GFW Technical Review 10 – Trojan

During WWII, radio became the backbone of military communication, and the cryptography that protected it formed a hidden battlefield. The Allied intelligence advantage was a key reason for decisive victories in both the European and Pacific theaters. The problem with radio was that messages were broadcast into the air: not only your intended receiver but everyone within range, including the enemy, could intercept the signal and decode the message. Both Allied and Axis forces developed cryptographic schemes to encrypt their transmissions, attempting to camouflage them as random noise. Nazi Germany famously used the Enigma machine. But designing a perfect cryptographic scheme is hard. Statistical patterns in the messages and weaknesses in the cipher can leak information that, with enough automated computation, allows the codes to be broken.

Rather than trying to make radio indecipherable, there is a different approach: planting false intelligence. Right before the invasion of Normandy, the Allied forces carried out a mass deception campaign. They deliberately leaked false information, deployed balloon tanks and landing craft, and carried out deceptive military exercises, misleading the Nazis into thinking the naval invasion was going to take place at Calais. The campaign was a great success. The Nazis deployed their strongest forces at Calais, a key reason why the Normandy invasion succeeded.

Inflatable tanks were used during Allied's deception campaign

Two approaches: one is to encrypt and conceal, the other is to imitate and disguise. The same principles map to modern censorship circumvention as two strategies: polymorphism and steganography. Polymorphism takes the concealment path. Polymorphic proxies make network traffic look like nothing but a stream of completely random bytes, so the GFW cannot classify them. Shadowsocks and VMess took this approach. But just as Enigma was eventually broken by Turing and his Allied team’s careful analysis and computation, the “look like nothing” proxies unavoidably have recognisable characteristics that the GFW can exploit. In fact, a stream of bytes with high entropy, no recognisable protocol headers, and no cleartext handshake is a signal in itself. By the late 2010s, the GFW had developed increasingly effective techniques for identifying and blocking these fully encrypted protocols, not by breaking their encryption, but by noticing that the traffic looked encrypted and nothing else.

The response was a philosophical shift in circumvention design, taking the disguise path of steganography. Instead of camouflaging your proxy traffic and hoping it dodges some blacklist, what if you imitate (or even directly use) an existing protocol and disguise yourself as a common class of network traffic? That way, you avoid exposing identifiable characteristics to the GFW. If you pick a sufficiently common protocol and disguise well enough, the cost and collateral damage the GFW would incur in blocking it become enormous.

This is the idea behind TLS-based evasion technologies: use TLS directly and disguise proxy traffic as the most common class of traffic on the internet.

Trojan

The core idea of Trojan is simple: use TLS to disguise proxied traffic as normal HTTPS.

A Trojan server is, first and foremost, a real TLS server with a valid certificate for a real domain. When any client connects, the TLS handshake proceeds exactly as it would for any HTTPS website, because it is a standard TLS handshake, handled by a standard TLS stack. There is nothing to fingerprint in the handshake because there is nothing custom about it.

The distinction happens after the handshake completes, inside the encrypted tunnel. The server reads the first few bytes of application data. If those bytes contain a valid Trojan authentication token – a 56-byte hex-encoded SHA-224 hash of a pre-shared password – the server treats the connection as a proxy session and forwards traffic to the requested destination.

The Trojan protocol itself is minimal. Since it operates inside the TLS tunnel, there is no need for its own encryption layer, and the designer skipped everything else too. The “Trojan Request” field simply contains a SOCKS5-style destination header (address type, address, and port), similar to Shadowsocks, and the payload follows immediately.

Probe Defense

To defend against active probes, a Trojan server must behave identically to a web server when receiving unauthenticated requests. It silently forwards those requests to a fallback, typically Nginx or another web server hosting a real website. The unauthenticated client sees a normal webpage and has no way to confirm that the server is anything other than an ordinary HTTPS site. This is strictly stronger than Shadowsocks-style probe resistance: Shadowsocks only maintains its disguise at the TCP layer, whereas a Trojan probe yields a perfectly correct HTTPS response, because the fallback is a real website.

The trojan server routes unauthenticated requests to a backend web server

TLS Based Evasion Family

Trojan was the first widely adopted TLS-based circumvention tool. A broader family of protocols has since built on the idea.

Cloak

Cloak is a TLS-based pluggable transport intended to be used in conjunction with existing protocols like Shadowsocks or OpenVPN, similar to Tor’s idea of pluggable transports. It wraps the existing protocol inside a TLS channel to provide censorship resistance through TLS indistinguishability. The appeal is operational rather than cryptographic: operators can keep their existing Shadowsocks or OpenVPN deployment intact and bolt on a TLS layer in front, instead of migrating to a different protocol. Cloak also actively mimics the TLS fingerprint of a real browser like Chrome or Firefox, so the outer layer blends in with ordinary HTTPS traffic.

VLESS

VLESS is an evolution of VMess that strips the encryption layer and other complexities like the time-sync requirement. Encryption is delegated to an outer TLS layer, and Nginx typically handles the fallback website. The resulting architecture looks very similar to Trojan.

The Merits and Costs

TLS-based evasion is a significant step forward from Shadowsocks. It offers several benefits:

Obfuscation. The majority of the modern internet runs on TLS. Using TLS for proxy traffic makes protocol-level analysis much more difficult, if not impossible. The potential collateral damage of blocking TLS is also far higher, forcing the GFW to be more cautious.
Probe resistance. The real website behind Trojan makes the server hard to distinguish from an ordinary HTTPS site, even under active probing.
Security. TLS is a well-studied, battle-tested cryptographic protocol. It is far safer than self-designed protocols such as early Shadowsocks, which had multiple security vulnerabilities.

These benefits come with costs:

Performance. TLS incurs one to two additional round trips from the handshake, whereas Shadowsocks or VMess have no handshakes at all. This is often not noticeable to the end user, but it does impact performance on the long-range, low-quality links typical of proxy deployments.
Complexity. The Trojan protocol itself is not much more complicated than Shadowsocks, but properly setting up a Trojan server requires buying a domain, obtaining and installing certificates, and configuring a backend web server. Many of these steps require solid knowledge of networking and web infrastructure.

Not a Silver Bullet

TLS raises the bar, but it does not end the arms race. As Trojan and its family gained traction, the GFW shifted its approach away from protocol-level detection and toward statistical fingerprinting, where the target is no longer the proxy itself but the context around it. The next post examines that shift.

References

M. C. Tschantz, S. Afroz, Anonymous, and V. Paxson, “SoK: Towards Grounding Censorship Circumvention in Empiricism,” 2016 IEEE Symposium on Security and Privacy (SP). https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7546542
Mingshi Wu, Jackson Sippe, Danesh Sivakumar, Jack Burg, Peter Anderson, Xiaokang Wang, Kevin Bock, Amir Houmansadr, Dave Levin, and Eric Wustrow. How the Great Firewall of China Detects and Blocks Fully Encrypted Traffic. In 32nd USENIX Security Symposium (USENIX Security 23). https://www.usenix.org/conference/usenixsecurity23/presentation/wu-mingshi
trojan-gfw. The Trojan Protocol. https://trojan-gfw.github.io/trojan/protocol.html
trojan-gfw. trojan. https://github.com/trojan-gfw/trojan
cbeuw. Cloak. https://github.com/cbeuw/Cloak
klzgrad. trojan issue #14, Design discussion. https://github.com/trojan-gfw/trojan/issues/14
Wikipedia. Operation Bodyguard. https://en.wikipedia.org/wiki/Operation_Bodyguard