GFW Technical Review 04 – The West Chamber Project
Most circumvention tools rely on a proxy and an encrypted channel between user and proxy, like a VPN. But there is an entirely different class of techniques, much less well known today, that uses no proxy at all. These attacks target the TCP protocol and the GFW’s implementation of it, exploiting the gap between what the GFW believes about a connection and what the endpoints actually do.
TCP-Based Evasion
One of the earliest proposals was simply to ignore the TCP RST packets the GFW injects, since RST injection was the only mechanism its DPI component used to tear down connections. The trick worked in isolation but was never practical: even if a user patched their local TCP stack to drop resets, the GFW sends RSTs to both ends of the connection, and the remote server would close it regardless.
TCP is a stateful protocol. Both endpoints maintain a set of states (connection state, sequence number, transmission window, and so on) collectively known as the TCB (Transmission Control Block). For the GFW to reassemble TCP streams, it must track this state for every connection on its network. Once it desynchronizes, meaning it loses an accurate view of the TCB, it can no longer reliably reassemble the flow: with the wrong sequence numbers, it cannot reorder packets or know when to wait for a retransmission.
The key principle of TCP-based evasion is to desynchronize the GFW while keeping the real connection alive with the remote server. The usual move is to convince the GFW that the connection has closed when it has not. The space of possible TCP connections is enormous (every combination of source IP, source port, destination IP, and destination port), so the GFW cannot keep state for every connection that ever existed; it can only track active ones and drop state for those it thinks have closed. Once the GFW believes a connection is closed, it stops censoring it.
From the GFW’s perspective, defending against desynchronization is fundamentally hard. TCP is a complex protocol with a large state space. Ambiguities inevitably arise around edge cases, and different TCP implementations behave differently. For performance, the GFW’s TCP stack has to be lightweight and therefore cannot be fully RFC-compliant. That is the attack vector: identify any divergence between the GFW’s TCP behavior and a standard implementation, then exploit it.
Even a perfect replica of the endpoint’s TCP stack would still be vulnerable. The GFW has no direct view of the states each side actually holds; it can only infer them. It also has no view of end-to-end network conditions, such as where packets are being dropped.
The West Chamber Project
The West Chamber Project was an early attempt at practical TCP-based evasion. From the client side, it performs a TCP insertion attack against the GFW by sending carefully crafted packets that convince the GFW the connection has closed.
The insertion happens during the TCP handshake, after the client receives a SYN/ACK from the remote server. Instead of responding with an ACK, the client first sends a FIN packet with an invalid sequence number.
From the GFW’s perspective, this looks like a connection termination. The GFW does not validate the sequence number, because tracking sequence numbers accurately would be too expensive. An RFC-compliant remote server, however, checks the sequence number, sees it is invalid, and simply ignores the packet. The client then proceeds with the normal handshake. The GFW, now desynchronized, no longer censors this TCP stream.
But this only desynchronizes the client-to-server direction; the server-to-client side still needs handling. The trick: inject an ACK with a correct sequence number but an invalid acknowledgment number. The GFW sees nothing unusual; it just looks like a valid ACK. RFC 793, however, requires that a server in the SYN-RECEIVED state respond with an RST when it receives an ACK with an invalid acknowledgment number. Crucially, the server sends that RST without closing the connection. It stays in SYN-RECEIVED, waiting for a valid ACK.
The GFW sees the server’s RST and concludes the connection has closed, unaware that the server is still sitting in SYN-RECEIVED. The client also drops the RST, because its sequence number is invalid. The GFW is now desynchronized in both directions while the server still waits for the handshake to finish. The client sends a valid ACK, completes the handshake, and begins transferring data with no further interference from the GFW.
This client-side component is named ZHANG, after Zhang Sheng, a character in the classical Chinese romance “The Story of the Western Chamber” (西厢记).
The CUI Component
West Chamber includes a separate server-side plugin called CUI, named after Cui Yingying, the other protagonist in the same love story. CUI operates independently from ZHANG.
CUI works by injecting two pairs of SYN/ACK and RST packets after the initial SYN. Each SYN/ACK carries an invalid acknowledgment number; each RST carries an invalid sequence number. The client ignores all of them, but they thoroughly confuse the GFW, leaving it unable to track the connection in either direction. Once again, this exploits the GFW’s inability to track sequence and acknowledgment numbers accurately. The server then sends the real SYN/ACK and continues the connection undetected.
DNS Component
West Chamber is not limited to TCP-based evasion; it also includes a DNS component that defends against the GFW’s DNS poisoning.
To use this feature, users must first point their DNS resolver at an overseas server, since domestic resolvers return poisoned records. As explained in Post 01, the GFW injects fake DNS responses that race ahead of the legitimate ones. Those fake responses have a useful fingerprint: the spoofed IP addresses come from a small, known set. West Chamber filters them out, leaving the client with the legitimate responses.
Variants
The West Chamber Project’s implementation represents just one of many ways to perform TCP insertion attacks. Beyond manipulating sequence and acknowledgment numbers, other techniques include:
- TCP checksum manipulation: crafting packets with invalid checksums that the GFW may treat differently from the endpoints
- Packet reordering: deliberately sending packets out of order to confuse the GFW’s reassembly logic
- TTL manipulation: setting the TTL so packets reach the GFW but expire before reaching the remote server
Any discrepancy between the GFW’s handling of TCP and the server’s actual behavior is a potential attack vector.
Resynchronization
West Chamber was more of a research project than a tool with large-scale real-world impact. Clever as it was, the GFW adapted quickly.
The GFW counters these attacks with a resynchronization mechanism: instead of abandoning state tracking once desynchronized, it tries to recover its TCB under certain conditions. Researchers’ measurements show that triggers include multiple SYN packets, multiple SYN-ACK packets, ACKs with manipulated acknowledgment numbers, and unexpected RSTs. Any of these can pull the GFW back into active tracking and break West Chamber’s core assumption. Recovery costs CPU, so the heuristics tend to be simple, cheap, and minimally stateful.
Countering Resynchronization
New strategies emerged to counter the GFW’s resynchronization mechanism. The key idea is to understand recovery well enough to weaponize it. Instead of merely desynchronizing the GFW (which would just trigger recovery), inject one packet that forces it into the resynchronization state, then a second packet with an invalid sequence number that pushes the recovery into an incorrect state.
Systematic Discovery of Insertion Packets
Another area of study is the systematic generation of insertion packets. Any TCP insertion attack that manipulates the GFW’s TCB must ensure that the insertion packets:
- Are not dropped by network middleboxes between the client and the GFW
- Do not alter the remote server’s TCB in ways that would close the connection
This is a hard problem along several dimensions. First, the GFW’s TCP implementation and source code are not public; its behavior can only be studied through measurement. Second, both TCP implementations and network conditions are highly diverse, so any proposed solution must work across most real-world configurations while remaining effective against the GFW. To handle this, researchers have built automated tools that systematically discover such discrepancies and generate effective insertion packets.
Closing Thoughts
TCP insertion attacks can circumvent the GFW without any proxy at all, a real advantage in principle over the alternatives. In practice, the approach never went mainstream: it depends on a deep understanding of the GFW’s TCP stack (which keeps evolving) and on edge cases in TCP behavior that vary across endpoint implementations and network paths. The work has been invaluable for revealing the GFW’s internals, but it has remained an academic pursuit rather than a deployed circumvention tool.
The mainstream eventually converged on proxy-based solutions. The most influential of them, Shadowsocks, emerged in 2012, two years after the West Chamber Project. That is the subject of the next post.
References
- Zhongjie Wang, Yue Cao, Zhiyun Qian, Chengyu Song, and Srikanth V. Krishnamurthy. Your state is not mine: a closer look at evading stateful internet censorship. In Proceedings of the 2017 Internet Measurement Conference (IMC ‘17). https://doi.org/10.1145/3131365.3131374
- SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery. Zhongjie Wang, Shitong Zhu, Yue Cao, Zhiyun Qian, Chengyu Song, Srikanth Krishnamurthy, Tracy D. Braun, Kevin S. Chan. In Network and Distributed Systems Security (NDSS) Symposium 2020. https://dx.doi.org/10.14722/ndss.2020.24083
- Sheharbano Khattak, Mobin Javed, Philip D. Anderson, Vern Paxson. Towards Illuminating a Censorship Monitor’s Model to Facilitate Evasion. 3rd USENIX Workshop on Free and Open Communications on the Internet (FOCI 13). https://www.usenix.org/conference/foci13/workshop-program/presentation/khattak
- Richard Clayton, Steven J. Murdoch, and Robert N. M. Watson. 2006. Ignoring the Great Firewall of China. https://www.cl.cam.ac.uk/~rnc1/ignoring.pdf
- T. Ptacek, T.N. Newsham. 1998. Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection. https://www.cs.unc.edu/~fabian/course_papers/PtacekNewsham98.pdf
- Jon Postel. 1981. Transmission Control Protocol. RFC 793. https://tools.ietf.org/html/rfc793
- scholarzhang. 2010. West Chamber Project. https://github.com/codegooglecom/scholarzhang
Enjoy Reading This Article?
Here are some more articles you might like to read next: