GFW Technical Review 04 – The West Chamber Project

待月西厢下,迎风户半开。隔墙花影动,疑是玉人来。

Most circumvention technologies rely on some form of proxy and an encrypted channel between the user and the proxy, such as a VPN. However, there exists an entirely different class of circumvention technologies – much less well known today – that does not rely on any proxy at all. These techniques focus on the TCP protocol and GFW’s implementation of the TCP stack, achieving circumvention by exploiting inherent vulnerabilities in how GFW handles TCP streams.


TCP-Based Evasion

One of the earliest proposals was to simply ignore the TCP RST packets generated by GFW, as RST injection was the only mechanism used by GFW’s on-path DPI component to terminate connections. While this was demonstrated to work, it was never practical: even if users could tweak their local TCP stack to ignore resets, TCP RST packets are sent to both ends of the connection, and the remote server would close the connection regardless.

TCP is a stateful protocol. Both endpoints maintain a set of states – connection state, sequence number, transmission window, and so on – collectively known as the TCB (Transmission Control Block). Crucially, for GFW to perform TCP stream reassembly, it must track the state of every connection traversing its network. If GFW becomes desynchronized – that is, if it loses accurate track of the TCB – it can no longer reliably reassemble the TCP flow. For example, with incorrect sequence numbers, it wouldn’t know how to reorder packets or whether to wait for retransmission.

The key principle of TCP-based evasion is to desynchronize GFW while keeping the connection alive with the remote server. A common approach is to trick GFW into believing that a TCP connection has been closed when in fact it remains active. Since the total number of possible TCP connections is enormous (every combination of the 4-tuple: source IP, source port, destination IP, and destination port), GFW cannot maintain state for every possible connection. It can only feasibly track active connections and discard state for those it believes have closed. Therefore, if GFW thinks a connection is closed, it stops censoring that connection.

From GFW’s perspective, defending against desynchronization is fundamentally difficult. TCP is a complex protocol with a large state space. Ambiguities inevitably arise in how packets or states are handled at the protocol’s edge cases, and different TCP implementations can exhibit different behaviors. For performance reasons, GFW’s TCP stack must be lightweight and therefore cannot be fully RFC-compliant. This is the attack vector: if we can identify how GFW’s TCP implementation differs from standard implementations, we may be able to exploit those differences. In fact, even if GFW managed to faithfully replicate the TCP implementation used by the endpoints, it would still be vulnerable to desynchronization. GFW lacks full visibility into the actual states maintained by both sides of the connection – it can only infer them. Furthermore, it has no knowledge of end-to-end network conditions, such as where packet drops occur.

The state transition in TCP protocol is complicated, leaving plenty of room for ambiguity

The West Chamber Project

The West Chamber Project was an early attempt at practical TCP-based evasion. It performs a TCP insertion attack against GFW from the client side by sending carefully crafted packets to trick GFW into believing the TCP connection has closed.

The insertion happens during the TCP handshake, when the client receives a SYN/ACK from the remote server. Instead of responding with an ACK packet, the client first sends a FIN packet with an invalid sequence number.

From GFW’s perspective, it sees a FIN packet and concludes that the connection has been terminated. GFW does not validate the sequence number because accurately tracking sequence numbers is computationally expensive. However, an RFC-compliant remote server first checks the sequence number, determines it is invalid, and simply ignores the packet. The client then proceeds with the normal handshake. Since GFW is now desynchronized, this TCP stream is no longer subject to censorship.

But this only desynchronizes the client-to-server direction. We still need to desynchronize the server-to-client direction. This is accomplished cleverly by injecting an ACK packet with a correct sequence number but an invalid acknowledgment number. This injected packet doesn’t immediately trigger any reaction from GFW – it simply appears to be a valid ACK. However, RFC 793 requires that a server in the SYN-RECEIVED state must respond with an RST when it receives an ACK with an invalid acknowledgment number. Crucially, even though the server sends this RST, it does not close the connection. Instead, it remains in the SYN-RECEIVED state, waiting for a valid ACK.

When GFW observes this RST packet from the server, it concludes that the connection has closed – unaware that the server remains in SYN-RECEIVED state. The client, meanwhile, drops this RST because it contains an invalid sequence number. At this point, GFW is fully desynchronized in both directions, while the remote server is still waiting in SYN-RECEIVED state. The client then completes the handshake with a valid ACK and can begin transferring data without GFW’s interference.

This client-side component is named ZHANG, after Zhang Sheng, a character in the classical Chinese romance “The Story of the Western Chamber” (西厢记).

The full packet flow of TCP handshake with ZHANG injecting two packets

The CUI Component

West Chamber includes a separate server-side plugin called CUI, named after Cui Yingying, the other protagonist in the same love story. CUI operates independently from ZHANG.

CUI works by injecting two pairs of SYN/ACK and RST packets after receiving the initial SYN. The SYN/ACK contains an invalid acknowledgment number, and the RST contains an invalid sequence number. The client ignores all of these packets, but they thoroughly confuse GFW, rendering it unable to track the connection in either direction. Once again, this exploits GFW’s inability to accurately track sequence and acknowledgment numbers. After injecting these packets, the server sends the real SYN/ACK response and continues the connection undetected.

Packet flow of TCP handshake with CUI injecting four packets from server side

DNS Component

West Chamber is not limited to TCP-based evasion; it also includes a DNS component that defends against GFW’s DNS poisoning.

To use this feature, users must first configure their DNS resolver to point to an overseas DNS server, since domestic servers contain poisoned records. As explained in Blog 01, GFW injects fake DNS responses that race ahead of legitimate responses from the actual DNS server. These fake responses have certain fingerprints that West Chamber exploits. In particular, the spoofed IP addresses that GFW generated came from a very limited set. West Chamber performs packet filtering based on these known fake IPs, allowing the client to discard GFW’s spoofed responses and use the legitimate ones.

Variants

The West Chamber Project’s implementation represents just one of many ways to perform TCP insertion attacks. Beyond manipulating sequence and acknowledgment numbers, other techniques include:

  • TCP checksum manipulation: Crafting packets with invalid checksums that GFW may process differently than the endpoints
  • Packet reordering: Intentionally sending packets out of order to confuse GFW’s reassembly logic
  • TTL manipulation: Setting the TTL value so that packets reach GFW but expire before reaching the remote server

Any discrepancy between GFW’s handling of TCP and the real server’s handling represents a potential attack vector.


Resynchronization

West Chamber was more of a research project than one with large-scale real-world impact. While clever and well-crafted, GFW evolved quickly in response.

GFW combats these attacks by introducing a resynchronization mechanism: instead of abandoning state tracking after being desynchronized, it attempts to resynchronize its TCB under certain conditions. According to researchers’ measurements, scenarios such as multiple SYN packets, multiple SYN-ACK packets, ACKs with manipulated acknowledgment numbers, or unexpected RST packets could all trigger resynchronization, bringing GFW back to actively censoring the connection and breaking West Chamber’s fundamental assumption. Such mechanisms always incur additional computational costs, so they tend to be simple, cost-effective, and require minimal state tracking.

Countering Resynchronization

New strategies emerged to combat GFW’s resynchronization mechanism. The key idea is to gain sufficient understanding of how resynchronization works. Then, instead of merely desynchronizing GFW (which would trigger resynchronization), we can inject a packet that forces GFW into its resynchronization state, followed by another packet with an invalid sequence number that causes GFW to resynchronize into an incorrect state.

A more sophisticated insertion attack targeting GFW's re-synchronization mechanism, proposed by Zhongjie Wang et al

Systematic Discovery of Insertion Packets

Another area of study is the systematic generation of insertion packets. Any TCP insertion attack that manipulates GFW’s TCB must ensure that the insertion packets:

  1. Are not dropped by network middleboxes between the client and GFW
  2. Do not alter the remote server’s TCB in ways that cause it to close the connection

This is a challenging problem along multiple dimensions. First, we do not have access to GFW’s TCP implementation specifications or source code – its behavior can only be studied through measurement. Second, both TCP implementations and network conditions are highly diverse. Any proposed solution must be compatible with the majority of real-world configurations while remaining effective against GFW. To address this, researchers have developed automated tools to systematically discover such discrepancies and generate effective insertion packets.


Closing Thoughts

TCP insertion attacks have the potential to circumvent GFW without requiring additional proxies – a significant advantage over other circumvention solutions. However, this approach has never achieved mainstream adoption. Ultimately, successful TCP insertion methods require deep understanding of GFW’s TCP stack, which itself evolves constantly. The technique has never been reliable enough for practical use due to its reliance on edge cases in the TCP protocol and uncertainty about endpoint implementations and network paths. Nevertheless, it has revealed a great deal about GFW’s inner workings and remains primarily of academic interest rather than seeing large-scale deployment.

The mainstream eventually converged on proxy-based solutions. The single most influential one, Shadowsocks, emerged in 2012 – two years after the West Chamber Project. That will be the subject of the next blog.


References




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • GFW Technical Review 08 – Tor
  • GFW Technical Review 07 – Active Probing
  • GFW Technical Review 06 – HTTPS and Domain Fronting
  • GFW Technical Review 05 – Shadowsocks
  • GFW Technical Review 03 – Deep Packet Inspection