GFW Technical Review 07 – Active Probing

So far, we have modeled the GFW as a passive censor: it observes traffic as it flows through, performs DPI, and blocks connections based on what it sees. But this is only part of the GFW’s arsenal. To improve detection accuracy, the GFW takes a more aggressive approach: it initiates its own connections and sends probing packets to servers suspected of hosting circumvention services like Shadowsocks. By observing how these servers respond, the GFW can gain insights that passive observation alone cannot provide.

The Firewall Talks Back

Passive DPI has inherent limitations against obfuscated protocols. As we saw with Shadowsocks, a well-designed protocol makes its traffic indistinguishable from random bytes, giving the GFW no reliable way to identify it. Instead, the GFW must rely on heuristics such as packet entropy and length, which often lack sufficient accuracy for confident blocking.

Active probing adds a verification step. When the GFW’s passive analysis flags a suspicious connection, it records the server’s IP address and port. Then, from machines under its control, the GFW opens its own connection to that server and tries to interact with it. If the server responds in a way that confirms a circumvention service, the GFW adds it to the blocklist.

The key insight: while the Shadowsocks protocol is obfuscated, the server’s behavior is not. Active probing was first documented around 2012, when the GFW began targeting servers suspected of running Tor. It has since been extended to a wide range of circumvention protocols.

The Probing Infrastructure

The GFW operates its probing infrastructure at remarkable scale. In an experiment conducted in 2019, researchers received over 50,000 probing attempts originating from more than 12,000 unique IP addresses. These addresses predominantly belonged to two Autonomous Systems: AS4837 (CHINA169-BACKBONE) and AS4134 (CHINANET-BACKBONE), though probes from other ASes were also observed. Given the GFW’s distributed architecture, each regional component likely runs its own probing system rather than sharing a single nationwide one.

The same experiment revealed that approximately 90% of probes used source TCP ports between 32768 and 65535, the default ephemeral port range for many Linux kernels.

Timing and frequency vary significantly. When probing is triggered for a particular Shadowsocks server, the GFW typically sends multiple probes. The first can arrive in under a second after the initial suspicious traffic, though it may also come much later. Over 50% of initial probes arrive within one minute, the majority within 20. Follow-up probes mostly arrive within one hour, though some have been observed as late as 10 hours after the initial trigger.

Probing Shadowsocks

Probing a Shadowsocks server is not straightforward. Given the fully encrypted nature of Shadowsocks, the GFW cannot construct arbitrary valid packets without knowing the password. There is no “magic probe” that can definitively identify a Shadowsocks server from a single request. Instead, the GFW performs repeated probes with varying characteristics, hoping to learn something from the server’s TCP-layer behavior.

The ATYP Brute-Force Attack

The earliest documented Shadowsocks vulnerability to active probing was discovered in 2015. The attack exploits the fact that stream ciphers provide no integrity checking. Specifically, the attacker attempts to brute-force the ATYP field in the first Shadowsocks packet. ATYP has only three valid values: 0x01 (IPv4), 0x03 (domain), and 0x04 (IPv6). When Shadowsocks receives an invalid ATYP, it immediately closes the connection; otherwise, it waits for further data from the client.

Even without knowing the password, the attacker knows which byte corresponds to ATYP since it has a fixed offset. They simply cycle through all 256 possible ciphertext values for that byte position. Due to the properties of stream ciphers (see the Shadowsocks post), this also means they have tried every possible plaintext value. The attacker then observes whether exactly three attempts result in the server not immediately closing the connection. This would be a strong indicator that the server is running Shadowsocks.

Random Byte Attack

Since the GFW cannot construct valid Shadowsocks packets, it takes a simpler approach: send packets filled with random bytes and observe how the server responds. Depending on the length (and sometimes content) of the random packet, Shadowsocks servers exhibit characteristic behaviors:

Timeout: The server does not respond until the prober times out. This typically occurs when the probe packet is too short to cover the salt/IV, and the server waits for more bytes.
TCP RST: The server resets the connection. This occurs when the probe packet is invalid, for example when the decrypted ATYP byte is not a valid value.
FIN/ACK: The server does not reset immediately but closes the connection shortly after with a FIN/ACK.

The exact behavior depends on the probe packet length, the salt/IV length of the cipher, and the specific Shadowsocks implementation. By repeatedly probing with random bytes of different lengths, the GFW can determine whether the response pattern matches known Shadowsocks fingerprints. Even AEAD cipher implementations, which are resistant to the ATYP brute-force attack, still exhibit different behaviors (timeout vs. reset) depending on packet length.

How different Shadowsocks implementations respond to random probes of varying length

Note how OutlineVPN v1.0.6 exhibits a clearly identifiable pattern: a 50-byte random probe results in a FIN/ACK, while anything longer results in a RST. This makes it trivially fingerprintable.

TCP FIN vs RST Threshold

The FIN/ACK vs RST distinction deserves explanation, as it reveals how subtle implementation details can become fingerprinting vectors.

This behavior is an artifact of how the Linux kernel handles TCP connection closure. When an application parses incoming data, determines it is invalid, and decides to close the connection, the kernel’s behavior depends on whether the application has fully drained the read buffer:

If the application has called recv() to read all data from the kernel’s buffer before calling close(), the kernel sends a FIN/ACK.
If unread data remains in the buffer when close() is called, the kernel sends a RST instead.

This gives probers a handy fingerprinting vector: they observe that probes of a specific length result in FIN/ACK, while longer probes produce RST.

Linux TCP close behavior: FIN vs RST depending on read buffer state

Replay Attack

Beyond random bytes, the GFW also captures the first packet from connections it monitors and replays it verbatim to the Shadowsocks server on a new connection. Since Shadowsocks is stateless, such a replay constitutes a valid request, and the server will respond with data. Even though the GFW cannot decrypt the response, the fact that the server responds meaningfully is yet another indicator of its identity.

Beyond identical replay, the GFW also sends modified versions of captured packets. Notable variations include:

Byte 0 modification (salt/IV tweak): Changes the salt/IV and observes behavior changes
Byte 16 modification (ATYP byte): The ATYP brute-force attack described above
Modifications at various positions between 0 and 16: Likely an attempt to estimate the salt/IV length

Note that with an altered salt/IV, the server’s behavior becomes identical to the random byte attack scenarios, since the decryption produces garbage.

Probe Resistance

To make a proxy like Shadowsocks resistant to active probing, several areas must be addressed to defend against the full attack surface.

Use Authenticated Encryption

Authentication-based cipher schemes (e.g., AEAD) are essential. Simple stream ciphers without authentication repeatedly lead to subtle and dangerous vulnerabilities. Even if the GFW is not interested in decrypting the data stream, the lack of authentication still enables effective probing attacks.

Respond Consistently to Invalid Requests

The proxy must exhibit consistent behavior when receiving erroneous requests. For Shadowsocks, the most practical approach is to timeout the connection when receiving invalid data. This ensures that probers see a timeout consistently, preventing TCP behavior fingerprinting.

Research has shown that many legitimate servers on the Internet exhibit this “timeout on error” behavior, so the behavior itself is not fingerprintable. As shown in the earlier diagram, newer versions of shadowsocks-libev and OutlineVPN implement consistent timeout behavior against random probes.

Alternatively, the server can close connections consistently on error, for example by always sending a TCP RST for invalid requests. This approach is more subtle to implement correctly: incomplete packets (those with insufficient bytes for the salt or header) must also be rejected rather than waiting for more data.

Implement Replay Protection

To resist replay attacks, the server must ensure salt values are not reused. The simplest approach is to maintain an in-memory store of previously seen salts. However, the GFW may save captured packets indefinitely, so naively the server would need to persist used salts forever: inefficient, prone to unbounded memory growth, and requiring disk persistence across restarts.

Modern proxy implementations employ time-based validation: the first packet includes a client timestamp, which the server checks against its own clock, rejecting requests where the time difference exceeds a threshold (e.g., 30 seconds). With this mechanism, the server only needs to store used salts for 30 seconds, knowing that any replay attempts beyond that window will be rejected due to timestamp validation.

Closing Thoughts

Active probing opens another dimension for the GFW. It provides confidence and data points orthogonal to passive DPI-based classification, significantly enhancing detection accuracy.

However, once probing methods are studied and publicized, they quickly become obsolete. Circumvention tools can efficiently patch the specific vulnerabilities being exploited. Modern circumvention protocols all incorporate probe resistance, making it a high-priority consideration for protocol designers. The biggest remaining obstacle is the continued use of old, outdated implementations that lack these protections.

References

Alice, Bob, Carol, Jan Beznazwy, and Amir Houmansadr. 2020. How China Detects and Blocks Shadowsocks. In Proceedings of the ACM Internet Measurement Conference (IMC ‘20). https://dl.acm.org/doi/10.1145/3419394.3423644
David Fifield. Shadowsocks active-probing attacks and defenses. https://groups.google.com/g/traffic-obf/c/CWO0peBJLGc/m/Py-clLSTBwAJ
BreakWa11. 2015. Shadowsocks协议的弱点分析和改进. https://web.archive.org/web/20160829052958/https://github.com/breakwa11/shadowsocks-rss/issues/38
Sergey Frolov, Jack Wampler, Eric Wustrow. Detecting Probe-resistant Proxies. https://www.ndss-symposium.org/wp-content/uploads/2020/02/23087-paper.pdf
Shadowsocks. SIP022 AEAD-2022 Ciphers. https://shadowsocks.org/doc/sip022.html
Anonymous, Anonymous, Anonymous, David Fifield, Amir Houmansadr. A practical guide to defend against the GFW’s latest active probing. https://gfw.report/blog/ss_advise/en/