GFW Technical Review 02 – VPN

When the Great Firewall first deployed its earliest filtering mechanisms – IP blocking, DNS poisoning, and basic keyword-based DPI – the Internet was still dominated by plaintext protocols. This gave the system broad visibility into traffic contents, but it also meant it was highly vulnerable to any technology that encrypted or encapsulated traffic. VPNs quickly emerged as a natural response. Not because they were designed for circumvention, but because they offered a mature, standardized way to wrap traffic inside an encrypted tunnel, making the underlying communication opaque to simple filtering logic. If the traffic never revealed its true destination, IP- or DNS-based blocking became ineffective. And once the payload was encrypted, keyword-based DPI could no longer read or classify it. The assumption at the time – largely correct – was that encrypted tunnels would appear “opaque enough” that early DPI systems could not meaningfully identify them.

How VPNs Work

At its core, a VPN is a mechanism that takes arbitrary IP packets and encapsulates them inside another protocol, typically with encryption applied to the inner payload. A VPN client creates a virtual tunnel interface, which behaves like a normal network adapter. Applications send packets to this interface as if they were sending them directly to the Internet. Instead of being routed normally, these packets are wrapped inside an outer VPN transport protocol and then encrypted. Conceptually, a VPN packet looks like this:

Outer IP header → Transport header (TCP/UDP) → Encrypted VPN payload (inner IP packet)

The outer IP header points to the VPN server, usually located outside the censored network. The server decrypts the packets, restores the original IP packets, and forwards them to their intended destinations. As a result, the only traffic visible to the GFW is the encrypted, encapsulated VPN tunnel. The destination IPs, DNS queries, and application-layer content are hidden.

However, not everything is concealed: the outer IP and transport headers remain plaintext. If GFW operators know the IP address of a VPN endpoint, they can still block it outright.

Common VPN Protocols

Different VPN protocols encapsulate traffic in different ways, which also influences how detectable they are.

IPsec

One of the earliest standardized VPN protocols. It uses ESP (Encapsulating Security Payload) and AH (Authentication Header) and relies on IKE (Internet Key Exchange) for key negotiation. IPsec has recognizable packet formats and commonly uses UDP ports 500 and 4500, making it relatively easy to fingerprint.

OpenVPN

A later protocol that reuses TLS for its handshake and cryptographic layer. OpenVPN can run over UDP or TCP, often defaulting to UDP/1194. Because it borrows from TLS but does not fully mimic HTTPS, its traffic patterns are distinctive.

L2TP/PPTP

Older tunneling technologies widely supported in early consumer devices. PPTP relied on GRE and MS-CHAP, and L2TP often ran in combination with IPsec. Over time they fell out of favor due to both security and performance limitations.

SSH Tunneling

Not technically a VPN protocol but functionally similar. Users run SSH clients and servers on each end, encapsulating traffic inside an encrypted SSH stream. While simpler, SSH sessions also have identifiable handshake patterns.

SSH tunnelling works in a very similar fashion to VPN

Case Study: OpenVPN

OpenVPN is a representative example of a classic VPN protocol and shows how early VPN traffic appeared “on the wire.”

OpenVPN connections are stateful. To begin, the client sends a “Client Reset” packet, and the server responds with a “Server Reset.” This exchange establishes a session ID and sets up a control channel. Afterward, the client and server perform a TLS handshake over this control channel, leveraging OpenSSL for cryptography. Once the TLS session is established and keys are exchanged, the data channel becomes active and encrypted payloads can flow.

Although OpenVPN encrypts the data channel, it was never designed to hide the fact that it is a VPN. Several aspects remain visible:

The control channel is not further obfuscated beyond standard TLS.
The data channel exposes certain fields, such as opcode and Key ID, in plaintext.
Opcodes appear at fixed offsets in packets and follow predictable sequences.

OpenVPN wireline packet format. It exposes opcode in plaintext

For an adversary like the GFW, these characteristics create reliable identification vectors. A censor can observe packet sequences, opcode patterns, and handshake structures and classify them with high accuracy. As DPI matured, OpenVPN’s distinct “fingerprint” became trivial to detect and block.

Closing Thoughts

In the early era of censorship circumvention, VPNs stood out as an immediately available and technically mature option. Their impact was so strong that, even today, many users refer to all circumvention tools generically as “VPNs,” regardless of what protocol they actually employ.

However, VPNs were not purpose-built for censorship resistance. They encrypted traffic, but they did not hide metadata or protocol identity. As the GFW adopted more advanced Deep Packet Inspection techniques, these weaknesses became clear. Traditional VPNs increasingly struggled to evade detection, and new families of circumvention protocols emerged in response. We will explore these DPI mechanisms – and how they shaped the next generation of circumvention tools – in the next post.

References

Great Firewall father speaks out. http://www.china.org.cn/china/2011-02/18/content_21951602.htm
OpenVPN Protocol. https://openvpn.net/community-resources/openvpn-protocol/
SSH Tunneling. https://www.ssh.com/academy/ssh/tunneling
What is OpenVPN Protocol. https://www.vpnunlimited.com/help/vpn-protocols/open-vpn-protocol
Diwen Xue, Reethika Ramesh, Arham Jain, Michalis Kallitsis, J. Alex Halderman, Jedidiah R. Crandall, Roya Ensafi. OpenVPN is Open to VPN Fingerprinting. 31st USENIX Security Symposium (USENIX Security 22). https://www.usenix.org/conference/usenixsecurity22/presentation/xue-diwen