GFW Technical Review 01 – Architecture

In the late 1990s and early 2000s, China’s Internet infrastructure expanded rapidly. As connectivity grew, policymakers looked for ways to control cross-border information flows. The physical layout of the early Chinese Internet made this achievable: most international traffic passed through a handful of centralized exchange points. These “choke points” were natural places to plug in filtering without redesigning the entire domestic network.

The earliest versions of the Great Firewall (GFW) were built around these gateways. At each one, border routers enforced access-control rules, external appliances tapped and inspected traffic, forged packets interrupted offending connections, and centralized controllers coordinated policies across carriers. The infrastructure was run by state-affiliated telecom operators working alongside newly established regulators tasked with information control.

Even at this early stage, the GFW was never a single firewall appliance. No single device could enforce broad, heterogeneous policies across all cross-border traffic. From the start, it was a distributed system: a collection of loosely coupled components cooperating toward nationwide filtering. Three mechanisms emerged as its foundational pillars.


IP Address Blocking

Routers at backbone or gateway points maintain ACLs and drop packets to and from specific IP ranges. The method is reliable (routers almost never misapply an ACL) and computationally cheap. The burden sits on the operator, who must keep an exhaustive, constantly updated list of target addresses. That is hard: the web is enormous, and IP addresses change often. IP-based blocking also tends to over-block, especially on shared hosting and CDNs where many domains share one address.

Even so, maintaining huge ACLs on a handful of backbone routers is impractical. To spread the burden across the entire domestic network, the GFW uses null routing. Routers share routes via BGP, and BGP is built on trust: a router accepts what its peers announce without verification.

That trust is the opening. The GFW injects a false BGP announcement for a target IP, advertising a route that leads nowhere. The announcement propagates across the network until every router collectively drops traffic for that destination. The result is a systemwide null route: efficient, and very hard for typical users to work around.

Topology for propagating null routes across the network, from the GFW's own architects

Port Blocking

Backbone routers can also block traffic to specific destination ports. On Cisco equipment this is typically implemented through ACL-Based Forwarding (ABF). Port-level blocking is more surgical than IP blocking: since many addresses host multiple services, filtering by port cuts off the targeted service while leaving the rest reachable.


DNS Poisoning

IP blocking is operationally expensive and unreliable, because IP addresses change frequently. Domain names, by contrast, are stable. DNS interference therefore became a more attractive and more scalable technique.

The GFW poisons domestic DNS resolvers by feeding them forged responses. DNS is hierarchical and cache-driven, so a bad record injected at the root or ISP level quickly propagates through caches and contaminates the wider domestic DNS ecosystem.

In principle, users can bypass poisoned resolvers by querying foreign DNS servers. The GFW counters by injecting forged responses when it detects outbound DNS queries crossing the border. These forged responses almost always arrive first, since they are generated locally rather than traveling to Japan, the US, or elsewhere. The client accepts the first answer and ignores the genuine one when it finally shows up.

The forged DNS responses also contribute to DNS poisoning. As they travel to the client, they update the DNS cache at various levels along the route

Keyword-Based Filtering (Early DPI)

In the early Internet, most traffic was plaintext, which made pattern matching on TCP streams straightforward. Operators tapped the cross-border choke points and captured a copy of all international traffic off-path. The capture was sharded and fed to a server farm running early Deep Packet Inspection (DPI).

The GFW taps cross-border traffic and injects disruptive packets

These DPI systems scanned unencrypted HTTP requests for specific keywords. Some could perform limited stateful inspection, tracking TCP streams to identify behavioral patterns, but this was far more expensive and early deployments were mostly stateless.

When a DPI device decided a connection violated policy, it injected forged TCP RST packets to both endpoints, tearing the connection down from both sides. This became one of the earliest and most recognizable signatures of the GFW in action.

The GFW cuts off the TCP connection by sending RST packets to both ends

A Distributed, Multi-Layer Design

A core design philosophy of the GFW is to be distributed and multi-layered rather than monolithic. Null routing and DNS poisoning both push responsibility outward into the network instead of concentrating it at the border. Over time, the same logic extended to DPI: heavy inspection at international gateways alone would create massive bottlenecks, so ISPs gradually took on more of it within the domestic backbone. Regional enforcement also enabled location-specific policies, such as those observed in Xinjiang in recent years.

The distributed approach delivers scalability, fault tolerance, and policy flexibility, none of which a centralized firewall could match.


Operational Challenges

Each mechanism is conceptually simple, but together they create serious operational headaches. The hardest one is false positives. The GFW is ultimately a binary classifier deciding whether a given connection violates policy, and no classifier is perfect: IP lists become outdated, DPI matches the wrong substring, and legitimate flows resemble disallowed ones.

Like any classifier, the GFW must balance precision and recall. Historically it has been tuned toward precision, prioritizing a low false-positive rate even at the cost of more false negatives. That balance is adjustable, and operators regularly tune it upward during politically sensitive periods, when the system becomes noticeably more aggressive.

Leakage is another persistent challenge. Any “false information” the GFW injects, whether poisoned DNS records or null routes, must stay strictly within national borders; otherwise it spreads through the global routing or DNS ecosystem and causes collateral damage. In 2010, a Chilean DNS operator observed incorrect resolutions for domains like Facebook, YouTube, and Twitter. The trail led back to DNS queries inadvertently routed through China: the GFW had injected forged responses that leaked abroad because its direction checks were insufficient at the time.


Observable Characteristics

The GFW’s architecture is not publicly documented, so most of what we know comes from researchers and users examining its externally visible behavior. Common observations include:

  • TCP RST injections: forged packets abruptly terminating connections.
  • Silent timeouts: connections stall indefinitely, often a signature of IP- or route-based blocking.
  • Forged DNS records: detectable when users manually compare DNS results or change resolver settings.
  • Asymmetric behavior: outbound and inbound traffic treated differently, a pattern seen repeatedly in academic measurements.

Together, these behaviors point to a complex, distributed filtering system rather than a single device.


Closing Thoughts

The GFW began as a handful of simple, distributed mechanisms built around natural choke points. Two decades of refinement, scaling, and partial decentralization have turned it into one of the most sophisticated national filtering architectures in the world. The next post shifts to the client side, starting with how early VPNs appeared on the wire and what happened once the GFW noticed.


References




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • GFW Technical Review 14 – The Cat and Mouse Game
  • GFW Technical Review 13 – Hysteria
  • GFW Technical Review 12 – Advanced TLS Evasion
  • GFW Technical Review 11 – Statistical Fingerprinting
  • GFW Technical Review 10 – Trojan