liberate

Liberate

ISPs leverage middleboxes to implement a variety of network management policies (e.g., prioritizing or blocking traffic) in their networks. While such policies can be beneficial (e.g., blocking malware) they also raise issues of network neutrality and freedom of speech when used for application-specific differentiation and censorship. In general, there is a poor understanding of how such policies are implemented in practice, and how they can be evaded efficiently. As a result, most circumvention solutions are brittle, point solutions based on manual analysis.

We show the design and implementation of liberate, a general-purpose tool for automatically identifying middlebox policies, reverse-engineering their implementations, and adaptively deploying custom circumvention techniques. Our key insight is that differentiation is necessarily implemented by middleboxes using incomplete models of end-to-end communication protocols at the network and transport layers. liberate conducts targeted network measurements to identify the corresponding inconsistencies and leverages this information to transform arbitrary network traffic such that it is purposefully misclassified (e.g., to avoid shaping or censorship). Unlike previous work, our approach is application-agnostic, can be deployed unilaterally (i.e., only at one endpoint) on unmodified applications via a linked library or transparent proxy, and can adapt to changes to classifiers at runtime. We evaluate liberate both in a testbed environment and in operational networks that throttle or block traffic based on DPI-based classifier rules, and show that our approach is effective across a wide range of middlebox deployments.

Key Contributions

An application-agnostic approach to identifying traffic-classication rules for differentiation.
A taxonomy of evasion techniques that exploit inconsistencies between end-to-end and middlebox views of network flows.
A tool that identies classication rules, identies the network location of the corresponding middlebox, and deploys low-cost, custom countermeasures without modifying applications.
Public, open-source tools and datasets to allow others to incorporate and extend our work.

Key findings

Policies are implemented using classifiers that rely on searches for keywords in HTTP payloads, SNI fields, and protocol-specific fields. Some of the policies apply only to a small number of initial packets, while Iran's censoring devices inspect the entire flow for matching keywords. We found no evidence that UDP traffic was classified by any of the operational networks we tested, providing a surprisingly easy way to evade their policies (i.e., use UDP-based protocols).
Middleboxes running the classifiers exhibit different, incomplete implementations of network and transport layers. Specifically, our testbed device does not check for a wide range of invalid packet header values, while the Great Firewall of China (GFC) does extensive packet validation. Iran and T-Mobile use middleboxes that only partially check for invalid packet headers. Further, we find that reordering of TCP segments can alter classification in all instances except for the GFC.
AT&T only throttles video traffic on port 80 Throughput when testing Veoh (HTTP) Throughput when testing Vimeo (HTTPS)
Sprint does not have differentiation policy based on content/port/IP, and we observed different rates for the same replay

Pcaps that recorded succeeded evasion

Code

The code that used for analysis can be found here

Paper

lib.erate,(n): A library for exposing (traffic-classification) rules and avoiding them efficiently
Fangfan Li, Abbas Razaghpanah, Arash Molavi Kakhki, Arian Akhavan Niaki, David Choffnes, Phillipa Gill, Alan Mislove. In Proceedings of the 17th ACM Internet Measurement Conference (IMC'17), London, UK, November 2017 [pdf].