Obfuscating Information to Evade Censorship Filters
Methods of the Freedom Of Information Hack Project
by Chris Gragsone
Previous techniques for information smuggling involved encryption, complex
network tunneling, or (better define this term)special
clients. These do not work well in environments of despotic control.
They provide evidence of subverting authority, for which the (better
define this term)user may be punished. The availability
of the clients are as restricted as the information to which they would
provide access. The learning curve also limits which people can use the
techniques. By utilizing a technique which is support by main stream web
browsers, project FOIH will enable the masses access to information, without
a prerequisite, non-mainstream client or advanced networking and encryption.
Censorship filters come in two forms: content vector filters and URL
filters. Content Vector Filters block access based on phrases using
keyword lists. URL filters deny access based on the host, path, or
query of the request. Thus, to allow information to be indiscriminately
passed to the end users, FOIH must circumvent both censorship techniques.
Project FOIH's countermeasure for Content Vector Filters is to obfuscate
the content, using encryption, so that it cannot be examined "in flight".
The obfuscation process is a simple substitution cipher. Deniable cryptography
is not part of the FOIH's objective, so encryption strength is not an issue.
However it would require an immense amount resources for a filter to decrypt
every text appearing cryptic, and to pass traffic at network speeds.
Therefore, simple encryption is sufficient for our needs in obfuscation
-- making it highly difficult for the filter to drop the packet based on
key phrases alone. The encrypted html is accompanied with a Javascript
decryption function. The function rewrites the document automatically,
and without the (define this term)user's
knowledge.
It is possible for the filter to detect encryption and drop the transfer
by using character frequency analysis. Character frequency analysis
(find
a better definition) examines the distribution of characters
and compares them to the normal frequency of the selected language.
If a comparison fails to match, the transfer is perceived to be encrypted,
and the transfer is stopped. To circumvent this analysis, FOIH will
insert "non-offensive" dictionary words until the file matches the correct
distribution of characters.
To provide access for blocked sites, FOIH will act as a proxy.
Called with no arguments, FOIH will generate an obfuscated Common Gateway
Interface (CGI) form. The user will then enter the desired URL, and
FOIH will react by transforming and return an obfuscated version of the
requested URL. However, filters can identify, and block, URLs specified
in GET and POST queries. In response to this technique, FOIH randomly
chooses between GET and POST methods and the request will be sent as obfuscated
data back to the FOIH CGI.
As FOIH's popularity increases, so shall filters' increase their resistance
FOIH. Not only must the data be obfuscated, but the method, and even
the act of obfuscation must be obfuscated. Although it would be impractical
for a high-bandwidth filter to decrypt simple substitution on every website
being received, it could be possible for a local filter, such as one "protecting"
a library, to attempt to decrypt all html traffic in order to spot obfuscation
hidden behind FOIH's "counter frequency analysis" method. It is also
likely that future filters will search for Javascript which rewrite the
document.
This process of natural selection between text obfuscation and content
filters will force FOIH to be an ever evolving force implementing new counter
measures. As filters improve in the performance of mass simple decryption,
fake character frequencies will not suffice in obfuscating the presence
of encryption. FOIH will need to implement more processor and bandwidth
intensive encryption systems.
Javascript was chosen to implement the decryption due to its simplicity
and widespread support. However, FOIH's dependence on this language
will become an Achilles heal. In order to prevent content filters
from blocking a FOIH "signature". FOIH must become more dynamic with
more choices and thus use multiple languages such as Java, ActiveX, and
VB script for the decryption method.
|