Profiling Tor Users with Unsupervised Learning Techniques

Abstract

Website fingerprinting has been shown to be effective against Tor, one of the most popular low-latency anonymity networks. With this attack, a local network adversary is able to recover the browsing history of a client by using the traffic fingerprints observed at the client’s connection to the Tor network. Previous studies on website fingerprinting focus on designing supervised classifiers to identify visits to a set of target websites. In this paper, we consider an adversary with the same capabilities as in website fingerprinting, but who uses unsupervised techniques to profile the users’ browsing activity. We have used OPTICS, a clustering algorithm, to group similar traffic samples together, and the BCubed Precision and Recall metrics to measure the quality of the clustering. For a world of 100 websites, we show that, under mild assumptions, the attacker is able to group visits of different users to the same site with more than 50% success rate. We have also evaluated how the number of different pages that users can access impacts the effectiveness of the attack and found that for a world of 1,000 pages, the attack performance does not suffer a significant reduction.

Publication
International Workshop on Inference and Privacy in a Hyperconnected World