De-anonymizing Web Browsing Data with Social Networks

Su, Jessica; Shukla, Ansh; Goel, Sharad; Narayanan, Arvind

De-anonymizing Web Browsing Data with Social Networks

Author(s): Su, Jessica; Shukla, Ansh; Goel, Sharad; Narayanan, Arvind

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1qn72

Full metadata record

DC Field	Value	Language
dc.contributor.author	Su, Jessica	-
dc.contributor.author	Shukla, Ansh	-
dc.contributor.author	Goel, Sharad	-
dc.contributor.author	Narayanan, Arvind	-
dc.date.accessioned	2021-10-08T19:44:28Z	-
dc.date.available	2021-10-08T19:44:28Z	-
dc.date.issued	2017-04	en_US
dc.identifier.citation	Su, Jessica, Ansh Shukla, Sharad Goel, and Arvind Narayanan. "De-anonymizing Web Browsing Data with Social Networks." In WWW '17: Proceedings of the 26th International Conference on World Wide Web (2017): pp. 1261-1269. doi:10.1145/3038912.3052714	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1qn72	-
dc.description.abstract	Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show---theoretically, via simulation, and through experiments on real user data---that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one's feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user's social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time.To gauge the real-world effectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on sufficiently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is---to our knowledge---the largest-scale demonstrated de-anonymization to date.	en_US
dc.format.extent	1261 - 1269	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	WWW '17: Proceedings of the 26th International Conference on World Wide Web	en_US
dc.rights	Final published version. This is an open access article.	en_US
dc.title	De-anonymizing Web Browsing Data with Social Networks	en_US
dc.type	Conference Article	en_US
dc.identifier.doi	doi:10.1145/3038912.3052714	-
dc.identifier.isbn13	978-1-4503-4913-0	-
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
DeanonymizingWebBrowsingData.pdf		3.06 MB	Adobe PDF	View/Download

Show Simple Item Record