Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge

Narayanan, Arvind; Shi, Elaine; Rubinstein, Benjamin IP

Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge

Author(s): Narayanan, Arvind; Shi, Elaine; Rubinstein, Benjamin IP

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1bg0g

Full metadata record

DC Field	Value	Language
dc.contributor.author	Narayanan, Arvind	-
dc.contributor.author	Shi, Elaine	-
dc.contributor.author	Rubinstein, Benjamin IP	-
dc.date.accessioned	2021-10-08T19:44:30Z	-
dc.date.available	2021-10-08T19:44:30Z	-
dc.date.issued	2011	en_US
dc.identifier.citation	Narayanan, Arvind, Elaine Shi, and Benjamin IP Rubinstein. "Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge." The 2011 International Joint Conference on Neural Networks (2011): pp. 1825-1834. doi:10.1109/IJCNN.2011.6033446	en_US
dc.identifier.issn	2161-4393	-
dc.identifier.uri	https://arxiv.org/abs/1102.4374	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1bg0g	-
dc.description.abstract	This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run. We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link prediction-the latter is required to achieve good performance on the portion of the test set not de-anonymized-for example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction.	en_US
dc.format.extent	1825 - 1834	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	The 2011 International Joint Conference on Neural Networks	en_US
dc.rights	Author's manuscript	en_US
dc.title	Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge	en_US
dc.type	Conference Article	en_US
dc.identifier.doi	doi:10.1109/IJCNN.2011.6033446	-
dc.identifier.eissn	2161-4407	-
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
WonKaggleSocialNetworkChallenge.pdf		753.67 kB	Adobe PDF	View/Download

Show Simple Item Record