Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge
Author(s): Narayanan, Arvind; Shi, Elaine; Rubinstein, Benjamin IP
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1bg0g
Abstract: | This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run. We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link prediction-the latter is required to achieve good performance on the portion of the test set not de-anonymized-for example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction. |
Publication Date: | 2011 |
Citation: | Narayanan, Arvind, Elaine Shi, and Benjamin IP Rubinstein. "Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge." The 2011 International Joint Conference on Neural Networks (2011): pp. 1825-1834. doi:10.1109/IJCNN.2011.6033446 |
DOI: | doi:10.1109/IJCNN.2011.6033446 |
ISSN: | 2161-4393 |
EISSN: | 2161-4407 |
Pages: | 1825 - 1834 |
Type of Material: | Conference Article |
Journal/Proceeding Title: | The 2011 International Joint Conference on Neural Networks |
Version: | Author's manuscript |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.