Skip to main content

Automatically Generating a Large, Culture-Specific Blocklist for China

Author(s): Hounsel, Austin; Mittal, Prateek; Feamster, Nick

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr16261
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHounsel, Austin-
dc.contributor.authorMittal, Prateek-
dc.contributor.authorFeamster, Nick-
dc.date.accessioned2021-10-08T19:50:30Z-
dc.date.available2021-10-08T19:50:30Z-
dc.date.issued2018en_US
dc.identifier.citationHounsel, Austin, Prateek Mittal, and Nick Feamster. "Automatically generating a large, culture-specific blocklist for China." In 8th USENIX Workshop on Free and Open Communications on the Internet (2018).en_US
dc.identifier.urihttps://www.usenix.org/system/files/conference/foci18/foci18-paper-hounsel.pdf-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr16261-
dc.description.abstractInternet censorship measurements rely on lists of websites to be tested, or “block lists” that are curated by third parties. Unfortunately, many of these lists are not public, and those that are tend to focus on a small group of topics, leaving other types of sites and services untested. To increase and diversify the set of sites on existing block lists, we use natural language processing and search engines to automatically discover a much wider range of websites that are censored in China. Using these techniques, we create a list of 1125 websites outside the Alexa Top 1,000 that cover Chinese politics, minority human rights organizations, oppressed religions, and more. Importantly, none of the sites we discover are present on the current largest block list. The list that we develop not only vastly expands the set of sites that current Internet measurement tools can test, but it also deepens our understanding of the nature of content that is censored in China. We have released both this new block list and the code for generating it.en_US
dc.language.isoen_USen_US
dc.relation.ispartof8th USENIX Workshop on Free and Open Communications on the Interneten_US
dc.rightsFinal published version. This is an open access article.en_US
dc.titleAutomatically Generating a Large, Culture-Specific Blocklist for Chinaen_US
dc.typeConference Articleen_US
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceedingen_US

Files in This Item:
File Description SizeFormat 
CultureList.pdf410.66 kBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.