Skip to main content

Hierarchical relational models for document networks

Author(s): Chang, Jonathan; Blei, David M

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1zn47
Abstract: We develop the relational topic model (RTM), a hierarchical model of both network structure and node attributes. We focus on document networks, where the attributes of each document are its words, that is, discrete observations taken from a fixed vocabulary. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and estimation algorithms based on variational methods that take advantage of sparsity and scale with the number of links. We evaluate the predictive performance of the RTM for large networks of scientific abstracts, web documents, and geographically tagged news.
Publication Date: 2010
Citation: Chang, Jonathan, Blei, David M. (2010). Hierarchical relational models for document networks. The Annals of Applied Statistics, 4 (1), 124 - 150. doi:10.1214/09-AOAS309
DOI: doi:10.1214/09-AOAS309
ISSN: 1932-6157
Pages: 124 - 150
Type of Material: Journal Article
Journal/Proceeding Title: The Annals of Applied Statistics
Version: Final published version. This is an open access article.



Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.