Skip to main content

Exact recovery in the stochastic block model

Author(s): Abbe, Emmanuel; Bandeira, AS; Hall, G

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1ng3t
Full metadata record
DC FieldValueLanguage
dc.contributor.authorAbbe, Emmanuel-
dc.contributor.authorBandeira, AS-
dc.contributor.authorHall, G-
dc.date.accessioned2021-10-08T20:16:12Z-
dc.date.available2021-10-08T20:16:12Z-
dc.date.issued2016en_US
dc.identifier.citationAbbe, E, Bandeira, AS, Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Transactions on Information Theory, 62 (471 - 487. doi:10.1109/TIT.2015.2490670en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1ng3t-
dc.description.abstractThe stochastic block model with two communities, or equivalently the planted bisection model, is a popular model of random graph exhibiting a cluster behavior. In the symmetric case, the graph has two equally sized clusters and vertices connect with probability p within clusters and q across clusters. In the past two decades, a large body of literature in statistics and computer science has focused on providing lower bounds on the scaling of |p - q| to ensure exact recovery. In this paper, we identify a sharp threshold phenomenon for exact recovery: if α = pn/log(n) and β = qn/log(n) are constant (with α> β), recovering the communities with high probability is possible if (α + β/2) - √αβ > 1 and is impossible if (α + β/2) - √αβ < 1. In particular, this improves the existing bounds. This also sets a new line of sight for efficient clustering algorithms. While maximum likelihood (ML) achieves the optimal threshold (by definition), it is in the worst case NP-hard. This paper proposes an efficient algorithm based on a semidefinite programming relaxation of ML, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest that it may achieve the threshold. An efficient algorithm that succeeds all the way down to the threshold is also obtained using a partial recovery algorithm combined with a local improvement procedure.en_US
dc.format.extent471 - 487en_US
dc.language.isoen_USen_US
dc.relation.ispartofIEEE Transactions on Information Theoryen_US
dc.rightsAuthor's manuscripten_US
dc.titleExact recovery in the stochastic block modelen_US
dc.typeJournal Articleen_US
dc.identifier.doidoi:10.1109/TIT.2015.2490670-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
Exact recovery in the stochastic block model.pdf1.79 MBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.