Provable learning of noisy-or networks

Arora, Sanjeev; Ge, R; Ma, T; Risteski, A

Provable learning of noisy-or networks

Author(s): Arora, Sanjeev; Ge, R; Ma, T; Risteski, A

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1n431

Full metadata record

DC Field	Value	Language
dc.contributor.author	Arora, Sanjeev	-
dc.contributor.author	Ge, R	-
dc.contributor.author	Ma, T	-
dc.contributor.author	Risteski, A	-
dc.date.accessioned	2019-08-29T17:05:05Z	-
dc.date.available	2019-08-29T17:05:05Z	-
dc.date.issued	2017	en_US
dc.identifier.citation	Arora, S, Ge, R, Ma, T, Risteski, A. (2017). Provable learning of noisy-or networks. Part F128415 (1057 - 1066. doi:10.1145/3055399.3055482	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1n431	-
dc.description.abstract	Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Finding parameters with the maximum likelihood is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structures: topic models, mixture models, hidden Markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy or network, which is a textbook example of a Bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.	en_US
dc.format.extent	1057 - 1066	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	Proceedings of the Annual ACM Symposium on Theory of Computing	en_US
dc.rights	Author's manuscript	en_US
dc.title	Provable learning of noisy-or networks	en_US
dc.type	Conference Article	en_US
dc.identifier.doi	doi:10.1145/3055399.3055482	-
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
Provable learning of Noisy-or Networks.pdf		373.79 kB	Adobe PDF	View/Download

Show Simple Item Record