Skip to main content

Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction

Author(s): Zhou, Jian; Troyanskaya, Olga G

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr17n92
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZhou, Jian-
dc.contributor.authorTroyanskaya, Olga G-
dc.date.accessioned2021-10-08T19:47:41Z-
dc.date.available2021-10-08T19:47:41Z-
dc.date.issued2014en_US
dc.identifier.citationZhou, Jian, and Olga Troyanskaya. "Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction." In International Conference on Machine Learning 32, no. 1 (2014): pp. 745-753.en_US
dc.identifier.urihttp://proceedings.mlr.press/v32/zhou14.html-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr17n92-
dc.description.abstractPredicting protein secondary structure is a fundamental problem in protein structure prediction. Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical representations. GSN is a recently proposed deep learning technique (Bengio & Thibodeau-Laufer, 2013) to globally train deep generative model. We present the supervised extension of GSN, which learns a Markov chain to sample from a conditional distribution, and applied it to protein structure prediction. To scale the model to full-sized, high-dimensional data, like protein sequences with hundreds of amino-acids, we introduce a convolutional architecture, which allows efficient learning across multiple layers of hierarchical representations. Our architecture uniquely focuses on predicting structured low-level labels informed with both low and high-level representations learned by the model. In our application this corresponds to labeling the secondary structure state of each amino-acid residue. We trained and tested the model on separate sets of non-homologous proteins sharing less than 30% sequence identity. Our model achieves 66.4% Q8 accuracy on the CB513 dataset, better than the previously reported best performance 64.9% (Wang et al., 2011) for this challenging secondary structure prediction problem.en_US
dc.format.extent745 - 753en_US
dc.language.isoen_USen_US
dc.relation.ispartofInternational Conference on Machine Learningen_US
dc.rightsFinal published version. Article is made available in OAR by the publisher's permission or policy.en_US
dc.titleDeep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Predictionen_US
dc.typeConference Articleen_US
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceedingen_US

Files in This Item:
File Description SizeFormat 
DeepSupervisedConvolutionalGenerativeStochasticNetwork.pdf456.49 kBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.