Skip to main content

MLASSO-Hum: A LASSO-based interpretable human-protein subcellular localization predictor

Author(s): Wan, S; Mak, M-W; Kung, S-Y

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1f18sf4j
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWan, S-
dc.contributor.authorMak, M-W-
dc.contributor.authorKung, S-Y-
dc.date.accessioned2024-01-21T19:18:23Z-
dc.date.available2024-01-21T19:18:23Z-
dc.date.issued2015-10-07en_US
dc.identifier.citationWan, S, Mak, M-W, Kung, S-Y. (2015). MLASSO-Hum: A LASSO-based interpretable human-protein subcellular localization predictor. Journal of Theoretical Biology, 382 (223 - 234. doi:10.1016/j.jtbi.2015.06.042en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1f18sf4j-
dc.description.abstractKnowing the subcellular compartments of human proteins is essential to shed light on the mechanisms of a broad range of human diseases. In computational methods for protein subcellular localization, knowledge-based methods (especially gene ontology (GO) based methods) are known to perform better than sequence-based methods. However, existing GO-based predictors often lack interpretability and suffer from overfitting due to the high dimensionality of feature vectors. To address these problems, this paper proposes an interpretable multi-label predictor, namely mLASSO-Hum, which can yield sparse and interpretable solutions for large-scale prediction of human protein subcellular localization. By using the one-vs-rest LASSO-based classifiers, 87 out of more than 8000 GO terms are found to play more significant roles in determining the subcellular localization. Based on these 87 essential GO terms, we can decide not only where a protein resides within a cell, but also why it is located there. To further exploit information from the remaining GO terms, a method based on the GO hierarchical information derived from the depth distance of GO terms is proposed. Experimental results show that mLASSO-Hum performs significantly better than state-of-the-art predictors. We also found that in addition to the GO terms from the cellular component category, GO terms from the other two categories also play important roles in the final classification decisions. For readers׳ convenience, the mLASSO-Hum server is available online at http://bioinfo.eie.polyu.edu.hk/mLASSOHumServer/.en_US
dc.format.extent223 - 234en_US
dc.language.isoen_USen_US
dc.relation.ispartofJournal of Theoretical Biologyen_US
dc.rightsAuthor's manuscripten_US
dc.titleMLASSO-Hum: A LASSO-based interpretable human-protein subcellular localization predictoren_US
dc.typeJournal Articleen_US
dc.identifier.doidoi:10.1016/j.jtbi.2015.06.042-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
mLASSO_Hum_A_LASSO_based_interpretable_h.pdf548.59 kBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.