Skip to main content

High Dimensional Semiparametric Scale-Invariant Principal Component Analysis

Author(s): Han, Fang; Liu, Han

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1kz3f
Abstract: We propose a new high dimensional semiparametric principal component analysis (PCA) method, named Copula Component Analysis (COCA). The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. COCA improves upon PCA and sparse PCA in three aspects: (i) It is robust to modeling assumptions; (ii) It is robust to outliers and data contamination; (iii) It is scale-invariant and yields more interpretable results. We prove that the COCA estimators obtain fast estimation rates and are feature selection consistent when the dimension is nearly exponentially large relative to the sample size. Careful experiments confirm that COCA outperforms sparse PCA on both synthetic and real-world data sets.
Publication Date: Oct-2014
Citation: Han, Fang, and Han Liu. "High dimensional semiparametric scale-invariant principal component analysis." IEEE transactions on pattern analysis and machine intelligence 36, no. 10 (2014): 2016-2032. doi:10.1109/TPAMI.2014.2307886
DOI: doi:10.1109/TPAMI.2014.2307886
ISSN: 0162-8828
EISSN: 1939-3539
Pages: 2016 - 2032
Type of Material: Journal Article
Journal/Proceeding Title: IEEE Transactions on Pattern Analysis and Machine Intelligence
Version: Author's manuscript



Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.