To refer to this page use:
|Abstract:||In this paper we focus on the principal component regression and its application to high dimension non-Gaussian data. The major contributions are in two folds. First, in low dimensions and under a double asymptotic framework where both the dimension d and sample size n can increase, by borrowing the strength from recent development in minimax optimal principal component estimation, we first time sharply characterize the potential advantage of classical principal component regression over least square estimation under the Gaussian model. Secondly, we propose and analyze a new robust sparse principal component regression on high dimensional elliptically distributed data. The elliptical distribution is a semiparametric generalization of the Gaussian, including many well known distributions such as multivariate Gaussian, rank-deficient Gaussian, t, Cauchy, and logistic. It allows the random vector to be heavy tailed and have tail dependence. These extra flexibilities make it very suitable for modeling finance and biomedical imaging data. Under the elliptical model, we prove that our method can estimate the regression coefficients in the optimal parametric rate and therefore is a good alternative to the Gaussian based methods. Experiments on synthetic and real world data are conducted to illustrate the empirical usefulness of the proposed method.|
|Citation:||Han, Fang, and Han Liu. "Robust sparse principal component regression under the high dimensional elliptical model." In Advances in Neural Information Processing Systems, pp. 1941-1949. 2013.|
|Pages:||1941 - 1949|
|Type of Material:||Conference Article|
|Journal/Proceeding Title:||Advances in Neural Information Processing Systems|
|Version:||Final published version. Article is made available in OAR by the publisher's permission or policy.|
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.