Printer Friendly Version | Back

Sparse principal component regression

Year: 2013       Vol.: 62       No.: 1      

Authors: Joseph Ryan G. Lansangan

Abstract:

Modeling of complex systems is usually confronted with high dimensional independent variables. Econometric models are usually built using time series data that often exhibit nonstationarity due to the impact of some policies and other economic forces. Both cases are usually affected by the multicollinearity problem resulting to unstable least squares estimates of the linear regression coefficients. Principal component regression can provide solution, but in cases where the regressors are nonstationary or the dimension exceeds the sample size, principal components may yield simple averaging of the regressors and the resulting model is difficult to interpret due to biased estimates of the regression coefficients. A sparsity constraint is added to the least squares criterion to induce the sparsity needed for the components to reflect the relative importance of each regressor in a sparse principal component regression (SPCR) model. Simulated and real data are used to illustrate and assess performance of the method. SPCR in many cases leads to better estimation and prediction than conventional principal component regression (PCR). SPCR is able to recognize relative importance of indicators from the sparse components as predictors. SPCR can be used in modeling high dimensional data, as an intervention strategy in regression with nonstationary time series data, and when there is a general problem of multicollinearity.

Keywords: sparsity; high dimensionality; multicollinearity; nonstationarity; sparse principal components

Download this article:

Back to top