In this paper a novel wavelength region selection algorithm, called elastic net grouping variable selection combined with partial least squares regression (EN-PLSR), is proposed for multi-component spectral data analysis. The EN-PLSR algorithm can automatically select successive strongly correlated prediction variable groups related to the response variable using two steps. First, a portion of the correlated predictors are selected and divided into subgroups by means of the grouping effect of elastic net estimation. Then, a recursive leave-one-group-out strategy is employed to further shrink the variable groups in terms of the root mean square error of cross-validation (RMSECV) criterion. The performance of the algorithm with real near-infrared (NIR) spectroscopic data sets shows that the EN-PLSR algorithm is competitive with full-spectrum PLS and moving window partial least squares (MWPLS) regression methods and it is suitable for use with strongly correlated spectroscopic data.
Vol. 6, Iss. 5 Virtual Journal for Biomedical Optics
Guang-Hui Fu, Qing-Song Xu, Hong-Dong Li, Dong-Sheng Cao, and Yi-Zeng Liang, "Elastic Net Grouping Variable Selection Combined with Partial Least Squares Regression (EN-PLSR) for the Analysis of Strongly Multi-collinear Spectroscopic Data," Appl. Spectrosc. 65, 402-408 (2011)