The spectra in spectral reflectance datasets tend to be quite correlated and therefore they can be represented more compactly using standard techniques such as principal components analysis (PCA) as part of a lossy compression strategy. However, the presence of outlier spectra can often increase the overall error of the reconstructed spectra. This paper introduces a new outlier modeling (OM) method that detects, clusters, and separately models outliers with their own set of basis vectors. Outliers are defined in terms of the robust Mahalanobis distance using the fast minimum covariance determinant algorithm as a robust estimator of the multivariate mean and covariance from which it is computed. After removing the outliers from the main dataset, the performance of PCA on the remaining data improves significantly; however, since outlier spectra are a part of the image, they cannot simply be ignored. The solution is to cluster the outliers into a small number of clusters and then model each cluster separately using its own cluster-specific PCA-derived bases. Tests show that OM leads to lower spectral reconstruction errors of reflectance spectra in terms of both normalized RMS and goodness of fit.
© 2014 Optical Society of America
Original Manuscript: January 24, 2014
Revised Manuscript: April 1, 2014
Manuscript Accepted: April 23, 2014
Published: June 12, 2014
Farnaz Agahian and Brian Funt, "Outlier modeling for spectral data reduction," J. Opt. Soc. Am. A 31, 1445-1452 (2014)