The significance of a spectral feature is defined as the probability that the feature captures the structure of the data set at hand. In particular, the significance is equal to a value proportional to the variance of a feature within a particular data set. The larger the variance, the higher the probability that the feature will capture the underlying structure. This approach is particularly useful when significance is used to select features differentiating clusters of samples and for the construction of self-organizing maps (SOMs) of clusters. A significance spectrum is obtained by plotting significance as a function of wavenumber. After developing the approach for feature significance, the significance framework was applied to the construction of SOMs for clustering infrared spectra of bacteria. The significance framework consistently chooses features that make it possible to construct maps with reduced feature sets that are at least as good as the maps constructed on full feature sets. In addition, significance reliably picks features that are consistent with biological interpretations of the spectra.
Vol. 7, Iss. 3 Virtual Journal for Biomedical Optics
LUTZ HAMEL and CHRIS W BROWN, "Bayesian Probability Approach to Feature Significance for Infrared Spectra of Bacteria," Appl. Spectrosc. 66, 48-59 (2012)