In typical near-infrared multivariate statistical analyses, samples with similar spectra produce points that cluster in a certain region of spectral hyperspace. These clusters can vary significantly in shape and size due to variation in sample packings, particle-size distributions, component concentrations, and drift with time. These factors, when combined with discriminant analysis using simple distance metrics, produce a test in which a result that places a particular point inside a particular cluster does not necessarily mean that the point is actually a member of the cluster. Instead, the point may be a member of a new, slightly different cluster that overlaps the first. A new cluster can be created by factors like low-level contamination or instrumental drift. An extention added to part of the BEAST (<u>B</u>ootstrap <u>E</u>rror-<u>A</u>djusted <u>S</u>ingle-sample <u>T</u>echnique) can be used to set nonparametric probability-density contours inside spectral clusters as well as outside, and when multiple points begin to appear in a certain region of cluster-hyperspace the perturbation of these density contours can be detected at an assigned significance level. The detection of false samples both within and beyond 3 SDs of the center of the training set is possible with this method. This procedure is shown to be effective for contaminant levels of a few hundred ppm in an over-the-counter drug capsule, and is shown to function with as few as one or two wavelengths, suggesting its application to very simple process sensors.
Robert A. Lodder and Gary M. Hieftje, "Detection of Subpopulations in Near-Infrared Reflectance Analysis," Appl. Spectrosc. 42, 1500-1512 (1988)