In this study preprocessing of Raman spectra of different biological samples has been studied, and their effect on the ability to extract robust and quantitative information has been evaluated. Four data sets of Raman spectra were chosen in order to cover different aspects of biological Raman spectra, and the samples constituted salmon oils, juice samples, salmon meat, and mixtures of fat, protein, and water. A range of frequently used preprocessing methods, as well as combinations of different methods, was evaluated. Different aspects of regression results obtained from partial least squares regression (PLSR) were used as indicators for comparing the effect of different preprocessing methods. The results, as expected, suggest that baseline correction methods should be performed in advance of normalization methods. By performing total intensity normalization after adequate baseline correction, robust calibration models were obtained for all data sets. Combination methods like standard normal variate (SNV), multiplicative signal correction (MSC), and extended multiplicative signal correction (EMSC) in their basic form were not able to handle the baseline features present in several of the data sets, and these methods thus provide no additional benefits compared to the approach of baseline correction in advance of total intensity normalization. EMSC provides additional possibilities that require further investigation.
Vol. 2, Iss. 1 Virtual Journal for Biomedical Optics
Nils Kristian Afseth, Vegard Herman Segtnan, and Jens Petter Wold, "Raman Spectra of Biological Samples: A Study of Preprocessing Methods," Appl. Spectrosc. 60, 1358-1367 (2006)