OSA's Digital Library

Virtual Journal for Biomedical Optics

Virtual Journal for Biomedical Optics

| EXPLORING THE INTERFACE OF LIGHT AND BIOMEDICINE

  • Editors: Andrew Dunn and Anthony Durkin
  • Vol. 6, Iss. 8 — Aug. 26, 2011
« Show journal navigation

Minimum Description Length approach for unsupervised spectral unmixing of multiple interfering gas species

Julien Fade, Sidonie Lefebvre, and Nicolas Cézard  »View Author Affiliations


Optics Express, Vol. 19, Issue 15, pp. 13862-13872 (2011)
http://dx.doi.org/10.1364/OE.19.013862


View Full Text Article

Acrobat PDF (950 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

We address an original statistical method for unsupervised identification and concentration estimation of spectrally interfering gas components of unknown nature and number. We show that such spectral unmixing can be efficiently achieved using information criteria derived from the Minimum Description Length (MDL) principle, outperforming standard information criteria such as AICc or BIC. In the context of spectroscopic applications, we also show that the most efficient MDL technique implemented shows good robustness to experimental artifacts.

© 2011 OSA

1. Introduction

Air pollution monitoring in the atmosphere has motivated the development of many active optical instruments based on absorption spectroscopy. Ideally, a single instrument should be able to detect and quantify numerous gas species. It is therefore appropriate to use an illumination source that can cover a large spectral range. Two kinds of sources can be used, which are: i) narrow-line lasers with broad tunability, and ii) instantaneous broadband sources. Both families have demonstrated high potential for measurement of multi-components gas mixtures in the atmosphere. Narrow-line tunable lasers have been used in multi-wavelength systems like DIfferential Absorption Lidars (DIAL) [1

1. P. Weibring, C. Abrahamsson, M. Sjholm, J. N. Smith, H. Edner, and S. Svanberg, “Multi-component chemical analysis of gas mixtures using a continuously tuneable lidar system,” Appl. Phys. B 79, 525–530 (2004). [CrossRef]

, 2

2. J. R. Quagliano, P. O. Stoutland, R. R. Petrin, R. K. Sander, R. J. Romero, M. C. Whitehead, C. R. Quick, J. J. Tiee, and L. J. Jolin, “Quantitative chemical identification of four gases in remote infrared (9–11μm) differential absorption lidar experiments,” Appl. Opt. 36, 1915–1927 (1997). [CrossRef] [PubMed]

] and Tunable-Diode Laser Absorption Spectroscopy (TD-LAS) [3

3. G. Wysocki, R. Lewicki, R. Curl, F. Tittel, L. Diehl, F. Capasso, M. Troccoli, G. Hofler, D. Bour, S. Corzine, R. Maulini, M. Giovannini, and J. Faist, “Widely tunable mode-hop free external cavity quantum cascade lasers for high resolution spectroscopy and chemical sensing,” Appl. Phys. B 92, 305–311 (2008). [CrossRef]

]. Instantaneous broadband sources have been used in various experiments schemes such as Differential Optical Absorption Spectroscopy (DOAS) [4

4. U. Platt and J. Stutz, Differential Optical Absorption Spectroscopy (Springer, 2008).

], open-path active Fourier-Transform InfraRed (FTIR) spectroscopy [5

5. R. A. Hashmonay, R. M. Varma, M. Modrak, R. H. Kagann, and P. D. Sullivan, “Simultaneous measurement of vaporous and aerosolized threats by active open path FTIR,” Unclassified Technical Report ADA449529, Arcadis Geraghty and Miller Research, Triangle Park, NC (2004).

], white-light filament-induced spectroscopy [6

6. J. Kasparian, M. Rodriguez, G. Méjean, J. Yu, E. Salmon, H. Wille, R. Bourayou, S. Frey, Y. André, A. Mysyrowicz, R. Sauerbrey, J. Wolf, and L. Wöste, “White-light filaments for atmospheric analysis,” Science 301, 61–64 (2003). [CrossRef] [PubMed]

], and more recently, supercontinuum fiber laser spectroscopy [7

7. D. M. Brown, K. Shi, Z. Liu, and C. R. Philbrick, “Long-path supercontinuum absorption spectroscopy for measurement of atmospheric constituents,” Opt. Express 16, 8457–8471 (2008). [CrossRef] [PubMed]

, 8

8. P. S. Edwards, A. M. Wyant, D. M. Brown, Z. Liu, and C. R. Philbrick, “Supercontinuum laser sensing of atmospheric constituents,” Proc. SPIE 7323, 73230S (2009). [CrossRef]

]. These various techniques share a common experimental design which is sketched in Fig. 1.

Fig. 1 Illustration of an absorption spectroscopy experiment using an active broadband illumination or tunable laser source.

All these techniques provide multi-spectral absorption data that can be processed by multivariate statistical analysis in order to characterize the gas mixture. When the number and nature of the chemicals are a priori known, efficient algorithms can be designed to estimate their concentrations [9

9. E. R. Warren, “Optimum detection of multiple vapor materials with frequency-agile lidar,” Appl. Opt. 35, 4180–4193 (1996). [CrossRef] [PubMed]

11

11. J. Fade and N. Cézard, “Supercontinuum lidar absorption spectroscopy for gas detection and concentration estimation,” in Proceedings of the 25th International Laser and Remote-sensing Conference, (2010), pp. 798–801.

]. However, in many practical cases, the number, nature, and concentration of gas components are all unknown. In such situations, the same algorithms are inclined to over-fit signal noise by assigning non-zero concentrations to many gas species in the fixed list of expectable gases (all of them being estimated at the same time). This results in complex and often unrealistic gas diagnosis. To avoid this, it is necessary to design unsupervised methods enabling simultaneous gas selection and concentration estimation. In this paper, we use the powerful concept of Minimum Description Length (MDL) principle to tackle this problem. We illustrate the potential of the method for spectral unmixing of several chemicals in the mid-infrared range. This spectral range is of particular interest for air pollution monitoring, as many industrial and greenhouse gases exhibit strong absorption lines in this band.

In broad outline, the MDL principle is based on the idea that the best model describing the measured data must minimize the code length needed to describe the data and to encode the model itself [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

]. Such a principle has already been applied in various domains, such as social sciences [13

13. R. A. Stine, “Model selection using information theory and the MDL principle,” Sociolog. Methods Res. 33, 230–260 (2004). [CrossRef]

], biology [14

14. C. D. Giurcaneanu, “Stochastic complexity for the detection of periodically expressed genes,” in Proceedings of the IEEE International Workshop on Genomic Signal Processing and Statistics, (2007), pp. 1–4. [CrossRef]

] or radar signal processing [15

15. H. Chen, T. Kirubarajan, Y. Bar-Shalom, and K. R. Pattipati, “MDL approach for multiple low-observable track initiation,” Proc. SPIE 4728, 477–488 (2002). [CrossRef]

] for instance. For the first time to the best of our knowledge, we show that this principle can be used for spectroscopic applications. More precisely, the approach presented in this paper allows unsupervised spectral unmixing of gas mixtures to be simply operated, with detection performances that outperform standard information criteria.

2. Principle of unsupervised spectral unmixing algorithm

2.1. Posing of the problem

Before presenting the principle of the unsupervised spectral unmixing method addressed in this paper, let us detail the physical model that will be considered in the following. In most of absorption spectroscopy experiments, one is interested in measuring a vector X containing intensity measures on M spectral slits (or wavelengths) not necessarily adjacent. In the presence of absorbing gas species, these spectral measurements reveal specific absorption patterns depending on the nature and concentration of the chemicals encountered by the probe light beam. These spectral absorption patterns are superimposed with the spectral baseline of the active illumination source. The vector X of the measured intensities is linked to the K-dimensional vector c containing the gases concentrations c = [c1,..., cK]T through the following equation
X=(a0ueHuc)*g,
(1)
where g denotes the spectral slit (or laser linewidth) convolution function, which is assumed known in the following. In this equation, a 0u denotes the baseline spectrum, and the M × K matrix Hu = [hu 1, h 2,..., hu K] contains the unconvolved high-resolution absorption spectra of the K gas species. For the sake of simplicity, we will only consider in this paper the case of small absorption optical depths (i.e., Huc ≪ 1). Moreover, we assume that the baseline a 0 is varying slowly with respect to both the absorption lines and the convolution function widths. In such conditions, the measured intensities can be written,
X=a0eHc,
(2)
where the matrix H = [(hu 1 * g), (hu 2 * g),...,(hu K * g)] contains the convolved absorption spectra of the K gas species, and where the convolved spectral baseline a 0 is assumed known, either from instrumental calibration or with a precise radiometric model of the illumination source. More accurate models involving deconvolution procedures, as well as the influence of a possible resolution mismatch between the instrument and the model are outside the scope of this paper, but could deserve investigation in future work.

The noisy experimental intensity measures over the M spectral slits, obtained for instance with a dispersive spectrometer or a FTIR spectrometer, will be denoted in the remaining of this paper. It is a common procedure to use the logarithm of the measured data so as to obtain a linear regression model of the following form:
=lnX˜=b0Hc+n,
(3)
with b 0 = ln a 0, and where the M-dimensional zero-mean random vector n allows us to model the experimental noise. We assume that the noise contribution to the measured signal can be correctly accounted for with a Gaussian additive model. We also assume independence between the noise affecting two distinct spectral slits, i.e., 〈ni nj〉 = 0 if ij. For such a linear regression model, the usual estimator is ĉ = (HH T)−1 H T (b 0) and is usually referred to as the Minimum Mean Squared Error (MMSE) estimator since it minimizes the Residual Sum of Squares RSS = (Ŷ)T (Ỹ – Ŷ), with Ŷ = b 0.

2.2. Model selection

The issue of model selection arises in many practical situations. For the problem at hand, two questions have to be answered: how many gas components (regressors) do we need to describe the experimental data, and which regressors have to be selected in the linear regression model of Eq. (3) to best explain the observations ? Without any model selection step, the most exhaustive regression model would include any gas species presenting characteristic absorption lines within the spectral range considered, and may lead to misleading and imprecise (if not incorrect) results, mostly due to overfitting of the noise. To avoid such undesirable situations, many penalization methods have been proposed among which we can cite the Akaike Information Criterion (AIC) [18

18. H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom. Control 19, 716–723 (1974). [CrossRef]

], the Bayesian Information Criterion (BIC) [19

19. G. Schwartz, “Estimating the dimension of a model,” Ann. Stat. 9, 461–464 (1978). [CrossRef]

], the Risk Inflation Criterion (RIC) [20

20. D. P. Foster and E. I. G., “The risk inflation criterion for multiple regression,” Ann. Stat. 22, 1947–1975 (1994). [CrossRef]

], etc. These so-called information criteria make it possible to introduce sparsity constraints in the regression model, by selecting the solution (i.e., the regressor matrix H) which minimizes – (|H) + 𝒞, with a different penalization term 𝒞 depending on the information criterion considered. It can be noted however that since the loglikelihood is proportional to the logarithm of the RSS, up to an additive constant independent of the selected regression model [16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

], the model selection can be operated equivalently by minimizing M/2 ln RSS + 𝒞.

Let us briefly recall two of the classical information criteria, which will be used in the remaining of the paper as benchmarks to assess the quality of the proposed MDL-based methods. The simplest is the Akaike Information Criterion (AIC) [18

18. H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom. Control 19, 716–723 (1974). [CrossRef]

], which introduces a penalization term equal to the number K of regressors included in the model. In the case of samples of limited size, this penalization term can be refined and is usually referred to as AICc and will be denoted 𝒞 ( a ) in the following, with [16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

]
𝒞(a)=M21+K/M1(K+2)/M.
(5)
We shall also consider the well-known Bayesian Information Criterion (BIC) [19

19. G. Schwartz, “Estimating the dimension of a model,” Ann. Stat. 9, 461–464 (1978). [CrossRef]

], whose penalization term reads
𝒞(b)=K2lnM.
(6)

Other information criteria can be found in abundance in the literature, which may suggest that an appropriate “most efficient” criterion at hand can be designed for a given statistical problem. Among various attempts to build a general theoretical framework to interpret model complexity, the Minimum Description Length (MDL) principle introduced by Rissanen [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

] is an interesting and fruitful approach. The MDL principle is based on the underlying idea that the best description of the data will be given by the model leading to the shortest code length (expressed in bits or in nats (1 nat= ln 2 bits)) needed to both describe the data given the model, and to encode the model itself [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

, 16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

].

More recently, sophisticated forms of the MDL principle have been proposed, with a constant effort towards loosening the assumptions held on the observed data. We shall focus in the following on two MDL approaches whose expressions are recalled below. Detailed theoretical foundings of these MDL theories can be found in Refs [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

, 16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

, 21

21. J. Rissanen, Stochastic Complexity in Statistical Inquiry, Series in Computer Science (World Scientific, 1989), Vol. 15.

].

Mixture MDL and g -prior (gMDL): Within the framework of mixture MDL [21

21. J. Rissanen, Stochastic Complexity in Statistical Inquiry, Series in Computer Science (World Scientific, 1989), Vol. 15.

], a prior distribution is assigned to the vector parameter θ. With a specific choice of the prior distribution (Zellner’s g-prior), one obtains the so-called gMDL for which the criterion to minimize has the following closed form expression [16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

]:
min{M2lnRSS+𝒞(g)if F>1M2ln(b0)T(b0)otherwise,
(7)
where F = (M – K) [(b 0)T (b 0) – RSS] /K RSS is the standard F-ratio for testing the null model containing the spectral baseline only. The penalization term 𝒞 ( g ) in Eq.(7) is given in [16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

] and can be written
𝒞(g)=K2lnF+M2lnMMK.
(8)

Normalized Maximized Likelihood (nMDL): Lastly, we shall be interested in the recently proposed Normalized Maximized Likelihood form of the MDL [22

22. J. Rissanen, “Fisher information and stochastic complexity,” IEEE Trans. Inf. Theory 42, 48–54 (1996). [CrossRef]

]. This approach has proved efficient in various practical problems and has shown several optimality properties [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

, 16

16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

]. For the statistical problem considered in this paper, the nMDL theory suggests to introduce the following penalization terms [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

]:
  • For a model including the baseline only:
    𝒞(n)=M2ln2πM+12lnM2+lnlnba.
    (9)
  • In any other cases:
    𝒞(n)=K2lnF+12lnK(MK)+K2lnM+M2ln2πMK+2lnlnbaln2+c,
    (10)
    where c denotes the code length needed for encoding the model. Following Rissanen, we use the code length
    c=min{Kmax,[lnK!(KmaxK)!Kmax!+lnK+log2ln(eKmax)]}
    (11)
    for a selection among Kmax potential regressors contained in the spectral database.

It must be noted that the nMDL approach requires the hyperparameters a and b to be estimated. According to Rissanen’s indications [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

], the estimator of the hyperparameter a is given by the RSS obtained with the most exhaustive model (i.e. Kmax regressors included) while the estimator of the hyperparameter b is the RSS obtained with the less exhaustive model (i.e. baseline only).

2.3. Stepwise algorithm for unsupervised spectral unmixing

Fig. 2 (a) Example of simulated noisy data with S-SNR=6.3 dB (blue curve) superimposed with the true spectrum (black curve) and baseline (green dashed curve). (b)–(f) Comparison of the reconstructed signal after various steps of nMDL-based stepwise model selection (red dotted curve) with the true spectrum (black curve).

Since we are concerned with absorption spectroscopy applications, we also implement a modified version of the algorithm so as to include a positivity constraint in the estimation results by rejecting models leading to physically unwanted negative concentration values.

3. Implementation and comparison of MDL-based information criteria

3.1. Simulated absorption spectroscopy experiment

We simulated a typical absorption spectroscopy experiment by numerically generating spectral measurements over M = 400 adjacent spectral slits, spanning between 3.2 and 3.6 μm, with a simulated instrumental spectral resolution of 2.3 nm (Gaussian slit function). The physical situation considered in this experiment consisted of a spectrally uniform illumination propagated through a gas mixture with 4 components: O 3 (6000 ppm.m), NO 2 (500 ppm.m), CH 4 (70 ppm.m) and H 2 CO (30 ppm.m), where the numerical values in brackets correspond to their respective path-length integrated concentration.

The model selection was operated with the stepwise algorithm presented above from a spectral database containing Kmax = 16 gas species, including the 4 gases of the “true” model and 12 spectrally interfering species (such as H 2 O, N 2 O, NH 3, HCl, etc.) with significant absorption strength within the spectral range considered. The strong spectral overlap of the database species can be checked in Fig. 3, where the absorption spectra of 8 gas species (among 16 in the spectral database) are plotted. In this figure, the spectra are convolved with a Gaussian kernel to match the spectral resolution of the instrument considered in the simulated experimental data.

Fig. 3 Absorption spectra of the 4 gas components present in the mixture (red curves) and of 4 other chemicals of the spectral database (black and green curves) with resolution 2.3 nm. The green curve corresponds to the absorption spectrum of HCl, which is used in section 3.4 to simulate anomalous measurements (outliers).

To account for experimental/detection noise, M statistically independent realizations of Gaussian random noise with variance σ 2 were added to the absorption spectra generated over M spectral slits. Varying the noise variance allowed us to simulate experiments with different values of the Signal to Noise Ratio (SNR), usually defined in the context of additive Gaussian noise as the ratio of the flat baseline value to the noise standard deviation σ. However, this quantity is poorly adapted to assess the difficulty of the estimation problem considered, since it only depends on the active illumination power, and does not depend on the absorption strength of the gas mixture to be detected. We thus introduce another figure of merit, denoted S-SNR for spectral SNR, and defined as:
S-SNR =1M(b0-Y )T(b0-Y )σ=1McTHTHcσ.
(12)
In this expression, the numerator can be interpreted as the root mean square of the absorption signal b 0Y = Hc from which the nature and concentration of the gas components have to be estimated. An increase of the gas mixture concentrations accentuates the spectral absorption patterns in the measured spectrum, thus leading to an easier identification/estimation. In that case, it can be seen from the above definition that the S-SNR value is correspondingly increased. An example of simulated noisy data is given in Fig.2(a) for a S-SNR=6.3 dB.

3.2. Simulation results

The results of the numerical simulations are summarized in Table 1, where the percentage of correct model selections is given for the 4 information criteria compared in this paper and for different SNRs. For each physical situation considered, this percentage is evaluated over R = 5.103 realizations of the selection/estimation task on statistically independent simulated data. Two situations were considered according to whether light has undergone absorption from the gas mixture or not.

Table 1. Percentage of Correct Models Selected by Stepwise Algorithm with Four Information Criteria (AICc, BIC, gMDL, nMDL) and for Various Values of SNR

table-icon
View This Table
| View All Tables

This table clearly reveals that in the context of unsupervised spectral unmixing, the MDL approaches implemented outperform the classical information criteria such as AICc or BIC, for reasonably high values of the SNR. This general result can be refined by observing that when the gas mixture is present, the nMDL is by far the most efficient criterion, with less than 2% erroneous selected models when S-SNR ≥ 6.3 dB, while the standard BIC selects approximately 17% erroneous models in the same conditions and AICc is strongly ineffective, leading to a large majority of erroneous selections. For lower values of the signal to noise ratio (S-SNR <4.3 dB) however, the percentage of correct models selected by nMDL strongly diminishes, and better performance is obtained with BIC. As for the gMDL approach, it can be noted that this criterion outperforms BIC for high SNRs (S-SNR≥ 9.8 dB), but the advantage quickly drops out as the SNR decreases.

To complement this analysis, it is interesting to focus on the distribution of the size of the selected models. In Fig. 4.a, the histogram of the selected models sizes is plotted for the 4 criteria and for a S-SNR=6.3 dB. This figure reveals a clear tendency for AICc to overfit the noise patterns, thus leading to strongly overestimated model sizes. If the size distributions for BIC and gMDL are very similar, with approximately 16% of overestimated models (K = 5), it is however interesting to note that nMDL appears very efficient at avoiding overfitting, with only 1% of overestimated selections and 0.4% of selections with only K = 3 components. This property has already been addressed in Ref. [12

12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

] and remains valid in the less favorable situations of low SNRs where nMDL is outperformed by BIC: when S-SNR=2 dB, nMDL leads to only 53.5% of correct models but more than 99% of the remaining selections have an underestimated size (K = 3) and the “missing” gas component is always H 2 CO. In the context of absorption spectroscopy, this behavior seems interesting since it decreases the probability of erroneously detecting a gas component in excess and thus strengthens the confidence in the components selected with nMDL.

Fig. 4 Histograms of the number of regressors selected by AICc, BIC, gMDL and nMDL criteria for S-SNR=6.3 dB, (a): with a 4-components gas mixture; (b): without gas mixture.

Let us now analyze the second physical situation considered in the simulations where the illumination beam does not undergo any absorption. In this situation, it appears clearly again that MDL approaches lead to better results, when compared with standard criteria such as AICc and BIC. Once again, this result can be interpreted from the ability of MDL approaches to avoid overfitting, which can also be checked on the histograms of Fig.4.b. With approximately 99% of correct models, the gMDL criterion leads to the lowest probability of false alarm Pfa ≃ 1 – 0.99 = 1%, which we define as the probability of detecting any gas mixture when there is not. On this particular point, the nMDL criterion appears less efficient with a Pfa ≃ 6.5%.

3.3. Influence of a positivity constraint

As stated in section 2.3, we also implemented a positivity-constrained version of the stepwise algorithm to provide physically acceptable results in the context of absorption spectroscopy. As can be checked in Table 2, such a constraint noticeably improves the quality of model selection with all the criteria considered. For instance, with a S-SNR=6.3 dB, the positivity-constrained algorithm selects 42.4% of correct models with AICc, 91.2% with BIC, 90.2% with gMDL and 99.3% with nMDL. When there is no gas mixture, the proportion of erroneous rejection of the null model hypothesis also diminishes whatever the criterion considered.

Table 2. Percentage of Correct Models Selected by Stepwise Algorithm Implementing Positivity Constraint on Regression Coefficients (i.e., on Gas Components Concentrations)

table-icon
View This Table
| View All Tables

It can be noticed however that the performance of nMDL is less influenced by this constraint than the other criteria. This property may indicate that the nMDL criterion is intrinsically efficient at avoiding non-physical results in the context addressed here, even if no positivity-constraint is applied to the algorithm.

3.4. Influence of outliers

To complete our analysis, we study the influence of measurement outliers. In practical situations of in-field experiments, many sources of measurement artifacts may exist, and it is likely that some amount of anomalous measures may occur. It is thus interesting to check the robustness of the implemented methods to the occurrence of outliers.

The results obtained are summarized in Table 3, where the percentage of correct models is given for a S-SNR=6.3 dB on the averaged signal. Once again, it can be clearly seen that in the context of spectral unmixing of interfering gas species, the nMDL criterion outperforms the other methods, with still 90% of correct models for a significant amount of outliers (20%). It can also be noted that the inclusion of outliers does not influence the Pfa obtained with nMDL (approximately 1 – 0.933 ≃ 6,7%) while this quantity noticeably decreases when other criteria are implemented.

Table 3. Percentage of Correct Models Selected by Stepwise Algorithm with S-SNR=6.3 dB for Varying Proportion of Measurement Outliers

table-icon
View This Table
| View All Tables

4. Conclusion

In this paper, we presented an original technique for unsupervised spectral unmixing of multiple gas species. More precisely, we have shown that two Minimum Description Length approaches can be successfully implemented in a stepwise model selection algorithm. Applied on spectroscopic data, this algorithm allows one to estimate the number, nature and concentration of the components of an unknown gas mixture without requiring ajustment of any parameter.

In the context addressed in this paper, numerical simulations have demonstrated that the MDL approaches outperform the standard information criteria tested (AICc, BIC). When a gas mixture is present within the path of the illumination beam, the gMDL approach does not provide great improvement in comparison to classical BIC, but we illustrated its efficiency in avoiding false alarms when no gas mixture is present. However, the most promising results for a practical implementation were obtained with the Normalized Maximized Likelihood (nMDL) approach, which seems to be a very interesting alternative to standard criteria, and can still be implemented with a simple algorithm. The nMDL criterion strongly outperforms the other methods for reasonable values of the SNR and provides the best robustness to measurements artifacts.

A promising perspective to this work is the opportunity to apply this method to experimental spectroscopic data due to recent development in our laboratories of appropriate mid-infrared powerful sources with broadband spectrum [23

23. M. Duhant, W. Renard, G. Canat, F. Smektala, J. Troles, P. Bourdon, and C. Planchat, “Improving mid-infrared supercontinuum generation efficiency by pumping a fluoride fiber directly into the anomalous regime at 1995 nm,” in CLEO/Europe and EQEC 2011 Conference Digest, (2011), p. CD9_1. (to be published)

], or with highly tunable operating wavelength [24

24. A. Berrou, M. Raybaut, A. Godard, and M. Lefebvre, “High-resolution photoacoustic and direct absorption spectroscopy of main greenhouse gases by use of a pulsed entangled cavity doubly resonant OPO,” Appl. Phys. B 98, 217–230 (2010). [CrossRef]

]. It must be noted that this approach is not limited to the case of absorption spectroscopy, and could be also applied in many situations requiring spectral unmixing (Raman spectroscopy, hyperspectral data processing, etc.). A further analysis of the influence of the spectral resolution and of the noise model would be also a useful theoretical continuation of this work, as well as the study of detection performances. A comparison of the MDL-based model selection techniques presented in this paper with other parsimonious model selection methods such as the lasso approaches [25

25. R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. R. Stat. Soc. Ser. B 58, 267–288 (1996).

, 26

26. E. J. Candès and Y. Plan, “Near-ideal model selection by 1 minimization,” Ann. Stat. 37, 2145–2177 (2009). [CrossRef]

] is another interesting perspective.

References and links

1.

P. Weibring, C. Abrahamsson, M. Sjholm, J. N. Smith, H. Edner, and S. Svanberg, “Multi-component chemical analysis of gas mixtures using a continuously tuneable lidar system,” Appl. Phys. B 79, 525–530 (2004). [CrossRef]

2.

J. R. Quagliano, P. O. Stoutland, R. R. Petrin, R. K. Sander, R. J. Romero, M. C. Whitehead, C. R. Quick, J. J. Tiee, and L. J. Jolin, “Quantitative chemical identification of four gases in remote infrared (9–11μm) differential absorption lidar experiments,” Appl. Opt. 36, 1915–1927 (1997). [CrossRef] [PubMed]

3.

G. Wysocki, R. Lewicki, R. Curl, F. Tittel, L. Diehl, F. Capasso, M. Troccoli, G. Hofler, D. Bour, S. Corzine, R. Maulini, M. Giovannini, and J. Faist, “Widely tunable mode-hop free external cavity quantum cascade lasers for high resolution spectroscopy and chemical sensing,” Appl. Phys. B 92, 305–311 (2008). [CrossRef]

4.

U. Platt and J. Stutz, Differential Optical Absorption Spectroscopy (Springer, 2008).

5.

R. A. Hashmonay, R. M. Varma, M. Modrak, R. H. Kagann, and P. D. Sullivan, “Simultaneous measurement of vaporous and aerosolized threats by active open path FTIR,” Unclassified Technical Report ADA449529, Arcadis Geraghty and Miller Research, Triangle Park, NC (2004).

6.

J. Kasparian, M. Rodriguez, G. Méjean, J. Yu, E. Salmon, H. Wille, R. Bourayou, S. Frey, Y. André, A. Mysyrowicz, R. Sauerbrey, J. Wolf, and L. Wöste, “White-light filaments for atmospheric analysis,” Science 301, 61–64 (2003). [CrossRef] [PubMed]

7.

D. M. Brown, K. Shi, Z. Liu, and C. R. Philbrick, “Long-path supercontinuum absorption spectroscopy for measurement of atmospheric constituents,” Opt. Express 16, 8457–8471 (2008). [CrossRef] [PubMed]

8.

P. S. Edwards, A. M. Wyant, D. M. Brown, Z. Liu, and C. R. Philbrick, “Supercontinuum laser sensing of atmospheric constituents,” Proc. SPIE 7323, 73230S (2009). [CrossRef]

9.

E. R. Warren, “Optimum detection of multiple vapor materials with frequency-agile lidar,” Appl. Opt. 35, 4180–4193 (1996). [CrossRef] [PubMed]

10.

S. Yin and W. Wang, “Novel algorithm for simultaneously detecting multiple vapor materials with multiple-wavelength differential absorption lidar,” Chin. Opt. Lett. 4, 360–363 (2006).

11.

J. Fade and N. Cézard, “Supercontinuum lidar absorption spectroscopy for gas detection and concentration estimation,” in Proceedings of the 25th International Laser and Remote-sensing Conference, (2010), pp. 798–801.

12.

J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).

13.

R. A. Stine, “Model selection using information theory and the MDL principle,” Sociolog. Methods Res. 33, 230–260 (2004). [CrossRef]

14.

C. D. Giurcaneanu, “Stochastic complexity for the detection of periodically expressed genes,” in Proceedings of the IEEE International Workshop on Genomic Signal Processing and Statistics, (2007), pp. 1–4. [CrossRef]

15.

H. Chen, T. Kirubarajan, Y. Bar-Shalom, and K. R. Pattipati, “MDL approach for multiple low-observable track initiation,” Proc. SPIE 4728, 477–488 (2002). [CrossRef]

16.

M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]

17.

C. L. Mallows, “Some comments on cp,” Technometrics 15, 661–675 (1973). [CrossRef]

18.

H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom. Control 19, 716–723 (1974). [CrossRef]

19.

G. Schwartz, “Estimating the dimension of a model,” Ann. Stat. 9, 461–464 (1978). [CrossRef]

20.

D. P. Foster and E. I. G., “The risk inflation criterion for multiple regression,” Ann. Stat. 22, 1947–1975 (1994). [CrossRef]

21.

J. Rissanen, Stochastic Complexity in Statistical Inquiry, Series in Computer Science (World Scientific, 1989), Vol. 15.

22.

J. Rissanen, “Fisher information and stochastic complexity,” IEEE Trans. Inf. Theory 42, 48–54 (1996). [CrossRef]

23.

M. Duhant, W. Renard, G. Canat, F. Smektala, J. Troles, P. Bourdon, and C. Planchat, “Improving mid-infrared supercontinuum generation efficiency by pumping a fluoride fiber directly into the anomalous regime at 1995 nm,” in CLEO/Europe and EQEC 2011 Conference Digest, (2011), p. CD9_1. (to be published)

24.

A. Berrou, M. Raybaut, A. Godard, and M. Lefebvre, “High-resolution photoacoustic and direct absorption spectroscopy of main greenhouse gases by use of a pulsed entangled cavity doubly resonant OPO,” Appl. Phys. B 98, 217–230 (2010). [CrossRef]

25.

R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. R. Stat. Soc. Ser. B 58, 267–288 (1996).

26.

E. J. Candès and Y. Plan, “Near-ideal model selection by 1 minimization,” Ann. Stat. 37, 2145–2177 (2009). [CrossRef]

OCIS Codes
(070.4790) Fourier optics and signal processing : Spectrum analysis
(280.1120) Remote sensing and sensors : Air pollution monitoring
(300.0300) Spectroscopy : Spectroscopy
(010.1030) Atmospheric and oceanic optics : Absorption
(010.0280) Atmospheric and oceanic optics : Remote sensing and sensors

ToC Category:
Spectroscopy

History
Original Manuscript: March 31, 2011
Revised Manuscript: May 6, 2011
Manuscript Accepted: May 9, 2011
Published: July 6, 2011

Virtual Issues
Vol. 6, Iss. 8 Virtual Journal for Biomedical Optics

Citation
Julien Fade, Sidonie Lefebvre, and Nicolas Cézard, "Minimum description length approach for unsupervised spectral unmixing of multiple interfering gas species," Opt. Express 19, 13862-13872 (2011)
http://www.opticsinfobase.org/vjbo/abstract.cfm?URI=oe-19-15-13862


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. P. Weibring, C. Abrahamsson, M. Sjholm, J. N. Smith, H. Edner, and S. Svanberg, “Multi-component chemical analysis of gas mixtures using a continuously tuneable lidar system,” Appl. Phys. B 79, 525–530 (2004). [CrossRef]
  2. J. R. Quagliano, P. O. Stoutland, R. R. Petrin, R. K. Sander, R. J. Romero, M. C. Whitehead, C. R. Quick, J. J. Tiee, and L. J. Jolin, “Quantitative chemical identification of four gases in remote infrared (9–11μm) differential absorption lidar experiments,” Appl. Opt. 36, 1915–1927 (1997). [CrossRef] [PubMed]
  3. G. Wysocki, R. Lewicki, R. Curl, F. Tittel, L. Diehl, F. Capasso, M. Troccoli, G. Hofler, D. Bour, S. Corzine, R. Maulini, M. Giovannini, and J. Faist, “Widely tunable mode-hop free external cavity quantum cascade lasers for high resolution spectroscopy and chemical sensing,” Appl. Phys. B 92, 305–311 (2008). [CrossRef]
  4. U. Platt and J. Stutz, Differential Optical Absorption Spectroscopy (Springer, 2008).
  5. R. A. Hashmonay, R. M. Varma, M. Modrak, R. H. Kagann, and P. D. Sullivan, “Simultaneous measurement of vaporous and aerosolized threats by active open path FTIR,” Unclassified Technical Report ADA449529, Arcadis Geraghty and Miller Research, Triangle Park, NC (2004).
  6. J. Kasparian, M. Rodriguez, G. Méjean, J. Yu, E. Salmon, H. Wille, R. Bourayou, S. Frey, Y. André, A. Mysyrowicz, R. Sauerbrey, J. Wolf, and L. Wöste, “White-light filaments for atmospheric analysis,” Science 301, 61–64 (2003). [CrossRef] [PubMed]
  7. D. M. Brown, K. Shi, Z. Liu, and C. R. Philbrick, “Long-path supercontinuum absorption spectroscopy for measurement of atmospheric constituents,” Opt. Express 16, 8457–8471 (2008). [CrossRef] [PubMed]
  8. P. S. Edwards, A. M. Wyant, D. M. Brown, Z. Liu, and C. R. Philbrick, “Supercontinuum laser sensing of atmospheric constituents,” Proc. SPIE 7323, 73230S (2009). [CrossRef]
  9. E. R. Warren, “Optimum detection of multiple vapor materials with frequency-agile lidar,” Appl. Opt. 35, 4180–4193 (1996). [CrossRef] [PubMed]
  10. S. Yin and W. Wang, “Novel algorithm for simultaneously detecting multiple vapor materials with multiple-wavelength differential absorption lidar,” Chin. Opt. Lett. 4, 360–363 (2006).
  11. J. Fade and N. Cézard, “Supercontinuum lidar absorption spectroscopy for gas detection and concentration estimation,” in Proceedings of the 25th International Laser and Remote-sensing Conference , (2010), pp. 798–801.
  12. J. Rissanen, Information and Complexity in Statistical Modeling (Springer, 2007).
  13. R. A. Stine, “Model selection using information theory and the MDL principle,” Sociolog. Methods Res. 33, 230–260 (2004). [CrossRef]
  14. C. D. Giurcaneanu, “Stochastic complexity for the detection of periodically expressed genes,” in Proceedings of the IEEE International Workshop on Genomic Signal Processing and Statistics , (2007), pp. 1–4. [CrossRef]
  15. H. Chen, T. Kirubarajan, Y. Bar-Shalom, and K. R. Pattipati, “MDL approach for multiple low-observable track initiation,” Proc. SPIE 4728, 477–488 (2002). [CrossRef]
  16. M. Hansen and B. Yu, “Model selection and the principle of minimum description length,” J. Am. Stat. Assoc. 96, 746–774 (2001). [CrossRef]
  17. C. L. Mallows, “Some comments on cp,” Technometrics 15, 661–675 (1973). [CrossRef]
  18. H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom. Control 19, 716–723 (1974). [CrossRef]
  19. G. Schwartz, “Estimating the dimension of a model,” Ann. Stat. 9, 461–464 (1978). [CrossRef]
  20. D. P. Foster and E. I. G., “The risk inflation criterion for multiple regression,” Ann. Stat. 22, 1947–1975 (1994). [CrossRef]
  21. J. Rissanen, Stochastic Complexity in Statistical Inquiry, Series in Computer Science (World Scientific, 1989), Vol. 15.
  22. J. Rissanen, “Fisher information and stochastic complexity,” IEEE Trans. Inf. Theory 42, 48–54 (1996). [CrossRef]
  23. M. Duhant, W. Renard, G. Canat, F. Smektala, J. Troles, P. Bourdon, and C. Planchat, “Improving mid-infrared supercontinuum generation efficiency by pumping a fluoride fiber directly into the anomalous regime at 1995 nm,” in CLEO/Europe and EQEC 2011 Conference Digest, (2011), p. CD9_1. (to be published)
  24. A. Berrou, M. Raybaut, A. Godard, and M. Lefebvre, “High-resolution photoacoustic and direct absorption spectroscopy of main greenhouse gases by use of a pulsed entangled cavity doubly resonant OPO,” Appl. Phys. B 98, 217–230 (2010). [CrossRef]
  25. R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. R. Stat. Soc. Ser. B 58, 267–288 (1996).
  26. E. J. Candès and Y. Plan, “Near-ideal model selection by ℓ1 minimization,” Ann. Stat. 37, 2145–2177 (2009). [CrossRef]

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

Figures

Fig. 1 Fig. 2 Fig. 3
 
Fig. 4
 

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited