1. Introduction
Mueller polarimetry consists of illuminating a scene with four well-chosen polarization states and measuring the Stokes vector of the light scattered by the scene for each incident polarization [
1B. Laude-Boulesteix, A. De Martino, B. Drévillon, and L. Schwartz, “Mueller Polarimetric Imaging System with Liquid Crystals,” Appl. Opt. 43(14), 2824–2832 (2004). [CrossRef] [PubMed]
]. These measurements give access to the response of the observed material to any incident polarization state, all these information being gathered in the 4 × 4 Mueller matrix. In the design of Mueller polarimeters, the choice of the polarization states that minimize the estimation variance has been widely studied [
2K. M. Twietmeyer and R. A. Chipman, “Optimization of Mueller matrix polarimeters in the presence of error sources,” Opt. Express 16, 11589–11603 (2008). [CrossRef] [PubMed]
–
9P. A. Letnes, I. S. Nerbø, L. M. S. Aas, P. G. Ellingsen, and M. Kildemo, “Fast and optimal broad-band Stokes/Mueller polarimeter design by the use of a genetic algorithm,” Opt. Express 18, 23,095–23,103 (2010). [CrossRef]
]. In these studies, it was generally assumed that the noise that affects the measurements is additive and independent of the level of the signal [
10Y. Takakura and J. E. Ahmad, “Noise distribution of Mueller matrices retrieved with active rotating polarimeters,” Appl. Opt. 46, 7354–7364 (2007). [CrossRef] [PubMed]
].
However, in many cases, the shot noise due to the useful signal is dominant compared to the signal independent detector noise. This is for example the case in photon counting systems or quantum detectors with a sufficient level of light. It is thus important to determine which are the optimal Mueller polarimeter configurations in the presence of signal dependent shot noise. In this paper, we propose a set of polarization states for which estimation variance is minimal - in a given sense - and depends on the observed Mueller matrix only through its intensity reflectivity, not on its other polarimetric properties. This result is particularly important in Mueller imaging, since it makes it possible to estimate the Mueller matrices of all the materials present in the image with the same precision. This issue has already been addressed for Stokes polarimeters [
11F. Goudail, “Noise minimization and equalization for Stokes polarimeters in the presence of signal-dependent Poisson shot noise,” Opt. Lett. 34, 647–649 (2009). [CrossRef] [PubMed]
], but not, to the best of our knowledge, for Mueller imagers.
The paper is organized as follows: In Section 2, we define the performance criterion used to quantify the performance of a Mueller polarimeter and illustrate it on the example of additive Gaussian noise. Then we find the polarimeter configurations that optimize this criterion in the presence Poisson shot noise (Section 3). We present in Section 4 some simulations that validate the obtained results and illustrate the benefit of using the proposed optimal measurement configurations.
2. Performance criterion for a Mueller polarimeter
We consider Mueller polarimeters that perform
N = 16 intensity measurements to estimate the Mueller matrix of a material. Let us denote
the 16×16 Mueller matrix to estimate. The measurement system is composed of a unpolarized light source of intensity
I0, a polarization state generator with matrix of states
A, and a polarization state analyzer with matrix
B. The matrices
A and
B contain sets of 4 Stokes vectors used respectively in illumination and analysis to acquire the Mueller matrix:
where
U = {
A,
B} and
are the unit intensity Stokes vectors of the polarization states used. The intensities acquired from the scene are thus given by:
where
T denotes the transpose of the matrix,
I0 is the intensity coming from the light source,
I is a 4 × 4 matrix containing the intensities obtained from the 16 measurements using the polarization states defined in the
A and
B matrices. In the following, to simplify equations, we will consider that we are estimating the Mueller matrix
I0M.
Eq. (3) can be thus rewritten as follows:
where ⊗ denotes the Kronecker product [
14A. N. Langvillea and W. J. Stewart, “The Kronecker product and stochastic automata networks,” J. Comp. Appl. Math. 167, 429–447 (2004). [CrossRef]
] and
VM and
VI are 16 dimensional vectors obtained by reading respectively the matrices
I0M and
I in the lexicographic order.
In this paper, we will consider that the measurements can be disturbed by two kinds of noise sources that are additive Gaussian noise (that can be a model for sensor noise) and Poisson shot noise. The sensor noise will be modeled as a Gaussian noise of zero mean and variance σ2 while the Poisson noise has intrinsically the interesting property that its variance is equal to its mean. The variance of the noise disturbing the acquisition will thus be equal to the mean of the intensity measured.
To estimate the Mueller matrix (and thus the vector
VM) from the noisy intensity measurements stacked in the vector
VI, we use the following estimator, which consists in inverting
Eq. (4):
If the noise disturbing the acquisition is additive Gaussian distributed with a mean equal to zero or Poisson distributed, it is clear that
V̂
M is an unbiased estimator, since
where < . > denotes ensemble averaging. Its covariance matrix has the following expression [
10Y. Takakura and J. E. Ahmad, “Noise distribution of Mueller matrices retrieved with active rotating polarimeters,” Appl. Opt. 46, 7354–7364 (2007). [CrossRef] [PubMed]
]:
where Γ
VI is the covariance matrix of
VI. Finally, a standard performance criterion for a Mueller polarimeter is the sum of the variances of all the elements of the Mueller matrix, which is the trace of Γ
V̂M:
This criterion can be rewritten in a simpler form obtained by using some properties of the Kronecker product and trace functions:
with
QU = (
UTU)
−1.
In order to illustrate the previous results, let us consider that we are in the presence of additive Gaussian noise. In this case,
VI is a random vector such that each of its elements [
VI]
i,
i ∈ [1, 16] is a Gaussian random variable of mean value <
Ii > and variance
σ2. We assume that the fluctuations are statistically independent from one intensity measurement to the other. The covariance matrix Γ
VI of
VI is thus a diagonal matrix with diagonal elements equal to
σ2. In this case, the expression of the criterion
𝒞 can be simplified as follows:
It has been shown that trace{
QU} is minimized if the 4 vectors
,
i ∈ [
1B. Laude-Boulesteix, A. De Martino, B. Drévillon, and L. Schwartz, “Mueller Polarimetric Imaging System with Liquid Crystals,” Appl. Opt. 43(14), 2824–2832 (2004). [CrossRef] [PubMed]
,
4J. S. Tyo, “Design of optimal polarimeters: maximization of signal-to-noise ratio and minimization of systematic error,” Appl. Opt. 41, 619–630 (2002). [CrossRef] [PubMed]
] (defined in
Eq. (2)) form a regular tetrahedron on the Poincaré sphere [
4J. S. Tyo, “Design of optimal polarimeters: maximization of signal-to-noise ratio and minimization of systematic error,” Appl. Opt. 41, 619–630 (2002). [CrossRef] [PubMed]
,
12A. Ambirajan and D. C. Look, “Optimum angles for a polarimeter: part II,” Opt. Eng. 34, 1656–1658 (1995). [CrossRef]
,
13J. S. Tyo, “Noise equalization in Stokes parameter images obtained by use of variable-retardance polarimeters,” Opt. Lett. 25, 1198–1200 (2000). [CrossRef]
]. Thus to minimize
𝒞, the matrices
A and
B must be of this form. It can be noticed that they may not be identical.
The variance on each coefficient of the Mueller matrix is given by:
and, by considering that
A and
B are two sets of polarization states forming a regular tetrahedron on the Poincaré sphere, the variances associated with each coefficient are given by:
where
var[
M] is a matrix containing the estimation variance of each coefficients of the Mueller matrix. The minimal value of
𝒞 is thus equal to:
The variance on each coefficient does not depend on the observed Mueller matrix, which is normal in the presence of additive noise. In the next section, we will show that specific polarimeter configurations enable us to obtain similar properties in the presence of Poisson distributed shot noise.
3. Optimal Mueller matrix estimation in the presence of Poisson shot noise
Let us now consider that we are in the presence of Poisson shot noise. In this case,
VI is a random vector such that each of its elements [
VI]
i,
i ∈ [1, 16] is a Poisson random variable of mean value <
Ii > and variance <
Ii >. From the properties of Poisson shot noise, the fluctuations are statistically independent from one intensity measurement to the other. The covariance matrix Γ
VI is thus diagonal of the form:
Using the properties of the trace of a matrix, it is possible to rewrite the
𝒞 criterion from
Eq. (9) as:
and substituting the expression of [Γ
VI]
i,j in
Eq. (14), we obtain:
with the vector:
It is interesting to notice that, as the first row of matrices
A and
B only consists of 1/2, the first column of the matrix [
B ⊗
A]
T is thus equal to 1/4 and thus the criterion
𝒞 can be separated in two terms:
where
V′
U is a 15 dimensional vector containing the coefficients of
VU from 2 to 16. It has to be noted that, contrary to the case where the noise is additive Gaussian, the criterion
𝒞 depends on both the measurement matrices (
A,
B) and the observed Mueller matrix. It is thus possible to find the best couple of matrices (
A,
B) that minimize this criterion when observing a particular Mueller matrix
M. However, as it will be shown in the last section of this paper, this couple of measurement matrices is optimal only for one observed Mueller matrix, and can lead to high values of the criterion when used to estimate other matrices, which means high variances of some coefficients of the Mueller matrix. It is thus interesting to find the best couple of matrices (
A,
B) that allows us minimizing the criterion
𝒞 whatever the observed Mueller matrix. For that purpose, we will use a
min/max approach.
If we consider a particular Mueller matrix
M, it is always possible to find a couple of matrices (
A,
B) leading to a negative value of the product
V′
TMV′
(A,B). However, if we consider the physical Mueller matrix associated with a perfect depolarizer:
the vector
V′
M is null whatever the matrices (
A,
B). We can thus say that, if we consider all the possible physical Mueller matrices, the maximal value of the product
V′
MTV′
(A,B) is larger or equal to zero. This is true for all measurement matrices (
A,
B). We thus have the relation:
Let us now consider two sets of 4 Stokes vectors spread over the Poincaré sphere and forming a regular tetrahedron. They are gathered in the two matrices of illumination and analysis
A and
B as presented in
Eq. (2). This type of matrices has two interesting properties:
By substituting
Eq. (21) in
Eq. (18), the criterion
𝒞 is rewritten as:
Finally, using the property in
Eq. (22), we obtain that the product
V′
MT V′
(A,B) is equal to 0, which is the minimal value that can be reached if we want to minimize the criterion
𝒞 considering all the possible vectors
V′
M (see
Eq. (20)). The conclusion is thus that, using Stokes vectors forming a regular tetrahedron on the Poincaré sphere, it is possible to minimize the maximal variance over all observed Mueller matrices, and the obtained value of the criterion
𝒞 is then equal to:
The min/max value of
𝒞 has the same expression as in the case of additive Gaussian noise, with
σ2 replaced by [
VM]
1, which corresponds to a variance since we are in the presence of Poisson shot noise.
However, it must be noted that contrary to the case of additive noise, the variances
on each coefficient [
VM]
i may vary with the value of
VM. Indeed,
Eq. (7) yields
As the first row of the
A and
B matrices is always equal to 1/2, the first column of the matrix [
B ⊗
A]
T is thus equal to 1/4. Using this property,
can be expressed as:
In order to have variances that are independent of Mueller matrix coefficients other than [
VM]
1, that is related to the reflectivity of the material, the second term of
Eq. (26) has to be equal to zero. The question is thus: ”Does it exist any regular tetrahedron for which this term is always equal to zero?” For this, let us rewrite this term as following:
with
The only two matrices that have the property: ∀
i ∈ [2, 16],
, are (within arbitrary column permutations)
and the matrix
A2 =
B2 obtained from
A1 by reversing the signs of all the elements of the last three rows. The tetrahedron obtained from the matrix
A1 is presented
Fig. 1.
Fig. 1 tetrahedron obtained from the matrix
A1 presented
Eq. (29). (a) Top view. (b) Global view
The uniqueness of this result can be proved thanks to an exhaustive search. Let us define the following criterion depending on
:
It is clear that
ℱ will be equal to 0 if and only if ∀
i,
k
. The goal is now to compute the value of this criterion for all the regular tetrahedra with vertices on the Poincaré sphere. In order to generate these tetrahedra, we start from the tetrahedron presented in
Eq. (29) and we apply to it two different rotations that are represented in the
Fig. 2. By varying angle
α from −90° to 90° and
β from −180° to 180°, it is possible to generate all the possible regular tetrahedra, and compute for each of them the criterion
ℱ. It has to be noted that for
α = 0 and
β = 0, the generated tetrahedron is the optimal one. The obtained results are presented in
Fig. 3.
Fig. 2 Definition of the angles of rotation
α and
β to generate any regular tetrahedron on the Poincaré sphere from the optimal one presented in
Eq. (29).
Fig. 3 Evolution of the criterion ℱ in function of the angles α and β.
We can observe that the criterion is minimal and equal to 0 only for combinations of
α and
β only equal to −90°, 0° and 90° and it is easily observed that all these combinations always lead to the two optimal tetrahedra defined in
Eq. (29). It is interesting to notice that, by using the couple of matrices (
A1,
B2) and (
A2,
B1), we obtain also a value of
ℱ equal to 0 and the conclusions are the same as those we present when using couples (
A1,
B1) and (
A2,
B2) to estimate the Mueller matrix.
Using this optimal matrix for illumination (
A) and analysis (
B), the estimation variance on each coefficient of the Mueller matrix will be independent of observed matrix and the variance of each coefficient is given by:
These variance are gathered in the matrix
var[
M] that is equal to:
We can observe that we obtain a result similar to the one obtained in the case of intensity disturbed by additive Gaussian noise (see
Eq. (12)). The only difference is that the variance is replaced by the coefficient [
VM]
1, which also represents a variance in the presence of Poisson noise. However, in the case of Poisson noise, these properties are not obtained for all polarimeter structures based on regular tetrahedra, but only in the case of the measurement matrices in
Eq. (29).
4. Examples & discussion
Let us now illustrate these results and their interests on an example. We consider a Mueller matrix consisting of a diattenuator with diattenuation
D = 0.5 and axis
D given by
DT = [0.8, 0.6, 0] [
15S. Lu and R. A. Chipman, “Interpretation of Mueller matrices based on polar decomposition,” J. Opt. Soc. Am. A 13, 1106–1113 (1996). [CrossRef]
]. Acquisitions of intensity are disturbed by Poisson shot noise and we use the same set of polarization states in illumination and analysis (
A =
B).
We consider three different configurations to estimate the Mueller matrix. The first one, that we call
Min, consists in using the set of polarization states minimizing the criterion
𝒞 presented
Eq. (23) for this matrix. The second, that we call
Tetra, consists in using a set of polarization states forming an arbitrary regular tetrahedron on the Poincaré sphere. The associated matrix
Atetra is given by:
Finally, the third case, called
TetraMin/max, consists in using the set of polarization states forming the optimal regular tetrahedron on the Poincaré sphere defined
Eq. (29). For these 3 configurations, we compute the criterion
𝒞 (see
Eq. (18)) and the variance matrix
Var[
M] by using the analytical form of the matrix in
Eq. (25). We have checked the validity of this expression with Monte Carlo simulations: when a sufficient numbers of realizations is used, one obtains a very good agreement with the theoretical values for all the Mueller matrices we have tested. The results are gathered in
Table 1.
Table 1 Variance of each coefficient of the Mueller matrix and efficiency criterion values obtained by using different sets of polarization states.
Min: Set of polarization states minimizing the criterion
𝒞. Tetra: Set of polarization states forming a regular tetrahedron on the Poincaré sphere defined
Eq. (33).
TetraMin/max: Optimal set of polarization states forming a regular tetrahedron on the Poincaré sphere defined
Eq. (29).
| Min | Tetra | TetraMin/max |
|---|
|
|
|
|
|
| 𝒞M1 = [VM]1. | 23.32 | 25 | 25 |
We observe that the criterion
𝒞 is, as expected, minimal in the configuration
Min because the set of polarization states have been adapted to the measured matrix. It has to be noticed, that, in this configuration, the polarization states are not forming a regular tetrahedron on the Poincaré sphere. Considering the two other configurations
Tetra and
TetraMin/max, the criterion
𝒞 is equal to (5/2)
2 = 6.25, as found previously in
Eq. (24). Let us now look at the variances of the different coefficients of the Mueller matrix. We can notice that some coefficients have a lower variance than the one obtained by using the optimized regular tetrahedron presented
Eq. (29). However, others have a higher variance. It means that, even if the global estimation of the Mueller matrix seems to be more efficient by using the set of polarization states minimizing
𝒞, some coefficients have a worse estimation precision than when using the optimized regular tetrahedron (like, for example, the coefficient
M33 that has a variance 13% larger). The same observation can be done considering the arbitrary regular tetrahedron. Even if the use of this latter leads to the same value of
𝒞 as with the optimal regular tetrahedron, some coefficients have a bad estimation precision compared to the optimal case. For example, the coefficient
M11 that has a 64% larger variance.
Moreover, it has to be noted that the set of polarization states used in the configuration
Min has been optimized for one particular matrix. What are the consequences of the use of this set to estimate another Mueller matrix? Let us consider that we observe another diattenuator matrix of diattenuation
D = 0.42 with
DT = [0.24, 0.24, 0.94]. The sets of polarization states used to estimate the Mueller matrix are kept the same and the results are presented in the
table 2.
Table 2 Variance of each coefficient of the Mueller matrix and efficiency criterion values obtained by using different sets of polarization states.
Min: Set of polarization states minimizing the criterion
𝒞 for
M1.
Tetra: Set of polarization states forming a tetrahedron on the Poincaré sphere defined
Eq. (33).
TetraMin/max: Optimal set of polarization states forming a tetrahedron on the Poincaré sphere defined
Eq. (29).
| Min | Tetra | TetraMin/max |
|---|
|
|
|
|
|
| 𝒞M2 = [VM]1. | 25.8 | 25 | 25 |
First, we can observe that the criterion 𝒞 in the configuration Min is now larger than the one obtained with regular tetrahedron. Indeed, the set of polarization states used is absolutely not optimized for this matrix, that is why the variance increases. As expected, the value of the criterion does not change using the tetrahedron. Considering now the variance of each coefficient, we can see that, in the configuration Min, some of them have a variance that is now 90% larger than the one obtained with the optimal tetrahedron, such as the coefficient M11. The same observation can be done with the configuration Tetra where the variance of the coefficient M22 is now 74% larger than in the optimal configuration.
In conclusion, we have shown that using the set of polarization states presented in
Eq. (29) allows minimizing and equalizing the variance of the different coefficients of the Mueller matrix to estimate. These variances do not depend on the polarimetric properties of the material, that is not the case when using any other sets of polarization states. This configuration also avoids having estimation of Mueller matrices with very high variance for some coefficients.
However, it can be noted that in some applications, it may be interesting to estimate some coefficients with a higher precision than some others. In this case, an optimization of the measurement configuration that takes into account the requirements of the application can be done using
Eq. 15 and
26.