1. Introduction
Measurements of the polarization state of light are foundational to optical physics and applied optics, and are an important avenue by which classical pictures connect with the quantum theory of light. Polarimetry is equally important in the description of light scattering that is vital to atmospheric and environmental physics. Polarization measurements can generally be divided up into either time-sequential operations (in which a sequence of measurements using various retarders and analyzers is used to deduce the Stokes parameters) [
1C.A. Skinner, “The polarimeter and its practical applications,” J. Franklin Inst. 196, 721–750 (1923). [CrossRef]
,
2E. Collett, Polarized Light: Fundamentals and Applications (CRC Press, 1992).
] or simultaneous measurements (in which either the amplitude or the wavefront is divided and directed to different sensors using suitable analyzers) [
3R. Azzam, “Arrangement of four photodetectors for measuring the state of polarization of light,” Opt. Lett. 10, 309–311 (1985). [CrossRef] [PubMed]
,
4H. Luo, K. Oka, E. DeHoog, M. Kudenov, J. Schwegerling, and E. L. Dereniak, “Compact and miniature snapshot imaging polarimeter,” Appl. Opt. 47, 4413–4417 (2008). [CrossRef] [PubMed]
]. As imaging detectors have advanced, pixel level filtering has provided an important avenue by which standard polarimetry can be applied to imaging polarimetry. [
5J. Guo and D. Brady, “Fabrication of thin-film micropolarizer arrays for visible imaging polarimetry,” Appl. Opt. 39, 1486–1492 (2000). [CrossRef]
,
6J. Tyo, D. Goldstein, D. Chenault, and J. Shaw, “Review of passive imaging polarimetry for remote sensing applications,” Appl. Opt. 45, 5453–5469 (2006). [CrossRef] [PubMed]
]. There have also been various proposals and implementations of polarimeters that make use of space-variant polarization elements. Gori [
7F. Gori, “Measuring Stokes parameters by means of a polarization grating,” Opt. Lett. 24,584–586 (1999). [CrossRef]
] suggested the use of a polarization grating to deduce the polarization of light from a diffracted field, and recently Sparks
et al.[
8W. Sparks, T. Germer, J. MacKenty, and F. Snik, “Compact and robust method for full Stokes spectropolarimetry,” Appl. Opt. 51, 5495–5511 (2012). [CrossRef] [PubMed]
] reported on the design of a robust polarimeter that makes use of a linearly varying birefringent element. Each of these methods has been shown to have certain advantages based on accuracy, robustness, or compactness. They are classical polarimeter designs, and do not explicitly invoke a priori statistical knowledge of the polarization state into the measurement.
In this paper we report on a measurement in which a polarization dependent point spread function yields a probability density map within the Poincaré sphere, offering an approach to polarization measurement that can be explicitly analyzed in a statistical framework. By interpreting the image as a conditional probability distribution, Bayes theorem allows an explicit representation of the likelihood of a polarization state given particular a priori constraints.
2. Stokes parameters and the Poincaré sphere
Consider a collimated optical beam with uniform polarization. This polarization can be expressed as a mixture of two orthogonally polarized reference states:
EU =
αêL +
βêR, where
êR and
êL represent the unit polarization vectors for left- and right-circular polarizations, and
α and
β are respectively the relative amounts of each of these polarizations in the mixture. If this is a stochastic mixture (partially polarized light),
α and
β are complex random processes that may be partially correlated, or uncorrelated. The Stokes parameters are then defined in the usual way, where the angular brackets denote correlations:
The first of these parameters is proportional to the irradiance of the field. The other three parameters can be normalized by the first one as
si =
Si/
S0, so that the normalized Stokes three-vector
s = (
s1,
s2,
s3) is defined. This vector then represents each possible state of polarization as a point in the three-dimensional space constrained to the interior and surface of the unit sphere, known as the Poincaré sphere, since |
s| ≤ 1. The magnitude |
s| corresponds to the
degree of polarization of the field, such that fields for which |
s| = 1 (i.e. where
s lies over the surface of the Poincaré sphere) are said to be fully polarized.
3. Propagation of light through stress-engineered optical elements
The basis of the method proposed here is to let the beam propagate through an optical element with spatially varying birefringence, followed by an analyzer. Our experiment makes use of a stress-engineered optical (SEO) element having threefold symmetry. The use and fabrication of these SEO windows has been described in detail elsewhere [
9,
10] and is summarized as follows: An optical window (diameter 12.7 mm, thickness 8 mm, BK7 glass) is stressed using a thermal compression procedure in which an outer metal ring is designed with a hole about 25
μm smaller than the outer diameter of the optical flat. Material is removed from the metal housing to create three contact regions at 120°. The high thermal expansion coefficient of the metal ring allows the insertion of the glass window at about 300°C. After cooling, the SEO window acquires a stress distribution of trigonal symmetry, which near the window’s center follows a power law model in which the retardance increases linearly with radius and the orientation of the fast axis precesses with the azimuth. The Jones matrix for a general retarder can be modified as follows to describe the center region of a SEO window:
in which (
r,
φ) is the window polar coordinate, 𝕀 is the 2 × 2 identity matrix, is the pseudo-rotation matrix, and
c is the stress-coefficient of the window. This approximate result is only valid in the central part of the window, as can be seen from the insets in
Fig. 1. For this reason, a circular aperture of radius
R is used to block the outer regions of the window. When the incident uniform beam propagates past this SEO followed by a left-circular analyzer, its resulting transverse irradiance distribution provides a signature that can be unambiguously linked to the polarization state of the incident beam. We will refer to this irradiance distribution by the optical convention of point spread function (PSF). In principle, one can work with PSFs corresponding to any propagation distance away from the SEO. However, it is particularly convenient to consider either the irradiance immediately after the SEO and analyzer, or at a Fourier-conjugate plane, i.e. at the back-focal plane of a lens. While the theoretical treatment that follows is valid for any of these cases, our experiments will use the focused approach for reasons discussed in the concluding remarks.
Fig. 1 Experimental setup, in which a partially-polarized beam is prepared by combining two orthogonally polarized laser beams, spatially filtering and recollimating them. This beam then passes through the center of the SEO element shown in the inset, and through a left-circular analyzer, and is focused by a lens onto a CCD. (a) Experimental image of contours of equal (half-wave) birefringence of the SEO. (b) Theoretical model of birefringence at the central region of the SEO. The aperture size used in the experiments, corresponding to cR = 0.8π, is illustrated with the small blue circle in the inset.
Our experimental setup is shown in
Fig. 1. The incident partially polarized beam was prepared through the combination with a beamsplitter of two independent orthogonally-polarized laser sources (doubled Nd:YAG, wavelength 532 nm), followed by further polarization control optics and a spatial filter in order to assure perfect beam coincidence. The degree of polarization was set by adjusting the power ratio of the two sources (equal irradiance sources of orthogonal polarization provide a very low degree of polarization, while a single source produces a fully polarized state). In each case, the polarization state was measured independently using a calibrated polarimeter (ThorLabs™). The irradiance images were captured using an Imaging Source 480 × 640 format CCD sensor (pixel width of 5.6
μm) controlled by ICCapture™, which allows the control of the gamma parameter of the camera and ensures that no pixels are saturated. The SEO window is kept in a fixed position during the course of the experiment, since any change in its orientation would change the orientation of the images.
Figures 2(a)–2(f) shows a sample set of PSFs for
cR = 0.8
π. The orientation of the distribution can be linked to the orientation of the polarization ellipse, while the spread of the distribution is linked to the degree of circular polarization, with one state of circular polarization occupying a tight distribution at the center and the opposite state occupying an annular region that contains a phase vortex. The right- and left-circular distributions are reminiscent of, and closely related to, those studied in ref. [
11A. M. Beckley, T. G. Brown, and M. A. Alonso, “Full Poincaré beams,” Opt. Express 18, 10777–10785 (2010). [CrossRef] [PubMed]
].
Fig. 2 Comparison of simulated (a–f) and measured (g–l) PSFs, for fully polarized incident light whose polarization is right-circular (a,g), left-circular (b,h), horizontal (c,i), vertical (d,j), and linear at +45° (e,k) and −45° (f,l).
4. Relation between the PSF and the polarization of the incident beam
Let us now provide a quantitative link between the measured PSF and the input polarization. After the field passes through the SEO window and left-circular analyzer, the PSF at the detector is given by
where
IR(
x) and
IL(
x) are the irradiance profiles corresponding to incident right- and left-circular polarization, respectively, and
x = (
ρ,
ϕ) are the polar coordinates at the CCD. These profiles are ideally axially symmetric. If the CCD were to be placed directly after the SOE and left-circular analyzer,
IR,L would be given by
where
A is an apodization function (assumed to be constant in what follows). On the other hand, if the beam emerging from the SOE and analyzer is focused by a lens (as it is in our experimental setup),
IR,L are given by
where
f is the focal distance of the lens, and
Jn is the
nth order Bessel function of the first kind.
Figures 2(g)– 2(l) shows the agreement of the PSFs calculated through the substitution of
Eq. (5) into
Eq. (3) with those measured experimentally.
Note that
Eq. (3) can be written as
where the vector
u(
x) is defined as
Notice that
u(
x) is a unit vector, which therefore provides a mapping between points
x over the CCD to points over the surface of the Poincaré sphere. Notice also from
Eq. (6) that, for fully polarized fields, the PSF vanishes at points where
u = −
s and is maximal at points where
u =
s.
5. Probabilistic estimation of the polarization
If the incident polarization
s is known, the probability density of a photon hitting the point
x is a normalized version of the PSF in
Eq. (3):
where
with Φ
R,L representing the total powers for each polarization, and an overline denotes averaging of a function over the
x plane with
w(
x) as a weight factor:
Therefore,
ū corresponds to the average of the unit vector
u(
x) mapping
x to the Poincaré sphere, weighted by
w(
x). In the ideal case of a perfectly aligned system and a detector with infinite resolution, the first two components of the constant vector
ū vanish due to the dependence of
u on
ϕ. It is easy to show that the third component can also be made to vanish by choosing the window radius so that Φ
R = Φ
L. In this case, the conditional probability density in
Eq. (8) depends linearly on
s. In practice, however, it is best not to calculate
IR and
IL but to use measured values [like those in
Figs. 2(g)–2(h)] as part of the setup’s calibration, and then to calculate Φ
R, Φ
L, and the averages in
Eq. (10) as sums over all pixels rather than as integrals. Due to imperfections in the SOE, the setup’s alignment, and the CCD’s finite extent, pixel orientation and discretization, none of the components of
ū will generally vanish exactly. For the experimental data used later in this work,
ū = (0.089, −0.026, −0.004). The system’s calibration also uses other known polarizations [e.g., those in
Figs. 2(i)–2(l)] to deduce any systematic errors, including the defocus that produces the skewness apparent in the measured point spread functions for the linear polarization states. The details of this procedure are given in [
12R. D. Ramkhalawon, A. M. Beckley, and T. G. Brown, “Star test polarimetry using stress-engineered optical elements,” Proc. Spie 8227, 82270Q–82270Q-8 (2012). [CrossRef]
].
Since, for a given
s, the probability of the
nth photon hitting a point
xn in the detector is independent of where other photons are detected, the probability density of
N photons hitting
N specific locations is just the product of each probability:
Consider now the case where the incident polarization is not known precisely, but instead there is an underlying probability density
P(
s) dictated by the physical process generating the field. For example, if the source of the field is fully correlated, then the field is expected to be fully polarized and
P(
s) must be zero except at the surface of the sphere. Similarly, if the measured field is due to reflections/scattering off non-chiral materials of an initially unpolarized field, then
P(
s) =
δ(
s3)
P(
s1,
s2). Note that, in any case,
P(
s) includes a Jacobian factor to account for the chosen parametrization of the polarization parameters. For a given
P(
s), the probability of a photon hitting a position
x in the detector is given by
and the probability density of finding
N photons in
N specific positions is
Our goal is to infer the incident field’s unknown polarization
s from the measured locations at the CCD of
N photons. By using
Eqs. (11) and
(13) as well as standard relations for conditional probabilities, the probability density for the polarization given
N photon positions is given by
Note that the product of
P(
xn) in the denominator is independent of
s and therefore just serves as normalization, and that the distribution
P(
s) enters as a global factor. The information provided by each detected photon comes from the factors
P(
xn|
s). It is clear from
Eq. (8) and the fact that |
u| = 1 that, for a given
xn,
P(
xn|
s) vanishes exactly for one state of (full) polarization. If the setup is such that
ū vanishes, then
P(
xn|
s) is a linear function of
s that vanishes at one point over the surface of the Poincaré sphere and is maximal at the antipodal point [see
Fig. 3(a)]. If
ū does not vanish, the contours of constant
P(
xn|
s) are still planar sections of the Poincaré sphere, but they are generally no longer parallel, so the zero and the maximum are not antipodes in general, as shown in
Fig. 3(b). In any case, each detected photon rules out a state of polarization for which
P(
xn|
s) = 0.
Fig. 3 (a,b) Plots of P(xn|s) (with black corresponding to zero) for given xn as a function of s, over a cross-section of the Poincaré sphere that includes the origin as well as the points of zero and maximum probability, for (a) ū = 0 and (b) ū = (0, 0, 0.5). (c) Plot of q(s) over the surface of the sphere for the case of two detected photons.
6. Discretization of the detector
In practice, the CCD is subdivided into a discrete number of pixels, each centered at a point
xi. Therefore, each position
xn must be assigned to the pixel at the closest
xi, so
Eq. (14) becomes
where
Ĩ = (
Ĩ1,
Ĩ2,...) with
Ĩi being the measured number of photons falling in the pixel at
xi, and
Equation (15) gives the probability density in the Poincaré space that the field is in a state of polarization
s given a detected signal
Ĩ at the CCD. As mentioned earlier, the prefactor
P(
s) in this expression helps preselect the expected states of polarization, and provides the correct Jacobian for the conditional probability density corresponding to the chosen parametrization of
s. In the experimental results that follow, we assume that
P(
s) is constant. The factor exp(−
q0) is also constant, and serves simply as normalization. The main part of
Eq. (15) is the factor exp[
q(
s)]. When the number of detected photons
N is large, the distribution given by this factor becomes sharply peaked and, in the absence of noise and experimental errors, this peak is essentially at the true polarization
s0 of the incident field. This can be shown by taking the gradient in
s and using the fact that, in the limit of small pixel area
a and large
N,
Ĩi ≈
NaP(
xi|
s0):
where in the last step we used ∑
i P(
xi|
s) ≈ 1/
a.
For large
N, exp[
q(
s)] is approximately a (generally anisotropic) Gaussian distribution whose maximum is at
s0:
where is the Hessian matrix of derivatives of
q evaluated at
s0. By using
Ĩi ≈
NaP(
xi|
s0), the components of this matrix can be found to be given approximately by
where, as defined in
Eq. (10), an overline denotes averaging over the CCD plane with weight factor
w(
x). The orientation axes and standard-deviation widths of the Gaussian are given, respectively, by the eigenvectors and the inverse of the square root of minus the eigenvalues of .
7. Optimal polarimeter
The previous analysis raises the question of whether there is a birefringence distribution that the SEO element could have that would optimize the performance of the system. Let us define “optimal performance” as that where the magnitude of the spreads of the probability distribution determined by depend only on the magnitude of
s0 (i.e. on the degree of polarization) and not on its direction. For the derivation that follows, we consider the limit of small pixel size, so
x is regarded as a continuous variable. The fact that
R is chosen such that
ū vanishes is not sufficient to ensure optimal performance; the factors 1 +
u(
x) ·
s that make up the probability must also have equal weight. From
Eq. (8) we see that (assuming
ū vanishes) the weight of this linear function due to an infinitesimal area over the CCD is
wρd
ρd
ϕ. Uniformity over the sphere means that this weight should be proportional to sin
θd
θd
ϕ, where
θ = arccos(
uz). The condition for the polarimeter to be optimal then states
which, after substituting the expressions for
w and
uz from
Eqs. (9) and
(7), results in the following condition for
IR,L:
One case in which it is easy to find the solution to this constraint is that where the CCD is placed right after the SEO element and analyzer, since then
IR +
IL is constant. The solution to
Eq. (21) is then simply
where
I0 is a constant. That is, the birefringence map should be
limited to 0 ≤
ρ ≤
R. For the optimal polarimeter, the Hessian given by
Eq. (19) can be calculated analytically.
8. Theoretical comparison with a standard polarimeter
It is interesting to compare the performance of this type of polarimeter to that of a standard one, where the detected light is separated into six parts (through spatial or temporal multiplexing) and each part is made to pass through an analyzer (vertical, horizontal, linear at 45°, linear at −45°, right-circular, and left-circular) after which a detector counts the transmitted photons. For such a polarimeter, the probability density can be written as
where
P0 is a normalization constant,
ĨSt is a vector whose six components represent the number of photons detected by each of the detectors, and
equals (±1, 0, 0) for the vertical/horizontal polarization detector, (0, ±1, 0) for the ±45° linear polarization detector, and (0, 0, ±1) for the right/left circular polarization detector. That is, it is also true for a standard polarimeter that the probability density is a product of linear functions of
s which vanish exactly at one point over the surface of the Poincaré sphere. However, in this case, these linear functions are always aligned with the three Cartesian coordinates, so only the six reference full states of polarization can be “ruled out” by the detection of a photon. In the large
N limit, the probability density and the components of the Hessian of
qSt(
s) = ln[
PSt(
s|
ĨSt)] evaluated at
s =
s0 can be shown to be given approximately by
where
s0n are the components of the polarization
s0 of the incident field, and
δm,n is the Kronecker delta. Given the diagonal nature of
St, the orientation axes of the Gaussian distribution for exp[
qSt(
s)] is always that of the axes
s1,
s2, and
s3. Both for the standard and the new polarimeter, the Hessian is proportional to the number of detected photons
N, so the standard-deviation widths scale as
N−1/2.
Figure 4 gives a comparison of the spreads of the probability distribution over a slice of the
s1–
s3 plane for the SEO-based polarimeter according to
Eq. (19), for the optimal polarimeter described earlier, and for the theoretical model of a standard polarimeter, according
Eqs. (25), all for
N = 2500 (recall that these spreads scale as
N−1/2). The black ellipses represent the standard-deviation cross-section of the Gaussian distributions over this plane for several values of
s0, while the radius of the underlying green circles represents the standard deviation cross-section in the direction of
s2 (normal to the plane). Notice that, when the degree of polarization is small (i.e. for
s0 near the center of the Poincaré sphere), both SEO and standard polarimeters present very similar standard deviations, which are nearly isotropic and roughly of size
. [The slight anisotropy for the spreads of the SEO-based polarimeter in part (a) are due to the fact that this figure was generated using the measured
IR and
IL, which resulted in
ū not vanishing exactly.] However, for a highly polarized field, the uncertainty in the results of the standard polarimeter (particularly in the radial direction) depend significantly on whether the polarization is nearly one of the six ones used as references, while for the proposed system the behavior is significantly more uniform, resembling that of the optimal case shown in
Figure 4(b).
Fig. 4 Widths of the standard deviations of the polarization measurements corresponding to several values of s0 within the s1–s3 slice of the Poincaré sphere, for (a) the proposed SEO-based polarimetric system, (b) for the optimal polarimeter; and (c) for a standard polarimeter. The radius of the green circles represents the extent of the standard deviation in the direction normal to the plane.
10. Concluding remarks
We have chosen one particular form of a polarization-dependent point spread function to illustrate how such a PSF maps to a probability density function on the Poincaré sphere. One could, in principle, employ a 1-d PSF such as that used by Sparks
et al[
8W. Sparks, T. Germer, J. MacKenty, and F. Snik, “Compact and robust method for full Stokes spectropolarimetry,” Appl. Opt. 51, 5495–5511 (2012). [CrossRef] [PubMed]
], or use a window with a different stress distribution to achieve a similar goal. The advantage of the trigonally stressed window is that, as with the case of Full Poincaré beams [
11A. M. Beckley, T. G. Brown, and M. A. Alonso, “Full Poincaré beams,” Opt. Express 18, 10777–10785 (2010). [CrossRef] [PubMed]
], a simple geometric mapping is possible between coordinates on the PSF and points on the Poincaré sphere. For the collection of linear state measurements shown in
Fig. 2, the orientation of the PSF gives the orientation of the polarization ellipse.
Such an arrangement, in principle, could be done without a lens. However, the lens allows the point spread function to be interrogated with a relatively small number of pixels (we have tested the concept with as few as 16 pixels) by focusing a collimated beam to an image sensor. In this way, it is adaptable to an imaging polarimetry scenario, in which separated, distant point sources could be separately analyzed using the polarization dependent point spread function.
We believe that this approach to polarization measurement could prove useful for situations in optics that require the extraction of polarization information from a single measurement, in cases of very small photon number, and in the characterization of polarization-entangled quantum states[
13D. F. V. James, P. G. Kwiat, W. J. Munro, and A. G. White, “Measurement of qubits,” Phys. Rev. A. 64, 052312–27 (2001). [CrossRef]
]. By inferring the probability density of the polarization conditioned on
a priori information, we can explore the implications of polarization measurements at very low light levels. For a very small number of independent photons, the formalism suggests that each photon represents an individual stochastic event (absorption at a particular pixel) described by a spatial probability density function.
Figure 3(c) illustrates the character of this distribution for the case of a sequence of two photons which are, in this case, generated by a numerically weighted PSF. While it is clear that each photon location yields some result (in a maximum likelihood sense) of the state of the field, it is more striking that the location of each event eliminates precisely one fully polarized state from the possible states that comprise the light field. For quantum measurements, the method projects the polarization state of the photon onto a continuous variable space; since the method is inherently probabilistic, it may provide an important tool for exploring novel quantum states using correlative imaging. Replacing our analyzer with a polarization beam splitter and measuring coincidence between two sensors would open up new avenues for the direct imaging of quantum states.