## Maximum-likelihood estimation of parameterized wavefronts from multifocal data |

Optics Express, Vol. 20, Issue 14, pp. 15928-15944 (2012)

http://dx.doi.org/10.1364/OE.20.015928

Acrobat PDF (1697 KB)

### Abstract

A method for determining the pupil phase distribution of an optical system is demonstrated. Coefficients in a wavefront expansion were estimated using likelihood methods, where the data consisted of multiple irradiance patterns near focus. Proof-of-principle results were obtained in both simulation and experiment. Large-aberration wavefronts were handled in the numerical study. Experimentally, we discuss the handling of nuisance parameters. Fisher information matrices, Cramér-Rao bounds, and likelihood surfaces are examined. ML estimates were obtained by simulated annealing to deal with numerous local extrema in the likelihood function. Rapid processing techniques were employed to reduce the computational time.

© 2012 OSA

## 1. Introduction

1. I. S. Stefanescu, “On the phase retrieval problem in two dimensions,” J. Math. Phys. **26**(9), 2141–2160 (1985). [CrossRef]

2. J. H. Seldin and J. R. Fienup, “Numerical investigation of the uniqueness of phase retrieval,” J. Opt. Soc. Am. A **7**(3), 412–427 (1990). [CrossRef]

3. M. R. Teague, “Deterministic phase retrieval: a Green’s function solution,” J. Opt. Soc. Am. **73**(11), 1434–1441 (1983). [CrossRef]

6. G. R. Brady, M. Guizar-Sicairos, and J. R. Fienup, “Optical wavefront measurement using phase retrieval with transverse translation diversity,” Opt. Express **17**(2), 624–639 (2009). [CrossRef] [PubMed]

6. G. R. Brady, M. Guizar-Sicairos, and J. R. Fienup, “Optical wavefront measurement using phase retrieval with transverse translation diversity,” Opt. Express **17**(2), 624–639 (2009). [CrossRef] [PubMed]

## 2. Propagation algorithm

**r**

_{0}and

**r**are two-dimensional vectors in the pupil and image planes, respectively [7,8]. It is a version of the Huygens wavelet formulation when the obliquity factor in the integral is neglected and distance is replaced by

*z*in the denominator. Thus, the field at each observation point in the image plane is the sum of an infinite number of secondary waves emanating from the aperture. Since the integrand of the double integral in Eq. (1) can vary rapidly over the integration domain, especially when the aberrations are large, a significant amount of sampling in the aperture is required to accurately predict the irradiance distribution in the image plane. Consequently, numerical evaluations of the Huygens diffraction integral are prohibitively time-consuming and an approximation to this integral is necessary to reduce the computing problem.

*f*is defined as the radius of curvature of the unaberrated wave in the exit pupil (i.e., the distance between the exit pupil and paraxial focus). Inserting Eq. (2) into Eq. (1) leads toor equivalently,where the last exponential term is part of the integrand. A binomial expansion for

*A*(

*r*) is given by

**r**and

**r**

_{0}are inseparable in these terms, so including them in the integral means that we cannot take advantage of the FFT. Since a brute-force computation of the diffraction integral is too computationally expensive and impractical, the best we can do is minimize the terms in Eq. (6) by considering planes sufficiently close to nominal focus, so that

*z*≈

*f*and

**r**is small.

## 3. Parameterized wavefront description

*n*th polynomial with coefficient

*α*,

_{n}*n*= 1,…,

*N*), but an important determination is the number of coefficients necessary for an accurate representation of the wavefront. Even if a small number of terms is used in the expansion, this does not imply that the wavefront aberration is small; representing large wavefront errors simply requires large coefficients. In our approach, we assume that the wavefront is smoothly varying, so that sufficient accuracy can be achieved with a relatively small value of

*N*.

## 4. Maximum-likelihood estimation

**perform a search through the space defined by all possible values to find a point that maximizes the likelihood, or probability, of generating the observed data vector**

*θ***g**:Here,

**g**, conditioned on the set of parameters

**. Essentially all search algorithms locate an extremum via iterative methods that execute in a variable number of steps depending on the starting location, the volume of the search space, the complexity of the probability surface, and values selected for convergence factors. Since we are dealing with a complicated likelihood surface in a high-dimensional space with many local minima, we implement a global search algorithm called simulated annealing (SA) that can be tailored to the probability surface [9**

*θ*9. A. Corana, M. Marchesi, C. Martini, and S. Ridella, “Minimizing multimodal functions of continuous variable with the ‘simulated annealing’ algorithm,” ACM Trans. Math. Softw. **13**(3), 262–280 (1987). [CrossRef]

## 5. Fisher information and Cramér-Rao bounds

10. H. H. Barrett, C. Dainty, and D. Lara, “Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions,” J. Opt. Soc. Am. A **24**(2), 391–414 (2007). [CrossRef] [PubMed]

*Cramér-Rao inequality*[11,12], written aswhere

*p*th parameter and

## 6. Numerical results

### 6.1 Test lens description

*D*= 88.41 mm and f/#

_{xp}_{w}= 1.796, respectively, at the wavelength λ = 0.6328 μm. The peak-to-valley wavefront error was 149.1λ.

*N*= 37, the maximum number of terms (Table 1 ). Since the system has rotational symmetry, the coefficients for all non-rotationally symmetric Zernike terms are zero; thus non-zero coefficients correspond to piston, defocus, and various orders of spherical aberration. Table 2 provides the polynomial expressions for {

*Z*,

_{n}*n*= 1,…,9, 16, 25, 36, 37}, corresponding to the rotationally symmetric terms and several low-order terms.

### 6.2 Irradiance data

*z*=

*z*

_{1}=

*z*– 0.25 mm and

_{f}*z*=

*z*

_{2}=

*z*– 0.43 mm, where the paraxial focal plane is located at

_{f}*z*=

*z*= 157.8 mm and the exit pupil lies in the

_{f}*z*= 0 plane. So that each irradiance distribution provided unique information, special care was taken to select planes sufficiently far from each other, yet close enough to the nominal focus for the approximation in Eq. (9) to hold.

### 6.3 Fisher information and Cramér-Rao bounds

*α*,

_{n}*n*= 2,…, 9, 16}, shown in Fig. 3(a) on a logarithmic scale. Although the off-axis coefficients included here are zero in this idealized case, in a real system, they may become non-zero if there are any system misalignments. Since

*α*

_{1}corresponds to piston and does not influence the irradiance data, we had no interest in estimating it and disregarded it in the FIM. The variance σ

^{2}in the detector elements was based on a peak SNR of 10

^{4}. We evaluated the FIM components at

**, based on the values in Table 1. Note that the FIM is a 9 × 9 symmetric matrix and is order-specific, with the parameter indices on the**

*θ**x*and

*y*axes. The

*jk*

^{th}entry has units of

^{1/2}, on the order of 10

^{−10}to 10

^{−9}, permit the estimation of the wavefront parameters to very high precision, provided that the forward model is exact.

### 6.4 Likelihood surfaces

**g**|

**) versus**

*θ***for a particular**

*θ***g**is called the

*likelihood surface*for that data vector. Although we are minimizing the sum-of-squares (i.e., negative likelihood), we will refer to this objective function as the likelihood surface.

*N*parameters, the likelihood surface exists in an

*N*-dimensional hyperspace. However, we are restricted to plotting the likelihood surface while varying up to two parameters at a time. We selected a handful of pairs of parameters for the following plots (Fig. 4 ) and applied the ranges given in Table 4 . In each plot, the fixed parameters were set to the true values underlying the data.

### 6.5 Maximum-likelihood estimates

*α*,

_{n}*n*= 2,…, 9, 16}, then estimated them using ML estimation according to our Gaussian noise model. Upper and lower limits in the search space were based on the specified ranges in Table 4. We generated a random starting point, as provided in Table 5 .

*α*,

_{n}*n*= 2,…, 9, 16} are given in Table 5, along with the respective standard deviation. Both the bias and variance are small for every parameter, while the bias is always within one standard deviation. Based on the small biases, it is not surprising that the true and estimated irradiance patterns are virtually indistinguishable (Fig. 6 ).

^{3}forward propagations, per temperature phase. For the specified pupil sampling and FFT grid size, the computation time was 790 ms per forward propagation, which included both output planes. So, the average of 87.3 temperature phases per trial corresponds to 34.5 hours of computation time. We carried out this study using an NVIDIA Tesla C1060 GPU, whose peak double-precision (DP) performance is 78 GFLOPS. Had we used a Tesla C2075 model, offering 515 GFLOPS of DP power, the computation time per trial would have been roughly 5.2 hours. Further improvement can be achieved with a cluster of GPUs. For instance, the VSC455 V8 GPU workstation by Velocity Micro combines 8 Tesla C2075s for over 4 TFLOPS of DP power, which would turn the 5.2 hours into just 40 minutes. Another way to improve computational time involves beamlet-based propagation algorithms, such as CODE V Beam Synthesis Propagation (BSP), which can provide accurate and efficient modeling of the wave propagation.

## 7. Experimental results

### 7.1 System configuration

### 7.2 Test lens description

*D*

_{lens}= 25 mm and radius of curvature of

*R*

_{1}= -

*R*

_{2}= 76.66 mm. According to ZEMAX, the exit pupil diameter and working f-number was

*D*= 25.39 mm and f/#

_{xp}_{w}= 3.471, respectively. It was imperative that the image-space NA of the test lens (NA = 0.14) was less than that of the imaging lens to avoid information loss. The position of the paraxial focal plane was

*z*=

*z*= 90.83 mm, where the

_{f}*z*= 0 plane contained the exit pupil.

*N*= 37) in the exit pupil of the lens are provided in Table 6 . With a smaller peak-to-valley wavefront error of 30.6λ, we anticipated less stringent requirements on the pupil sampling in our propagation algorithm. The minimum pupil sampling level was 256 × 256 for this lens, which we zero-padded for an FFT grid size of 1024 × 1024. This resulted in oversampling of the irradiance distribution, however, this only increased the computational time and did not affect the outcome of the study.

### 7.3 Irradiance data

### 7.4 Fisher information and Cramér-Rao bounds

*α*,

_{n}*n*= 2,…, 9, 16}, based on the method in Section 6.3. The structure of these matrices is comparable to that for the highly aberrated lens, including strong coupling among the rotationally-symmetric terms,

*α*

_{4},

*α*

_{9}, and

*α*

_{16}. Since the CRB takes on diminutive values, the variance from detector noise is unlikely to preclude high-precision estimates.

### 7.5 Nuisance parameters

**denotes the wavefront parameters of interest,**

*α***denotes the wavefront parameters of no immediate interest, and**

*β***denotes all other nuisance parameters, so that the entire vector parameter can be written as**

*χ***= (**

*θ***,**

*α***,**

*β***)**

*χ**. Therefore,*

^{t}**is comprised of Zernike coefficients {**

*α**α*,

_{n}*n*= 2,…, 9, 16},

**is comprised of all remaining coefficients, and**

*β***consists of unknown system parameters to be discussed next. We dealt with**

*χ***by replacing it (up to the ZEMAX limit of**

*β**N*= 37) with the respective design coefficients in Table 6, denoted as

*β*_{0}, then separately estimated

**prior to the primary estimation task. Finally, we set**

*χ***.**

*α**z*

_{1}=

*z*+ Δ

_{f}*z*

_{1}and

*z*

_{2}=

*z*+ Δ

_{f}*z*

_{2}. During acquisition of the detector data, it was difficult to accurately pinpoint the paraxial focal plane,

*z*=

*z*, which served as the reference plane in our studies. Another nuisance parameter was the true magnification of the microscope objective just before the detector, denoted as

_{f}*M*

_{1}and

*M*

_{2}for the first and second image, respectively. To achieve the design magnification of 40X, the detector must be placed 160 mm from the rear principal plane of the objective lens, however, we had no knowledge of the exact location of this plane.

*i*output plane, we estimated

^{th}*z*and

_{i}*M*from the respective detector data in Fig. 7 by performing a straightforward 2D grid search. During this step, we fixed all wavefront parameters,

_{i}**and**

*α***, to the design parameters,**

*β*

*α*_{0}and

*β*_{0}, and computed the cost function for both image planes assuming a Gaussian noise model (Fig. 9 ). Prior to generating these plots, we first did a broader search for local minima in the region of interest, but did not find any. The final nuisance parameter estimates were given by where Δ

*z*is the distance from paraxial focus and

*M*is the magnification.

*x*= 0.1101 μm and Δ

_{d}*x*= 0.1098 μm. This resulted in sampling the irradiance distribution below the diffraction limit of approximately 0.5 λ/NA = 0.33 µm (i.e., oversampling by a factor of three). However, the excess magnification also allowed us to avoid detector saturation without the use of additional optics, as mentioned in Sec. 7.1.

_{d}### 7.6 Maximum-likelihood estimates

*α*,

_{n}*n*= 2,…, 9, 16} are given in Table 8. Without knowledge of the true values or a gold standard, we have no basis for evaluating biases in the estimates, though the estimated values are within one standard deviation from the design values. Once again, large departures from the CRB may have resulted from the numerous local basins in the cost function.

^{3}forward propagations in each temperature phase. For the given FFT grid size, the computation time was 308 ms per forward propagation. On average, there were 105 temperature phases per trial, corresponding to 20.2 hours of computation time with a single NVIDIA Tesla C1060. This amounts to 3.1 hours with a Tesla C2075 model, or 23.3 minutes with a VSC455 V8.

## 8. Conclusions

_{w}= 3.47. Since the imaging (i.e., relay) lens was much faster with f/# ≈0.53 (NA = 0.95), it should not have resulted in information loss from the suppression of high spatial frequencies. However, this may become an issue if the f-number of the test lens is low enough. To avoid having to include the imaging lens in the forward model here, one solution is to use the translation stage to place a ground glass in any plane where an irradiance measurement is desired, then relay the incoherent irradiance, instead of the field, to the detector. Note that it is necessary to have a rotating ground glass to decorrelate the resulting speckle noise. An alternative to a rotating diffuser is a liquid-crystal diffuser operated in a dynamic scattering mode.

## Acknowledgments

## References and links

1. | I. S. Stefanescu, “On the phase retrieval problem in two dimensions,” J. Math. Phys. |

2. | J. H. Seldin and J. R. Fienup, “Numerical investigation of the uniqueness of phase retrieval,” J. Opt. Soc. Am. A |

3. | M. R. Teague, “Deterministic phase retrieval: a Green’s function solution,” J. Opt. Soc. Am. |

4. | G. R. Brady and J. R. Fienup, “Improved optical metrology using phase retrieval,” |

5. | G. R. Brady and J. Fienup, “Phase retrieval as an optical metrology tool,” Optifab: Technical Digest, SPIE Technical Digest |

6. | G. R. Brady, M. Guizar-Sicairos, and J. R. Fienup, “Optical wavefront measurement using phase retrieval with transverse translation diversity,” Opt. Express |

7. | H. H. Barrett and K. J. Myers, |

8. | J. W. Goodman, |

9. | A. Corana, M. Marchesi, C. Martini, and S. Ridella, “Minimizing multimodal functions of continuous variable with the ‘simulated annealing’ algorithm,” ACM Trans. Math. Softw. |

10. | H. H. Barrett, C. Dainty, and D. Lara, “Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions,” J. Opt. Soc. Am. A |

11. | C. R. Rao, “Information and accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc. |

12. | H. Cramér, |

**OCIS Codes**

(100.5070) Image processing : Phase retrieval

(120.3940) Instrumentation, measurement, and metrology : Metrology

(120.5050) Instrumentation, measurement, and metrology : Phase measurement

**ToC Category:**

Image Processing

**History**

Original Manuscript: February 22, 2012

Revised Manuscript: May 30, 2012

Manuscript Accepted: June 6, 2012

Published: June 28, 2012

**Citation**

Julia A. Sakamoto and Harrison H. Barrett, "Maximum-likelihood estimation of parameterized wavefronts from multifocal data," Opt. Express **20**, 15928-15944 (2012)

http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-14-15928

Sort: Year | Journal | Reset

### References

- I. S. Stefanescu, “On the phase retrieval problem in two dimensions,” J. Math. Phys.26(9), 2141–2160 (1985). [CrossRef]
- J. H. Seldin and J. R. Fienup, “Numerical investigation of the uniqueness of phase retrieval,” J. Opt. Soc. Am. A7(3), 412–427 (1990). [CrossRef]
- M. R. Teague, “Deterministic phase retrieval: a Green’s function solution,” J. Opt. Soc. Am.73(11), 1434–1441 (1983). [CrossRef]
- G. R. Brady and J. R. Fienup, “Improved optical metrology using phase retrieval,” 2004 Optical Fabrication & Testing Topical Meeting, OSA, Rochester, NY, paper OTuB3 (2004).
- G. R. Brady and J. Fienup, “Phase retrieval as an optical metrology tool,” Optifab: Technical Digest, SPIE Technical DigestTD03, 139–141 (2005).
- G. R. Brady, M. Guizar-Sicairos, and J. R. Fienup, “Optical wavefront measurement using phase retrieval with transverse translation diversity,” Opt. Express17(2), 624–639 (2009). [CrossRef] [PubMed]
- H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley, 2004).
- J. W. Goodman, Introduction to Fourier Optics (Roberts & Company Publishers, 2005).
- A. Corana, M. Marchesi, C. Martini, and S. Ridella, “Minimizing multimodal functions of continuous variable with the ‘simulated annealing’ algorithm,” ACM Trans. Math. Softw.13(3), 262–280 (1987). [CrossRef]
- H. H. Barrett, C. Dainty, and D. Lara, “Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions,” J. Opt. Soc. Am. A24(2), 391–414 (2007). [CrossRef] [PubMed]
- C. R. Rao, “Information and accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc.37, 81–91 (1945).
- H. Cramér, Mathematical Methods of Statistics (Princeton University Press, 1946).

## Cited By |
Alert me when this paper is cited |

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

« Previous Article | Next Article »

OSA is a member of CrossRef.