Misalignment changes the aberration field of an optical system in a systematic manner that can be well approximated, in many cases, by simple linear and/or quadratic functions of field and alignment parameters. This makes the distributions of aberrations over the system field extremely useful not only in misalignment diagnostics, but also in determining optimal adjustments for efficiently improving collimation quality of the system. Having the focus of this paper on the latter aspect, it is important to define what we mean by optimal correction, as it determines the way to compute the corrections as well as the strategy of applying them to the system.
Ideally, the optimal correction is something that most efficiently removes misalignment from the components of a system. In order to do that, one needs to be able to measure (or estimate) the alignment state of the optical components with an accuracy at least comparable to the alignment tolerance. Several alignment methods have taken this approach and reverse optimization
is quite often at the core of them [1–4
1. H. J. Jeong, G. N. Lawrence, and K. B. Nahm, “Auto-alignment of a three mirror off-axis telescope by reverse optimization and end-to-end aberration measurements,” Proc. SPIE 818, 419–430 (1987).
]. The principle is to search for alignment states of individual components, with which the system reproduces the observed wavefront. However, this approach often utilizes non-linear optimization procedures and, as a result, produces a stagnated estimate with significant difference from the true alignment state. One remedy to this is to use wavefronts sampled at multiple fields, but these multi-field samples can be degenerate among themselves and thus the stagnation problem can still persist, especially in multi-element systems [5
5. H. Lee, G. B. Dalton, I. A. J. Tosh, and S.-W. Kim, “Computer-guided alignment II : Optical system alignment using differential wavefront sampling,” Opt. Express 15, 15424–15437 (2007). [CrossRef] [PubMed]
]. Although there is a feasible alternative to avoid this issue [5–7
5. H. Lee, G. B. Dalton, I. A. J. Tosh, and S.-W. Kim, “Computer-guided alignment II : Optical system alignment using differential wavefront sampling,” Opt. Express 15, 15424–15437 (2007). [CrossRef] [PubMed]
], the special measurement scheme proposed in the references could be hard to implement in systems with only one or two (partially) adjustable elements as one often faces in reality.
In such constrained systems, it is better to adjust alignment-sensitive components to deliberately introduce additional variations to the aberration field and thus to compensate for the alignment-driven aberrations. This is analogous to the compensator concept in optical tolerancing. If such a correction brings a system to the state free from alignment-driven aberrations, the system can be declared to be in alignment
, no matter what the actual alignment state is, and this correction can be called optimal
. Methods that aim for this definition of optimal correction often use the singular value decomposition of the alignment influence matrix [8
8. H. N. Chapman and D. W. Sweeney, “Rigorous method for compensation selection and alignment of microlitho-graphic optical systems,” Proc. SPIE 3331, 102–113 (1998). [CrossRef]
9. A. M. Hvisc and J. H. Burge, “Alignment analysis of four-mirror spherical aberration correctors,” Proc. SPIE 7018, 701819 (2008). [CrossRef]
]. The corrections from these tend to be rather complex combinations of adjustments of many of the alignment parameters (often including alignment-insensitive ones). This would be less suitable for alignment correction of constrained systems [10
10. D. O’Donoghue, South African Large Telscope, Observatory, 7935, South Africa (Personal communication, 2009).
The main outcome of the presented analyses is threefold. (i) There is a linear matrix relation between alignment parameters and alignment-driven aberration terms. The solution of this equation corresponds to the optimal adjustment of a chosen set of alignment parameters. (ii) The optimal adjustment is physically equivalent to placing the centers of primary field aberrations at a desired common field location simultaneously (called aberration concentering hereafter). This restores the field distribution of aberrations to the nominal and improves the image quality at the same time; (iii) In most of low-order aberration dominant systems, only three alignment-driven terms need to be removed. Thus (maximum) three alignment parameters per axis are required. This can still be true for systems with higher-order alignment-driven aberrations although the aberration concentering may not be achieved. However, adding one more alignment parameter per axis for also removing the fourth term from higher-order aberrations is shown to be effective for further improvement in collimation and aberration concentering quality. In Section 2, details of this approach is described with error analyses. We present the results of case studies and robustness tests in Section 3. The results demonstrate the method’s feasibility in efficient removal of alignment-driven aberrations in the face of measurement and model uncertainties. We finish up this paper with a discussion on how this approach can be useful in collimation of wide-field large aperture multi-surface systems with higher-order field aberrations (Sec. 4).
2.1. Alignment-driven aberrations
The aberration fields of many optical systems are dominated by low-order primary aberrations. Some of these, namely coma, astigmatism, and curvature, are sensitive to misalignment of individual optical components and thus easily detectable when they exists. Removing the alignment-driven terms of these aberrations effectively improves the quality of collimation and restores the system performance close to the nominal regardless of its actual alignment state. Let Comax
, and Curv
be the coefficients of Z
, and Z
, respectively, where Zi
is the i
-th standard Zernike polynomial [11
11. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am. 66, 207–211 (1975). [CrossRef]
]. These coefficients can be expressed as a function of de-center (x, y
) and tilt (θ
) parameters [12–17
12. R. V. Shack and K. Thompson, “Influence of alignment errors of a telescope system on its aberration field,” in Optical alignment, R. M. Shagam and W. C. Sweatt, eds., Proc. SPIE251, 146–153 (1980).
]. For a single surface case, these are given as,
(n) includes terms of order higher than n-1 in field and/or alignment parameters and F
0 corresponds to the defocus term at the center of the field. In low-order aberration dominant systems, O
(n) is negligible, but in some wide-field systems these may need to be accounted for as to be discussed later on (Sec. 4).
The terms in brackets in Eq. (1)
are due to misalignment and need to be suppressed during the course of collimation. We call those in round brackets linear term
and those in square brackets quadratic term
hereafter. The linear terms
are field constant and thus mainly controls the overall magnitude of wavefront error across the system field of view. The linear terms
effectively determine the overall slope of aberration field and, when exist, produce so-called focus gradient across the image field. The quadratic terms
are similar to the linear terms
, but usually less significant as A
, and F
are much smaller than C
. These linear terms
commonly result in displacement of the centers of the individual aberrations away from the nominal by different amounts. This induces non-intrinsic large asymmetric image quality variations across the field of a system. Thus, the linear terms
are those to be removed in the collimation process. It is outside the scope of this paper to give explicit expressions for these coefficients, but one can certainly do this by, for example, a ray-tracing-based numerical sensitivity analysis [16
16. H. Lee, G. B. Dalton, I. A. J. Tosh, and S. Kim, “Computer-guided alignment I : Phase and amplitude modulation of the alignment-influenced wavefront,” Opt. Express 15, 3127–3139 (2007). [CrossRef] [PubMed]
] or the rigorous analytic derivations [17
17. H. Lee, G. Dalton, I. Tosh, and S. Kim, “Computer-guided alignment III: Description of inter-element alignment effect in circular-pupil optical systems,” Opt. Express 16, 10992–11006 (2008). [CrossRef] [PubMed]
2.2. Aberration concentering and optimal collimation by alignment correction
Before beginning the collimation process, the amounts of the linear terms of a misaligned system need be first quantified and this can be done through three steps: (i) measuring wavefront data at a set of field positions, (ii) determining the aberration coefficients by decomposing the wavefront data into aberration functions (such as Zernike polynomials), and (iii) fitting a linear or quadratic function to the distributions of the aberration coefficients across the field.
In step (i), one may attempt to measure wavefront at a two dimensional grid of discrete field positions across the field. However, scanning the wavefront along only two field axes (i.e. Hx
- and Hy
-axis) can provide as much information for quantifying the linear terms
as the grid sampling can. This is due to the fact that the linear terms
of each aberration only respond to certain parameters associated with one of the two field axes, as easily noticed in Eq. (1)
, and thus can be split into two groups. For example, all terms of Comax
and those in the first line of Astg
only respond to Hx
, or ϕ
, all of which are associated with the Hx
-axis. Taking this aberration scanning approach, one can obtain a linear curve for Coma
and a quadratic curve for Astg
along each field axis after performing step (i) and (ii). By fitting a linear or quadratic function to these scans, as one would do in step (iii), linear fitting coefficients can be obtained. These correspond to the amounts of the linear terms
of the aberrations. Note that the aberration scanning naturally requires a fewer wavefront measurements than the grid sampling does and this can simplify and speed up the measurement process.
Let X⃗ and Y⃗ be vectors containing the measured values of the linear terms of Coma, Astg
1, Curv, and Astg
2 in the Hx- and Hy-axis, respectively. The field coordinates of the centers of Coma, Astg
1, and Curv, for example, are given by the following.
indicates that reducing X⃗
is equivalent to concentering
the three aberrations at the nominal field center. At the same time, it removes the linear terms
from the system aberration field. As a result of these, the aberrations restore their intrinsic distribution patterns across the system field. In order to do this, one needs to apply appropriate alignment corrections (Δx
) to the alignment parameters (x,y,θ,ϕ
) of the system. These corrections can be obtained by solving a set of linear equations given by the measured X⃗
and the expressions of the linear terms
in Eq. (1)
. The equations are given in Eq. (3)
Here, it should be noted that one set of corrections given by solving one of the above equation is not necessarily similar (if not identical) to the corrections given by other equations. This can give rise to, for instance, small coma at the center field, but large astigmatism and/or skewed curvature across the field, leading to focus gradient. This occurs quite often especially when a misaligned system is tested on-axis without verifying off-axis wavefront (as usually done in practice), in that (X
3), and (X
4) cannot be sensed by on-axis wavefront measurement only. This illustrates the importance of verifying imaging performance across the field.
To remove the linear terms together, one needs to find a solution to some of the above equations for a given set of correction parameters. For example, in a single surface misalignment case, one can solve the following.
In this particular case, Δx = −x and Δϕ = −ϕ remove the linear and quadratic terms altogether.
2.3. Description of residual alignment-driven aberrations after alignment correction
Let us assume that we have a system with N misaligned surfaces and we desire to concenter Coma, Astg
1, and Curv at the nominal center field of the system. Adopting a generic notation of xi and yi for the alignment parameters, we can write
x are N × 1 vectors of
along Hx-axis, respectively, and likewise along Hy-axis. M
x and M
y are 3 × N matrices. x⃗ and y⃗ are 2N × 1 vectors. Note that each surface has two alignment parameters per field axis. A
T means the transpose of A.
As only three aberrations are to be concentered, correcting three alignment parameters in each axis (6 in total) is sufficient to remove the linear terms. However, this may not be sufficient to eliminate the quadratic terms. Let the correction parameters be Δxk and Δyk with k = 1,2, 3. As only three parameters are to be adjusted, the influence matrices must be subsets of M
x and M
y. Letting m
x and m
y be the 3 × 3 subset matrices, Δx⃗ and Δy⃗ can be expressed in terms of x⃗ and y⃗ as,
is the inverse of m
. Although three alignment parameters are sufficient, one may wish to use more alignment parameters in this process for some reason. In that case, m
are no longer square matrices and their inverses in Eq. (6)
can be replaced by pseudo-inverses via singular-value-decomposition (SVD).
Upon applying these corrections, the corrected alignment states are given, with a N × N unit matrix (1) and a (N − 3) × N zero matrix (0) as,
While the linear terms
vanish, the quadratic terms
may still have residuals. The residuals can be expressed in terms of x⃗
, using Eq. (6)
, at the common center (i.e. Hx
= 0) as,
, and B
are the coefficient matrices of the quadratic terms
. Assuming that x⃗
are random independent variables following Gaussian distributions with zero mean and standard deviations of σx
, the probability distributions of the residual aberrations can be computed. If these distributions happened to be Gaussian, the statistics in Eq. (9)
can be used in finding the optimal values of σx
19. G. D’Agostini, Bayesian Reasoning in Data Analysis (World Scientific, 2005).
where i ≠ 1,2,3, E[A] is the mean value of A, and Var[A] is the variance of A. The condition of, for example, minimum curvature can be
where Curvreq is the allocation to Curv from the total rms wavefront error budget of a system. Similar conditions can be posed to the other aberrations and one then needs to find the optimum values of σx and σy that meet these conditions. It should be noted, however, that the distributions are different from one case to another and may significantly deviate from a Gaussian. In such cases, one needs to find the optimal σx and σy using more sophisticated optimization procedures.
2.4. Error analysis
Two error sources can play a critical role in performing the aforesaid alignment correction method in reality. One is the uncertainty in the measurement and the other is the error in the influence matrix (i.e. model error). The measurement error in fact originates from the aberration coefficient measurement and propagates through the curve fit procedures. Omitting axis notations for convenience, let wi
be the average and error of a particular aberration coefficient, inferred from M
measurements at the i
-th field locations Hi
. The curve fit coefficients p⃗
of the aberration scans can be computed by a least-square analysis [18
18. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C 2nd ed. (Cambridge, 2002).
In fact, [R
is the covariance matrix of the fit coefficients so that the variance of pi
equals to the (i,i
) element of C
). If pj
is the i
-th element of X⃗
. Equation (6)
can be rewritten as,
and the measurement-driven uncertainty in the correction estimates is approximated by
The error in the influence matrix can also be treated as part of the measurement error. For given x⃗, if the true influence matrix of the system differs from what we think it is (M
x) by δ
x, the corrections are expressed as,
be the variances of δMij,x and xi, respectively, the expected variance of the alignment corrections can be approximated by the following.
where nij,x is the (i,j) element of n
x. This outcome obviously depends on
, the variance of the unknown true alignment state. However, this can be substituted for its expected variance to set the expected upper limit on the alignment correction uncertainty.
4. Discussion: Wide-field large aperture multi-surface systems with higher-order alignment-driven aberrations
Although many optical systems are low-order aberration dominant, there are an increasing number of wide-field on/off-axis systems that exhibit notable higher-order field aberrations.We use the Hobby-Eberly Telescope as an example system. It consists of five reflective surfaces with 10m pupil and 22 arcmin field. The primary (M1) feeds f/1.3 beam into the four-mirror prime focus corrector (PFC) that produces f/3.65 beam at the focal plane. We assume that the four mirrors in the PFC can be tilted about an arbitrary rotation point. This rigid-body motion produces additional field constant coma to the system with relatively small amounts of astigmatism and curvature.
In the presence of large higher-order field aberrations, the field aberrations start showing substantial amounts of extra higher-order terms that are absent in Eq. (1)
. For example, Comax
shows field quadratic terms and, when scanned along Hx
axis, can be approximated by the following functions.
where dhx and dhy are linear functions of (x, ϕ) and (y,θ), respectively. Here, the field constant term is still substantial and the linear coefficients in this term are much larger than the cubic ones. Therefore, it should be possible to reduce the field constant term in the same way as used in the previous cases. However, in the presence of higher-order aberrations, the new field quadratic term (coupled with alignment parameters) can substantially contributes to Coma. The major influence of this term is to deform Coma field scans into quadratic shape, effectively breaking the oddness of the original functional form of Coma. Therefore, in this case, the reduction of the field constant term of Coma does not necessarily place its center at a desired field location. A similar effect also occurs in Astg and Curv, where substantial amounts of field cubic terms, coupled with alignment parameters, can appear.
In the current example system, this is certainly true and, as a demonstration, the mirror surfaces (except M1) of the system are perturbed within ±0.1mm in decenter and ±0.05deg in tilt. After the perturbation, the PFC as a rigid-body is intentionally tilted to null the field constant coma. The field scans of the aberrations show substantial higher-order features (Fig. 7
Fig. 7. Initial field scans of the example telescope system: (A) Coma field scans, (B) Astigmatism field scans, (C) Curvature field scans, (D) Field scans of RMS wavefront error (wv=632.8 nanometers).
The curve fit coefficients for the Hx
scans, in Table 1
, clearly show the existence of significant quadratic term for Coma
and cubic terms for Astg
, whereas the terms intrinsic to each aberration (e.g. odd functions of Hx
for Coma) have been changed by only small amounts, meaning relatively weak alignment-influence in these terms. Note, however, that the cubic terms in Astg
are still less significant than the linear terms by many factors. A clear indication from this is that selective reduction of the H
terms of Coma
and the H
terms of Astg
should restore the distributions of the aberrations to their nominal over the field of view.
Table 1. Curve fit coefficients of Comax, Astg1, and Curv along Hx axis (unit in wv)
| | |
To demonstrate this, the alignment correction was computed in the same way used in the case studies (Method I) and by including the equation of the H
term in Coma
(Method II). In Method II, total four alignment parameters per axis are required to completely correct the four alignment-driven terms.We have chosen M4 decenter/tilt, M5 decenter, and PFC tilt, and these are also used in Method I. We use the initial scan data in Fig. 7
Fig. 8. Field scans of Coma, Astg
1, and Curv after correction by Method I: (A) Coma field scans, (B) Astigmatism field scans, (C) Curvature field scans, (D) Field scans of RMS wavefront error (wv=632.8 nanometers).
The correction by Method I substantially removed alignment-driven aberrations and restored the field distributions close to nominal, Astg
in particular (Fig. 8
). At the edge of the field, the RMS wavefront error is reduced from 11 wv to 2.5 wv. However, as discussed, the distribution of Coma
is still off-set from the desired nominal center field, showing large asymmetry over the field. The distributions of Astg
are also off-centered, but by smaller amounts than Coma
. Though the individual aberrations are not quite centered at the common field, the overall wavefront error, across the field, becomes close to the nominal. Note that almost identical result is obtained when only M4 decenter/tilt and PFC tilt are used.
Fig. 9. Field scans of Coma, Astg
1, and Curv after correction by Method II: (A) Coma field scans, (B) Astigmatism field scans, (C) Curvature field scans, (D) Field scans of RMS wavefront error (wv=632.8 nanometers).
Due to the fact that the H
term is also corrected, Method II produced a set of alignment corrections that exactly concentered the field distributions of Coma
, and Curv
at the nominal center field (Fig. 9
scans follow the original odd function in H
. The amount of asymmetry in all aberrations is negligible and the overall wavefront error is nearly identical to the nominal. In comparison to Method I’s results, the RMS wavefront error at the edge of the field was reduced from 2.5wv to 2.2wv (roughly 10% improvement) by Method II. Although Method I can be effective in removing the alignment-driven aberrations, the full field ray-spot distribution, shown in Fig. 10
, clearly demonstrates that Method II can further improve the quality of aberration concentering and collimation in the presence of higher-order aberrations, towards the edge of the field in particular.
Fig. 10. Full field ray-spot diagrams of the example telescope system: (A) initial, (B) After correction by Method I, (C) After correction by Method II, (D) Nominal.
Note that, if a misaligned system develops large amounts of cubic terms in Astg and Curv, two more alignment parameters per field axis are likely to be necessary for concentering the three aberrations. Even so, however, not all required alignment parameters may be necessary depending on the amount of alignment-driven optical performance degradation. If a system is in a late stage of its commissioning and the performance is not far a way from the nominal, one may use some of the alignment parameters to efficiently reduce alignment-driven aberrations. If the degradation is still large, it would be necessary to use all of the required alignment parameters. In any case, the proposed method can be a useful way to test the alignment state of a system and to determine the optimal next adjustment.
In this paper, we described a new collimation method for misaligned optical systems. A series of theoretical analyses of the method indicates the followings. The optimal adjustment given by the method is physically equivalent to placing the centers of primary field aberrations at a desired common field location simultaneously. This not only restores the field distribution of aberrations to the nominal, but also improves the image quality across the field. In the case study of the three-mirror system, for example, the optimal correction found from the method demonstrated a complete restoration of the field aberration distribution and significantly improved the RMS wavefront error from 0.78 wv to 0.12 wv. Note that this improvement was obtained from a single aberration scanning and alignment correction without any further iterations. In most of low-order aberration dominant systems, maximum three alignment parameters per axis are required to be adjusted to improve the quality of aberration concentering and collimation. This would mean adjusting maximum two surfaces in practice. Analyses suggest that, for misaligned systems with higher-order alignment-driven aberrations, the aberration concentering may not be achieved although a factor of 5 improvement in terms of RMS wavefront error was observed in the presented example system. However, adding one more alignment parameter per axis for also removing the fourth term from higher-order aberrations is shown to be effective for further improving collimation and aberration concentering quality. The observed improvement in RMS wavefront error was approximately 10%. Finally, the case studies and robustness tests demonstrated that the method can be robust against measurement and model uncertainties. This proves the method’s feasibility as an independent alignment test method for both coarse and fine collimation of misaligned optical systems.