OSA's Digital Library

Optics Express

Optics Express

  • Editor: C. Martijn de Sterke
  • Vol. 15, Iss. 8 — Apr. 16, 2007
  • pp: 5218–5226
« Show journal navigation

Shape specification for axially symmetric optical surfaces

G. W. Forbes  »View Author Affiliations

Optics Express, Vol. 15, Issue 8, pp. 5218-5226 (2007)

View Full Text Article

Acrobat PDF (317 KB)

Browse Journals / Lookup Meetings

Browse by Journal and Year


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools



Advances in fabrication and testing are allowing aspheric optics to have greater impact through their increased prevalence and complexity. The most widely used characterization of surface shape is numerically deficient, however. Furthermore, with regard to tolerancing and to constraints for manufacturability, this representation is poorly suited for design purposes. Effective alternatives are therefore presented for working with rotationally symmetric surfaces that are either (i) strongly aspheric or (ii) constrained in terms of the slope in the departure from a best-fit sphere.

© 2007 Optical Society of America

1. Introduction

Characterizing the desired shape of individual mirrors or lens surfaces is central to activities such as optical design and fabrication. In essence, this process is pure geometry. After agreeing on certain conventions, however, the problem becomes a numerical one that is made ever more challenging by the demanding tolerances of modern precision optics. A variety of different approaches to this task have been adopted within the context of optics, and especially within optical design [1

1. See, for example, the discussion and references in H. Chase, “Optical design with rotationally symmetric NURBS”, SPIE Proceedings 4832, 10–24 (2002) and A. W. Greynolds, “Superconic and subconic surface descriptions in optical design,” Proc. SPIE 4832, 1–9 (2002). Such matters are also treated within the manuals for commercial optical design software. [CrossRef]

]. Limitations of one of the most widely used options [2

2. G. H. Spencer and M. V. R .K. Murty, “General ray-tracing procedure,” J. Opt. Soc. Am. 52, 672–678 (1962), see Eq. (16). [CrossRef]

] are demonstrated here, and a number of simple improvements are proposed. General geometry and numerics are the key points of interest; details that are specific to only one context, such as design, are suppressed as much as possible. While some comments are offered about how these ideas are expected to be of use for design purposes, the emphasis is on general processes for specifying ideal surface shapes. The results relate to various activities including optical fabrication, testing, and patenting, as well as design.

Since conic sections have special optical properties that are often exploited in system design, it is conventional to express rotationally symmetric surface shapes in terms of their deviation from the sagittal representation —or just the “sag”— of a close-fitting conic, say [2

2. G. H. Spencer and M. V. R .K. Murty, “General ray-tracing procedure,” J. Opt. Soc. Am. 52, 672–678 (1962), see Eq. (16). [CrossRef]



where {z,ρ,ϕ} are standard cylindrical polar coordinates. Here, c is the paraxial curvature of the surface and ε the conic parameter. Only a handful of terms are typically retained in the added polynomial, but it is increasingly common to see this number grow in order to characterize a desired shape with sufficient accuracy. For the purposes of fabrication and testing, this characterization of the surface shape is completed by specification of the aperture size, i.e. a value for ρ max where Eq. (1) is then valid over 0<ρ<ρ max.

There are enough degrees of freedom in Eq. (1) to admit independent control of all even powers up to ρ 2M+4 (and, in fact, via ε to ρ 2M+6). This representation is therefore sufficiently general to approximate any symmetric shape with arbitrary accuracy provided M is allowed to be large enough. However, the failings of such a representation are well known. For example, if a particular shape, say z = f(ρ), is prescribed, a least-squares fit can be performed to determine the optimal values of {am;m = 0,1,..M} once a close-fitting conic section has been identified. It can readily be shown, however, that even when working in terms of the normalized variable u≔/ρ max and using double-precision arithmetic, the associated Gram matrix [defined below at Eq. (7)] is so ill-conditioned that this simplistic process fails when more than about ten terms are used. When a solution is found for modest values of M, the values of {am; m = 0,1,..M} typically alternate in sign and only yield the desired shape through heavy cancellation between the individual terms. This property is all too familiar to the optical design and fabrication community and makes for numerical inefficiency due to problematic round-off errors. In particular, large numbers of digits must be specified. A sample of these difficulties is displayed in Sec. 4, and one of the goals of this work is to push these failings far beyond the ever-growing design space for optics. The proposed solution is to use non-standard orthogonal bases in place of the simplistic additive polynomial of Eq. (1). A useful property of orthogonal decompositions is that the mean square value of the associated superposition is just the sum of the squares of the coefficients. Section 3 contains a particular basis that is tailored to exploit this property to achieve the second goal for this work: facilitating the enforcement of manufacturability constraints during design.

2. Strong aspheres

By choosing an appropriate basis, the ill-conditioning of the Gram matrix discussed in the Introduction can be entirely removed. To see this, rewrite Eq. (1) as


where the departure from a conic, written as D con (u), is defined by


In place of the standard choice of simple monomials, namely Q mon m(x) = xm [which gives Eq. (1)], we can choose {Q mon m(x); m = 0,1,…M} to be an orthogonal set. Consider fitting the surface described by z = f(ρ) where f may be specified via a numerical look-up table, or whatever. After choosing a close-fitting conic (which can be refined by using a process that invokes the procedure discussed here) the coefficients in the additive polynomial can be chosen to minimize the rms sag error. That is, if the difference between f(ρ) and the conic component of Eq. (2) is written as g(ρ), the minimization of


becomes a standard least-squares process. The angle brackets in Eq. (4) denote a weighted average that may be defined by a sum or an integral of whatever form we please. In this work, for any function h, it is convenient at first to take


Choosing the weight function to be constant, i.e. W(u 2) = 1, means that equal areas in the aperture carry equal weight, and this is as good as any for the moment.

Setting the gradient of E 2 to zero leads to the expression for the optimal coefficients:


where bm ≔〈g(u ρ max)u 4 Q con m(u 2)〉 and the elements of the Gram matrix are given by


It follows that this matrix is diagonal if the basis is chosen to be a particular case of the Jacobi polynomials [3

3. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 1978), see 22.2.1.

]. In particular,


For reference, the first six members can be written as


As can be seen in Fig. 1, these basis elements are scaled to take a maximum value of unity. Even when the mean square error of Eq. (4) is computed as a finite sum, the basis of Eq. (8) is still a good choice. Although the Gram matrix may then not be precisely diagonal, it is nearly so and remains well conditioned. Some benefits of this basis are outlined in Sec. 4.

Fig. 1. Plots of the orthogonal basis elements for m = 0, 1, 2,… 5.

3. Mild aspheres

One simple definition of the best-fitting sphere for a rotationally symmetric surface is to choose the sphere that is coincident with the surface at its axial point and around its perimeter. Various minor modifications from this best fit are possible, e.g. to minimize the maximum absolute deviation, but the distinctions are relatively minor for mild aspheres. If this simple option is accepted and the sag at the edge of the clear aperture is written as f(ρ max), it follows that the curvature of the best-fit sphere, say c bfs, is given by c bfs = 2f(ρ max)/[ρ 2 max + f(ρ max)2]. In place of Eq. (1), it is now more effective to express the sag as


where the sag-based departure from the best-fit sphere, written as D bfs(u), is defined by


So that the coefficient values have no impact on the best-fitting sphere, D bfs(u) is explicitly forced to vanish at the aperture’s centre and edge, i.e. at u = 0 and u = 1. Also, so that the axial curvature may be different fromc bfs, the pre-factor in Eq. (11) reduces to u 2 for u≪1 [unlike the form used in Eq. (3) where the pre-factor is u 4].

To first order, D bfs(u) can be converted to a deviation measured along the surface normal by multiplying it by the cosine of the angle between the optical axis and the local normal to the best-fit sphere. This cosine factor is precisely the square root that appears in the denominator of Eq. (11), and this is the justification for the square root’s presence. It is now straightforward to choose Q bfs m (x) to be polynomials of order m and to configure them so that the weighted rms slope of the departure along the normal is just the sum of the squares of am. To achieve this, the elements of the normal-departure slope are written as


and then Q bfs m (x) is chosen to make Sm (u) orthogonal. In particular,


where the angle brackets were defined at Eq. (5) and δmn is the Kronecker delta. By starting with Q bfs 0 (x) as a constant that is normalized in accordance with Eq. (13), higher-order members can be determined progressively by requiring that each new polynomial be orthogonal to all the lower members and normalized in keeping with Eq. (13).

To constrain the tendency of orthogonal polynomials to shoot off at the edge of their domain, it is sometimes convenient to choose a weight function that emphasizes the behaviour at the ends. If we choose W(x) = [x(1-x)]-1/2 in Eq. (5), the first six basis elements are found to be


These are plotted in Fig. 2. Notice that, as shown in Fig. 3, on account of the particular nonuniform weight distribution, the associated slope functions remain aesthetically bounded even at the part's centre and edge. The key property of this representation is that the mean square slope of the normal departure from a best-fit sphere is now evidently given by


It is this result that makes this basis well suited to certain design tasks.

Fig. 2. Plots for m = 0, 1, 2,… 5 of the basis elements that are tailored for use when constraining the departure from a best-fitting sphere along its normal.
Fig. 3. Plots of the orthogonal slope functions of Eq. (12) for m = 0, 1, 2,… 5.

4. Simple applications

A specific example, say a Cartesian oval, can provide elementary numerical checks for implementations of the basis proposed in Sec. 2. Although such an example may be entirely academic, it can also help to clarify some of the benefits associated with this basis. The single refracting surface that perfectly images a source point to its image point can be found by requiring that the summed optical distance along line segments that join the two points via any point on the surface be a constant. An example is shown in Fig. 4 with n = 1.5 and d 1, = d 2 = 10mm, and 2ρ max = 7mm, where n is the refractive index and d 1, and d 2 are the distances between the axial point on the surface and the object and image points, respectively. Finding the explicit sag of the surface involves solving a quartic equation. Thankfully, standard mathematical software packages give direct access to the solution for the purposes of the fitting described here. The axial curvature is readily found to be c = -(n/d 1+ 1/d 2)/(n -1) = -0.5mm-1. For the current purposes, it is not important whether the conic parameter, i.e. ε of Eq. (2), is chosen to match the fourth-order expansion of the sag, or to give a fit through the surface at the edge of the aperture. For the latter, in the notation used before Eq. (4), we have ε= [2f(rho; max)- 2 max]/[cf(ρ max)2], which gives ε≈0.156284. The residual 60μm of deviation from this fitted conic is plotted in Fig. 5(a).

Fig. 4. The Cartesian oval used as an example. For some purposes, it can be helpful to note that a parametric description as η(θ) follows upon solving a quadratic.

For the case M = 7, the fit error associated with either of Eqs. (1) or (2) is plotted in Fig. 5(b). The coefficients for each case are presented in the first and last columns of Table 1. In principle, these two polynomials are identical. When the traditional monomial basis is adopted, however, the coefficient values are highly dependent upon the number of terms used in the fit. Further, because of the ill-conditioned Gram matrix, the accuracy of the fits based on monomials rapidly bottoms out as M is increased. More problematically, the non-empty null space for the Gram matrix means that the coefficients are truly ill-defined: the associated values in the Table are good to only a couple of significant digits because a coordinated change can leave the overall fit effectively untouched. This is clarified in the second column of Table 1 where some sample changes in the coefficients for the monomial basis are given. Although individually exceeding hundreds of thousands of nanometers, these changes combine to give no more than just one nanometer of change in the net sag. [These changes are precisely –Q con 7(x) and, as suggested by Fig. 1, x 2 Q con 7(x) has a peak value of unity.] Setting tolerances directly in such a representation is evidently an ugly prospect. The first column exhibits the coefficients’ characteristic alternation in signs and growing magnitude. In fact, the largest coefficient value grows with increasing M so, even if it were possible to solve for a larger number of coefficients, evaluation of the polynomial would involve increasing numerical round-off problems due to cancellation. Notice, for example, in Table 1 that a 5 is on the scale of millimeters for the traditional basis even though we are fitting just 60μm of aspheric departure. The traditional representation is evidently increasingly inefficient in terms of the number of significant digits required in specifying the coefficient values.

For the orthogonal basis, the Gram matrix is diagonal, and the coefficient values are thus independent of M. The decomposition therefore resembles a familiar spectrum. In contrast to the monomial representation, all of the digits shown in the table for this case are significant, and there are clearly fewer of them. Notice that these coefficient values decay exponentially with m: in this case, successive terms fall by a factor of three or four. (This trend continues out to any number of terms, and the fit is accurate to full machine precision — 15 digits— by M = 25. The key point in this is the striking robustness.) Notice that, because of the decaying coefficient values, fit error plots such as Fig. 5(b) generally resemble the first of the truncated basis elements. If we truncate at M = 4, for example, it is clear that the fit error will resemble the dark blue curve of Fig. 1 and have a peak value of roughly 200nm (see a 5 of Table 1).

Fig. 5. Plots of (a) the Cartesian oval’s departure from the best-fitting conic, and (b) the error in the associated polynomial fit with M=7.

Because Gmn is diagonal, it follows from Eqs. (3) and (7) that the mean square departure between the reference conic and the asphere is just a weighted sum of the squares of the coefficients in the decomposition. While this is a relatively incidental result for this basis, the corresponding property given in Eq. (15) for the basis of Sec. 3 is its defining feature. This property can be exploited in a number of ways within design tools. When aspherizing an all-spherical imaging system, for example, it is straightforward to linearly approximate the impact on the wavefront error from the aspheric terms on each surface. The result involves simple ray inclination factors. Even when averaged over the field, the mean square wavefront error is then a quadratic in the coefficients. Since the mean square slope of the normal deviation from the spherical surfaces follows from the trivial quadratic of Eq. (15), the optimal wavefront error for any prescribed mean square slope can be found by using Lagrange multipliers and solving only linear equations. Various simple strategies can now be used to identify optimal solutions that use only aspheres that can be measured interferometrically. An effective step in this is to constrain the rms slope to be below some fraction, say γ where 0 < γ<1, of Nyquist. When testing in reflection, the Nyquist slope corresponds to a change in the normal aspheric departure between adjacent pixels of λ/4. When the aperture is imaged onto an N × N detector, the constraint on the rms slope follows from Eq. (15):


This type of result underlies the envisioned applications of the basis introduced in Sec. 3. Notice that, if desired, the maximum absolute slope can be monitored when solving for the appropriate value of the Lagrange multiplier.

Table 1. Fit coefficients (in nanometer units) for two different bases. [To clarify a point made in the text, the central column contains just the monomial expansion of Q con 7(x), c.f. Eq. (9).]

View This Table

5. Concluding remarks

The traditional parametrization of optical surface shapes in terms of monomials is famously ill-conditioned. One symptom of this weakness is the wildly varying and precariously balanced values among the associated degrees of freedom. All the problematic features disappear when the basis is orthogonalized, however. As a result, significantly fewer digits are required to characterize a surface to any given accuracy. What is more, the high-order coefficients then independently control just the high-order shape. This property can facilitate the tolerancing of an aspheric surface. In the context of optical design, the coefficients associated with orthogonalized representations also promise to give a more natural coordinate space for optimization of the associated merit function. For example, including extra terms in the asphere delivers the benefits of higher orders with minimal change to the existing coefficients. It is worth noting, however, that in general design processes, the value of ρ max for each surface must be updated whenever the system modifications that are part of optimization cause the effective (illuminated) aperture to drift significantly from ρ max.

The introduction of the polynomials described in Sec. 3, namely Q bfs m (x), was motivated by the fact that aspheres can be significantly more cost effective if their departure from a best-fit sphere is sufficiently constrained. In particular, it is then possible to avoid the need for dedicated, expensive null optics as part of their testing. Further, when a part’s local principal curvatures vary slowly enough, the fabrication difficulties associated with non-uniform tool-fit are greatly relieved. While similarly avoiding the weaknesses of working with monomials, Q bfs m (x) retains most of the advantages of Q con m (x) while also facilitating the design of milder aspheres. Notice that the simple design procedures discussed in Sec. 4 do not require access to derivatives, and they change neither aperture sizes nor best-fit curvatures. In this sense, a simple application of Q bfs m (x) requires minimal dedicated effort other than implementing the basis elements themselves, see Eq. (14).

References and links


See, for example, the discussion and references in H. Chase, “Optical design with rotationally symmetric NURBS”, SPIE Proceedings 4832, 10–24 (2002) and A. W. Greynolds, “Superconic and subconic surface descriptions in optical design,” Proc. SPIE 4832, 1–9 (2002). Such matters are also treated within the manuals for commercial optical design software. [CrossRef]


G. H. Spencer and M. V. R .K. Murty, “General ray-tracing procedure,” J. Opt. Soc. Am. 52, 672–678 (1962), see Eq. (16). [CrossRef]


M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 1978), see 22.2.1.


E. H. Doha, “On the coefficients of differentiated expansions and derivatives of Jacobi polynomials,” J. Phys. A: Math. Gen. 35, 3467–3478 (2002). [CrossRef]


E. W. Weisstein, “Jacobi Polynomial” from MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/JacobiPolynomial.html, see esp. Eqs. (10–14).


B. Y. Ting and Y. L. Luke, “Conversion of polynomials between different polynomial bases,” IMA J. Numer. Anal. , 1, 229–234 (1981). [CrossRef]


A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients of scaled pupils”, J. Microlithogr., Microfabr., Microsyst. 5, 1–3 (2006). (Note that the Zernikes are also Jacobi polynomials.) [CrossRef]

OCIS Codes
(000.4430) General : Numerical approximation and analysis
(220.1250) Optical design and fabrication : Aspherics
(220.4610) Optical design and fabrication : Optical fabrication
(220.4830) Optical design and fabrication : Systems design
(220.4840) Optical design and fabrication : Testing

ToC Category:
Optical Design and Fabrication

Original Manuscript: February 20, 2007
Revised Manuscript: April 12, 2007
Manuscript Accepted: April 12, 2007
Published: April 13, 2007

G. W. Forbes, "Shape specification for axially symmetric optical surfaces," Opt. Express 15, 5218-5226 (2007)

Sort:  Author  |  Year  |  Journal  |  Reset  


  1. See, for example, the discussion and references in H. Chase, "Optical design with rotationally symmetric NURBS", SPIE Proceedings 4832, 10-24 (2002) and A. W. Greynolds, "Superconic and subconic surface descriptions in optical design," Proc. SPIE 4832, 1-9 (2002). Such matters are also treated within the manuals for commercial optical design software. [CrossRef]
  2. G. H. Spencer and M. V. R.K. Murty, "General ray-tracing procedure," J. Opt. Soc. Am. 52, 672-678 (1962), see Eq. (16). [CrossRef]
  3. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 1978), see 22.2.1.
  4. E. H. Doha, "On the coefficients of differentiated expansions and derivatives of Jacobi polynomials," J. Phys. A: Math. Gen. 35, 3467-3478 (2002). [CrossRef]
  5. E. W. Weisstein, "Jacobi Polynomial" from MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/JacobiPolynomial.html, see esp. Eqs. (10-14).
  6. B. Y. Ting and Y. L. Luke, "Conversion of polynomials between different polynomial bases," IMA J. Numer. Anal.  1, 229-234 (1981). [CrossRef]
  7. A. J. E. M. Janssen and P. Dirksen, "Concise formula for the Zernike coefficients of scaled pupils," J. Microlithogr. Microfabr. Microsyst. 5, 1-3 (2006). (Note that the Zernikes are also Jacobi polynomials.) [CrossRef]

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited