OSA's Digital Library

Optics Express

Optics Express

  • Editor: Andrew M. Weiner
  • Vol. 21, Iss. 23 — Nov. 18, 2013
  • pp: 29065–29072
« Show journal navigation

Acceleration of computation of φ-polynomials

Ilhan Kaya and Jannick Rolland  »View Author Affiliations


Optics Express, Vol. 21, Issue 23, pp. 29065-29072 (2013)
http://dx.doi.org/10.1364/OE.21.029065


View Full Text Article

Acrobat PDF (1084 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

The benefits of making an effective use of impressive computational power offered by multi-core platforms are investigated for the computation of φ-polynomials used in the description of freeform surfaces. Specifically, we devise parallel algorithms based upon the recurrence relations of both Zernike polynomials and gradient orthogonal Q-polynomials and implement these parallel algorithms on Graphical Processing Units (GPUs) respectively. The results show that more than an order of magnitude improvement is achieved in computational time over a sequential implementation if these recurrence-based parallel algorithms are adopted in the computation of the φ-polynomials.

© 2013 Optical Society of America

1. Introduction

Optical elements that are not rotationally symmetric are expected to become prevalent components in future optical systems [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

]. Emerging examples of freeform optical surfaces designed, implemented, manufactured and tested lie not only in compact head worn displays [2

2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

, 3

3. O. Cakmakci, K. Thompson, P. Vallee, J. Cote, and J. P. Rolland, “Design of a freeform single-element head-worn display,” Proc. SPIE 7618, 761803 (2010). [CrossRef]

] but also in illumination optics applications [4

4. J. C. Miñano, P. Benitez, and A. Santamaria, “Freeform optics for illumination,” Opt. Rev. 16(2), 99–102 (2009). [CrossRef]

]. The mathematical surface description for this new type of optical elements is an active area of research that aims to reveal the most economical, accurate, and efficient way for the characterization. The orthogonal full aperture φ-polynomials [5

5. F. Zernike, “Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode,” Physica 1(7–12), 689–704 (1934). [CrossRef]

7

7. G. W. Forbes, “Characterizing the shape of freeform optics,” Opt. Express 20(3), 2483–2499 (2012). [CrossRef] [PubMed]

] defined on a circular aperture and local radial basis functions that allow for more general aperture shapes [8

8. O. Cakmakci, B. Moore, H. Foroosh, and J. P. Rolland, “Optimal local shape description for rotationally non-symmetric optical surface design and analysis,” Opt. Express 16(3), 1583–1589 (2008). [CrossRef] [PubMed]

,9

9. I. Kaya and J. P. Rolland, “Hybrid RBF and local phi-polynomial freeform surfaces,” Adv. Opt. Technol. 2(1), 81–88 (2013).

] are among the many forms of surface descriptions. Zernike polynomials are the most widely used φ-polynomials in optics. They are not only used to describe optical surfaces but they might also be used to express wavefront aberrations [10

10. R. W. Gray, C. Dunn, K. P. Thompson, and J. P. Rolland, “An analytic expression for the field dependence of Zernike polynomials in rotationally symmetric optical systems,” Opt. Express 20(15), 16436–16449 (2012). [CrossRef]

]. Recently introduced gradient orthogonal Q-polynomials were compared to Zernike polynomials and shown to represent an optical surface with similar accuracy in a least squares setting [11

11. I. Kaya, K. P. Thompson, and J. P. Rolland, “Comparative assessment of freeform polynomials as optical surface descriptions,” Opt. Express 20(20), 22683–22691 (2012). [CrossRef] [PubMed]

].

As the demands of the optical surface descriptions increase, more terms of φ-polynomials may be required in order to accurately express the surface departure from a base surface that may typically be a sphere, a conic, or a best fit sphere. The wave of future descriptions of freeform surfaces may include high order terms of φ-polynomials as shown in [12

12. G. W. Forbes, “Fitting freeform shapes with orthogonal bases,” Opt. Express 21(16), 19061–19081 (2013). [CrossRef] [PubMed]

]. High order terms may also be needed to measure bumps on a surface that may occur as part of the polishing process [11

11. I. Kaya, K. P. Thompson, and J. P. Rolland, “Comparative assessment of freeform polynomials as optical surface descriptions,” Opt. Express 20(20), 22683–22691 (2012). [CrossRef] [PubMed]

], and mid spatial frequencies on the surfaces add to the challenge [12

12. G. W. Forbes, “Fitting freeform shapes with orthogonal bases,” Opt. Express 21(16), 19061–19081 (2013). [CrossRef] [PubMed]

]. Both the total number of terms and the high order φ-polynomials themselves become computationally intensive in their inclusion for describing a surface. Furthermore, multi-dimensional optical surface optimization with full aperture φ-polynomials is a highly challenging and computationally intensive task. Optimization cycles may become a major bottleneck for the optical design process.

The tremendous amount of computational power that is available on highly parallel multithreaded many-core Graphics Processing Units (GPU) has been utilized with several parallel algorithms to reduce the computational time for computationally intensive scientific problems in the last decade. In many different fields of science ranging from computational dynamics [13

13. C. L. Phillips, J. A. Anderson, and S. C. Glotzer, “Pseudo-random number generation for Brownian dynamics and dissipative particle dynamics simulations on GPU devices,” J. Comput. Phys. 230(19), 7191–7201 (2011). [CrossRef]

] to optical imaging applications, GPUs are reported to accelerate applications more than one order of magnitude or achieve faster data processing and visualization rates [14

14. Y. Jian, K. Wong, and M. V. Sarunic, “Graphics processing unit accelerated optical coherence tomography processing at megahertz axial scan rate and high resolution video rate volumetric rendering,” J. Biomed. Opt. 18(2), 026002 (2013). [CrossRef] [PubMed]

]. In a blood flow visualization framework, 12-60 fold speedups are reported with the help of GPUs [15

15. S. Liu, P. Li, and Q. Luo, “Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit,” Opt. Express 16(19), 14321–14329 (2008). [CrossRef] [PubMed]

]. Through parallel algorithms designed to work on Single Instruction Multiple Thread (SIMT) GPU architecture, the computation of the full aperture φ-polynomials may achieve a significant pace by leveraging the commodity graphics hardware. The main contribution of this paper is to devise and implement several recurrence-based data-parallel algorithms for the computation of Zernike and gradient orthogonal Q-polynomials and show that an order of magnitude speedup is possible in the computation of these φ-polynomials.

This paper is organized as follows: Section 2 summarizes the recurrences based computation of φ-polynomials (i.e., Zernike and gradient orthogonal Q-polynomials) and details the parallel algorithms to implement the recurrence relations of φ-polynomials on a SIMT architecture. Section 3 shows the computational results of executing the specifically designed parallel algorithms for both the Zernike polynomials and gradient orthogonal Q-polynomials on GPUs and reports the speedups as compared to that of a sequential implementation of the recurrence relations on multi-core Central Processing Units (CPU).

2. Recurrence relations of φ-polynomials and their parallelization

In order to investigate the parallelization and possible speedups in the computation of the φ-polynomials, recurrence relations shown in Eqs. (4) and (6) were parallelized on a SIMT architecture. There are two promising ways to accelerate the computation of the recurrence shown in Eq. (4). The first one is that instead of computing the coefficients shown in Eq. (5) sequentially as the recurrence is run, all the coefficients up until the (n-m)/2th execution of the recurrence relation are computed together at once:

for each thread

get the local id corresponding to the kth recurrence run,

compute the a[k], b[k], c[k] locally (see Eq. (5))

store them

end

In above algorithm, each thread operates for a specific run of the recurrence relation, computes the coefficients required only for that specific run. When all the threads return, all the coefficients for the recurrence are ready to use. This is a data-parallel SIMT algorithm, since a single compute instruction is executed on each and every thread with different data corresponding to the specific recurrence runs, k. For example, the computation of Z11010(r,φ) would require 50 runs of Eq. (4), with a total of 150 coefficients computed as shown in Eq. (5). With the algorithm shown above, spawning 50 threads that simultaneously compute the coefficients corresponding to each run of the recurrence shown in Eq. (4) will reduce the computational time of the Zernike polynomial.

The second way to accelerate the recurrence relation shown in Eq. (4), thus the computation of Zernike polynomials shown in Eqs. (3) and (1), is to compute the recurrence relation on each thread for each sample ray position in the ray grid. In other words, each thread computes the recurrence relation, thus the Zernike polynomial, for each sample location of the rays on the ray grid over the aperture. Each thread not only computes the recurrence relation shown in Eq. (4) but also the power term in Eq. (3) and sine or cosine in Eq. (1) on the sample ray point (r,φ). Hence once all the threads return, the computation of the Zernike polynomial,Znm(r,φ), is completed across the aperture of the optical element. This data parallel SIMT algorithm is shown below:

for each thread

get the local sample ray point, (r,φ), to operate on.

create a local data cache [3

3. O. Cakmakci, K. Thompson, P. Vallee, J. Cote, and J. P. Rolland, “Design of a freeform single-element head-worn display,” Proc. SPIE 7618, 761803 (2010). [CrossRef]

].

store in cache[0]the first Jacobi, P0, in cache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

] the second Jacobi, P1 at (r,φ).

while (recurrence exec num < (n-m)/2)

run the recurrence, store it in cache [2

2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

].

swap cache[0], cache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

], swap cache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

], cache [2

2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

].

recurrence exec number + + .

end

compute power, r^m

compute sine/cosine (mφ).

store the result

end

The computation of a specific gradient orthogonal Q-polynomial requires implementation of two kernels corresponding to the recurrence based parallel algorithms shown below. The coefficients for the recurrence relation shown in Eq. (6) and the coefficients for the auxiliary polynomial three-term recurrence relation working in tandem must be computed first together in advance:

for each thread

get the local id corresponding to the kth recurrence run,

compute the a[k], b[k], c[k], d[k] for the auxiliary recurrence (see Eq. (A.3) of [7

7. G. W. Forbes, “Characterizing the shape of freeform optics,” Opt. Express 20(3), 2483–2499 (2012). [CrossRef] [PubMed]

])

compute F[k], G[k] for Q-poly recurrence (see Eqs. (A.13) and (A.16) of [7

7. G. W. Forbes, “Characterizing the shape of freeform optics,” Opt. Express 20(3), 2483–2499 (2012). [CrossRef] [PubMed]

])

store them

synchronize threads

thread 0: compute f[k] and g[k] (see Eq. (A.18) of [7

7. G. W. Forbes, “Characterizing the shape of freeform optics,” Opt. Express 20(3), 2483–2499 (2012). [CrossRef] [PubMed]

])

end

After all the coefficients are computed together at once in parallel with as many threads as the total number of runs of the recurrence relation show in Eq. (6) by implementing the above algorithm, we can compute the gradient orthogonal Q-polynomial by creating as many threads as the local number of ray points in the ray grid. Each thread executes the following parallel algorithm to compute the recurrence relation locally for the specific ray.

for each thread

get the local sample ray point, (r,φ) to operate on.

create an auxiliary auxcache [3

3. O. Cakmakci, K. Thompson, P. Vallee, J. Cote, and J. P. Rolland, “Design of a freeform single-element head-worn display,” Proc. SPIE 7618, 761803 (2010). [CrossRef]

] and a data cache [2

2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

]

store auxcache[0] the first auxiliary, A0, auxcache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

] the second auxiliary, A1 at (r,φ).

store cache[0] the first gradient orthogonal Q-poly, Q0 at (r,φ).

while (recurrence exec num < n)

run the auxiliary recurrence, store it auxcache [2

2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

], (see Eq. (4)).

run gradient orthogonal Q-poly recurrence, store it cache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

], (see Eq. (6)).

swap auxcache[0], auxcache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

], swap auxcache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

], auxcache [2

2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

].

swap cache[0], cache [1

1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

].

recurrence exec number + + .

end

compute power, r^m

compute sine/cosine (mφ).

store the result

end

A parallel implementation of gradient orthogonal Q-polynomial requires both of the algorithms shown above as GPU kernels. As a result of this high level granularity of the computations (i.e. each thread operates on a single ray point within a ray grid), we have observed significant acceleration of the φ-polynomial computations through parallelization.

3. Numerical results of SIMT parallelization of φ-polynomials

The first step is to validate that the result obtained out of parallelization is the same as the result that comes out of the sequential algorithm. For this purpose, a low order Zernike and Q-polynomial are computed and the results are compared to sequential counterparts and displayed in Fig. 1. Since the sequential and parallel computed φ-polynomials coincide visually, only the GPU versions are shown. However, to quantify differences, Fig. 1 also shows the difference between the sequential and the parallel versions and results show that they are in correspondence within 14 significant digits. This is because of the IEEE compliant double precision support is inherent on both chips.

The second investigation is to analyze the effect of the total number of ray points across the aperture over the φ-polynomials computation time. Naturally, as the number of ray points increases, the total time to compute a specific φ-polynomial increases. In order to quantify however the effect of ray grid size on the computation time, a high order Zernike,Z11010(r,φ) and a high order gradient orthogonal Q-polynomial, Q5010(r,φ) were computed both sequentially and in parallel. The results are shown in Table 1

Table 1. Effect of the size of the ray grids on the speedup of the computation of φ-polynomials.

table-icon
View This Table
| View All Tables
and Fig. 2
Fig. 2 (a) Total execution time of the sequential and parallel algorithms of φ-polynomials on both CPU and GPU as a function of the grid size (b) speedups of φ-polynomials with grid size.
.

Table 1 shows that the computation time for both the parallel and the sequential algorithms increases as the number of rays quadruples at each row for both of the φ-polynomials. However the time for the sequential algorithm increases more in proportion than the parallel algorithm time. The ratio of the total time for the sequential algorithm execution over the total time that it takes to execute the parallel algorithm is defined as the speedup and this parameter increases as the ray grid size increases. Figure 2 shows the total execution times of the sequential and parallel algorithms on CPU and GPU and corresponding speedups of the φ-polynomials with respect to the ray grid size.

In Fig. 2, we can clearly see that computational time for the gradient orthogonal Q-polynomials is higher than that of the Zernike polynomials (see dash-dot blue line in Fig. 2(a)), although the recurrence relations are run exactly 50 times for both of the φ-polynomials. The reason for the compute intensive nature of the gradient orthogonal Q-polynomial is because of the unconventional recurrence relation and the necessity of an auxiliary polynomial computation through another recurrence. This computationally expensive operation causes significant overhead for the sequential algorithm on CPU; however it is not a significant burden for the parallel algorithm running on GPU. This can be observed with the almost coincident red dash-dot and solid lines on Fig. 2(a) showing the parallel execution times of Q-polynomial and Zernike polynomial, respectively. Figure 2(b) quantifies the speedup for both the gradient orthogonal Q-polynomial and the Zernike polynomial. Results show that the speedup increases with the total number of ray samples and grows significantly in average as the number of rays quadruples across the aperture.

Another aspect of inquiry for the φ-polynomials computation is the order of the φ-polynomials. Provided that high order polynomials may occur in optical surface description, it is desirable to determine if parallelization and speedup in computation time is effected by the order of the φ-polynomials. In Table 2

Table 2. Effect of the order of the φ-polynomial over the computation time and speedup.

table-icon
View This Table
| View All Tables
and Fig. 3
Fig. 3 Effect of the polynomial order on the computation of φ-polynomials (a) computation times on CPU and GPU (b) speedups through parallelization.
we show the computation times of sequential and parallel algorithms as the order of the φ-polynomials is increased. For this experiment, the total number of ray points is kept fixed at 1024x1024 over the circular aperture, and azimuthal order is fixed at m = 2.

Table 2 shows that the speedup for the φ-polynomials increases as the order of the polynomials increases. It takes gradually more time to compute the φ-polynomial sequentially if the order of the polynomial is increased (see Fig. 3(a), blue lines). However the φ-polynomial computational time does not grow if parallel algorithms are utilized (see Fig. 3(a), red lines). Consequently, this finding leads to speedups with parallelization of an order of magnitude, i.e. 10 to 40 times, in computation of Zernike or gradient orthogonal Q-polynomials over the polynomial order (see Fig. 3(b)).

The level of parallelization is increased when the size of the ray grid or the order of the polynomial is increased, whose effects are shown in Table 1 and Table 2, respectively. The main reason for this increase in speedups is higher levels of granularity in the computation provided through parallel algorithms shown in Section 2.

The speedups reported in Table 1 and Table 2 may be associated with the specific features of the GPUs in executing the parallelized recurrence algorithms. A modern GPU is able to run many more concurrent active threads compared to the number of concurrent threads a CPU can run [19

19. NVIDIA, CUDA C Best Practices Guide, (NVIDIA, 2012).

]. The GeForce 650M GPU has 2 multiprocessors, each designed to execute hundreds of threads concurrently (see page 63 of [18

18. NVIDIA, CUDA C Programming Guide (NVIDIA, 2012).

]) whereas Intel® Core i7 3610QM consists of 4 cores running 8 concurrent threads with hyper-threading [20

20. Intel, “Intel core 7 processor specifications” (Intel, 2013), http://www.intel.com/content/www/us/en/processors/core/core-i7-processor/Corei7Specifications.html.

]. However, the GeForce GT 650M supports up to 2x1536 resident concurrent threads (see page 1 of [19

19. NVIDIA, CUDA C Best Practices Guide, (NVIDIA, 2012).

]). Furthermore, GPU threads are lightweight, and context switches are faster. Although the computational load is increased gradually in Table 2 by incrementing the polynomial order, the computational time to compute the φ-polynomials on GPU does not change due to the GPU ability to execute more instructions in one clock cycle. The Clenshaw process used to compute a linear combination of φ-polynomials based upon recurrence relations may also be carried out on GPUs [21

21. C. W. Clenshaw, “A note on the summation of Chebyshev series,” Math. Tables Other Aids Comput. 9, 118–120 (1995).

]. It is expected that the parallelization of the Clenshaw algorithm would yield similar speedups as reported in this paper. Finally, one would possibly observe similar speedup benefit if an actual ray tracing was performed through the parallelized algorithms with the help of highly-threaded GPUs or hyper threading multi-core CPUs executing similar parallel ray tracing algorithms.

4. Conclusion

In this work, we have investigated the effects of parallelizing the algorithms of φ-polynomial computation with the recurrence relations as they provide more robust and efficient results. Also the effects of ray grid sizes and the orders of the φ-polynomials on the computational time were examined. We have quantified the increased benefits through parallelization as the intensity of the computation grows, such as for high order terms and finer ray-grid resolutions. Furthermore, the parallel algorithms proposed in this research were validated to be in excellent correspondence with the sequential implementations. We utilized the many-core highly threaded GPU for parallel execution and used a multi-core CPU for the sequential algorithms. This by no means states that CPUs should not be utilized for parallelization with appropriate hyper-threading libraries. Just the contrary, the future computation of the φ-polynomials should take advantage of devised parallel algorithms running on both highly threaded many-core GPUs and CPUs.

Acknowledgments

This research was supported by the National Science Foundation GOALI grant ECCS-1002179. We thank Greg Forbes for stimulating discussion about this work.

References and links

1.

K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express 19(22), 21919–21928 (2011). [CrossRef] [PubMed]

2.

O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID 16, 1089–1098 (2008).

3.

O. Cakmakci, K. Thompson, P. Vallee, J. Cote, and J. P. Rolland, “Design of a freeform single-element head-worn display,” Proc. SPIE 7618, 761803 (2010). [CrossRef]

4.

J. C. Miñano, P. Benitez, and A. Santamaria, “Freeform optics for illumination,” Opt. Rev. 16(2), 99–102 (2009). [CrossRef]

5.

F. Zernike, “Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode,” Physica 1(7–12), 689–704 (1934). [CrossRef]

6.

M. Born and E. Wolf, Principles of Optics (Cambridge University, 1999).

7.

G. W. Forbes, “Characterizing the shape of freeform optics,” Opt. Express 20(3), 2483–2499 (2012). [CrossRef] [PubMed]

8.

O. Cakmakci, B. Moore, H. Foroosh, and J. P. Rolland, “Optimal local shape description for rotationally non-symmetric optical surface design and analysis,” Opt. Express 16(3), 1583–1589 (2008). [CrossRef] [PubMed]

9.

I. Kaya and J. P. Rolland, “Hybrid RBF and local phi-polynomial freeform surfaces,” Adv. Opt. Technol. 2(1), 81–88 (2013).

10.

R. W. Gray, C. Dunn, K. P. Thompson, and J. P. Rolland, “An analytic expression for the field dependence of Zernike polynomials in rotationally symmetric optical systems,” Opt. Express 20(15), 16436–16449 (2012). [CrossRef]

11.

I. Kaya, K. P. Thompson, and J. P. Rolland, “Comparative assessment of freeform polynomials as optical surface descriptions,” Opt. Express 20(20), 22683–22691 (2012). [CrossRef] [PubMed]

12.

G. W. Forbes, “Fitting freeform shapes with orthogonal bases,” Opt. Express 21(16), 19061–19081 (2013). [CrossRef] [PubMed]

13.

C. L. Phillips, J. A. Anderson, and S. C. Glotzer, “Pseudo-random number generation for Brownian dynamics and dissipative particle dynamics simulations on GPU devices,” J. Comput. Phys. 230(19), 7191–7201 (2011). [CrossRef]

14.

Y. Jian, K. Wong, and M. V. Sarunic, “Graphics processing unit accelerated optical coherence tomography processing at megahertz axial scan rate and high resolution video rate volumetric rendering,” J. Biomed. Opt. 18(2), 026002 (2013). [CrossRef] [PubMed]

15.

S. Liu, P. Li, and Q. Luo, “Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit,” Opt. Express 16(19), 14321–14329 (2008). [CrossRef] [PubMed]

16.

G. W. Forbes, “Robust and fast computation for the polynomials of optics,” Opt. Express 18(13), 13851–13862 (2010). [CrossRef] [PubMed]

17.

M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 1972), Chap. 22.

18.

NVIDIA, CUDA C Programming Guide (NVIDIA, 2012).

19.

NVIDIA, CUDA C Best Practices Guide, (NVIDIA, 2012).

20.

Intel, “Intel core 7 processor specifications” (Intel, 2013), http://www.intel.com/content/www/us/en/processors/core/core-i7-processor/Corei7Specifications.html.

21.

C. W. Clenshaw, “A note on the summation of Chebyshev series,” Math. Tables Other Aids Comput. 9, 118–120 (1995).

OCIS Codes
(200.4960) Optics in computing : Parallel processing
(220.0220) Optical design and fabrication : Optical design and fabrication

ToC Category:
Optical Design and Fabrication

History
Original Manuscript: September 4, 2013
Revised Manuscript: November 1, 2013
Manuscript Accepted: November 9, 2013
Published: November 15, 2013

Citation
Ilhan Kaya and Jannick Rolland, "Acceleration of computation of φ-polynomials," Opt. Express 21, 29065-29072 (2013)
http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-21-23-29065


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. K. Fuerschbach, J. P. Rolland, and K. P. Thompson, “A new family of optical systems employing φ-polynomial surfaces,” Opt. Express19(22), 21919–21928 (2011). [CrossRef] [PubMed]
  2. O. Cakmakci, S. Vo, K. P. Thompson, and J. P. Rolland, “Application of radial basis functions to shape description in a dual-element off-axis eyewear display: Field-of-view limit,” SID16, 1089–1098 (2008).
  3. O. Cakmakci, K. Thompson, P. Vallee, J. Cote, and J. P. Rolland, “Design of a freeform single-element head-worn display,” Proc. SPIE7618, 761803 (2010). [CrossRef]
  4. J. C. Miñano, P. Benitez, and A. Santamaria, “Freeform optics for illumination,” Opt. Rev.16(2), 99–102 (2009). [CrossRef]
  5. F. Zernike, “Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode,” Physica1(7–12), 689–704 (1934). [CrossRef]
  6. M. Born and E. Wolf, Principles of Optics (Cambridge University, 1999).
  7. G. W. Forbes, “Characterizing the shape of freeform optics,” Opt. Express20(3), 2483–2499 (2012). [CrossRef] [PubMed]
  8. O. Cakmakci, B. Moore, H. Foroosh, and J. P. Rolland, “Optimal local shape description for rotationally non-symmetric optical surface design and analysis,” Opt. Express16(3), 1583–1589 (2008). [CrossRef] [PubMed]
  9. I. Kaya and J. P. Rolland, “Hybrid RBF and local phi-polynomial freeform surfaces,” Adv. Opt. Technol.2(1), 81–88 (2013).
  10. R. W. Gray, C. Dunn, K. P. Thompson, and J. P. Rolland, “An analytic expression for the field dependence of Zernike polynomials in rotationally symmetric optical systems,” Opt. Express20(15), 16436–16449 (2012). [CrossRef]
  11. I. Kaya, K. P. Thompson, and J. P. Rolland, “Comparative assessment of freeform polynomials as optical surface descriptions,” Opt. Express20(20), 22683–22691 (2012). [CrossRef] [PubMed]
  12. G. W. Forbes, “Fitting freeform shapes with orthogonal bases,” Opt. Express21(16), 19061–19081 (2013). [CrossRef] [PubMed]
  13. C. L. Phillips, J. A. Anderson, and S. C. Glotzer, “Pseudo-random number generation for Brownian dynamics and dissipative particle dynamics simulations on GPU devices,” J. Comput. Phys.230(19), 7191–7201 (2011). [CrossRef]
  14. Y. Jian, K. Wong, and M. V. Sarunic, “Graphics processing unit accelerated optical coherence tomography processing at megahertz axial scan rate and high resolution video rate volumetric rendering,” J. Biomed. Opt.18(2), 026002 (2013). [CrossRef] [PubMed]
  15. S. Liu, P. Li, and Q. Luo, “Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit,” Opt. Express16(19), 14321–14329 (2008). [CrossRef] [PubMed]
  16. G. W. Forbes, “Robust and fast computation for the polynomials of optics,” Opt. Express18(13), 13851–13862 (2010). [CrossRef] [PubMed]
  17. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 1972), Chap. 22.
  18. NVIDIA, CUDA C Programming Guide (NVIDIA, 2012).
  19. NVIDIA, CUDA C Best Practices Guide, (NVIDIA, 2012).
  20. Intel, “Intel core 7 processor specifications” (Intel, 2013), http://www.intel.com/content/www/us/en/processors/core/core-i7-processor/Corei7Specifications.html .
  21. C. W. Clenshaw, “A note on the summation of Chebyshev series,” Math. Tables Other Aids Comput.9, 118–120 (1995).

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

Figures

Fig. 1 Fig. 2 Fig. 3
 

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited