OSA's Digital Library

Optics Express

Optics Express

  • Editor: C. Martijn de Sterke
  • Vol. 17, Iss. 25 — Dec. 7, 2009
  • pp: 23147–23152
« Show journal navigation

Real-time 2D parallel windowed Fourier transform for fringe pattern analysis using Graphics Processing Unit

Wenjing Gao, Nguyen Thi Thanh Huyen, Ho Sy Loi, and Qian Kemao  »View Author Affiliations


Optics Express, Vol. 17, Issue 25, pp. 23147-23152 (2009)
http://dx.doi.org/10.1364/OE.17.023147


View Full Text Article

Acrobat PDF (279 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

In optical interferometers, fringe projection systems, and synthetic aperture radars, fringe patterns are common outcomes and usually degraded by unavoidable noises. The presence of noises makes the phase extraction and phase unwrapping challenging. Windowed Fourier transform (WFT) based algorithms have been proven to be effective for fringe pattern analysis to various applications. However, the WFT-based algorithms are computationally expensive, prohibiting them from real-time applications. In this paper, we propose a fast parallel WFT-based library using graphics processing units and computer unified device architecture. Real-time WFT-based algorithms are achieved with 4 frames per second in processing 256×256 fringe patterns. Up to 132× speedup is obtained for WFT-based algorithms using NVIDIA GTX295 graphics card than sequential C in quad-core 2.5GHz Intel(R)Xeon(R) CPU E5420.

© 2009 OSA

1. Introduction

In optical interferometers, fringe projection systems, and synthetic aperture radars, fringe patterns are common outcomes with unavoidable noises [1

1. D. W. Robinson, and G. T. Reid, in Interferogram analysis: digital fringe pattern measurement techniques, (Bristol, England: Institute of Physics 1993)

3

3. D. C. Ghiglia, and M. D. Pritt, in Two-dimensional phase unwrapping: theory, algorithms and software, (John Wiley& Sons, Inc 1998).

]. The presence of noises makes the phase extraction and phase unwrapping challenging. Among various techniques proposed for fringe pattern analysis, recently developed windowed Fourier transform (WFT) based algorithms [4

4. Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 ( 2007). [CrossRef]

,5

5. Q. Kemao, H. Wang, and W. Gao, “Windowed Fourier transform for fringe pattern analysis: theoretical analyses,” Appl. Opt. 47(29), 5408–5419 ( 2008). [CrossRef] [PubMed]

] are considered in this paper, due to the following reasons: (i) noises in fringe patterns can be effectively suppressed [4

4. Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 ( 2007). [CrossRef]

,5

5. Q. Kemao, H. Wang, and W. Gao, “Windowed Fourier transform for fringe pattern analysis: theoretical analyses,” Appl. Opt. 47(29), 5408–5419 ( 2008). [CrossRef] [PubMed]

]; (ii) quality-guided phase unwrapping along with WFT processing has been shown to be successful [6

6. Q. Kemao, W. Gao, and H. Wang, “Windowed Fourier-filtered and quality-guided phase-unwrapping algorithm,” Appl. Opt. 47(29), 5420–5428 ( 2008). [PubMed]

]; (iii) phase extraction from a single fringe pattern has been demonstrated with the concept of frequency-guided demodulation [7

7. H. Wang and Q. Kemao, “Frequency guided methods for demodulation of a single fringe pattern,” Opt. Express 17(17), 15118–15127 ( 2009), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-17-17-15118. [CrossRef] [PubMed]

]. The WFT algorithms have been successfully used in the measurement of 3D profile [8

8. W. Chen, X. Su, Y. P. Cao, Q. C. Zhang, and L. Q. Xiang, “Method for eliminating zero spectrum in Fourier transform profilometry,” Opt. Lasers Eng. 43(11), 1267–1276 ( 2005). [CrossRef]

], thickness of thin films [9

9. P. Hlubina, D. Ciprian, J. Lunacek, and R. Chlebus, “Phase retrieval from the spectral interference signal used to measure thickness of SiO2 thin film on silicon wafer,” Appl. Phys. B 88(3), 397–403 ( 2007). [CrossRef]

], deformation of a force plate [10

10. M. P. Arroyo, J. A. Bea, N. Andres, R. Osta, and M. Doblare, “Force plate for measuring small animal forces by digital speckle pattern interferometry,” Proc. SPIE 6616, 66164D ( 2007). [CrossRef]

] and a dragonfly’s wings [11

11. P. Cheng, J. Hu, G. Zhang, L. Hou, B. Xu, and X. Wu, “Deformation measurements of dragonfly’s wings in free flight by using windowed Fourier transform,” Opt. Lasers Eng. 46(2), 157–161 ( 2008). [CrossRef]

], refractive index [12

12. W. Zhao, Y. Chen, L. Shen, and A. Y. Yi, “Refractive index and dispersion variation in precision optical glass molding by computed tomography,” Appl. Opt. 48(19), 3588–3595 ( 2009). [CrossRef] [PubMed]

], vibration [13

13. Y. Fu, R. M. Groves, G. Pedrini, and W. Osten, “Kinematic and deformation parameter measurement by spatiotemporal analysis of an interferogram sequence,” Appl. Opt. 46(36), 8645–8655 ( 2007). [CrossRef] [PubMed]

], as well as used for comparison with newly developed algorithms [14

14. J. A. Gómez-Pedrero, J. A. Quiroga, and M. Servín, “Adaptive asynchronous algorithm for fringe pattern demodulation,” Appl. Opt. 47(21), 3954–3961 ( 2008). [CrossRef] [PubMed]

,15

15. S. Gorthi and P. Rastogi, “Numerical analysis of fringe patterns recorded in holographic interferometry using high-order ambiguity function,” J. Mod. Opt. 56(8), 949–954 ( 2009). [CrossRef]

]. However, the WFT algorithms are computationally expensive and time-consuming [4

4. Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 ( 2007). [CrossRef]

]. A fast solution is thus significant and expected especially in real-time applications. Many hardware devices have been proposed in photomechanics to accelerate numerical computation, including field programmable gate array (FPGA) [16

16. T. Ito, N. Masuda, K. Yoshimura, A. Shiraki, T. Shimobaba, and T. Sugie, “Special-purpose computer HORN-5 for a real-time electroholography,” Opt. Express 13(6), 1923–1932 ( 2005), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-13-6-1923. [CrossRef] [PubMed]

], digital signal processor (DSP) [17

17. X. Wang, X. Peng, and T. Jindong, “Tree-dimensional digital imaging based on temporal phase unwrapping with parallel DSP,” Proc. SPIE 6723 (2007)

], computer cluster [18

18. T. W. Ng, K. T. Ang, and G. Argentini, “Temporal fringe pattern analysis with parallel computing,” Appl. Opt. 44(33), 7125–7129 ( 2005). [CrossRef] [PubMed]

], multicore computer [19

19. W. Gao, Q. Kemao, H. Wang, F. Lin, and H. S. Seah, “Parallel computing for fringe pattern processing: A multicore CPU approach in MATLAB® environment,” Opt. Lasers Eng. 47(11), 1286–1292 ( 2009). [CrossRef]

] and graphics processing unit (GPU) [20

20. N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Computer generated holography using a graphics processing unit,” Opt. Express 14(2), 603–608 ( 2006), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-14-2-603. [CrossRef] [PubMed]

22

22. S. Liu, P. Li, and Q. Luo, “Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit,” Opt. Express 16(19), 14321–14329 ( 2008), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-19-14321. [CrossRef] [PubMed]

]. In this paper, we propose fast parallel WFT-based algorithms using GPUs on a graphics card and computer unified device architecture (CUDA). With NVIDIA GTX295 graphics card, up to 132× speedup compared with the C codes has been achieved. As an example, a 256×256 fringe pattern can be effectively filtered with the speed of 4 frames per second (fps), which is fast enough for many real applications. With the fast development pace of the GPU, better acceleration performance can be expected in the near future. In the rest of the paper, the WFT algorithms are briefly introduced and the parallelization of the algorithms for GPUs and CUDA is described. Its application in processing digital holographic fringe patterns is demonstrated. The speedup and accuracy of the parallelized WFT-based algorithms are reported and discussed.

2. Windowed Fourier transform for fringe pattern analysis and its parallelization principles

A two dimensional windowed Fourier transform pair can be written as,
Sf(u,v;ξ,η)=++f(x,y)g(xu,yv)exp(jξxjηy)dxdy
(1)
f(x,y)=14π2++++Sf(u,v,ξ,η)g(xu,yv)exp(jξx+jηy)dξdηdudv
(2)
whereSf(u,v;ξ,η)is the windowed Fourier spectrum of a input fringe pattern f(x,y);g(xu,yv)is a real and symmetric window function to localized the Fourier integral within the neighborhood of a point at (u,v). A Gaussian window:
g(x,y)=1/πσxσyexp(x2/2σx2y2/2σy2)
(3)
is often used, where σx and σy indicate the window size; exp(jξx+jηy) is a harmonic function with ξ and η as its frequencies. Equation (1) decomposes a signal while Eq. (2) reconstructs it. Based on windowed Fourier transform, two algorithms have been proposed: the windowed Fourier filtering (WFF) and the windowed Fourier ridges (WFR). The parallelization of the WFF algorithm will be described in detail. The parallelization of the WFR algorithm is similar to that of the WFF and will not be elaborated to keep the paper concise. Nevertheless, the speedup performance of both the WFF and WFR are given at the end of the paper.

The WFF algorithm filters an input fringe pattern through selecting proper frequency bands as (ξ,η)[ξl,ξh]×[ηl,ηh], and thresholding the windowed Fourier spectrum Sf(u,v;ξ,η) by setting the spectrum with small amplitude to zero. The discrete WFF algorithm can be written as follows,
f¯(x,y)=ξiηi4π2ηs=ηlηhξs=ξlξh[f(x,y)hξs,ηs(x,y)¯hξs,ηs(x,y)]
(4)
where hξs,ηs(x,y)=g(x,y)exp(jξsx+jηsy) and (ξs,ηs)[ξl,ξh]×[ηl,ηh]; ξi and ηi are sampling intervals for ξ and η respectively; f(x,y)hξs,ηs(x,y)¯ denotes the thresholded spectrum; f¯(x,y) denotes the reconstructed fringe pattern. More details of the WFF algorithm can be found in [4

4. Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 ( 2007). [CrossRef]

].

Knowing that a convolution can be realized by Fourier transforms, in order to reduce the computation time, Eq. (4) can be rewritten as
f¯(x,y)=ξiηi4π2ηs=ηlηhξs=ξlξh1({1[(f)×(hξs,ηs)]¯}×(hξs,ηs))
(5)
where and 1 denote forward and inverse Fourier transforms respectively; the coordinates (x,y)on the right hand of the equation is omitted for conciseness. It can be seen that Eq. (5) consists of the following three major types of operations:
  • (i) Pixel-based operations including addition, multiplication and thresholding. As the operations on one pixel are independent from those on other pixels, operations on different pixels can be assigned to different computational units for simultaneously execution. This is extremely efficient to be executed in a GPU with hundreds of computational units called stream processors.
  • (ii) Forward and inverse Fourier transforms. By divide-and-conquer paradigm and matrix-vector notation, fast Fourier transform (FFT) has been developed into many parallel versions [23

    23. C. V. Loan, Computational frameworks for the fast Fourier transform data, (SIAM 1992)

    ] and can be efficiently parallelized by CUDA CUFFT library [24

    24. NVIDIA, “Tesla GPU computing solutions,” 2009 GPU workshop, http://www.idre.ucla.edu/events/2009/gpu-workshop/

    ]. It should be noted that, (f) can be pre-computed before the iteration starts; (hξs,ηs) appears twice but is only needed to be computed once for each iteration. Furthermore, since a Gaussian window is used, (hξs,ηs) can be theoretically expressed as

    (hξs,ηs)=4πσxσyexp{[σx2(ξξs)2+σy2(ηηs)2]/2}
    (6)

which becomes pixel-based operations (i.e., type (i)) with no communication among pixels and can be more efficiently executed in a GPU. Thus, only one forward and two inverse Fourier transforms are computed in one iteration.

  • (iii) Iterations for each (ξs,ηs)[ξl,ξh]×[ηl,ηh]. These iterations are independent from one another so that different iterations can be assigned to different GPUs to be calculated at the same time if multiple GPUs are available in the computer.

3. Graphics computing Unit and computer unified device architecture

From the above analysis and discussion, it can be seen that GPUs are the ideal hardware for accelerating the WFF algorithm. As suggested by its name, GPU was designed to process thousands of threads simultaneously by underlying parallel stream processors for graphics rendering [25

25. R. Scott “Stream Processor Architecture,” Springer international series in Engineering and Computer Science, 664 ( 2001).

]. Recently GPGPU (General-Purpose computation in GPUs) was developed to leverage the powerful GPU capability for general computation acceleration, especially for image processing. Among all the hardware accelerators, GPU is outstanding because of its tremendous cost performance, performance per power consumption, and low learning curve. Designed for NVIDIA GPUs, CUDA is a C-based general-purpose parallel computing programming model, which resolves the programming difficulty and inefficient hardware mapping of traditional graphics APIs for GPGPU. The functions in CUDA codes are called kernels. Based on single-instruction, multiple-thread (SIMT) architecture [26

26. NVIDIA, “ CUDA Programming Guide Version 2.3” (2009) http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/

], CUDA maps a single kernel to multiple threads to process different input data simultaneously. Using CUDA programming model, GPUs can be regarded as computation devices operating as coprocessors to the central processing unit (CPU). GPUs communicate with the CPU through fast PCI-Express port on motherboard of a personal computer (PC). Data-parallel and computationally intensive functions are offloaded to GPUs to get better performance than sequential execution in the CPU. Only one copy of source code is required with mixed C codes for the CPU and CUDA codes for GPUs. NVIDIA “nvcc” compiler [27

27. NVIDIA, “The CUDA compiler Driver NVCC” (2009) http://moss.csc.ncsu.edu/~mueller/cluster/nvidia/2.0/nvcc_2.0.pdf

] extracts and compiles the codes for GPUs while the rest codes are complied by standard C compiler.

4. Parallel windowed Fourier filtering based on CUDA

The WFF algorithm is such a typical data-parallel, computationally intensive algorithm that can benefit from the GPU hardware. As mentioned before, there are three types of operations in the WFF algorithm. Correspondingly, the following three approaches are explored to design the fast parallel WFF algorithm. Firstly, while the pixel-based operations are implemented by loops in a CPU, it can be implemented by a kernel in a GPU with the form “kernel_name<<<grid, block>>> ()”. After the kernel launched from the CPU, it is executed “block×grid” times by “block×grid” different light-weighted CUDA threads according to SIMT architecture. In this way, each iteration of pixel-based operations is implemented by a thread and all the threads are equally distributed among stream processors of the GPU for simultaneous execution [26

26. NVIDIA, “ CUDA Programming Guide Version 2.3” (2009) http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/

]. The number of grid and block should be enough to hide the memory and instruction pipeline latency as well as to scale over many GPU models. To make full use of GPUs capacity on GTX295 graphics card, we conFig. 256 as block size and 90 as grid size to optimize the parallel WFF code.

Secondly, we adopt CUDA parallel FFT library called CUFFT to accelerate sequential FFT. The pseudo codes in Fig. 1
Fig. 1 Pseudo code for WFF using one GPU.
address the parallel WFT algorithm using one GPU. The functions in Fig. 1 are launched by CPU and passed on to GPUs for execution. It is worth mentioning that CUFFT library achieves best performance when the transform sizes are powers of a single factor [28

28. NVIDIA, “CUDA CUFFT library 2.3” (2009), http://www.nvidia.com/object/cuda_develop.html

]. Therefore, we extend the padding size of convolution to the nearest power of a single factor to optimize the parallel WFF algorithm.

Thirdly, the parallel WFF library can automatically detect the available GPUs in a PC and equally divide frequencies along x axis according to the number of GPUs. In Fig. 1, ξl' and ξh' indicate one of the equally divided frequency bands for each GPU. After loading a fringe pattern, each GPU calculates its own section of frequencies independently. Then CPU reads back the individual result from available GPUs and synthesizes all the results.

5. Experiments and performance of parallel WFF and WFR using CUDA and GPU

The experimental code is implemented in quad-core 2.5GHz Intel(R) Xeon(R) CPU E5420 and NVIDIA GTX295 graphics card with 2 GPUs. CUDA 2.3 and visual studio 2005 are adopted as the development software. We apply the proposed parallel WFT library based on GPUs, in both single-precision and double-precision arithmetic, to a digital holographic fringe pattern as shown in Fig. 2(a)
Fig. 2 Sequential and parallel WFF result for processing a digital holographic fringe pattern. (a) a digital holographic fringe pattern, (b) the WFF filtering result using CPU, (c)the WFF filtering result using the proposed system based on GPU (single precision), (d) the difference map of the two results.
. Sequential MATLAB® 2008a version (double precision) and “C” version (double precision) of WFT-based algorithms are also tested in the above CPU for comparison. The parameters are set as follows: for the parallel WFF algorithm, ξi=ηi=0.1 and the threshold is 6; for the parallel WFR, ξi=ηi=0.025; for both algorithms, σx=σy=10 and [ξs,ηs][1,1]×[1,1].

Although the accuracy of GPU result is slightly sacrificed by using the single-precision arithmetic instead of double-precision in CUDA, the loss of the accuracy is tolerable. Figure 2(a) shows an input fringe pattern from a digital holographic interferometer. Figures 2(b) and 2(c) show the WFF results by CPU (double-precision) and GPU (single-precision). Their difference is shown in Fig. 2(d), which has a peak value of 1.03×105radians. If the double-precision arithmetic is adopted for the GPUs, the peak difference decreases to 1015radians. The parallel WFR algorithm based on GPUs and CUDA has the similar accuracy to the WFF for both single-precision and double-precision. Regarding the accuracy, though the double-precision is preferred, single-precision can be adopted for practical use.

Table 1

Table 1. Execution time for different size of fringe pattern by WFF and WFR algorithm

table-icon
View This Table
shows the computation time for the above experiments. It can be seen that for the WFF algorithm, GPUs with single-precision give up to 77× and 135× speedup compared with the sequential C codes and sequential MATLAB® codes, respectively. For the WFR algorithm, 132× and 306× speedup can be achieved compared with sequential C codes and MATLAB® codes, respectively. For a 256×256 fringe pattern, only 0.25 second is needed to process one frame using the parallel WFF, which enables its application to many real-time measurements. Note that in some applications the frequency band for the WFF and the WFR can be narrower and the proposed parallel WFT library based on GPUs can consequently achieve higher frame rate.

6. Conclusions and discussions

In this paper, parallel windowed Fourier transform algorithms for fringe pattern analysis using graphics processing unit (GPU) is proposed. Windowed Fourier filtering is used as an example to show the parallelization technique for GPU. Real-time windowed Fourier filtering is achieved with 4 fps for 256×256 digital holographic fringe patterns in GTX295 graphics card. The dramatic speedup performance shows the significance and importance of parallelizing an algorithm. The parallelization approach in GPU is not only technically sound but also practically attractive as a personal computer can simply be equipped with a CUDA supported graphics card through PCI-Express interface. In addition, the GPU solution is cost-effective and the most advanced CUDA supported graphics card costs only several hundred US dollars. Finally, the developed parallel WFT codes can run seamlessly in all CUDA supported graphics cards. With either a more advanced GPU or more available GPUs in a PC, better acceleration performance is expected.

References and links:

1.

D. W. Robinson, and G. T. Reid, in Interferogram analysis: digital fringe pattern measurement techniques, (Bristol, England: Institute of Physics 1993)

2.

X. Su and W. Chen, “Fourier transform profilometry: a review,” Opt. Lasers Eng. 35(5), 263–284 ( 2001). [CrossRef]

3.

D. C. Ghiglia, and M. D. Pritt, in Two-dimensional phase unwrapping: theory, algorithms and software, (John Wiley& Sons, Inc 1998).

4.

Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 ( 2007). [CrossRef]

5.

Q. Kemao, H. Wang, and W. Gao, “Windowed Fourier transform for fringe pattern analysis: theoretical analyses,” Appl. Opt. 47(29), 5408–5419 ( 2008). [CrossRef] [PubMed]

6.

Q. Kemao, W. Gao, and H. Wang, “Windowed Fourier-filtered and quality-guided phase-unwrapping algorithm,” Appl. Opt. 47(29), 5420–5428 ( 2008). [PubMed]

7.

H. Wang and Q. Kemao, “Frequency guided methods for demodulation of a single fringe pattern,” Opt. Express 17(17), 15118–15127 ( 2009), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-17-17-15118. [CrossRef] [PubMed]

8.

W. Chen, X. Su, Y. P. Cao, Q. C. Zhang, and L. Q. Xiang, “Method for eliminating zero spectrum in Fourier transform profilometry,” Opt. Lasers Eng. 43(11), 1267–1276 ( 2005). [CrossRef]

9.

P. Hlubina, D. Ciprian, J. Lunacek, and R. Chlebus, “Phase retrieval from the spectral interference signal used to measure thickness of SiO2 thin film on silicon wafer,” Appl. Phys. B 88(3), 397–403 ( 2007). [CrossRef]

10.

M. P. Arroyo, J. A. Bea, N. Andres, R. Osta, and M. Doblare, “Force plate for measuring small animal forces by digital speckle pattern interferometry,” Proc. SPIE 6616, 66164D ( 2007). [CrossRef]

11.

P. Cheng, J. Hu, G. Zhang, L. Hou, B. Xu, and X. Wu, “Deformation measurements of dragonfly’s wings in free flight by using windowed Fourier transform,” Opt. Lasers Eng. 46(2), 157–161 ( 2008). [CrossRef]

12.

W. Zhao, Y. Chen, L. Shen, and A. Y. Yi, “Refractive index and dispersion variation in precision optical glass molding by computed tomography,” Appl. Opt. 48(19), 3588–3595 ( 2009). [CrossRef] [PubMed]

13.

Y. Fu, R. M. Groves, G. Pedrini, and W. Osten, “Kinematic and deformation parameter measurement by spatiotemporal analysis of an interferogram sequence,” Appl. Opt. 46(36), 8645–8655 ( 2007). [CrossRef] [PubMed]

14.

J. A. Gómez-Pedrero, J. A. Quiroga, and M. Servín, “Adaptive asynchronous algorithm for fringe pattern demodulation,” Appl. Opt. 47(21), 3954–3961 ( 2008). [CrossRef] [PubMed]

15.

S. Gorthi and P. Rastogi, “Numerical analysis of fringe patterns recorded in holographic interferometry using high-order ambiguity function,” J. Mod. Opt. 56(8), 949–954 ( 2009). [CrossRef]

16.

T. Ito, N. Masuda, K. Yoshimura, A. Shiraki, T. Shimobaba, and T. Sugie, “Special-purpose computer HORN-5 for a real-time electroholography,” Opt. Express 13(6), 1923–1932 ( 2005), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-13-6-1923. [CrossRef] [PubMed]

17.

X. Wang, X. Peng, and T. Jindong, “Tree-dimensional digital imaging based on temporal phase unwrapping with parallel DSP,” Proc. SPIE 6723 (2007)

18.

T. W. Ng, K. T. Ang, and G. Argentini, “Temporal fringe pattern analysis with parallel computing,” Appl. Opt. 44(33), 7125–7129 ( 2005). [CrossRef] [PubMed]

19.

W. Gao, Q. Kemao, H. Wang, F. Lin, and H. S. Seah, “Parallel computing for fringe pattern processing: A multicore CPU approach in MATLAB® environment,” Opt. Lasers Eng. 47(11), 1286–1292 ( 2009). [CrossRef]

20.

N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Computer generated holography using a graphics processing unit,” Opt. Express 14(2), 603–608 ( 2006), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-14-2-603. [CrossRef] [PubMed]

21.

T. Shimobaba, Y. Sato, J. Miura, M. Takenouchi, and T. Ito, “Real-time digital holographic microscopy using the graphic processing unit,” Opt. Express 16(16), 11776–11781 ( 2008), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-16-11776. [CrossRef] [PubMed]

22.

S. Liu, P. Li, and Q. Luo, “Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit,” Opt. Express 16(19), 14321–14329 ( 2008), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-19-14321. [CrossRef] [PubMed]

23.

C. V. Loan, Computational frameworks for the fast Fourier transform data, (SIAM 1992)

24.

NVIDIA, “Tesla GPU computing solutions,” 2009 GPU workshop, http://www.idre.ucla.edu/events/2009/gpu-workshop/

25.

R. Scott “Stream Processor Architecture,” Springer international series in Engineering and Computer Science, 664 ( 2001).

26.

NVIDIA, “ CUDA Programming Guide Version 2.3” (2009) http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/

27.

NVIDIA, “The CUDA compiler Driver NVCC” (2009) http://moss.csc.ncsu.edu/~mueller/cluster/nvidia/2.0/nvcc_2.0.pdf

28.

NVIDIA, “CUDA CUFFT library 2.3” (2009), http://www.nvidia.com/object/cuda_develop.html

OCIS Codes
(100.2650) Image processing : Fringe analysis
(070.2615) Fourier optics and signal processing : Frequency filtering
(090.5694) Holography : Real-time holography

ToC Category:
Image Processing

History
Original Manuscript: October 27, 2009
Manuscript Accepted: November 24, 2009
Published: December 2, 2009

Citation
Wenjing Gao, Nguyen Thi Thanh Huyen, Ho Sy Loi, and Qian Kemao, "Real-time 2D parallel windowed Fourier transform for fringe pattern analysis using Graphics Processing Unit," Opt. Express 17, 23147-23152 (2009)
http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-17-25-23147


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. D. W. Robinson, and G. T. Reid, in Interferogram analysis: digital fringe pattern measurement techniques, (Bristol, England: Institute of Physics 1993)
  2. X. Su and W. Chen, “Fourier transform profilometry: a review,” Opt. Lasers Eng. 35(5), 263–284 (2001). [CrossRef]
  3. D. C. Ghiglia, and M. D. Pritt, in Two-dimensional phase unwrapping: theory, algorithms and software, (John Wiley& Sons, Inc 1998).
  4. Q. Kemao, “Two-dimensional windowed Fourier transform for fringe pattern analysis: Principles, applications and implementations,” Opt. Lasers Eng. 45(2), 304–317 (2007). [CrossRef]
  5. Q. Kemao, H. Wang, and W. Gao, “Windowed Fourier transform for fringe pattern analysis: theoretical analyses,” Appl. Opt. 47(29), 5408–5419 (2008). [CrossRef] [PubMed]
  6. Q. Kemao, W. Gao, and H. Wang, “Windowed Fourier-filtered and quality-guided phase-unwrapping algorithm,” Appl. Opt. 47(29), 5420–5428 (2008). [PubMed]
  7. H. Wang and Q. Kemao, “Frequency guided methods for demodulation of a single fringe pattern,” Opt. Express 17(17), 15118–15127 (2009), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-17-17-15118 . [CrossRef] [PubMed]
  8. W. Chen, X. Su, Y. P. Cao, Q. C. Zhang, and L. Q. Xiang, “Method for eliminating zero spectrum in Fourier transform profilometry,” Opt. Lasers Eng. 43(11), 1267–1276 (2005). [CrossRef]
  9. P. Hlubina, D. Ciprian, J. Lunacek, and R. Chlebus, “Phase retrieval from the spectral interference signal used to measure thickness of SiO2 thin film on silicon wafer,” Appl. Phys. B 88(3), 397–403 (2007). [CrossRef]
  10. M. P. Arroyo, J. A. Bea, N. Andres, R. Osta, and M. Doblare, “Force plate for measuring small animal forces by digital speckle pattern interferometry,” Proc. SPIE 6616, 66164D (2007). [CrossRef]
  11. P. Cheng, J. Hu, G. Zhang, L. Hou, B. Xu, and X. Wu, “Deformation measurements of dragonfly’s wings in free flight by using windowed Fourier transform,” Opt. Lasers Eng. 46(2), 157–161 (2008). [CrossRef]
  12. W. Zhao, Y. Chen, L. Shen, and A. Y. Yi, “Refractive index and dispersion variation in precision optical glass molding by computed tomography,” Appl. Opt. 48(19), 3588–3595 (2009). [CrossRef] [PubMed]
  13. Y. Fu, R. M. Groves, G. Pedrini, and W. Osten, “Kinematic and deformation parameter measurement by spatiotemporal analysis of an interferogram sequence,” Appl. Opt. 46(36), 8645–8655 (2007). [CrossRef] [PubMed]
  14. J. A. Gómez-Pedrero, J. A. Quiroga, and M. Servín, “Adaptive asynchronous algorithm for fringe pattern demodulation,” Appl. Opt. 47(21), 3954–3961 (2008). [CrossRef] [PubMed]
  15. S. Gorthi and P. Rastogi, “Numerical analysis of fringe patterns recorded in holographic interferometry using high-order ambiguity function,” J. Mod. Opt. 56(8), 949–954 (2009). [CrossRef]
  16. T. Ito, N. Masuda, K. Yoshimura, A. Shiraki, T. Shimobaba, and T. Sugie, “Special-purpose computer HORN-5 for a real-time electroholography,” Opt. Express 13(6), 1923–1932 (2005), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-13-6-1923 . [CrossRef] [PubMed]
  17. X. Wang, X. Peng, and T. Jindong, “Tree-dimensional digital imaging based on temporal phase unwrapping with parallel DSP,” Proc. SPIE 6723 (2007)
  18. T. W. Ng, K. T. Ang, and G. Argentini, “Temporal fringe pattern analysis with parallel computing,” Appl. Opt. 44(33), 7125–7129 (2005). [CrossRef] [PubMed]
  19. W. Gao, Q. Kemao, H. Wang, F. Lin, and H. S. Seah, “Parallel computing for fringe pattern processing: A multicore CPU approach in MATLAB® environment,” Opt. Lasers Eng. 47(11), 1286–1292 (2009). [CrossRef]
  20. N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Computer generated holography using a graphics processing unit,” Opt. Express 14(2), 603–608 (2006), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-14-2-603 . [CrossRef] [PubMed]
  21. T. Shimobaba, Y. Sato, J. Miura, M. Takenouchi, and T. Ito, “Real-time digital holographic microscopy using the graphic processing unit,” Opt. Express 16(16), 11776–11781 (2008), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-16-11776 . [CrossRef] [PubMed]
  22. S. Liu, P. Li, and Q. Luo, “Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit,” Opt. Express 16(19), 14321–14329 (2008), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-19-14321 . [CrossRef] [PubMed]
  23. C. V. Loan, Computational frameworks for the fast Fourier transform data, (SIAM 1992)
  24. NVIDIA, “Tesla GPU computing solutions,” 2009 GPU workshop, http://www.idre.ucla.edu/events/2009/gpu-workshop/
  25. R. Scott “Stream Processor Architecture,” Springer international series in Engineering and Computer Science, 664 (2001).
  26. NVIDIA, “ CUDA Programming Guide Version 2.3” (2009) http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/
  27. NVIDIA, “The CUDA compiler Driver NVCC” (2009) http://moss.csc.ncsu.edu/~mueller/cluster/nvidia/2.0/nvcc_2.0.pdf
  28. NVIDIA, “CUDA CUFFT library 2.3” (2009), http://www.nvidia.com/object/cuda_develop.html

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

Figures

Fig. 1 Fig. 2
 

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited