1. Introduction
Holograms have been recognized by most image researchers as the final goal of perfect 3-dimensional (3D) image reconstruction, because they are exactly the same image, as the original object in free space. Thus, many researchers have working on this, since their invention by Gabor in 1948.
Electronic holograms have been researched since the 1960s. Computational holography is one of these forms of electronic holograms, in which the interference pattern (fringe pattern) is calculated numerically to acquire a hologram. The hologram is uploaded to a spatial light modulator (SLM) and the reference light is exposed to reconstruct the image [
1
S. Benton and V. M. Bove Jr., Holographic Imaging (Wiley, 2008). [CrossRef]
–
3
P. Hariharan, Basics of Holography (Cambridge University Press, 2002). [CrossRef]
]. The numerically calculated hologram is termed a computer-generated hologram (CGH). The inputs for a CGH, the depth information and the light intensity of each object point, are in digital forms. In addition, the output, the resulting hologram, consists of pixels.
The inherent and critical problem for the CGH method is the enormous amount of calculations.
M×
N×
P×
Q times of calculation to calculate a fringe pattern in a hologram pixel by one light source and accumulations for all the light sources must be performed to make a hologram with the resolution of
M×
N [pixel
2] for an object of
P×
Q [pixel
2]. Thus, the two major issues for CGH are how to simplify the calculation equation, and how to increase the calculation speed with minimal loss of reconstructed object image quality [
4
M. Lucente, “Interactive computation of holograms using a look-up table,” J. Electron. Imaging
2, 28–34 (1993). [CrossRef]
,
5
H. Yoshikawa, S. Iwase, and T. Oneda, “Fast computation of fresnel holograms employing differences,” Proc. SPIE
3956, 48–55 (2000). [CrossRef]
]. The research group of [
4
M. Lucente, “Interactive computation of holograms using a look-up table,” J. Electron. Imaging
2, 28–34 (1993). [CrossRef]
], who first started CGH research, tried to speed up the generation by including only the horizontal parallax (HPO) using a look-up table (LUT) method and a parallel supercomputer. Thus, they could generate one hologram frame per second, in which the number of light sources of the object was 10,000 and the hologram resolution was 6M [pixel
2]. The CGH equation was approximated with the Taylor series expansion for the square root calculation [
5
H. Yoshikawa, S. Iwase, and T. Oneda, “Fast computation of fresnel holograms employing differences,” Proc. SPIE
3956, 48–55 (2000). [CrossRef]
]. We should pay attention to this research from the viewpoint that it has been the basis in recent CGH research, even though this paper could not contribute much to the speed-up.
Hardware implementations based on FPGA [
6
T. Shimobaba and T. Ito, “An efficient computational method suitable for hardware of computer-generated hologram with phase computation by addition,” Comput. Phys. Commun.
138, 44–52 (2001). [CrossRef]
–
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
] and in a graphics processing unit (GPU) [
10
N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Computer generated holography using a graphics processing unit,” Opt. Express
14, 603–608 (2006). [CrossRef]
[PubMed]
–
14
T. Shimobaba, T. Ito, N Masuda, Y Ichihashi, and N. Takada, “Fast calculation of computer-generated-hologram on AMD HD5000 series GPU and OpenCL,” Opt. Express
18, 9955–9960 (2010). [CrossRef]
[PubMed]
] have been developed subsequently, because implementing the approximated equation in software could not reach the desired speed-up. [
6
T. Shimobaba and T. Ito, “An efficient computational method suitable for hardware of computer-generated hologram with phase computation by addition,” Comput. Phys. Commun.
138, 44–52 (2001). [CrossRef]
] modified the equation of [
5
H. Yoshikawa, S. Iwase, and T. Oneda, “Fast computation of fresnel holograms employing differences,” Proc. SPIE
3956, 48–55 (2000). [CrossRef]
] to propose a recursive equation to calculate a row of a hologram for a light source and implemented it with an FPGA. This research group continued to upgrade their implementations to develop the HORN-5 system, a printed circuit board (PCB) with four Xilink FPGAs to calculate Fresnel transform CGH [
7
T. Ito, N. Masuda, K. Yoshimura, A. Shiraki, T. Shimobaba, and T. Sugie, “Special-purpose computer HORN-5 for a real-time electroholography,” Opt. Express
13, 1923–1932 (2005). [CrossRef]
[PubMed]
]. It arranged the same number of calculation cells, as the number of pixels in a column of a hologram and takes 0.0679 sec/hologram with a 166 MHz clock. In addition, this group proposed a special-purpose computer system called HORN-6 to dedicate it to CGH calculation [
8
Y. Ichihashi, H. Nakayama, T. Ito, N Masuda, T. Shimobaba, A Shiraki, and T. Sugie, “HORN-6 special-purpose clustered computing system for electroholography,” Opt. Express
17, 13895–13903 (2009). [CrossRef]
[PubMed]
]. In [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
] a CGH processor with 100% pipeline structure was proposed, in which the structures of the basic cell for Fresnel transforms, kernel, and the processor were included.
Reference [
12
Y. Pan, X. Xu, S. Solanki, X. Liang, R. Bin, A. Tanjung, C. Tan, and T.-C. Chong, “Fast CGH computation using S-LUT on GPU,” Optics Express
17, 18543–18555 (2009). [CrossRef]
] proposed an algorithm referring LUTs and implemented it in an Nvidia GPU for implementation with a GPU. It takes 0.3 sec. to calculate a hologram frame of 1,024×768 [pixel
2] with 1,000 object light sources. Reference [
13
Y.-Z. Liu, J.-W. Dong, Y.-Y. Pu, B.-C. Chen, H.-X. He, and H.-Z. Wang, “High-speed full analytical holographic computations for true-life scenes,” Opt. Express
18, 3345–3351 (2010). [CrossRef]
[PubMed]
] used a 3-dimensional mesh-model to calculate CGH and implemented it in a GPU. Reference [
14
T. Shimobaba, T. Ito, N Masuda, Y Ichihashi, and N. Takada, “Fast calculation of computer-generated-hologram on AMD HD5000 series GPU and OpenCL,” Opt. Express
18, 9955–9960 (2010). [CrossRef]
[PubMed]
] used an AMD GPU and optimized GPU programming that resulted in about 0.03 sec/hologram of HD resolution with 1,000 light sources, too small to yield reasonable image quality.
If we consider an SLM with pixel pitch of even 10
μm for image reconstruction, a hologram with HD resolution can make the image of only about 2×1 [cm
2] size. Therefore, a much higher calculation speed is required to service a real-time hologram video of moderate image size. We have proposed a high-performance CGH processor, by re-arranging the CGH equation and adopting a pipelining scheme in [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
]. In this paper we modify the CGH equation used in previous works including [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
] to maximize parallel computation. With the resulting equation, the hardware is implemented to speed up the computation. In addition, we propose a new hardware structure to accomplish this modified equation.
In Section 2, we show a conceptual explanation of CGH and the recursive equation for high-speed computation. It is modified for maximum parallel computation. The corresponding hardware structure is proposed in Section 3. Then, the hardware is implemented with FPGA from Altera in Section 4. The paper is concluded in Section 5.
2. Computer-generated hologram (CGH)
This section briefly explains the CGH equations proposed for high speed or hardware implementation so far.
2.1. The basic CGH equation
CGH generation calculates interference between two light waves, the object wave, and the reference wave. This paper focuses on the
phase hologram, which generates a hologram with the phases of the object wave components. The proof of this method can be found in [
3
P. Hariharan, Basics of Holography (Cambridge University Press, 2002). [CrossRef]
].
The fundamental equation for the phase hologram used in the CGH research is as
Eq. (1).
where,
Iαj
is the intensity influence on the hologram pixel (
xα
,
yα
) from the object pixel (
xj
,
yj
,
zj
),
Aj
is the light intensity of the object pixel,
λ is the wave length,
p is the pixel pitch (here, the pixel pitches of object plane and the hologram are assumed to be the same), and Φ
α
and Φ
j
are the initial phases of the object wave and the reference wave, respectively. A hologram pixel is completely calculated by summing all the influences from all the object pixels. However, this calculation scheme has a drawback in that changes of the pixel positions increase the complexity of calculation, because different object pixels have different z-axis values and/or x and y values. Therefore, it is preferable in CGH calculation that all the influences by one object pixel on all the hologram pixels are calculated, then moved to the next object pixel. The change of the z-axis value is minimized in this scheme.
2.2. Recursive equation for CGH calculation
If examining
Eq. (1), one can recognize that for a row of hologram by a source, only the x-axis value changes. Thus, it is more efficient to consider a row of a hologram together, as shown in
Fig. 1 [
6
T. Shimobaba and T. Ito, “An efficient computational method suitable for hardware of computer-generated hologram with phase computation by addition,” Comput. Phys. Commun.
138, 44–52 (2001). [CrossRef]
–
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
]. If
zj
≫
p|
xα
–
xj
| and
zj
≫
p|yα
–
yj
|,
Eq. (1) can be approximated into
Eq. (2) by expanding the square root of
Eq. (1) in series [
6
T. Shimobaba and T. Ito, “An efficient computational method suitable for hardware of computer-generated hologram with phase computation by addition,” Comput. Phys. Commun.
138, 44–52 (2001). [CrossRef]
].
where,
xαj
=
xα
–
xj
and
yαj
=
yα
–
yj
. In
Eq. (2) the only quantity (or phase information) changing along a row of hologram pixels is
θ(
xαj
,
yαj
,
zj
)[
6
T. Shimobaba and T. Ito, “An efficient computational method suitable for hardware of computer-generated hologram with phase computation by addition,” Comput. Phys. Commun.
138, 44–52 (2001). [CrossRef]
]. The phase
θH
at (
xα
,
yα
) on the hologram plane that is formed by the light source at (
xj
,
yj
) in the virtual object is expressed as
Eq. (3) and the separated phases,
θXY
and
θZ
are defined in
Eq. (4) and
(5), respectively.
Fig. 1 Calculating a row of hologram components by one object light source.
Now, let us consider the hologram pixel at
x =
p(
xα
+
d)
θXY
in the same row. It should be as
Eq. (6).
This means that once
θXY
(
xαj
,
yαj
,
zj
) is calculated,
θXY
of the pixel at
x > pxα
in the same row can be obtained by adding the rightmost term. Of course,
θZ
’s of all the pixels in the row are the same. Therefore, once the
θXY
for the leftmost pixel in a row and
θZ
are calculated, all the other pixels in the row can be calculated recursively with
Eq. (6), as
Eq. (7) and
(8).
3. Parallel CGH calculation and its hardware architecture
This section proposes a modified CGH equation that can be performed fully in parallel when it is implemented in hardware. Then, its hardware design is proposed with its internal and external pipelining scheme. In addition, this section includes some precision approximation with the minimal quality degradation.
3.1. CGH equation in full parallel
These values form a
progressive sequence of differences. The general term of this sequence is
Now, let us compare this equation to
Eq. (10). In
Eq. (10), each pixel value in a row of a hologram affected by a light source must be calculated serially one-by-one due to its recursive property, although each row can be performed in parallel, if hardware is provided. For
Eq. (14), the independence of each row is the same as
Eq. (11). However, all the other pixel values can be calculated in parallel, if hardware is provided in the calculation of each pixel value in a row, once
θXY
(
x = 0) and
θXY
(
x =
p or
d = 1) are calculated.
This method has a great deal of flexibility in parallel computing of digital holograms. It means that if sufficient hardware resources are provided, two cycles of calculation are sufficient for all the hologram components by one light source. Of course,
Eq. (10) can perform a parallel computation by dividing a row into several sub-rows, or into the number of pixels in the ultimate case. However, the first pixel in each sub-row must be calculated by
Eq. (5), which takes much hardware and a more complicated data input scheme compared to
Eq. (10) or
(14).
3.2. Hardware architecture of computational cells
We separate
Eq. (8) into two components, the initial-parameter calculator
initial-parameter calculator iinit
(
xαj
,
yαj
,
zj
) and the
update-phase calculator iupdate
(
iinit
(),
d), which are as
Eq. (15) and
(16), respectively, to design hardware for
Eq. (7) and
(8) including
Eq. (14).
Figure 2 shows their hardware architecture.
Fig. 2 Architecture of CGH cells: (a) initial-parameter calculator, (b) update-phase calculator.
The outputs of the initial-parameter calculator in
Fig. 3(a) are the phase of the first pixel
θH,d
=0=
θXY,d
=0+
θZ
, Γ
1, and Δ, among which Γ
1 and Δ are used in the update-phase calculator. In this cell,
θZ
and Δ/2 are taken from a pre-generated look-up table LUT1 with
zj
as the address. Note that
zj
is used only once in a row and the wavelength
λ is fixed. The update-phase calculator in
Fig. 3(b) also uses a look-up table, LUT2, for the cosine function. Section 3.4 explains this in more detail. Note that the update-phase calculator does not have a feedback loop that resides in the corresponding cell of [
7
T. Ito, N. Masuda, K. Yoshimura, A. Shiraki, T. Shimobaba, and T. Sugie, “Special-purpose computer HORN-5 for a real-time electroholography,” Opt. Express
13, 1923–1932 (2005). [CrossRef]
[PubMed]
] or [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
], which increases much flexibility in pipelining. This will be explained in the next section.
Fig. 3 The pipelined architecture of the update-phase calculator.
Only one initial parameter calculator cell is required to calculate a row of hologram components for a light source, while for the update-phase calculator cell, as many cells as desired can be included. If one update-phase cell resides, the recursive operation as
Eq. (11) is performed. If
M-1 (
M is the number of pixels in a hologram row) update-phase cells are included as the fastest case, all the pixels except the leftmost one are calculated in parallel.
3.3. Pipelining
Many pipelining schemes are possible. More than one scheme is usually implemented, according to the provided hardware (hologram by hologram, light source by light source, row by row, pixel by pixel, and internal operator by operator, etc). Only the operator level scheme is explained here, because the higher level pipelining schemes depend on the included number of update-phase cells more than internal operator level.
Figure 3 only shows the pipelining scheme for the update-phase cell, because pipelining the initial-parameter cell may be meaningless, if sufficient cells are not included. As can be seen in the figure, we used a counter for (
d – 1)/2 as well as
d, which can be easily obtained by diminishing and shifting. It has six pipeline stages.
Table 1 shows their time scheduling. Thus, from the sixth clock cycle after obtaining Δ, each clock cycle outputs one pixel value. The maximum delay that determines the speed of the clock period is the delay of one multiplier.
Table 1 Pipeline Stage Scheduling
| Cycle | Stage1 | Stage2 | Stage3 | Stage4 | Stage5 | Stage6 |
|---|
| R0 | R1 | R2 | R3 | R4 | R5 |
|---|
| 1 | 0Δ | | | | | |
| 2 |
| Γ1 + 0Δ | | | | |
| 3 | 1Δ |
| Γ1
| | | |
| 4 |
| Γ1 + 1Δ | Γ2
|
θH
,1
| | |
| 5 | 2Δ |
| Γ3
|
θH
,2
| cos(2πθH
,1) | |
| 6 |
| Γ1 + 2Δ | Γ4
|
θH
,3
| cos(2πθH
,2) |
Aj
cos(2πθH
,1) |
| 7 | 3Δ |
| Γ5
|
θH
,4
| cos(2πθH
,3) |
Aj
cos(2πθH
,2) |
| 8 |
| Γ1 + 3Δ | Γ6
|
θH
,5
| cos(2πθH
,4) |
Aj
cos(2πθH
,3) |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| n |
|
| Γ
n
–2
|
θH,n
–3
| cos(2πθH,n
–4) |
Aj
cos(2πθH,n
–5) |
Now, let us consider the extension of the cells to perform a parallel computation. The extension for only two cells is explained for simplicity, because extension for more than two cells is conceptually the same.
Fig. 4 depicts the two possible schemes. The one in
Fig. 4(a) shares cosine functions and a multiplier. In this structure, the outputs should be taken sequentially through the MUX. Conversely, the one in
Fig. 4(b) can calculate two pixel values separately. The MUX at the end of this structure is to output the two results sequentially as
Fig. 4(a). If sequential output is unnecessary, the MUX can be removed. Consequently, the structure of
Fig. 4(a) saves some hardware resource at the expense of losing some flexibility of parallel computation compared to the structure of
Fig. 4(b).
Fig. 4 Extension of update-phase calculator for pixel-based parallelization; (a) extendable structure with sequential outputs, (b) extendable structure with sequential or parallel outputs.
3.4. Precision approximation for cosine function
Fixed-point computation is more preferred over floating-point computation in hardware implementation, because it uses less resources and can be computed more quickly. Note that a fixed-point numerical system itself is an approximation. For example, a number is expressed with 8 bits of integers, and it is approximated into an integer between 0 and 255. Therefore, it is quite usual that the intermittent results can be properly approximated without losing much precision in the final result. This can reduce the amount of hardware resource used.
This section deals with the approximation of the cosine function used in
Fig. 2,
3, and
4. The methodology was a fixed-point simulation that a given digital bit is assigned to the result from the cosine function.
Fig. 5 shows the simulation results from assigning the given bits (the numbers in the horizontal axis) to the cosine function both in the hologram
Fig. 5(a) and in the reconstructed object
Fig. 5(b). We estimated both peak signal-to-noise ratio (PSNR) (
Eq. (17)) and normal correlation (NC) (
Eq. (18)) for each case. X and Y is horizontal and vertical resolution.
I and
I
′ are an original and a reconstructed pixel. In both cases, the cosine values resulting from assigning more than 28 bits are saturated to the one without approximation.
Fig. 5 The experimental results of approximation for the cosine function; (a) hologram (b) reconstructed object.
However, in real applications, the quality of the reconstructed image is more important than that of the hologram. In addition, subjective tests or eye inspection may be quite different from the results of PSNR or NC measurement, especially for holograms.
Fig. 6 shows some examples of the reconstructed image for various cosine approximations with the Rabbit test image. In the figure, assigning 0 bits always denotes
cosθH
= 1, assigning 1 bit, makes
cosθH
= 1 or −1, and so on. One can easily recognize from the figures that the image created by assigning 1 bit does not make much difference in image quality from that created by assigning 30 bits. From this experiment, we could conclude that 3 bits are sufficient for the cosine function. We implement LUT2 in
Figs. 3 and
4 with 3 bits.
Fig. 6 The object reconstruction results for the approximations of cosine function by assigning; (a) 0 bit, (b) 1 bit, (c) 15 bits, (d) 30 bits.
3.5. CGH processor
The initial-parameter calculator cell of
Fig. 2(a) and the update-phase calculator of
Fig. 4(a) or (b) consist of a CGH kernel that performs the real CGH calculation. Of course, how many cells reside in the CGH kernel according to the parallel computation scheme is pre-determined. The kernel is also a component of the CGH processor, which includes input/output interfaces, memory, and its controller, DMA. In this paper, the architectures of the CGH kernel and CGH processor of [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
] are used.
4. Hardware implementation and experiments
The proposed architecture in the previous section was implemented with VHDL in the environments of FPGA from Altera. Thus, Quartus II 10.0 and Modelsim 6.5e were used for VHDL design and simulation, respectively.
Table 2 compares the hardware resource in each cell of the proposed method to that of [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
], which has similar hardware composition. As in this table, the proposed scheme uses fewer hardware resources, multipliers, adders, and LUTs, even though it uses one MUX per cell. It has high flexibility in parallel computation.
Table 2 Hardware Resource of CGH Cell
| initial parameter calculator | Optimized update-phase calculator |
|---|
[9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
] | Proposed | [9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
] | Proposed |
|---|
| Multiplier | 2 | 2 | 3 | 2 |
| Adder | 3 | 3 | 3 | 2 |
| LUT1 | 1 | 1 | - | - |
| LUT2 | 1 | - | - | 1 |
| LUT3 | - | - | 1 | - |
| Register | 4 | 4 | 8 | 6 |
| MUX | - | - | 1 | - |
As explained previously, our scheme can calculate the influences from one light source to all the hologram pixels in parallel, if sufficient hardware is provided. The hardware for all the hologram cells might be too large to realize. Thus, we estimate the calculation ability, as the amount of hardware increases, as shown in
Fig. 7. The horizontal and vertical axies indicate the amount of hardware corresponding to the number of hologram rows and the number of hologram frames per second, respectively. Three hologram resolutions were considered: 1,920×1,080 [pixel
2], 1,408×1,050 [pixel
2], and 1,280×1,024 [pixel
2] (1920, 1408, and 1280 in the figure, respectively). In addition, two clock frequencies, 166MHz and 294MHz, were included, in which 294 MHz is the maximum stable frequency. As can be seen in the figure, the calculation speed has the properties of
Eq. (19), as expected.
Fig. 7 Calculation speed according to the amount of hardware.
Table 3 shows the performance under some implementation conditions. The number of object light points was the same but we considered two hologram resolutions, 1,920×1,080 [pixel
2] and 1,408×1,050 [pixel
2]. Two cases were examined for examples of the amount of hardware: the number of cells corresponding to a row of holograms and four rows of holograms. Here, the maximum clock frequency that operated stably was 294 [MHz]. 27.22 frames/sec of holograms with HD resolution could be generated with this clock frequency. As explained above, the speed is proportional to the clock frequency and inversely proportional to the hologram resolution and the number of object points, as shown in the other cases of the table.
Table 3 Performances for Various Implementation Conditions
| Number of object points | 10,000 |
|---|
| Hologram resolution [pixel2] | 1,920×1,080 | 1,408×1,050 |
| Frequency [MHz] | 294MHz | 166MHz |
| Time[sec]/CGH | 0.036 | 0.0092 | 0.0158 |
| CGHs/sec | 27.22 | 91.8 | 62.2 |
| Included number of cells | 1,920 | 7,680 | 5,632 |
Figure 8 shows two examples of the reconstructed objects, Ballet and Hyun-Jin. Ballet is a test multi-view video sequence from MPEG with a depth map resolution of 200×200 [pixel
2]. Hyun-Jin is an image that we have made with a depth map resolution of 177×144 [pixel
2]. The hologram resolution was 1,280×1,024 [pixel
2] for both images. We used a depth camera from Mesa Imaging to capture the depth information for the test image of Hyun-Jin. Each test image includes its depth map in
Fig. 8(a) and 8(d), a reconstructed image by simulation for the CGH generated with the original equations of
Eq. (1) in
Fig. 8(b) and 8(e), and the reconstructed results in the optical system (such as [
9
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
]) for the CGH generated by the proposed hardware in
Fig. 8(c) and 8(f). The resolution and pixel pitch for the software-based CGH and reconstruction are 1,024×1,024 and 10.4
μm, respectively. In the optical system, the resolution and pixel pitch of the spatial light modulator (SLM) are 1,280×1,024 and 13.62
μm repectively.
Fig. 8 Examples of reconstructed images (upper ones for Ballet and the lower ones for Hyun-Jin); (a) and (d) depth maps, (b) and (e) reconstructed results by software, (c) and (f) (
Media 1) and (
Media 2) reconstructed results by optical apparatus for the CGH generated by the proposed hardware.
5. Conclusion
In this paper, we proposed a CGH generation equation that can maximize the parallel computation by modifying the previous one. In addition, we proposed the hardware design of CGH cells to calculate the initial parameters and to update the phase for other pixels, although the architecture of the kernel and processor from one of our previous papers was used.
The experimental results, after implementing the proposed scheme in hardware in various conditions and amounts of hardware, verified that the calculation speed is proportional to the amount of hardware and clock frequency and inversely proportional to the resolution of the object image and that of the hologram. This can maximize the flexibility of CGH calculation.
The purpose of this paper is to maximize the parallel computation for CGH generation by the proposed hardware architecture. It can complete the computations for all the hologram pixels for one light source in a few clock cycles if sufficient hardware is incorporated. Also, it has the property of a trade-off between the amount of hardware and the computation speed all the way from pixel by pixel serial computation to the fully parallel computation. Therefore, the proposed scheme can be properly and efficiently used by applications according to the requirements for the amount of hardware and the computation speed.
Acknowledgments
This work was supported by the
IT R&D program of KEIT. [KI002058, Signal Processing Elements and their SoC Developments to Realize the Integrated Service System for Interactive Digital Holograms]
References and links
1. |
S. Benton and V. M. Bove Jr., Holographic Imaging (Wiley, 2008). [CrossRef]
|
2. |
J. K. Chung and M. H. Tsai, Three-Dimensional Holographic Imaging (Wiley, 2002). |
3. |
P. Hariharan, Basics of Holography (Cambridge University Press, 2002). [CrossRef]
|
4. |
M. Lucente, “Interactive computation of holograms using a look-up table,” J. Electron. Imaging
2, 28–34 (1993). [CrossRef]
|
5. |
H. Yoshikawa, S. Iwase, and T. Oneda, “Fast computation of fresnel holograms employing differences,” Proc. SPIE
3956, 48–55 (2000). [CrossRef]
|
6. |
T. Shimobaba and T. Ito, “An efficient computational method suitable for hardware of computer-generated hologram with phase computation by addition,” Comput. Phys. Commun.
138, 44–52 (2001). [CrossRef]
|
7. |
T. Ito, N. Masuda, K. Yoshimura, A. Shiraki, T. Shimobaba, and T. Sugie, “Special-purpose computer HORN-5 for a real-time electroholography,” Opt. Express
13, 1923–1932 (2005). [CrossRef]
[PubMed]
|
8. |
Y. Ichihashi, H. Nakayama, T. Ito, N Masuda, T. Shimobaba, A Shiraki, and T. Sugie, “HORN-6 special-purpose clustered computing system for electroholography,” Opt. Express
17, 13895–13903 (2009). [CrossRef]
[PubMed]
|
9. |
Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “An architecture of a high-speed digital hologram generator based on FPGA,” J. Syst. Archit.
56, 27–37 (2009). [CrossRef]
|
10. |
N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Computer generated holography using a graphics processing unit,” Opt. Express
14, 603–608 (2006). [CrossRef]
[PubMed]
|
11. |
L. Ahrenberg, P. Benzie, M. Magnor, and J. Watson, “Computer generated holography using parallel commodity graphics hardware,” Opt. Express
14, 7636–7641 (2006). [CrossRef]
[PubMed]
|
12. |
Y. Pan, X. Xu, S. Solanki, X. Liang, R. Bin, A. Tanjung, C. Tan, and T.-C. Chong, “Fast CGH computation using S-LUT on GPU,” Optics Express
17, 18543–18555 (2009). [CrossRef]
|
13. |
Y.-Z. Liu, J.-W. Dong, Y.-Y. Pu, B.-C. Chen, H.-X. He, and H.-Z. Wang, “High-speed full analytical holographic computations for true-life scenes,” Opt. Express
18, 3345–3351 (2010). [CrossRef]
[PubMed]
|
14. |
T. Shimobaba, T. Ito, N Masuda, Y Ichihashi, and N. Takada, “Fast calculation of computer-generated-hologram on AMD HD5000 series GPU and OpenCL,” Opt. Express
18, 9955–9960 (2010). [CrossRef]
[PubMed]
|
15. |
W. G. Joseph, Introduction to Fourier Optics , 3rd ed. (Roberts and Company, 2005). |