1. Introduction
Thus far, a number of approaches for generation of the computer-generated holograms (CGHs) of three-dimensional (3-D) objects have been proposed [
11. C. J. Kuo and M. H. Tsai, Three-Dimensional Holographic Imaging (John Wiley, 2002).
–
77. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47, D55–D62 (2008). [CrossRef] [PubMed]
]. Recently, a novel look-up table (NLUT) method to highly enhance the computational speed as well as to dramatically reduce the total number of pre-calculated interference patterns required for the generation of computer-generated holograms (CGHs) of three-dimensional (3-D) objects was proposed [
77. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47, D55–D62 (2008). [CrossRef] [PubMed]
]. In this method, a 3-D object is approximated as a set of discretely sliced image planes having different depth, and only the fringe patterns of the center-located object points on each image plane, which are called two-dimensional principal fringe patterns (2-D PFPs), are pre-calculated and stored. Therefore, the memory size of the NLUT can be reduced down to the order of gigabytes (GB) from the order of terabytes (TB) of the conventional look-up-table (LUT) for the moderate 3-D object [
77. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47, D55–D62 (2008). [CrossRef] [PubMed]
].
For reducing the memory capacity, a one-dimensional NLUT (1-D NLUT) in which GB-sized memory size of the conventional NLUT could be decreased down to the order of MB, was proposed by employing a new concept of 1-D sub-PFPs decomposed from the conventional 2-D PFPs [
88. S. C. Kim, J. M. Kim, and E.-S. Kim, “Effective memory reduction of the novel look-up table with one-dimensional sub-principle fringe patterns in computer-generated holograms,” Opt. Express 20(11), 12021–12034 (2012). [CrossRef] [PubMed]
]. Thus far, this 1-D NLUT might be the most effective approach for generation of CGH patterns for the future lifelike 3-D mobile terminals in terms of the memory capacity.
In addition, in order to accelerate the computational speed, several hardware and software methods have been also proposed [
1010. S.-C. Kim, J.-H. Yoon, and E.-S. Kim, “Fast generation of three-dimensional video holograms by combined use of data compression and lookup table techniques,” Appl. Opt. 47, 5986–5995 (2009). [CrossRef] [PubMed]
–
1717. D.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Hardware implementation of N-LUT method using Field Programmable Gate Array technology,” Proc. SPIE 7957, 79571C (2011). [CrossRef]
]. In the hardware approach, the original NLUT or its modified versions were attempted to be implemented on field-programmable-gate-arrays (FPGAs) or graphic-processing-units (GPUs) [
1616. M.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Graphics processing unit-based implementation of a one-dimensional novel-look-up-table for real-time computation of Fresnel hologram patterns of three-dimensional objects,” Opt. Eng. 53(3), 035103 (2014). [CrossRef]
,
1717. D.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Hardware implementation of N-LUT method using Field Programmable Gate Array technology,” Proc. SPIE 7957, 79571C (2011). [CrossRef]
]. On the other hand, in the software approach, the NLUT employed various image compression algorithms for removal of spatially or temporally redundant object data by taking into account of the shift-invariance property of the NLUT [
1010. S.-C. Kim, J.-H. Yoon, and E.-S. Kim, “Fast generation of three-dimensional video holograms by combined use of data compression and lookup table techniques,” Appl. Opt. 47, 5986–5995 (2009). [CrossRef] [PubMed]
–
1515. S.-C. Kim, X.-B. Dong, M.-W. Kwon, and E.-S. Kim, “Fast generation of video holograms of three-dimensional moving objects using a motion compensation-based novel look-up table,” Opt. Express 21(9), 11568–11584 (2013). [CrossRef] [PubMed]
].
Among them, a temporal redundancy-based NLUT (TR-NLUT) was proposed [
1010. S.-C. Kim, J.-H. Yoon, and E.-S. Kim, “Fast generation of three-dimensional video holograms by combined use of data compression and lookup table techniques,” Appl. Opt. 47, 5986–5995 (2009). [CrossRef] [PubMed]
], in which temporally-redundant object data between consecutive video frames of 3-D moving objects are removed with the differential pulse-code modulation (DPCM) algorithm, then only the difference images are applied to the NLUT for CGH calculation. This way, the TR-NLUT can reduce the number of object points to be calculated for CGH generation and obtain a more enhanced computational speed compared to the NLUT.
This method, however, has a critical drawback of its compression efficiency being highly dependent on the temporal redundancy of the 3-D moving objects [
1010. S.-C. Kim, J.-H. Yoon, and E.-S. Kim, “Fast generation of three-dimensional video holograms by combined use of data compression and lookup table techniques,” Appl. Opt. 47, 5986–5995 (2009). [CrossRef] [PubMed]
,
1515. S.-C. Kim, X.-B. Dong, M.-W. Kwon, and E.-S. Kim, “Fast generation of video holograms of three-dimensional moving objects using a motion compensation-based novel look-up table,” Opt. Express 21(9), 11568–11584 (2013). [CrossRef] [PubMed]
]. That is, as the 3-D objects move faster, the image differences between two consecutive video frames get larger, which results in a sharp decrease in the compression efficiency of the TR-NLUT. To be more precise, this TR-NLUT cannot acquire any improvements in computational speed in cases where there are more than 50% image differences between two consecutive video frames.
On the other hand, a motion-compensation concept has been widely employed in compression of 2-D video images [
1717. D.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Hardware implementation of N-LUT method using Field Programmable Gate Array technology,” Proc. SPIE 7957, 79571C (2011). [CrossRef]
–
2020. A. Barjatya, “Block matching algorithms for motion estimation,” in Technical Report, Utah State University (2004).
]. In this approach, video data to be processed can be significantly reduced by compensating the object’s motions with an amount of estimated motion-vectors between two consecutive video frames. Moreover, as mentioned above, the NLUT has an inherent shift-invariance property, which means the PFPs for the object points on each depth plane are always same regardless of the location shift of the object points [
77. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47, D55–D62 (2008). [CrossRef] [PubMed]
,
1616. M.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Graphics processing unit-based implementation of a one-dimensional novel-look-up-table for real-time computation of Fresnel hologram patterns of three-dimensional objects,” Opt. Eng. 53(3), 035103 (2014). [CrossRef]
]. Therefore, by applying this motion-compensation concept to the NLUT based on its shift invariance property, a massive removal of redundant object data between consecutive video frames may be expected.
For this, a motion-compensated NLUT (MC-NLUT) method was proposed [
1616. M.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Graphics processing unit-based implementation of a one-dimensional novel-look-up-table for real-time computation of Fresnel hologram patterns of three-dimensional objects,” Opt. Eng. 53(3), 035103 (2014). [CrossRef]
], in which motion vectors of the 3-D moving objects are extracted between two consecutive video frames and with these, object’s motions at each frame are compensated. Then, only the difference image between the motion-compensated object image of the previous frame and the object image of the current frame is applied to the NLUT for CGH calculation.
Accordingly, in this paper, as an alternative to the conventional TR-NLUT and MC-NLUT methods, a new type of the NLUT employing the MPEG-based motion estimation and compensation scheme, which is called MPEG-based NLUT (MPEG-NLUT), is proposed for rapid generation of video holograms of fast-moving 3-D objects in space.
The MPEG is the most efficient compression algorithm for video images because of its excellent ability to exploit high temporal correlation between successive video frames. That is, the MPEG operates the block-based motion estimation and compensation process based on a strict mathematical model hence, as many redundant object data between consecutive video frames as possible can be removed, which causes the computational speed of the MPEG-NLUT to be highly enhanced.
Furthermore, since the compression performance of the MPEG-NLUT may not depend on the temporal redundancy of the 3-D moving objects, it can be applied to video compression of fast-moving 3-D objects. In other words, contrary to the conventional TR-NLUT and MC-NLUT methods, the MPEG-NLUT method shows an excellent compression performance even for cases where there are more than 50% image differences between consecutive video frames. In addition, since the MPEG stream contains the motion vectors and residual information of 3-D moving objects, the proposed method can directly generate the CGHs from the MPEG stream without a decoding process at the recipients.
To confirm the feasibility of the proposed MPEG-NLUT method, experiments with three types of test 3-D video scenarios are performed and the results are compared to those of the conventional NLUT, TR-NLUT and MC-NLUT methods in terms of the average number of calculated object points and the average calculation time.
2. Shift-invariance property of the NLUT method
Basically, a 3-D object can be treated as a set of image planes discretely sliced along the
z-direction, and each image plane having a specific depth is approximated as a collection of self-luminous object points of light. In the NLUT method, only the 2-D PFPs representing the fringe patterns of the object points located on the centers of each depth plane are pre-calculated and stored [
77. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47, D55–D62 (2008). [CrossRef] [PubMed]
]. Then, the fringe patterns of other object points located on each depth plane can be obtained just by shifting these pre-calculated PFPs according to the displaced coordinate values from the center to the corresponding object points without further calculation processes.
Here, we can define the unity-magnitude PFP for the object point (
x_{p},
y_{p},
z_{p}) positioned on the center of a depth plane of
z_{p},
T(
x,
y;
z_{p}) as
Eq. (1).
In which the wave number
k is defined as
k = 2
π/
λ, and
λ and
θ_{R} represent the free-space wavelength of the light and the incident angle of the reference beam, respectively. Moreover, the oblique distance
r_{p} between the
p-th object point of (
x_{p},
y_{p},
z_{p}) and the point on the hologram plane of (
x,
y, 0) is given by
Eq. (2).
Therefore, in the NLUT method, the CGH pattern for an object
I(
x,
y) can be expressed in terms of the shifted versions of pre-calculated PFPs of
Eq. (1) as shown in
Eq. (3).
Where
N denotes the number of object points.
Equation (3) reveals that the NLUT method may enable obtaining the CGH pattern of a 3-D object just by combination of operations such as shifting and adding the PFPs on each depth plane of the 3-D object. That is, on each depth plane, the corresponding PFPs are shifted according to the dislocated values of the object points from the center points, and then these shifted versions of PFPs for all object points located on each depth plane are added up to obtain the final CGH pattern of a 3-D object, which may confirm the unique shift-invariance property of the NLUT.
Moreover, the CGH pattern for the cubic block moved to the location of B, which is called CGH_{B} here, can be obtained by simply shifting the hologram pattern of CGH_{A} to the location of (x_{1}, y_{1}) without any other calculation processes in the NLUT method. At the same time, from this hologram pattern of CGH_{B}, which is regarded as a shifted version of CGH_{A}, the corresponding cubic image can be reconstructed at the shifted location of B'(x_{1}, y_{1}, z_{1}).
Accordingly, in the NLUT method, the CGH pattern for the moved object can be generated only with the moving distances of the object without any additional calculations by taking advantage of the shift-invariance property of the NLUT. That is, if the motion vectors of the 3-D moving objects between two successive video frames of t_{1} and t_{2} are effectively extracted, the CGH pattern for the video frame of t_{2} can be generated just by shifting the pre-calculated CGH pattern for the video frame of t_{1} according to the extracted motion vectors.
Here, this shift-invariance property of the NLUT is directly matched to the concept of motion estimation and compensation which has been widely employed in compression of 2-D video data. Therefore, by applying this motion estimation and compensation concept to the NLUT, 3-D video data to be calculated for CGH generation is expected to be greatly reduced, which may result in a significant increase of the computational speed of the NLUT.
3. Limitations of the conventional TR-NLT and MC-NLUT
Figure 3(d), on the other hand, shows the motion vectors extracted from each block of the 3-D moving object by using the MPEG-based motion estimation and compensation scheme.
Figure 3(d) shows the car body moving to the right side while four wheels rotate, the motion vectors corresponding to the car body may have the same moving directions, whereas the motion vectors corresponding to the car wheels have many different moving directions. In the proposed MPEG-NLUT, different motion vectors existed on the moving object can be simultaneously estimated and compensated, so that an accurate compensation of the object’s motion can be achieved on the block basis.
Figure 3(h) shows the object points of the difference image extracted between the motion-compensated reference and input images with the MPEG.
Figures 3(e)–
3(h) finally confirm that the proposed MPEG-NLUT shows the smallest number of object points to be calculated for CGH generation compared to those of the conventional NLUT, TR-NLUT and MC-NLUT methods.
4. Proposed MPEG-based NLUT method
First, input 3-D video frames are grouped into a set of GOP (group of pictures) in sequence, in which each GOP consists of one RF (reference frame) and several GFs (general frames). Then, all video frames in each GOP are divided into 2-D arrays of image blocks. Second, the CGH pattern of the RF in a GOP is generated by calculating the block-CGHs (B-CGHs) for each block of the RF by using the TR-NLUT and adding them altogether. Third, the motion vectors (MVs) are estimated from the blocks between the RF and each of the GFs in a GOP. Fourth, all object motions on each block of the GFs are compensated just by shifting the image blocks and their corresponding B-CGH patterns according to the estimated MVs, from which motion-compensated RFs (MC-RFs) for each of the GFs are obtained. Fifth, the difference images are extracted from the blocks between the MC-RF and each of the GFs in a GOP. Finally only the B-CGHs for these difference images are calculated and added to those of each block of the MC-RFs to generate the CGH for the GFs.
4.1 Grouping and blocking of input 3-D video frames
In order to apply the MPEG-based motion estimation and compensation process to the sequence of 3-D video frames, a grouping and blocking process of the input video frames is needed. That is, input 3-D video frames with both of intensity and depth data are sequentially grouped into a series of GOPs. Here, a GOP is assumed to be composed of four video frames including one RF and three GFs, in which each video frame consists of intensity and depth images having the resolution of 512 × 512 pixels.
As an example,
Fig. 5Fig. 5 A structure of the input 3-D video frames grouped into a sequence of GOPs.
shows the video structure for 10 video frames to be grouped into three GOPs of GOP
_{1}, GOP
_{2} and GOP
_{3}. As mentioned above, the first frame and the remaining three in each GOP are designated as the RF and the GFs, respectively. Here, the number of GFs can be chosen depending on the object’s motions in the input video frames. This means that for slow-moving objects, the similarity of the objects between two adjacent video frames increases, thus the number of video frames composed of a GOP can be increased, whereas for fast-moving objects the corresponding number of video frames can be decreased.
Following the grouping process, all video frames in each GOP are divided into 32 × 32 square blocks for the subsequent block-based motion estimation and compensation process, in which each block has a resolution of 16 × 16 pixels.
4.2 Calculation of B-CGHs for each block of the RF
The CGH pattern of the RF in a GOP can be generated by computing the B-CGHs for each block of the RF using the TR-NLUT, and adding them altogether. With this CGH pattern of the RF, then, the motion-compensated versions of the remaining three GFs in a GOP can be calculated.
Here, the I_{B}(m, n) represents the B-CGH of the (m, n)^{th} block of the RF, in which m (1≤m≤M) and n (1≤n≤M) are the integers, then the CGH pattern of the RF in a GOP, which is denoted here as I_{R}, can be simply generated as a sum of the B-CGHs calculated for all blocks of the RF as follows.
Now, for the feasibility test of the proposed method as well as for the comparative performance analysis of the proposed method with those of the conventional NLUT, TR-NLUT and MC-NLUT methods, three types of test video scenarios composed of 150 video frames with fast-moving 3-D objects, which are called here ‘Case I’, ‘Case II’ and ‘Case III’, are computationally generated by using the 3DS MAX as a form of sequential images under the condition that the image differences between two consecutive video frames are to be more than 50% in all video frames.
Furthermore, for the third test scenario of ‘Case III’ shown in
Fig. 7(c), two objects move fast in the opposite directions with some overlapping regions between them. For this case, there should be some difficulties in segmentation of those objects into individuals since two ‘Car’ objects are overlapped on the same blocks, which means the MC-NLUT may not be appropriate for this test scenario of ‘Case III’.
4.3 Block-based motion estimation
Here, in case the block-based motion estimation scheme is applied to the 3-D video composed of intensity and depth images, the MVs for each block of the RF are extracted only from the intensity image because the corresponding blocks of the depth image have the same MVs as mentioned above.
As seen in
Fig. 8,
A_{m}_{,}_{n}(
x_{1},
y_{1}), the (
m,
n)
^{th} image block of the RF centering on the location coordinates of (
x_{1},
y_{1}), is assumed to move to the position of
B_{m}_{,}_{n}(
x_{2},
y_{2}) in the GF with the moving amounts of
d_{x} and
d_{y} along the
x and
y directions, respectively. The shifted location coordinates of
B_{m}_{,}_{n}(
x_{2},
y_{2}) in the GF can be, then, expressed as follows
, where d_{x} and d_{y} are designated as the MVs of the block in the x and y direction, respectively, and Δt denotes the time interval between two consecutive video frames. That is, the motion-compensated image block can be obtained by the combined use of the block of the RF and the estimated MVs.
Here, the block-matching process is performed between the block of A_{m}_{,}_{n}(x_{1}, y_{1}) on the RF and all possible blocks with the same size of A_{m}_{,}_{n}(x_{1}, y_{1}) within the searching region in the GF. The matching degree of one block with another is based on the value of a cost function, and the block showing the least value of the cost function is the one that matches the closest to the block of the GF.
There are many types of cost functions. Among them, the MAD (mean absolute difference) given by
Eq. (6) is used as the cost function in this paper because it has been known as one of the most popular and less computationally intensive cost functions [
2020. A. Barjatya, “Block matching algorithms for motion estimation,” in Technical Report, Utah State University (2004).
].
Where S_{L} is the side length of the block, C_{ij} and R_{ij} represent the pixels being compared in the blocks of the GF and the reference block of the RF, respectively.
After finding the best-matching block in the GF, the displacements between the positions of the reference block of the RF and those of the matching block in the GF are computed as output MVs. As you can see in
Fig. 8, the center point of the reference block of
A and the best-matching block of
B are given by (
x_{1},
y_{1}) and (
x_{2},
y_{2}), which means the reference block of the RF are moved from the location of (
x_{1},
y_{1}) to the location of (
x_{2},
y_{2}) in the GF. Accordingly, the motion vector of the block
A can be computed by using
Eq. (7).
That is, the block of A is shifted by d_{x} and d_{y} along the x and y direction, respectively between the RF and GF. Here, if the motion vectors of d_{x} or d_{y} are given by the negative numbers, they are then oriented to the left or to the upper direction along the x and y axis.
For both of the ‘Case II’ and ‘Case III’, two ‘Car’ objects move fast in the opposite directions, so that MVs with different directions exist at the 2nd frames of
Figs. 10(e) and
10(h). That is, most MVs of two cars are given by (4,0) and (−4,0). Moreover, as show in
Fig. 10(f), at the 71th frame of the ‘Case II’, most MVs are given by
v_{1}(−4,0) for the ‘Car’ moving from the right to the left, which means the block only moves to -
x direction with a shift amount of four pixels. Likewise, most MVs for the ‘Car’ moving from the left to the right are given by
v_{2}(4,0).
For the ‘Case III’ with mutually overlapped regions from the 30th to 112th video frames between two moving ’Car’ objects, the MVs of one object could be affected by the other object. As seen in
Fig. 10(i), two MVs of
v_{1}(−4,0) and
v_{2}(4,0) exist, which means two cars move from the right to the left and vice versa, respectively. However, due to the prediction error, three different MVs of
v_{3}(−1,3),
v_{4}(−3,1) and
v_{5}(3,3) also exist because two cars are overlapped on the same blocks.
4.4 Block-based motion compensation
For example, as seen in
Fig. 9(b), the motion vector of the block of A is given by (3,0) for the
x and
y directions. Therefore, the block of A', which represents the compensated version of A, can be obtained with the motion vector for the block of A.
However, during the pixel-shifting process of the blocks of the RF, several blocks happen to be overlapped, which may cause some blocks to be occluded by other blocks just like the blocks of ‘B’ and ‘D’, and ‘C’ and ‘E’ as shown in
Fig. 11(b). In these cases, the pixel values of the occluded regions of X and Y are determined by those of the blocks arriving there lastly. That is, if two or more blocks move to the same area in the MC-RF, the lastly entering block may cover all of the previous blocks.
4.5 CGHs generation for the GFs and reconstruction of 3-D objects
4.5.1 Overall procedure for generation of CGHs for the GFs
4.5.2 Generation of the CGH pattern for the MC-RF
As shown in
Fig. 14, the instantaneous CGH pattern for the RF can be generated just by shifting the B-CGHs for each block of the RF according to the estimated MVs. For instance,
Figs. 9(a) and
9(b) show 9 blocks of the RF and their estimated MVs, respectively, and
Fig. 15Fig. 15 A shifting process of the B-CGHs of the RF in
Fig. 9: (a), (d) and (g) represent the B-CGHs for each block of A, B and I, (b), (e) and (h) show their shifted versions with the corresponding MVs, (c), (f) and (i) show their shifted versions compensated with the hologram patterns for the blank areas, and (j) CGH for the MC-RF obtained by adding all shifted B-CGHs.
shows a shifting process of the B-CGHs of the RF with these estimated MVs is shown.
That is, for generation of the instantaneous CGH pattern for the MC-RF, the B-CGHs for each block of the RF require two-step processes. In the first step, the B-CGHs for each block of the RF are shifted according to their estimated MVs. Then, in the second step, the hologram patterns for the blank areas occurred due to the shifting process are calculated and with these the shifted B-CGHs are compensated. After the shifting and compensation operation of the B-CGHs for each block of the RF, all shifted and compensated versions of the B-CGHs are summed up together to obtain the instantaneous CGH pattern for the whole MC-RF as shown in
Eq. (9).
In which I_{S}(x, y), I_{Rm}(x, y) and I_{RBm}(x, y) represent the instantaneous CGH obtained for the MC-RF, the shifted versions of the B-CGHs for each block of the RF with the corresponding MVs, and the hologram patterns for the blank areas of the m^{th} block, respectively. M denotes the total number of blocks, and d_{x} and d_{y} represent the MVs along the x and y directions.
4.5.3 Compensation of hologram patterns for the overlapped regions
The instantaneous CGH pattern of
I_{S}(
x,
y) given by
Eq. (9) cannot represent the final CGH version calculated for the MC-RF because some overlapped regions among the shifted blocks happen to occur during the motion-compensation process. In these regions overlapped with several shifted blocks, the same number of different pixel values as of the overlapped blocks exists together, therefore the pixel values for these overlapped regions are to be determined with those of the lastly-coming blocks.
For instance,
Fig. 11(b) shows two overlapped areas between the motion-compensated blocks of
Fig. 11(a). ‘X’ and ‘Y’ in
Fig. 11(b) represent the intersected areas between two blocks of ‘B’ and ‘D’, and ‘C’ and ‘E’, respectively. Here, the final pixel values for the overlapped areas of ‘X’ and ‘Y’ are set to be those of the lastly-coming blocks of ‘D’ and ‘E’, respectively. However, this process may also result in some object points coming from two blocks of ‘B’ and ‘C’ to their corresponding overlapped areas to be removed.
Therefore, the final CGH for the MC-RF,
I_{C} (
x, y) can be obtained by subtracting the hologram patterns for these removed object points located on each overlapped area from the instantaneous CGH pattern calculated for the MC-RF of
Eq. (9), which is given by
Eq. (10) as follows.
In which I_{RO} represents the CGH pattern for the overlapped areas and it is generated with the NLUT.
4.5.4 Generation of CGH patterns for the GFs
Right after the final CGH pattern for the MC-RF of
Eq. (10) is obtained, the hologram pattern for the difference image between the motion-compensated RF and the GF shown in
Fig. 13 is calculated, and with which the CGH pattern for the MC-RF is finally compensated. That is, the hologram pattern for the object points in the difference image between the MC-RF and the GF is calculated, which is called
I_{D}(
x,
y) as shown in
Fig. 14. Then, the CGH pattern for the GF,
I(
x,
y) can be obtained just by adding the hologram pattern for the difference image between the MC-RF and the GF of
I_{D}(
x,
y) to the final CGH pattern for the MC-RF of
I_{C}(
x,
y) as shown in
Eq. (11)Here in this paper, the CGH patterns with the resolution of 1,400 × 1,400 pixels, in which each pixel size of the hologram pattern is given by 10
μm × 10
μm, are generated by using the intensity and depth data of the test 3-D videos of
Fig. 7. Moreover, the viewing-distance and the discretization step in the horizontal and vertical directions are set to be 600
mm and 30
μm, respectively.
Accordingly, to fully display the hologram fringe patterns, the PFP must be shifted by 1,536 pixels (512 × 3 pixels = 1,536 pixels) horizontally and vertically [
77. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47, D55–D62 (2008). [CrossRef] [PubMed]
]. Thus, the total resolution of the PFP becomes 2,936 (1,400 + 1,536) × 2,936 (1,400 + 1,536) pixels.
4.5.5 Reconstruction of 3-D objects from the CGHs
5. Performance analysis of the proposed method
For the ‘Case I’ scenario, in which a ‘Car’ object moves fast along the pathway from the left to the right, the differences in the object points between the previous and current frames were set to be larger than 50% of the object points of the current frames, so that the numbers of calculated object points for these difference images get much larger than those of the current frames. In other words, the numbers of object points to be calculated in the TR-NLUT and MC-NLUT must be increased much more than that of the original NLUT in cases where there are more than 50% image differences between consecutive video frames.
As seen in
Fig. 17(a) and
Table 1, the average number of calculated object points of the original NLUT is calculated to be 7,462, whereas those of the TR-NLUT and MC-NLUT are estimated to be 12,009 and 9,896, respectively, which means that the average numbers of calculated object points of the TR-NLUT and MC-NLUT are rather increased by 60.94% and 32.63%, respectively compared to the original NLUT instead of being decreased.
On the other hand, in the proposed MPEG-NLUT method, the test video frames of ‘Case I’ are grouped into a set of GOPs, in which each GOP is composed of one RF and three GFs. Here, the CGHs for the RFs in each GOP are calculated with the TR-NLUT. That is, the B-CGHs for each block of the RF of a GOP are computed by combined use of the corresponding B-CGHs for those of the previous GOP and the TR-NLUT. However, the CGHs for three other GFs are generated by combined use of the CGH for the MC-RF and the CGHs for the difference images between the MC-RF and each of the GFs. Therefore, in the proposed method, the numbers of calculated object points for three other GFs could be greatly reduced because the object’s motions between the RF and each of the GFs are precisely compensated based on the block-based motion estimation and compensation process, which signify that the difference images between the MC-RF and each of the GFs can be kept to be very small.
For the ‘Case I’, the average number of calculated object points of the proposed method is estimated to be 5,262, which the smallest number compared to those of the conventional methods mentioned above. That is, the average number of calculated object points of the proposed method has been reduced by 29.48%, 56.19% and 46.83%, respectively compared to each of the conventional NLUT, TR-NLUT and MC-NLUT methods.
Furthermore, the average calculation times are estimated to be 4.717
ms, 6.868
ms, 5.382
ms and 3.749
ms, respectively for each of the conventional NLUT, TR-NLUT, MC-NLUT and proposed MPEG-NLUT methods as seen in
Fig. 17(d) and
Table 1. In other words, the average calculation time of the proposed method has been reduced by 20.51%, 45.41% and 30.34%, respectively compared to each of the conventional NLUT, TR-NLUT and MC-NLUT methods.
For the ‘Case II’, since two ‘Car’ objects move fast in the opposite directions without any overlapped regions, the number of calculated object points and the image differences between two consecutive video frames become almost doubled compared to those of the ‘Case I’ as shown in
Fig. 17(b), which means the image differences between two consecutive video frames become much bigger than 50% of the object points of the current frame.
As mentioned above, two ‘Car’ objects move fast in the opposite directions, so that the motion vectors with different directions exist in the ‘Case II’. However, the motion vectors of one ‘Car’ object may not be affected by those of the other because there is no overlapped region between them. Therefore, for the ‘Case II’, the total number of calculated object points is almost doubled compared to the ‘Case I’ as shown in
Figs. 17(a) and
17(b) and
Table 1.
Here, for the ‘Case II’, the average number of calculated object points of the proposed method is calculated to be 10,355 whereas those of the conventional NLUT, TR-NLUT and MC-NLUT methods are estimated to be 14,217, 23,248 and 18,550, respectively, which means the average numbers of calculated object points of the proposed method has been reduced by 27.17%, 55.46% and 44.18%, respectively compared to each of the NLUT, TR-NLUT and MC-NLUT methods.
Moreover, the average calculation times are also estimated to be 4.766
ms, 7.021
ms, 5.497
ms, 3.786
ms, respectively for each of the conventional NLUT, TR-NLUT, MC-NLUT and proposed MPEG-NLUT methods as shown in
Fig. 17(e) and
Table 1. That is, the average calculation time of the proposed method has been reduced down by 20.55%, 46.07% and 31.12%, respectively compared to each of the conventional NLUT, TR-NLUT and MC-NLUT methods.
On the other hand, for the ‘Case III’, two ‘Car’ objects move fast in the opposite directions with some overlapped regions. The only difference between the ‘Case II’ and ‘Case III’ is whether the overlapping regions between two objects exist or not. In the ‘Case III’, overlapped regions between two objects may occur from the 30th to the 112th frames. Therefore, the motion vectors of one ‘Car’ object may be affected by those of the other one in the overlapped blocks.
As seen in
Fig. 17(c) and
Table 1, the numbers of calculated object points start to reduce from the 30th to 75th frames in all methods because overlapped regions between two ‘Car’ objects get increased, whereas they are increasing again from the 75th to 112th frames since the overlapped regions get decreased.
Now, for the ‘Case III’, the average number of calculated object points of the proposed method is calculated to be 10,293 whereas those of the NLUT, TR-NLUT and MC-NLUT are estimated to be 13,982, 22,912 and 19,273, respectively, which means that the average numbers of calculated object points of the proposed method has been reduced by 26.38%, 55.07% and 46.59%, respectively compared to each of the NLUT, TR-NLUT and MC-NLUT method.
In addition, the average calculation times are estimated to be 4.697
ms, 6.760
ms, 5.519
ms, 3.826
ms, respectively for each of the conventional NLUT, TR-NLUT, MC-NLUT and proposed MPEG-NLUT methods as shown in
Fig. 17(f) and
Table 1. That is, the average calculation time of the proposed method has been reduced by 18.56%, 43.41% and 30.68%, respectively compared to each of the conventional NLUT, TR-NLUT and MC-NLUT methods.
In summary, the overall results of
Figs. 17(a)–
17(c) and
Table 1 show that the average numbers of calculated object points of the TR-NLUT and MC-NLUT methods have been increased by 60.94%, 63.52%, 63.87%, and 32.63%, 30.48%, 37.84%, whereas those of the proposed method have been decreased by 29.48%, 27.17%, 26.38%, respectively compared to the original NLUT method for each scenario of ‘Case I’, ‘Case II’ and ‘Case III’.
Table 1 shows the computed results on the detailed processing times for motion estimation, compensation and CGH calculation for each of the ‘Case I’, ‘Case II’ and ‘Case III’ in the proposed method. That is, motion estimation, motion compensation and CGH calculation times have been found to be 0.219
ms, 0.092
ms, 3.438
ms and 0.118
ms, 0.094
ms, 3.575
ms and 0.118
ms, 0.093
ms, 3.614
ms, respectively for each of the ‘Case I’, ‘Case II’ and ‘Case III’, which means that portions of the preprocessing times for motion estimation and compensation in the total calculation time have been found to be minimal values of 8.30%, 5.60% and 5.52%, respectively for each of ‘Case I’, ‘Case II’ and ‘Case III’.
Finally, aforementioned good experimental results mentioned above confirm that in the proposed method, the number of calculated object points and the resultant calculation time required for generation of video holograms of fast-moving 3-D objects have been significantly reduced compared to those of the conventional NLUT, TR-NLUT and MC-NLUT methods. In particular, the conventional TR-NLUT and MC-NLUT have been analyzed to get no improvement in their computational speed for the case of the 3-D objects moving fast and making more than 50 image differences between two consecutive video frames.
In addition, real-time generation of video holograms of fast-moving 3-D objects with this proposed MPEG-NLUT method on graphic-processing-units (GPUs) may be expected, because the recent results of the GPU-based NLUT system have shown 300-fold enhancement of the computational speed compared to that of the conventional central-processing-unit (CPU)-based approach [
1616. M.-W. Kwon, S.-C. Kim, and E.-S. Kim, “Graphics processing unit-based implementation of a one-dimensional novel-look-up-table for real-time computation of Fresnel hologram patterns of three-dimensional objects,” Opt. Eng. 53(3), 035103 (2014). [CrossRef]
]. This means the calculation time for one video frame can be reduced down to the order of milliseconds.