OSA's Digital Library

Biomedical Optics Express

Biomedical Optics Express

  • Editor: Joseph A. Izatt
  • Vol. 3, Iss. 12 — Dec. 1, 2012
  • pp: 3223–3230
« Show journal navigation

Accelerating mesh-based Monte Carlo method on modern CPU architectures

Qianqian Fang and David R. Kaeli  »View Author Affiliations


Biomedical Optics Express, Vol. 3, Issue 12, pp. 3223-3230 (2012)
http://dx.doi.org/10.1364/BOE.3.003223


View Full Text Article

Acrobat PDF (1049 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

In this report, we discuss the use of contemporary ray-tracing techniques to accelerate 3D mesh-based Monte Carlo photon transport simulations. Single Instruction Multiple Data (SIMD) based computation and branch-less design are exploited to accelerate ray-tetrahedron intersection tests and yield a 2-fold speed-up for ray-tracing calculations on a multi-core CPU. As part of this work, we have also studied SIMD-accelerated random number generators and math functions. The combination of these techniques achieved an overall improvement of 22% in simulation speed as compared to using a non-SIMD implementation. We applied this new method to analyze a complex numerical phantom and both the phantom data and the improved code are available as open-source software at http://mcx.sourceforge.net/mmc/.

1. Introduction

Near-infrared (NIR) light with wavelength between 600 and 1000 nm can penerate deep into the tissue [1

1. D. A. Boas, D. H. Brooks, E. L. Miller, C. A. DiMarzio, M. Kilmer, R. J. Gaudette, and Q. Zhang, “Imaging the body with diffuse optical tomography,” IEEE Signal Proc. Mag. 18, 57–75 (2001). [CrossRef]

], owning primarily to the relatively low optical absorption of human tissue chromorphores, namely oxy/deoxy-hemoglobin, lipids and water. This remarkable characteristic makes NIR light a suitable candidate to probe deep tissue physiology such as angiogenesis and oxygen metabolism in a safe and non-invasive manner. Substantial efforts have been made in the past decade to utilize NIR imaging to facilitate the investigation of breast cancer and human brain functions, monitoring therapy responses, and a range of other applications [2

2. A. P. Gibson, J. C. Hebden, and S. R. Arridge, “Recent advances in diffuse optical imaging,” Phys. Med. Biol. 50, R1–R43 (2005). [CrossRef] [PubMed]

]. A notable challenge for NIR imaging is the high scattering within the tissue. The transport of low-energy NIR photons is highly nonlinear and exhibits a complex and diffusive pattern. Accurate and efficient numerical methods play an essential role in NIR-based techniques such as diffuse optical tomography (DOT) and fluorescence molecular tomography (FMT).

The Monte Carlo (MC) method is a numerical technique offering both generality and accuracy in modeling photon transport inside human tissue, and often serves as the gold standard in biophotonic applications. After 20 years of evolution, biophotonic MC algorithms have become increasingly more efficient and capable in handling complex tissue structures. One of the first MC codes, MCML [3

3. L. H. Wang, S. L. Jacques, and L. Q. Zheng, “MCML - Monte Carlo modeling of light transport in multilayered tissues,” Comput. Meth. Prog. Bio. 47, 131–146 (1995). [CrossRef]

], can only simulate problems within layered infinite media. A more general approach was proposed by Boas et al. for 3D heterogeneous domains using vox-elated representations [4

4. D. A. Boas, J. Culver, J. Stott, and A. Dunn, “Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head,” Opt. Express 10, 159–170 (2002). [PubMed]

, 5

5. Q. Fang and D. A. Boas, “Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing,” Opt. Express 17, 20178–20190 (2009). [CrossRef] [PubMed]

]. However, to achieve higher accuracy near curved boundaries in a voxelated domain, one has to increase the global density of the grid resulting in a significant increase in memory costs and execution time. Over the past few years, several mesh-based approaches have been pursued to obtain better accuracy in modelling complex tissue boundaries. In [6

6. H. Li, J. Tian, F. Zhu, W. Cong, L. V. Wang, E. A. Hoffman, and G. Wang, “A mouse optical simulation environment (MOSE) to investigate bioluminescent phenomena in the living mouse with the monte carlo method,” Academ. Radiol. 11, 1029–1038 (2004). [CrossRef]

], Li et al. reported a Mouse Optical Simulation Environment (MOSE) that allows one to define 3D heterogeneous structures using geometric primitives. Margallo-Balbás et al. [7

7. E. Margallo-Balbäs and P. J. French, “Shape based Monte Carlo code for light transport in complex heterogeneous Tissues,” Opt. Express 15, 14086–14098 (2007). [CrossRef] [PubMed]

] and Ren et al. [8

8. N. Ren, J. Liang, X. Qu, J. Li, B. Lu, and J. Tian, “GPU-based Monte Carlo simulation for light propagation in complex heterogeneous tissues,” Opt. Express 18, 6811–6823 (2010). [CrossRef] [PubMed]

] have reported methods using triangular surface meshes and ray-surface intersection calculations to propagate photons in a piece-wise-constant domain. This approach achieves better accuracy near tissue boundaries, however, the speed of the ray-surface intersection tests depends on the partitioning scheme of the surfaces [9

9. C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, Ulm, Germany, 2007).

] and the traversal of the data structure often introduces significant computational overhead [10

10. I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland Univ., Saarbrücken, Germany, 2004).

].

In the past year, two new MC algorithms were proposed to perform photon transport simulations using a 3D volumetric mesh. These algorithms include the Tetrahedron-based Inhomogeneous MC Optical Simulator (TIM-OS) by Shen & Wang [11

11. H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol. 55, 947–962 (2010). [CrossRef] [PubMed]

] and the Mesh-based MC Method (MMCM) by Fang [12

12. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1, 165–175 (2010). [CrossRef] [PubMed]

]. It has been shown that both methods pursue a similar approach, but differ slightly in the ray-tracing formulation used [13

13. Q. Fang, “Comment on ‘A study on tetrahedron-based inhomogeneous Monte-Carlo optical simulation’,” Biomed. Opt. Express 2, 1258–1264 (2011). [CrossRef] [PubMed]

]. Compared to surface-based approaches, the use of a tetrahedral mesh leads to significant savings in the overhead of testing ray-triangle intersections. On average, only 2.5 ray-triangle intersection tests are required per step in MMCM and four for TIM-OS. Combined with the recent advances in image-based mesh generation methods [14

14. CGAL Editorial Board, CGAL User and Reference Manual, 3rd ed. (2009).

],[15

15. Q. Fang and D. A. Boas, “Tetrahedral mesh generation from volumetric binary and grayscale images,” in IEEE International Symposium on Biomedical Imaging: from Nano to Macro, 2009. ISBI ’09 (IEEE, 2009), pp. 1142–1145. [CrossRef]

], the development of these new mesh-based techniques provides a robust path to fast and accurate modeling of photon propagation inside complex anatomical structures.

Ray-tracing techniques are at the heart of both mesh-based MC techniques. It is natural to hypothesize that by incorporating the latest ray-tracing methods developed by the computer graphics community, one can further advance mesh-based MC simulations. In particular, several Single Instruction Multiple Data (SIMD)-based methods have been proposed recently, including Ward’s method [10

10. I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland Univ., Saarbrücken, Germany, 2004).

] and Shevtsov’s method [16

16. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007 (2007), Vol. 11, pp. 33–39.

]. Recently, Havel and Herout proposed a hybrid approach (referred to as H&H hereafter) that effectively combines both methods with an efficient Streaming SIMD Extensions (SSE) implementation [17

17. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Visualiz. Comput. Graphics 16, 434–438 (2010). [CrossRef]

]. Moreover, in [16

16. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007 (2007), Vol. 11, pp. 33–39.

], Shevtsov et al. described a branch-less ray-tracing strategy to take advantage of the pipelining mechanism in modern CPU architectures. Incorporating these innovations into biophotonic MC simulation is expected to improve overall speed. Our main interest in this report is to discuss the implementation and comparison of four ray-tracing techniques: a Plücker-based ray-tracer, a Badouel-based ray-tracer and their SSE counterparts, in a mesh-based MC simulation.

The contribution of this work is to demonstrate the use of compteporary ray-tracing techniques for efficient MC simulations on a modern CPU processor. We highlight the ray-tracing nature of biophotonic MC simulations, and suggest that a thorough investigation to the ray-tracing technique state-of-the-art can not only guide the performance improvement of many existing MC algorithms, but also provide inspirations for the future generations of MC algorithms. In the remaining sections, we first introduce the general structure of a mesh-based Monte Carlo photon simulation. We subsequently examine the computational bottlenecks present in non-SSE implementations with a profiling analysis. Using the knowledge learned from the profiling study, we improve the simulation performance by incorporating SSE in the ray-tracing, random number generation and the calculations of math functions. We then examine the improvement in simulation speed by re-profiling the SSE-enabled codes and report the improved runtimes. We conclude with a brief discussion on limitations and future prospects.

2. Methods

2.1. Mesh-based Monte Carlo Method

In a mesh-based MC simulation, the problem domain is first discretized using a tetrahedral mesh (or other shapes [12

12. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1, 165–175 (2010). [CrossRef] [PubMed]

]). This is followed by the launching of a large number of photon packets each initialized with a unitary weight. Each simulated photon packet enters the domain at a specified source location p, enclosed by a tetrahedron e0, along an initial directional vector, v. Position p and vector v define a 3D “ray”. For each movement, we calculate a random scattering length d along v based on the local scattering coefficient (μs) inside e0 [3

3. L. H. Wang, S. L. Jacques, and L. Q. Zheng, “MCML - Monte Carlo modeling of light transport in multilayered tissues,” Comput. Meth. Prog. Bio. 47, 131–146 (1995). [CrossRef]

]. This ray is then tested against the four triangular faces of e0 using a ray-triangle intersection test [9

9. C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, Ulm, Germany, 2007).

] to determine the exiting point px. If ||ppx|| ≤ d, we move the photon to px by setting ppx and update e0 by the index of the neighboring element at the exiting triangle, and repeat the above ray-tetrahedron test. If ||ppx|| > d, we then move the photon to the end of the scattering path by pp + vd, and start a new scattering event. To do so, we generate 1) an exponentially-distributed random scattering length d, 2) a [0, 2π] uniformly-distributed random azimuth angle, and 3) a random elevation angle based on the Henyey-Greenstein phase function [3

3. L. H. Wang, S. L. Jacques, and L. Q. Zheng, “MCML - Monte Carlo modeling of light transport in multilayered tissues,” Comput. Meth. Prog. Bio. 47, 131–146 (1995). [CrossRef]

, 4

4. D. A. Boas, J. Culver, J. Stott, and A. Dunn, “Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head,” Opt. Express 10, 159–170 (2002). [PubMed]

]. With the updated propagation vector v and d, we continue to propagate the photon until it exits the domain. As the photon moves along its trajectory, the photon packet weight is attenuated exponentially based on the Beer-Lambert law [1

1. D. A. Boas, D. H. Brooks, E. L. Miller, C. A. DiMarzio, M. Kilmer, R. J. Gaudette, and Q. Zhang, “Imaging the body with diffuse optical tomography,” IEEE Signal Proc. Mag. 18, 57–75 (2001). [CrossRef]

] using the absorption coefficient (μa) of each enclosing element. The weight-loss is deposited into the mesh using the Barycentric coordinates calculated from the above ray-tracing step [12

12. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1, 165–175 (2010). [CrossRef] [PubMed]

].

When a photon hits a boundary between two media with different refractive indices, a reflection coefficient, R, is calculated based on Fresnel’s law. A uniformly distributed random number, s ∈ [0, 1], is selected. If s > R, we transmit the photon across the boundary and update v by the refraction direction. Otherwise, the photon stays in the current medium and propagates along a reflected direction. When a photon passes through an external surface of the mesh, we terminate the photon and launch a new one. At the end of the simulation, we output the volumetric flux (or fluence) accumulated over the mesh by all simulated photon packets and save the distances traveled by the exiting photons inside each tissue type (referred to as the partial path-lengths). With the path-length information, one can quickly re-calculate the optical intensity at the detectors without repeating the entire simulation [4

4. D. A. Boas, J. Culver, J. Stott, and A. Dunn, “Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head,” Opt. Express 10, 159–170 (2002). [PubMed]

].

2.2. Computational bottlenecks and profiling analysis

We have published an open-source C implementation of the above algorithm, namely mmc. Independently, Shen and Wang have published a C++ software, timos [11

11. H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol. 55, 947–962 (2010). [CrossRef] [PubMed]

]. By comparison [13

13. Q. Fang, “Comment on ‘A study on tetrahedron-based inhomogeneous Monte-Carlo optical simulation’,” Biomed. Opt. Express 2, 1258–1264 (2011). [CrossRef] [PubMed]

], mmc uses a Plücker-coordinates-based ray-tracing scheme for generality while timos uses a simplified Badouel algorithm [18

18. D. Badouel, Graphics Gems (Academic, 1990).

] for better speed. In both cases, the simulations are accelerated using multi-threaded computing [12

12. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1, 165–175 (2010). [CrossRef] [PubMed]

, 11

11. H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol. 55, 947–962 (2010). [CrossRef] [PubMed]

] but neither of the published codes were optimized with SIMD.

Here, we perform a profiling analysis to identify the computational bottlenecks in the non-SSE mmc implementation. An open-source profiler, Callgrind, is picked for its availability and performance. The benchmark details are described in Section 3. Running the profiler, we extract the run-time for each function of the simulation code and generate a run-time-sensitive call-graph using a visualization tool, Kcachgrind. The resulting call-graph is shown in Fig. 1(a). Each block in Fig. 1(a) represents a C-function corresponding to each step of the simulation ( sincos is a math function to produce a pair of sine and cosine simultaneously). The size of the block is proportional to the execution time spent within the function. We report the percentage of the run-time for each function by normalizing with the total run-time. Note that the result here is for a single CPU core as the profiler is restricted to run with a single thread.

Fig. 1 Subroutine-level profiling for mesh-based MC simulation (a) without SSE, (b) with Havel & Herout’s SSE-based algorithm, and (c) with additional SSE-enabled math library and RNG. The area and percentage in each block represent the run-time for each subroutine normalized by the total run-time of case (a). If a block is enclosed by another, its percentage contributes to that of the enclosing block. Profiling is performed within a single thread.

2.3. SSE-accelerated ray-tetrahedron intersection tests

For the branch-less Baudoul’s method, we use a Structure-of-Arrays (SoA) format [10

10. I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland Univ., Saarbrücken, Germany, 2004).

] to organize the face norm data in the memory. For example, instead of defining N = {Nx, Ny, Nz, d} as in [17

17. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Visualiz. Comput. Graphics 16, 434–438 (2010). [CrossRef]

], we define Nx={Nx1,Nx2,Nx3,Nx4} where Nxi(i=1,,4) is the Nx component of the i-th face in a tetrahedron, and so on. With this approach, a single SSE command can advance one operation for four triangular surfaces simultaneously. An SSE implementation without the SoA memory layout was found to provide almost no acceleration.

2.4. Accelerating RNG and math functions with SSE

To accelerate the calculation of the math functions ( log, sincos), we leveraged a free math library, SSEMath by Julien Pommier [20

20. J. Pommier, “Simple SSE and SSE2 optimized sin, cos, log and exp” (2007), http://gruntthepeon.free.fr/ssemath/.

]. In this library, each call to the math function produces four values simultaneously with SSE commands. To utilize this fast math library in the photon scattering module, we created a static buffer with a length multiple of four and filled the buffer with random numbers generated by the SFMT RNG. We subsequently call the SSE version of the log and sincos functions to compute four function values simultaneously. Each time the MC simulator requests a random function, we pop a precomputed value from the buffer. When a buffer is used up, we refill the buffer by repeating the above process.

3. Results

In this section, we profile the mesh-based Monte Carlo code with different ray-tracing algorithms and RNG/math configurations. From the profiling output, we quantitatively compare the proposed algorithms to demonstrate the benefits of using SSE acceleration.

For this purpose, we build a benchmark problem using the widely-used Digimouse atlas [21

21. B. Dogdas, D. Stout, A. Chatziioannou, and R. M. Leahy, “Digimouse: a 3D whole body mouse atlas from CT and cryosection data,” Phys Med Biol. 52, 577–87 (2007). [CrossRef] [PubMed]

]. From the segmented digital mouse CT image, we create a 3D tetrahedral mesh using an image-based mesh generator, iso2mesh [15

15. Q. Fang and D. A. Boas, “Tetrahedral mesh generation from volumetric binary and grayscale images,” in IEEE International Symposium on Biomedical Imaging: from Nano to Macro, 2009. ISBI ’09 (IEEE, 2009), pp. 1142–1145. [CrossRef]

], and utilities based on the CGAL library [14

14. CGAL Editorial Board, CGAL User and Reference Manual, 3rd ed. (2009).

]. As seen in Fig. 2, the resulting mesh contains 42,301 nodes and 209,907 tetrahedral elements, including 21 distinct tissue regions according to the segmentation. The mesh data, script, and optical properties of each tissue type [22

22. W. F. Cheong, S. A. Prahl, and A. J. Welch, “A review of the optical properties of biological tissues,” IEEE J. Quantum Electronics 26, 2166–2185 (1990). [CrossRef]

] are provided online on our website [12

12. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1, 165–175 (2010). [CrossRef] [PubMed]

]. A pencil-beam point source is positioned on the skin above the right-hemisphere of the brain. All simulations are performed on a Dell workstation running 64bit Ubuntu Linux 10.04. The computer has a quad-core Intel E5520 CPU. The software was compiled with gcc 4.4.3 with -O3and -msse4 options. We launched a total of 106 photons to produce the profiling results using Callgrind.

Fig. 2 A tetrahedral mesh generated from the Digimouse atlas. Different colors represent different tissue types.

In Fig. 1(b), we report the performance benefits of replacing the non-SSE Plücker-based ray-tracer with the H&H’s SSE-based ray-tracer. The area of each box is proportional to its run-time. The width of the largest box in Fig. 1(b) is scaled by matching the area of the next_scatter block with that in Fig. 1(a), given that RNG and math functions are not changed in the two cases.

Next, we evaluate the improvement obtained from the SSE-accelerated math functions and the added SFMT RNG in the code. The updated performance results are shown in Fig. 1(c). Again, the size of each box is proportional to its run-time and the area for havel_raytet is set to match that in Fig. 1(b).

In addition to the subroutine-level profiling analysis, in Table 1, we summarize total runtimes for the ray-tracing calculation and the total simulation time across various combinations of ray-tracers and SSE-enabled techniques. In each case, we simulate 107 photons with four CPU cores running in parallel. Note that the two Badouel-based algorithms perform less computation than the other two algorithms at a cost of reduced accuracy.

Table 1. Simulation run-times (in seconds) for various ray-tracers.

table-icon
View This Table

4. Discussion

From Table 1, it is clear that the SSE-accelerated ray-tracing techniques make measurable improvements in the overall simulation speed. The total run-time is reduced by 22% for SSE-based H&H’s method compared to the Plücker’s method. For the simplified ray-tracing using the Badouel’s method, the branch-less algorithm reduces the simulation time by 26% by improving pipelining on a modern CPU. The acceleration of the ray-tracing calculation alone is 2X when compared to H&H with Plücker-based methods; comparing the branch-less and branched Badouel’s methods, the speedup is 1.3X. From the profiling results shown in Fig. 1(b) and (c), we see that the SSE-enabled RNG and math functions have accelerated the photon scattering calculations by 2.7X. Better computational efficiency was clearly achieved by simultaneously exploiting the multiple processor cores, the SIMD capability, and pipelining available in modern CPU processors. It is interesting to note that the speed-up ratio calculated by the profiling data in Fig. 1 is over-estimated compared to that derived from Table 1. We believe this is due to the fact that the profiler can only utilize a single CPU thread, thus, is a result of variable overhead when running single-threaded versus multi-threaded. Nonetheless, the relative run-times of the subroutines can provide clear insight and help guide us during code optimization.

In summary, we have investigated the potential of accelerating biophotonic Monte Carlo simulations on modern CPUs. We have used SSE constructs to accelerate ray-tracing calculations, as well as key math functions and random number generators. With these techniques, we were able to speed up the ray-tracing calculation by a factor of two, and improve the overall speed by 22 to 26%. The updated source code and benchmark dataset is provided on our website at http://mcx.sourceforge.net/mmc/. The software can not only take advantage of multi-core processors on a symmetric multiprocessor (SMP) system, but also has the potential to gain massive acceleration when running on a software distributed shared memory (SDSM) cluster, either locally or via cloud computing. The proposed techniques should also scale gracefully when many-core CPU chipsets become available on the market.

Acknowledgments

QF wishes to thank the funding support from the Bill & Melinda Gates Foundation ( #OPP1035992).

References and links

1.

D. A. Boas, D. H. Brooks, E. L. Miller, C. A. DiMarzio, M. Kilmer, R. J. Gaudette, and Q. Zhang, “Imaging the body with diffuse optical tomography,” IEEE Signal Proc. Mag. 18, 57–75 (2001). [CrossRef]

2.

A. P. Gibson, J. C. Hebden, and S. R. Arridge, “Recent advances in diffuse optical imaging,” Phys. Med. Biol. 50, R1–R43 (2005). [CrossRef] [PubMed]

3.

L. H. Wang, S. L. Jacques, and L. Q. Zheng, “MCML - Monte Carlo modeling of light transport in multilayered tissues,” Comput. Meth. Prog. Bio. 47, 131–146 (1995). [CrossRef]

4.

D. A. Boas, J. Culver, J. Stott, and A. Dunn, “Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head,” Opt. Express 10, 159–170 (2002). [PubMed]

5.

Q. Fang and D. A. Boas, “Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing,” Opt. Express 17, 20178–20190 (2009). [CrossRef] [PubMed]

6.

H. Li, J. Tian, F. Zhu, W. Cong, L. V. Wang, E. A. Hoffman, and G. Wang, “A mouse optical simulation environment (MOSE) to investigate bioluminescent phenomena in the living mouse with the monte carlo method,” Academ. Radiol. 11, 1029–1038 (2004). [CrossRef]

7.

E. Margallo-Balbäs and P. J. French, “Shape based Monte Carlo code for light transport in complex heterogeneous Tissues,” Opt. Express 15, 14086–14098 (2007). [CrossRef] [PubMed]

8.

N. Ren, J. Liang, X. Qu, J. Li, B. Lu, and J. Tian, “GPU-based Monte Carlo simulation for light propagation in complex heterogeneous tissues,” Opt. Express 18, 6811–6823 (2010). [CrossRef] [PubMed]

9.

C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, Ulm, Germany, 2007).

10.

I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland Univ., Saarbrücken, Germany, 2004).

11.

H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol. 55, 947–962 (2010). [CrossRef] [PubMed]

12.

Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1, 165–175 (2010). [CrossRef] [PubMed]

13.

Q. Fang, “Comment on ‘A study on tetrahedron-based inhomogeneous Monte-Carlo optical simulation’,” Biomed. Opt. Express 2, 1258–1264 (2011). [CrossRef] [PubMed]

14.

CGAL Editorial Board, CGAL User and Reference Manual, 3rd ed. (2009).

15.

Q. Fang and D. A. Boas, “Tetrahedral mesh generation from volumetric binary and grayscale images,” in IEEE International Symposium on Biomedical Imaging: from Nano to Macro, 2009. ISBI ’09 (IEEE, 2009), pp. 1142–1145. [CrossRef]

16.

M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007 (2007), Vol. 11, pp. 33–39.

17.

J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Visualiz. Comput. Graphics 16, 434–438 (2010). [CrossRef]

18.

D. Badouel, Graphics Gems (Academic, 1990).

19.

M. Saito and M. Matsumoto, Monte Carlo and Quasi-Monte Carlo Methods 2006 (Springer, 2008).

20.

J. Pommier, “Simple SSE and SSE2 optimized sin, cos, log and exp” (2007), http://gruntthepeon.free.fr/ssemath/.

21.

B. Dogdas, D. Stout, A. Chatziioannou, and R. M. Leahy, “Digimouse: a 3D whole body mouse atlas from CT and cryosection data,” Phys Med Biol. 52, 577–87 (2007). [CrossRef] [PubMed]

22.

W. F. Cheong, S. A. Prahl, and A. J. Welch, “A review of the optical properties of biological tissues,” IEEE J. Quantum Electronics 26, 2166–2185 (1990). [CrossRef]

23.

S. Powell and T. S. Leung, “Highly parallel Monte-Carlo simulations of the acousto-optic effect in heterogeneous turbid media,” J. Biomed. Opt. 17, 045002 (2012). [CrossRef] [PubMed]

OCIS Codes
(170.3660) Medical optics and biotechnology : Light propagation in tissues
(170.5280) Medical optics and biotechnology : Photon migration
(170.7050) Medical optics and biotechnology : Turbid media

ToC Category:
Optics of Tissue and Turbid Media

History
Original Manuscript: July 23, 2012
Revised Manuscript: September 25, 2012
Manuscript Accepted: September 27, 2012
Published: November 12, 2012

Citation
Qianqian Fang and David R. Kaeli, "Accelerating mesh-based Monte Carlo method on modern CPU architectures," Biomed. Opt. Express 3, 3223-3230 (2012)
http://www.opticsinfobase.org/boe/abstract.cfm?URI=boe-3-12-3223


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. D. A. Boas, D. H. Brooks, E. L. Miller, C. A. DiMarzio, M. Kilmer, R. J. Gaudette, and Q. Zhang, “Imaging the body with diffuse optical tomography,” IEEE Signal Proc. Mag.18, 57–75 (2001). [CrossRef]
  2. A. P. Gibson, J. C. Hebden, and S. R. Arridge, “Recent advances in diffuse optical imaging,” Phys. Med. Biol.50, R1–R43 (2005). [CrossRef] [PubMed]
  3. L. H. Wang, S. L. Jacques, and L. Q. Zheng, “MCML - Monte Carlo modeling of light transport in multilayered tissues,” Comput. Meth. Prog. Bio.47, 131–146 (1995). [CrossRef]
  4. D. A. Boas, J. Culver, J. Stott, and A. Dunn, “Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head,” Opt. Express10, 159–170 (2002). [PubMed]
  5. Q. Fang and D. A. Boas, “Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing,” Opt. Express17, 20178–20190 (2009). [CrossRef] [PubMed]
  6. H. Li, J. Tian, F. Zhu, W. Cong, L. V. Wang, E. A. Hoffman, and G. Wang, “A mouse optical simulation environment (MOSE) to investigate bioluminescent phenomena in the living mouse with the monte carlo method,” Academ. Radiol.11, 1029–1038 (2004). [CrossRef]
  7. E. Margallo-Balbäs and P. J. French, “Shape based Monte Carlo code for light transport in complex heterogeneous Tissues,” Opt. Express15, 14086–14098 (2007). [CrossRef] [PubMed]
  8. N. Ren, J. Liang, X. Qu, J. Li, B. Lu, and J. Tian, “GPU-based Monte Carlo simulation for light propagation in complex heterogeneous tissues,” Opt. Express18, 6811–6823 (2010). [CrossRef] [PubMed]
  9. C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, Ulm, Germany, 2007).
  10. I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland Univ., Saarbrücken, Germany, 2004).
  11. H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol.55, 947–962 (2010). [CrossRef] [PubMed]
  12. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express1, 165–175 (2010). [CrossRef] [PubMed]
  13. Q. Fang, “Comment on ‘A study on tetrahedron-based inhomogeneous Monte-Carlo optical simulation’,” Biomed. Opt. Express2, 1258–1264 (2011). [CrossRef] [PubMed]
  14. CGAL Editorial Board, CGAL User and Reference Manual, 3rd ed. (2009).
  15. Q. Fang and D. A. Boas, “Tetrahedral mesh generation from volumetric binary and grayscale images,” in IEEE International Symposium on Biomedical Imaging: from Nano to Macro, 2009. ISBI ’09 (IEEE, 2009), pp. 1142–1145. [CrossRef]
  16. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007 (2007), Vol. 11, pp. 33–39.
  17. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Visualiz. Comput. Graphics16, 434–438 (2010). [CrossRef]
  18. D. Badouel, Graphics Gems (Academic, 1990).
  19. M. Saito and M. Matsumoto, Monte Carlo and Quasi-Monte Carlo Methods 2006 (Springer, 2008).
  20. J. Pommier, “Simple SSE and SSE2 optimized sin, cos, log and exp” (2007), http://gruntthepeon.free.fr/ssemath/ .
  21. B. Dogdas, D. Stout, A. Chatziioannou, and R. M. Leahy, “Digimouse: a 3D whole body mouse atlas from CT and cryosection data,” Phys Med Biol.52, 577–87 (2007). [CrossRef] [PubMed]
  22. W. F. Cheong, S. A. Prahl, and A. J. Welch, “A review of the optical properties of biological tissues,” IEEE J. Quantum Electronics26, 2166–2185 (1990). [CrossRef]
  23. S. Powell and T. S. Leung, “Highly parallel Monte-Carlo simulations of the acousto-optic effect in heterogeneous turbid media,” J. Biomed. Opt.17, 045002 (2012). [CrossRef] [PubMed]

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

Figures

Fig. 1 Fig. 2
 

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited