OSA's Digital Library

Biomedical Optics Express

Biomedical Optics Express

  • Editor: Joseph A. Izatt
  • Vol. 2, Iss. 5 — May. 1, 2011
  • pp: 1258–1264
« Show journal navigation

Comment on A study on tetrahedron-based inhomogeneous Monte-Carlo optical simulation”

Qianqian Fang  »View Author Affiliations


Biomedical Optics Express, Vol. 2, Issue 5, pp. 1258-1264 (2011)
http://dx.doi.org/10.1364/BOE.2.001258


View Full Text Article

Acrobat PDF (709 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

The Monte Carlo (MC) method is a popular approach to modeling photon propagation inside general turbid media, such as human tissue. Progress had been made in the past year with the independent proposals of two mesh-based Monte Carlo methods employing ray-tracing techniques. Both methods have shown improvements in accuracy and efficiency in modeling complex domains. A recent paper by Shen and Wang [Biomed. Opt. Express 2, 44 (2011)] reported preliminary results towards the cross-validation of the two mesh-based MC algorithms and software implementations, showing a 3–6 fold speed difference between the two software packages. In this comment, we share our views on unbiased software comparisons and discuss additional issues such as the use of pre-computed data, interpolation strategies, impact of compiler settings, use of Russian roulette, memory cost and potential pitfalls in measuring algorithm performance. Despite key differences between the two algorithms in handling of non-tetrahedral meshes, we found that they share similar structure and performance for tetrahedral meshes. A significant fraction of the observed speed differences in the mentioned article was the result of inconsistent use of compilers and libraries.

© 2011 OSA

Comment

Both teams had released software implementations, namely timos (binary-only) [10

10. H. Shen and G. Wang, “TIM-OS Project Site,” https://sites.google.com/a/imaging.sbes.vt.edu/tim-os/.

] and mmc (open-source) [11

11. Q. Fang, “Mesh-based Monte Carlo—the software,” http://mcx.sourceforge.net/mmc/.

], based on the algorithms they published (in this comment, we use lower-case names for implementations and capital names for algorithms). A rigorous cross-validation between the two codes could reveal key differences and trade-offs which may benefit the further development of the method. In a recent paper published in Biomedical Optics Express [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

], Shen and Wang took the first step in this direction and reported their preliminary findings comparing the computational speed and accuracy of the two codes, along with two other MC packages, using the binary builds of the software.

Comparison of the ray-tracing algorithms

Given the generalized conclusion in [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

], the authors of the article showed clear interests in exploring the theoretical performance of the two algorithms. However, in the FLOPS (floating-point operations) analysis presented in the Appendix of [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

], the authors used “text-book” formulas [8

8. N. Platis and T. Theoharis, “Fast ray-tetrahedron intersection using Plücker coordinates,” J. Graph. Tools 8(4), 37–48 (2003).

] to compare the ray-tracing kernels in TIM-OS and MMCM. In practice, this may lead to a suboptimal estimate of performance, since many contemporary ray-tracing algorithms extensively use pre-computed data to accelerate the computation [4

4. C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, 2007)

,6

6. I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland University, 2004).

,9

9. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007, (2007), pp. 33–39.

,13

13. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]

]. This is especially the case when using Plücker-coordinate-based ray-tracing, such as in Shevtsov’s method [9

9. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007, (2007), pp. 33–39.

].

Recently, Havel and Herout [13

13. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]

] published a brief review on the ray-tracing algorithm state of the art. They also proposed a hybrid approach that combines the strengths of Wald’s and Shevtsov’s methods. Given the connection between mesh-based MC and ray-tracing algorithms, we found this paper particularly helpful to understand the algorithmic differences between TIM-OS and MMCM.

Combining Eqs. (9) and (10) in [13

13. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]

], we can reconstruct a pseudo-code that describes Shevtsov’s method in a format similar to that in the Appendix of [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

]. The pseudo-code, the “apparent” FLOPS count of each step and the probability of each execution path are listed in the following table. For consistency, we count division as 1 FLOP as in [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

]; in practice, division is typically more expensive than other operations.For simplicity, we numerically estimate the probabilities of different execution paths by simulating a large number of photons using the above method at various scattering settings. We find that these numbers quickly converge to the listed values independent of the scattering and anisotropy settings.

In the above discussion, we highlighted the fact that the FLOPS estimates were taken as the “face values”. In modern CPUs, these values are not necessarily proportional to the execution time. Memory latency is another important factor and proper memory pre-fetching can significantly impact the efficiency [13

13. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]

]. Also, the if-branches in Table 1 and Appendix of [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

] can hurt efficiency by causing pipeline stalls in modern CPUs [9

9. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007, (2007), pp. 33–39.

]. In [9

9. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007, (2007), pp. 33–39.

], the authors had shown that a branchless structure can be more efficient despite the slight increase in FLOPS.

In addition, we found Eq. (13) in [13

13. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]

] very helpful in understanding the key differences between TIM-OS and MMCM in the tetrahedral case. Comparing the equations, we found that the core ray-tracing kernel in TIM-OS requires the first, second and the sixth equations; for MMCM, all equations are needed. From this perspective, we can see that the two algorithms are not really “quite different” as concluded in [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

]. Rather one is a partial version of the other: MMCM performs a complete ray-tracing calculation while TIM-OS bypasses the calculations for Barycentric coordinates.

We want to note particularly that Plücker-coordinate-based MC is not limited to tetrahedral meshes. MMCM can be extended to support general polyhedral elements, such as hexahedrons, or incorporate high-order basis functions, i.e. the p-method, to enhance accuracy [2

2. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1(1), 165–175 (2010). [CrossRef] [PubMed]

]. In [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

], the authors proposed to use refined elements, i.e. the h-method, to reduce discretization error. Based on the error analysis for mesh-refinement, the discretization error of the p-method diminishes much faster than the h-method [14

14. K. H. Huebner, The Finite Element Method for Engineers (Wiley, 2001), Section 10.6.4.

]. To produce solutions with a comparable discretization error, the computational expense of the h-method can quickly exceed that of the p-method as we refine the mesh.

Comparison of interpolation strategies

Although the fluence is calculated using the piece-wise-constant basis functions in timos, the authors also proposed to use a linear interpolation as a post-process [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

]. To a certain extent, this strategy allows timos to approximate the continuous fluence profile, similar to mmc, where the interpolation is done during the energy deposit phase. However, we want to point out that these two strategies are not equivalent. More specifically, we found two issues related to the post-interpolation strategy. First of all, the linear interpolation between the centroids of all tetrahedra is typically achieved by first constructing a 3D tessellation. Thus, the interpolation is only defined inside the convex hull of the centroids, which, in most cases, is smaller than the original mesh. Secondly, the space where the fluence is calculated is different from where it is interpolated. This discrepancy can cause an increase in discretization error.

To further illustrate the second issue, we use a 1D example to show the difference in accuracy of the two interpolation strategies. For simplicity, we select a continuous function f(x) = 1/x and define a set of equally spaced nodes at xi=i*Δx (i = 1,2,…). Our goal is to estimate the integrals in the form of y(i)=(i1/2)Δx(i+1/2)Δxf(x)dx. If we choose the mmc strategy, we first define the 1D linear basis as ϕi(x)=1(|xiΔx|/Δx) and integrate f(x)ϕi(x) in the intervals [(i1)Δx,iΔx] and [iΔx,(i+1)Δx]; alternatively, we can integrate f(x) in both intervals and assign the results to the middle points. We then approximate y(i) by taking the average, which is essentially a post-interpolation. It is not difficult to analytically show that
ytruth=(i1/2)Δx(i+1/2)Δxf(x)dx=ln((2i+1)/(2i1))ymmc=(i1)Δx(i+1)Δxf(x)ϕi(x)dx=(i+1)ln(i+1)+(i1)ln(i1)2iln(i)ytimos=12((i1)ΔxiΔxf(x)dx+iΔx(i+1)Δxf(x)dx)=12ln((i+1)/(i1))
(1)
In Fig. 1
Fig. 1 Absolute and relative errors of the solutions using different interpolation strategies. Here we define e mmc = y mmc - y truth and e timos = y timos - y truth.
, we plot the absolute and relative errors for y mmc and y timos at two Δx settings. Several observations can be made from this plot: (1) the mmc strategy produces a better approximation than the post-interpolation approach, especially when f(x) has large variations, and (2) the error of the post-interpolation becomes greater as Δx increases. Similar analysis can be done in 2D and 3D spaces. For practical photon transport simulations, we expect the solutions produced by post-interpolation to have lower accuracy near the source and with coarse meshes.

Impact of compilers and optimization settings

By running benchmarks for several configurations with the pre-compiled binaries, the authors of [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

] showed that timos software is 3~6 times faster than mmc. However, no analysis was given. With further investigation, we found that this reported difference was largely dominated by differences in compilers and libraries instead of the algorithms themselves.

To illustrate the above finding, we recompile the mmc source code with various compiler and optimization settings and report the run-times in Table 2

Table 2. Comparison of Run-Times for Various mmc and timos Builds with Different Compilers and Optimizations

table-icon
View This Table
| View All Tables
.Here, all mmc binaries were compiled on a computer (Machine-A) running 64bit CentOS 5.5 with either gcc 4.1.2 or Intel C++ compiler (icc) 2011.0.084. This computer has two Intel Xeon E5530 CPUs (8 cores total), similar to the one used in [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

]. The timos executable was compiled with the Intel C++ compiler and was downloaded from its website [10

10. H. Shen and G. Wang, “TIM-OS Project Site,” https://sites.google.com/a/imaging.sbes.vt.edu/tim-os/.

] on Dec. 10, 2010. All binaries were compiled with the -O3 flag for both compilers except for mmc-icc-SSE where “-fast” and “-msse4.1” options were appended. We chose the “sph2” example [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

] as the benchmark, and simulated 3×107 photons for each executable using 32 parallel threads. To estimate the run-time, we ran each binary 4 times consecutively, and took the average from the last 3 runs. We repeated the simulations on a second computer (Machine-B) running on a single Intel Core i7-870 CPU under the hyper-threading mode.

In Table 2, mmc-gcc and mmc-icc were built with the original mmc v0.2 source code released on Aug. 29, 2010 [11

11. Q. Fang, “Mesh-based Monte Carlo—the software,” http://mcx.sourceforge.net/mmc/.

]. The difference between mmc-gcc and mmc-icc is solely in the compiler used - gcc or icc - as the suffixes indicate. In mmc-gcc2 and mmc-icc2, we replaced the exponential function expf() defined in the GNU C library (glibc) with an approximation function using the 9th order Taylor’s expansion; in addition, we also added the “static inline” prefix to enable expansion of inline functions across multiple source units. In mmc-icc-SSE, we further optimized the ray-tracing kernel using the Streaming SIMD Extensions (SSE).

Comparing the run-times in rows 2 and 3, we can clearly see the dramatic difference in speed by simply using a different compiler: the mmc-icc binary is 1.6 times faster than its gcc counterpart! To identify the cause of this difference, we profiled both binaries and found that the slow math functions (exp, log, sin and cos) in the GNU C Library were responsible for the poor performance. As a result, in mmc-gcc2, we replaced expf() with a fast approximation. The run-time of mmc-gcc2 became close to mmc-icc, indicating that the speed enhancement of the icc binaries benefited from a more efficient implementation of the math functions. The normalized run-times (divided by that of timos) for various mmc binaries show a wide swing from 1.5 to 4, indicating the large susceptibility of run-time with respect to the choices of compiler and library settings. Similar findings were observed on Machine-B.

Other related issues

In addition to the compiler and library differences, there are several other issues that also deserve investigation. Insufficient consideration of these issues may also lead to inaccurate estimates of run-time.

From private communication with the timos authors, we learned that the Russian roulette settings were different between the two codes. In timos, the Russian roulette has a built-in threshold of 0.00001 and is used for both static and time-resolved simulations. By comparison, mmc only enables Russian roulette for static simulations. This configuration is based on our belief that using Russian roulette in a time-resolved simulation introduces time-dependent statistical characteristics for the temporal point spread function (TPSF). This is not desired in certain rigorous analysis even though the solutions may be visually similar. The use of Russian roulette can give a slight speed advantage for timos. Nevertheless, the results from the two solvers possess different characteristics. Thus, this comparison is not entirely valid.

Moreover, timos calls the SIMD-oriented Fast Mersenne Twister (SFMT) random number generator (RNG) from the Intel Math Kernel Library (MKL) [15

15. Intel Knowledge Base, “New fast basic random number generator SFMT19937 in Intel MKL,” (2010), http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/.

] while mmc uses a plain multi-thread 48bit RNG from glibc. Accelerated by the SSE instructions, SFMT algorithm is known to be fast and it is reasonable to believe that the contribution from RNG differences is non-negligible. In addition to speed, using identical RNG settings in timos and the reference code can also introduce biases in the accuracy assessment in Table 3 of [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

]. Nonetheless, the diminishing absorption fraction shown in Table 4 in [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

] does reveal a bug in the mmc implementation. By changing the absorption accumulator variable type from “float” to “double,” we find that the above issue disappears.

Conclusions

From the above discussions, we believe that identifying the detailed causes of performance differences is more informative to the community than reporting running times and error metrics without in-depth analysis. Caution must be exercised to avoid common pitfalls in performance analysis where external components can easily introduce biases to the results. Summarizing the above findings, we believe the main conclusion in [12

12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

], “TIM-OS is superior to MMC in terms of efficiency for the general heterogeneous models,” is not sufficiently justified and thus deserves further debate.

Acknowledgment

QF would like to thank Jiří Havel for his help in understanding the SSE implementation in [IEEE Trans. Vis. Comput. Graph. 16, 434 (2010)]. QF also appreciates the kind communications with Haiou Shen and Ge Wang to learn the details regarding the compilation and execution settings for the timos binary. This study was partially supported by the Harvard Catalyst Pilot Grant.

References and links

1.

H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol. 55(4), 947–962 (2010). [CrossRef] [PubMed]

2.

Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1(1), 165–175 (2010). [CrossRef] [PubMed]

3.

E. Margallo-Balbás and P. J. French, “Shape based Monte Carlo code for light transport in complex heterogeneous Tissues,” Opt. Express 15(21), 14086–14098 (2007). [CrossRef] [PubMed]

4.

C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, 2007)

5.

N. Ren, J. Liang, X. Qu, J. Li, B. Lu, and J. Tian, “GPU-based Monte Carlo simulation for light propagation in complex heterogeneous tissues,” Opt. Express 18(7), 6811–6823 (2010). [CrossRef] [PubMed]

6.

I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland University, 2004).

7.

D. Badouel, “An efficient ray-polygon intersection,” Graphics Gems, S.A. Glassner, ed. (Academic Press Professional, 1990), pp. 390–393.

8.

N. Platis and T. Theoharis, “Fast ray-tetrahedron intersection using Plücker coordinates,” J. Graph. Tools 8(4), 37–48 (2003).

9.

M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007, (2007), pp. 33–39.

10.

H. Shen and G. Wang, “TIM-OS Project Site,” https://sites.google.com/a/imaging.sbes.vt.edu/tim-os/.

11.

Q. Fang, “Mesh-based Monte Carlo—the software,” http://mcx.sourceforge.net/mmc/.

12.

H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]

13.

J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]

14.

K. H. Huebner, The Finite Element Method for Engineers (Wiley, 2001), Section 10.6.4.

15.

Intel Knowledge Base, “New fast basic random number generator SFMT19937 in Intel MKL,” (2010), http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/.

OCIS Codes
(170.3660) Medical optics and biotechnology : Light propagation in tissues
(170.5280) Medical optics and biotechnology : Photon migration
(170.7050) Medical optics and biotechnology : Turbid media

ToC Category:
Optics of Tissue and Turbid Media

History
Original Manuscript: December 13, 2010
Manuscript Accepted: December 29, 2010
Published: April 19, 2011

Citation
Qianqian Fang, "Comment on “A study on tetrahedron-based inhomogeneous Monte-Carlo optical simulation”," Biomed. Opt. Express 2, 1258-1264 (2011)
http://www.opticsinfobase.org/boe/abstract.cfm?URI=boe-2-5-1258


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. H. Shen and G. Wang, “A tetrahedron-based inhomogeneous Monte Carlo optical simulator,” Phys. Med. Biol. 55(4), 947–962 (2010). [CrossRef] [PubMed]
  2. Q. Fang, “Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates,” Biomed. Opt. Express 1(1), 165–175 (2010). [CrossRef] [PubMed]
  3. E. Margallo-Balbás and P. J. French, “Shape based Monte Carlo code for light transport in complex heterogeneous Tissues,” Opt. Express 15(21), 14086–14098 (2007). [CrossRef] [PubMed]
  4. C. Wächter, “Quasi-Monte Carlo light transport simulation by efficient ray tracing,” Ph.D. dissertation (Ulm University, 2007)
  5. N. Ren, J. Liang, X. Qu, J. Li, B. Lu, and J. Tian, “GPU-based Monte Carlo simulation for light propagation in complex heterogeneous tissues,” Opt. Express 18(7), 6811–6823 (2010). [CrossRef] [PubMed]
  6. I. Wald, “Realtime ray tracing and interactive global illumination,” Ph.D. dissertation (Saarland University, 2004).
  7. D. Badouel, “An efficient ray-polygon intersection,” Graphics Gems, S.A. Glassner, ed. (Academic Press Professional, 1990), pp. 390–393.
  8. N. Platis and T. Theoharis, “Fast ray-tetrahedron intersection using Plücker coordinates,” J. Graph. Tools 8(4), 37–48 (2003).
  9. M. Shevtsov, A. Soupikov, and A. Kapustin, “Ray-triangle intersection algorithm for modern CPU architectures,” in Proceedings of GraphiCon 2007, (2007), pp. 33–39.
  10. H. Shen and G. Wang, “TIM-OS Project Site,” https://sites.google.com/a/imaging.sbes.vt.edu/tim-os/ .
  11. Q. Fang, “Mesh-based Monte Carlo—the software,” http://mcx.sourceforge.net/mmc/ .
  12. H. Shen and G. Wang, “A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation,” Biomed. Opt. Express 2(1), 44–57 (2011). [CrossRef] [PubMed]
  13. J. Havel and A. Herout, “Yet faster ray-triangle intersection (using SSE4),” IEEE Trans. Vis. Comput. Graph. 16(3), 434–438 (2010). [CrossRef] [PubMed]
  14. K. H. Huebner, The Finite Element Method for Engineers (Wiley, 2001), Section 10.6.4.
  15. Intel Knowledge Base, “New fast basic random number generator SFMT19937 in Intel MKL,” (2010), http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/ .

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

Figures

Fig. 1
 

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited