OSA's Digital Library

Optics Express

Optics Express

  • Editor: Andrew M. Weiner
  • Vol. 21, Iss. 5 — Mar. 11, 2013
  • pp: 6061–6075
« Show journal navigation

Statistical properties of amplitude and decay parameter estimators for fluorescence lifetime imaging

Jeongtae Kim and Jiyeong Seok  »View Author Affiliations


Optics Express, Vol. 21, Issue 5, pp. 6061-6075 (2013)
http://dx.doi.org/10.1364/OE.21.006061


View Full Text Article

Acrobat PDF (906 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

We analyze the statistical properties of the maximum likelihood estimator, least squares estimator, and Pearson’s χ2-based and Neyman’s χ2-based estimators for the estimation of decay constants and amplitudes for fluorescence lifetime imaging. Our analysis is based on the linearization of the gradient of the objective functions around true parameters. The analysis shows that only the maximum likelihood estimator based on the Poisson likelihood function yields unbiased and efficient estimation. All other estimators yield either biased or inefficient estimations. We validate our analysis by using simulations.

© 2013 OSA

1. Introduction

Fluorescence lifetime imaging microscopy (FLIM) provides useful information about the status of fluorophores and their molecular environments, which can be used to resolve physiological parameters in cells [1

1. J. R. Lakowicz, Principles of Fluorescence Spectroscopy (Kluwer Academic/Plenum, 1999) [CrossRef] .

, 2

2. H. E. Grecco, P. Roda-Navarro, and P. J. Verveer, “Global analysis of time correlated single photon counting FRET-FLIM data,” Opt. Express 17, 6493–6508 (2009) [CrossRef] [PubMed] .

]. Moreover, FLIM can be extended to monitor fluorescence resonance energy transfer that can be used to study the dynamics of proteins within the cells [3

3. S. Kumar, C. Dunsby, P. A. A. D. Beule, D. M. Owen, U. Anand, P. M. P. Lanigan, R. K. P. Benninger, D. M. Davis, M. A. A. Neil, P. Anand, C. Benham, A. Naylor, and P. M. W. French, “Multifocal multiphoton excitation and time correlated single photon counting detection for 3-D fluorescence lifetime imaging,” Opt. Express 15, 12548–12561 (2007) [CrossRef] [PubMed] .

]. Since FLIM is based on the decay properties of fluorophores, an accurate estimation of fluorescence lifetime becomes important. The fluorescence lifetime is usually estimated either in the time domain, using a time correlated single photon counting (TCSPC) technique or in the frequency domain, using an intensity-modulated light source and phase-shifted emission light [1

1. J. R. Lakowicz, Principles of Fluorescence Spectroscopy (Kluwer Academic/Plenum, 1999) [CrossRef] .

]. Compared with methods used for the frequency domain, the TCSPC-based method has advantages such as robustness to scattering, higher precision, and the ability to measure short lifetimes [4

4. B. B. Collier and M. J. McShane, “Dynamic windowing algorithm for the fast and accurate determination of luminescence lifetimes,” Anal. Chem. 84, 4725–4731 (2012) [CrossRef] [PubMed] .

].

We think that the plurality of incorrect estimators is due to the lack of an analysis of statistical properties of each estimator. Although there are some investigations that compare the performance of different estimators [11

11. T. Hauschild and M. Jentschel, “Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments,” Nucl. Instrum. Meth. A 457, 384–401 (2001) [CrossRef] .

,12

12. M. Maus, M. Cotlet, J. Hofkens, T. Gensch, F. C. De Schryver, J. Schaffer, and C. A. M. Seidel, “An experimental comparison of the maximum likelihood estimation and nonlinear least-squares fluorescence lifetime analysis of single molecules,” Anal. Chem. 73, 2078–2086 (2001) [CrossRef] [PubMed] .

], the comparisons are based on only numerical simulations and experiments, and do not use analysis. In addition, the usefulness of these comparisons is limited, because most studies are based on estimating the parameters of a single exponential decay. Recently, an investigation on estimating parameters of a double exponential model for measuring protein interacting fractions was reported [13

13. K. A. Walther, B. Papke, M. B. Sinn, K. Michel, and A. Kinkhabwala, “Precise measurement of protein interacting fractions with fluorescence lifetime imaging microscopy,” Mol. BioSyst. 7, 322–336 (2011) [CrossRef] [PubMed] .

]. Except for empirical comparisons, to our knowledge, there exists only one study that analyzed the performance of the estimators [14

14. J. Tellinghuisen and C. W. Wilkerson, “Bias and precision in the estimation of exponential decay parameters from sparse data,” Anal. Chem. 65, 1240–1246 (1993) [CrossRef] .

]. However, this analysis was not very useful since it assumed an infinite number of photons and was also used for the estimation of parameters of a a single decay. Moreover, since the behaviors of MLE and Pearson’s χ2-based estimator for the asymptotic case of infinite photon counts are identical in their analysis [14

14. J. Tellinghuisen and C. W. Wilkerson, “Bias and precision in the estimation of exponential decay parameters from sparse data,” Anal. Chem. 65, 1240–1246 (1993) [CrossRef] .

], it may mislead readers, who may think that equal performance can be achieved either using the MLE or Pearson’s χ2-based estimator. In fact, the Pearson’s χ2-based estimator should not be used since, as demonstrated herein, this estimator is biased for a finite number of photon counts.

2. Theory

2.1. Problem formulation

We model a photon count y(ti), that is measured at time ti from a mixture of fluorophores, as a Poisson random variable [1

1. J. R. Lakowicz, Principles of Fluorescence Spectroscopy (Kluwer Academic/Plenum, 1999) [CrossRef] .

]:
y(ti)=Poisson{λ(ti;θ)},i=0,.,N1,
(1)
where, the mean of the Poisson random variable, λ (ti; θ), is defined as follows:
λ(ti;θ)=g(ti;θ)*h(ti),
(2)
where, * is the convolution operator, h(t) is the impulse response function (IRF) of a measurement system, and g(t; θ) is an exponential decay model that is defined as follows:
g(t;θ)=j=0M1Aje(tτj),
(3)
where, the 2M × 1 long vector of unknown parameters, θ = [τ0, τ1,...,τM−1, A0, A1,..., AM−1], consists of the decay constants τi and the amplitudes Ai of all fluorophores [1

1. J. R. Lakowicz, Principles of Fluorescence Spectroscopy (Kluwer Academic/Plenum, 1999) [CrossRef] .

]. To simplify the notations, we denote λ (ti; θ) by λi(θ) and y(ti) by yi. Because FLIM should be based on accurate decay properties, the parameter vector θ should be accurately estimated from the noisy measurement Y = [y0, y1,..., yN−1]. To achieve this goal, the estimation is usually carried out by determining the minimizer of an objective function, in the following way:
θ^=argminθ>0Φ(θ;Y).
(4)
Since it is desired that the mean of the estimator be the same as the true parameter (i.e., unbiased) and that the variance of the estimator be as small as possible, one should design the objective function based on the underlying probabilistic nature of the measured photon counts. One of the most useful estimators is the MLE, which is defined by the maximizer of the likelihood function of Y. The MLE for the measurement that is modeled in Eq. (1) is determined by the minimizer of the negative Poisson log-likelihood function, defined as follows:
L(θ;Y)=i=0N1λi(θ)yilogλi(θ)+logyi!.
(5)
If one assumes that the measured data is acquired under additive Gaussian noises with equal variances, then the LSE would be the MLE for such measured data. The objective function for the LSE is defined as follows:
ΦLS(θ;Y)=12i=0N1(yiλi(θ))2.
(6)
Since each measurement yi has different variance, one may attempt to estimate the vector θ by considering different variances for different yi. Because the mean and the variance of the Poisson random variable yi are the same as λi(θ), assuming each measurement as a Gaussian random variable with equal mean and variance yields the well known Pearson’s χ2 objective function defined, as follows [11

11. T. Hauschild and M. Jentschel, “Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments,” Nucl. Instrum. Meth. A 457, 384–401 (2001) [CrossRef] .

]:
ΦP(θ;Y)=12i=0N1(yiλi(θ))2λi(θ).
(7)
Another alternative of the Pearson’s χ2 is the modified Neyman’s χ2, which is obtained by substituting the variance in the denominator with max{yi, 1}, as follows:
ΦN(θ;Y)=12i=0N1(yiλi(θ))2max{yi,1}.
(8)
The denominator of the original Neyman’s χ2 is yi, which is based on the intuition that E[yi] is λi(θ). However, since the original Neyman’s χ2 function is not defined when yi is zero, we study the modified Neyman’s χ2-based method as suggested previously [11

11. T. Hauschild and M. Jentschel, “Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments,” Nucl. Instrum. Meth. A 457, 384–401 (2001) [CrossRef] .

]. We note that if all yi are non-zero, the modified Neyman’s χ2 is the same as the original [11

11. T. Hauschild and M. Jentschel, “Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments,” Nucl. Instrum. Meth. A 457, 384–401 (2001) [CrossRef] .

].

2.2. Mean and variance approximation

We consider an estimator that is defined by the minimizer of an analytic objective function. The minimizer is determined by zeroing the gradient of the objective function, as follows [16

16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

]:
θΦ(θ^;Y)=θΦ(θ;Y)|θ=θ^=0,
(9)
where, the ∇θ operator is defined by the column gradient using θj [16

16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

]. To approximate its mean and variance, we linearize the column gradient ∇θΦ(θ̂, Y) by the first-order Taylor series expansion that is performed around the true parameter θ, as follows [16

16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

]:
θΦ(θ^;Y)θΦ(θ;Y)+[θ[θΦ(θ;Y)]T](θ^θ).
(10)
Note that the approximation should be useful when the objective function is locally quadratic in a region around θ, and when θ̂ belongs to the local region. For such cases, the Hessian matrix ∇θ[∇θΦ(θ; Y)] is a positive definite matrix. Since ∇θΦ(θ̂; Y) = 0, the estimator θ̂ can be approximated as follows [16

16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

]:
θ^θ[θ[θΦ(θ;Y)]T]1θΦ(θ,Y).
(11)
Therefore, the bias may be approximated as follows [16

16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

]:
E[θ^]θE[[θ[θΦ(θ;Y)]T]1θΦ(θ;Y)]H1E[θΦ(θ;Y)],
(12)
where, H is the expectation of Hessian matrix at θ and is defined as follows:
H=E[θ[θΦ(θ;Y)]T].
(13)
In addition, the covariance of the estimator may be approximated as follows [16

16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

]:
Cov[θ^]H1E[θΦ(θ;Y)θΦ(θ;Y)T]H1
(14)
Note that the approximation would be more accurate if estimated parameter values are closer to true parameter values, because in such a case, the objective function can be better approximated by a locally quadratic function around the true parameter (i.e., high signal to noise scenario).

2.3. Cramer-Rao bound

For any unbiased estimator, the Cramer-Rao bound is the lowest achievable bound for the variance of the estimator [5

5. H. Cramer, Mathematical Methods of Statistics (Princeton University, 1999).

, 17

17. H. Van Trees, Detection, Estimation, and Modulation Theory, Part 1 (John Wiley & Sons, 2001).

]. The variance of the j-th component of any unbiased estimator θ̂j is bounded by Jjj, as follows [17

17. H. Van Trees, Detection, Estimation, and Modulation Theory, Part 1 (John Wiley & Sons, 2001).

]:
Var[θ^j]Jjj,
(15)
where, the matrix J is the inverse of the Fisher information matrix F, which is defined as follows:
FjkE[logp(Y;θ)θjlogp(Y;θ)θk]=E[2logp(Y;θ)θjθk].
(16)
We compute the Cramer-Rao bound for the estimation of θ from Y (defined in Eq. (1)). By using the negative log-likelihood function defined in Eq. (5), it is straightforward to show that the (j, k)-th element of the Fisher information matrix F can be defined as follows:
Fjk=i=0N11λi(θ)λi(θ)θjλi(θ)θk.
(17)
Note that it is possible to simplify the Fisher information matrix, using the relation E[yi] = Var[yi] = λi(θ).

2.4. Maximum likelihood estimation

The MLE that use the negative Poisson log-likelihood function, defined in Eq. (5), is unbiased in our approximation since
E[L(θ;Y))θj]=E[i=0N1(yiλi(θ)1)λi(θ)θj]=0.
(18)
For covariance approximation, it is sufficient to note that [17

17. H. Van Trees, Detection, Estimation, and Modulation Theory, Part 1 (John Wiley & Sons, 2001).

, p. 80]
E[[θL(θ,Y)][θL(θ,Y)]T]=E[θ[θL(θ,Y)]T].
(19)
Since the approximated covariance matrix is the same as the inverse of the Fisher information matrix, we conclude that, in our approximation, the MLE is both unbiased and efficient. This result is expected, since the MLE is asymptotically unbiased and asymptotically efficient [5

5. H. Cramer, Mathematical Methods of Statistics (Princeton University, 1999).

].

2.5. Least squares estimation

The least squares estimation is also unbiased in our approximation, because
E[ΦLS(θ;Y)θj]=E[i(yiλi(θ))λi(θ)θj]=0.
(20)
However, the covariance approximation of the LSE reveals that it has larger variance than MLE. Let H = E [∇θ[∇θΦLS(θ; Y)]T] and K = E[∇θΦLS(θ; Y)∇θΦLS(θ; Y)T]. Then the (j, k)-th elements of H and K matrices are defined as follows:
Hjk=i=0N1λi(θ)θjλi(θ)θk,
(21)
Kjk=i=0N1λi(θ)λi(θ)θjλi(θ)θk.
(22)
The approximated covariance function of the LSE, θ̂LS, satisfies the following inequality due to the matrix Cauchy-Schwarz inequality [18

18. T. G., “A matrix extension of the cauchy-schwarz inequality,” Econ. Lett. 63, 1–3 (1999) [CrossRef] .

]:
Cov{θ^LS}H1KH1F1,
(23)
where, AB is understood in the sense that AB matrix is positive semidefinite. The proof of this inequality is included in Appendix A. Therefore, we conclude that the variance of the LSE is larger than the MLE, although the estimator is unbiased. This is because the LSE is based on an incorrect probability model.

2.6. Pearson’s χ2

2.7. Neyman’s χ2

2.8. Numerical optimization

To determine the MLE and the χ2-based estimations, it is required to solve the minimization problem defined in Eq. (4) with the objective functions defined in Eq. (5) through Eq. (8). Note that the minimization problem is a constrained optimization problem with non-negativity constraints since both decay and amplitude parameters have positive values. We solve the constrained minimization problem using the trust region reflective method [21

21. J. Moré and D. Sorensen, “Computing a trust region step,” SIAM J. Sci. Comput. 4, 553–572 (1983) [CrossRef] .

23

23. R. Byrd, R. Schnabel, and G. Shultz, “Approximate solution of the trust region problem by minimization over two-dimensional subspaces,” Math. Program. 40, 247–263 (1988) [CrossRef] .

] implemented in the Optimization toolbox in MATLAB (Mathworks, USA), which is based on the quadratic approximation of an objective function. The Hessian matrix HM of the negative log-likelihood function, which is the objective function for the MLE, is defined as follows:
HjkM=i=0N1(1yiλi(θ))2λi(θ)θjθk+yiλi(θ)2λi(θ)θjλi(θ)θk.
(29)
For the quadratic approximation of the negative log-likelihood function, we ignore the first term in Eq. (29) since inclusion of the second order derivative term can be destabilizing if the model fits badly [10

10. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C++ - The Art of Scientific Computing (Cambridge University, 2002).

, p. 688]. The Hessian matrices for the χ2-based estimators are also computed using only first order derivative terms for the same reason.

The approximations for the statistical properties of each estimator developed in the preceding sections are based on the assumption that each estimator is determined by zeroing the gradient of corresponding objective function around true parameter. In practice, since objective functions for the MLE and the χ2-based estimators are non-convex functions, numerical optimization may find only a local minimum depending on initial parameter values. Because it is very challenging to analyze the behaviors of local minima theoretically, we study the effects of local minima in decay and amplitude parameter estimation using numerical simulations in the following section.

3. Simulation results

To validate the analysis that was presented in the preceding section, we conducted simulation studies in order to estimate the two decay constants and their associated amplitudes in an exponential decay model that is defined as follows:
λi(θ)=(A1etiτ1+A2etiτ2)*h(ti).
(30)
In Eq. (30), we used the IRF of a truncated Gaussian function with σ = 1 and 5 nonzero points. Note that the Gaussian function was often used to model an IRF [2

2. H. E. Grecco, P. Roda-Navarro, and P. J. Verveer, “Global analysis of time correlated single photon counting FRET-FLIM data,” Opt. Express 17, 6493–6508 (2009) [CrossRef] [PubMed] .

, 6

6. S. Laptenok, K. M. Mullen, J. W. Borst, I. H. M. van Stokkum, V. V. Apanasovich, and A. J. W. G. Visser, “Fluorescence lifetime imaging microscopy (FLIM) data analysis with TIMP,” J. Stat. Softw. 18, 1–20 (2007).

]. We generated 1024 Poisson random variables yi whose mean is λi(θ) with uniformly spaced ti from t0 = 0 to tN−1 = 12.788ns. We set true values as τ1 = 1ns, τ2 = 4ns. To simulate a different number of photon counts, we changed A1 from 23 to 125 while we set A2 = A1 − 10. By using the generated photon counts (i.e., the Poisson random variables), we estimated the parameters using the MLE, LSE, and Pearson’s χ2 and Neyman’s χ2-based estimators. We set initial parameter values by two times the true values. We repeated the estimation 5000 times, using different Poisson random variable realizations.

Figure 1 shows the sample mean of each estimator for [τ1, τ2, A1, A2] from the 5000 estimations with the approximated mean values that were obtained using the formula developed in the preceding section for the Pearson’s and Neyman’s χ2-based estimators. Notably, because there is no bias in our approximation, the approximated means of the MLE and the LSE are the same as true ones. As shown in Fig. 1, the sample means of the MLE and the LSE are very close to the true parameters for the entire range of photon counts. However, the Pearson’s χ2-based estimator shows positive biases in estimating τ1 and τ2. There also exists a small positive bias in estimating A1 and a small negative bias in estimating A2. For higher photon counts, the approximated mean formula is in better agreement with the sample mean. The approximated mean values are included only in the range for which the approximated Hessian is a positive definite matrix. Compared with the Pearson’s χ2-based estimator, the Neyman’s χ2-based estimator exhibits more complicated behaviors. The Neyman’s χ2-based estimator shows positive biases in estimating τ1 and τ2 when photon counts are very low. However, in the region wherein photon counts are relatively high, the biases are negative. The approximated mean formula agrees with the sample mean only in the regime of high photon counts. As a matter of fact, the approximated Hessian matrix of the Neyman’s χ2 is positive definite only for sufficiently high photon counts, indicating that the Neyman’s χ2-based estimator might be more nonlinear than the Pearson’s χ2-based estimator.

Fig. 1 Mean of the estimated parameters for different photon counts: (a) τ1; (b) τ2; (c) A1; (d) A2;

In Fig. 2, we present the sample variance of each estimator with the Cramer-Rao bound and approximated variances for the LSE. As shown in Fig. 2, the sample variances of the MLE and the LSE agree with the Cramer-Rao bound and the approximated variance formula very well for the entire range of photon counts. We note that, according to the proof presented in Appendix A, the approximated variance of the LSE is always greater than the Cramer-Rao bound. The variances of the Pearson’s χ2-based estimator are larger than those of the MLE for low photon counts, although the variances become close to the Cramer-Rao bound as the number of photons increases. Note that even for high photon counts, the Pearson’s χ2-based estimator is biased in estimating τ1 and τ2. For Neyman’s χ2-based estimator, the variances are greater than those of the MLE for relatively low photon counts. Although the variances for estimating τ1 and τ2 approach the Cramer-Rao bound as photon counts increase, variances in estimated amplitudes do not approach the Cramer-Rao bound even in the case of very high photon counts, as is shown in Fig. 2(c) and Fig. 2(d). We believe that the Neyman’s χ2-based estimator should not be used because it is more biased than the Pearson’s χ2-based method, in addition to showing larger variances. Note that the Neyman’s χ2-based method was a frequently used method in FLIM imaging [7

7. S. Pelet, M. J. R. Previte, L. H. Laiho, and P. T. C. So, “A fast global fitting algorithm for fluorescence lifetime imaging microscopy based on image segmentation.” Biophys. J. 87, 2807–17 (2004) [CrossRef] [PubMed] .

, 12

12. M. Maus, M. Cotlet, J. Hofkens, T. Gensch, F. C. De Schryver, J. Schaffer, and C. A. M. Seidel, “An experimental comparison of the maximum likelihood estimation and nonlinear least-squares fluorescence lifetime analysis of single molecules,” Anal. Chem. 73, 2078–2086 (2001) [CrossRef] [PubMed] .

]. We believe that one should choose the MLE for FLIM imaging, not only for low photon counts but also for high photon counts. Our analysis and simulations show biases in Pearson’s and Neyman’s χ2-based methods even for the case of more than 45000 photon counts. In addition, since the LSE shows larger variances than the MLE, there is no reason to favor the LSE over to the MLE.

Fig. 2 Variance in the estimation of parameters for different photon counts: (a) τ1; (b) τ2; (c) A1; (d) A2;

As mentioned in the preceding section, numerical optimization may determine only a local minimizer since the objective functions for the MLE and the χ2-based estimators are non-convex functions. To study the effects of local minimizers in estimating decay and amplitude parameters, we conducted another simulation study. We generated 1024 Poisson random variables yi whose mean is λi(θ) with true parameter values as τ1 = 1ns, τ2 = 4ns, A1 = 72, and A2 = 62 (number of average total photon counts is 24852). Then, we estimated the decay and the amplitude parameters using the MLE and the χ2-based methods while increasing initial parameter values from the true values to five times the true values. We repeated these estimations 5000 times with different Poisson random variable realizations. Figure 3 shows the sample mean of each estimator from 5000 realizations for different initial parameters. To our surprise, all the four estimation methods are very robust to the change of initial parameters. As shown in Fig. 3, the sample mean values of estimated parameters using the four methods are almost the same for the entire range of initial parameters. Similar results are found in the sample variance of each estimator which is shown in Fig. 4. One can see that the sample variances of the MLE and the χ2-based estimations do not change much for the entire range of initial parameters. In addition, the sample variance of the MLE and the LSE agree with the Cramer-Rao bound and the variance approximation formula for the LSE, respectively. These simulation results suggest that the objective functions of the four estimation methods might be locally convex functions for the cases of high photon counts.

Fig. 3 Mean of the estimated parameters for different initial parameter values (high photon counts): (a) τ1 (true value=1); (b) τ2 (true value=4); (c) A1 (true value=72); (d) A2 (true value=62);
Fig. 4 Variance in the estimation of parameters for different initial parameter values (high photon counts): (a) τ1; (b) τ2; (c) A1; (d) A2;

Fig. 5 Mean of the estimated parameters for different initial parameter values (low photon counts and similar decay parameters): (a) τ1 (true value=1); (b) τ2 (true value=2); (c) A1 (true value=28); (d) A2 (true value=18);

Fig. 6 Variance in the estimation of parameters for different initial parameter values (low photon counts and similar decay parameters): (a) τ1; (b) τ2; (c) A1; (d) A2;

4. Conclusion

We used gradient linearization around true parameters to analyze the statistical properties of amplitude and decay constant estimators for FLIM. In the analysis, the MLE that was based on the Poisson likelihood function exhibited both unbiasedness and efficiency. The LSE was unbiased, but it showed a larger variance than the MLE. Since the Pearson’s and Neyman’s χ2-based estimators showed biases, it can be concluded that they are not suitable for even high photon counts. We believe that the analysis presented in this work can also be used to predict the performance of the estimators for FLIM.

Appendix A: Inequality for the variance approximations of MLE and LSE

Matrix Cauchy-Schwarz inequality is defined as follows [18

18. T. G., “A matrix extension of the cauchy-schwarz inequality,” Econ. Lett. 63, 1–3 (1999) [CrossRef] .

]:
BTB[BTA][ATA]1[ATB]0,
(31)
where, a matrix C0 means that the matrix C is a positive semidefinite matrix. We define N × 2M matrices A and B as follows:
Bij=1λi(θ)λi(θ)θj,
(32)
Aij=λi(θ)λi(θ)θj,i=0,,N1,j=0,,2M1.
(33)
Note that the Fisher information matrix is defined as follows:
F=BTB.
(34)
In addition, since the H matrix, defined in Eq. (21) is ATB, and the K matrix, defined in Eq. (22), is ATA, the covariance approximation of the LSE satisfies the following inequality:
Cov{θ^LS}[ATB]1[ATA][BTA]1F1Cov{θ^MLE},
(35)
where θ̂MLE denotes MLE.

Appendix B: Mean approximation for the estimation of a single time constant using the Pearson’s χ2

For the estimation of a single decay constant from a single fluorophore, the mean intensity is modeled as follows:
λi(θ)=Aetiτ*i(ti).
(36)
Then, the first order derivative is defined as follows:
λi(θ)τ=tiτ2λi(θ).
(37)
Note that the first term in Eq. (25) dominates when A is very large, since the first term is proportional to A but others are not. Therefore, we approximate the second derivative of the objective function as follows:
hpi=0N1ti2τ4λi(θ).
(38)
Therefore, the bias is approximated as follows:
E[τ^]τ1hp12i=0N11λi(θ)λi(θ)τ=12Ai=0N1tiτ2i=0N1ti2τ4etiτ*i(ti).
(39)
Note that the bias in estimating τ approaches zero as A approaches infinity. However, note that the convergence rate is rather slow, since the bias is proportional to the reciprocal value of the amplitude A.

Acknowledgment

This work was supported by the Korean Research Foundation Grant funded by the Korean Government (KRF-2008-331-D00419) and Ewha Global Top 5 Grant 2011 of Ewha Womans University.

References and links

1.

J. R. Lakowicz, Principles of Fluorescence Spectroscopy (Kluwer Academic/Plenum, 1999) [CrossRef] .

2.

H. E. Grecco, P. Roda-Navarro, and P. J. Verveer, “Global analysis of time correlated single photon counting FRET-FLIM data,” Opt. Express 17, 6493–6508 (2009) [CrossRef] [PubMed] .

3.

S. Kumar, C. Dunsby, P. A. A. D. Beule, D. M. Owen, U. Anand, P. M. P. Lanigan, R. K. P. Benninger, D. M. Davis, M. A. A. Neil, P. Anand, C. Benham, A. Naylor, and P. M. W. French, “Multifocal multiphoton excitation and time correlated single photon counting detection for 3-D fluorescence lifetime imaging,” Opt. Express 15, 12548–12561 (2007) [CrossRef] [PubMed] .

4.

B. B. Collier and M. J. McShane, “Dynamic windowing algorithm for the fast and accurate determination of luminescence lifetimes,” Anal. Chem. 84, 4725–4731 (2012) [CrossRef] [PubMed] .

5.

H. Cramer, Mathematical Methods of Statistics (Princeton University, 1999).

6.

S. Laptenok, K. M. Mullen, J. W. Borst, I. H. M. van Stokkum, V. V. Apanasovich, and A. J. W. G. Visser, “Fluorescence lifetime imaging microscopy (FLIM) data analysis with TIMP,” J. Stat. Softw. 18, 1–20 (2007).

7.

S. Pelet, M. J. R. Previte, L. H. Laiho, and P. T. C. So, “A fast global fitting algorithm for fluorescence lifetime imaging microscopy based on image segmentation.” Biophys. J. 87, 2807–17 (2004) [CrossRef] [PubMed] .

8.

N. Boens, W. Qin, N. Basarić, J. Hofkens, M. Ameloot, J. Pouget, J.-P. Lefèvre, B. Valeur, E. Gratton, M. vandeVen, N. D. Silva, Y. Engelborghs, K. Willaert, A. Sillen, G. Rumbles, D. Phillips, A. J. W. G. Visser, A. van Hoek, J. R. Lakowicz, H. Malak, I. Gryczynski, A. G. Szabo, D. T. Krajcarski, N. Tamai, and A. Miura, “Fluorescence lifetime standards for time and frequency domain fluorescence spectroscopy,” Anal. Chem. 79, 2137–2149 (2007) [CrossRef] [PubMed] .

9.

P. J. Steinbach, R. Ionescu, and C. R. Matthews, “Analysis of kinetics using a hybrid maximum-entropy/nonlinear-least-squares method: application to protein folding,” Biophys. J. 82, 2244–2255 (2002) [CrossRef] [PubMed] .

10.

W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C++ - The Art of Scientific Computing (Cambridge University, 2002).

11.

T. Hauschild and M. Jentschel, “Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments,” Nucl. Instrum. Meth. A 457, 384–401 (2001) [CrossRef] .

12.

M. Maus, M. Cotlet, J. Hofkens, T. Gensch, F. C. De Schryver, J. Schaffer, and C. A. M. Seidel, “An experimental comparison of the maximum likelihood estimation and nonlinear least-squares fluorescence lifetime analysis of single molecules,” Anal. Chem. 73, 2078–2086 (2001) [CrossRef] [PubMed] .

13.

K. A. Walther, B. Papke, M. B. Sinn, K. Michel, and A. Kinkhabwala, “Precise measurement of protein interacting fractions with fluorescence lifetime imaging microscopy,” Mol. BioSyst. 7, 322–336 (2011) [CrossRef] [PubMed] .

14.

J. Tellinghuisen and C. W. Wilkerson, “Bias and precision in the estimation of exponential decay parameters from sparse data,” Anal. Chem. 65, 1240–1246 (1993) [CrossRef] .

15.

J. Fessler, “Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): applications to tomography,” IEEE Trans. Image Process. 5, 493 –506 (1996) [CrossRef] [PubMed] .

16.

J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag. 23, 1430 –1444 (2004) [CrossRef] .

17.

H. Van Trees, Detection, Estimation, and Modulation Theory, Part 1 (John Wiley & Sons, 2001).

18.

T. G., “A matrix extension of the cauchy-schwarz inequality,” Econ. Lett. 63, 1–3 (1999) [CrossRef] .

19.

P. Hall and B. Selinger, “Better estimates of exponential decay parameters,” J. Phys. Chem. 85, 2941–2946 (1981) [CrossRef] .

20.

K. J. Mighell, “Parameter estimation in astronomy with poisson-distributed data. I. the χγ2 statistic,” The Astrophys. J. 518, 380 (1999) [CrossRef] .

21.

J. Moré and D. Sorensen, “Computing a trust region step,” SIAM J. Sci. Comput. 4, 553–572 (1983) [CrossRef] .

22.

M. A. Branch, T. F. Coleman, and Y. Li, “A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems,” SIAM J. Sci. Comput. 21, 1–23 (1999) [CrossRef] .

23.

R. Byrd, R. Schnabel, and G. Shultz, “Approximate solution of the trust region problem by minimization over two-dimensional subspaces,” Math. Program. 40, 247–263 (1988) [CrossRef] .

OCIS Codes
(100.3190) Image processing : Inverse problems
(180.2520) Microscopy : Fluorescence microscopy
(300.6280) Spectroscopy : Spectroscopy, fluorescence and luminescence

ToC Category:
Microscopy

History
Original Manuscript: January 3, 2013
Revised Manuscript: February 23, 2013
Manuscript Accepted: February 24, 2013
Published: March 4, 2013

Virtual Issues
Vol. 8, Iss. 4 Virtual Journal for Biomedical Optics

Citation
Jeongtae Kim and Jiyeong Seok, "Statistical properties of amplitude and decay parameter estimators for fluorescence lifetime imaging," Opt. Express 21, 6061-6075 (2013)
http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-21-5-6061


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. J. R. Lakowicz, Principles of Fluorescence Spectroscopy (Kluwer Academic/Plenum, 1999). [CrossRef]
  2. H. E. Grecco, P. Roda-Navarro, and P. J. Verveer, “Global analysis of time correlated single photon counting FRET-FLIM data,” Opt. Express17, 6493–6508 (2009). [CrossRef] [PubMed]
  3. S. Kumar, C. Dunsby, P. A. A. D. Beule, D. M. Owen, U. Anand, P. M. P. Lanigan, R. K. P. Benninger, D. M. Davis, M. A. A. Neil, P. Anand, C. Benham, A. Naylor, and P. M. W. French, “Multifocal multiphoton excitation and time correlated single photon counting detection for 3-D fluorescence lifetime imaging,” Opt. Express15, 12548–12561 (2007). [CrossRef] [PubMed]
  4. B. B. Collier and M. J. McShane, “Dynamic windowing algorithm for the fast and accurate determination of luminescence lifetimes,” Anal. Chem.84, 4725–4731 (2012). [CrossRef] [PubMed]
  5. H. Cramer, Mathematical Methods of Statistics (Princeton University, 1999).
  6. S. Laptenok, K. M. Mullen, J. W. Borst, I. H. M. van Stokkum, V. V. Apanasovich, and A. J. W. G. Visser, “Fluorescence lifetime imaging microscopy (FLIM) data analysis with TIMP,” J. Stat. Softw.18, 1–20 (2007).
  7. S. Pelet, M. J. R. Previte, L. H. Laiho, and P. T. C. So, “A fast global fitting algorithm for fluorescence lifetime imaging microscopy based on image segmentation.” Biophys. J.87, 2807–17 (2004). [CrossRef] [PubMed]
  8. N. Boens, W. Qin, N. Basarić, J. Hofkens, M. Ameloot, J. Pouget, J.-P. Lefèvre, B. Valeur, E. Gratton, M. vandeVen, N. D. Silva, Y. Engelborghs, K. Willaert, A. Sillen, G. Rumbles, D. Phillips, A. J. W. G. Visser, A. van Hoek, J. R. Lakowicz, H. Malak, I. Gryczynski, A. G. Szabo, D. T. Krajcarski, N. Tamai, and A. Miura, “Fluorescence lifetime standards for time and frequency domain fluorescence spectroscopy,” Anal. Chem.79, 2137–2149 (2007). [CrossRef] [PubMed]
  9. P. J. Steinbach, R. Ionescu, and C. R. Matthews, “Analysis of kinetics using a hybrid maximum-entropy/nonlinear-least-squares method: application to protein folding,” Biophys. J.82, 2244–2255 (2002). [CrossRef] [PubMed]
  10. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C++ - The Art of Scientific Computing (Cambridge University, 2002).
  11. T. Hauschild and M. Jentschel, “Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments,” Nucl. Instrum. Meth. A457, 384–401 (2001). [CrossRef]
  12. M. Maus, M. Cotlet, J. Hofkens, T. Gensch, F. C. De Schryver, J. Schaffer, and C. A. M. Seidel, “An experimental comparison of the maximum likelihood estimation and nonlinear least-squares fluorescence lifetime analysis of single molecules,” Anal. Chem.73, 2078–2086 (2001). [CrossRef] [PubMed]
  13. K. A. Walther, B. Papke, M. B. Sinn, K. Michel, and A. Kinkhabwala, “Precise measurement of protein interacting fractions with fluorescence lifetime imaging microscopy,” Mol. BioSyst.7, 322–336 (2011). [CrossRef] [PubMed]
  14. J. Tellinghuisen and C. W. Wilkerson, “Bias and precision in the estimation of exponential decay parameters from sparse data,” Anal. Chem.65, 1240–1246 (1993). [CrossRef]
  15. J. Fessler, “Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): applications to tomography,” IEEE Trans. Image Process.5, 493 –506 (1996). [CrossRef] [PubMed]
  16. J. Kim and J. Fessler, “Intensity-based image registration using robust correlation coefficients,” IEEE Trans. Med. Imag.23, 1430 –1444 (2004). [CrossRef]
  17. H. Van Trees, Detection, Estimation, and Modulation Theory, Part 1 (John Wiley & Sons, 2001).
  18. T. G., “A matrix extension of the cauchy-schwarz inequality,” Econ. Lett.63, 1–3 (1999). [CrossRef]
  19. P. Hall and B. Selinger, “Better estimates of exponential decay parameters,” J. Phys. Chem.85, 2941–2946 (1981). [CrossRef]
  20. K. J. Mighell, “Parameter estimation in astronomy with poisson-distributed data. I. the χγ2 statistic,” The Astrophys. J.518, 380 (1999). [CrossRef]
  21. J. Moré and D. Sorensen, “Computing a trust region step,” SIAM J. Sci. Comput.4, 553–572 (1983). [CrossRef]
  22. M. A. Branch, T. F. Coleman, and Y. Li, “A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems,” SIAM J. Sci. Comput.21, 1–23 (1999). [CrossRef]
  23. R. Byrd, R. Schnabel, and G. Shultz, “Approximate solution of the trust region problem by minimization over two-dimensional subspaces,” Math. Program.40, 247–263 (1988). [CrossRef]

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.


« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited