1. Introduction
We begin in Section 2 by deriving the Wiener estimator and the scanning linear estimator as solutions to an EMSE minimization. The imaging equation, a description of measurement noise in imaging systems, and the object model are presented in Section 3. The object is decomposed into a signal and a background term. The signal is parameterized by the object features we wish to estimate: the signal size, amplitude, and location. The background represents nuisance parameters. In addition to measurement noise, the signal and the background are also random. This leads to a triply stochastic treatment of the image data, which is described in Section 4. This section also defines the first- and second- order statistics of the data as well as a cross-covariance between the images and the parameters to be estimated. At this juncture we transition from theoretical forms of the estimation rules to the specifics of our simulation study. The models and statistics of the simulated objects are introduced in Section 5. Section 6 provides the details of our simulated imaging operator with a display of sample images, data covariances, cross-covariance, and the model-specific calculation of the scanning linear estimator. Finally, the performance of both estimators, for various cases of prior knowledge, is illustrated and discussed in Section 7. The conclusions are presented in Section 8.
2. Deriving the estimation rules
We will consider a digital image, written as an M×1 vector g, that has some general relationship to a set of parameters denoted by an N×1 vector called θ, with elements corresponding to the parameters to be estimated. An estimate of these parameters, calculated from an image, will be denoted by θ^ (g). The vector θ will be treated as a random variable drawn from an ensemble described by a probability density function (pdf) called pr(θ). The performance of the estimation scheme will be evaluated according to its average performance on the population, rather than for one specific object.
Throughout this paper, statistical averages will be written as
where x is an P×1 random vector with pdf pr(x), y(x) is a scalar-valued function of the random vector, and the subscript on the LHS conveys the ensemble used for the expectation. This notation allows for expectations with respect to multiple random quantities to be expanded into nested conditional averages in a short-hand form, such as 〈·〉 _{g,θ}=〈〈·〉_{g|θ}〉_{θ}.
The performance of an estimation rule is conventionally quantified by measuring the distance between the estimate and the true value of the parameter. The mean-squared error (MSE) is a popular choice for quantifying this distance. The stochastic nature of this problem is accounted for by calculating the EMSE,
We have made use of the equality between an inner product of two vectors and the trace of their outer product. The trace, defined as the sum of the diagonal elements of a square matrix, is denoted tr. The EMSE calculation involves the familiar squared difference between estimated and actual values; however because the estimation task can in general be multi-dimensional, the measure of difference is a vector norm.
Among all possible estimators, the mean of the posterior density pr(θ|g) minimizes EMSE and is given by
where the subscript PM denotes the posterior mean, and
N is the number of parameters to be estimated. This estimation rule is, in general, nonlinear with respect to
g and requires sufficient knowledge of the posterior density to calculate or estimate the integral expression. Markovchain Monte Carlo (MCMC) techniques can be used to estimate the posterior mean [
1414. J. Shao, Mathematical Statistics (Springer, 1999).
], [
1515. M. Kupinski, J.W. Hoppin, E. Clarkson, and H. H. Barrett, “Ideal-observer computation in medical imaging with use of Markov-chain Monte Carlo techniques,” J. Opt. Soc. Am. A 20, 430–438 (2003). [CrossRef]
], but the disadvantages include computational time and stringent requirements on prior knowledge.
In this paper, we will evaluate two estimation rules that minimize EMSE under different statistical conditions that simplify the form of the posterior mean.
2.1. Wiener estimation
When the joint pdf of the image data and the parameters, denoted pr(
θ,
g), is Gaussian the posterior mean of Eq. (
3) reduces to the linear Wiener estimator, given by [
1616. N WienerExtrapolation, Interpolation, and Smoothing of Stationary Time Series with Engineering Applications (The MIT press, 1949). [PubMed]
],[
1717. J. L. Melsa and D. L. Cohn, Decision and Estimation Theory (McGraw-Hill, 1978).
]
where the double overbar indicates averages over
g and
θ as in Eq. (
2). The optimal EMSE is given by
Here
K
_{θ,g}=〈(
θ-θ¯)(
g-
g̿)
^{t}〉
_{g,θ} is the cross-covariance between the parameters and the data, which is a measure of the interdependence between these two random vectors. The cross-covariance is an
N×
M matrix, where
N is the number of parameters to be estimated, and
M is the number of elements in an image. In general, the cross-covariance is non-square because the number of parameters to be estimated is usually much less than the dimensionality of an image. A given row of
K
_{θ,g} selects one particular element of
θ and is a measure of how changes to this scalar quantity will change each of the
M elements in the image data. The cross-covariance is an expectation with respect to the data as well as the parameter ensemble; hence the calculation will require knowledge of the pdf pr(
θ). Calculating the Wiener estimator also requires knowledge of the first- and second-order statistics of the data
g; the inverse of the data covariance
K
^{-1}
_{g} and the average data
g̿ appear in the estimation rule of Eq. (
4).
Among all affine transforms of the data (defined as a linear plus a constant term), theWiener estimator minimizes the EMSE and, if the joint pdf is Gaussian, it minimizes the EMSE among all estimators. An analogy can be seen with detection tasks; the ideal observer requires full knowledge of the likelihood ratios, but restrictions to optimal linear test statistics reduce the requirement on prior knowledge to first- and second-order data statistics [
55. H. H. Barrett, “Objective assessment of image quality: effects of quantum noise and object variability,” J. Opt. Soc. Am. A 7(7), 1266–1278 (1990). [CrossRef] [PubMed]
]. In both detection and estimation tasks, the optimal nonlinear performers, the ideal observer and posterior mean, are equivalent to their respective linear reductions, the Hotelling observer andWiener estimator, when statistics are Gaussian [
55. H. H. Barrett, “Objective assessment of image quality: effects of quantum noise and object variability,” J. Opt. Soc. Am. A 7(7), 1266–1278 (1990). [CrossRef] [PubMed]
].
2.2. Scanning-linear estimation
If the posterior density pr(θ|g) is unimodal and sufficiently symmetric about the mean, then the mean and mode of the distribution will be close. Calculating the mode of the posterior density (also called maximum a posteriori (MAP) estimation) requires a scan of parameter space to compare solutions and find the maximum. The general form of MAP estimation is
where the second equality comes from using Bayes’ rule and the third because ln(·), the natural logarithm, is a monotonic transform and pr(g) is independent of θ. The likelihood of the data, conditioned on the parameters, is called pr(g|θ). In order to simplify the function to be optimized, we will approximate the likelihood by a Gaussian, but this condition does not imply that the joint pdf pr(g,θ) is also Gaussian. The approximate likelihood is
where g̿(θ) is the data averaged over all sources of randomness except the parameters to be estimated θ,K
_{g|θ} is the covariance matrix for the random vector g-g̿(θ), and det(·) is the determinant of the matrix. When searching for the maximum of this Gaussian likelihood we would have to re-evaluate the covariance, its inverse, and its determinant at every candidate θ. Our second approximation is to average over this dependence. This is equivalent to making a slowly varying parameter approximationc
The determinant and other scale factors in Eq. (
7) are now independent of
θ and can be ignored for the purposes of an extremum search. In order to avoid the effort of exponentiating, we will operate on the natural logarithm of this approximate posterior density, which yields
After dropping terms that do not depend on θ, the scanning-linear estimator which maximizes the approximate posterior density is
The first term is a linear operation on the image data, given by a vector inner product with g̿(θ)^{t}
K̅
^{-1}
_{g}. The next two terms are independent of the data, but must be retained because of the dependence on θ. The scanning-linear estimator is an increase in data-processing complexity compared to the Wiener estimator, or any other pure linear estimator. This estimator operates linearly on the data, though the linear template itself is a general nonlinear function of the parameters. The estimation rule requires a search for the value of θ that will maximize this linear operation on the data. Once the data statistics and parameterized signal model for our present study have been introduced, the specific form of this estimator will be presented in Section 6.3.
3. Assumptions and models
3.1. The imaging operator
An object can be represented as a continuous function of space and written as
f(
r). The associated image is a finite number of measurements that may be organized into the vector
g. This leads to the mathematical description of imaging as an operator that maps a function of continuous variables to a discrete vector [
11. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley-Interscience, 2004).
]. If the relationship between the spatial distribution of the object and the noise-free image data is also linear, this mapping can be written as [
11. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley-Interscience, 2004).
]
where r is a vector in q dimensions (e.g., q=2 or 3), and m is the index for the m^{th} measurement of the imaging system. This index can distinguish the various elements across a detector face. It can also be organized, in a lexicographical fashion, to include multiple cameras of tomographic systems or successive exposures of a temporal sequence of images. The function h_{m}(r) is the system sensitivity of the m^{th} measurement at the spatial coordinate r, and f(r) is the continuous description of the object. The overline in ḡ indicates that the data is averaged over the measurement noise in the imaging system. An alternative short-hand form represents the system as an operator that maps a function of a continuous variable to a finite-dimensional vector; now the imaging equation can be compactly written as
Here, ℋ is a continuous-to-discrete linear operator, and
f is an infinite-dimensional vector representing the object [
11. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley-Interscience, 2004).
]. Because it is notationally compact, this form will be used for the remainder of the paper.
3.2. The object model
The quantities we wish to estimate are the elements of a vector called θ. This vector quantifies features or attributes of the object that we would like to estimate, such as signal location, size, and amplitude. Consider an object equation given by
Here, an object f(r;θ) is constructed by adding together a signal f
^{sig}(r;θ) and a background f
^{bkgnd} (r) term. The object is a doubly stochastic random function because both the statistics on the signal and the statistics on the background determine the overall object statistics. The signal is a deterministic function of the random vector θ. The background is not a function of θ because it contains all of the object structure that is not relevant to the features we seek to estimate. We will proceed with a model of statistically independent signals and backgrounds. The data will be treated as a triply stochastic variable: object fluctuations due to both signal and background variability and measurement noise.
3.3. Measurement noise
When an object is measured by an imaging system, there are fluctuations in the measured values. In a real imaging system, repeatedly acquiring data from the same object will not yield the same image each time. The system sensitivity functions are the deterministic portion of the imaging system, and the noise can be modeled by the addition of a stochastic term whose mean is zero,
The vector n is a random perturbation to each data element, whose statistics may in general be object dependent. Many imaging systems operate as photon-counting devices, which leads to a Poisson probability law on the measurement noise of the image data; consequently, for a given object, the noise values in different detector elements are independent of one another. Therefore the noise covariance matrix is diagonal
Here Diag(·) denotes a diagonal matrix with the vector argument as the diagonal elements, and ḡ(f) is the noise-free image of a given signal plus a given background.
We will later make use of the fact that the noise covariance, when averaged over an object ensemble, remains diagonal
Here g̿ denotes the grand mean of the data, defined as the expected value with respect to all sources of randomness in the problem: noise, signal, and background. The object variabilities due to signal and background variations are both accounted for by the average over f.
4. Image statistics
4.1. Conditional data averages
For the remainder of this paper, the vector g will denote a noisy image of a random background plus a random signal. A noise-free image of this random object is denoted by
and the grand mean is a triply stochastic average over detector noise, backgrounds, and signals
Rather than adding a third overbar to the grand mean, we define a conditional doubly stochastic average of a noise-free image over just the random background as
4.2. Data covariance
A primary ingredient for the optimal linear estimator is the overall data covariance matrix K
_{g}, defined by
Here, the contribution from the background is
where K
^{bkgnd}
_{f} is the covariance operator for the continuous random variable f
^{bkgnd}, and ℋ^{†} is the adjoint of the continuous-to-discrete imaging operator introduced in Section 3.1. Similarly, the signal portion is
For Poisson statistics, the noise covariance is diagonal and is given by
The covariance operators K
^{sig}
_{f} and K
^{bkgnd}
_{f} for the object model used in our study will be specified in Section 5.
4.3. Cross-covariance
The cross-covariance is needed to construct the linear Wiener estimation template. The cross-covariance, which expresses sensitivity of the data to changes in the parameters, is calculated by
where g̿ is the grand-mean image. To simplify the cross-covariance expression, we begin by making use of the fact that the joint expectation can be performed by a conditional average of the data given a particular θ followed by an expectation with respect to the ensemble of θ. That is,
where
g̿(
θ) is the image average over noise and backgrounds, given in Eq. (
19). Further simplification can be made by recognizing that the grand mean is independent of
θ; thus the term 〈(
θ-
θ¯)
g̿〉
_{θ}=
0 and the cross-covariance is
The system operator can be factored out of the expectation
Here, f
^{sig}(θ) is a signal parameterized by θ, f
^{-bkgnd} is the average background, and ℋ^{†} is the adjoint of the continuous-to-discrete imaging operator. This expression for the cross-covariance is advantageous for studies of system comparison; if the expectation can be evaluated prior to applying the imaging operator, then computational redundancy can be avoided. The background term is a nuisance parameter; it is independent of the parameters θ and therefore will not contribute to the cross-covariance, so
In the Wiener estimator expression of Eq. (
4), the random background affects only the grand-mean image and the data covariance. The estimates will be biased for a particular object background, but this bias is designed to average to zero by subtracting the grand mean from the data. This is a primary feature of the way in which the optimal linear estimator, theWiener estimator, treats nuisance parameters.
5. Simulated objects
Thus far, we have provided formulas that are entirely general with respect to the object’s signal and background description. To proceed with the calculation of the estimation rules, we must specify a functional form of the continuous object f(r) and the distribution of the object ensemble.
5.1. Parameters of interest: the signal model
The signal portion of the object describes the features we are interested in estimating from the image data. For the purposes of simulations to follow, we confine our treatment to signals that are related to θ by the parameterized signal model, specified in two dimensions as
where
Here, the signal’s shape is defined by a circle of radius R, A is a scalar quantity representing the integrated output of the circular signal, I_{cyl}(R) is the area of the circle
∫drcyl(r−cR)
, and c is the location of the signal’s center. So a two-dimensional signal model is parameterized by θ that is 5×1 and given by θ=(A,R,c_{x},c_{y})^{t}. The signal’s spatial distribution is uniquely specified by the vector θ.
In order to reproduce signal variability, the parameter vector θ is treated as a random variable and, consequently, our parameterized object model describes an ensemble of possible f
^{sig}(r,θ) distributions. The first- and second- order statistics of this stochastic process are defined as
and
Calculating these expectations requires prior knowledge of the distribution on the parameters pr(θ). We will assume that the parameters to be estimated are statistically independent, so the multivariate joint probability density distribution is separable leading to pr_{θ}(θ)=pr_{A}(A)pr_{R}(R)pr_{c}(c).
The signal amplitude is a value naturally constrained to be positive; thus, a shifted gamma distribution is chosen to ensure positivity. The shape of the gamma distribution is determined by the amount of shift and two independent parameters, denoted α
_{A} and β
_{A}
The mean and variance are functions of these parameters and are given by
A̅=
α
_{A}
β
_{A}+
A
_{0} and σ
^{2}
_{A}=
α
_{A}
β
^{2}
_{A}. The mean value of the signal amplitude was determined by constraining the contrast with respect to the background to be within a realistic range. A normalized histogram for 10,000 random samples drawn from this gamma distribution,
α
_{A}=4,
βA=12
, and
A
_{0}=4.5 is superimposed on the theoretical pdf in
Fig. 1(a). A shifted-gamma distribution was also used to describe the variability of the signal radius
RFig. 1. The probability density function (red line) is superimposed on a histogram of 10,000 samples.
The prior distribution on the location of the signal is a multivariate uncorrelated Gaussian centered at the origin. In our simulation study, the linear dimension of the object space field of view is 30 mm, and the width parameter of the Gaussian distribution is set to σ_{c}=4 mm, which limits the probability of signal-absent images.
5.2. Nuisance parameters: the background model
The function S(r) represents the support region, which is unity within the imaging system’s FOV and zero otherwise, A_{b} is the uniform contribution to the background and l(r) is the functional description of each lump.
The total number of lumps and the location of each lump are the random quantities that determine the stochastic nature of this background model. The number of lumps within each realization of the background, denoted
L in Eq. (
38), is a Poisson-distributed random variable.
Fig. 2. Samples of 2D Gaussian lumpy backgrounds.
The centroid of each lump, r_{n}, is uniformly distributed throughout the FOV. This uniformity leads to a mean lumpy background that is constant within the support region and given by
Here, I_{l} is the area of a single lump ∫d
r
l(r), L̄ is the average number of lumps, and V_{FOV} is the volume of the FOV.
The autocovariance is a second-order statistic that requires evaluating the expectation of the random process at two distinct points. In general, the autocovariance of a stochastic process is a function of this pair of points
The lumpy background model is a special case of a filtered Poisson point process [
11. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley-Interscience, 2004).
]. In the theoretical limit of infinite support, equivalent to the case of
S(
r)=1 everywhere, the first-and second-order statistics of the lumpy background model satisfy stationarity conditions. In practice, stationarity is inhibited by the effect of finite support. The autocovariance for a lumpy background within a finite FOV retains the hallmark of stationarity, being functionally dependent on the difference between a pair of points, but the support region truncates the otherwise stationary function:
where [
l⊗
l](
r-
r′) represents the autocorrelation of the lump function evaluated at
r-
r′. In this study, the lump function
l(
r) is chosen to be Gaussian
l(
r)=
b
_{0} exp(-
r
^{2}/2σ
^{2}). The width parameter σ and peak value
b
_{0} of each Gaussian are selected to produce lumps that are broad and low amplitude with respect to the signal. Nine samples from this background ensemble are evaluated on a discrete planar grid and displayed in
Fig. 2.
The first- and second-order statistics of this Gaussian lumpy background model are
and
Fig. 3. Noise-free images generated by sampling from the ensemble of possible signals and backgrounds.
These first- and second-order statistics have been calculated using the distribution on lump location and number of lumps; note that the pdf pr(f
^{bkgnd}) has not been specified and is not necessarily Gaussian.
6. Simulated images and image statistics
Typically, a simulation study requires a careful trade-off between the realism of the model and the computational complexity of the implementation. The simulation of real imaging systems is usually informed by knowledge of the acquisition geometry and/or measurement data from phantoms designed to probe the system’s performance. Our goal however, is to evaluate the performance of the proposed estimators in very simplistic test cases. Our simulated imaging system is overly optimistic in modeling the transfer of information from object to image. We will demonstrate some cases for which linear estimation fails in this simulation, thereby providing evidence that it would not work in a realistic imaging context. The scanning-linear estimator will also be evaluated for this simple case, and the successful results motivate future work beyond the current simulation.
Consider a sampling operator given by
where f(r) is a planar object, each r
_{m} is a spatial coordinate on an evenly spaced grid, and m is a lexicographic ordering index. In our study, the planar object is sampled on a 61×61 grid of 0.5 mm spacing resulting in a simulated image with a linear FOV of 30 mm in each direction.
To convey a sense of the amount of signal and background variability present in the simulated data, nine realizations of noise-free data are displayed in
Fig. 3. These images are generated by drawing samples from the signal and background ensembles, described in Section 5.2 and Section 5.1 respectively.
6.1. Calculating the cross-covariance
The expectation of the product of the parameterized signal model and the parameter vector, introduced in Eq. (
31), is most naturally calculated one row at a time, where the decomposition is given by
Fig. 4. Noisy images generated by sampling from the ensemble of possible signals and backgrounds, and adding Poisson noise.
Here, the definition of the vector θ≡[A,R,c_{x},c_{y}] has been used, and each row is an infinite-dimensional vector in object space. Consider the functional form of the row corresponding to signal amplitude
and
The results of this cross-covariance calculation are displayed in
Fig. 5, where each row is displayed as a planar image. Evaluating
K
_{θ,g} at a given row selects one element of the vector
θ and measures the cross-covariance between each image pixel and the selected parameter. Viewing the data in this fashion allows for visual interpretation of how the Wiener estimator uses the image data. The cross-covariance for signal amplitude
A is provided in
Fig. 5(b) and is non-negative for all pixel values because an increase in the parameter
A would either increase or not change each pixel value of the image. This template is circularly symmetric about the center of the image, which is the average signal location. The cross-covariance for signal radii is displayed in
Fig. 5(a). Adjacent negative and positive lobes are features common to edge-detection algorithms, and the Wiener radius template also has this type of distribution. The radius template is circularly symmetric and centered about the average signal location. The templates for location estimation, 5(c) and 5(d), look like a centroid, or center of mass, calculation that tapers off at the edge of the image where signal is less likely to be located.
Fig. 5. Rows of the matrix K_{θ,g}, the cross-covariance between the data and the parameters to be estimated.
6.2. Calculating the data covariance
A data covariance matrix can be difficult to evaluate due to finite data storage. If
M measurements are made by the imaging system, the covariance matrix is
M×
M. For tomographic systems, temporal systems, or any system that collects large amounts of data, explicit storage of each covariance element may not be practical. Potentially useful solutions include a decomposition of a sample covariance matrix [
1919. H. H. Barrett, K. J. Myers, B. Gallas, E. Clarkson, and H. Zhang, “Megalopinakophobia: Its Symptoms and Cures,” Proc. SPIE 4320, 299–307 (2001). [CrossRef]
] or projection onto a lower-dimensional orthonormal set (
i.e., channelizing the data). The imaging operator ℋ of Eq. (
44) samples the continuous function on a grid of 61×61 points; consequently the overall data covariance matrix
K
_{g} is only 61
^{2}×61
^{2}, and storage requirements are minimal.
where r
_{m} is a point in object space. This integral expression is evaluated using the same Monte-Carlo and Gaussian quadrature hybrid method employed for the cross-covariance calculation. We used 100,000 samples from the radius distribution. The integral with respect to the signal center c was evaluated by quadrature at these 100,000 samples, and the sample mean is used as an unbiased estimator of the integral’s true value. The sample size of 100,000 was chosen by adding 10,000 samples until the maximum change in the estimates was less than 1/10 of a percent.
The organization of each covariance element into a matrix requires lexicographically organizing the rows and columns of an image into one index. Then, one row (or equivalently one column since all covariance matrices are symmetric) of K
_{g} selects a given pixel location, and the M values of this row are the covariance between the selected pixel and all other pixels in the image. The covariance values could also be organized into a function of four indices, the row and column designation for the pair of pixels at which the covariance is evaluated. With this type of organization, the covariance would be written as K
_{g}(m,n,m′,n′), where (m,n) is the row and column designation of one pixel and (m′,n′) is the designation of the other. In this fashion K
_{g}(m,n, :, :) is a J×J matrix where J
^{2}=M, and the colon denotes evaluating the function at all indices. This matrix can be viewed as an image in data space that expresses the covariance of the (m,n) pixel with each of the others. Such an image would peak at the point K
_{g}(m,n,m,n) because this is the variance of the (m,n) pixel.
6.3. Calculating the scanning linear estimator
In Section 2.2, we introduced a scanning-linear estimator derived from MAP estimation with two critical approximations to the data likelihood: a Gaussian distribution and a covariance independent of θ. In this section, we will use the functional form of the object model introduced thus far to derive the task-specific scanning-linear estimator.
The first assumption is that the likelihood of the data conditioned on the parameters, called pr(g|θ), is Gaussian. For a majority of realistic imaging problems, the noise-free data will not be uniquely determined by the parameters to be estimated because of nuisance parameters. In these circumstances, the likelihood is given by a marginalization over the nuisance parameters
Fig. 6. The three terms that contribute to the triply stochastic data covariance K_{g}.
where
b=ℋ
f
^{bkgnd}, the noise-free image of a background. The measurement noise is described by pr(
g|
θ;
b) the probability of an image when the object is fixed. Maximum-likelihood estimation requires numerous evaluations of the integral expression in Eq. (
50) to find the value of
θ that maximizes the likelihood. Then this search, and subsequent integral evaluations, need to be repeated for each image
g. Avoiding this calculation is the utility of assuming a Gaussian distributed likelihood. The mean of the approximated likelihood is simply the data averaged over backgrounds and noise but conditioned on
θwhere s(θ) is the noise-free image of a signal parameterized by θ, and b̅ is the noise-free image of the mean background. The conditional covariance only contains terms for the noise and the background
The second approximation is to replace the conditional covariance with the ensemble averaged covariance to avoid the functional dependence on θ. These matrices only differ in the noise term, so replacing the conditional covariance by its average is equivalent to making a slowly varying signal approximation in the noise covariance
Substituting these expressions into Eq. (
10) shows that the scanning-linear estimator is calculated by optimizing the function
The entire argument within argmax{·} is called the objective function, and the scanning linear estimate is the one that maximizes this objective function. For this problem, the objective function is a continuous scalar-valued function of the parameters to be estimated as well as a linear transform of the image data. Therefore, the optimization is performed with respect to the continuous vector θ for each image g.
7. Results
7.1. Estimating signal location, radius, and amplitude
The signal location, radius, and amplitude are all random variables, and the task is to estimate these quantities from noisy image data acquired when the signals are embedded in a Gaussian lumpy background.
7.1.2. Scanning linear results
A thorough search of
θ space is not possible due to its continuous nature; instead, possible solutions are discretized to a grid of approximately 14 million points. These points are generated by spacing the location solutions a voxel apart, for a total of 61×61 locations. The radius is incremented by one voxel for a total of 16 candidate radii. Twenty amplitude solutions are evenly spaced over a plausible range. These candidate
θ solutions are evaluated for 300 sample images. For each image, the maximum returned by the discrete search occurs at a point neighboring the true parameter values. Examples are shown in
Fig. 8, where a cross-section of the objective function is paired with the corresponding image data. Values near the true signal radius and amplitude are held constant while the location is scanned across each voxel, and the scalar value of the objective function at these points populates the displays in
Fig. 8.
Fig. 7. Scatter plots of the Wiener estimates versus the true parameter values for each element of θ=[A,R,c
_{x},c
_{y}]. The red line indicates perfect estimation performance, the actual estimates are blue points, and the black line shows θ¯, the prior mean.
7.2. Radius of signal known, location, and amplitude to be estimated
In this task, the radius of each signal is fixed at R=2 mm, which corresponds to a diameter of 8 pixels. Ten thousand signals are sampled from the ensemble pr(θ), added to samples from the lumpy background distribution, and noisy image data are created. The scanning-linear estimator is not implemented for this task; the outcome is anticipated to be redundant because it performed well in the more complex task of radius unknown.
Fig. 8. The objective functions scanned over solutions for signal locations are paired with the noisy image data.
7.2.1. Wiener results
The location and amplitude are estimated from each image
g, and the resulting sample EMSE is calculated for each element of the vector
θ^(
g)=[
Â,
ĉ
_{x},
ĉ
_{y}]. We are interested in the effect prior knowledge on location has on the EMSE of the amplitude estimates. The prior distribution on location is a two-dimensional Gaussian; hence shrinking the standard deviation σ
_{c} corresponds to less variability in the location of the signals. The sample EMSE of the amplitude estimate as it varies with the standard deviation of signal locations is plotted in
Fig. 10(a). It is to be expected that, when the signal location varies less, the amplitude estimates should improve. This is confirmed by the simulation study, but the effect is not substantial.
The smallest σ
_{c} included in the study is 1.0mm(2 pixels), and the estimation performance for this case is plotted in
Fig. 11. The location is estimated rather accurately, as shown in
Figs. 11(b) and
11(c), but the amplitude estimation given in
Fig. 11(a) is not trending well with the data. The sample EMSE of these amplitude estimates is 0.7550 when σ
_{c} 1.0 mm (2 pixels), which can be read off of
Fig. 10(a). Because the variance of pr(
A) is 1.0, the ratio ofWiener EMSE to intrinsic variance is 75%, indicating only a 25% reduction in EMSE compared to an estimator that simply returns the average amplitude. The performance of the Wiener location estimator has a sample EMSE of 0.10 when σ
_{c} is 1.0 mm (2 pixels), so the ratio of sample EMSE to the variance of pr(
c) is 10%. This low ratio indicates the strong performance seen in
Figs. 11(b) and
11(c). Instead of looking at scatter plots for a range of location variability the ratio of sample EMSE of the x-location estimate to prior variance σ
^{2}
_{c} is plotted in
Fig. 10(b). This measure of performance indicates that, when the signal location is distributed throughout the FOV, linear estimation is not good at finding it; when σ
_{c} is 4.0mm (8 pixels) the EMSE nearly equals the prior variance σ
^{2}
_{c}, and the performance ratio is close to one. However, when the standard deviation of pr(
c) is 1.50 mm (3 pixels), the Wiener EMSE is reduced by about 75% compared to an estimate that merely guesses the average location. These results suggests that, althoughWiener estimation cannot locate signals of known shape when the location variability is large, it has potential when the location is known to be within a specific region.
Fig. 9. Scatter plots of the scanning linear estimates versus the true parameter values for each element of θ=[A,R,c
_{x},c
_{y}]. The red line indicates perfect estimation performance and the actual estimates are blue points.
Fig. 10. (a): The Wiener EMSE of the estimated signal amplitude as dependent upon the standard deviation of the pdf on signal location. (b): The Wiener EMSE of the estimated location divided by the variance of the pdf on signal location as dependent upon the width of the pdf on signal location.
Fig. 11. Scatter plots of the Wiener estimates versus the true parameter values for each element of θ=[A,c
_{x},c
_{y}] when the radius is fixed at 2 mm (4 pixels). The red line indicates perfect estimation performance, and the actual estimates are blue points.
7.3. Location and radius of signal known, amplitude to be estimated
This study seeks to estimate the amplitude of a signal when the location and shape are both known, so that the vector of parameters to estimate reduces to a scalar θ=A. The radius is fixed at 2 mm; and the signal is always located at the center of object space. As in previous simulations 10,000 of these signals with varying amplitude are embedded in randomly selected Gaussian lumpy backgrounds, and noisy images are generated.
7.3.1. Wiener results
To provide an understanding of the calculation done by the Wiener estimator, a central slice of the matrix
K
_{A,g}
K
^{-1}
_{g} is plotted in
Fig. 13, superimposed with a slice of the average signal. These functions have been normalized by their respective maximum values to accentuate their regions of overlap. The estimation template resembles a high-pass filtered version of the signal because this is precisely the action of the matrix multiplication by
K
^{-1}
_{g}, also called the pre-whitening step.
7.3.2. Scanning linear
For the case of estimating a single scalar parameter that is linearly related to the object, a simple gradient-minimization can be solved to find an approximation to the scanning linear estimator.
Fig. 12. Scatter plot of true versus Wiener estimates of amplitude when the signal shape and location are known a priori. The red line indicates perfect estimation performance, and the actual estimates are blue points.
Fig. 13. The linear template used on image data (black line) and the average signal (red line) are normalized by their respective maximum values to accentuate their regions of overlap.
Consider Eq. (
55) with the substitution for
θ as the scalar amplitude
Awhere s(A) is the noise-free image of a signal with amplitude A. The functional dependence between signal and amplitude is simply linear
where s̃ is a normalized signal image. This substitution yields
Without the prior pr(
A), this objective function is quadratic in
A, and the maximization could be calculated by a gradient minimization. This modification is akin to maximum-likelihood (ML) estimation; the estimator maximizes the Gaussian approximated likelihood with the slowing-varying covariance assumption of Eq. (
54). If the prior is ignored, the remaining terms are maximized at the estimate
which is unbiased
Fig. 14. Scatter plot of true versus scanning linear estimates of amplitude when the signal shape and location are known a priori. The red line indicates perfect estimation performance, and the actual estimates are blue points.
8. Conclusions
In this paper, we have described and evaluated two estimation procedures designed to minimize the EMSE under distinct statistical circumstances. The estimator’s performance was evaluated for tasks of estimating signal location, amplitude, and radius from noisy data sets that include nuisance parameters. In our computation, the nuisance parameters were modeled using the Gaussian lumpy background model, and the signal was parameterized by a circle with a random centroid, radius, and amplitude.
The estimation task was modified to simulate various cases of prior knowledge. We measured the EMSE of the Wiener estimator when the radius was fixed, and the signal’s location variability was decreased in steps approaching the limit of the location known exactly. Even for our simplified imaging model, the Wiener estimator did not perform well when the signal location was unknown. This result indicates a fundamental limitation on linear estimation. For instances when the location and size of the signal are both fixed, the Wiener estimator is a potentially useful linear estimation procedure for image-quality evaluation.
The metric to be optimized (see Eq. (
55)) by the scanning linear estimator offers an interesting connection to detection tasks. The first term of the objective function, derived from an approximate likelihood, is also known as the Hotelling discriminant function. In a detection study, the scalar value of the Hotelling discriminant is used as a threshold to classify signalabsent from signal-present images [
11. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley-Interscience, 2004).
]. Instead of thresholding the output of the Hotelling discriminant function we have used it as an approximate likelihood to be maximized.