1 Introduction
In recent years, the problem of forming bio-medical images from sparse observations of diffuse photon density wave data (DPDW) has received considerable attention, [
11. S. R. Arridge, “Optical tomography in medical imaging,” Inverse Problems 15, R41–R93, (1999). [CrossRef]
,
44. M. O’Leary, D. Boas, B. Chance, and A. Yodh, “Experimental images of heterogeneous turbid media by frequency domain diffusion photon tomography,” Opt. Lett. 20, 425–428, (1995). [CrossRef]
,
55. T. J. Fareel, M. S. Patterson, and B. Wilson, “A diffusion theory model of spatially resolved, steady state diffuse reflectance for the non-invasive determination of tissue optical properties in vivo,” Med. Phys 19, 879–888, (1992). [CrossRef]
,
33. S. J. Norton and T. Vo-Dinh, “Diffraction tomographic imaging with photon density waves: an explicit solution,” J. Opt. Soc. Am. A 15, 2670–2677 (1998). [CrossRef]
]. A common goal of such processing is the detection, localization, and characterization of regions in the body, which we refer to as anomalies, that have optical properties (space varying absorption coefficient,
µ_{a}
(
r), and scattering coefficient,
µ_{s}
(
r)) that are different from those of the surrounding tissue. By extracting such information about anomalies from the DPDW data, one can potentially draw medically useful conclusions related to the flow of blood in an area, the presence of a tumor, etc.
A typical approach to these problems is the “image-then-detect” method, where the first step is to use measured DPDW data to form an image in two dimensions, or volumetric rendering in 3-D, of a subcutanous region of interest. This step is followed by image postprocessing via standard techniques to extract and analyze anomalous areas. Thus, a major difficulty with these methods is the need to first form a pixel by pixel reconstruction of the underlying region, as this step requires the solution to an inverse scattering problem. Since the inverse scattering problem itself is highly ill-posed in the sense that small changes in the data can have dramatic, adverse affects on the estimated solution, no matter which model (Born, Rytov, exact, etc.) is used to approximate the forward scattering problem, the computed solution will be hopelessly contaminated by noise artifacts. To lessen the sensitivity to noise, some type of regularization technique must be employed - any number of techniques is possible [
77. Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems, (SIAM Press, Philadelphia, 1998). [CrossRef]
].
A further difficulty with standard approaches is that a small amount of measured DPDW data is used to recover the preliminary image comprised of a large number of pixels. Our approach to the imaging problem is fundamentally different in that we parameterize the problem directly in terms of a small number of parameters that describe the shape, location and contrast of the anomaly as well as the value of the background. As a result, we are able to localize anomalies directly without postprocessing. This formulation requires the minimization of a functional with respect to only a small number (relative to the total number of pixels reconstructed) of parameters, ensuring the efficiency of the method.
The remainder of this paper is organized as follows. In §2, the mathematical model for the diffuse photon density wave system of interest is presented. We describe our algorithm in §3 and provide numerical examples in §4. Finally, we discuss conclusions and future work in §5.
2 Models
2.1 Forward DPDW Model
In this paper, we consider information extraction algorithms based on the first-Born approximation to the frequency domain, integral
equation [9,
equation 1] formulation of the diffusion equation describing the flux of the optical modulation envelope. Specifically, for a point source located at position
r_{s}
, the model we have is
In this work it is assumed that we have a total of
N_{s}
point sources where the scattered fields generated by each source are observed over an array of
N_{r}
receivers. We are using a 2-dimensional model. Thus, for the
ith source, at a given modulation frequency, we have a data vector,
y_{i}
, of length 2
N_{r}
(for DC, the vector length is
N_{r}
) comprised of the in-phase and quadrature components of the measured scattered wavefield. Using a method of moments procedure to discretize (1) with pulse basis functions to represent
g(
r) [
88. R. F. Harrington, Field Computations by Moment Methods, (Macmillan, New York, 1968).
], we have the discrete equations
where G_{i}
is the discretization of the Born kernel associated with the ith source, g is the vector of pulse basis expansion coefficients for g(r), and n
i is the noise vector. Finally, all y
_{i}
are collected for the various modulation frequencies into a single vector, yT=[y1T…yNsT] to arrive at the discretized data model
with G and n obtained by “stacking” all the G
_{i}
and n
_{i}
for each frequency. In our experiments, we take data at 0 and 200 MHz, so y is of length 3N_{r}N_{s}
.
Fig. 1. Source/receiver configuration in the transmission geometry.
2.2 A Model for g(r)
The model we use here to describe
g(
r), the unknown perturbation in the absorption coefficient, was originally described by the authors in [
99. M. E. Kilmer, E. L. Miller, D. A. Boas, D. H. Brooks, C. A. DiMarzio, and R. J. Gaudette, “Direct object localization and characterization from diffuse photon density wave data,” Proceedings of the SPIE Photonics West Meeting, Jan. 1999.
]. Here,
g(
r) is written as
where S(r) is an indicator function that is one over the (unknown) support of the object and zero elsewhere. The vectors a_{1}, a_{2} hold the expansion coefficients that effectively determine the weight of each function. The 1×p vectors B_{i}
i=1, 2 have as their components functions b_{i,j}
(r), j=1, …, p which are the expansion functions. In other words, we model the variation of g over the anomaly as a linear combination of certain functions and the variation of g on the background as a different linear combination of functions. The particular choice of the b_{i,j}
depends on the application. In this paper, for instance, we assume that there is a homogeneous anomaly of contrast a
_{1,1} against a homogeneous background of value a
_{2,1}: thus, B
_{1}=b
_{1,1}(r)=b
_{2,1}(r)=B
_{2}=1. That is, we assume g is piecewise constant. Use of higher order polynomials, trigonometric functions etc. provides greater flexibility in capturing true, underlying inhomogeneities, but adds a small amount of computational complexity to the problem. In this paper, the objective is to determine the structure of S and the a
_{i}
given a data vector y related to g via (1) assuming the B_{i}
are known.
In order to determine the support of the anomaly, and hence the S(r), we define the contour of the anomaly as a B-spline curve c(s):
for a given set of x̂_{i}
, ŷ_{i}
expansion coefficients, or control points, and a set of periodic, quadratic functions, βki(s). Each βki(s) has support over a subinterval [k_{i}
, k_{i}
_{+3}]∊[0, L], and the sequence {k
_{0},…, k_{K}
_{-1}} is called the knot sequence. Since the basis was taken to be periodic, [x̂
_{0}, ŷ
_{0}]=[x̂
_{K-2}, ŷ
_{K-2}] and [x̂
_{1}, ŷ
_{1}]=[x̂
_{K-1}, ŷ
_{K-1}]. According to our model, if (x, y) is a point outside the curve, S(r)=0, otherwise, S(r)=1.
From the point of view of discrete implementation of our model, S will be a diagonal matrix which functions as the discrete equivalent to S(r) in (4). If the center of the Ith pixel in the discretization is inside the closed curve c(s) then we set S
_{ll}
=1, otherwise we set S
_{ll}
=0. Define the m×p matrices B_{1}, B_{2} to have columns given by B
_{1}(x_{j}
,y_{i}
), B
_{2}(x_{j}
, y_{i}
) (with (x_{j}
, y_{i}
) denoting the pixel center coordinates and m denoting the total number of pixels) ordering lexicographically first by increasing i then by increasing j. With these definitions the discrete version of (4) is
3 Algorithm Description
Our algorithm seeks a good approximation to
g(
r) by successively improving approximations to a
_{1}, a
_{2}, and
S. Let
c*(
s) denote the true contour of the true anomaly, and let
c
_{0}(
s) denote an initial guess
^{1} to
c*(
s). We note that is highly likely that the number and locations of distinct control points of
c*(
s) and
c
_{0}(
s) will be different.
The c
_{0}(s) defines an initial guess at S, which we use to estimate a_{1}, a_{2} as the minimizers of
Note that if the number of columns in B
_{1} and B
_{2} are small (as is the case for problems of interest here) compared to 3
N_{s}N_{r}
, the height of
GQ, this problem is quickly and easily solved
^{2}. The total cost associated with any curve
c(
s) (hence, with our estimate of the solution
g(
r)) therefore is
where the first term enforces fidelity to the data while the second denotes the regularizer. The regularizer enforces a priori information we have about the shape of the anomaly and serves also to stabilize the least squares solution. In this work, we chose Ω(c) to be
where (
x̂_{i}
,
ŷ_{i}
) denotes the ith pair of control points for the current estimate,
c(
s), of
c*(
s). The idea is that large gaps between adjacent control points are penalized more, thereby discouraging the algorithm from choosing non-physical anomalies. For example, since there are neither sources nor receivers on the sides of the domain we are trying to reconstruct, without regularization we are more likely to reconstruct tall, narrow anomalies in presence of noisy data. Clearly a value of λ that is too large or too small will cause the reconstruction to be under or over dependent on the data, respectively. We do not address the selection of a near optimal parameter here, but refer the interested reader to the abundance of literature on the subject (see [
77. Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems, (SIAM Press, Philadelphia, 1998). [CrossRef]
] and the references therein).
The algorithm proceeds by perturbing the curve
c
_{0}(
s), in a fashion to be described shortly, to obtain a finite collection of new curves, compute the associated cost (and a
_{1}, a
_{2}) for each new curve from (8), and then select the new estimate,
c
_{1}(
s), of
c*(
s) as a curve which gave the minimum cost over all the perturbations, provided that minimum is less than or equal to the current cost for
c
_{0}(
s). The curves are obtained by taking each of the
K-2 distinct control points of
c
_{0}(
s) and moving them by a fixed amount
h, in mm, in the vertical, horizontal, and diagonal directions. Thus, there are 8 possible moves for each of the
K-2 distinct control points, resulting in at most 8(
K-2) new curves for which the cost is computed. Then
c
_{1}(
s) is selected from among these curves and the process is iterated
^{3} until the cost can no longer be reduced by making these types of perturbations to the control points of the current estimate
c_{k}
(
s). The algorithm is outlined in
Figure 2, where
h is understood to have been chosen apriori.
Fig. 2. Anomaly Recovery Algorithm
4 Numerical Experiments
All our experiments were conducted in Matlab using double precision, floating point arithmetic. We used Matlab’s Spline Toolbox to create and manipulate the B-splines. The region of interest is 3cm in width and 3cm in depth, and the region is discretized into a 31×31 pixel image. There were 10 sources across the top of the region and 10 receivers along the bottom, and data was collected at the modulation frequencies of 0 and 200 MHz. Thus, the matrix G had dimension 300×31^{2}.
To generate the data, c*(s) was taken to be a quadratic spline curve defined using 6 distinct control points and S* was the corresponding discrete indicator matrix. With a*_{1},a*_{2} as the true values of the anomaly and background, respectively, the noise-free data was formed as the matrix-vector product G
_{g} with g in (6) and G described above. This generated a data vector of length 300, the first 100 points of which corresponded to 0 MHz, the second 100 to the in-phase at 200, and the 3rd to the quadrature at 200.
We added shot noise to the data as follows. For a particular source and receiver pair the noise component n_{sr}
(ω) (at any given frequency ω, for either an in-phase or quadrature component) is given by
where
v
_{1} is a number chosen from a normal distribution with mean zero and variance one
^{4}. Here
σ_{sr}
(0) is proportional to the square root of the total signal strength at DC:
where we denote by ỹ_{s}
(r) the noise-free scattered field corresponding to source s, receiver r, at DC and we use y~sinc
(r) to denote the incident field at source s, receiver r at DC, and γ is the proportionality constant. We denote the 300-length, noise-contaminated data vector by y and the noise-free data by ỹ. The value of the signal to noise ratio that we give in the examples corresponds to
SNR=10log10∥y˜∥22∥n∥22..
Since the noise is not white, we must replace J(a_{1}, a_{2}) in (7) by
J(a1,a2)≔∥Σ−1(G[SB1,(I−S)B2][a1a2]−y)∥2,,
where ∑^{-1} is a diagonal matrix with 1/σ_{sr}
(0) placed appropriately along the diagonal.
In our examples, we compare the solution computed using our algorithm with the solution obtained via the commonly used truncated singular value decomposition (TSVD) method [
77. Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems, (SIAM Press, Philadelphia, 1998). [CrossRef]
]. Essentially, if
G=
∑i=1n
µ_{i}
u
_{i}
viT
is the singular value decomposition of the matrix
G, the TSVD regularized solution for truncation index
j,
j≤
n is
gtsυd=∑i=1juiTyμivi. We refer to the best, or optimal, TSVD solution as the one corresponding to the index
j which gives the smallest relative error in the 2-norm with respect to the true solution.
4.1 Example 1
Since the TSVD solution was already available, we used this to initialize
c
_{0}(
s) for our algorithm. Thus, the blue curve in
Figure 3a shows
c
_{0}(
s) to be an approximation to the outline of where the anomaly appears to be in the TSVD reconstruction, which we obtained by postprocessing the TSVD image manually. We set
h, the amount to perturb the control points, to 1 mm in our algorithm and took λ=.005 in (9). The final estimate of the contour,
b_{k}
*(
s), that is reached by the algorithm is the red curve in
Figure 3b; that is, no other curve is locally more optimal than that curve. The total reconstruction obtained using our algorithm to determine both
b_{k}
*(
s) and a
_{1},a
_{2} is displayed in
Figure 3d. The relative 2-norm error between the true and our computed solution is .56, which is a significant improvement in the error over the TSVD solution. Notice that although the relative error value seems high, as evidenced by the
Figure 3d the shape-based reconstruction very accurately reconstructs the value of
g inside and outside the anomaly and fairly accurately estimates the extent of the anomaly (the reconstruction is off by only 10 pixels). Thus, the relative error measure does not accurately reflect the high quality of the reconstruction. Further, recall from the discussion at the beginning of this section that
c*(
s) was defined with 6 distinct control points while our reconstruction assumed only 5 distinct control points. We believe that if we use more control points for
c
_{0}(
s), we could have reconstructed the irregular contour more accurately. The trade-off for using more control points in the reconstruction is the time it takes to execute one outer iteration.
4.2 Example 2
Fig. 3. From left to right, top to bottom: a) Contour curves for b*(r), c
_{0}(r), b_{k}
*(r) b) True image of the perturbation in the absorption coefficient c) TSVD reconstruction of the absorption perturbation d) our reconstruction of the absorption using h=1 and λ=.005.
Fig. 4. From left to right, top to bottom: a) Contour curves for b*(r), c
_{0}(r), b_{k}
*(r) b) True image of the perturbation in the absorption coefficient c) TSVD reconstruction of the absorption perturbation d) our reconstruction of the absorption using h=1 and λ=.05.
5 Conclusions and Future Work
In this work, we presented an effective technique for simultaneously solving the image formation and object characterization problems from DPDW data with a particular Gaussian noise model. The success of our method was based on the underlying model for the perturbation of the absorption coefficient. Our model formulates the problem in terms of only a small number of unknowns: namely, those that describe the B-spline contour of the anomaly and those that describe the values of the perturbations over both the anomaly and the background. Our examples illustrated that our reconstructions contain significantly more qualitative and quantitative information than the commonly used TSVD reconstructions. Further, although our algorithm was implemented in serial, it is in fact highly parallelizable in that the cost evaluations at any given iteration can be performed in parallel. Thus, it is computationally feasible to consider more complicated structures (i.e. those with more control points that would each need to be perturbed during an iteration). Finally, the model and algorithm presented here are readily adapted to account for non-linear scattering models that arise when the Born approximation is no longer valid. In the future we hope to report results for non-linear scattering models and to generalize this work for scattering problems and 3D reconstructions. The generalization of our approach to multiple anomalies is straightforward if initial estimates of boundaries of such anomalies can be found; more work needs to be done to generalize our method to find multiple anomalies with a single boundary as a starting guess.
Although we have used a fixed knot sequence and number of control points to describe our B-spline curve, we note that it is possible to add and delete knots (and correspondingly, control points) from our description of the contour as well as specify more complicated basis functions B
_{1}(r) and B
_{2}(r), thereby allowing reconstruction of more complex anomalies and backgrounds. The question of how to choose A appropriately is also an important issue. All these considerations were beyond the scope of this work and remain areas for future research.