OSA's Digital Library

Virtual Journal for Biomedical Optics

Virtual Journal for Biomedical Optics

| EXPLORING THE INTERFACE OF LIGHT AND BIOMEDICINE

  • Editor: Gregory W. Faris
  • Vol. 5, Iss. 11 — Aug. 25, 2010
« Show journal navigation

Robust segmentation of intraretinal layers in the normal human fovea using a novel statistical model based on texture and shape analysis

Vedran Kajić, Boris Považay, Boris Hermann, Bernd Hofer, David Marshall, Paul L. Rosin, and Wolfgang Drexler  »View Author Affiliations


Optics Express, Vol. 18, Issue 14, pp. 14730-14744 (2010)
http://dx.doi.org/10.1364/OE.18.014730


View Full Text Article

Acrobat PDF (2065 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

A novel statistical model based on texture and shape for fully automatic intraretinal layer segmentation of normal retinal tomograms obtained by a commercial 800nm optical coherence tomography (OCT) system is developed. While existing algorithms often fail dramatically due to strong speckle noise, non-optimal imaging conditions, shadows and other artefacts, the novel algorithm’s accuracy only slowly deteriorates when progressively increasing segmentation task difficulty. Evaluation against a large set of manual segmentations shows unprecedented robustness, even in the presence of additional strong speckle noise, with dynamic range tested down to 12dB, enabling segmentation of almost all intraretinal layers in cases previously inaccessible to the existing algorithms. For the first time, an error measure is computed from a large, representative manually segmented data set (466 B-scans from 17 eyes, segmented twice by different operators) and compared to the automatic segmentation with a difference of only 2.6% against the inter-observer variability.

© 2010 OSA

1. Introduction

Optical Coherence Tomography (OCT) [1

1. W. Drexler, and J. G. Fujimoto, Optical Coherence Tomography: Technology and Applications (Springer, 2008).

] is a biomedical imaging method using infrared light that gives high resolution, three-dimensional (3D) sub-surface insight into living tissue utilizing white-light interferometry. The simple optical access to the light sensitive retina, which is hard to reach by other high resolution methods lead to a wide clinical acceptance of this imaging technique. Currently available OCT systems have increased in speed towards hundred thousand depth scans per second and axial resolution of less than 3 µm, enabling imaging of the retinal microstructure. Result of a typically less than 10 second retinal OCT scan is a large volumetric data set consisting of a stack (typically 128-512) of high resolution cross-sectional images (B-scans, typically 1024x512 pixels).

In order to make these large retinal 3D OCT data sets clinically useful it is necessary to analyze the structure by segmentation of layers, as they correspond to patches of similar cellular components that can be used to establish a potential early disease diagnosis or perform therapy monitoring. However, due to the sheer amount of data, it is inconvenient or even impossible for a human operator to manually perform the segmentation in a high throughput clinical environment. Therefore it is necessary to develop effective computer algorithms for automated segmentation of relevant layers of the investigated tissue.

Existing published approaches to retinal OCT data segmentation vary depending on the number of layers to be segmented and on their robustness in the presence of strong speckle noise, shadows, irregularities (i.e. vessels, structural changes at the fovea and optic nerve head) and pathological changes in the tissue. In general they tend to be very sensitive to noisy data or are limited to only segment a small number of layers.

Fabritius et al. [2

2. T. Fabritius, S. Makita, M. Miura, R. Myllylä, and Y. Yasuno, “Automated segmentation of the macula by optical coherence tomography,” Opt. Express 17(18), 15659–15669 (2009). [CrossRef] [PubMed]

] presented a fast, efficient algorithm for finding only the internal limiting membrane (ILM) and retinal pigment epithelium (RPE) boundaries that utilizes 3D information and performs simple filtering. This rather simple step is typically the first one performed before a more detailed analysis of the intraretinal structure.

Zawadzki et al. [3

3. R. J. Zawadzki, S. S. Choi, S. M. Jones, S. S. Oliver, and J. S. Werner, “Adaptive optics-optical coherence tomography: optimizing visualization of microscopic retinal structures in three dimensions,” J. Opt. Soc. Am. A 24(5), 1373 (2007). [CrossRef]

] used a semi-automatic algorithm for OCT segmentation where the user would have to paint the areas of interest in any slice of the volume. For segmentation a support vector machine (SVM) was used with a feature vector that contained intensity, location, mean of the neighbourhood, standard deviation and gradient magnitude.

Fernandez et al. [5

5. D. Cabrera Fernández, H. M. Salinas, and C. A. Puliafito, “Automated detection of retinal layer structures on optical coherence tomography images,” Opt. Express 13(25), 10200–10216 (2005). [CrossRef] [PubMed]

] presented segmentation results using a peak finding algorithm. Since it is an iterative thresholding algorithm it is sensitive to noise and deviation from the normal retinal data. Extension to any non-typical case might prove to be difficult since the algorithm’s parameters are manually selected, rather than learned from a set of segmentation examples. Even though some good results were obtained it is prone to failure and it allows detected boundaries to overlap.

Mujat et al. [6

6. M. Mujat, R. Chan, B. Cense, B. Park, C. Joo, T. Akkin, T. Chen, and J. de Boer, “Retinal nerve fiber layer thickness map determined from optical coherence tomography images,” Opt. Express 13(23), 9480–9491 (2005). [CrossRef] [PubMed]

] used the deformable spline algorithm (active contour) to determine nerve fibre layer (NFL) thickness only. Blood vessels could be detected by using intensity holes in the RPE layer.

A Markov boundary model was applied to connect the extracted boundary edge primitives by Koozekanani et al. [7

7. D. Koozekanani, K. Boyer, and C. Roberts, “Retinal thickness measurements from optical coherence tomography using a Markov boundary model,” IEEE Trans. Med. Imaging 20(9), 900–916 (2001). [CrossRef] [PubMed]

]. Even though more robust than standard column-wise thresholding methods, it still relies on connecting 1D points. That makes it sensitive to noise and thus detected layer boundaries can easily drift off from the real ones. Special rules have to be applied to correct for such cases which makes the whole approach less general.

An elegant approach to retinal segmentation based on spectral rounding was introduced by Tolliver et al. [8

8. D. Tolliver, Y. Koutis, H. Ishikawa, J. S. Schuman, and G. L. Miller, “Unassisted Segmentation of Multiple Retinal Layers via Spectral Rounding,” in ARVO(2008).

]. It is a graph partitioning algorithm based on the eigenvector calculation to determine the oscillation steps that represent the retinal edges. It performed very well since no a priori information was available to the algorithm, simply dividing an image iteratively along the oscillation boundary - different regions of the image correspond to different modes of oscillation. Although the accuracy was good, the number of extracted layers was low and it is very unlikely that layers with weaker signal could be extracted without using additional structural information.

Mishra et al. [9

9. A. Mishra, A. Wong, K. Bizheva, and D. A. Clausi, “Intra-retinal layer segmentation in optical coherence tomography images,” Opt. Express 17(26), 23719–23728 (2009). [CrossRef]

] presented a promising two-step algorithm based on a kernel optimization scheme. Initially, approximate positions of the boundaries are found, followed by the second, refinement step. Very good segmentation results were obtained; however no quantitative evaluation on a large data set was given, nor was any result given on the actual dynamic range of the presented images. Additionally, it is unclear how the algorithm performs in cases with more variability in boundary distances, such as the foveal pit region. Only images of the flat part of the retina were shown and the algorithm imposes some fixed constraints on the shape of layers.

Thus, all of the aforementioned methods suffer from one or more of the following disadvantages: they distinguish only the most prominent layers, do not exhibit robustness in noisy and varied cases and/or require manual intervention of the operator.

The proposed method of the present paper uses training data obtained from manual segmentations by human operators as input to a statistical model which is able to actively learn and determine the plausible solutions in a noisy environment. During the learning stage parameters of a statistical model are extracted so that it best fits the training data. That includes the possible variation of layer boundaries as well as texture information within the layers. This approach offers greater flexibility over the fixed constraints on layer smoothness, since it learns from the data what amount of variability is possible and in what regions, while on the other hand constrains data to a plausible space of states.

Our model based approach uses the variation obtained from the training set and imposes those constraints when segmenting an unseen image. This guarantees that the segmentation will be close to the ground truth and less sensitive to noise. However, it is extremely important to have a large, representative training set that includes all possible variation. We solved this issue by applying a novel approach for obtaining manual segmentations of the OCT data via an Amazon service called The Mechanical Turk, designed to offer a large international human work force for completion of user defined tasks.

Overall the novel algorithm segments eight layers: NFL, ganglion cell layer and inner plexiform layer (GCL + IPL), inner nuclear layer (INL), outer plexiform layer (OPL), outer nuclear layer (ONL), connecting cilia (CL), outer segment (OS) and RPE.

2. Materials and methods

We have used a three-dimensional OCT system for imaging. It uses a superluminescent light source, with 840nm central wavelength and 50nm optical bandwidth. Axial resolution is 5-6 microns, while transverse resolution is 15-20 microns. Data acquisition speed was 27 klines/sec. Optical power was 500 µW and SNR was 96dB with a sensitivity roll off −6dB/mm. Depth range was 3.5mm and axial sampling 2.3 µm/vx.

2.1 The algorithm overview

As seen in the overview of the algorithm (Fig. 1
Fig. 1 Algorithm overview: manually segmented data is used as the input to the training phase of the algorithm. After passing the pre-processing block a statistical model is constructed that captures the variance in the training data, which can be then used to segment unseen data.
), one can be observe that the pre-processing stage is performed for both the training step and the segmentation of the unseen data. Once the variation parameters have been learned from the manually segmented training data, they can be used to drive the model to perform segmentation of unseen data. The actual segmentation process is essentially an optimization run that changes the model parameters in order to minimize the objective function which defines the difference between the model and a given unseen image that is to be segmented.

2.2 Pre-processing

Before the segmentation process, dual-tree complex wavelet (DTCW) denoising is applied to the data. The denoising algorithm exhibits very good performance, while being computationally efficient [10

10. I. W. Selesnick, R. G. Baraniuk, and N. G. Kingsbury, “The Dual-Tree Complex Wavelet Transform,” IEEE Signal Process. Mag. 22(6), 123–151 (2005). [CrossRef]

]. This reduces the speckle noise present and thus makes the subsequent segmentation tasks easier.

Denoising based on quasi-random nonlinear scale space described in [11

11. A. Mishra, A. Wong, D. A. Clausi, and P. W. Fieguth, “Quasi-random nonlinear scale space,” Pattern Recognit. Lett. In Press. (Corrected Proof.).

] and applied to OCT speckle reduction in [12

12. A. Wong, A. Mishra, K. Bizheva, and D. A. Clausi, “General Bayesian estimation for speckle noise reduction in optical coherence tomography retinal imagery,” Opt. Express 18(8), 8338–8352 (2010). [CrossRef] [PubMed]

] would likely be more effective. It is an effective and fast method based on formulating the denoising problem as a general Bayesian least-squares estimation problem. A quasi-random density estimation approach is introduced for estimating the posterior distribution between consecutive scale space realizations. However, the relatively small performance difference (larger speed difference) in not significant for the performance of the statistical model, thus we have used a well tested and freely available DTCW code.

After that, registration of the stack and segmentation of the three initial well defined boundaries (ILM, connecting cilia (CL) and end of RPE) is performed. Registration and initial boundary location finding are currently independent since detection of the initial boundary location operates on each B-scan independently.

A stack registration algorithm has been developed based on B-spline multi-resolution pyramid registration approach [13

13. P. Thevenaz, and M. Unser, “A pyramid approach to sub-pixel image fusion based on mutual information,” in Image Processing, 1996. Proceedings., International Conference on(1996), p. 265.

] and [14

14. C. O. S. Sorzano, P. Thevenaz, and M. Unser, “Elastic registration of biological images using vector-spline regularization,” BIEEE Biomed. Eng. 52(4), 652–663 (2005). [CrossRef]

]. The basic algorithm for translation and rotation is used to register source to target image.

Initial boundary detection would also be possible based on a decoupled active contour (DAC) approach as presented in [15

15. A. K. Mishra, P. W. Fieguth, and D. A. Clausi, “Decoupled Active Contour (DAC) for Boundary Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence 99.

]. We have tested the level set active contour approach and discarded it for its slow convergence. However, DAC is both robust and fast, as it decouples the measurement (solved by using Hidden Markov Model (HMM) and Viterbi search) and prior active contour energy terms. As we have found our initial boundary estimation approach sufficient for the current application we have not experimented with all other available methods. For future work, however, algorithms such as DAC could prove valuable.

2.3 Model building

From the manually segmented images we extract the shape and texture features and for each image we put all the extracted shape features into one vector and all the texture features into another vector. Separate models for shape and texture are constructed similarly, so only the shape model construction will be explained. If we have m training images, for each layer (n layers) we get one vector of offsets v per layer, per image of width w, which stacked together for all the layers define x. All of the manual segmentations then comprise the matrix X Eq. (1).
X=(x1xm)=(v11v1nvm1vmn)vij=[off1offW]
(1)
Shape features that are used are sparsely sampled distances of the boundaries from the top boundary (ILM). Texture features that are currently used are simple, although it is trivial to include additional features if needed to further increase performance in case of vessels, large shadows and pathological tissue; currently used features are the mean of all the pixels for each of the layers in the original image, standard deviation and mean of all the pixels for each of the layers in the median filtered image, as well as the multiple-scale (a pyramid of Gaussian filtered versions of the image) edges sampled along the boundaries. In practice, for an image of width 512, we sampled each boundary at 26 positions. Thus we have 26 spatial features and 4 texture features per each layer, and for eight layers, we obtain 208 spatial and 32 texture features.

Statistical models can reproduce specific patterns of variability in shape and texture by analyzing the variations in shape across the training set. It is difficult to achieve this selectivity, whilst allowing for natural variability, without using very large descriptors and thus it is essential to select good features from the training set for the model building phase. The key step of the statistical model training phase is the dimensionality reduction of the large set of features from the training data set. The reason for dimensionality reduction is to reduce the computational cost of the optimization method that is used to fit the model to the real data later on. The idea behind this concept is to find statistical dependencies between the produced features and reduce the dimensionality of the space by identifying only a certain number of the most prominent properties in the data set, represented by the most important eigenvectors.

Principal component analysis (PCA) is the standard vector space transform technique used to reduce multidimensional data sets to lower dimensions for analysis. It works by calculating the eigenvalue decomposition of a data covariance matrix or singular value decomposition of a data matrix. Usually a relatively small number of eigenvectors with greatest eigenvalues can describe the original data well. If X is the original data matrix, as defined in Eq. (1), after the decomposition we can select only L principal components and in that way project the data into a reduced dimensionality space to get Y Eq. (2).
X=WΣVTY=WLTX
(2)
However, rather than PCA, we used neural network based dimensionality reduction since it offers nonlinear eigenvectors and therefore can reduce the space more compactly if the data is nonlinearly distributed than the linear representation obtained by PCA [17

17. M. Scholz, M. Fraunholz, and J. Selbig, “Nonlinear Principal Component Analysis: Neural Network Models and Applications,” in Principal Manifolds for Data Visualization and Dimension Reduction(2007), pp. 44–67.

]. The shape features proved to be nonlinear and thus we obtained a more compact representation using nonlinear dimensionality reduction, rather than PCA. A Neural network (NN) is a mathematical or computational model based on principles found in biological neural networks. It consists of an interconnected group of artificial neurons and processes information where each connection between neurons has a weight, with the weights modulating the value across the connection. The training phase is performed to modify the weights until the network implements a desired function. Once training has completed, the network can be applied to data that was not part of the training set. It is useful to note that a special type of neural network (inverse) [18

18. M. Scholz, F. Kaplan, C. L. Guy, J. Kopka, and J. Selbig, “Non-linear PCA: a missing data approach,” Bioinformatics 21(20), 3887–3895 (2005). [CrossRef] [PubMed]

] can be used to perform dimensionality reduction on the training feature set that is produced which contains missing values. Missing values occur when no data value is stored for the variable in the current observation. The generating function is used to produce larger dimensionality data X from the parameters z (equivalent to WL in PCA) Eq. (3). The extraction function does the reverse.
Φgen:z>X^Φextr:X>z
(3)
We encounter the problem of missing data because during the registration process slices are moved, and since input to the dimensionality reduction step has to be a rectangular matrix, it is necessary to fill the missing values. In practice we can set them as “not a number” (NaN) and perform the nonlinear PCA (Fig. 3
Fig. 3 Filling the gaps after the registration with NaNs and applying inverse neural network nonlinear PCA dimensionality reduction. In the case of the example data shown on the right, we can see that already the first eigenvector (e1) captures most of the variance in the original data set. This illustrates the idea behind the dimensionality reduction.
). After that, we end up with a reduced number of variables which can reasonably well describe any variation observed in the training data. We reduced dimensionality of the original spatial feature space from 208 to 12, and the texture feature space from 32 to 2. This number of eigenvectors allowed for an efficient optimization in the subsequent steps, while still preserving the original data variation well.

Our approach is based on a similar concept to the Active Appearance Model (AAM). For completeness, it will be first explained how the basic AAM model works, followed by an explanation of how the proposed statistical model differs from that concept. An Active Appearance Model (AAM) manipulates a model capable of synthesising new images of the object of interest by finding the model parameters which generate a synthetic image as close as possible to the target image [16

16. T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active Appearance Models,” IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001). [CrossRef]

]. An AAM will, based on learned shape deformation, generate a new image with a texture learned from the texture variation and then compute the distance between the synthesized and the given image that is to be segmented. x is the shape vector (which is normalized by subtracting the mean shape and rescaling, Eq. (4)) and g is the texture vector obtained from an image I and the shape vector (it is also normalized) Eq. (5).
x(x - μ(x)1)/σ(x)
(4)
g=G(x,I)
(5)
Function S(s) produces new shape vectors by adding the shape parameters s multiplied by the shape matrix Qs (a matrix of sorted eigenvectors learned from the training set, usually produced by PCA decomposition) to the mean shape vector x¯ Eq. (6). The same procedure is used to generate new texture vectors.
x=S(s)=x¯+Qssg=T(t)=g¯+Qgt
(6)
However, unlike the AAM which compares pixelwise synthesized images, we use the layer boundaries produced by the model during the optimization to compute texture features of the bounded area and compare it to the expected texture properties of each layer learned from the training set. This approach is used since unlike the areas in which AAMs are usually applied, the texture of retinal OCT scans varies so much within one layer that the direct comparison with a synthesized image is unusable. The objective function (Eq. (7)) evaluates how well the model matches real data and is minimized during the optimization.
f(s,t)=|T1(G(S(s),I))t|+b*(S(s)bbinit)2w
(7)
b is the number of boundaries, w is image width and binit defines the initial three boundaries positions found by the adaptive thresholding algorithm. T1 is the inverse of T; T is defined in Eq. (6). T1 returns the model texture parameters t that are most likely to generate a given vector of texture features g. The first term of the objective function defines the main measure for evaluation of the model fitting, determined by the difference of the model texture parameters and the texture parameters extracted from the image regions defined by the model shape parameters. The second term penalizes deviations from the initial boundary as found by the initial three boundaries algorithm and the one produced by running the optimization function for the statistical model. This is an important novelty, when compared to the standard AAM, which helps to constrain the optimization process to valid solutions. Additionally, we do not start the optimization process from the mean of the model, but rather we determine the median distance between ILM and RPE boundaries found by the adaptive thresholding algorithm, as well as the ratio of the foveal pit distance to the greatest thickness found in the image. Using these values we pick the closest example from the training set and use these parameters for the initial model position. This way we ensure a faster and more robust convergence.

Another novelty is introduced in the second stage of the algorithm based on fitting a model for each independently used A-scan (depth-scan) to further improve the accuracy. This stage starts from the position defined by the result of the first stage B-scan fitting. We have chosen to divide the image area into four segments and built an A-scan model for each segment since different types of variation can be expected at different offsets from the foveal depression. The A-scan model is trained on offsets produced by back projecting the manual segmentation data using the main B-scan model and computing the boundary offsets between the back projections and the original segmentations Eq. (8) (n is the number of layers and u is the number of A-scans from all the images in the given segment).
A=(aOff11aOff1naOffu1aOffun)
(8)
In Fig. 4
Fig. 4 On the left is the result after the global low-res optimisation followed by, on the right, the refined result by the A-scan optimization.
it can be seen how the second refinement stage of the algorithm improves precise tracking of the layer boundaries.

2.4 The Mechanical Turk

3. Results and discussion

For evaluation purposes we have used 466 manually segmented B-scans, almost (in some cases we had to discard the manual segmentation) uniformly sampled from 17 eyes (each stack contains 128 B-scans). We have tested the performance of our algorithm on this data set using the leave-one-out test; we iteratively left out all data from one person, trained the model on the remaining data and then tested the performance on the data from the person left out. This procedure is performed for each person in the training set. This way we make sure that we are testing the performance of our algorithm on the “unseen” data.

For evaluation, automatic segmentation results were compared to manual segmentation done by the AMT workers. Two types of error measures were used, computed for each boundary i separately and from these we compute error measures for an entire B-scan or for an individual layer, Eq. (9).
EiB=j=1j=w|yAutijyRefij|,EiLDEV=w*j=1j=w(yAutijyRefij)2
(9)
EB (Basic) is the basic error measure that defines the number of misclassified pixels. ELDEV (Layer DEViation) uses the w w term for normalization so that for the special case when the two boundaries are equally distant from each other along their whole length (yAutijyRefij=dfor all j), it is equal to EB (proved in Eq. (10)).
EiB=j=1j=w|yAutijyRefij|=j=1j=w|d|=w*|d|EiLDEV=w*j=1j=w(yAutijyRefij)2=w*j=1j=wd2=w*w*d2=w*|d|=EiB
(10)
For all other cases ELDEV is larger than EB. Thus ELDEV will penalize large deviations from the reference position of a boundary, unlike EB which only measures the number of misclassified pixels. ELDEVis therefore useful for penalizing specific types of poor algorithm performance which could show as, for example, a large jump in a boundary position that could be narrow and thus not affect EB significantly since the misclassified area would be relatively small.The error for a whole image (this refers to both EB and ELDEV) is defined in Eq. (11).
E=i=1i=bEiA
(11)
Ei is the error for each boundary and A is the area between top (ILM) and bottom boundaries (RPE/CH).

In the case when we express error for layer k separately, instead of summing up across all boundary errors, only the two boundaries that define a layer are added and divided by the sum of the layer area as given by the automatic segmentation (AA) and the layer area as given by the reference segmentation (AR), Eq. (12). This is used to normalize for double counting of misclassified pixels, as each layer is bounded by two boundaries.
E=i=ki=k+1EiAA+AR,0<k<b
(12)
A confidence measure could be introduced based on the values returned by the objective function after the optimization step. Large values are proportional to the low confidence in the boundary positions determined by the model fitting. This would be useful for the operator to decide whether the obtained results are reliable.

In Table 1

Table 1. Variability of manual segmentations on 75 B-scans in percent (the data has been previously examined and “bad” results left out)

table-icon
View This Table
| View All Tables
the inter-worker variability of the manual segmentations used in training is presented for each boundary and in total, while in Table 2

Table 2. Error values on 466 B-scans at various positions from 17 eyes in percent before the A-scan optimization

table-icon
View This Table
| View All Tables
and Table 3

Table 3. Error values on 466 B-scans at various positions from 17 eyes in percent after the A-scan optimization

table-icon
View This Table
| View All Tables
results are presented for both the initial segmentation and after the second step refinement.

It can be seen that the total error rates (especially EB which is the main measure) are close to the inter-operator variability (18.2% compared to inter-operator’s 16.1%). ELDEV difference is somewhat larger. Thus, we can conclude that the algorithm performance is almost the same as ground truth.

Our algorithm performs well even when artefacts are present, such as strong shadows, which can cause problems for less robust algorithms (Fig. 5
Fig. 5 Robust performance for all the layers is achieved even in presence of shadowing. A despeckled image is shown on the left; the segmented image is on the right.
).

In Fig. 6
Fig. 6 Median and coefficient of variation computed on thickness maps of all the individual layers (nerve fibre layer (NFL), ganglion cell layer and inner plexiform layer (GCL + IPL), inner nuclear layer (INL), outer plexiform layer (OPL), outer nuclear layer (ONL), connecting cilia (CL), outer segment (OS), retinal pigment epithelium (RPE)), as well as the retina, obtained from 17 eyes.
thickness maps are shown for 17 different eyes after registering them and computing median and coefficient of variation (expressed as absolute variation in pixels), since it would take too much space to present the results for each eye individually. It can be seen that despite the data being affected by artefacts, the results are accurate and show larger variation only around the foveal pit region, as can be expected.

To evaluate performance of the algorithm in conditions of increased noise (reduced dynamic range) that frequently occurs in clinical measurements for a number of reasons (opaque cornea of cataract lens, residue in vitreous humour, non optimal imaging conditions, etc) background noise (speckle, multiplicative random noise) has been added to tomograms (Fig. 7
Fig. 7 Segmentation in a case of added strong noise. Left original image. Right filtered, denoised image with segmentation results superimposed.
) and results plotted on a graph. The background was generated using a texture synthesis approach [19

19. A. A. Efros, and W. T. Freeman, “Image quilting for texture synthesis and transfer,” in Proceedings of the 28th annual conference on Computer graphics and interactive techniques(ACM, 2001), pp. 341–346.

]. This enables us to efficiently produce a different speckle noise pattern for each image even though they are all based on the same physical speckle template, which is only one image of background noise with the typical spatial frequency distribution. Using this approach we can generate an arbitrary number of synthetic, but uncorrelated and realistic, images of background noise that we add subsequently to each given image to simulate low dynamic range. The algorithm shows robust performance under such conditions shown by a gentle rise of the error/dynamic range curve.

This can be seen in two graphs showing error rates EB and ELDEV plotted versus the dynamic range for a set of images for all the layers combined and with the confidence interval (1.96 std. dev.) plotted as dashed lines (Fig. 8
Fig. 8 Error rates EB (Basic) and ELDEV (Layer DEViation) with decreasing dynamic range for all the data sets, with confidence interval (1.96 * standard deviation) marked by the dashed lines. For both error measures a slow rise in the error values can be observed, which guarantees robust performance with noisy data.
), as well as two graphs showing the error rates for each individual layer (Fig. 9
Fig. 9 Error rates for the individual layers EB (Basic) and ELDEV (Layer DEViation) with decreasing dynamic range for all the data sets. For all the individual layers (nerve fibre layer (NFL), ganglion cell layer and inner plexiform layer (GCL + IPL), inner nuclear layer (INL), outer plexiform layer (OPL), outer nuclear layer (ONL), connecting cilia (CL), outer segment (OS), retinal pigment epithelium (RPE)) a slow rise in the error values can be observed. Thin layers inherently exhibit greater error values, as both errors are normalized by the layer area.
). The individual boundaries most affected by decreasing dynamic range are those defining INL and OPL, as could be expected since these layers exhibit normally significant variation and have weak boundaries which are affected early by the noise increase. Also, the boundaries between CL, OS and RPE are difficult to determine.

4. Conclusion

We have proposed an algorithm for automatically segmenting all major retinal layers based on a novel statistical model. We have introduced two important novelties with respect to the standard Active Appearance Model (AAM): a second term in the optimization function that penalizes large deviations from the three boundaries found by the adaptive thresholding algorithm and the second algorithm stage that refines the model fit for each A-scan independently, giving increased accuracy.

It has been thoroughly tested and evaluated against the manually segmented large data set from a 800nm OCT system and proved highly robust in full foveal scans even in the presence of artefacts and added strong background noise that reduces dynamic range down to 12dB. It is the first time that a large, representative data set (466 B-scans from 17 eyes) has been used for evaluation of an OCT segmentation algorithm. We have used manual segmentations of large data set as ground truth, rather than the frequently used error computed between the results of the algorithm on inter-visit measurements, as it is susceptible to underdetermine the real error value as it is susceptible to ignore systematic error of the algorithm. Apart from the basic error measure that counts the number of the misclassified pixels, we have also used a second error measure to penalize large deviations from the ground truth.

Thus, we can conclude that our algorithm successfully demonstrated reliable performance under conditions which prove extremely challenging for the pre-existing methods. Additionally, it has the potential of being used in other areas where boundaries are not well defined, such as segmentation of choroid layers, which is an important open problem in OCT data analysis. It would be also possible to extend the proposed algorithm to segmentation of pathological cases, as well as segmentation of ONH (optic nerve head) scans which contain discontinuous boundaries. In case that the stack registration is very precise, the initial ILM and RPE boundary finding step could be replaced by the algorithm proposed by Fabritius et al. [2

2. T. Fabritius, S. Makita, M. Miura, R. Myllylä, and Y. Yasuno, “Automated segmentation of the macula by optical coherence tomography,” Opt. Express 17(18), 15659–15669 (2009). [CrossRef] [PubMed]

] that relies on full 3D information present in the stack, since it is very efficient. Clinically, fully automated segmentation of all major layers is essential in making medically useful the possibilities given by the method of high resolution, high speed OCT of large portions of the human retina at microscopic detail.

Acknowledgements

The authors would like to thank Marieh Esmaeelpour for advice on biological interpretation of the data. This research was supported in part by the BBSRC, Cardiff University; FP6-IST-NMP-2 STREPT (017128, NanoUB); DTI grant (OMICRON); AMR grant (AP1110); European Union project FUN OCT (FP7 HEALTH, contract no. 201880) & CARL ZEISS Meditec Inc.

References and links

1.

W. Drexler, and J. G. Fujimoto, Optical Coherence Tomography: Technology and Applications (Springer, 2008).

2.

T. Fabritius, S. Makita, M. Miura, R. Myllylä, and Y. Yasuno, “Automated segmentation of the macula by optical coherence tomography,” Opt. Express 17(18), 15659–15669 (2009). [CrossRef] [PubMed]

3.

R. J. Zawadzki, S. S. Choi, S. M. Jones, S. S. Oliver, and J. S. Werner, “Adaptive optics-optical coherence tomography: optimizing visualization of microscopic retinal structures in three dimensions,” J. Opt. Soc. Am. A 24(5), 1373 (2007). [CrossRef]

4.

M. M. K. Garvin, M. M. D. Abramoff, R. R. Kardon, S. S. R. Russell, X. X. Wu, and M. M. Sonka, “Intraretinal Layer Segmentation of Macular Optical Coherence Tomography Images Using Optimal 3-D Graph Search,” IEEE Trans. Med. Imaging 27(10), 1495–1505 (2008). [CrossRef] [PubMed]

5.

D. Cabrera Fernández, H. M. Salinas, and C. A. Puliafito, “Automated detection of retinal layer structures on optical coherence tomography images,” Opt. Express 13(25), 10200–10216 (2005). [CrossRef] [PubMed]

6.

M. Mujat, R. Chan, B. Cense, B. Park, C. Joo, T. Akkin, T. Chen, and J. de Boer, “Retinal nerve fiber layer thickness map determined from optical coherence tomography images,” Opt. Express 13(23), 9480–9491 (2005). [CrossRef] [PubMed]

7.

D. Koozekanani, K. Boyer, and C. Roberts, “Retinal thickness measurements from optical coherence tomography using a Markov boundary model,” IEEE Trans. Med. Imaging 20(9), 900–916 (2001). [CrossRef] [PubMed]

8.

D. Tolliver, Y. Koutis, H. Ishikawa, J. S. Schuman, and G. L. Miller, “Unassisted Segmentation of Multiple Retinal Layers via Spectral Rounding,” in ARVO(2008).

9.

A. Mishra, A. Wong, K. Bizheva, and D. A. Clausi, “Intra-retinal layer segmentation in optical coherence tomography images,” Opt. Express 17(26), 23719–23728 (2009). [CrossRef]

10.

I. W. Selesnick, R. G. Baraniuk, and N. G. Kingsbury, “The Dual-Tree Complex Wavelet Transform,” IEEE Signal Process. Mag. 22(6), 123–151 (2005). [CrossRef]

11.

A. Mishra, A. Wong, D. A. Clausi, and P. W. Fieguth, “Quasi-random nonlinear scale space,” Pattern Recognit. Lett. In Press. (Corrected Proof.).

12.

A. Wong, A. Mishra, K. Bizheva, and D. A. Clausi, “General Bayesian estimation for speckle noise reduction in optical coherence tomography retinal imagery,” Opt. Express 18(8), 8338–8352 (2010). [CrossRef] [PubMed]

13.

P. Thevenaz, and M. Unser, “A pyramid approach to sub-pixel image fusion based on mutual information,” in Image Processing, 1996. Proceedings., International Conference on(1996), p. 265.

14.

C. O. S. Sorzano, P. Thevenaz, and M. Unser, “Elastic registration of biological images using vector-spline regularization,” BIEEE Biomed. Eng. 52(4), 652–663 (2005). [CrossRef]

15.

A. K. Mishra, P. W. Fieguth, and D. A. Clausi, “Decoupled Active Contour (DAC) for Boundary Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence 99.

16.

T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active Appearance Models,” IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001). [CrossRef]

17.

M. Scholz, M. Fraunholz, and J. Selbig, “Nonlinear Principal Component Analysis: Neural Network Models and Applications,” in Principal Manifolds for Data Visualization and Dimension Reduction(2007), pp. 44–67.

18.

M. Scholz, F. Kaplan, C. L. Guy, J. Kopka, and J. Selbig, “Non-linear PCA: a missing data approach,” Bioinformatics 21(20), 3887–3895 (2005). [CrossRef] [PubMed]

19.

A. A. Efros, and W. T. Freeman, “Image quilting for texture synthesis and transfer,” in Proceedings of the 28th annual conference on Computer graphics and interactive techniques(ACM, 2001), pp. 341–346.

OCIS Codes
(100.0100) Image processing : Image processing
(170.4500) Medical optics and biotechnology : Optical coherence tomography
(170.4580) Medical optics and biotechnology : Optical diagnostics for medicine
(100.3008) Image processing : Image recognition, algorithms and filters

ToC Category:
Medical Optics and Biotechnology

History
Original Manuscript: May 14, 2010
Revised Manuscript: June 20, 2010
Manuscript Accepted: June 21, 2010
Published: June 24, 2010

Virtual Issues
Vol. 5, Iss. 11 Virtual Journal for Biomedical Optics

Citation
Vedran Kajić, Boris Považay, Boris Hermann, Bernd Hofer, David Marshall, Paul L. Rosin, and Wolfgang Drexler, "Robust segmentation of intraretinal layers in the normal human fovea using a novel statistical model based on texture and shape analysis," Opt. Express 18, 14730-14744 (2010)
http://www.opticsinfobase.org/vjbo/abstract.cfm?URI=oe-18-14-14730


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. W. Drexler, and J. G. Fujimoto, Optical Coherence Tomography: Technology and Applications (Springer, 2008).
  2. T. Fabritius, S. Makita, M. Miura, R. Myllylä, and Y. Yasuno, “Automated segmentation of the macula by optical coherence tomography,” Opt. Express 17(18), 15659–15669 (2009). [CrossRef] [PubMed]
  3. R. J. Zawadzki, S. S. Choi, S. M. Jones, S. S. Oliver, and J. S. Werner, “Adaptive optics-optical coherence tomography: optimizing visualization of microscopic retinal structures in three dimensions,” J. Opt. Soc. Am. A 24(5), 1373 (2007). [CrossRef]
  4. M. M. K. Garvin, M. M. D. Abramoff, R. R. Kardon, S. S. R. Russell, X. X. Wu, and M. M. Sonka, “Intraretinal Layer Segmentation of Macular Optical Coherence Tomography Images Using Optimal 3-D Graph Search,” IEEE Trans. Med. Imaging 27(10), 1495–1505 (2008). [CrossRef] [PubMed]
  5. D. Cabrera Fernández, H. M. Salinas, and C. A. Puliafito, “Automated detection of retinal layer structures on optical coherence tomography images,” Opt. Express 13(25), 10200–10216 (2005). [CrossRef] [PubMed]
  6. M. Mujat, R. Chan, B. Cense, B. Park, C. Joo, T. Akkin, T. Chen, and J. de Boer, “Retinal nerve fiber layer thickness map determined from optical coherence tomography images,” Opt. Express 13(23), 9480–9491 (2005). [CrossRef] [PubMed]
  7. D. Koozekanani, K. Boyer, and C. Roberts, “Retinal thickness measurements from optical coherence tomography using a Markov boundary model,” IEEE Trans. Med. Imaging 20(9), 900–916 (2001). [CrossRef] [PubMed]
  8. D. Tolliver, Y. Koutis, H. Ishikawa, J. S. Schuman, and G. L. Miller, “Unassisted Segmentation of Multiple Retinal Layers via Spectral Rounding,” in ARVO(2008).
  9. A. Mishra, A. Wong, K. Bizheva, and D. A. Clausi, “Intra-retinal layer segmentation in optical coherence tomography images,” Opt. Express 17(26), 23719–23728 (2009). [CrossRef]
  10. I. W. Selesnick, R. G. Baraniuk, and N. G. Kingsbury, “The Dual-Tree Complex Wavelet Transform,” IEEE Signal Process. Mag. 22(6), 123–151 (2005). [CrossRef]
  11. A. Mishra, A. Wong, D. A. Clausi, and P. W. Fieguth, “Quasi-random nonlinear scale space,” Pattern Recognit. Lett. In Press. (Corrected Proof).
  12. A. Wong, A. Mishra, K. Bizheva, and D. A. Clausi, “General Bayesian estimation for speckle noise reduction in optical coherence tomography retinal imagery,” Opt. Express 18(8), 8338–8352 (2010). [CrossRef] [PubMed]
  13. P. Thevenaz, and M. Unser, “A pyramid approach to sub-pixel image fusion based on mutual information,” in Image Processing, 1996. Proceedings., International Conference on(1996), p. 265.
  14. C. O. S. Sorzano, P. Thevenaz, and M. Unser, “Elastic registration of biological images using vector-spline regularization,” BIEEE Biomed. Eng. 52(4), 652–663 (2005). [CrossRef]
  15. A. K. Mishra, P. W. Fieguth, and D. A. Clausi, “Decoupled Active Contour (DAC) for Boundary Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence 99.
  16. T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active Appearance Models,” IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001). [CrossRef]
  17. M. Scholz, M. Fraunholz, and J. Selbig, “Nonlinear Principal Component Analysis: Neural Network Models and Applications,” in Principal Manifolds for Data Visualization and Dimension Reduction(2007), pp. 44–67.
  18. M. Scholz, F. Kaplan, C. L. Guy, J. Kopka, and J. Selbig, “Non-linear PCA: a missing data approach,” Bioinformatics 21(20), 3887–3895 (2005). [CrossRef] [PubMed]
  19. A. A. Efros, and W. T. Freeman, “Image quilting for texture synthesis and transfer,” in Proceedings of the 28th annual conference on Computer graphics and interactive techniques(ACM, 2001), pp. 341–346.

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.


« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited