OSA's Digital Library

Virtual Journal for Biomedical Optics

Virtual Journal for Biomedical Optics

| EXPLORING THE INTERFACE OF LIGHT AND BIOMEDICINE

  • Editors: Andrew Dunn and Anthony Durkin
  • Vol. 7, Iss. 3 — Feb. 29, 2012
« Show journal navigation

Using artificial neural networks for open-loop tomography

James Osborn, Francisco Javier De Cos Juez, Dani Guzman, Timothy Butterley, Richard Myers, Andrés Guesalaga, and Jesus Laine  »View Author Affiliations


Optics Express, Vol. 20, Issue 3, pp. 2420-2434 (2012)
http://dx.doi.org/10.1364/OE.20.002420


View Full Text Article

Acrobat PDF (982 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

Modern adaptive optics (AO) systems for large telescopes require tomographic techniques to reconstruct the phase aberrations induced by the turbulent atmosphere along a line of sight to a target which is angularly separated from the guide sources that are used to sample the atmosphere. Multi-object adaptive optics (MOAO) is one such technique. Here, we present a method which uses an artificial neural network (ANN) to reconstruct the target phase given off-axis references sources. We compare our ANN method with a standard least squares type matrix multiplication method and to the learn and apply method developed for the CANARY MOAO instrument. The ANN is trained with a large range of possible turbulent layer positions and therefore does not require any input of the optical turbulence profile. It is therefore less susceptible to changing conditions than some existing methods. We also exploit the non-linear response of the ANN to make it more robust to noisy centroid measurements than other linear techniques.

© 2012 OSA

1. Introduction

Adaptive optics (AO) systems require guide sources to sample the turbulent atmosphere above the telescope. If a guide star is located very close to the target or we can use the target itself, then this star can be used to directly measure the phase aberrations along the line of sight to the target. However, if there is no guide star bright enough or we would like to observe multiple or extended objects in the field then we require multiple guide stars to sample the turbulent atmosphere. Another reason for using multiple guide sources is when artificial guide stars are employed. In this case each guide star only illuminates a cone within the turbulent volume above the telescope. If the light cones of these guide stars overlap with the cylinder illuminated by the target we can use tomographic techniques to reconstruct the phase aberrations along the line of sight to the target. The majority of modern AO systems (with the exception of extreme AO for extrasolar planet imaging) make use of tomographic reconstruction techniques. Three major varieties of tomographic AO currently under investigation are laser tomography AO (LTAO) [1

1. M. Le Louarn and N. Hubin, “Wide-field adaptive optics for deep-field spectroscopy in the visible,” Mon. Not. R. Astron. Soc. 349(3), 1009–1018 (2004). [CrossRef]

], multi-conjugate AO (MCAO) [2

2. J. M. Beckers, “Detailed compensation of atmospheric seeing using multiconjugate adaptive optics,” Proc. SPIE 1114, 215–217 (1989).

] and multi-object AO (MOAO) [3

3. F. Hammer, F. Sayède, E. Gendron, T. Fusco, D. Burgarella, V. Cayatte, J.-M. Conan, F. Courbin, H. Flores, I. Guinouard, L. Jocou, A. Lançon, G. Monnet, M. Mouhcine, F. Rigaud, D. Rouan, G. Rousset, V. Buat, and F. Zamkotsian, “The FALCON Concept: Multi-Object Spectroscopy Combined with MCAO in Near-IR,” in Scientific Drivers for ESO Future VLT/VLTI Instrumentation, J. Bergeron and G. Monnet, eds.(Springer-Verlag, 2002), p. 139. [CrossRef]

, 4

4. F. Assémat, E. Gendron, and F. Hammer, “The FALCON concept: multi-object adaptive optics and atmospheric tomography for integral field spectroscopy – principles and performance on an 8-m telescope,” Mon. Not. R. Astron. Soc. 376, 287–312 (2007). [CrossRef]

].

In the case of MOAO a number of target directions are observed simultaneously and corrected independently by one deformable mirror (DM) per channel. The guide stars (natural and laser) are distributed around the field and are monitored with open loop wavefront sensors (WFS) (i.e. without a DM). The information from each guide star is then combined in such a way as to estimate the phase aberrations for each target. CANARY [5

5. T. Morris, Z. Hubert, R. Myers, E. Gendron, A. Longmore, G. Rousset, G. Talbot, T. Fusco, N. Dipper, F. Vidal, D. Henry, D. Gratadour, T. Butterley, F. Chemla, D. Guzman, P. Laporte, E. Younger, A. Kellerer, M. Harrison, M. Marteaud, D. Geng, A. Basden, A. Guesalaga, C. Dunlop, S. Todd, C. Robert, K. Dee, C. Dickson, N. Vedrenne, A. Greenaway, B. Stobie, H. Dalgarno, and J. Skvarc, “CANARY: The NGS/LGS MOAO demonstrator for EAGLE,” 1st AO4ELT conference p. 08003 (2010).

, 6

6. E. Gendron, F. Vidal, M. Brangier, T. Morris, Z. Hubert, A. Basden, G. Rousset, R. M. Myers, F. Chemla, A. Longmore, T. Butterley, N. Dipper, C. Dunlop, D. Geng, D. Gratador, D. Henry, P. Laporte, N. Looker, D. Perret, A. Sevin, G. Talbot, and E. Younger, “MOAO first on-sky demonstration with CANARY,” Astron. Astrophys. L2(529) (2011).

] is the first on-sky test of tomographic MOAO and is thus a perfect test bench for both the opto-mechanical technology that needs to be developed and for the algorithms that are required for the control of the instrument.

Optical turbulence profilers show that the atmosphere can be considered to be made up of a number of independent very thin turbulent layers. The altitude and strength of these layers can change with time and so the vertical profile of the optical turbulence will develop and evolve with time [7

7. R. Avila, E. Carrasco, F. Ibañez, J. Vernin, J. L. Prieur, and D. X. Cruz, “Generalized SCIDAR Measurements at San Pedro Mártir. II. Wind Profile Statistics,” Publ. Astron. Soc. Pac. 118, 503–515 (2006). [CrossRef]

]. Median profiles are used in simulations for performance analysis but it should be remembered that a median profile is not representative of any real profile. The tomographic reconstructor must be able to handle these changes of turbulent profiles.

Here we present a new method which uses an Artificial Neural Network (ANN) to combine the information from the WFSs and output the integrated reconstructed phase aberrations from the target to the telescope. ANNs are trained by exposing them to a large number of inputs together with the desired output. In theory this training data should cover the full range of possible scenarios. However, this is obviously not possible and so given enough training the ANN will provide a best guess to the solution. When the ANN is confronted with a superposition of a number of the independent training sets it can then predict an output by combining a number of the synaptic pathways. In this way we do not need to train the ANN with every possible turbulent profile.

We propose to train an ANN off-line with simulated data. The reconstructor is named CARMEN (Complex Atmospheric Reconstructor based on Machine lEarNing). The idea is to train the reconstructor to be able to handle any turbulent profile that it might be exposed to. We do this by carefully selecting the optimum training routines. This is a train and apply technique; once trained with the correct parameters (for example, the number and geometry of the guide stars) it will work for any optical turbulence profile. Therefore, we train CARMEN with a large number of independent turbulence profiles. We train it with the off-axis WFS slopes and the desired on-axis target Zernike coefficients. When the network is implemented and shown the off-axis WFS data it will estimate what the on-axis Zernike coefficients will be. We have chosen to output Zernike coefficients at this stage to limit the number of outputs required (i.e 27 values, assuming we predict up to 6th order Zernikes, instead of a value of the order of the number of actuators in the DM). It would be possible to predict a higher number of degrees of freedom and we would expect the performance to increase accordingly. No a priori knowledge of the atmosphere is required and no input from the user or re-training is required if the atmospheric turbulence profile changes during observing. This is an alternative approach to most other tomographic reconstructors.

ANNs have been applied to the field of AO in the past. Angel et al. (1990) [8

8. J. R. P. Angel, P. Wizinowich, M. Lloyd-Hart, and D. Sandler, “Adaptive optics for array telescopes using neural-network techniques,” Nature 348, 221–224 (1990). [CrossRef]

], Sandler et al. (1991) [9

9. D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991). [CrossRef]

] and Lloyd-Hart et al. (1992) [10

10. M. Lloyd-Hart, P. Wizinowich, B. McLeod, D. Wittman, D. Colucci, R. Dekany, D. McCarthy, J. R. P. Angel, and D. Sandler, “First results of an on-line adaptive optics system with atmospheric wavefront sensing by an artifical neural network,” Astrophys. J. 390(1), L41–L44 (1992). [CrossRef]

] present successful results using neural networks for wavefront sensing in the focal plane. Montera et al. (1996) [11

11. D. A. Montera, M. C. Welsh, B. M. Roggemann, and D. W. Ruck, “Processing wave-front-sensors slope measurements using artificial neural networks,” Appl. Opt. 35(21), 4238–4251 (1996). [CrossRef] [PubMed]

] experimented with an ANN to reduce WFS centroiding error and to estimate the Fried parameter r0 and the WFS slope measurement error. They found that the ANN performed as well as but not better than a standard linear approach for estimating the WFS slopes and for the estimation of the Fried parameter, r0, however the ANN was very good at estimating the WFS slope measurement error. ANNs have also been investigated for spatial and temporal predictions of the slope measurements. Lloyd-Hart & McGuire (1995) [12

12. M. Lloyd-Hart and P. McGuire, “Spatio-temporal prediction for adaptive optics wavefront reconstructors,” in Proc. European Southern Observatory Conf. on Adaptive Optics , pp. 95–102 (1995).

] use an ANN to make a temporal prediction of the WFS slopes. The AO latency is then reduced allowing for a better correction. Weddell & Webb (2006, 2007) [13

13. S. J. Weddell and R. Y. Webb, “Dynamic Artificial Neural Networks for Centroid Prediction in Astronomy,” inProc. of the Sixth International Conference on Hybrid Intelligent Systems , pp. 68 (2006). [CrossRef]

, 14

14. S. J. Weddell and R. Y. Webb, “A Neural Network Architecture for the Reconstruction of Turbulence Degraded Point Spread Functions,” in Proc. Image and Vision Computing New Zealand , pp. 103–108 (2007).

] developed this idea and used off-axis WFS measurements to temporally predict the on-axis slopes. However, this was limited to low-order Zernike modes (tip/tilt) only. More recently neural networks have been used to model open loop DMs for MOAO [15

15. D. Guzmán, F. J. Juez, R. Myers, A. Guesalaga, and F. Lasheras, “Modeling a MEMS deformable mirror using non-parametric estimation techniques,” Opt. Express 18(20), 21356–21369 (2010). [CrossRef] [PubMed]

]. An accurate DM model is required for open-loop AO as the DM is not seen by the WFS.

The difference between our proposal and the work of Lloyd-Hart & McGuire (1995) [12

12. M. Lloyd-Hart and P. McGuire, “Spatio-temporal prediction for adaptive optics wavefront reconstructors,” in Proc. European Southern Observatory Conf. on Adaptive Optics , pp. 95–102 (1995).

] and Weddell & Webb (2006) [13

13. S. J. Weddell and R. Y. Webb, “Dynamic Artificial Neural Networks for Centroid Prediction in Astronomy,” inProc. of the Sixth International Conference on Hybrid Intelligent Systems , pp. 68 (2006). [CrossRef]

] is that we will train the network in simulation rather than on-sky. This allows us to select and control what the network learns and means that we can predict to a higher order. One advantage of the on-sky training is that it will inherently be trained to the concurrent turbulence profile. However, if this profile were to change then, like other reconstructors that need to be re-calcualted, it would need to be re-trained.

2. Existing reconstructor techniques

Tomographic reconstruction is the re-combination of the information from several guide stars to estimate the phase aberrations along a different line of sight to a scientific target. A standard approach is to use Shack-Hartmann WFSs to measure the phase aberrations in the light cones to the guide stars. When the light cones overlap at the altitude of a turbulent layer the same phase aberrations will be applied to both wavefronts but in different areas of the meta-pupil. We can then look for correlation in the phase maps at the ground. Figure 1 shows a topological diagram of a system with three guide stars and one target. Any turbulence at low altitudes will be well sampled. At higher altitudes the overlap is reduced and we therefore have less information. Above the altitude where the beams no longer overlap there will be very limited correlation in the phase aberration (possibly some correlation in the very low order modes, depending on the extent of the separation and the outer scale of the turbulence) and it is therefore very difficult to gain any information. Any turbulence above this altitude will essentially be noise.

Fig. 1 Topological diagram of the light cones for three guide stars and one target for a 4.2 m telescope and guide stars equally distributed on a ring of radius 30 arcseonds. The target direction is shown in red, the guide stars in green and the full field of view in blue. The cut-throughs on the right are taken at 0 m, 5000 m and 10000 m. At higher altitudes the overlap of the guide stars reduces and we sample smaller areas of the target light cone.

There are several tomographic techniques which can be used to combine the information from the guide stars. Here we examine two methods, a standard least squares type matrix vector multiplication (LS) and learn and apply (L+A). We have chosen these two as benchmark tests to compare with our new technique.

Learn and Apply (L+A) [20

20. F. Vidal, E. Gendron, and G. Rousset, “Tomography approach for multi-object adaptive optics,” J. Opt. Soc. Am. A. 27(11), 253–264 (2010). [CrossRef]

] has recently been developed and successfully tested with CANARY. L+A has taken a different approach to many other techniques in that it includes the concept of a SLODAR [21

21. R. W. Wilson, “SLODAR: measuring optical turbulence altitude with a Shack–Hartmann wavefront sensor,” Mon. Not. R. Astron. Soc. 337(1), 103–108 (2002). [CrossRef]

] system and so automatically includes the atmospheric optical profile within the reconstruction. This is done by calculating the covariance matrices between the slopes of all of the guide stars with each other and all of the guide stars with an on-axis calibration WFS. By combining the two covariance matrices, the turbulence profile and geometric positions of all the guide stars with the target are taken into account in the reconstructor. If the turbulence profile were to change during the course of the observation the covariance matrices would need to be re-calculated. However, as the guide star WFS are open loop it is possible to monitor the profile using the SLODAR method and therefore know when the reconstructor needs to be updated. It should be noted that the on-axis WFS is only available during calibration and not when the instrument is observing the scientific target. This means that the reconstructor can not be completely updated during observations. However, it might be possible to estimate the on-axis covariance matrix using the off axis matrix and knowledge of the geometry, allowing L+A to be stable even in changeable conditions.

3. Neural networks

ANNs are well known for their ability to solve problems that are otherwise difficult to model [22

22. K. Huarng and T. H.-K. Yu, “The application of neural networks to forecast fuzzy time series,” Physica A , 363(2), 481–491 (2006). [CrossRef]

24

24. J. W. Denton, “How good are neural networks for causal forecasting?” J. Bus. Forecast. Methods Syst. 14(2), 17–20 (1995).

]. A detailed explanation of ANNs can be found in [15

15. D. Guzmán, F. J. Juez, R. Myers, A. Guesalaga, and F. Lasheras, “Modeling a MEMS deformable mirror using non-parametric estimation techniques,” Opt. Express 18(20), 21356–21369 (2010). [CrossRef] [PubMed]

]. Artificial Neural Networks are computational models inspired by biological neural networks which consist of a series of interconnected simple processing elements called neurons or nodes. The Multi Layer Perceptron (MLP) is a specific type of Feedforward Neural Network in which the nodes are organized in layers (input, hidden and output layers) and each neuron is connected with one or more of the following layers only. Each neuron receives a series of data from the preceding layer neurons or an external source, transforms it locally using an activation or transfer function (Eq. (1)) and sends the result to one or more nodes in any of the following layers (Fig. 2). This cycle repeats until the output neurons are reached.

Fig. 2 A simplified network diagram for CARMEN. The slopes from the WFS are input to the network. They are all connected to every neuron in the hidden layer by a synapse. Each neuron in the hidden layer is then connected to every output node. CARMEN will output the predicted Zernike coefficients for the target direction. Each of the synapses has a weight. At run time the inputs are injected into the network which is then processed by the different activation functions and weights generating a response. In the diagram only a few of the synapses are shown for clarity.

3.1. Training

As described above, the ANN is trained by showing it a representative selection of inputs with the desired outputs. The training data should attempt to cover the full range of possible scenarios. We propose to train the ANN with simulated data. If we present it with enough independent data the weightings will converge and the network should be able to cope with any input which is similar to, or a combination of stimuli which are all similar to, the training data. If we are not careful with the training data the network will learn to make connections which are only a coincidence in the training set or are perhaps a secondary concern. By using simulated data we can control what the neural network sees and hope to guide the learning process. We have tested many training scenarios. The best one we have found involves training the network with a single turbulent layer (r0 = 0.12 m and L0 = 30 m). The layer is placed at 155 altitudes ranging from 0 m to 15500 m with 150 m resolution. At each altitude we present CARMEN with 1000 randomly generated phase screens. Using this dataset CARMEN has seen all of the possible layer positions. CARMEN will combine the response of this basis set and use it to model the input data. We can essentially model the atmosphere with the same resolution we use to train CARMEN. In reality what we are doing is teaching the network how to combine slopes with different light cone overlap fractions in the WFSs (Fig. 1). There are other alternatives for training sets, like including two turbulent layers, one fixed at the ground and another higher layer at a number of different altitudes or a more realistic case with a number of layers with different relative strengths. However, although more realistic, these datasets are no longer independent, the network is over trained and we find that the results are not as good as with the simpler approach explained above.

After testing networks with different network architectures and actuation functions we have found that the optimum architecture depends on the profile of the optical turbulence in the atmosphere and on the magnitude of the noise. As the optimum architecture is different under different conditions we have decided to use the simplest approach which produces good results in all cases. The simplest network consists in a MLP of only one hidden layer containing the same number of neurons as the input allowing full mapping, and BP training algorithm with a sigmoid activation function and a value of learning rate of 0.03. The results from the more complicated networks are not presented here, however they were all broadly similar with each one having slightly better performance in different circumstances. For example, in more complicated atmospheres with many turbulent layers networks with an additional hidden layer resulted in a slightly lower residual wavefront error. By training the networks with these simplistic sets that cover the full range of possible layer positions the network can combine the responses in order to estimate the outputs from much more complicated profiles. No additional information or re-training is necessary even if the atmosphere changes drastically during observing. The tomographic reconstructor is robust even in the most challenging conditions.

The hardware used for training was an OpenSuSe 11.3 running on a 8 core 2.4 GHz Intel Xeon CPU E5530 with 32 Gb RAM, although only 1 core and nearly 620 Mb of RAM were used. With this configuration using R v2.12.2 the training time was of 4 days 1 hour and 23 minutes.

4. Results

The results presented here are generated by Monte Carlo simulation. We assume three off-axis natural guide stars equally spaced in a ring of 30 arcseconds radius. The target direction is at the centre of this ring. The telescope diameter is 4.2 m and we assume 7 × 7 subapertures in the Shack-Hartmann WFS. The simulation parameters were chosen to be similar to those of CANARY and the results are compared with a standard LS method and with L+A. In the simulations we use a standard thresholded centre of gravity algorithm for the centroiding.

We assess each of the tomographic reconstructors with three test cases. These are the good, median and bad seeing atmospheric profiles from La Palma, as used in the CANARY simulations (shown in Table 1). Each of the profiles have four turbulent layers, but the altitudes and the relative strengths of the layers and the integrated turbulence strength is different in each case.

Table 1. Table of Atmospheric Parameters for the Three Test Casesa

table-icon
View This Table
| View All Tables

In order to compare the reconstructors fairly the LS method is optimised in terms of virtual DM altitude and actuator density by experimentation and the learn stage of L+A is also performed with an atmosphere of the same parameters, but different phase maps, for each test case.

4.1. Noiseless simulation results

The results of a noiseless simulation are shown in Table 2. On the contrary to the LS and L+A reconstructors, no change was made to CARMEN between the test cases. The results show that CARMEN was able to adapt to each test case successfully as it consistently results in the lowest WFE. It is important to note that we show the comparison results to prove that CARMEN can perform as well as the other techniques. The other techniques can be optimised to further improve these results, however the real strength of CARMEN is that it does not need any modification even in very changeable conditions and this is reflected in the results.

Table 2. Table of PSF Metrics for Each Tomographic Reconstructor and Test Scenario

table-icon
View This Table
| View All Tables

If the exposure time was long enough for even the lowest order modes to average out then if we were to run each of these test cases sequentially to simulate a changing atmosphere then the resulting PSF will simply be the sum of the three test case PSFs. Therefore, we see that CARMEN would be able to function with a changing atmosphere with no re-configuration necessary. The other reconstructors would require re-configuring in order to obtain a similar result, otherwise the performance could be seriously impaired.

Figure 3(a) shows the PSFs generated using each of the tomographic reconstructors and the atomospheric test case 2 (median seeing). The azimuthally averaged radial profiles are also shown in Fig. 3(b). The non-circular diffraction effect seen in the PSF is because we are approximating the wavefront with Zernikes up to sixth order.

Fig. 3 Simulated PSFs (left) for test 2 (median seeing scenario). Clockwise from top left is the uncorrected PSF, LS, L+A and CARMEN tomography. The residual WFEs are 817 nm, 322 nm, 289 nm and 262 nm respectively. The azimuthally averaged radial profiles of the PSFs are shown on the right.

Figure 4(a) show the azimuthally averaged radial profiles for the scenario where each of the test cases are run sequentially. The LS and L+A were re-configured for each atmospheric test case. The WFE for the LS, L+A, and CARMEN tomographic techniques are 356 nm, 317 nm and 293 nm, this corresponds to Strehls of 0.198, 0.265 and 0.319, respectively. Figure 4(b) shows the residual Zernike variance on a mode by mode basis. We see that the residual variance is lower for CARMEN for every mode.

Fig. 4 (a) Radial profiles of the simulated PSFs using the three test atmospheres ran sequentially to simulate a changing atmosphere. The residual WFE are 356 nm, 317 nm and 293 nm for LS, L+A and CARMEN reconstruction. (b) is the residual Zernike variance as a function of mode number.

The test atmospheric profiles used above are all similar. We have also applied unrealistic extreme profiles to CARMEN to see if they will still be compensated. We introduce three more test cases, each with two turbulent layers and a 50% split in turbulence strength, one at the ground and one at 5, 10 or 15 km. The LS and L+A were re-configured for each test and CARMEN was left unaltered. Table 3 shows the resulting metrics. The correction reduces with the altitude of the high turbulent layer because of the reduced fraction of overlap of the metapupils. We see that CARMEN functions with a wide range of altitudes for the high layer.

Table 3. Table of Metrics for the Three Extreme Test Cases.

table-icon
View This Table
| View All Tables

So far the test cases have involved small numbers of layers. Here we experiment with atmospheric profiles containing many layers. The residual WFE for a seven layer atmosphere (as shown in Fig. 5) with CARMEN was 328 nm compared to the uncorrected WFE of 818 nm. The integrated r0 was 0.12 m. This shows that the network functions even with a large number of turbulent layers in the atmosphere without any modification or extra input.

Fig. 5 Arbitrary fictional seven layer turbulent profile used to test CARMENs ability to combine the response of many layers. The uncorrected WFE is 818 nm and the CARMEN residual WFE is 328 nm.

The plots in Fig. 6 show that CARMEN, although trained with a single value of the integrated turbulence strength, r0, and outer scale, L0, can actually correct for a wide range of realistic values. We varied r0 between 0.05 m and 0.25 m and L0 between 2 m (D/2, where D is the diameter of the telescope) and 100 m (≈ 25 × D) and the observed pattern in correction is consistent with the other reconstructors.

Fig. 6 WFE as a function of integrated turbulence strength, r0, (a) and of the outer scale, L0, (b) using the atmospheric profile of test case 2.

4.2. Simulation results with shot noise

We have tested our reconstructor with simulated detector noise (shot noise and read noise) in the wavefront sensor. We assumed 100 photons per subaperture (which equates to an 11th magnitude star and throughput of 50% on a 4.2 m telescope), twenty by twenty pixels per subaperture and 0.2 electrons readout noise.

There are two approaches that we can take to train the ANN for noise. We can attempt to run the noisy WFS measurements through the original CARMEN trained without noise and we can try training a new ANN with slopes including centroid noise. After testing in simulation we find that the latter turns out to be a significantly better solution. Table 4 shows the resultant PSF metrics generated with reconstructors using WFS vectors including shot noise. We see that in the presence of shot noise the difference between CARMEN and the other reconstructors becomes even greater. This is expected as neural networks have been shown to be good at learning patterns in noisy data [28

28. S. Tamura, “An analysis of a noise reduction neural network,” in Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on, pp. 2001–2004 vol.3 (1989).

]. The neural network is essentially de-prioritising higher order modes which are now indistinguishable from the noise. The noise was not included when training L+A and the conditioning parameter was altered to maximise the performance of the LS reconstructor.

Table 4. Table of PSF Metrics for Each Tomographic Reconstructor and Test Scenario Including Shot Noise in the WFSs

table-icon
View This Table
| View All Tables

Figure 7(a) shows the radial profiles of the PSFs with the three different tomographic reconstructors with the median seeing atmospheric test case. The residual WFE for the uncorrected, LS, L+A and CARMEN reconstructors are 817, 543, 547 and 368 nm respectively. Figure 7(b) shows the variance of the residual Zernike coefficients (∑ (ZreconstructedZmeasured)2/n, where Zreconstructed are the reconstructed Zernike coefficients, Zmeasured are the measured Zernike coefficients and n is the number of iterations of the simulation) for each of the three reconstructors. We can see that CARMEN fits the low order modes better than the other methods. As most of the energy is concentrated in these modes this explains where the performance advantage of CARMEN comes from. However, in order to do this CARMEN must be trained with a dataset containing the same magnitude of shot noise.

Fig. 7 (a) Azimuthally averaged radial profiles of the uncorrected and LS, L+A and CARMEN reconstructed PSFs. Note that the LS and L+A radial profiles overlap almost perfectly. (b) Residual Zernike variance for the three reconstructors with WFS shot noise.

5. On-sky implementation

In the work presented here we have only used natural guide stars. However, laser guide stars (LGS) will be required to increase the sky coverage. This will reduce the performance of any tomographic reconstructor due to the reduced overlap of the metapupils caused by focal anisplanatism of the beams [29

29. R. W. Wilson and C. R. Jenkins, “Adaptive Optics for astronomy: theoretical performance and limitations,” Mon. Not. R. Astron. Soc. 268, 39–61 (1996).

]. Although there are no problems with including LGSs in the training simulation there are other practical issues which may complicate an on-sky implementation. For example, we would need to train the network with the same Sodium column density profile and fratricide effects. Although ANNs have been shown to be robust the training simulation should incorporate all of these issues to optimise the performance.

So far all of the training has been done off-line in a simulation. This approach has the advantage that we can carefully select the training scenarios to optimise the performance. However, it might also be beneficial to have an additional on-line secondary correction which can tweak the output of CARMEN for the actual optical turbulence profile, WFS parameters, optical setup (e.g. misregistration errors) and centroid noise actually being observed and any other effect not included in the simulation. One option for this secondary correction would be to implement an additional neural network. As with the L+A technique an on-axis truth sensor would be required to train this network. Once trained this network will take in the vectors from the off-axis wavefront sensors and the output to the initial CARMEN prediction and output a new improved estimation of the on-axis phase aberrations. The disadvantage is that if we tune the tomographic reconstructor to the actual turbulence profile which then changes we will lose performance, as with the other reconstructors. This secondary network is currently under development and we plan to test it with on-sky data.

5.1. Extremely large telescopes

An important question for AO instrument scientists is the scalability to ELT size telescopes. Due to the larger number of subapertures and guide stars involved, tomography on ELT scales becomes computationally more difficult. Although the training of the ANNs becomes exponentially more time consuming for larger telescopes (or more correctly, for larger number of sub-apertures) the computational complexity remains constant. Therefore, a network can be trained and implemented on ELT scale telescopes. Although we think that it might be possible to extrapolate the correction geometrically for any target direction it is worth noting that currently for every different asterism a new training is required. Therefore advanced planning is necessary.

The strength of the ANN for tomographic reconstruction comes from its non-linear properties. The disadvantage of this means that the computational time required for each iteration scales badly in comparison to other linear techniques. However, the ANNs architecture and associated learning algorithms take advantage of the inherent parallelism in neural processing [30

30. M. Hänggi and G. S. Moschytz, Cellular Neural Networks: Analysis, Desgn and Optimization (Kluwer Academic Publishers, 2000).

]. For specific applications such as tomographic reconstruction at ELT scales, which demand a high volume of adaptive real-time processing and learning of large data-sets in a reasonable time, the use of energy-efficient ANN hardware with truly parallel processing capabilities is more recomended [31

31. J. Misra and I. Saha, “Artificial neural networks in hardware: A survey of two decades of progress,” Neurocomputing 74(1–3), 239–255 (2010). [CrossRef]

]. A wide spectrum of technologies and architectures have been explored in the past. These include digital, analog, hybrid, FPGA based, and (non-electronic) optical implementations ([31

31. J. Misra and I. Saha, “Artificial neural networks in hardware: A survey of two decades of progress,” Neurocomputing 74(1–3), 239–255 (2010). [CrossRef]

] and references therein). Efficient neural-hardware designs are well known for achieving high speeds and low power dissipation.

We are not able at this point to define the final computational necessities of an ANN tomographic reconstructor at ELT scales but, as an example of the capabilities of a neural-hardware implementation, a typical real-time image processing task may demand 10 teraflops, which is well beyond the current capacities of PCs or workstations today [31

31. J. Misra and I. Saha, “Artificial neural networks in hardware: A survey of two decades of progress,” Neurocomputing 74(1–3), 239–255 (2010). [CrossRef]

]. In such cases neurohardware appears an attractive choice and can provide a better cost-to-performance ratio even when compared to supercomputers.

6. Conclusion

We have presented and tested in simulation a novel and versatile tomographic reconstruction technique using an artificial neural network. We train the network with a number of simulated datasets designed to sample the full range of possible input signals. The data set comprises of a single turbulent layer positioned at a number of different altitudes in order to show the network as many different overlap fractions as possible. After testing several different training scenarios and network architectures we found that the simplest, with a single hidden layer, is the best. The reconstructor has been compared in simulation to a standard LS technique and to L+A.

We compare with LS and L+A only as a benchmark to show that the performance of CARMEN is on a par with other accepted reconstructor techniques. It is possible to optimise these reconstructors even more to obtain a better correction but we also believe that we can optimise CARMEN more by allowing the training process more time. Therefore, we do not want to draw conclusions about the magnitude of the correction at any one time.

The second strength of CARMEN is its ability to process shot noise corrupted centroid measurements. We have shown through Monte Carlo simulation that CARMEN is able to reconstruct the on-axis Zernike coefficients from noisy off-axis guide sources better than LS and L+A reconstructors. For example, using the CANARY median test case LS and L+A result in residual WFE of 543 and 547 nm respectively, CARMEN achieves a residual WFE of 368 nm. From analysis of the variance of the Zernike residuals we see that the majority of this improvement comes from the low order modes.

Acknowledgments

The author received a Postdoctoral fellowship from the School of Engineering at Pontificia Universidad Catlica de Chile as well as from the European Southern Observatory and the Government of Chile. D. Guzman appreciates support from Pontificia Universidad Catolica, grant inicio No. 8/2010, TB acknowledges the Santander Mobility Grant. This work was partially supported by the Chilean Research Council grants Fondecyt-1095153 and Fondecyt-11110149 and by the Spanish Science and Innovation Ministry, project reference: PLAN NACIONAL AYA2010-18513. We would also like to thank Eric Gendron and Fabrice Vidal (LESIA) for their useful comments regarding the Learn and Apply method.

References and links

1.

M. Le Louarn and N. Hubin, “Wide-field adaptive optics for deep-field spectroscopy in the visible,” Mon. Not. R. Astron. Soc. 349(3), 1009–1018 (2004). [CrossRef]

2.

J. M. Beckers, “Detailed compensation of atmospheric seeing using multiconjugate adaptive optics,” Proc. SPIE 1114, 215–217 (1989).

3.

F. Hammer, F. Sayède, E. Gendron, T. Fusco, D. Burgarella, V. Cayatte, J.-M. Conan, F. Courbin, H. Flores, I. Guinouard, L. Jocou, A. Lançon, G. Monnet, M. Mouhcine, F. Rigaud, D. Rouan, G. Rousset, V. Buat, and F. Zamkotsian, “The FALCON Concept: Multi-Object Spectroscopy Combined with MCAO in Near-IR,” in Scientific Drivers for ESO Future VLT/VLTI Instrumentation, J. Bergeron and G. Monnet, eds.(Springer-Verlag, 2002), p. 139. [CrossRef]

4.

F. Assémat, E. Gendron, and F. Hammer, “The FALCON concept: multi-object adaptive optics and atmospheric tomography for integral field spectroscopy – principles and performance on an 8-m telescope,” Mon. Not. R. Astron. Soc. 376, 287–312 (2007). [CrossRef]

5.

T. Morris, Z. Hubert, R. Myers, E. Gendron, A. Longmore, G. Rousset, G. Talbot, T. Fusco, N. Dipper, F. Vidal, D. Henry, D. Gratadour, T. Butterley, F. Chemla, D. Guzman, P. Laporte, E. Younger, A. Kellerer, M. Harrison, M. Marteaud, D. Geng, A. Basden, A. Guesalaga, C. Dunlop, S. Todd, C. Robert, K. Dee, C. Dickson, N. Vedrenne, A. Greenaway, B. Stobie, H. Dalgarno, and J. Skvarc, “CANARY: The NGS/LGS MOAO demonstrator for EAGLE,” 1st AO4ELT conference p. 08003 (2010).

6.

E. Gendron, F. Vidal, M. Brangier, T. Morris, Z. Hubert, A. Basden, G. Rousset, R. M. Myers, F. Chemla, A. Longmore, T. Butterley, N. Dipper, C. Dunlop, D. Geng, D. Gratador, D. Henry, P. Laporte, N. Looker, D. Perret, A. Sevin, G. Talbot, and E. Younger, “MOAO first on-sky demonstration with CANARY,” Astron. Astrophys. L2(529) (2011).

7.

R. Avila, E. Carrasco, F. Ibañez, J. Vernin, J. L. Prieur, and D. X. Cruz, “Generalized SCIDAR Measurements at San Pedro Mártir. II. Wind Profile Statistics,” Publ. Astron. Soc. Pac. 118, 503–515 (2006). [CrossRef]

8.

J. R. P. Angel, P. Wizinowich, M. Lloyd-Hart, and D. Sandler, “Adaptive optics for array telescopes using neural-network techniques,” Nature 348, 221–224 (1990). [CrossRef]

9.

D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991). [CrossRef]

10.

M. Lloyd-Hart, P. Wizinowich, B. McLeod, D. Wittman, D. Colucci, R. Dekany, D. McCarthy, J. R. P. Angel, and D. Sandler, “First results of an on-line adaptive optics system with atmospheric wavefront sensing by an artifical neural network,” Astrophys. J. 390(1), L41–L44 (1992). [CrossRef]

11.

D. A. Montera, M. C. Welsh, B. M. Roggemann, and D. W. Ruck, “Processing wave-front-sensors slope measurements using artificial neural networks,” Appl. Opt. 35(21), 4238–4251 (1996). [CrossRef] [PubMed]

12.

M. Lloyd-Hart and P. McGuire, “Spatio-temporal prediction for adaptive optics wavefront reconstructors,” in Proc. European Southern Observatory Conf. on Adaptive Optics , pp. 95–102 (1995).

13.

S. J. Weddell and R. Y. Webb, “Dynamic Artificial Neural Networks for Centroid Prediction in Astronomy,” inProc. of the Sixth International Conference on Hybrid Intelligent Systems , pp. 68 (2006). [CrossRef]

14.

S. J. Weddell and R. Y. Webb, “A Neural Network Architecture for the Reconstruction of Turbulence Degraded Point Spread Functions,” in Proc. Image and Vision Computing New Zealand , pp. 103–108 (2007).

15.

D. Guzmán, F. J. Juez, R. Myers, A. Guesalaga, and F. Lasheras, “Modeling a MEMS deformable mirror using non-parametric estimation techniques,” Opt. Express 18(20), 21356–21369 (2010). [CrossRef] [PubMed]

16.

B. L. Ellerbroek, “First-order performance evaluation of adaptive-optics systems for atmospheric-turbulence compensation in extended-field-of-view astronomical telescopes,” J. Opt. Soc. Am. A 11(2), 783–805 (1994). [CrossRef]

17.

T. Fusco, J. Conan, G. Rousset, L. M. Mugnier, and V. Michau, “Optimal wave-front reconstruction strategies for multiconjugate adaptive optics,” J. Opt. Soc. Am. A 18(10), 2527–2538 (2001). [CrossRef]

18.

J. W. Wild, E. J. Kibblewhite, and R. Vuilleumier, “Sparse matrix wave-front estimators for adaptive-optics systems for large ground-based telescopes,” Opt. Lett. 20(9), 955–957 (1995). [CrossRef] [PubMed]

19.

E. Thiébaut and M. Tallon, “Fast minimum variance wavefront reconstruction for extremely large telescopes,” J. Opt. Soc. Am. A 27(5), 1046–1059 (2010). [CrossRef]

20.

F. Vidal, E. Gendron, and G. Rousset, “Tomography approach for multi-object adaptive optics,” J. Opt. Soc. Am. A. 27(11), 253–264 (2010). [CrossRef]

21.

R. W. Wilson, “SLODAR: measuring optical turbulence altitude with a Shack–Hartmann wavefront sensor,” Mon. Not. R. Astron. Soc. 337(1), 103–108 (2002). [CrossRef]

22.

K. Huarng and T. H.-K. Yu, “The application of neural networks to forecast fuzzy time series,” Physica A , 363(2), 481–491 (2006). [CrossRef]

23.

K. Swingler, Applying Neural Networks: A Practicle Guide (Academic Press, 1996).

24.

J. W. Denton, “How good are neural networks for causal forecasting?” J. Bus. Forecast. Methods Syst. 14(2), 17–20 (1995).

25.

S. S. Haykin, Neural Networks: A Comprehensive Foundation (Prentice Hall, 1999).

26.

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature 323, 533–536 (1986). [CrossRef]

27.

L. Bottaci, P. J. Drew, J. E. Hartley, M. B. Hadfield, R. Farouk, P. W. Lee, I. M. Macintyre, G. S. Duthie, and J. R. Monson, “Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions,” The Lancet 350(9076), 469–472 (1997). [CrossRef]

28.

S. Tamura, “An analysis of a noise reduction neural network,” in Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on, pp. 2001–2004 vol.3 (1989).

29.

R. W. Wilson and C. R. Jenkins, “Adaptive Optics for astronomy: theoretical performance and limitations,” Mon. Not. R. Astron. Soc. 268, 39–61 (1996).

30.

M. Hänggi and G. S. Moschytz, Cellular Neural Networks: Analysis, Desgn and Optimization (Kluwer Academic Publishers, 2000).

31.

J. Misra and I. Saha, “Artificial neural networks in hardware: A survey of two decades of progress,” Neurocomputing 74(1–3), 239–255 (2010). [CrossRef]

OCIS Codes
(010.1080) Atmospheric and oceanic optics : Active or adaptive optics
(010.1330) Atmospheric and oceanic optics : Atmospheric turbulence

ToC Category:
Adaptive Optics

History
Original Manuscript: October 6, 2011
Revised Manuscript: December 22, 2011
Manuscript Accepted: December 22, 2011
Published: January 19, 2012

Virtual Issues
Vol. 7, Iss. 3 Virtual Journal for Biomedical Optics

Citation
James Osborn, Francisco Javier De Cos Juez, Dani Guzman, Timothy Butterley, Richard Myers, Andrés Guesalaga, and Jesus Laine, "Using artificial neural networks for open-loop tomography," Opt. Express 20, 2420-2434 (2012)
http://www.opticsinfobase.org/vjbo/abstract.cfm?URI=oe-20-3-2420


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. M. Le Louarn and N. Hubin, “Wide-field adaptive optics for deep-field spectroscopy in the visible,” Mon. Not. R. Astron. Soc.349(3), 1009–1018 (2004). [CrossRef]
  2. J. M. Beckers, “Detailed compensation of atmospheric seeing using multiconjugate adaptive optics,” Proc. SPIE1114, 215–217 (1989).
  3. F. Hammer, F. Sayède, E. Gendron, T. Fusco, D. Burgarella, V. Cayatte, J.-M. Conan, F. Courbin, H. Flores, I. Guinouard, L. Jocou, A. Lançon, G. Monnet, M. Mouhcine, F. Rigaud, D. Rouan, G. Rousset, V. Buat, and F. Zamkotsian, “The FALCON Concept: Multi-Object Spectroscopy Combined with MCAO in Near-IR,” in Scientific Drivers for ESO Future VLT/VLTI Instrumentation, J. Bergeron and G. Monnet, eds.(Springer-Verlag, 2002), p. 139. [CrossRef]
  4. F. Assémat, E. Gendron, and F. Hammer, “The FALCON concept: multi-object adaptive optics and atmospheric tomography for integral field spectroscopy – principles and performance on an 8-m telescope,” Mon. Not. R. Astron. Soc.376, 287–312 (2007). [CrossRef]
  5. T. Morris, Z. Hubert, R. Myers, E. Gendron, A. Longmore, G. Rousset, G. Talbot, T. Fusco, N. Dipper, F. Vidal, D. Henry, D. Gratadour, T. Butterley, F. Chemla, D. Guzman, P. Laporte, E. Younger, A. Kellerer, M. Harrison, M. Marteaud, D. Geng, A. Basden, A. Guesalaga, C. Dunlop, S. Todd, C. Robert, K. Dee, C. Dickson, N. Vedrenne, A. Greenaway, B. Stobie, H. Dalgarno, and J. Skvarc, “CANARY: The NGS/LGS MOAO demonstrator for EAGLE,” 1st AO4ELT conference p. 08003 (2010).
  6. E. Gendron, F. Vidal, M. Brangier, T. Morris, Z. Hubert, A. Basden, G. Rousset, R. M. Myers, F. Chemla, A. Longmore, T. Butterley, N. Dipper, C. Dunlop, D. Geng, D. Gratador, D. Henry, P. Laporte, N. Looker, D. Perret, A. Sevin, G. Talbot, and E. Younger, “MOAO first on-sky demonstration with CANARY,” Astron. Astrophys.L2(529) (2011).
  7. R. Avila, E. Carrasco, F. Ibañez, J. Vernin, J. L. Prieur, and D. X. Cruz, “Generalized SCIDAR Measurements at San Pedro Mártir. II. Wind Profile Statistics,” Publ. Astron. Soc. Pac.118, 503–515 (2006). [CrossRef]
  8. J. R. P. Angel, P. Wizinowich, M. Lloyd-Hart, and D. Sandler, “Adaptive optics for array telescopes using neural-network techniques,” Nature348, 221–224 (1990). [CrossRef]
  9. D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature351, 300–302 (1991). [CrossRef]
  10. M. Lloyd-Hart, P. Wizinowich, B. McLeod, D. Wittman, D. Colucci, R. Dekany, D. McCarthy, J. R. P. Angel, and D. Sandler, “First results of an on-line adaptive optics system with atmospheric wavefront sensing by an artifical neural network,” Astrophys. J.390(1), L41–L44 (1992). [CrossRef]
  11. D. A. Montera, M. C. Welsh, B. M. Roggemann, and D. W. Ruck, “Processing wave-front-sensors slope measurements using artificial neural networks,” Appl. Opt.35(21), 4238–4251 (1996). [CrossRef] [PubMed]
  12. M. Lloyd-Hart and P. McGuire, “Spatio-temporal prediction for adaptive optics wavefront reconstructors,” in Proc. European Southern Observatory Conf. on Adaptive Optics, pp. 95–102 (1995).
  13. S. J. Weddell and R. Y. Webb, “Dynamic Artificial Neural Networks for Centroid Prediction in Astronomy,” inProc. of the Sixth International Conference on Hybrid Intelligent Systems, pp. 68 (2006). [CrossRef]
  14. S. J. Weddell and R. Y. Webb, “A Neural Network Architecture for the Reconstruction of Turbulence Degraded Point Spread Functions,” in Proc. Image and Vision Computing New Zealand, pp. 103–108 (2007).
  15. D. Guzmán, F. J. Juez, R. Myers, A. Guesalaga, and F. Lasheras, “Modeling a MEMS deformable mirror using non-parametric estimation techniques,” Opt. Express18(20), 21356–21369 (2010). [CrossRef] [PubMed]
  16. B. L. Ellerbroek, “First-order performance evaluation of adaptive-optics systems for atmospheric-turbulence compensation in extended-field-of-view astronomical telescopes,” J. Opt. Soc. Am. A11(2), 783–805 (1994). [CrossRef]
  17. T. Fusco, J. Conan, G. Rousset, L. M. Mugnier, and V. Michau, “Optimal wave-front reconstruction strategies for multiconjugate adaptive optics,” J. Opt. Soc. Am. A18(10), 2527–2538 (2001). [CrossRef]
  18. J. W. Wild, E. J. Kibblewhite, and R. Vuilleumier, “Sparse matrix wave-front estimators for adaptive-optics systems for large ground-based telescopes,” Opt. Lett.20(9), 955–957 (1995). [CrossRef] [PubMed]
  19. E. Thiébaut and M. Tallon, “Fast minimum variance wavefront reconstruction for extremely large telescopes,” J. Opt. Soc. Am. A27(5), 1046–1059 (2010). [CrossRef]
  20. F. Vidal, E. Gendron, and G. Rousset, “Tomography approach for multi-object adaptive optics,” J. Opt. Soc. Am. A.27(11), 253–264 (2010). [CrossRef]
  21. R. W. Wilson, “SLODAR: measuring optical turbulence altitude with a Shack–Hartmann wavefront sensor,” Mon. Not. R. Astron. Soc.337(1), 103–108 (2002). [CrossRef]
  22. K. Huarng and T. H.-K. Yu, “The application of neural networks to forecast fuzzy time series,” Physica A, 363(2), 481–491 (2006). [CrossRef]
  23. K. Swingler, Applying Neural Networks: A Practicle Guide (Academic Press, 1996).
  24. J. W. Denton, “How good are neural networks for causal forecasting?” J. Bus. Forecast. Methods Syst.14(2), 17–20 (1995).
  25. S. S. Haykin, Neural Networks: A Comprehensive Foundation (Prentice Hall, 1999).
  26. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature323, 533–536 (1986). [CrossRef]
  27. L. Bottaci, P. J. Drew, J. E. Hartley, M. B. Hadfield, R. Farouk, P. W. Lee, I. M. Macintyre, G. S. Duthie, and J. R. Monson, “Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions,” The Lancet350(9076), 469–472 (1997). [CrossRef]
  28. S. Tamura, “An analysis of a noise reduction neural network,” in Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on, pp. 2001–2004 vol.3 (1989).
  29. R. W. Wilson and C. R. Jenkins, “Adaptive Optics for astronomy: theoretical performance and limitations,” Mon. Not. R. Astron. Soc.268, 39–61 (1996).
  30. M. Hänggi and G. S. Moschytz, Cellular Neural Networks: Analysis, Desgn and Optimization (Kluwer Academic Publishers, 2000).
  31. J. Misra and I. Saha, “Artificial neural networks in hardware: A survey of two decades of progress,” Neurocomputing74(1–3), 239–255 (2010). [CrossRef]

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.


« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited