OSA's Digital Library

Optics Express

Optics Express

  • Editor: C. Martijn de Sterke
  • Vol. 20, Iss. 3 — Jan. 30, 2012
  • pp: 3241–3249
« Show journal navigation

Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing

L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer  »View Author Affiliations


Optics Express, Vol. 20, Issue 3, pp. 3241-3249 (2012)
http://dx.doi.org/10.1364/OE.20.003241


View Full Text Article

Acrobat PDF (1125 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

Many information processing challenges are difficult to solve with traditional Turing or von Neumann approaches. Implementing unconventional computational methods is therefore essential and optics provides promising opportunities. Here we experimentally demonstrate optical information processing using a nonlinear optoelectronic oscillator subject to delayed feedback. We implement a neuro-inspired concept, called Reservoir Computing, proven to possess universal computational capabilities. We particularly exploit the transient response of a complex dynamical system to an input data stream. We employ spoken digit recognition and time series prediction tasks as benchmarks, achieving competitive processing figures of merit.

© 2012 OSA

1. Introduction

Optical information processing is a vision originating from the 1970s [1

1. D. A. B. Miller, M. H. Mozolowski, A. Miller, and S. D. Smith, “Nonlinear optical effects in insb with a cw co laser,” Opt. Commun. 27, 133–136 (1978). [CrossRef]

, 2

2. E. Abraham and S. D. Smith, “Optical bistability and related devices,” Rep. Prog. Phys. 45, 815–885 (1982). [CrossRef]

], but due to power consumption, volume and scaling issues, interest decayed in the 1980s. Notwithstanding, optical information processing has been receiving reawakened interest with the evolution of photonic technologies and quantum computing [3

3. J. L. O’Brien, “Optical quantum computing,” Science 7, 1567–1570 (2007). [CrossRef]

]. The potential role of optics in supercomputing is again under consideration [4

4. H. J. Caulfield and S. Dolev, “Why future supercomputing requires optics,” Nat. Photonics 4, 261 (2010). [CrossRef]

6

6. D. A. B. Miller, “Correspondence to the editor,” Nat. Photonics 4, 406 (2010). [CrossRef]

].

Inspired by the way the brain processes information, neuroscience, neural network, and dynamical systems communities have been proposing novel computational concepts [7

7. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004). [CrossRef] [PubMed]

9

9. J. P. Crutchfield, L. D. William, and S. Sudeshna, “Introduction to focus issue: Intrinsic and designed computation: Information processing in dynamical systems beyond the digital hegemony,” Chaos 20, 037101 (2010). [CrossRef] [PubMed]

]. These concepts are fundamentally different from the standard Turing or von Neumann Machine methods, which are widely implemented in most computational systems. One of these concepts is known as Echo State Network [7

7. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004). [CrossRef] [PubMed]

], Liquid State Machine [8

8. D. V. Buonomano and W. Maass, “State-dependent computations: Spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci. 10, 113–125 (2009). [CrossRef] [PubMed]

] or more generally as Reservoir Computing (RC). RC is based on the computational power of complex recurrent networks operating in a dynamical and transient-like fashion. In standard neural networks recurrent networks have been employed, however resulting in difficulties to train network connection weights. RC benefits from the advantages of recurrent neural networks, while at the same time avoiding the problems in the training procedure. A schematic illustration of the network structure typically considered in RC, is shown in Fig. 1(a). These complex networks (or reservoirs) usually consist of a large number (102 to 103) of randomly connected nonlinear dynamical nodes receiving the information to be processed via input signals. These input signals are injected from l input channels into m reservoir nodes, with random weights wlmi. The reservoir response, i.e. the response of the network to the input signal, is evaluated at the read-out nodes j via a linear weighted sum of k node states, with coefficients wjkr. Due to the characteristics of the reservoir and its large number of dynamical elements (degrees of freedom), complex classification tasks and any nonlinear approximation can, in principle, be realized [7

7. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004). [CrossRef] [PubMed]

, 8

8. D. V. Buonomano and W. Maass, “State-dependent computations: Spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci. 10, 113–125 (2009). [CrossRef] [PubMed]

, 10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

].

Fig. 1 Schematic representation of RC based on (a) a complex network of nonlinear nodes or (b) a single nonlinear element subject to delayed feedback via time multiplexing, where f (x) stands for the the system’s nonlinear transformation and h(t) denotes the system’s impulse response, respectively.

Without input, the reservoir is typically set to operate in an asymptotically stable, fixed point, state. When excited by an external stimulus (i.e. the information to be processed), the reservoir might, however, exhibit complex transient dynamics. The transient dynamical states, essential for information processing purposes in this scheme, must comply with certain characteristics. If two input signals are similar enough within a certain range, a sufficiently similar transient response must be generated by the reservoir (approximation property). If two input signals belong to different classes, their transient states must sufficiently differ (separation property). These two properties, together with a short-term (fading) memory of the system, are crucial for the computational performance of RC [7

7. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004). [CrossRef] [PubMed]

, 8

8. D. V. Buonomano and W. Maass, “State-dependent computations: Spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci. 10, 113–125 (2009). [CrossRef] [PubMed]

]. Similar mechanisms have been reported in real physiological systems [11

11. M. Rabinovich, R. Huerta, and G. Laurent, “Transient dynamics of neural processing,” Science 321, 48–50 (2008). [CrossRef] [PubMed]

]. In addition, RC requires the system to be trained with known signals. During this training phase the read-out weights are optimized, enabling subsequent processing of untrained signals belonging to the same class as those used in the training procedure [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

].

The experimental implementation of traditional RC brings a key challenge with it. The reservoir is usually composed of a relatively large number of nonlinear nodes interconnected in a network. For instance, a photonic LSM based on a network of coupled Semiconductor Optical Amplifiers (SOA) has recently been proposed and simulated [12

12. K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express 16, 11182–11192 (2008). [CrossRef] [PubMed]

, 13

13. K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw. 22, 1469–1481 (2011). [CrossRef] [PubMed]

]. However, considering the physical complexity of the reservoir, the approach of many nodes is technologically highly demanding and often unrealistic. These constrains can be overcome by replacing the complex network of many elements with an approach based on a single nonlinear element subject to long delayed feedback via time multiplexing [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

]. Delay systems are well known to be high dimensional and they have been shown to exhibit a sufficiently large number of different transient states. Despite its simplicity (scalar nonlinear dynamical system, but with a long delay) this system can perform certain tasks as well as traditional reservoirs [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

]. A schematic representation of this approach is shown in Fig. 1(b). Here, the complex network is replaced by a reservoir consisting of a single nonlinear element with delayed feedback. The network nodes are distributed along the delay line and the data injection is realized via time multiplexing. From a practical point of view, a big advantage of our scheme is the possible simplification of a hardware implementation.

In the following, we demonstrate the first experimental realization of optical-based RC using a single nonlinear optoelectronic device subject to delay feedback. Our experiments prove that the RC concept can be transfered from the electronic [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

] to the optical domain, using optoelectronic hardware. Moreover, by using a different nonlinearity we show that the particular type of the nonlinearity seems not to be crucial. An advantage of the particular choice of nonlinearity in this manuscript is that it allows us to study the dependence of the RC performance on the shape of the nonlinearity in detail. This is achieved by tuning a single experimental parameter. Finally, our experiment demonstrates the potential for a high bandwidth realization of RC.

2. Experimental setup

The scheme we propose is based on a simple and efficient delay-coupled photonic system, depicted in Fig. 2. This setup was originally proposed as a modern integrated optics version allowing for the exploration of optical chaos [14

14. A. Neyer and E. Voges, “Dynamics of electrooptic bistable devices with delayed feedback,” IEEE J. Quantum Electron. 18, 2009–2015 (1982). [CrossRef]

16

16. K. E. Callan, L. Illing, Z. Gao, D. J. Gauthier, and E. Schöll, “Broadband chaos generated by an optoelectronic oscillator,” Phys. Rev. Lett. 104, 113901 (2010). [CrossRef] [PubMed]

], as exhibited by an Ikeda ring cavity [18

18. L. Larger and J. M. Dudley, “Optoelectronic chaos,” Nature 465, 41–42 (2010). [CrossRef] [PubMed]

]; it was also later successfully modified and used in the framework of broadband optical chaos communications [15

15. L. Larger, J.-P. Goedgebuer, and V. S. Udaltsov, “Ikeda–based nonlinear delayed dynamics for application to secure optical transmission systems using chaos,” C. R. Phys. 5, 669–681 (2004). [CrossRef]

], and highlighted as a system for studying fundamental characteristics and applications of complex dynamics including RC [17

17. K. Ikeda, “Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system,” Opt. Commun. 30, 257–261 (1979). [CrossRef]

]. Our implementation consists of several key components. We employ a standard telecommunication wavelength DFB diode laser (20 mW) emitting at 1550 nm. An integrated telecom Mach-Zehnder modulator (MZM, LiNbO3) provides an electro-optic nonlinear modulation transfer function (sin2 –function). A long optical fiber implements the delayed feedback loop and a photodiode is employed for optical detection. An electronic feedback circuit closes the nonlinear delay loop, connecting its output to the MZM input electrode. This circuit serves several purposes. It acts as a low pass filter, with a characteristic response time TR. It allows to add the input information uI (t) to the delayed signal x(t), and amplifies this signal before it is applied to the MZM to allow for sufficient nonlinear operation. In addition, it provides the data output w(t).

Fig. 2 Optoelectronic implementation of RC with a single nonlinear element subject to delayed feedback.

Our experimental system provides direct access to key parameters, e.g. the nonlinearity gain β and the offset phase of the MZM Φ0, enabling easy tunability of nonlinearity and dynamical behaviors. Parameter β is controlled via the laser diode power, while Φ0 is controlled by the DC bias input of the MZM. In the absence of input signal, the system is set to operate in a steady (fixed point) state by keeping β at a sufficiently low value. By setting the system in the steady state, a consistent response of the device to the same input signal is guaranteed.

The signal in the feedback loop can be described by the following scalar equation:
εx˙(s)+x(s)=βsin2[μx(s1)+ρuI(s1)+Φ0],
(1)
where ρ is the relative weight of the input information compared to the feedback signal x and μ corresponds to the feedback scaling. Parameter ε= TR/τD is the oscillator response time normalized to the delay and s = t/τD is the normalized time. Setting ρ = 0, the system performs the well known Ikeda dynamics [18

18. L. Larger and J. M. Dudley, “Optoelectronic chaos,” Nature 465, 41–42 (2010). [CrossRef] [PubMed]

], whose bifurcation diagram has already been intensively explored in the literature [19

19. T. Erneux, L. Larger, M. W. Lee, and J. Goedgebuer, “Ikeda hopf bifurcation revisited,” Physica D 194, 49–64 (2004). [CrossRef]

]. In the RC approach, the dynamics typically remain in a fixed point when it is not excited by an input information (β < 1). Dynamical complexity occurs during the transient response of the nonlinear delay system when it is excited by the input information.

In delay systems, the dynamical degrees of freedom are distributed along the delay line [20

20. F. T. Arecchi, G. Giacomelli, A. Lapucci, and R. Meucci, “Two–dimensional representation of a delayed dynamical system,” Phys. Rev. A 45, R4225–R4228 (1993). [CrossRef]

]. Therefore, we define virtual nodes by dividing the total delay interval of length τD, realized by 4.2 km optical fiber, into subintervals of length θ [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

]. At the end of each subinterval we extract the respective virtual node states. By this, we aim at mimicking the nodes of traditional reservoirs. Unlike traditional RC, connectivity between virtual nodes is limited to local couplings including few nearest neighbors. The extent of the coupling is determined by the characteristic response time (TR) of the nonlinear delayed feedback loop through its impulse response. The longer (shorter) TR is relative to the separation θ, the more (less) consecutive virtual nodes are connected. Temporal separations θ slightly smaller than TR were found to yield the best RC performance [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

]. Additional to this short time (local) coupling, a long time coupling originates from the delayed feedback, as explicitly written in Eq. (1).

In order to evaluate the performance of the system, the transient response of the reservoir needs to be processed for a given task. This dedicated processing is carried out by one or several read-out nodes. Each read-out node is defined by a linear weighted sum of the virtual node states. As it is also the case in traditional RC processing, the read-out weights are obtained via a training procedure. This training optimizes the linear separation of the virtual node states, excited by the input information to be processed. A parallel read-out of the virtual nodes can be obtained by simply tapping the delay line at the node positions. Each virtual node is scaled with a weight that needs to be determined from the training stage. In our scheme, a sequential read-out is also possible via time multiplexing, making it more practical and ideally suited for an experimental realization. We have sequentially read out the full transient response of the nonlinear delay dynamics and performed an off-line training procedure using a dedicated toolbox [13

13. K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw. 22, 1469–1481 (2011). [CrossRef] [PubMed]

].

In our experiments we have chosen a number of NN = 400 virtual nodes [10

10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

], a delay time of τD = 20.87 μs, i.e. θ = τD/NN = 52.18 ns. With the internal system timescale of TR = 240 ns, we calculate a ratio of TR/θ ≃ 4.6 between the system response time and node width. It is worth mentioning that other values of NN and τD yield similar results, as long as the indicated relative scaling is fulfilled. This is of particular relevance when the proposed setup has to be extended to an ultra-fast version involving standard high speed telecom components.

To evaluate the performance of our system we perform two challenging tasks typically used as benchmark in machine learning and neural network computing: spoken digit recognition and time series prediction. We would like to emphasize at this point that data injection and the classification are in this work computed off line. For RC, the input data is multiplied with a discrete mask, and some additional pre-processing depending on the task at hand. The post processing of the reservoir readout only consists of a linearly weighted sum. As such, both steps could in the future be implemented into the experimental realization with high bandwidth components. The training procedure, which is also carried out offline, once performed, does not affect the bandwidth of the online operation. Accordingly, the achievable bandwidth of an experimental realization consisting of entirely hardware based data injection, reservoir response and classifier readout should be determined by the bandwidth of our reservoir.

3. Benchmark tests for evaluating computational power

Spoken digit recognition is a benchmark test widely used in the field of machine learning and in particular RC [21

21. D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, “Isolated word recognition with the liquid state machine: a case study,” Inf. Process. Lett. 30, 521–528 (2005). [CrossRef]

]. The task of recognizing spoken digits reliably at high speed represents a very demanding computational task. At the same time this test also has a certain appeal due to its practical nature. The standard approach to spoken digit recognition utilizes data preprocessing, which replicates the response of the human Cochlea to sound waves, as depicted in Fig. 3. The Lyon’s Cochlear ear model [22

22. R. F. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” Proc. of the IEEE Int. Conf. Acoust., Speech, Signal Processing (1982).

] divides the input signal into 86 channels, containing different frequency information, and associating each channel’s response to the data input with a firing (excitation) possibility. The input data matrix Ml (dimension Nf xNs) constructed with the Lyon’s Cochlear ear model consists of the corresponding Nf =86 frequency channels and a maximum of Ns=130 samples in time. Ml is multiplied with the input connectivity matrix Wi (dimension NNxNf, NN=400 being the number of virtual nodes in the delay line), creating the data input Mi for the reservoir. Most of the elements wlmi of the connectivity matrix Wi are set to zero, realizing a sparse and random connectivity between the input layer and the reservoir. The remaining elements are chosen randomly from two discrete mask values, keeping the system in a transient state for the duration of the spoken digit, while also breaking the symmetry between the Nn nodes. The elements of the connectivity matrix remain constant for the duration of the node separation θ. For training the output weights we have randomly chosen 475 spoken digits among a data set of 500, leaving 25 for testing. The read-out weights ωjkr are calculated from a ridge regression [23

23. A. E. Hoerl and R. W. Kennard, “Ridge Regression: Applications to Nonorthogonal Problems” Technometrics 12, 69–82 (1970). [CrossRef]

] on the system response to the 475 test samples. These weights correspond to the coefficients of a read-out matrix Wr, which is expected to provide the identification of the spoken digit in the form of a so-called target function. The entire training and test procedure is repeated 20 times with different, non-overlapping fragmentations of the 500 speech samples. By following this approach, we minimize the influence of individual speakers and spoken digits on our results, as well as providing statistical information.

Fig. 3 Injection of a spoken digit into the reservoir showing the input connectivity matrix (left), a Cochleagram of a spoken digit (middle) and the resulting input data of the network (right). In the connect matrix the color code presents the magnitude of the input scaling factors wlmi, in the Cochleagram and the Network input data the color encodes the amplitudes of the signals, with red (blue) corresponding to large (small) values.

The performance for this task is characterized by the word error rate (WER), as well as a margin. We compute the margin by taking the classifier value of the reservoir’s best guess, from which we subtract the classifier value of the second best guess. Figures 4(a) and 4(b) show WER and margin extracted from our experiment, displayed in the (β0)–plane. Part (c) of the same Figure provides the Φ0-dependence for a constant β, while the transmission function of the MZM as a function of Φ0 is shown in part (d). As demonstrated by the nonlinear transfer function of the MZM, depicted in Fig. 4(d), and by Eq. (1), we can experimentally realize a variety of different nonlinear response properties to data input. These can be directly tuned by scanning the (β0)–plane, allowing to control magnitude and sign of the linear, as well as nonlinear response. We can choose to work with settings for different sign and magnitude of slope as well as curvature. Accordingly, our experiment represents not only a powerful electro-optical realization of RC, but at the same time it allows for studying the influence of nonlinearity and dynamical properties on the RC performance. A strong dependence in classification capability of the reservoir is found, with the WER ranging from (7.24±0.79) % down to only (0.04±0.017) %. The systematic dependence of the WER on Φ0 shows the importance of the nonlinearity for the classification performance. We find the lowest WER always to be at points close, but not equal, to the local extrema of the nonlinear response. Around these points the nonlinearity can be approximated by a quadratic function. The optimal operational point has a tendency to be shifted from the local extrema towards the side with a negative slope in the response function. Corresponding points, sharing the same nonlinearity, differ in stability properties of the fixed point for a change in sign of the slope [19

19. T. Erneux, L. Larger, M. W. Lee, and J. Goedgebuer, “Ikeda hopf bifurcation revisited,” Physica D 194, 49–64 (2004). [CrossRef]

]. Besides operating around the local extrema of the response function, we can tune the operating point to the vicinity of the inflection point, making its response almost linear. Here the performance strongly decreases, highlighting the importance of the nonlinearity for classification tasks. When changing β, we find the optimal operational conditions for intermediate values. As soon as β is sufficiently large (β >0.1) the performance does not critically depend on β, as long as Φ0 is kept optimized. An increase in β, however, results in a growing sensitivity on Φ0. In the absence of feedback (μ=0), the system’s performance significantly degrades, with the best classification yielding a WER of 1.84 %. Removing the delayed feedback strips the system of its memory, which is thus proven to be beneficial for successful spoken digit classification using our setup. Figure 4(c) shows the WER and margin as a function of Φ0 for β = 0.3 and ρπ in more detail. Error bars are extracted from three independent measurements, repeated under identical experimental conditions. It can be seen that good performance is not limited to a single point, with a WER remaining below 0.5% for the range 0.75π ≤ Φ0 ≤ 0.95π.

Fig. 4 (a) and (b) show the WER and Margin for spoken digit recognition in the (β0)–plane (bifurcation parameter vs. MZM phase). The two figures of merit show a similar dependency on both parameters, with excellent performance at β = 0.3 and Φ0 = 0.89π. (c) Detailed dependence of the RC performance on the MZM phase at β = 0.3. (d) MZM transmission function as a function of phase Φ0.

We further evaluated the performance of our system by addressing the one-time-step prediction task of a time series recorded from a far-infrared laser operating in a chaotic state [24

24. A. S. Weigend and N. A. Gershenfeld, “Time series prediction: Forecasting the future and understanding the past,” ftp://ftp.santafe.edu/pub/Time-Series/Competition (1993).

]. The one-time-step prediction is performed by feeding the reservoir only one explicit data point at a time. Information about points further in the past are present in the system only implicitly due to its internal, fading memory. To evaluate the performance of our RC approach we computed the normalized mean square error (NMSE) between a sequence of predicted points and their corresponding targets. The results for the one-time-step prediction are depicted in Fig. 5. For β = 0.2 (blue points), we again find a strong dependence of the NMSE on the MZM phase Φ0 and therefore on the characteristics of the nonlinearity. For Φ0 = 0.1π we obtain the lowest prediction error with a NMSE= 0.124 ±4 ×10−4. For the task of time series prediction the system’s performance is optimized for Φ0 being shifted further away from the local extrema in the response function, closer towards the inflection point. In addition, the system’s performance significantly degrades for these values of Φ0 corresponding to the local extrema. This is different to the behavior obtained in the spoken digit recognition task, where at these values of Φ0 the performance was not optimal, still the loss in performance was far less significant. We interpret this as a manifestation of the importance of the memory for the one-time-step prediction task, however, a small amount of nonlinearity is still required for obtaining good performance. To provide evidence that the performance indeed stems from the interplay of high-dimensional mapping and nonlinearity and not from the nonlinearity alone, we in addition plot the data obtained when disconnecting the feedback line (red points, μ= 0). The lower performance without feedback loop (i.e. memory) is clearly visible. Data presented for β = 0.2 shows consistently better optimal performance for Φ0 <0.5π, where the slope of Eq. (1) is positive. For the case of zero feedback the performance is almost symmetric around Φ0=0.5π, again indicating that this effect might be connected to properties of the system’s memory. Timeseries prediction based on numerical methods achieved even lower prediction errors (below 1 % using echo state networks [25

25. A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw. 22, 131–144 (2011). [CrossRef]

] or support vector machines [26

26. L. J. Cao, “Support vector machines experts for time series forecasting,” Neurocomputing 51, 321–339 (2003). [CrossRef]

]), however neglecting noise and finite experimental precision, and even more, externally feeding the reservoir several data points at a time.

Fig. 5 MZM phase dependence of the RC performance in a time series prediction task, using the Santa-Fe data set. Best performance for β = 0.2 is found around Φ0 = 0.1π0 = 0.5π0 = 0.7π and Φ0 = 0.85π phase values in the vicinity of local extrema of the transfer function of the MZM (see Figs. 4(d), 1(a), and 1(b)).

4. Conclusion

Major work needs to be done in the future in order to explore the full potential of our approach, including scaling possibilities. In addition, implementation of more advanced features, e.g. enhancing the connectivity of the virtual network, real-time post-processing and plasticity rules to optimize the reservoir for the corresponding task during the training phase, are foreseen.

Acknowledgments

We would like to thank J. Danckaert, G. Van der Sande and the members of the PHOCUS consortium for fruitful discussions. The project PHOCUS acknowledges the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET-Open grant number: 240763. Moreover, this work was supported by MICINN (Spain), and FEDER, under Projects TEC2009-14101 (DeCoDicA), and 0200950I190 (Proyecto Intramurales Especiales). LL thanks the institutional support of the Institut universitaire de France, as well as the Spanish Ministery for Research for a visiting professor position at the IFISC.

References and links

1.

D. A. B. Miller, M. H. Mozolowski, A. Miller, and S. D. Smith, “Nonlinear optical effects in insb with a cw co laser,” Opt. Commun. 27, 133–136 (1978). [CrossRef]

2.

E. Abraham and S. D. Smith, “Optical bistability and related devices,” Rep. Prog. Phys. 45, 815–885 (1982). [CrossRef]

3.

J. L. O’Brien, “Optical quantum computing,” Science 7, 1567–1570 (2007). [CrossRef]

4.

H. J. Caulfield and S. Dolev, “Why future supercomputing requires optics,” Nat. Photonics 4, 261 (2010). [CrossRef]

5.

R. S. Tucker, “The role of optics in computing,” Nat. Photonics 4, 405 (2010). [CrossRef]

6.

D. A. B. Miller, “Correspondence to the editor,” Nat. Photonics 4, 406 (2010). [CrossRef]

7.

H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004). [CrossRef] [PubMed]

8.

D. V. Buonomano and W. Maass, “State-dependent computations: Spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci. 10, 113–125 (2009). [CrossRef] [PubMed]

9.

J. P. Crutchfield, L. D. William, and S. Sudeshna, “Introduction to focus issue: Intrinsic and designed computation: Information processing in dynamical systems beyond the digital hegemony,” Chaos 20, 037101 (2010). [CrossRef] [PubMed]

10.

L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef] [PubMed]

11.

M. Rabinovich, R. Huerta, and G. Laurent, “Transient dynamics of neural processing,” Science 321, 48–50 (2008). [CrossRef] [PubMed]

12.

K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express 16, 11182–11192 (2008). [CrossRef] [PubMed]

13.

K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw. 22, 1469–1481 (2011). [CrossRef] [PubMed]

14.

A. Neyer and E. Voges, “Dynamics of electrooptic bistable devices with delayed feedback,” IEEE J. Quantum Electron. 18, 2009–2015 (1982). [CrossRef]

15.

L. Larger, J.-P. Goedgebuer, and V. S. Udaltsov, “Ikeda–based nonlinear delayed dynamics for application to secure optical transmission systems using chaos,” C. R. Phys. 5, 669–681 (2004). [CrossRef]

16.

K. E. Callan, L. Illing, Z. Gao, D. J. Gauthier, and E. Schöll, “Broadband chaos generated by an optoelectronic oscillator,” Phys. Rev. Lett. 104, 113901 (2010). [CrossRef] [PubMed]

17.

K. Ikeda, “Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system,” Opt. Commun. 30, 257–261 (1979). [CrossRef]

18.

L. Larger and J. M. Dudley, “Optoelectronic chaos,” Nature 465, 41–42 (2010). [CrossRef] [PubMed]

19.

T. Erneux, L. Larger, M. W. Lee, and J. Goedgebuer, “Ikeda hopf bifurcation revisited,” Physica D 194, 49–64 (2004). [CrossRef]

20.

F. T. Arecchi, G. Giacomelli, A. Lapucci, and R. Meucci, “Two–dimensional representation of a delayed dynamical system,” Phys. Rev. A 45, R4225–R4228 (1993). [CrossRef]

21.

D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, “Isolated word recognition with the liquid state machine: a case study,” Inf. Process. Lett. 30, 521–528 (2005). [CrossRef]

22.

R. F. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” Proc. of the IEEE Int. Conf. Acoust., Speech, Signal Processing (1982).

23.

A. E. Hoerl and R. W. Kennard, “Ridge Regression: Applications to Nonorthogonal Problems” Technometrics 12, 69–82 (1970). [CrossRef]

24.

A. S. Weigend and N. A. Gershenfeld, “Time series prediction: Forecasting the future and understanding the past,” ftp://ftp.santafe.edu/pub/Time-Series/Competition (1993).

25.

A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw. 22, 131–144 (2011). [CrossRef]

26.

L. J. Cao, “Support vector machines experts for time series forecasting,” Neurocomputing 51, 321–339 (2003). [CrossRef]

27.

D. Psaltis, D. Brady, X. G. Gu, and S. Lin, “Holography in artificial neural networks,” Nature 343, 325–330 (1990). [CrossRef] [PubMed]

28.

Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic Reservoir Computing,” http://arxiv.org/abs/1111.7219

OCIS Codes
(190.3100) Nonlinear optics : Instabilities and chaos
(200.3050) Optics in computing : Information processing
(250.4745) Optoelectronics : Optical processing devices

ToC Category:
Optics in Computing

History
Original Manuscript: October 24, 2011
Revised Manuscript: January 13, 2012
Manuscript Accepted: January 16, 2012
Published: January 27, 2012

Citation
L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, "Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing," Opt. Express 20, 3241-3249 (2012)
http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. D. A. B. Miller, M. H. Mozolowski, A. Miller, and S. D. Smith, “Nonlinear optical effects in insb with a cw co laser,” Opt. Commun.27, 133–136 (1978). [CrossRef]
  2. E. Abraham and S. D. Smith, “Optical bistability and related devices,” Rep. Prog. Phys.45, 815–885 (1982). [CrossRef]
  3. J. L. O’Brien, “Optical quantum computing,” Science7, 1567–1570 (2007). [CrossRef]
  4. H. J. Caulfield and S. Dolev, “Why future supercomputing requires optics,” Nat. Photonics4, 261 (2010). [CrossRef]
  5. R. S. Tucker, “The role of optics in computing,” Nat. Photonics4, 405 (2010). [CrossRef]
  6. D. A. B. Miller, “Correspondence to the editor,” Nat. Photonics4, 406 (2010). [CrossRef]
  7. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science304, 78–80 (2004). [CrossRef] [PubMed]
  8. D. V. Buonomano and W. Maass, “State-dependent computations: Spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci.10, 113–125 (2009). [CrossRef] [PubMed]
  9. J. P. Crutchfield, L. D. William, and S. Sudeshna, “Introduction to focus issue: Intrinsic and designed computation: Information processing in dynamical systems beyond the digital hegemony,” Chaos20, 037101 (2010). [CrossRef] [PubMed]
  10. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun.2, 468 (2011). [CrossRef] [PubMed]
  11. M. Rabinovich, R. Huerta, and G. Laurent, “Transient dynamics of neural processing,” Science321, 48–50 (2008). [CrossRef] [PubMed]
  12. K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express16, 11182–11192 (2008). [CrossRef] [PubMed]
  13. K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw.22, 1469–1481 (2011). [CrossRef] [PubMed]
  14. A. Neyer and E. Voges, “Dynamics of electrooptic bistable devices with delayed feedback,” IEEE J. Quantum Electron.18, 2009–2015 (1982). [CrossRef]
  15. L. Larger, J.-P. Goedgebuer, and V. S. Udaltsov, “Ikeda–based nonlinear delayed dynamics for application to secure optical transmission systems using chaos,” C. R. Phys.5, 669–681 (2004). [CrossRef]
  16. K. E. Callan, L. Illing, Z. Gao, D. J. Gauthier, and E. Schöll, “Broadband chaos generated by an optoelectronic oscillator,” Phys. Rev. Lett.104, 113901 (2010). [CrossRef] [PubMed]
  17. K. Ikeda, “Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system,” Opt. Commun.30, 257–261 (1979). [CrossRef]
  18. L. Larger and J. M. Dudley, “Optoelectronic chaos,” Nature465, 41–42 (2010). [CrossRef] [PubMed]
  19. T. Erneux, L. Larger, M. W. Lee, and J. Goedgebuer, “Ikeda hopf bifurcation revisited,” Physica D194, 49–64 (2004). [CrossRef]
  20. F. T. Arecchi, G. Giacomelli, A. Lapucci, and R. Meucci, “Two–dimensional representation of a delayed dynamical system,” Phys. Rev. A45, R4225–R4228 (1993). [CrossRef]
  21. D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, “Isolated word recognition with the liquid state machine: a case study,” Inf. Process. Lett.30, 521–528 (2005). [CrossRef]
  22. R. F. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” Proc. of the IEEE Int. Conf. Acoust., Speech, Signal Processing (1982).
  23. A. E. Hoerl and R. W. Kennard, “Ridge Regression: Applications to Nonorthogonal Problems” Technometrics12, 69–82 (1970). [CrossRef]
  24. A. S. Weigend and N. A. Gershenfeld, “Time series prediction: Forecasting the future and understanding the past,” ftp://ftp.santafe.edu/pub/Time-Series/Competition (1993).
  25. A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw.22, 131–144 (2011). [CrossRef]
  26. L. J. Cao, “Support vector machines experts for time series forecasting,” Neurocomputing51, 321–339 (2003). [CrossRef]
  27. D. Psaltis, D. Brady, X. G. Gu, and S. Lin, “Holography in artificial neural networks,” Nature343, 325–330 (1990). [CrossRef] [PubMed]
  28. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic Reservoir Computing,” http://arxiv.org/abs/1111.7219

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

Figures

Fig. 1 Fig. 2 Fig. 3
 
Fig. 4 Fig. 5
 

« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited