OSA's Digital Library

Optics Express

Optics Express

  • Editor: C. Martijn de Sterke
  • Vol. 20, Iss. 20 — Sep. 24, 2012
  • pp: 22783–22795
« Show journal navigation

All-optical reservoir computing

François Duport, Bendix Schneider, Anteo Smerieri, Marc Haelterman, and Serge Massar  »View Author Affiliations


Optics Express, Vol. 20, Issue 20, pp. 22783-22795 (2012)
http://dx.doi.org/10.1364/OE.20.022783


View Full Text Article

Acrobat PDF (962 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

Reservoir Computing is a novel computing paradigm that uses a nonlinear recurrent dynamical system to carry out information processing. Recent electronic and optoelectronic Reservoir Computers based on an architecture with a single nonlinear node and a delay loop have shown performance on standardized tasks comparable to state-of-the-art digital implementations. Here we report an all-optical implementation of a Reservoir Computer, made of off-the-shelf components for optical telecommunications. It uses the saturation of a semiconductor optical amplifier as nonlinearity. The present work shows that, within the Reservoir Computing paradigm, all-optical computing with state-of-the-art performance is possible.

© 2012 OSA

1. Introduction

Reservoir Computing is a novel neural network computing paradigm introduced at the onset of the 21st century [1

1. H. Jaeger, “The ‘echo state’ approach to analysing and training recurrent neural networks,” Technical Report GMD Report 148 (German National Research Center for Information Technology, 2001).

15

15. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” KI - Künstliche Intelligenz, (2012), pp. 1–7.

]. It is very well suited to process time dependent inputs, and provides state-of-the-art performance on tasks such as speech recognition, time series prediction, etc. (see [11

11. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev. 3(3), 127–149 (2009). [CrossRef]

,15

15. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” KI - Künstliche Intelligenz, (2012), pp. 1–7.

] for reviews).

A Reservoir Computer consists of a high-dimensional nonlinear dynamical system, the reservoir, driven by a time dependent input, and an output layer in which the output signal is calculated. The time dependent output of the reservoir is given by a linear combination of the instantaneous internal states of the reservoir. The weights of this linear combination are the only parameters of the system which are trained to match the reservoir output to a target sequence. In addition, a few parameters of the dynamical system are usually tunable, to adjust how close the system is to the edge of stability and to ensure adequate coupling between the internal variables. The simplicity and flexibility of the concept of Reservoir Computing makes it eminently suitable for experimental implementation. Indeed, given the loose constraints which a system must satisfy in order to constitute a suitable reservoir, a variety of experimental realizations have so far been built. These include using a bucket of water [16

16. C. Fernando and S. Sojakka, “Pattern recognition in a bucket,” in Proceedings of the 7th European Conference on Artificial Life2801, W. Banzhaf, J. Ziegler, T. Christaller, P. Dittrich, and J. Kim, eds. (2003), Vol. 2801, pp. 588–597.

], an analogue VLSI chip [17

17. F. Schürmann, K. Meier, and J. Schemmel, “Edge of chaos computation in mixed-mode vlsi - a hard liquid,” in Proceedings of Advances in Neural Information Processing Systems, L. K. Saul, Y. Weiss, and Léon, eds. (MIT Press, 2005).

], and a delay system with a single nonlinear node. The latter architecture was introduced in an experimental architecture in [18

18. Y. Paquot, B. Schrauwen, J. Dambre, M. Haelterman, and S. Massar, “Reservoir computing: a photonic neural network for information processing,” Proc. SPIE 7728, 77280B, 77280B-12 (2010). [CrossRef]

] and studied independently from a theoretical point of view in [19

19. A. Rodan and P. Tino, “Simple deterministically constructed recurrent neural networks,” Intelligent Data Engineering and Automated Learning (IDEAL, 2010), pp. 267–274.

,20

20. A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw. 22(1), 131–144 (2011). [CrossRef] [PubMed]

]. Because of its simplicity, this architecture has proven very powerful for experiments. It enabled, using an electronic implementation, the first experimental Reservoir Computer with performance comparable to state-of-the-art digital implementations [21

21. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html

], as well as the first optoelectronic implementation of Reservoir Computing [18

18. Y. Paquot, B. Schrauwen, J. Dambre, M. Haelterman, and S. Massar, “Reservoir computing: a photonic neural network for information processing,” Proc. SPIE 7728, 77280B, 77280B-12 (2010). [CrossRef]

,22

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

,23

23. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241. [CrossRef] [PubMed]

]. For a further general discussion, we refer to the supplementary material of [22

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

] in which a roadmap for building experimental reservoirs is presented.

The realization of all-optical Reservoir Computers is attractive. All-optical computation has proven a tantalizing but frustrating goal since decades. On the one hand, optics has great promise for computation, such as inherent speed and parallelism. On the other hand, the difficulty of realizing optical nonlinearities and of building complex optical architectures has hindered these efforts. Reservoir Computing seems a promising new avenue. Indeed, in Reservoir Computing one can directly use the available nonlinearities for computation, rather than first adapting them so as to implement specific operations (e.g. logical gates such AND, OR, etc…) which are then combined into more complex architectures. This implies that Reservoir Computing is highly flexible, a desirable feature when, as in optics, implementing nonlinearities is difficult.

The possibility of an all-optical implementation of Reservoir Computing based on an array of Semiconductor Optical Amplifiers (SOA) on a chip has been proposed through numerical simulations in [24

24. K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Van Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express 16(15), 11182–11192 (2008). [CrossRef] [PubMed]

,25

25. K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw. 22(9), 1469–1481 (2011), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-15-11182. [CrossRef] [PubMed]

]. Here we report the first all-optical experimental implementation of a Reservoir Computer. The architecture we use is based on a fiber optics delay loop with a single nonlinear node and off-line training [21

21. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html

23

23. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241. [CrossRef] [PubMed]

]. The nonlinearity is provided by the saturation gain effect in a SOA.

2. The photonic hardware implementation

2.1. Principles of reservoir computing

On a conceptual level, a Reservoir Computer can be thought of as a collection of internal nodes, whose states xi evolve in discrete timenaccording to a nonlinear recurrent map of the form
xi(n)=FNL(j=1NαAijxi(n1)+βmiu(n)),i=1,2,...,N
(1)
whereNis the number of nodes,FNLis a nonlinear function, u(n)is the input signal,mis the input mask vector, andAis the interconnection matrix. The feedback gainαand the input gainβare parameters that are used to tune the dynamics of the reservoir in order to find its best working point. The input maskmienriches the dynamics of the reservoir by distributing the same input to different nodes with different weightsmi.

The aim of a Reservoir Computer is to carry out computation on the input signal u(n). The result of the computation is an outputy^(n). It is given by a linear combination of the node states

y^(n)=iWixi(n)
(2)

The output weights Wiare calculated to minimize the distance between the actual outputy^(n)and the desired outputy(n). This is typically done by considering a “training” sequence for which the input and desired output are known. In many cases the distance between the actual and desired output is measured using the Normalized Mean Square Error (NMSE), given by

NMSE=(yy^)2n(yy)2n
(3)

Note that the input and output can both have more than one dimension, in which case the input maskmiand readout weightsWibecome respectively the matricesmikandWik, as will be considered in Sec. 4.5 and 4.6.

2.2. Configuration with a delay loop and a single nonlinear node

The Reservoir Computer implementation using a single nonlinear node and delayed feedback has been described in detail in [21

21. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html

23

23. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241. [CrossRef] [PubMed]

]. Its main advantage is experimental simplicity, since it requires very few components.

To implement this architecture, we need to go from the discrete timenused in Eq. (1) to continuous time. To this end each input u(n)undergoes a sample and hold procedure, during which it is held for a time T. Each interval of lengthTis subdivided intoNintervals of lengthθ. The i-th interval of durationθ(i=1,...,N)is associated with the input valuemiu(n), wheremiis the corresponding value of the input mask. It is these valuesmiu(n)which are sequentially used to drive the reservoir.

We denote byTthe round-trip time of the delay loop. Two approaches have been proposed to implement the single nonlinear node and delayed feedback scheme. In the first, used in [21

21. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html

,23

23. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241. [CrossRef] [PubMed]

], one takesT=T. In this case, to enrich the dynamics, it is necessary to include in the delay loop a lowpass filter with time constant typically of order a few timesθ. This internal time-scale couples successive node states to each other.

In the second approach, used in [22

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

], the nonlinearity is instantaneous, but one desynchronizes the input from the round-trip time, for instance by takingT=(N+1)θ. This introduces a coupling from each internal variablexito the adjacent onexi1. It is this second approach that is used in the present experiment.

2.3. Experimental implementation

Our experimental setup is depicted in Fig. 1
Fig. 1 Schematic of the experimental set-up of the all-optical reservoir. The reservoir consists of off-the-shelf fiber optics components operating in the telecommunication C-band. It is based on nonlinear all-optical loop operating in incoherent regime. Optical components are in red whereas electronic components are in green. The all-optical loop is driven by the input optical signal. A Superluminescent Light Emitting Diode (SLED) generates broadband white light. An electronic signal corresponding to the time dependent input multiplied by the input mask is generated by the Arbitrary Waveform Generator (AWG). This electronic signal drives an integrated Lithium Niobate Mach-Zehnder intensity modulator (MZ) thereby producing a time dependent input optical signal whose intensity is adjusted with a variable attenuator. The input optical signal is injected into the cavity by means of a 50/50 fiber coupler. The cavity itself consists of an isolator, a Semiconductor Optical Amplifier (SOA), a variable optical attenuator, and a fiber spool that acts as delay line. The cavity operates below the lasing threshold. A 90/10 fiber coupler is used to send 10% of the cavity intensity to a readout photodiode and then to a digitizer. Coarse Wavelength Division Multiplexers (CWDM, boxes with waves) are used to select a wavelength band near the maximum of emission of the SLED and the maximum gain of the SOA.
. The all-optical reservoir is implemented using off-the-shelf fiber components operating in the telecommunication C-band. It uses an incoherent light source so that interference effects can be neglected and the internal variablesxi(n)correspond to the light intensity inside the loop. The nonlinear feedback loop consists of a fiber spool (1.6 km of single mode fiber), an SOA (Covega C-Band SOA1117) and a tunable optical attenuator (Agilent 81571A). The latter determines the feedback strengthαthat appears in the Eq. (1). In addition the delay loop contains an isolator to avoid counter-propagating Amplified Spontaneous Emission (ASE) within the SOA and a band pass filter (commercial Coarse Wavelength Divison Multiplexer centered on 1550nm) that removes a large part of the spontaneous emission noise from the SOA. The noise figure of this amplifier has been measured on the photodiode to take values between 25 dB and 45 dB, depending on the input power and the pump current.

The round-trip time of the loop isT=7.9437µs. WithN=50internal variables and T=(N+1)θ, the input time-scale isT=7.7880µs, andθ=155.76ns.

Nonlinearity in a Reservoir Computer is essential to fulfill complex tasks. In our system the nonlinearity is provided by the saturation of the optical gain in the SOA. In Fig. 2
Fig. 2 Incoherent light amplification characteristics of the SOA. Left hand: output power versus input power. The SOA gain is almost linear at low input powers and bends over at input powers above the saturation level. The overall output power increases with increasing injection current. Due to spontaneous emission, the output power is non zero even at zero input power. From bottom to top: injection current is a) 160 mA, b) 200 mA, c) 240 mA, d) 300 mA, e) 340 mA, f) 400 mA, and g) 520 mA. (The hysteresis effect is due to the high-pass filter of the modulator’s driver used for the record). Right hand: normalized input-output relation. The data is the same as in the left panel, but the input and output powers are normalized to lie in the interval [0,1]. It is this function that characterizes the nonlinearity of the reservoir, which can be tuned by changing the pump current.
we show the measured power amplification characteristics of the SOA for various injection currents. These characteristics constitute the nonlinear functionFNLof our reservoir. As can be seen in Fig. 2, the injection current controls the shape of this nonlinear function and therefore constitutes an adjustable parameter in our system. For a given task, the injection current is varied to get the best performance. For most tasks we have found that operating in a relatively linear regime, corresponding to a low input current, provides best performance. Note that in our experimental conditions the amplifier is always used far from the complete saturation regime.

To drive the reservoir we use an incoherent light source (40 nm broadband Superluminescent Light Emitting Diode (SLED, Denselight DL-CS5254A) with peak emission at 1560 nm). The time-varying inputs are created by means of an Arbitrary Waveform Generator (AWG, National Instrument model PXI-5422) that drives a Mach-Zehnder Modulator (MZM, Photline model MXAN-LN-10). The intensity injected into the reservoir is chosen by means of a variable optical attenuator (Agilent, model 81571A). Two practically identical Coarse Wavelength Division Multiplexers (CWDM) with pass band [1540-1558.5nm] are added at the output of the SLED and the SOA. These filters improve the signal-to-noise ratio (SNR) by matching the bandwidth of the source to that of the cavity and removing a large portion of the ASE noise generated by the SOA.

A fraction (10%) of the circulating light intensity is extracted from the loop via a fiber coupler in order to record the evolution of the node states by means of a photodiode (TTI TIA-525 120 MHz bandwidth), and a digitizer (NIPXI-5124 Digitizer), whose output is stored in a desktop computer used for post-processing. The input and output are sampled by the AWG and digitizer at 200MSamples per second.

3. Operation mode of the all-optical reservoir

3.1. Signal pre-processing

3.2. Signal post-processing

In our experiment the reservoir training and testing are realized on the basis of the values recorded by the digitizer and are thus not performed in real time. The entire recorded intensity time trace is first rescaled to lie in the interval [-1, + 1]. Then each internal state is measured by averaging the normalized intensity around the middle of the corresponding time window over a duration ofθ/2. By doing so, we discard the leading and trailing edges of each time interval where abrupt jumps in the light intensity take place. This amounts to eliminating most of the influence of the photodetector response time on the state values.

For each experiment the desired or “target” output sequencey(n)corresponding to a given input sequence u(n)is defined according to the task under consideration. The reservoir training then consists in applying a least mean square algorithm that calculates the set of weight valuesWithat minimizes the mismatch between the desired output sequencey(n)and the actual output sequencey^(n)as calculated from Eq. (2). In order to make the reservoir more robust against overfitting ridge regularization is used.

Once the training is performed, the weight valuesWiare kept constant so that the reservoir can be tested with the remaining of the recorded normalized intensity time trace. Note that besides the training and testing sequences, the recorded output intensity time trace also contains a “warm-up” sequence that is not used in the training phase. Its role is to ensure that the reservoir has no memory of the steady state from which it systematically starts in the various experiments. Its length is typically 2 to 4 times the number of nodes in the reservoir (typically 200-400 samples). The total length of the warm up, training, and testing sequences is usually around ten thousand samples.

4. Results

4.1. Operating point

We investigated several benchmark tasks that have been used previously in the Reservoir Computing community. In the following paragraphs we describe the different tasks and give the corresponding scores obtained with our all-optical Reservoir Computer. We also quote the operating point used to get best performance on the task. The operating point is specified by three parameters.

The injection current of the SOA determines the shape of the nonlinearity, see Fig. 2. It also determines the noise figure of the SOA due to its ASE. For most tasks we have found that a small injection current, corresponding to a rather linear regime, gives best performance. The value of the injection current used is quoted for all tasks. In most cases it is 187mA.

The feedback gainαis estimated by measuring the small signal gain of the SOA and taking into account the other losses in the cavity. The largest feedback gain, corresponding to minimum loss in the cavity, is −1.6dB.

The input gainβis quantified by taking the powermiu(n)Pat the output of the MZM, averaging over the mask valuesmiand the input values u(n), and then propagating the resulting average power (taking into account losses in the variable attenuator and coupler) to the input of the SOA. The input power, brought back to the input of the SOA, is measured in dBm and µW, which allows comparison with the transfer function of the SOA, as reported in Fig. 2.

4.2. Numerical simulations

We have developed a simple numerical model of the experiment, based on integrating numerically the recurrence
xi(n)={FNL(αxi1(n1)+βmiu(n))+νi(n)2iNFNL(αxN+i1(n2)+βmiu(n))+νi(n)i=1
(4)
where FNL is the measured response of the SOA given in Fig. 2, and νi(n)is the noise added to node variable xi at time n (the νi(n)are independent indentically distributed Gaussian variables with zero mean).This numerical model takes into account the measured transfer functions of all the components. It does not take into account any bandpass effects of the electronics (either at the input or readout). The ASE noise added by the SOA is modeled by adding the noise terms νi(n). The aim of this numerical model was to establish whether the architecture used (single nonlinear node and delay loop) combined with the form of the nonlinearity (saturable nonlinearity) introduced fundamental limitations to our experiment. Except if relevant, we do not quote the predictions of these numerical simulations.

4.3. Memory capacities

The memory capacities characterize in a simple way how a reservoir processes information. The inputs u(n)are taken to be independent identically distributed random variables. In the present work the u(n)are drawn from the uniform distribution over the interval [-1,1].

In the case of the linear memory capacity, introduced by Jaeger in 2001 [2

2. H. Jaeger, “Short term memory in echo state networks,” GMD Report 152 (German National Research Institute for Computer Science, 2001).

], the task is to reconstruct the inputksteps in the past. That isyk(n)=u(nk). Upon summing the capacities over all delaysk, one obtains the linear memory capacity.

The nonlinear capacities introduced in [26

26. J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep. 2, Article number: 514, (2012).

] are extensions of the linear capacity to the case where the function to reconstruct is a nonlinear function of the past inputs. Here we consider only second order polynomials of the past input. Namely we consider the second order Legendre polynomial of the inputksteps in the pastyk(n)=3u2(nk)1 and the product of two inputskandksteps in the pastykk(n)=u(nk)u(nk). Upon summing the capacities overk(in the case of the second order Legendre polynomials) and overk,k(in the case of products) one obtains the quadratic memory capacity and the cross memory capacity, respectively.

A key result proven in [26

26. J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep. 2, Article number: 514, (2012).

] is that the sum of the linear memory capacity, the quadratic memory capacity, and the cross memory capacity is less or equal to the number of internal variablesN(in our caseN=50). This generalizes the result of [2

2. H. Jaeger, “Short term memory in echo state networks,” GMD Report 152 (German National Research Institute for Computer Science, 2001).

] where it is shown that this inequality holds in the case of the linear memory capacity. Note that in order to saturate the bound one should in general add all higher order polynomials, see [26

26. J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep. 2, Article number: 514, (2012).

] for the conditions under which this bound is saturated. In other words, while the linear memory capacity quantifies how well the reservoir remembers past inputs, the quadratic and cross memory capacities quantify how well the reservoir can compute second order polynomials of the past inputs.

Figure 4
Fig. 4 Linear, quadratic, and cross memory capacities for the all-optical reservoir, as a function of the feedback gain feedback gainαand the input gainβfor an injection current of 187mA. Note that the color scale is different for each capacity. The + signs locate the optimum working points for the channel equalization task at the different signal-to-noise ratios. The x signs locate the optimum working points for the radar task at low sea state over the 10 delays of prediction (some delays have identical optimal working points). This shows that the capacities are correlated to, but do not explain completely, the optimal working points.
shows how these quantities change in terms of the feedback gainαand the input gainβ.

Compared with the capacities of the optoelectronic architecture, the memory capacities of the all-optical reservoir are lower. In particular, we were unable to reach a large cross memory capacity or a total memory capacity close to its optimal value, i.e., the number of nodes in the reservoir. This is attributed to the ASE noise of the SOA that is inherent to our all-optical configuration and decreases the SNR.

4.4. Channel equalization task

q(n)=0.08d(n+2)0.12d(n+1)+d(n)+0.18d(n1)0.1d(n2)+0.091d(n3)0.05d(n4)+0.04d(n5)+0.03d(n6)+0.01d(n7)u(n)=q(n)+0.036q2(n)0.011q3(n)+noise
(5)

White noise is added to obtain a Signal-to-Noise Ratio (SNR) ranging from 12 dB to 32 dB. The detected signal u(n)obtained at the end of the communication channel constitutes the reservoir input, while the target output is the original sequenced(n).

This task is evaluated through the calculation of the Symbol Error Rate (SER), representing the percentage of misclassified symbols. The results we obtained for this task are presented in Fig. 5
Fig. 5 Performance of the all-optical reservoir on the channel equalization task with 50 nodes at 187mA pump current (Optical RC), with a simulation of the same system (Simulation), and with an optoelectronic reservoir of 50 nodes (OptoElectronic RC). Error bars are statistical.
, together with the results obtained for the same task with our optoelectronic reservoir [22

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

], and the ones obtained by running the numerical simulations described in section 4.2 of this paper. This figure reveals that the all-optical reservoir performs as well for a signal-to-noise ratio ranging from 12 to 16 dB, while for larger SNR the performance is slightly degraded with respect to the optoelectronic implementation. Note, however, that even in this high SNR range the measured SER is still one order of magnitude better than the bilinear filtering technique presented in [28

28. V. J. Mathews, “Adaptive algorithms for bilinear filtering,” Proc. SPIE 2296(1), 317–327 (1994). [CrossRef]

]. It should also be noted that there is a good agreement between the results of the optical experiment and the ones obtained with the numerical simulations, using a distribution of the internal noise νi(n) with a standard deviation of 1·10−3.

The experiment is repeated 10 times over 10 different data sets, each time the test sequence is 6000 time steps. The results show the mean and standard deviation of the SER over the 10 data sets.

To obtain these results the pump current was set to 187mA. The feedback gain and input gain where optimized for each value of the SNR. The optimal values are in the range −2.35dB to −3.6dB forαand 27.9µW (−15.55dBm) to 55.6µW (−12.55dBm) forβ.

4.5. The radar task

The results presented in Fig. 6
Fig. 6 Predicting radar signal with a reservoir of 50 nodes at a pump current of 187mA (Optical RC) and with the optoelectronic reservoir of 50 nodes reported in [22] (OptoElectronic RC).
, are the best NMSEs for prediction delays ranging from one to ten. Our performances are similar to those of [19

19. A. Rodan and P. Tino, “Simple deterministically constructed recurrent neural networks,” Intelligent Data Engineering and Automated Learning (IDEAL, 2010), pp. 267–274.

] where the authors have reported, for a reservoir of 80 nodes, a NMSE equal to 1.15∙10−3 for one-step predictions and equal to 3.01∙10−2 for five-step predictions for low sea state.

The pump current was set to 187mA. The feedback gainαand input gainβwhere optimized for each value of the prediction delay and for each sea state. The optimal values are in the range −2.6dB to −6.6dB forαand 55.6µW (−12.55dBm) to 221.3µW (−6.55dBm) forβ.

4.6. Isolated spoken digit recognition

The isolated spoken digit recognition is a task introduced in [30

30. D. Verstraeten, B. Schrauwen, and D. Stroobandt, “Isolated word recognition using a liquid state machine,” in Proceedings of the 13th European Symposium on Artifcial Neural Networks (ESANN) (2005), pp. 435–440.

] in the context of Reservoir Computing. It consists of the classification of audio sequences, each one containing one digit spoken by one female speaker. The data set, a subset of the National Institute of Standards and Technology (NIST) TI-46 corpus [31

31. Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46), September 1991, NIST Speech Disc 7–1.1 (1 disc) (1991).

], has 500 sequences: 5 female speakers pronouncing 10 times every digit from 0 to 9.

For each sequence, the audio signal is sampled at 12.5 kHz, preprocessed using the Lyon cochlear ear model [32

32. R. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (1982), pp. 1282–1285.

] and then sampled at regular intervals. The input signal u(n)consists of an 86-dimensional state vector with up to 130 samplings for a single sequence. For this task we used a reservoir of 200 nodes for this task (N=200). The input mask is therefore a 200*86 matrixmij. Its elements are either −0.1 or + 0.1, with equal probability. The masked input driving the reservoir is given by the N-dimensional vectorjmiju(n).

The output layer consists of 10 linear classifiersy^k(n),k=0,..,9, each paired to one digit. For each classifier the desired output is equal to 1 if the corresponding digit is being sent to the reservoir, and to −1 otherwise. For each sequence, the outputs of all the classifiers are averaged over the sequence length; using a winner-take-all approach, the highest averaged classifier is then set to 1, and all other classifiers are set to −1. The metric used to evaluate the recognition is the Word Error Rate (WER), i.e. the rate of digits incorrectly classified.

For this task, since the data set contains only 500 sequences, the evaluation follows a standard cross-validation process. Five subsets of 100 sequences each are randomly chosen; the reservoir is trained over 4 subsets and tested over the last one. The length of 100 sequences for the test subset ensures that each test subset contains all the digits. This procedure is repeated 5 times rotating the subsets, so that each subset is used once for the test. The results reported here are the average WERs and corresponding standard deviations over the 5 test subsets.

The pump current was set to 270mA. The feedback gainαwas set to −7.49 dB and the input gain β was set to 55.6µW (−12.55dBm).

The best WER that we can obtain with our all-optical reservoir is 3% (s.d. 1.2%). This is significantly worse than the 0.4% (s.d. 0.55%) WER reported in [22

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

] for our optoelectronic reservoir. Discrete-time simulations of the same task on our reservoir give perfect classification (0 WER). The difference between the simulations and the experiment is attributed to the ASE noise which is not included in the simulations.

5. Conclusion

We presented what is, to the best of our knowledge, the first all-optical implementation of a Reservoir Computer. The system design is based on a simple optical delayed feedback loop combined to the nonlinearity of an optical amplifier, which from a practical viewpoint only requires standard off-the-shelf optical components. We tested our all-optical reservoir on several benchmark tasks previously used in the literature to evaluate reservoir performances. We showed that, despite its simplicity, our system exhibits state-of-the-art capacities for several tasks. This outcome is promising as it reveals the potential of optics in the field of artificial intelligence.

However, compared to the opto-electronic reservoir reported in [22

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

], the performance on tasks, such as the memory capacities, the non linear channel equalization, isolated spoken digit recognition, is somewhat degraded. Our numerical simulations show that this is not due to the form of the nonlinearity which is different in the two experiements, but can be attributed to the noise induced in the present system which arises from the ASE of the SOA. This highlights a difficulty inherent to all analog computation that will have to be addressed in future experimental realizations of reservoir computing.

The configuration studied here is sequential. The reservoir speed is at present limited by the bandwidth of the AWG and digitizer, but it can already process inputs at a rate of 7.8 μs/symbol, regardless of the number of nodes used, even if more time is then needed for the postprocessing. It has moreover the merit of showing that standard optical components with a nonlinear response can be exploited efficiently to perform complex computational tasks. Looking further in the future, it may be possible to exploit this flexibility together with the parallelism of optics in order to perform complex tasks at ultrahigh speed. One possibility, for example, is to encode the state of different nodes at different frequencies, thus implementing reservoirs of higher dimensionality; another is to move to free space optics, and to encode the states of different nodes in the spatial modulation of a light beam. Either way, we hope that the present work will contribute to a renewed development of all-optical computing.

Acknowledgments

The authors thank Joni Dambre, Benjamin Schrauwen and Peter Bienstman for helpful discussions. They acknowledge financial support by Interuniversity Attraction Poles program of the Belgian Science Policy Office, under grant IAP P7-35 “photonics@be” and by the Fonds de la Recherche Scientifique FRS-FNRS.

References and links

1.

H. Jaeger, “The ‘echo state’ approach to analysing and training recurrent neural networks,” Technical Report GMD Report 148 (German National Research Center for Information Technology, 2001).

2.

H. Jaeger, “Short term memory in echo state networks,” GMD Report 152 (German National Research Institute for Computer Science, 2001).

3.

W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable states: a new framework for neural computation based on perturbations,” Neural Comput. 14(11), 2531–2560 (2002). [CrossRef] [PubMed]

4.

H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304(5667), 78–80 (2004). [CrossRef] [PubMed]

5.

J. J. Steil, “Backpropagation-decorrelation: online recurrent learning with O(N) complexity,” in Proceedings of IEEE International Joint Conference on Neural Networks (IEEE, 2004), pp. 843–848.

6.

R. Legenstein and W. Maass, “What makes a dynamical system computationally powerful?” in New Directions in Statistical Signal Processing: From Systems to Brain (MIT Press, 2005), pp. 127–154.

7.

D. Verstraeten, B. Schrauwen, M. D’Haene, and D. Stroobandt, “An experimental unification of reservoir computing methods,” Neural Netw. 20(3), 391–403 (2007). [CrossRef] [PubMed]

8.

W. Maass, P. Joshi, and E. D. Sontag, “Computational aspects of feedback in neural circuits,” PLOS Comput. Biol. 3(1), e165 (2007). [CrossRef] [PubMed]

9.

H. Jaeger, M. Lukosevicius, D. Popovici, and U. Siewert, “Optimization and applications of echo state networks with leaky-integrator neurons,” Neural Netw. 20(3), 335–352 (2007). [CrossRef] [PubMed]

10.

D. V. Buonomano and W. Maass, “State-dependent computations: spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci. 10(2), 113–125 (2009). [CrossRef] [PubMed]

11.

M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev. 3(3), 127–149 (2009). [CrossRef]

12.

B. Hammer, B. Schrauwen, and J. J. Steil, “Recent advances in efficient learning of recurrent networks”, in Proceedings of the European Symposium on Artificial Neural Networks (2009), pp. 213–216.

13.

F. Triefenbach, A. Jalalvand, B. Schrauwen, and J. Martens, “Phoneme recognition with large hierarchical reservoirs,” Proceedings of Adv.Neural Inf. Processing Syst. 23, 1–9 (2010).

14.

F. Wyffels and B. Schrauwen, “A comparative study of reservoir computing strategies for monthly time series prediction,” Neurocomputing 73(10–12), 1958–1964 (2010). [CrossRef]

15.

M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” KI - Künstliche Intelligenz, (2012), pp. 1–7.

16.

C. Fernando and S. Sojakka, “Pattern recognition in a bucket,” in Proceedings of the 7th European Conference on Artificial Life2801, W. Banzhaf, J. Ziegler, T. Christaller, P. Dittrich, and J. Kim, eds. (2003), Vol. 2801, pp. 588–597.

17.

F. Schürmann, K. Meier, and J. Schemmel, “Edge of chaos computation in mixed-mode vlsi - a hard liquid,” in Proceedings of Advances in Neural Information Processing Systems, L. K. Saul, Y. Weiss, and Léon, eds. (MIT Press, 2005).

18.

Y. Paquot, B. Schrauwen, J. Dambre, M. Haelterman, and S. Massar, “Reservoir computing: a photonic neural network for information processing,” Proc. SPIE 7728, 77280B, 77280B-12 (2010). [CrossRef]

19.

A. Rodan and P. Tino, “Simple deterministically constructed recurrent neural networks,” Intelligent Data Engineering and Automated Learning (IDEAL, 2010), pp. 267–274.

20.

A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw. 22(1), 131–144 (2011). [CrossRef] [PubMed]

21.

L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html

22.

Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

23.

L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241. [CrossRef] [PubMed]

24.

K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Van Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express 16(15), 11182–11192 (2008). [CrossRef] [PubMed]

25.

K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw. 22(9), 1469–1481 (2011), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-15-11182. [CrossRef] [PubMed]

26.

J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep. 2, Article number: 514, (2012).

27.

H. Jaeger, “Adaptive nonlinear system identification with echo state networks,” Adv. Neural Inf. Process. Syst. 8, 593–600 (2002).

28.

V. J. Mathews, “Adaptive algorithms for bilinear filtering,” Proc. SPIE 2296(1), 317–327 (1994). [CrossRef]

29.

http://soma.ece.mcmaster.ca/ipix/dartmouth/datasets.html

30.

D. Verstraeten, B. Schrauwen, and D. Stroobandt, “Isolated word recognition using a liquid state machine,” in Proceedings of the 13th European Symposium on Artifcial Neural Networks (ESANN) (2005), pp. 435–440.

31.

Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46), September 1991, NIST Speech Disc 7–1.1 (1 disc) (1991).

32.

R. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (1982), pp. 1282–1285.

OCIS Codes
(060.4370) Fiber optics and optical communications : Nonlinear optics, fibers
(200.4260) Optics in computing : Neural networks
(200.4560) Optics in computing : Optical data processing
(200.4700) Optics in computing : Optical neural systems
(200.4740) Optics in computing : Optical processing

ToC Category:
Optics in Computing

History
Original Manuscript: July 5, 2012
Revised Manuscript: September 16, 2012
Manuscript Accepted: September 16, 2012
Published: September 20, 2012

Citation
François Duport, Bendix Schneider, Anteo Smerieri, Marc Haelterman, and Serge Massar, "All-optical reservoir computing," Opt. Express 20, 22783-22795 (2012)
http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-20-22783


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. H. Jaeger, “The ‘echo state’ approach to analysing and training recurrent neural networks,” Technical Report GMD Report 148 (German National Research Center for Information Technology, 2001).
  2. H. Jaeger, “Short term memory in echo state networks,” GMD Report 152 (German National Research Institute for Computer Science, 2001).
  3. W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable states: a new framework for neural computation based on perturbations,” Neural Comput.14(11), 2531–2560 (2002). [CrossRef] [PubMed]
  4. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science304(5667), 78–80 (2004). [CrossRef] [PubMed]
  5. J. J. Steil, “Backpropagation-decorrelation: online recurrent learning with O(N) complexity,” in Proceedings of IEEE International Joint Conference on Neural Networks (IEEE, 2004), pp. 843–848.
  6. R. Legenstein and W. Maass, “What makes a dynamical system computationally powerful?” in New Directions in Statistical Signal Processing: From Systems to Brain (MIT Press, 2005), pp. 127–154.
  7. D. Verstraeten, B. Schrauwen, M. D’Haene, and D. Stroobandt, “An experimental unification of reservoir computing methods,” Neural Netw.20(3), 391–403 (2007). [CrossRef] [PubMed]
  8. W. Maass, P. Joshi, and E. D. Sontag, “Computational aspects of feedback in neural circuits,” PLOS Comput. Biol.3(1), e165 (2007). [CrossRef] [PubMed]
  9. H. Jaeger, M. Lukosevicius, D. Popovici, and U. Siewert, “Optimization and applications of echo state networks with leaky-integrator neurons,” Neural Netw.20(3), 335–352 (2007). [CrossRef] [PubMed]
  10. D. V. Buonomano and W. Maass, “State-dependent computations: spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci.10(2), 113–125 (2009). [CrossRef] [PubMed]
  11. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev.3(3), 127–149 (2009). [CrossRef]
  12. B. Hammer, B. Schrauwen, and J. J. Steil, “Recent advances in efficient learning of recurrent networks”, in Proceedings of the European Symposium on Artificial Neural Networks (2009), pp. 213–216.
  13. F. Triefenbach, A. Jalalvand, B. Schrauwen, and J. Martens, “Phoneme recognition with large hierarchical reservoirs,” Proceedings of Adv.Neural Inf. Processing Syst.23, 1–9 (2010).
  14. F. Wyffels and B. Schrauwen, “A comparative study of reservoir computing strategies for monthly time series prediction,” Neurocomputing73(10–12), 1958–1964 (2010). [CrossRef]
  15. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” KI - Künstliche Intelligenz, (2012), pp. 1–7.
  16. C. Fernando and S. Sojakka, “Pattern recognition in a bucket,” in Proceedings of the 7th European Conference on Artificial Life2801, W. Banzhaf, J. Ziegler, T. Christaller, P. Dittrich, and J. Kim, eds. (2003), Vol. 2801, pp. 588–597.
  17. F. Schürmann, K. Meier, and J. Schemmel, “Edge of chaos computation in mixed-mode vlsi - a hard liquid,” in Proceedings of Advances in Neural Information Processing Systems, L. K. Saul, Y. Weiss, and Léon, eds. (MIT Press, 2005).
  18. Y. Paquot, B. Schrauwen, J. Dambre, M. Haelterman, and S. Massar, “Reservoir computing: a photonic neural network for information processing,” Proc. SPIE7728, 77280B, 77280B-12 (2010). [CrossRef]
  19. A. Rodan and P. Tino, “Simple deterministically constructed recurrent neural networks,” Intelligent Data Engineering and Automated Learning (IDEAL, 2010), pp. 267–274.
  20. A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw.22(1), 131–144 (2011). [CrossRef] [PubMed]
  21. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html
  22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html
  23. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241 . [CrossRef] [PubMed]
  24. K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Van Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express16(15), 11182–11192 (2008). [CrossRef] [PubMed]
  25. K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw.22(9), 1469–1481 (2011), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-15-11182 . [CrossRef] [PubMed]
  26. J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep. 2, Article number: 514, (2012).
  27. H. Jaeger, “Adaptive nonlinear system identification with echo state networks,” Adv. Neural Inf. Process. Syst.8, 593–600 (2002).
  28. V. J. Mathews, “Adaptive algorithms for bilinear filtering,” Proc. SPIE2296(1), 317–327 (1994). [CrossRef]
  29. http://soma.ece.mcmaster.ca/ipix/dartmouth/datasets.html
  30. D. Verstraeten, B. Schrauwen, and D. Stroobandt, “Isolated word recognition using a liquid state machine,” in Proceedings of the 13th European Symposium on Artifcial Neural Networks (ESANN) (2005), pp. 435–440.
  31. Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46), September 1991, NIST Speech Disc 7–1.1 (1 disc) (1991).
  32. R. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (1982), pp. 1282–1285.

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.


« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited