Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A METHOD FOR COMPUTER-ASSISTED LEARNING OF ONE OR MORE NEURAL NETWORKS
Document Type and Number:
WIPO Patent Application WO/2008/132066
Kind Code:
A1
Abstract:
The invention relates to a method for computer-assisted learning of one or more neural networks based on a time series of data comprising first values at subsequent time steps. Furthermore, the invention relates to a method for computer-assisted prediction of values of a time series based on one or more neural networks. According to the invention, a rate of change of an initial time series of data is calculated, thus resulting in derivative values. Those derivative values are subjected to Empirical Mode Decomposition which is a well-known technique. The modes extracted by this Empirical Mode Decomposition as well as some time-lagged values of the time series are used as inputs for the neural networks whereas the output of those networks are future values of the time series which are to be predicted. Those networks are trained and used for prediction of future values of the time series. The invention provides good prediction results because derivative values are used as inputs and also past values are considered as inputs. The invention may be used for predicting values of any time series and is particularly for predicting the dynamics of a technical system, e.g. for predicting vibrations occurring in a technical system.

Inventors:
MININ ALEXEY (RU)
MOKHOV ILYA (RU)
Application Number:
PCT/EP2008/054701
Publication Date:
November 06, 2008
Filing Date:
April 18, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SIEMENS AG (DE)
MININ ALEXEY (RU)
MOKHOV ILYA (RU)
International Classes:
G06N3/08; G06N3/04
Domestic Patent References:
WO2002065157A22002-08-22
Foreign References:
KR20030031602A2003-04-23
US20030033094A12003-02-13
Other References:
IYENGAR R N ET AL: "Intrinsic mode functions and a strategy for forecasting Indian monsoon rainfall", METEOROLOGY AND ATMOSPHERIC PHYSICS, SPRINGER-VERLAG, VI, vol. 90, no. 1-2, 1 September 2005 (2005-09-01), pages 17 - 36, XP019378219, ISSN: 1436-5065
DORFFNER G: "Neural networks for time series processing", NEUROFUZZY. IEEE EUROPEAN WORKSHOP, XX, XX, vol. 6, no. 4, 1 January 1996 (1996-01-01), pages 447 - 468, XP002290512
RUQIANG YAN ET AL: "Hilbert-Huang Transform-Based Vibration Signal Analysis for Machine Health Monitoring", IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 55, no. 6, 1 December 2006 (2006-12-01), pages 2320 - 2329, XP011143733, ISSN: 0018-9456
ILYA MOKHOV AND ALEXEY MININ: "Advanced Forecasting and Classification Technique for Condition Monitoring of Rotating Machinery", LECTURE NOTES IN COMPUTER SCIENCE, vol. 4881, 6 December 2007 (2007-12-06), Springer Berlin / Heidelberg, pages 37 - 46, XP002491471, ISSN: 1611-3349, ISBN: 978-3-540-77225-5, Retrieved from the Internet [retrieved on 20080807]
IYENGAR R N ET AL.: "Meteorology and Atmospheric Physics", vol. 90, 1 September 2005, SPRINGER-VERLAG, article "Intrinsic mode functions and a strategy for forecasting Indian monsoon rainfall", pages: 17 - 36
Attorney, Agent or Firm:
SIEMENS AKTIENGESELLSCHAFT (München, DE)
Download PDF:
Claims:

Patent Claims

1. A method for computer-assisted learning of one or more neural networks based on a time series of data comprising first values at subsequent time steps, wherein: the rate of change of said first values are calculated, thus generating a time series of second values; the time series of said second values is subjected to an Empirical Mode Decomposition, resulting in several modes, each mode being a time series of mode values; training one or more neural networks, each neural network comprising one or more artificial neurons being coupled to: one or more first inputs for each mode for input- ting a mode value at a given time step and one or more second inputs for inputting second values at said given time step and/or at one or more time steps before said given time step; at least one output for outputting a second value at a time step after said given time step.

2. The method according to claim 1, wherein said one or more neural networks comprise one or more recurrent neural networks .

3. The method according to claim 2, wherein said one or more recurrent neural networks comprise recurrent Elman networks.

4. The method or according to one of the preceding claims, wherein said one or more neural networks comprise one or more perceptrons .

5. The method according to one of the preceding claims, wherein said one or more neural networks are learned by a weights optimization method, particularly the Broyden- Fletcher-Goldfarb-Shanno method.

6. The method according to one of the preceding claims, wherein said one or more neural networks comprise a plurality of neural networks having a common output for outputting an average output of said at least one output of said plurality of neural networks.

7. The method according to one of the preceding claims, wherein the time series of data comprise data measured in a technical system.

8. The method according to claim 7, wherein the data measured in said technical system comprise vibrations.

9. The method according to one of the preceding claims, wherein in said Empirical mode decomposition an envelope of maxima of second values and an envelope of minima of second values is approximated by cubic splines.

10. A method for computer-assisted prediction of values of a time series based on one or more neural networks learned by a method according to one of the preceding claims, the time series being a time series of first values, wherein: the rate of change of said first values is calculated, thus generating a time series of second values; - the time series of said second values is subjected to an Empirical Mode Decomposition, resulting in several modes, each mode being a time series of mode values; mode values of said modes at a given time step are input into said one or more first inputs of said one or more neural networks and one or more second values at said given time step and/or at one or more time steps before said given time step are input in said one or more second inputs of said one or more neural networks, resulting in at least one second value at a time step after said given time step at said at least one output of said one or more neural networks .

11. A computer program product directly loadable into the internal memory of a digital computer, comprising software code portions for performing a method according to one of the preceding claims when said product is run on a computer.

Description:

Description

A method for computer-assisted learning of one or more neural networks

The invention relates to a method for computer-assisted learning of one or more neural networks based on a time series of data comprising first values at subsequent time steps as well as to a method for computer-assisted prediction of values of a time series based on one or more neural networks learned by this learning method. Furthermore, the invention relates to a corresponding computer program product.

The invention refers to the technical field of information technology, particularly to the field of information processing of data by using neural networks.

The invention may be used for predicting any time series of data from which the underlying dynamics should be investi- gated. Particularly, the invention is used for processing data measured in technical systems in order to predict the time behaviour of the technical system.

Artificial neural networks are widely used for forecasting data. To do so, neural networks are learned by training data based on a given time series. A well-known method for training neural networks and predicting future values of time series is based on lagged vectors wherein time-lagged values of the time series are used as inputs for the neural networks. In order to obtain good prediction results, the underlying neural networks are often very complex and the computing time for learning and predicting values is very long.

Therefore, it is an object of the invention to provide a method for learning one or more neural networks providing good prediction results in reasonable computing time.

This object is solved by the independent claims. Preferred embodiments of the invention are defined in the dependent claims .

In a first step of the learning method according to the invention, the rate of change of first values of a time series of data are calculated, thus generating a time series of second values. Hence, those second values substantially correspond to time derivatives of said first values. In a next step, the time series of said second values is subjected to Empirical Mode Decomposition, resulting in several modes, each mode being a time series of mode values.

Thereafter, one or more neural networks are trained. Each neural network comprises one or more artificial neurons. Those neurons are coupled to one or more first inputs for each mode for inputting a mode value at a given time step and one or more second inputs for inputting second values at said given time step and/or at one or more time steps before said given time step. Furthermore, said one or more artificial neurons are coupled to at least one output for outputting a second value at a time step after said given time step.

According to the invention, instead of the original first values, second values indicating the rate of change of the first values are processed. This enables an efficient use of the Empirical Mode Decomposition method which is a well-known method and will also be described in the detailed description. Using the rate of change of the first values has the advantage that modes having substantially the same order are generated. Furthermore, the dynamics of the past is taken into account when training the neural networks by using, besides the modes being extracted by Empirical Mode Decomposition, some time-lagged second values derived from the initial time series.

As will be apparent from the detailed description, predictions based on networks learned by the method according to

the invention have a good quality in comparison with traditional methods.

In a preferred embodiment of the invention, the neural net- works used for training comprise one or more recurrent neural networks which are very well suited for prediction problems because those networks include feedback, i.e. produced outputs are fed back to the neurons of the network as inputs.

Several different structures of recurrent neural networks are known, and a preferred embodiment of the invention uses the well-known Elman networks as recurrent networks to be trained. Nevertheless, also non recurrent networks may be used in the method according to the invention, particularly so-called perceptrons .

The neural networks may be learned by any known method. In a preferred embodiment of the invention, a weights optimization method optimizing the synaptic weights of the neurons is used for learning, preferably the Broyden-Fletcher-Gloldfarb- Shanno method well-known in the art.

In another embodiment of the invention, said one or more neural network comprise a plurality of neural networks having a common output for outputting the average output of said at least one output of said plurality of neural networks, thus resulting in better prediction results.

As mentioned before, the method according to the invention may be applied to any time series of data. However, the method of the invention is preferably applied to data measured in a technical system. Such data may for example relate to vibrations.

In the Empirical Mode Decomposition used in the method of the invention, an envelope of maxima of second values and an envelope of minima of second values is approximated by cubic splines, leading to a good mode extraction.

Besides the above method for learning one or more neural networks, the invention also relates to a method for computer- assisted prediction of values related to a time series based on one or more neural networks learned by the above described method. The prediction method comprises the following steps: the rate of change of said first values is calculated, thus generating a time series of second values; the time series of said second values is subjected to an Empirical Mode Decomposition, resulting in several modes, each mode being a time series of mode values; mode values of said modes at a given time step are input into said one or more first inputs of said one or more neural networks and one or more second values at said given time step and/or at one or more time steps before said given time step are input in said one or more second inputs of said one or more neural networks, resulting in at least one second value at a time step after said given time step at said at least one output of said one or more neural networks.

Hence, the method for prediction uses the learned neural networks by inputting corresponding data of a time series in order to predict future values of this time series.

Besides the above described methods, the invention relates to a computer program product directly loadable into the internal memory of a digital computer, comprising software code portions for performing a method according to the invention when said product is run on a computer.

Embodiments of the invention will now be described in detail with respect to the accompanying drawings, wherein:

Fig. 1 shows two embodiments of neural networks which may be used in the method according to the invention;

Fig. 2 shows two diagrams illustrating the Empirical Mode Decomposition technique used according to the invention;

Fig. 3 shows three diagrams illustrating a composition of a signal for subjecting to Empirical Mode Decomposition;

Fig. 4 shows several diagrams illustrating the result of an Empirical Mode Decomposition performed on the composite signal illustrated in Fig. 3;

Fig. 5 shows a time series of values used for testing the method according to the invention;

Fig. 6 is a diagram showing the modes extracted by Empirical Mode Decomposition from the time series of Fig. 5;

Fig. 7 is a diagram showing derivative values of the time series of Fig. 5;

Fig. 8 is a diagram showing the mode values extracted by Empirical Mode Decomposition from the derivatives values of Fig. 7;

Fig. 9 is a diagram indicating the error between the sum of the modes shown in Fig. 8 and the initial signal according to the derivatives as shown in the dia- gram of Fig. 7;

Fig. 10 is a diagram showing the inputs used in a neural network according to one embodiment of the invention;

Fig. 11 shows a diagram comparing the prediction results according to one embodiment of the invention with the initial signal; and

Fig. 12 is a table comparing the quality of prediction according to an embodiment of the invention with the quality of prediction according to a prior art method.

According to the invention, an Empirical Mode Decomposition technique is combined with artificial neural networks. In general, neural networks are suitable for various problems. These problems can be divided into two classes, namely approximation and classification. As described in the following, neural networks will be used for predicting a time series of data. Particularly, financial data taken from the Dow Jones Index were used in order to test an embodiment of the invention. The problem arising when using such data is the filtration of the signal provided by the data. In order to submit such data on a neural network, a signal of a "crude" kind will be generated, i.e. the statistical quality of a neural forecast will be rather low. Although the nature of "noise" is unknown in financial data, it has to be searched for a filter which is adaptive. Therefore, the Empirical Mode Decomposition decomposing a signal in empirical mode functions was used as described in detail later on. This method may be interpreted as a filter which processes attractive properties when used in combination with artificial neural networks, namely additivity and orthogonality of the generated modes. Although the method of the invention was tested with financial data, the method is in general applicable to any kind of time series of data, particularly data relating to technical systems, such as the diagnostics of vibrations in a technical system.

Fig. 1 shows schematically two artificial neural networks which may be used in the method of the invention. On the left hand side of Fig. 1, a simple perceptron 11 is shown comprising several inputs xl, x2, ..., xn coupled through synaptic weights with neurons 1, 2 and 3. The couplings are shown by lines between the inputs and the neurons and for illustrative

purposes some lines are designated with the respective weights wll, wl2, wl3, wnl, wn2 and wn3. Furthermore, each neuron 1, 2 and 3 is associated with an output yl, y2 and y3, respectively. The network 11 is a very simple network only having one layer of neurons. According to the invention, much more complicated neural networks having several layers of neurons may be used. Those networks are called feed forward networks or multilayer perceptrons .

On the right hand side of Fig. 1, another type of network 12 is shown which may also be used in combination with the invention. This network 12 is a recurrent neural network. The structure of the network is very similar to the structure of the network 11. Particularly, network 12 has inputs xl, x2, ..., xn, three neurons 1, 2, 3 coupled via weights to the inputs, and outputs yl, y2 and y3, each associated with one of the neurons 1, 2 and 3. The difference between network 11 and 12 results in the presence of a feedback loop designated as 101 for neuron 1, 102 for neuron 2 and 103 for neuron 3. Hence, the output is fed back to the input of each neuron and this feedback leads to the presence of memory in the recurrent network. Similar to the network 11, the recurrent network 12 is a very simple network and much more complicated networks having several layers may be used in the invention described herein. In the embodiment of the invention described in the following, a special version of a recurrent network, namely the Elman network has been used. This network is known and will not be described in detail.

According to the invention, the well-known technique of Empirical Mode Decomposition (also called EMD) was used in order to generate an input for the neural network. For better understanding, a description of this technique will be presented in the following. Empirical Mode Decomposition is an adaptive representation of non stationary signals as a sum of AM-FM components (AM = Amplitude Modulated; FM = Frequency Modulated) with zero mean. According to this technique, the oscillation in a signal on a local level is considered. To do

so, the high frequency component in a given signal x(t) between two extrema, e.g. two minima t_ and t + , is evaluated. This high frequency component designated as represents the local detail which is responsible for the os- dilation connecting two minima on a path through a maximum which always exists between two minima. Besides the high frequency component d(t), the initial signal x(t) also comprises a low frequency component m(t) (also called trend) with respect to the high frequency component. Summarized, the ini- tial x(t) signal may be represented as a sum of high frequency and low frequency components, namely as:

In order to extract a mode by Empirical Mode Decomposition, the extraction of the high frequency component is applied it- eratively on the detail d(t) until the signal m(t) has zero average or an average near zero according to some stopping criterion. Thus, the algorithm to extract a mode is as fol- lows:

The local extrema in the initial signal x(t) are determined and envelopes between minima and between maxima are calculated. This step is shown in diagram 201 of Fig. 2. This dia- gram shows a time series of a periodic signal 202. In diagram 201 as well as in diagram 202, the abscissa represents time and the ordinate represents the amplitude of the signal. The maxima of the signals are determined and some of those maxima are designated for illustrative purposes as 203 in diagram 201. Analogously, the minima of the signal 202 are determined and some of those minima are designated for illustrative purposes as 204. The extrema must be extracted very accurate and this leads to overdiscretisation . Border effects must be taken into account in order to minimize the error at the ends of the time series which appears due to the finite number of data points in the time series. In order to reduce border effects, extrema are added which do not exist. This technique

of adding extrema is well-known in the prior art and will not be described in detail.

After determining the maxima and minima, an envelope 205 con- necting adjacent maxima and an envelope 206 connecting adjacent minima are calculated. This calculation may be done by any approximation technique, preferably cubic splines are used for the approximation. As a consequence, a function e min (t) for the envelope 206 of the minima and a function e max (t) for the envelope 205 of the maxima is given. The average between e min (t) and e max (t) is calculated thereafter, i.e. the following function m(t) is computed:

m(t) is designated as function 209 in diagram 201. As a next step, the detail d (t) =x (t) -m (t) is calculated. This residue is shown in diagram 207 with respect to the signal 202 and designated as 208. The above described method is repeated in a next iteration step with the residue 208, i.e. once again the envelopes are determined, the average of these envelopes is calculated and the detail is extracted by determining the difference between the function 208 and the average. As mentioned above, the iterations are repeated until the average of m(t) is zero or in the proximity of zero according to some stopping criterion. As a result, a first "Intrinsic Mode Function" also called IMF in the form of the residue existing at the end of the iterations is extracted.

This IMF function is subtracted from the initial signal x(t) and the above described iterations are analogously performed for this new signal. At the end of the iterations, a second mode is extracted. These iterations may be repeated several times on the difference between the initial signal at the be- ginning of the previous mode extraction and the previously extracted IMF function, in order to extract further modes. At the end of the EMD technique, a finite number of modes is ex-

tracted from the initial signal x(t) . The above described EMD method is fully automatic and adaptive. It should be noted that in case of harmonic oscillations including a high frequency and a low frequency, the EMD method should be used in local scale and does not fit to some band filtering.

Fig. 3 shows three diagrams 301, 302 and 303 wherein the abscissa represents the time and the ordinate the amplitude of a signal. Diagram 301 shows a tone signal with low frequency, diagram 302 a chirp signal 305 representing noise and diagram 303 shows the composite signal 306 resulting by adding the chirp signal 305 to the tone signal 304. The result of an Empirical Mode Decomposition applied to the signal 306 is shown in Fig. 4. Particularly, Fig. 4 shows seven diagrams wherein in each diagram the abscissa refers to time and the ordinate refers to the amplitude. The diagrams shown in Fig. 4 are designated as 401, 402, 403, 404, 405, 406, and 407. Diagrams 401 to 406 refer to the modes extracted by Empirical Mode Decomposition and diagram 407 shows the residue of the signal remaining at the end of the Empirical Mode Decomposition. As can be seen from Fig. 4, the signal includes two important modes 401 and 402, and the rest of the modes 403 to 406 can be considered as equal to zero. It is evident that the original tone signal 304 could be extracted as mode 402 and the original chirp signal 305 could be extracted as mode 401 from the composite signal 306.

Fig. 5 shows a time series of data being used in one embodiment of the invention for forecasting future values in this time series. Particularly, Fig. 5 is a diagram showing the development of the security of the company ALTRIA GROUP INC (abbreviated also as MO) . The abscissa in Fig. 5 shows the time in days and the ordinate indicates the value of this security. According to a prior art approach for neural fore- casting, time-lagged values of the time series were used for predicting future values. This approach is based on a reconstruction of the phase attractor of the dynamical system in-

side a neural network, thus extracting laws of development of the system in time.

According to the embodiment of the invention described herein, it is assumed that the time series is an "almost

Markov" process. A Markov process is a process where the future of the process does not depend on the past at a known present time step. In this context, "almost" means that tomorrow' s value of the time series depends mainly on today' s value, but, nevertheless, also depends on some previous values. In a first test, inputs in a neural network were considered, said inputs being the security values in the curve of Fig. 5 during the last five days but with different weights. However, this approach has the following problem. Having one input (or several correlated inputs in the form of lagged vectors) , the neural network will have a so-called "information famine". Therefore, one input or only several inputs are not enough to approximate the time series by means of a neural network.

To overcome this problem, it is necessary to extract as much information as possible from only one time series. To do so, the above described EMD method may be used. This is because the extracted modes are orthogonal and the number of modes will be about nine to twelve modes. Those non correlating modes may form the orthogonal inputs in a neural network. Nevertheless, by applying this approach to the time series shown in Fig. 5, it becomes apparent that low frequency modes will strongly dominate in comparison to other modes.

This is illustrated in the diagram of Fig. 6. The abscissa in this diagram indicates the mode numbers 1 to 10 and the ordi- nate refers to the values of the modes, wherein for each mode the time development of the mode value in a predetermined time window is shown. Those modes are extracted from the initial signal shown in Fig. 5. The mode with the lowest frequency, i.e. the mode with number 10 having a frequency period longer than the time window has the highest values lying

between 20 and approximately 60 in the diagram of Fig. 6. Contrary to that, the mode values of the other modes are much lower, i.e. the maximum mode values of all other modes 1 to 9 are much smaller than 10.

Moreover, the low frequency components in the signal of Fig. 5 form a trend of the time series and this trend cannot be removed by decomposition because this will affect border effects as well as other effects which cannot be neglected. Furthermore, the subtraction of modes leads to strong boundary effects and, thus, forecast even for one day will become impossible. Thus, the inventors realized that it is not possible to predict a time series by influencing the initial signal artificially. Nevertheless, in order to get good re- suits, all inputs must have the same value in average. To do so, the Empirical Mode Decomposition performed according to the invention is not applied to the initial signal but to the derivative of the signal, i.e. the rate of change in time of the initial signal is calculated. This new representation of the time series allows to reduce the influence of low frequency components and results in input values for the neural network which are all of the same order.

Fig. 7 shows a diagram wherein the abscissa indicates the time in days and the ordinate corresponds to the derivative values of the initial signal shown in Fig. 5. The signal shown in Fig. 7 is the signal which is subjected to Empirical Mode Decomposition. As a result of the Empirical Mode Decomposition, the modes shown in Fig. 8 are obtained. Analogously to Fig. 6, different modes 1 to 10 are shown along the abscissa and the mode values are shown along the ordinate. As can be seen from Fig. 8, most of the modes are of the same order and there is no longer a dominating low frequency mode.

Fig. 9 illustrates the computational error of the Empirical Mode Decomposition method applied to the derivate values of Fig. 7. To do so, the sum of all modes is computed and this sum is subtracted from the signal of Fig. 7, thus resulting

in the error. In the diagram of Fig. 9, the time in days is indicated on the abscissa and the error on a scale of 10 ~16 is shown for each day. Apparently, the error is very small and can be neglected. It should be noted that, in case that new points are added to the time series, uniqueness of the decomposition is broken, since the modes change and, thus, it is necessary to retrain the artificial neural network receiving the modes as inputs at each time as soon as a new value is received. The time for training of the network on a computer in a Matlab environment (CPU: Celeron 1.3 RAM: DDR2 2Gb) takes five minutes, and the decomposition of a signal in modes takes 30 seconds. Hence, for an operative forecasting for one day, this time of training is not critical.

According to the invention, it has to be taken into account that a time series is usually not purely Markov, so that it is necessary to include as inputs in the neural network one or more lagged vectors, i.e. one or more derivative values at a given time and some earlier time steps. Thus, the input of a neural network includes modes for the given time step, i.e. the current day, and also some set of lagged vectors which are derivative values of earlier time steps. Fig. 10 shows a diagram illustrating the inputs used for a neural network according to the invention. Along the abscissa of the diagram of Fig. 10, 15 different inputs for the neural network are shown. The ordinate indicates the value of each input for a given time window. The first inputs 1 to 9 are modes extracted from derivates values of the initial signal. Inputs 10 to 15 are lagged vectors, i.e. derivative values at the current time step and five earlier time steps.

As already mentioned in the foregoing, the method of the invention was applied to the time series shown in Fig. 5 which represents security values of the company ALTRIA GROUP INC from the Dow Jones Index. The data is taken from the site hLtp :/ /finance . yahoo . com. The data shows the security value for a time period from January 3, 1995 to April 17, 2007. The neural network was trained by a training set comprising the

first 2500 values in the data set. The test set or validation set comprised 400 values from 2501 to 2900. The remaining 180 values were used as a production set (generalization set) for producing forecasts.

For carrying out the experiment, a neural network in the form of a committee of recurrent Elman networks was used and the result of the committee was the average of the outputs of all networks. Each Elman network comprised two layers of neurons, each layer having ten neurons. As a method of training for the recurrent Elman network, a weights optimization method was applied, namely the weights optimization method named BFGS (Broyden-Fletcher-Goldfarb-Shanno) . As functions of activation in the neural network, tangential functions were used. As described above, the inputs of the network were modes extracted by Empirical Mode Decomposition from the derivative values of the signal of Fig. 5 as well as some lagged derivative values.

The result of the neural forecast method according to the invention is indicated in the diagram of Fig. 11. Along the abscissa of this diagram, 180 time steps of the production set mentioned before are shown. The ordinate indicates the derivative values at each time step. The diagram shows the de- sired signal, i.e. the initial derivative values to be predicted, as line 501, whereas line 502 shows the values predicted by the method according to the invention. As will be apparent from Fig. 11, there is a good coincidence between the real values according to line 501 and the predicted val- ues according to line 502.

Furthermore, the new method according to the invention was compared with a conventional prediction method based on lagged vectors. Both methods were applied on the production data set. To compare both methods, the determination coefficient, the correlation coefficient as well as the root mean squared error were calculated for each method. The highest value of the determination coefficient and the correlation

coefficient is 1 and the method is better the higher the coefficients are. In the table of Fig. 12, the values of the determination and correlation and the root mean squared error of the new method are shown in the column named 601 whereas the corresponding values for the traditional method of lagged vectors are shown in column 602. Along line 603, the respective determination coefficients are shown. Along line 604, the respective correlation coefficients are shown. Line 605 indicates the root mean squared error. It is evident that the method according to the invention is better than the traditional method. Particularly, the determination coefficient of the method according to the invention is 0.92 compared to 0.7 of the method of lagged vectors. The correlation coefficient of the new method having the value 0.95 is also better than in the traditional method having the coefficient 0.92. Moreover, the root mean squared error is smaller when using the method of the invention. The error in the method of the invention has the value 0.08 whereas the traditional method has the value 0.13.

According to the foregoing, a new technique for a neural forecasting of time series has been described and applied to a time series of financial data. Nevertheless, the method may be applied to any other time series of data, particularly to a time series in technical systems, e.g. in the field of vi- bro diagnostics. The technique according to the invention enables the prediction of future values of a derivative of a financial time series with much greater accuracy than existing techniques. The technique is not very complex but enables the construction of a profitable financial strategy in the financial market. The approach could be extended if energy of modes are used instead of modes. In this instance, the problem of approximation can be reduced to a problem of classification. Hence, it should be possible to train an artificial neural network to classify a financial time series into three categories, namely buy, sell or do nothing. As already mentioned before, the method of the invention can be extended to problems of vibro diagnostics, problems of classification of

critical moments in dynamics of complex systems and for forecasting of crashes in financial markets.