Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR ASSESSING RISK SEQUENTIALLY
Document Type and Number:
WIPO Patent Application WO/2022/136195
Kind Code:
A1
Abstract:
A computer implemented method for determining a risk indicator comprises: a) receiving time series for a set of assets, each element of said time series comprising an asset price and a corresponding date, a window duration span and a window overlap span; b) for each asset of said group of assets determining asset window energy vectors by b1) grouping the elements of said time series which relate to said each asset of said group of assets into asset time series subset, b2) for each asset time series subset, determining asset windows each comprising elements of a given asset time series subset which are associated to consecutive dates, are chosen such that each of the time difference between the element with the oldest date and the element with the most recent date is equal to said window duration span, and such that the asset windows associated to a given asset each overlap at least one asset window associated to said given asset by said window overlap span, b3) for each of said asset window, applying a window function followed by a Fourier transform, b4) for each resulting Fourier transform, applying a filter-bank in order to define an asset window energy vector having a chosen dimension superior or equal to 10; c) training a conditional variable autoencoder on the asset windowed energy vectors of operation b4), in which, for a given asset window energy vector the conditioning variable is defined as the sign of the difference between the asset prices of the element with the most recent date and the element with oldest date in the asset window corresponding to said given asset window energy vector, and in which the objective function is based on the Kullback–Leibler divergence, and, for each asset windowed energy vectors of operation b4), storing the output of the encoder of the trained conditional variable autoencoder as a turbulence vector, each turbulence vector being thus associated with an asset and a time window; d) training a hidden Markov model on the turbulence vectors by applying an expectation maximization algorithm; e) calculating a risk indicator by applying the trained hidden Markov model to a time series for a set of assets with new elements for the set of assets.

Inventors:
DE BEAUCHEF DE SERVIGNY ARNAUD (FR)
MADMOUN HACHEM (FR)
Application Number:
PCT/EP2021/086664
Publication Date:
June 30, 2022
Filing Date:
December 17, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BRAMHAM GARDENS (FR)
International Classes:
G06N3/04; G06N3/08; G06N7/00; G06N20/00
Other References:
BRYAN LIM ET AL: "Detecting Changes in Asset Co-Movement Using the Autoencoder Reconstruction Ratio", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 27 September 2020 (2020-09-27), XP081771639
TIM DE RYCK ET AL: "Change Point Detection in Time Series Data using Autoencoders with a Time-Invariant Representation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 21 August 2020 (2020-08-21), XP081745695
NORLANDER ERIK: "Clustering and Anomaly Detection in Financial Trading Data", 8 October 2019 (2019-10-08), LUND UNIVERSITY LIBRARIES, pages 1 - 75, XP055906350, Retrieved from the Internet [retrieved on 20220329]
Attorney, Agent or Firm:
CABINET NETTER (FR)
Download PDF:
Claims:
Claims

[Claim 1] A computer implemented method for determining a risk indicator comprising: a) receiving time series for a set of assets, each element of said time series comprising an asset price and a corresponding date, a window duration span and a window overlap span, b) for each asset of said group of assets determining asset window energy vectors by bl) grouping the elements of said time series which relate to said each asset of said group of assets into asset time series subset, b2) for each asset time series subset, determining asset windows each comprising elements of a given asset time series subset which are associated to consecutive dates, are chosen such that each of the time difference between the element with the oldest date and the element with the most recent date is equal to said window duration span, and such that the asset windows associated to a given asset each overlap at least one asset window associated to said given asset by said window overlap span, b3) for each of said asset window, applying a window function followed by a Fourier transform, b4) for each resulting Fourier transform, applying a filter-bank in order to define an asset window energy vector having a chosen dimension superior or equal to 10, c) training a conditional variable autoencoder on the asset windowed energy vectors of operation b4), in which, for a given asset window energy vector the conditioning variable is defined as the sign of the difference between the asset prices of the element with the most recent date and the element with oldest date in the asset window corresponding to said given asset window energy vector, and in which the objective function is based on the Kullback-Leibler divergence, and, for each asset windowed energy vectors of operation b4), storing the output of the encoder of the trained conditional variable autoencoder as a turbulence vector, each turbulence vector being thus associated with an asset and a time window, d) training a hidden Markov model on the turbulence vectors by applying an expectation maximization algorithm, e) calculating a risk indicator by applying the trained hidden Markov model to a time series for a set of assets with new elements for the set of assets.

[Claim 2] Computer implemented method according to claim 1 wherein the window function of operation b3) is a Hamming window function.

[Claim 3] Computer implemented method according to one of the preceding claims wherein the objective function for the training of the conditional variable autoencoder of operation c) is the sought predictions, Xi are the inputs and yi contains the sign of the return of the asset associated to an input Xi, N() is the Normal law, and KL() is the Kullback-Leibler divergence.

[Claim 4] Computer implemented method according to one of the preceding claims wherein the expectation maximization algorithm of operation d) comprises maximizing the equation [Claim 5] Computer implemented method according to claim 4 wherein are evaluated at every step by applying a forward-backward algorithm. [Claim 6] Computer implemented method according to one of the preceding claims wherein the filter-bank of operation b4) comprises triangle functions having a response of 1 at the central frequency of each filter and which decreases linearly towards 0 until it reaches the central frequencies of the two adjacent filters. [Claim 7] A computer program comprising instructions for performing the method of any of the preceding claims.

[Claim 8] A data storage medium having recorded thereon the computer program of claim 7.

[Claim 9] A computer system comprising a processor coupled to a memory (4), the memory (4) having recorded thereon the computer program of claim 7.

Description:
Method for assessing risk sequentially

The invention concerns a computer implemented method for assessing risk attached to an asset. More precisely, the invention aims at assessing risk based on time series related to assets.

In conventional art, risk is generally assessed from a statistical backward-looking point of view. This means that risk is usually computed as an historical measure of uncertainty in the market. This view is classically accepted as largely unbiased, but does not capture the full depth and breadth of information that is actually available in the market. More precisely, it does not capture the dynamics of risk over time.

When considering mainstream risk measures such as volatility, VaR, CVaR, and other related ones, there is an implicit assumption that only distributions matter and that the dynamics of prices itself is of little interest because of its random nature. This perspective is at the heart of traditional “Modern Finance” and feeds itself from the Brownian motion, a core stochastic calculus assumption. However, no existing technique manages to capture the intrinsically sequential nature of risk.

The invention aims at improving the situation. To this end, the Applicant proposes a computer-implemented method for determining a risk indicator comprising: a) receiving time series for a set of assets, each element of said time series comprising an asset price and a corresponding date, a window duration span and a window overlap span, b) for each asset of said group of assets determining asset window energy vectors by bl) grouping the elements of said time series which relate to said each asset of said group of assets into asset time series subset, b2) for each asset time series subset, determining asset windows each comprising elements of a given asset time series subset which are associated to consecutive dates, are chosen such that each of the time difference between the element with the oldest date and the element with the most recent date is equal to said window duration span, and such that the asset windows associated to a given asset each overlap at least one asset window associated to said given asset by said window overlap span, b3) for each of said asset window, applying a window function followed by a Fourier transform, b4) for each resulting Fourier transform, applying a filter-bank in order to define an asset window energy vector having a chosen dimension superior or equal to 10, c) training a conditional variable autoencoder on the asset windowed energy vectors of operation b4), in which, for a given asset window energy vector the conditioning variable is defined as the sign of the difference between the asset prices of the element with the most recent date and the element with oldest date in the asset window corresponding to said given asset window energy vector, and in which the objective function is based on the Kullback-Leibler divergence, and, for each asset windowed energy vectors of operation b4), storing the output of the encoder of the trained conditional variable autoencoder as a turbulence vector, each turbulence vector being thus associated with an asset and a time window, d) training a hidden Markov model on the turbulence vectors by applying an expectation maximization algorithm, e) calculating a risk indicator by applying the trained hidden Markov model to a time series for a set of assets with new elements for the set of assets.

This method is advantageous because it allows us to capture all of the data available to robustly quantify the time varying investor’s appetite for risk. This is achieved by modelizing and denoising the available data.

In various embodiments, the method may present one or more of the following features:

- the window function of operation b3) is a Hamming window function,

- the objective function for the training of the conditional variable autoencoder of operation c) is

Where are the sought predictions, Xi are the inputs and yi contains the sign of the return of the asset associated to an input Xi, N() is the Normal law, and KL() is the Kullback-Leibler divergence, - the expectation maximization algorithm of operation d) comprises maximizing the equation algorithm, and

- the filter-bank of operation b4) comprises triangle functions having a response of 1 at the central frequency of each filter and which decreases linearly towards 0 until it reaches the central frequencies of the two adjacent filters.

The invention also concerns a computer program comprising instructions for performing the above method of any of the preceding claims, a data storage medium having recorded thereon this computer program, and a computer system comprising a processor coupled to a memory, the memory having recorded thereon this computer program.

Other features and advantages of the invention will readily appear in the following description of the drawings, which show exemplary embodiments of the invention and on which:

- [Fig. 1] represents a general diagram of a system executing the method according to the invention,

- [Fig. 2] represents an exemplary result of a function which determines energy vectors for a time series of elements,

- [Fig. 3] represents an exemplary structure of an autoencoder used in the system of Figure 1, - [Fig. 4] represents an exemplary embodiment of a function which trains a hidden Markov model based on the risk vectors produced by the autoencoder of Figure 3,

- [Fig. 5] represents probabilities result of the hidden Markov model applied to historical data used to prove the efficiency of the system of Figure 1,

- [Fig. 6] represents the mean for the values used to establish Figure 5 grouped using a clustering based on the probabilities represented on Figure 5,

- [Fig. 7] represents the standard deviation for the values used to establish Figure 5 grouped using a clustering based on the probabilities represented on Figure 5, and

- [Fig. 8] represents the returns for the values used to establish Figure 5 grouped using a clustering based on the probabilities represented on Figure 5.

The drawings and the following description are comprised for the most part of positive and well-defined features. As a result, they are not only useful in understanding the invention, but they can also be used to contribute to its definition, should the need arise.

The description may make reference or use elements protected or protectable by copyright. The Applicant does not object to the reproduction of those elements in as much as it is limited to the necessary legal publications, however this should not be construed as a waiver of rights or any form of license.

The description further contains an annex referenced as Annex A, which provides the details of some mathematical formulas used in some of the embodiments of the invention. This annex is included as a means of clarification and in order to simplify the references. It forms an integral part of the description and can thus be used to defined the invention should the need arise.

Figure 1 represents a general diagram of a system executing the method according to the invention. System 2 comprises a memory 4, an energy vectorizing unit 6, a conditional autoencoder 8, a hidden Markov modelizer 10.

Memory 4 stores asset data, as well as all the data which is produced for carrying out the invention. The asset data represents a time series for a given asset. In this time series, for a given asset, there are many couples of data. These couples include an asset price, and a date value. The data may be organized differently, but it will appear that these elements can always be linked in the form of these couples.

In the example described herein, the memory 4 may be realized in any way suitable, that is by means of a hard disk drive, a solid-state drive, a flash memory, a memory embedded in a processor, a distant storage accessible in the cloud, etc. The data described above may be stored together or across one of several locations.

The energy vectorizing unit 6, conditional autoencoder 8, and hidden Markov modelizer 10 are computer programs or functions which are executed on one or more processors. Such processors include any means known for performing automated calculus, such as CPUs, GPUs, CPUs and/or GPUs grids, remote calculus grids, specifically configured FPGAs, specifically configured ASICs, specialized chips such as SOCs or NOCs, Al specialized chips, etc.

Generically speaking the philosophy followed by the invention relies on the core intuition that the price of an asset summarizes very complex pieces of information. Similarly to a concert play, the overall price signal is the aggregate of a large number of players and/or investors who get vocal, expressing their own view.

In order to understand the dynamics of prices, the invention aims at slicing and dicing them, looking to work at the level of more granular and interpretable information. In addition, the invention aims at de-noising this aggregate price signal at a sub-component level in order to only keep the gist of the information in a recognizable manner.

This philosophy has led the Application to consider audio techniques, as this kind of problem has actually been pretty well handled over time in the audio space where people looked to extract a recognizable compressed signal from a complex and noisy audio environment. While working on this, the Applicant understood that price series naturally convey a two- dimensional intertwined perception of risk. The first dimension can be characterised as the investment horizons of consideration and the second one consists in the level of stress perceived at each of these horizons. From price-series information, one can effectively track what corresponds to short-term emotion and what relates to more structural trends. By being able to disentangle these horizons, the Applicant discovered that a risk measure can convey some forward-looking information, related in particular with the evolution of the structural trends, but also with the evolving balance between the various horizons.

This constitutes a significant departure from the “Modern Finance” framework articulated earlier. However, even after these steps, the resulting fine decomposition of the level of stress at different time horizons is still too complex to help inferring reliable investment decisions.

For this reason, the Applicant turned to imaging techniques to reduce dimensionality, without losing the core of the above horizon / stress information. This is done by using a new family of machine learning techniques used by experts-in-imaging typically use in order to move from a high resolution but noisy picture to a simpler, compact low dimensional representation. This compression process led the Applicant to discover a unique finding in the sense that it enables to concatenate the complex multidimensional horizons and stress level picture into a two-dimensional risk vector. This risk vector is trustworthy in the sense that it enables to revert back to the retrieval of previous step picture in a meaningful manner.

In order to characterize this novel risk vector, the Applicant observed the dynamics of the risk vectors over time. In most instances lasting states within which a risk vector related to an asset would cluster could be observed. Based on the observation of the behavior of prices within each of these clusters and mirroring the language of fluid mechanics, the Applicant was able to separate periods of high turbulence and periods of low turbulence. These different periods are characterized by clearly differentiated Sharpe ratios, as well as specific levels of skewness and kurtosis. The interest is that, very often, but not always, it is clearly forward-looking because there is a high degree of resilience within each cluster.

From a process perspective, this philosophy is carried out by splitting the asset values into asset specific time windows having a chosen size, determining corresponding energy vectors, reducing these energy vectors into risk vectors, training a hidden Markov model on the risk vectors, and using the trained hidden Markov model to determine whether, based on new value data of an asset, this asset is undergoing a high turbulence period or a low turbulence period.

The training is based on an important amount of information, typically more than 10 years, in order for the hidden Markov model to be robust. However, this amount of information will be limited to less than 30 years of information, so as not to blur the most recent behavior.

Figure 2 shows an exemplary embodiment of the operations performed by the energy vectorizing unit 6.

In an operation 200, energy vectorizing unit 6 executes a function SplitQ, which receives a data set from memory 4. This data set comprises data couples which form a time series of asset price values for a set of assets. In other words, for a given list of assets, there are hundreds of asset price / date couples relating to the list of assets. Preferably, this list is such that there is one couple per day, i.e. for every asset of the list of assets, there a daily values over a continue period of time. Function SplitQ operates by grouping all couples relating to the same assets together. As a result, a number of sets of couples equal to the number of assets in the list of assets is obtained. These sets are hereinafter named asset time series subset, and each contain all of the couples relating to a given asset. Within a given asset time series subset, the couples are ordered chronologically, such that each asset time series subset forms a subset of the time series of elements formed by the data set. As a result, one can also see each given asset time series subset as a signal function of the price of the asset associated to this asset time series subset. After function Split/), an operation 210 is performed in which energy vectorizing unit 6 executes a function Window().

The idea behind function Window() is that an asset price series, like speech, is a non- stationary signal. It is thus desired to extract spectral features from a small window of prices for which it can be assumed that the signal is stationary, i.e. its statistical properties are constant within this region. This is done by using a window function, which is nonzero inside some region and zero elsewhere, running this window across the price series, and extracting the waveform inside this window.

The function Window() thus splits each asset time series subset into consecutive overlapping asset windows, and applies a window function on each asset window.

More precisely, for a given asset time series subset, function Window() creates time series of successive asset prices of the given asset time series subset hereinafter called asset windows such that each asset window has a chosen window duration span, and also such that each asset window overlaps at least one over asset window by a window overlap span.

For example, the asset windows may each span over two months, and successive asset windows may overlap by two weeks. As a result, for a given asset time series subset, function Window() first creates a series of successive asset windows each comprising 60 asset prices relating to consecutive days, and, for each asset window, the first 20 values are identical to the last 20 values of the previous asset window and/or the last 20 values are identical to the first 20 values of the next asset window. While the above values have been found to be optimal by the Applicant, the window duration span and the window overlap span may bear different values. The reason for creating overlapping asset windows is to keep the windows homogeneous between them, such that the resulting processing will itself be more or less continuous.

For each asset window, the function Window/) applies a chosen windowing function. In the example described herein, the window function is the Hamming function. The aim of the window function is to smooth the tails of the signal in the asset windows, in order to reinforce the uniqueness of the signal within each asset window, despite the overlapping setup. Applying the window function to the asset windows also helps reducing the spectral leakage by counteracting the Fourier assumption of infinite data in the following operation. Different window functions may be used, such as the Gaussian window, the Hanning window, the Blackman window, etc.

As a result, function Window/) returns smoothed asset windows which form a continuous signal between them while retaining unique local properties of the asset price signal, for each asset of the list of assets making up the data set.

Thereafter, the asset windows are all Fourier transformed in order to extract the spectral information for each asset window and know how much energy the signal contains per frequency band. In order to do that a function DFT() is executed in an operation 220 by the energy vectorizing unit 6. This function preferably performs a Discrete Fourier Transform because it is fast, efficient, and allows to massively parallelize the operations.

The result of this function is a set of asset window Fourier transform signals, for all assets. However, this data still maintains to high a dimensionality, and it is extremely hard to extract risk information from it. As a result, in an operation 230, the energy vectorizing unit 6 executes a function Filter/) which applies a filter-bank to each asset window Fourier transform signal in order to reduce each of them to a vector having a chosen size.

In the example described herein, each filter function in the filter-bank is a triangle function having a response of 1 at the central frequency of the filter and which decreases linearly towards 0 until it reaches the central frequencies of the two adjacent filters. Preferably, the filter-bank may contain 26 filters, such that each asset window Fourier transform signal is reduced to a vector of dimension 26. Choosing the above defined filterbank is akin to extracting frequency bands in each asset window Fourier transform signal. One of the most useful facts about the filter-bank approach is that its coefficients tend to be largely uncorrelated both across banks and from one period to the next. Of course, other filter-banks and/or with a different number of filters may be used. The idea really is to start reducing the dimensionality of the asset window Fourier transform signals. Thus, after performing all of its operations, energy vectorizing unit 6 return a set of vectors of size 26. Each vector represents the energy levels of an asset price for a set of successive and overlapping time windows, for all of the assets of the asset list which values make up the data set.

By looking at the energy vectors over the 2008 year, the Applicant discovered the energy vectors a measure of stress per horizon in the sense that a low frequency corresponds to a long-term horizon while a high frequency corresponds to a short-term horizon. In comparison with traditional measures of risk, the interpretation here is much richer because not only do they carry a measure of market stress, but they also carry a measure of market stress on the full spectrum of horizons. In addition, when looking at the low frequency (i.e. long term) information, one can read the forward-looking trend assumed by the market in a forward-looking perspective. This is something that can be done with volatility.

However, the signal represented by the energy vectors, while still being remarkably close in terms of actual information to the original data set, is still way too complex to be interpreted in itself. The Applicant thus decided to try and use machine learning techniques, which are known as great means of reducing dimensionality while maintaining the meaningfulness of information.

Unsupervised Neural networks learn useful data representations using the encoding decoding paradigm. The encoding features are used in several areas of machine learning. For instance, in Natural Language Processing (NLP), universal encoding vectors can be extracted using a neural network aiming at predicting some context vectors. It results in a mapping from the high dimensional space of words into a low dimensional feature space with interesting geometrical properties. More recently, universal embeddings have been extracted from encoding-decoding architectures trained for machine translation. Likewise, the speech community has used the bottleneck features trained on phoneme predictions. The Applicant considered that these tools could be useful to reduce the dimensionality of the energy vectors. The function of the autoencoder 8 is thus to further reduce the dimensionality of the vectors in order to allow the training of a hidden Markov model by the hidden Markov modelizer 10. The advantage of autoencoders is that they are able to reduce dimensionality without having to be aware of too much specificities of the training data.

In the following, the energy vectors are referenced as X, while the sign of the return of the asset associated to a given energy vector X is referenced Y. The reduced energy vector X will be referenced Z.

The autoencoder 8 is actually a conditional variational autoencoder (CVAE). The use of a conditional autoencoder approach is motivated by the aim to reduce the dimensionality of the vector X while keeping the bulk of the structural observation attached to it. The trade-off is that instead of considering the output Z as a simple vector, it is described as a bivariate Gaussian. In other words, some precision is lost, but there is a significant gain on parsimony and tractability.

The aim is thus to build a lower dimensionality distribution p(Z|X; Y ) which summarizes well the input energy vector X in the sense that, from it, it will be possible to infer an energy vector as close as possible to energy vector X. Also, during the training process, the additional information Y is provided as the conditional aspect in the CVAE 8.

In addition, the distribution Z will contain Gaussian components characterized by a vector of means mx and a vector of standard deviations sx. The realizations of this multivariate distribution Z will be inferred from the prior realizations of a multivariate Gaussian normalized noise.

The thinking behind this approach is that directly looking at identifying the law followed by the asset price goes way beyond our capability. Looking instead at both considering small time-windows and increasing the dimensionality of the price dynamics by decomposing it into 26 stationary highly independent sub-series makes the attempt to understand the law behind the evolution of these new vectors more within our reach. In other words, the invention aims at making the complexity more explicit within small clusters of data, which are then treated as largely independent over time, but following the same distributions in order to uncover the main characteristics of these distributions in a non- supervised manner.

In the end, what matters are only the two moments of distribution Z for each realization of an energy vector X. In the space of images, the process used here typically enables to move from a high-resolution picture to a low dimensional representation. Once the model is trained, it is used to generate a parsimonious vector Z which synthetizes as much information as possible belonging in energy vector X, while being nicely distributed and parsimonious.

The CVAE is a probabilistic model of data based on a continuous mixture of distributions. where d is the dimensionality of the data.

To train this model, it is necessary to maximize the marginal log-likelihood max » log of the energy vectors dataset. By marginalizing over the latent variables Zi and considering their distibutions q(z i ), a variational lower bound is obtained which depends on both the distributions q(z i ) for all values of i and 9 as follows:

The gap beween the marginal log-likelihood and this variational lower bound can be shown to be equal to

With KL() being the relative entropy or Kullback-Leibler divergence.

The goal of the learning of the CVAE 8 is thus to maximize on 9 and Φ the following equation

Since objective functions for machine learning applications are usually minimizing functions, this can be rewritten as minimizing on 3 and <I> the following equation

As a result, as shown on Figure 3, value Y is referenced 30, and energy vector X is reference 31. Both are fed to an encoder 32 (for 3) for minimizing the above equation, with function m(X) referenced 33 and function s(X) referenced 34 defining the intermediate vector Z referenced 35. Vector Z and value Y are then fed to a decoder 36 (for <Φ) which is trained to return estimate energy vectors 37 which best approximate input energy vectors X.

In view of the invention, only the vectors m(X) are important, as they realize the dimension reduction of the energy vectors X.

The empirical analysis of vectors Z can be summed up based on the following paradigm. For each country / geographic zone, a universe of large cap stocks is collected, typically present in the broad country index of interest. In total, per country, there are between 300 and 600 time series with typically thirty years of history in each. From this, expanding windows (expanding once a year) are created, where the population of X varies between 20,000 and 100,000 26-dimensional observations. These windows are used to estimate the parameters of the CVAE. The meta-parameters are kept simple, as on both the encoder and the decoder sides, one neuronal layer was shown to be sufficient. After various testing, the Applicant discovered that the most parsimonious and effective articulation of the vectors Z corresponded to a 2-dimensional vector. Higher dimensions did not help. This situation is quite unique as the dimensionality reduction obtained is substantial. The intuition behind vectors Z is that the set of input X articulates a risk management message which includes some forward-looking information. It should be noted that the dimension reduction, which took place from energy vectors X to vectors Z, brings another layer of resolution reduction, in the sense that if the energy vectors X are considered as a high-resolution picture, the vectors Z are its corresponding low-resolution recognizable twin.

By empirically analyzing the vectors Z, the Applicant was able to uncover the fact that these vectors actually relate to the risk attached to the asset price, and that this risk could be modeled by means of a hidden Markov model. As a result, the Applicant fed the vectors Z, thus risk vectors Z, to the hidden Markov modelizer 10.

The function of Figure 4 shows an exemplary embodiment of a function which trains the hidden Markov modelizer 10. The function of Figure 4 is a loop which performs an estimation maximization algorithm in order to estimate the parameters of the hidden Markov model.

More precisely, each vector X associated to a given asset represents a time instant t of the model which has a state Ht (which in the case of risk vector Z can be two states) and a vector Xt associated thereto. To parameterize the hidden Markov model, local conditional probabilities need to be assigned to each of the existing time instants. The first state node has no parents, thus this node receives an unconditional distribution F with i being equal to 1 or 2. This is done in an operation 400 by a function Init().

Each successive state node has the previous state node in the chain as its parent, and thus a 2x2 matrix Q, known as the state transition matrix, specifies its local conditional probability. Each term . Since each of the output node has a single state node as a parent, a probability distribution p(x t |qt) called the emission distribution is required. In the example herein, the emission distribution is assumed to be gaussian and independent of t, such that with N() being the Normal law and being the parameters of the emission distributions.

This is done by means of a function FwdBwdQ, which applies a forward-backward algorithm to return a triplet SFP comprising the values for Function FwdBwd() is performed in an operation 410, and the pseudo-code of the forwardbackward algorithm is provided in Annex A.

After the filtering and smoothing probabilities have been evaluated, the parameters of the hidden Markov model for time step i are evaluated by a function Upd() which receives the filtering and smoothing probabilities as arguments.

More precisely, it can be shown that the maximization step can be calculated such that:

This function thus evaluates the above equations and returns a vector PQMS comprising the corresponding values of

Thereafter, the estimation step can be updated in an operation 430 by a function Estim() which receives the current SFP and PQMS variables and calculates the estimation value according to the following equation. Finally, an exit condition is performed in an operation 440 in which a function Out() compares the current estimation value to the estimation value of the previous loop. If these values are above a chosen threshold, then the loop counter i is incremented in an operation 450, and the loop starts over with operation 410. Else the function ends in operation 499 and the hidden Markov model is trained.

Once the hidden Markov model is trained, it can be used to make prediction on the state of a new energy vector X based on the historical values, as it can be proven that

From this equation, the Applicant was able to prove the efficiency of his method. Indeed, Figure 5 represents the value of P() for the Consumer Staples sector in the US (MXUSOCS index). More precisely, the system of Figure 1 was trained on 18 years worth of data, from 1985 to 2003, and the rest of the data was used to test the reliability of the invention. Figure 5 shows the values of P() from 2004 to 2020. This graph shows that the values of P() cluster in two states which tend to be stable. The values above 0,5 have been labelled good state, while the values below have been labelled bad state.

Figures 6 and 7 respectively show the mean and standard deviations of the good state and the bad state, respectively (good state being on top in Figure 6 and in the bottom on Figure 7, while Figure 8 shows the returns that would have been obtained by investing only during the good state, or only during the bad state (good state being on top on Figure 8).

These figures clearly show the informative and forward looking nature of the risk measured by the probability from the hidden Markov model. This information can thus be used to invest in stocks when the probability indicates the good state, and on safe investments such as bonds in the alternative. More precisely, everyday, the hidden Markov Model may be trained using the previously acquired data and the data of the previous day, and the invention may be used to provide predictions for the next day. The conditional variable autoencoder may be retrained about once a year. The hidden Markov model may also be replaced by a prediction model based on LSTM neural networks and attention mechanisms which uses the same embedding vectors in order to predict favorable states. To that end, the model takes as input a sequence of embedding vectors corresponding to a sequence of frames along with a historical sequence of favorable states. The output of the model is the sequence of states shifted by one frame.