Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FORECASTING SYSTEM AND METHOD
Document Type and Number:
WIPO Patent Application WO/2022/122811
Kind Code:
A1
Abstract:
The disclosure relates to a method and apparatus for generating a resource consumption forecast, comprising: receiving data for a plurality of resources; identifying groups of resources based on cluster analysis of the data for the plurality of resources; forecasting one or more properties of future resource consumption for each of the plurality of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each sales modelling engine; determining, for each combination of group of products, modelling engine and set of hyperparameters, a confidence value using k-fold cross-validation; and selecting, for each group of resources, a modelling engine and set of hyperparameters as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

Inventors:
NAG PARIKSHIT (IN)
Application Number:
PCT/EP2021/084765
Publication Date:
June 16, 2022
Filing Date:
December 08, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNILEVER IP HOLDINGS B V (NL)
UNILEVER GLOBAL IP LTD (GB)
CONOPCO INC DBA UNILEVER (US)
International Classes:
G06Q10/04
Foreign References:
KR20180060317A2018-06-07
Other References:
TAYLOR SJLETHAM B: "Forecasting at scale", PEERJ PREPRINTS, vol. 5, 2017, pages e3190v2, Retrieved from the Internet
T. CHENC GUESTRIN: "XGBoost: A Scalable Tree Boosting System", ARXIV:1603.02754V3, 10 June 2016 (2016-06-10), Retrieved from the Internet
G. KE ET AL.: "LightGBM: A Highly Efficient Gradient Boosting Decision Tree", 31ST CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS (NIPS 2017), LONG BEACH, CA, USA, Retrieved from the Internet
L. PROKHORENKOVA ET AL.: "CatBoost: unbiased boosting with categorical features", ARXIV:1706.09516V5, 20 January 2019 (2019-01-20), Retrieved from the Internet
Attorney, Agent or Firm:
REIJNS, Tiemen, Geert, Pieter (NL)
Download PDF:
Claims:
Claims

1. An apparatus comprising: one or more processors; an output device; and computer memory, the computer memory comprising computer program code configured to: receive data for a plurality of resources; identify groups of resources based on cluster analysis of the data for the plurality of resources; forecast one or more properties of future resource consumption for each group of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engine; determine a confidence value using k-fold cross-validation for each combination of group of resources, modelling engine and set of hyperparameters; and select, for each group of resources, a modelling engine and set of hyperparameters as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

2. An apparatus according to claim 1 , wherein the apparatus is further configured to forecast future consumption using the resource consumption forecast model.

3. An apparatus according to claim 1 or claim 2, wherein the resource consumption data relates to one or more of: the use of energy by users in a power network, the use of computational resources by physical machines or virtual machines on a network or in another system, and the consumption of a commodity by a population.

4. An apparatus according to any of the preceding claims, wherein the groups of resources are resources with similar consumption trends by end users, the apparatus further configured to identify groups of resources by determining a forecast of consumption by end users. 5. An apparatus according to any of the preceding claims, wherein the resource consumption forecast is a forecast of a primary supply of resources to locations at which the resources are distributed to end users.

6. An apparatus according to any of the preceding claims, wherein the confidence value is a loss value and the apparatus is configured to select the modelling engine and set of hyperparameters which result in the lowest loss as a selected forecast model.

7. An apparatus according to any of the preceding claims, wherein k = 5, and wherein each of the k-fold validations is performed on different sub-sets of the data.

8. An apparatus according to any of the preceding claims, wherein the apparatus is further configured to select the hyperparameters using Bayesian optimization.

9. An apparatus according to any of the preceding claims, wherein the modelling engines are provided with, and take account of predefined event information.

10. An apparatus according to any of the preceding claims, wherein the resource consumption forecast is a product primary sales forecast.

11. An apparatus according to claim 10, wherein the data received for the plurality of resources is primary sales data, the resources are products, and the modelling engine is a primary sales modelling engine.

12. An apparatus according to claim 11 , wherein the groups of products are products with similar sales trends.

13. An apparatus according to any of the preceding claims, wherein the one or more properties of future resource consumption include one or more of a number of sales in a particular period and sales values.

14. An apparatus according to claim 11 , the modelling engines are provided with, and take account of, pricing information or stock availability. 15. A method for generating a resource consumption forecast, comprising: receiving data for a plurality of resources; identifying groups of resources based on cluster analysis of the data for the plurality of resources; forecasting one or more properties of future resource consumption for each group of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engine; determining a confidence value using k-fold cross-validation for each combination of group of resources, modelling engine and set of hyperparameters; and selecting, for each group of resources, a modelling engine and set of hyperparameters as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

Description:
Forecasting System and Method

Field of the invention

The invention relates to a system and method for generating a resource consumption forecast, and in particular, although not exclusively, to generating a product sales forecast.

Background of the invention

Multivariate analysis relates to the simultaneous consideration of a number of aspects of a physical or other system in order to determine its behaviour. Such analysis methods are known in the art and can be used to perform forecasting, such as demand forecasting. Demand forecasting is of practical application in a wide range of fields in order to determine the likely future consumption of a resource, such as the use of energy by users in a power network, the use of computational resources by physical machines or virtual machines on a network or in another system, or the consumption of a commodity by a population. In recent years, such forecasting has also found applications in ecommerce in order to predict demand for a product, such as a consumer product. Given the volatility of the e-commerce market, traditional time series analysis may be of limited practical use in providing useful forecasts for demand planning. A large number of different modelling packages are available for forecasting based on historic data. Each product may have its own advantages and limitations for modelling a particular consumption pattern. Further, planners operating such software packages may be required to manually build in anticipated uplifts in demand to account for big-day events, promotions, and pricing changes, using existing solutions based on their intuition and knowledge of the market. As such, the application of suitable demand forecasting tools and the selection of appropriate methods requires substantial skill and effort on the part of the planner in order to provide reliable forecasts. Such a process may also be particularly time consuming using prior art demand planning techniques. Some aspects of the present disclosure relate to alleviating the above difficulties.

Summary of the invention

According to a first aspect of the present disclosure, there is provided a method for generating a resource consumption forecast, comprising: receiving data for a plurality of resources; identifying groups of resources based on cluster analysis of the data for the plurality of resources; forecasting one or more properties of future resource consumption for each group of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engine; determining a confidence value using k-fold cross-validation for each combination of group of resources, modelling engine and set of hyperparameters; selecting, for each group of resources, a modelling engine and set of hyperparameters as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

In this way, such aspects provide an automated tool for the generating and validation of resource consumption forecasts using one of a number of base modelling engines. The various base modelling engines may be appropriate for different types of resource, and the application of the method may allow the selection of a suitable modelling engine and set of hyperparameters to be optimized. In particular, the use of k-fold cross-validation has been found to provide a robust solution, allowing the automation or semi-automation of model selection over a wide range of resources. As such, the amount of skill and experience required of the demand planner, and the time that it takes for them to obtain a model appropriate to a particular resource (and other resources in the same group) may be reduced. The method may comprise forecasting future consumption using the selected model algorithm and associated parameters.

The resource consumption data may relate to one or more of: the use of energy by users in a power network, the use of computational resources by physical machines or virtual machines on a network or in another system, and the consumption of a commodity by a population. Such consumption forecasts are examples of forecasts that model the future state of a technical system. The consumption of a commodity by a population may include determining the number of sales of a product.

The method may comprise modifying a condition for the plurality of modelling engines in order to obtain a plurality of sets of hyperparameters for each modelling engine using the modified conditions. In this way, the method may be used to plan for known changes in a system that may affect consumption of the resources.

The resources may be products, such as consumer products. The method of the first aspect, in which the resource consumption forecast is a product sales forecast.

The groups of resources may be resources with similar consumption trends. Identifying groups of resources may comprise determining a forecast of consumption of the resources by end users. The resource consumption forecast may be a forecast of a primary supply of resources to locations at which the resources are distributed to end users.

The confidence value may be a loss value. The modelling engine and set of hyperparameters which result in the lowest loss value may be selected as a chosen forecast model.

The modelling engines may comprise one or more of the XGBoost, Catboost and LightGBM engines. The k value in the k-fold cross-validation may be equal to 5. A value of 5 has been found to provide a suitable trade-off in terms of implementation complexity and robustness of the cross-validation. Each of the k-fold validations may be performed on different subsets of the data.

The hyperparameters may be selected using Bayesian optimization. The use of Bayesian optimization allows the automation of hyperparameters and so reduces the complexity of modelling for the demand planner.

The modelling engines may be provided with, and take account of, predefined event information. The predefined event information may include public holidays or promotional periods.

The method may comprise validating the homogeneity of a cluster by calculating a silhouette score.

The received data maybe sales data. The resources may be products, such as consumer products. The modelling may relate to sales, or primary sales, modelling. The group of products may be products with similar sales trends, such as offtake sales trends. One or more properties of the future resource consumption forecast may include one or more of a number of sales, such as offtake sales, of the product in a particular period and sales values. The received data may be primary sales data. The resources may be products. The modelling engine may be a primary sales modelling engine.

The data may include pricing information for one or more products. The data may include stock availability for one or more products. The modelling engines may be provided with and take account of pricing information. The modelling engines may be provided with and take account of stock availability.

The method may comprise forecasting future sales using the selected model algorithm and associated parameters.

The method may be an automated or computer implemented method. According to a further aspect of the present disclosure there is provided a method for generating a resource consumption forecast, comprising: receiving data for a resource; identifying a plurality of data subsets for the resource, each data subset comprising a different plurality of resources; forecasting one or more properties of future resource consumption for each subset of data using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engine; determining a confidence value using k-fold cross-validation based on the plurality of data subsets for each combination of modelling engine and set of hyperparameters; selecting a modelling engine and set of hyperparameters as the resource consumption forecast model based on the determined confidence values associated with that resource.

According to a further aspect of the present disclosure there is provided an apparatus comprising: one or more processors; an output device; and computer memory, the computer memory comprising computer program code configured to: receive data for a plurality of resources; identify groups of resources based on cluster analysis of the data for the plurality of resources; forecast one or more properties of future resource consumption for each group of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engine; determine a confidence value using k-fold cross-validation for each combination of group of resources, modelling engine and set of hyperparameters; select, for each group of resources, a modelling engine and set of hyperparameters as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

Brief Description of Figures

One or more embodiments will now be described by way of example only with reference to the accompanying drawings in which:

Figure 1 illustrates a method for generating a resource consumption forecast in accordance with an aspect of the present invention;

Figure 2 illustrates a schematic block diagram of a process flow in accordance with an aspect of the present invention;

Figure 3 illustrates a further process flow diagram;

Figure 4 illustrates a further schematic block diagram of elements of an offtake forecasts stage and an automated supervised model selection engine stage; and

Figure 5 illustrates a schematic block diagram of a computer system.

Detailed description of the invention

First Aspect of a Resource Consumption Forecasting Method

According to a first aspect, there is provided a method for generating a resource consumption forecast. The resource consumption forecast may be applicable to a number of different types of forecasting. For example, the forecast may relate to the use of energy by users in a power network, use of computational resources by physical machines or virtual machines on a network or in another system, or forecasting sales for a particular product. As such the resource consumption forecasting may relate to the future consumption of a resource, such as energy, computational or natural resources, or sales data.

The method includes a number of stages. Data for a plurality of resources are received. Groups of resources are identified based on cluster analysis of the data for the plurality of resources. One or more properties of future resource consumption are forecast for each of the plurality of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engines. For each combination of group of resources, modelling engine and set of hyperparameters, a confidence value is determined using k-fold cross-validation. A hyperparameter is a predetermined parameter. For each group of resources, a modelling engine and set of hyperparameters is selected as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

Second Aspect of a Resource Consumption Forecasting Method

According to a second aspect, there is provided a method for selecting a resource consumption forecast model by performing k-fold cross validation on models determined using different permutations of historic time series data for a single resource. The resource consumption forecast may be applicable to a number of different types of forecasting. For example, the forecast may relate to the use of energy by users in a power network, use of computational resources by physical machines or virtual machines on a network or in another system, or forecasting sales for a particular product. As such the resource consumption forecasting may relate to the future consumption of a resource, such as energy, computational or natural resources, or sales data.

In one example, there is provided a method for generating a resource consumption forecast, comprising: receiving data for a resource; identifying a plurality of data subsets for the resource, each data subset comprising different plurality of resources; forecasting one or more properties of future resource consumption for each subset of data using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engine; determining a confidence value using k-fold cross-validation based on the plurality of data subsets for each combination of modelling engine and set of hyperparameters; selecting a modelling engine and set of hyperparameters as the resource consumption forecast model based on the confidence values associated with that resource determined by the k-fold cross-validation.

The data for the resource may be timeseries data. In which case, the plurality of data subsets for the resource may include different datapoints. For example:

Dataset comprises tO, t1 , t2, t3, t4, t5;

1 st subset comprises t1 , t2, t3, t4, t5;

2 nd subset comprises tO, t2, t3, t4, t5;

3 rd subset comprises tO, t1 , t3, t4, t5;

4 th subset comprises tO, t1 , t2, t4, t5; and

5 th subset comprises tO, t1 , t2, t3, t5.

Hardware-resource control systems

Hardware-resource control systems may be used, for example, to allocate computational-resources within a computer system according to the method of the first or second aspect. In the example discussed below, a networked computer system comprising a server and a plurality of end user devices are tasked with executing a number of execution threads of application program interfaces (APIs). In this system, the APIs have a primary resource consumption related to processing power required at the server and consumption of computing resources by the end user devices. The control system may forecast future demand to determine whether additional processing power is likely to need to be acquired from other computation resources in order to ensure efficient operation of the system. In effect, the resource consumption forecast is a forecast of a primary supply of resources to locations at which the resources are distributed to end users.

The control system may receive a number of data sources, for example including: demand data regarding the demand from a number of APIs on the computer system; a database containing information concerning events that are likely to increase overall demand on the system, which might include details of public holidays; demand data regarding computational resource capability of connected systems for spreading the data-processing load; and details of periods when the price of electricity, and therefore computational resource time, is reduced.

The data are provided to a conventional multivariate forecasting engine to generate user end consumption forecasts for the various API threads. Groups of API threads with similar user consumption trends may be identified using cluster detection techniques, for example, a feature engineering process, which is conventional in the field of machine learning can be used by a user to adapt the forecast provided by the multivariate forecasting engine to accommodate different demand features, such as applying a network bandwidth constraint or identifying a period of heightened anticipated demand.

As discussed above with reference to the first aspect, one or more properties of future primary resource consumption are forecast for each of the plurality of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engines. In this example, the one or more properties may relate to bandwidth requirement or processing demand at the server from a particular API or group of APIs. For each combination of group of API threads, modelling engine and set of hyperparameters, a confidence value is determined using k-fold cross-validation. For each group of API threads, a modelling engine and set of hyperparameters is selected as the resource consumption forecast model based on the determined confidence values associated with that group of API threads.

E-commerce Platform

In a further aspect, the method described with reference to the first or second aspect may be used to implement a sales forecasting model for use, for example, in an e- commerce platform. According to a further aspect of the present disclosure there is provided a method for generating a primary sales forecast model, comprising: receiving offtake sales data for a plurality of products; identifying groups of products based on cluster analysis of the offtake sales data for the plurality of products; forecasting one or more properties of future primary sales for each of group of products using a plurality of primary sales modelling engines, and a plurality of sets of hyperparameters for each primary sales modelling engine; determining a confidence value using k-fold cross-validation for each combination of: group of products, modelling engine and set of hyperparameters; and selecting, for each group of products, a primary sales modelling engine and set of hyperparameters as the primary sales forecast model based on the determined confidence values associated with that group of products.

Groups of products, or base packs, may be identified based on the predicted offtake based on the products having similar offtake sales trends. For example, all the base packs may be segmented using Spectral Clustering. The homogeneity of each cluster may be validated by calculating a Silhouette Score. In general, the clusters may be identified using any type of cluster detection algorithm and evaluation technique known in the art. In this way, products that have similar offtake sales patterns can be identified.

Computer Programs According to a further aspect of the disclosure there is provided a data processing unit configured to perform any method described herein as a computer-implementable. The data processing unit may comprise one or more processors and memory, the memory comprising computer program code configure to cause the processor to perform any method described herein.

According to a further aspect of the disclosure there is provided a computer readable storage medium comprising computer program code configure to cause a processor to perform any computer-implementable method described herein. The computer readable storage medium may be a non-transitory computer readable storage medium.

There may be provided a computer program, which when run on a computer, causes the computer to configure any apparatus, including a circuit, unit, controller, device or system disclosed herein to perform any method disclosed herein. The computer program may be a software implementation. The computer may comprise appropriate hardware, including one or more processors and memory that are configured to perform the method defined by the computer program.

The computer program may be provided on a computer readable medium, which may be a physical computer readable medium such as a disc or a memory device, or may be embodied as a transient signal. Such a transient signal may be a network download, including an internet download. The computer readable medium may be a computer readable storage medium or non-transitory computer readable medium.

Detailed description of Figures

Figure 1 illustrates a method 100 in accordance with the first aspect. The method 100 illustrated in Figure 1 includes a number of stages. Data for a plurality of resources are received 102. Groups of resources are identified 104 based on cluster analysis of the data for the plurality of resources. One or more properties of future resource consumption are forecast 106 for each of the plurality of resources using a plurality of modelling engines, and a plurality of sets of hyperparameters for each modelling engines. For each combination of group of resources, modelling engine and set of hyperparameters, a confidence value is determined 108 using k-fold cross-validation. A hyperparameter is a predetermined parameter. For each group of resources, a modelling engine and set of hyperparameters is selected 110 as the resource consumption forecast model based on the determined confidence values associated with that group of resources.

Various features of the method described previously with reference to first and second aspects may be better understood by considering the specific embodiments discussed below regarding an ecommerce platform described with reference to figures 2 to 4.

Figure 2 illustrates a schematic block diagram of a process flow in accordance with the disclosed method. In this example, the process flow 200 relates to an example in which the resource forecast method provides a primary sales forecast in an e-commerce context. That is, the resource consumption forecast is a product primary sales forecast. The process flow 200 receives a number of data sources 212, 214, 216, 218. The data sources include offtake data 212 and primary sales data 216. Offtake data relates to the purchase of a product by an end consumer, for example from a retailer whereas primary sales relate to purchases by a retailer or wholesaler from a supply source. The process flow 200 also receives market event data, which in this example is our Amazon events 214. Such events may include discount or promotion events, such as black Friday or Thanksgiving. The market events may be country specific, for example. The process flow 200 also receives TTS discount data 218, which relates to information regarding promotion provided by a supplier to a retailer. The offtake data 212, Amazon events data 214 and TTS discount data 218 are provided to offtake forecasters 220 which are configured to generate a plurality of offtake forecasts related to the products that are the subject of the input data 212, 216, 218. The offtake forecasts generated by the offtake forecasters 220 may be implemented using a plurality of known forecast modelling engines. For example, offtakes may be predicted using Prophet (a generalized additive model open sourced by Facebook Inc., implemented in Python (https://facebook.github.io/prophet/, also see Taylor SJ, Letham B. 2017. Forecasting at scale. PeerJ Preprints 5:e3190v2 https://doi.org/10.7287/peerj.preprints.3190v2). Prophet provides a procedure for forecasting time series data based in which non-linear trends are fit with yearly, weekly, and daily seasonality, for example. It takes into account the impact of Big Days, Discounts and national holidays determined from series of historical data. Groups of products, or base packs, may be identified based on the predicted offtake based on the products having similar offtake sales trends. For example, all the base packs may be segmented using Spectral Clustering. The homogeneity of each cluster may be validated by calculating a Silhouette Score. In general, the clusters may be identified using any type of cluster detection algorithm and evaluation technique known in the art. In this way, products that have similar offtake sales patterns can be identified.

The Amazon events data 214, the primary sales data 216 and the TTS discount data 218 are also provided along with output from the offtake forecast engines 220 to a feature engineering and segmentation stage 222. Segmentation may be achieved using spectral clustering, using generic packages commonly available with Python programming packages. In other respects, the feature engineering and segmentation stage 222 may be implemented by a manual practice known in the art for creating features to modify the offtake forecasts, such as the application of an anticipated discount, applied predicted offtake, applied predicted primary sale in order to generate leads, lags or rolling averages in the offtake forecasts. Lag features may be provided as a function of historic time series data. In this way, feature engineering using predictive functions that taken historic data as an input allows for consideration of autocorrelation, which may otherwise be ignored by machine learning techniques. The specific implementation of the offtake forecasters 220 and the process of feature engineering and segmentation 222 is not the subject of this disclosure.

The outputs of the offtake forecasters 220 and feature engineering and segmentation process stages 222 are provided directly to an automated supervised model selection engine 224, as well as to the feature engineering and segmentation stage 222. In this way, the model selection engine 224 is able to act on forecasts and modified forecasts from the offtake forecast stage 220 and the feature engineering and segmentation stage 222.

The automated supervised model selection engine 224 provides a secondary forecasting step which determines a primary sales forecast based on the output of the offtake forecaster. The automated supervised model selection engine 224 is configured to forecast one or more properties of future sales using modified conditions for each of the plurality of products using a plurality of sales modelling engines, and a plurality of sets of hyperparameters for each sales modelling engines. The hyperparameters may be automated using a Bayesian optimization framework (for example, HyperOpt, a Python library for serial and parallel optimization, https://github.com/hyperopt/hyperopt). Such optimization can discretize the hyperparameters and remove the need for user selection.

The automated supervised model selection engine 224 is further configured to determine a confidence value using k-fold cross-validation for each combination of group of products, modelling engine and set of hyperparameters. For instance, the k-fold cross validation may be 5-fold cross validation in which each of the k-fold validations is performed on different sub-sets of the data. Each validation step is performed on different initial data.

For each identified group of products, a sales modelling engine and set of hyperparameters is selected as the primary sales forecast model 226 based on the determined confidence values associated with that group of products. Future primary sales for a particular product may be forecast using the selected primary sales model 226 and associated selected parameters.

Advantages of the method include, for example, that:

• it addresses the external drivers and quantifies the impact of changes.

• it can be used for simulating future results basis change in levers, for example, for the coming month the method is able to predict what the sales will be if we reduce the price by 5% vs what the sales would be if we cut price by 10%. Sales output can be simulated for various different price values.

• it may be used to consider a plethora of modelling options and select an optimum model for the given data set based on a rigorous cross validation framework.

• It is “no touch” in that feature selection and hyperparameter tuning is automated, and scalable.

Figure 3 illustrates a further process flow diagram 300 that may be used to better understand aspects of methods described in accordance with the present disclosure. Initially data sources are received, such as those described previously with reference to figure 2. From the received data sources offtakes and are predicted, and then in turn primary sales are predicted based on the offtake forecast, for example using the offtake forecasting engines described previously with respect to figure 2. The forecast models based on the data sources, and the modified models which incorporate aspects of the applied features, are then used to provide a forecast using, for example the automated supervision model selection engine described previously with reference to figure 2. This acts as a baseline primary sales model.

Features, which generally relate to promotions or discounts that may be expected to result in a difference in sales, are applied, as described previously in relation to the feature engineering and segmentation stage of figure 2. The same model and parameters determined for the baseline primary sales model can then be applied to the offtake forecast modified based on the applied features. In this way, the primary sales model can be used to determine a difference in primary sales based on applied features to offtake forecasts.

Figure 4 illustrates elements of an automated supervised model selection engine 424 in the form of a further schematic block diagram. The engine 424 may be implemented as an automated machine learning modelling using, for example, a plurality of existing proprietary modelling engines 444. Such engines include XGBoost (T. Chen and C Guestrin, “XGBoost: A Scalable Tree Boosting System”, KDD ’16, August 13-17, 2016, San Francisco, CA, USA, arXiv:1603.02754v3 [cs.LG] 10 June 2016, https://arxiv.org/pdf/1603.02754.pdf), LightGMB (G. Ke et al., “LightGBM: A Highly Efficient Gradient Boosting Decision Tree”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficien t-gradient-boosting- decision-tree.pdf) and CatBoost (L. Prokhorenkova et al., “CatBoost: unbiased boosting with categorical features”, arXiv:1706.09516v5 [cs.LG] 20 Jan 2019, https://arxiv.org/pdf/1706.09516.pdf). The CatBoost class definition is provided below in Phython code showing the parameters that may be set as hyperparameters. In this example, all of the parameters take the value ‘None’. class CatBoostRegressor(iterations=None, learning_rate=None, depth=None, l2_leaf_reg=None, model_size_reg=None, rsm=None, loss_function='RMSE', border_count=None, feature_border_type=None, per_float_feature_quantization=None, input_borders=None, output_borders=None, fold_permutation_block=None, od_pval=None, od_wait=None, od_type=None, nan_mode=None, counter_calc_method=None, leaf_estimation_iterations=None, leaf_estimation_method=None, thread_count=None, random_seed=None, use_best_model=None, best_model_min_trees=None, verbose=None, silent=None, logging_level=None, metric_period=None, ctr_leaf_count_limit=None, store_all_simple_ctr=None, max_ctr_complexity=None, has_time=None, allow_const_label=None, one_hot_max_size=None, random_strength=None, name=None, ignored_features=None, train_dir=None, custom_metric=None, eval_metric=None, bagging_temperature=None, save_snapshot=None, snapshot_file=None, snapshot_interval=None, fold_len_multiplier=None, used_ram_limit=None, gpu_ram_part=None, pinned_memory_size=None, allow_writing_files=None, final_ctr_computation_mode=None, approx_on_full_history=None, boosting_type=None, simple_ctr=None, combinations_ctr=None, per_feature_ctr=None, ctr_target_border_count=None, task_type=None, device_config=None, devices=None, bootstrap_type=None, subsample=None, sampling_unit=None, dev_score_calc_obj_block_size=None, max_depth=None, n_estimators=None, num_boost_round=None, num_trees=None, colsample_bylevel=None, random_state=None, reg_lambda=None, objective=None, eta=None, max_bin=None, gpu_cat_features_storage=None, data_partition=None, metadata=None, early_stopping_rounds=None, cat_features=None, grow_policy=None, min_data_in_leaf=None, min_child_samples=None, max_leaves=None, num_leaves=None, score_function=None, leaf_estimation_backtracking=None, ctr_history_unit=None, monotone_constraints=None)

As described previously, the model selection engine then proceeds to select 446 hyperparameters for each model in an automated manner. This may be achieved by sequentially testing a plurality of different hyperparameter combinations, for example. For each tested model, an optimal set of hyper parameters may be chosen. Of these, the model that produces the best results with its respective optimised hyperparameters may be selected 448. For example, the confidence values generated using k-fold cross- validation may be used to select the modelling engine and set of hyperparameters which result in the loss value being minimized.

Figure 5 illustrates a schematic block diagram of a computer system 500 which may be used to implement the methods described previously. The system 500 comprises one or more processors 502 in communication with memory 504. The memory 504 is an example of a computer readable storage medium. The one or more processors 502 are also in communication with one or more input devices 506 and one or more output devices 508. The various components of the system 500 may be implemented using generic means for computing known in the art. For example, the input devices 506 may comprise a keyboard or mouse and the output devices 508 may comprise a monitor or display, and an audio output device such as a speaker.