Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
LIFE CYCLE MANAGEMENT OF MACHINE LEARNING MODEL
Document Type and Number:
WIPO Patent Application WO/2023/144831
Kind Code:
A1
Abstract:
A computer-implemented method is provided that is performed by a node (QQ110, QQ300, QQ500) configured to perform life cycle management of at least one machine learning, ML, model for telecommunications dimensioning in a network. The method includes performing (703) one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning. The current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The method further includes selecting (711) a second minimal informative dataset of data that sufficiently models a second dataset.

Inventors:
VASUDEVAN SHRIHARI (IN)
PRASATH M J (IN)
PUTHENPURAKEL SLEEBA PAUL (IN)
Application Number:
PCT/IN2022/050055
Publication Date:
August 03, 2023
Filing Date:
January 25, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ERICSSON TELEFON AB L M (SE)
VASUDEVAN SHRIHARI (IN)
International Classes:
G06N20/00; H04W24/02; H04W24/08
Foreign References:
US20170154280A12017-06-01
Other References:
"Thesis Institut Polytechnique de Paris", 31 July 2021, article NGUYEN TUAN ANH: "Dimensioning cellular IoT network using stochastic geometry and machine learning techniques"", pages: 1 - 145, XP093083334
Attorney, Agent or Firm:
D J, Solomon et al. (IN)
Download PDF:
Claims:
CLAIMS: 1. A computer-implemented method performed by a node (QQ110, QQ300, QQ500) configured to perform life cycle management of at least one machine learning, ML, model for telecommunications dimensioning in a network, the method comprising: performing (703) one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset; and selecting (711) a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on a second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. 2. The method of Claim 1, further comprising: incrementally adding (705) data from the second dataset to the first minimal informative subset of data to obtain a second incremental dataset; assessing (707) performance of the current ML model on the second incremental dataset; and applying (709) a stopping criterion to the incrementally adding data when the performance of the current ML model refined on the second incremental dataset satisfies one of (i) is within a threshold of performance of the current ML model trained on the entire first and second datasets combined or taken together during offline training of the current ML model, (ii) when greater than a defined percentage of data from the second dataset is needed in the incrementally adding to achieve (a) the performance on the second incremental dataset that approximates performance on the entire first and second datasets or (b) the performance saturates, during online or offline retraining of the current ML model, or (iii) when performance of the current ML model refined online on the second incremental dataset saturates.

3. The method of any of Claims 1 to 2, wherein when the refined current ML model has an acceptable performance, generating a forecast of telecommunications dimensioning with the refined current ML model. 4. The method of any of Claims 1 to 3, further comprising: determining (801) whether the current ML model or a stored ML model is successfully refined; and when (i) none of the current ML model and a stored ML model is successfully refined, creating (803) a new ML model, or (ii) the current ML model and/or at least one stored ML model is successfully refined, using (805) a refined ML model that has the best performance to generate a forecast of telecommunications dimensioning. 5. The method of Claim 4, further comprising: determining (901) (i) whether the current ML model has an acceptable performance, or (ii) whether another ML model exists from a plurality of stored ML models based on known or unknown model deployment metadata and known target performance; and when (i) the current ML model or a stored ML model has an acceptable performance, or (ii) the another ML model exists, omitting (903) refining the current ML model.. 6. The method of any of Claims 2 to 5, wherein when (i) refinement of the current ML model does not result in an acceptable performance, or (ii) a periodic refinement to the current ML model does not result in an acceptable performance, further comprising: identifying (1001), from a plurality of ML models, another ML model to refine, the identifying performed with or without available model metadata, and with or without known target performance of the another ML model, and the model metadata comprising information about deployment of a model.

7. The method of Claim 6, wherein when the model metadata is available and a target performance of the another ML model is available, the identifying comprises identifying the another ML model based on a model error of the another ML model that is within the target performance of the another ML model. 8. The method of Claim 6, wherein when the ML model metadata is available and a target performance of the another ML model is not known, the identifying comprises identifying ML models from the plurality of ML models that match the ML model metadata. 9. The method of Claim 8, further comprising: performing (1003) a procedure to find a second minimal informative subset of new data that sufficiently models the second dataset based on performance of a retrained at least one ML model from the plurality of ML models. 10. The method of Claim 9, wherein the procedure comprises iterating through the identified ML models on new data until (i) performance saturation is reached, or (ii) none of the identified ML models can be refined and a new ML model is created. 11. The method of Claim 10, further comprising: designating (1005) a best performing ML model from the identified ML models or the new ML model to generate a forecast of telecommunications dimensioning. 12. The method of Claim 6, wherein when model metadata is not available, the identifying comprises (i) evaluating a performance of each of the plurality of ML models on a subset of the new data, (ii) identifying a subset of the plurality of ML models to retrain on a new dataset, the subset of the plurality of ML models identified based on a defined value that sets the number best performing ML models to include in the subset of the plurality of ML models; (iii) using an existing ML model that is determined to be performant, and (iv) when an existing ML model is not determined to be performant, performing a procedure to find a second minimal informative subset of the second dataset that sufficiently models the second dataset based on a performance of a retrained at least one ML model from the plurality of ML models, wherein the performance comprises achieving a target performance. 13. The method of Claim 12, wherein the procedure comprises iterating through the subset of the plurality of ML models on the new data until (i) the target performance is reached, or (ii) none of the subset of the plurality of ML models can be refined and a new ML model is created. 14. The method of Claim 13, further comprising: designating (1007) a best performing ML model from the subset of the plurality of ML models or the new ML model to generate a forecast of telecommunications dimensioning. 15. The method of any of the Claims 1 to 14, wherein the at least one ML model comprises a linear regression model, and the telecommunications dimensioning comprises estimating resources for a set of telecommunications features that capture telecommunications network traffic behavior. 16. The method of Claim 15, wherein the set of telecommunications features comprise at least one of time, de-registrations, initial registrations, re-registrations, call duration, answered calls, call attempts, and total short message services, SMS. 17. The method of any of Claims 1 to 16, further comprising: initially training (701) the current ML model, the initial training comprising: selecting a seed dataset from the first dataset comprising performance data for network traffic; incrementally adding data from the remainder of the first dataset to the seed dataset to obtain a first incremental dataset; assessing performance of the current ML model on the first incremental dataset; and applying a stopping criterion to the incrementally adding data when the performance of the current ML model trained on the first incremental dataset satisfies one of (i) is within a threshold of performance over the entire first dataset during training of the current ML model, or (ii) when performance of the current ML model trained on the first incremental dataset saturates during online training of the current ML model. 18. The method of any of Claims 1 to 17, further comprising: deleting (1101), on a defined periodic basis, unused and/or out-of-date datasets and/or ML models from a ML model repository comprising the current ML model and a plurality of additional ML models; and maintaining (1103) in the ML model repository at least one dataset and at least one ML model that is in use and/or is up-to-date. 19. The method of any of Claims 1 to 18, further comprising: compressing (1201) the first dataset and/or the second dataset for a stored ML model into a compressed form on a defined periodic basis, the compressing comprising finding a minimal informative subset of accumulated datasets for the stored ML model. 20. A node (QQ110, QQ300, QQ500) configured to perform life cycle management of at least one machine learning, ML, model for telecommunications dimensioning in a network, the node comprising: processing circuitry (QQ302, QQ504); memory (QQ304, QQ508) coupled with the processing circuitry, wherein the memory includes instructions that when executed by the processing circuitry causes the node to perform operations comprising: perform one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset; and select a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on a second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. 21. The node of Claim 20, the operations further comprising any of the operations of Claims 2 to 19. 22. A computer program comprising computer code to be executed by a node (QQ110, QQ300, QQ500), the node configured to perform life cycle management of at least one machine learning, ML, model for telecommunications dimensioning, to perform operations comprising: perform one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset; and select a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on a second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. 23. The computer program of Claim 22, the operations further comprising any of the operations of Claims 2 to 19. 24. A computer program product comprising a non-transitory storage medium (QQ304) including program code to be executed by processing circuitry (QQ302, QQ504) of a node (QQ110, QQ300, QQ500) configured to perform life cycle management of at least one machine learning, ML, model for telecommunications dimensioning, whereby execution of the program code causes the node to perform operations comprising: perform one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset; and select a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on a second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. 25. The computer program product of Claim 24, the operations further comprising any of the operations of Claims 2 to 19.

Description:
LIFE CYCLE MANAGEMENT OF MACHINE LEARNING MODEL TECHNICAL FIELD [0001] The present disclosure relates generally to life cycle management for a machine learning (ML) model for telecommunications dimensioning in a network, and related methods and apparatuses. BACKGROUND [0002] Dimensioning of telecommunications cloud infrastructure (e.g., virtual network function (VNF)/computing processing unit (CPU)/memory (MEM)) may be critical for pre-sales support (e.g., contract formulation), green and brown field deployment, and as an ongoing activity to cater to increases in subscriber growth and introduction of new services with technology evolution. Introduction of network function virtualization (NFV) may have enabled the expansion of network capacity in a shorter period of time. Some operators may have deployed their VNFs in a NFV architecture predominantly using an on-premises telecommunications cloud infrastructure. The NFV architecture may have enabled such an operator to scale down and scale up faster than the traditional physical network function (PNF) networks, where hardware has to be procured, commissioned, and connected in order to be made available for network application deployment. [0003] Although NFV may have improved the expansion and scalability of telecommunications networks over traditional PNF networks, the underlying infrastructure usage may not be optimized. Efforts have been made to understand dimensioning requirements of a telecommunications network based on historical data from performance management (PM) counters, e.g. by monitoring network behavior and other factors. SUMMARY [0004] ML may be used for telecommunications dimensioning. Life cycle management of ML models for telecommunications dimensioning is lacking, however, to handle drifts in data and/or predictions (e.g., concept drift). For example, when drift is detected, a ML model may need to be retrained or a new model may need to be deployed. Life cycle management of ML models for telecommunications dimensioning may not adequately identify ML models to update with new data, e.g. when deployment metadata may be unavailable to identify which ML model(s) to update with new data. ML model identification, thus, can be a challenge. Additionally, re-training of a ML model or deployment of a new ML model using an entire new dataset may not be feasible (e.g., due to limited computational resources) and/or may be time-consuming and a wasteful use of CPU/memory resources. Thus, CPU-efficient, memory-efficient, and/or time-efficient retraining of a ML model or deployment of a new ML model using less than an entire new dataset to re-train or deploy a ML model may be lacking. [0005] Potential advantages provided by various embodiments of the present disclosure may include that the method includes operations that may retrain a ML model or deploy a new ML model using less than an entire new dataset, which may result in CPU- efficient, memory-efficient, and/or time-efficient retraining or deployment of a ML model. Additionally, the method further includes operations to identify one or ML models to update with a minimal subset of new data, e.g. including when deployment metadata may be unavailable or insufficient, which may resolve challenges in ML model identification. [0006] Further potential advantages may include efficiency based on the method using a current, existing model when possible; refining a current ML model and using the refined ML model; or when no existing ML model is compatible, creating a new ML model. Challenges in ML model identification also may be resolved based on the method including ML model selection from a repository of ML models, with or without deployment metadata or a target performance metric. [0007] In various embodiments, a computer-implemented method is provided that is performed by a node configured to automatically perform life cycle management of at least one ML model for telecommunications dimensioning in a network. The method includes performing one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The method further includes selecting a second minimal informative dataset of data that sufficiently models a second dataset. The sufficiency is based on performance of the current ML model refined on a second incremental dataset. The second incremental dataset includes data incrementally added to the first minimal informative subset of data from that second dataset comprising performance data for network traffic that is different than the first dataset. [0008] In some embodiments, the method further includes incrementally adding data from the second dataset to the first minimal informative subset of data to obtain a second incremental dataset. The method further includes assessing performance of the current ML model on the second incremental dataset. The method further includes applying a stopping criterion to the incrementally adding data when the performance of the current ML model refined on the second incremental dataset satisfies one of (i) is within a threshold of performance of the current ML model trained on the entire first and second datasets combined or taken together during offline training of the current ML model, (ii) when greater than a defined percentage of data from the second dataset is needed in the incrementally adding to achieve (a) the performance on the second incremental dataset that approximates performance on the entire first and second datasets or (b) the performance saturates, during online or offline retraining of the current ML model, or (iii) when performance of the current ML model refined online on the second incremental dataset saturates. [0009] In some embodiments, the method further includes determining whether the current ML model or a stored ML model is successfully refined. The method further includes when (i) none of the current ML model and a stored ML model is successfully refined, creating a new ML model, or (ii) the current ML model and/or at least one stored ML model is successfully refined, using (805) a refined ML model that has the best performance to generate a forecast of telecommunications dimensioning. [0010] In some embodiments, the method further includes determining (i) whether the current ML model has an acceptable performance, or (ii) whether another ML model exists from a plurality of stored ML models based on known or unknown model deployment metadata and known target performance. The method further includes when (i) the current ML model or a stored ML model has an acceptable performance, or (ii) the another ML model exists, omitting refining the current ML model. [0011] In some embodiments, when (i) refinement of the current ML model does not result in an acceptable performance, or (ii) the periodic refinement to the current ML model does not result in an acceptable performance, the method further includes identifying, from a plurality of ML models, another ML model to refine, the identifying performed with or without available model metadata, and with or without known target performance of the another ML model, and the model metadata comprising information about deployment of a model. [0012] In some embodiments, the method further includes performing a procedure to find a second minimal informative subset of new data that sufficiently models the second dataset based on performance of a retrained at least one ML model from the plurality of ML models. [0013] In some embodiments, the method further includes designating a best performing ML model from the identified ML models or the new ML model to generate a forecast of telecommunications dimensioning. [0014] In some embodiments, the method further includes designating a best performing ML model from the subset of the plurality of ML models or the new ML model to generate a forecast of telecommunications dimensioning. [0015] In some embodiments, the method further includes initially training the current ML model. The initial training includes selecting a seed dataset from the first dataset comprising performance data for network traffic; incrementally adding data from the first dataset to the seed dataset to obtain a first incremental dataset; assessing performance of the current ML model on the first incremental dataset; and applying a stopping criterion to the incrementally adding data when the performance of the current ML model trained on the first incremental dataset satisfies one of (i) is within a threshold of performance over the entire first dataset during offline training of the current ML model, or (ii) when performance of the current ML model trained on the first incremental dataset saturates during online training of the current ML model. [0016] In some embodiments, the method further includes deleting, on a defined periodic basis, unused and/or out-of-date datasets and/or ML models from a ML model repository comprising the current ML model and a plurality of additional ML models. The method further includes maintaining in the ML model repository at least one dataset and at least one ML model that is in use and/or is up-to-date. [0017] In some embodiments, the method further includes compressing the first dataset and/or the second dataset for a stored ML model into a compressed form on a defined periodic basis. The compressing includes finding a minimal informative subset of accumulated datasets for the stored ML model. [0018] In various embodiments, a node configured to perform life cycle management of at least one ML model for telecommunications dimensioning in a network is provided. The node includes processing circuitry; and memory coupled with the processing circuitry. The memory includes instructions that when executed by the processing circuitry causes the node to perform operations. The operations include perform one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The operations further include select a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on a second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. [0019] In various embodiments, a computer program including program code to be executed by a node is provided. The node is configured to perform life cycle management of at least one ML model for telecommunications dimensioning. The operations include perform one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The operations further include select a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on the second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. [0020] In various embodiments, a computer program product including a non- transitory storage medium including program code to be executed by processing circuitry of a node is provided. Execution of the program code causes the node to perform operations. The operations include perform one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning, wherein the current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The operations further include select a second minimal informative dataset of data that sufficiently models a second dataset, the sufficiency based on performance of the current ML model refined on the second incremental dataset, and wherein the second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. BRIEF DESCRIPTION OF DRAWINGS [0021] The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this application, illustrate certain non-limiting embodiments of inventive concepts. In the drawings: [0022] Figure 1 is a block diagram illustrating an example embodiment of a node and other components configured to perform ML-based telecommunications dimensioning using at least one ML model, and can perform life cycle management of at least one ML model in accordance with some embodiments of the present disclosure; [0023] Figure 2 is a flowchart illustrating operations of a node in accordance with some embodiments of the present disclosure; [0024] Figures 3A-3C are flowcharts illustrating operations of a node in accordance with some embodiments of the present disclosure; [0025] Figures 4-5 are flowcharts illustrating operations of a node in accordance with some embodiments of the present disclosure; [0026] Figures 6A-6B are flowcharts illustrating operations of a node in accordance with some embodiments of the present disclosure; [0027] Figures 7-12 are flowcharts illustrating operations of a node in accordance with some embodiments of the present disclosure; [0028] Figure 13 is a plot illustrating CPU load for a dataset subject to offline learning in accordance with some embodiments of the present disclosure; [0029] Figure 14 is a plot illustrating a further plot of a training root mean squared error (RMSE) for data of Figure 13 in accordance with some embodiments of the present disclosure; [0030] Figure 15 is a plot illustrating a RMSE for data subject to online learning in accordance with some embodiments of the present disclosure; [0031] Figure 16 is a plot illustrating a RMSE standard deviation for data of Figure 15 in accordance with some embodiments of the present disclosure; [0032] Figure 17 is a block diagram of a communication system in accordance with some embodiments of the present disclosure; [0033] Figure 18 is a block diagram of a node in accordance with some embodiments of the present disclosure; and [0034] Figure 19 is a block diagram of a virtualization environment in accordance with some embodiments of the present disclosure. DETAILED DESCRIPTION [0035] Inventive concepts will now be described more fully hereinafter with reference to the accompanying drawings, in which examples of embodiments of inventive concepts are shown. Inventive concepts may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of present inventive concepts to those skilled in the art. It should also be noted that these embodiments are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present/used in another embodiment. [0036] The following description presents various embodiments of the disclosed subject matter. These embodiments are presented as teaching examples and are not to be construed as limiting the scope of the disclosed subject matter. For example, certain details of the described embodiments may be modified, omitted, or expanded upon without departing from the scope of the described subject matter. [0037] The following explanation of potential problems with some approaches is a present realization as part of the present disclosure and is not to be construed as previously known by others. [0038] Dimensioning, as a core capability, may be expected to transform telecommunications operations from being an infrequent back-end sales/deployment process to a proactive resource estimation solution for managing network deployments. Dimensioning may function both offline (e.g., deployments) and online (e.g., managed service, e.g., in an auto-scaling context); at low frequency (e.g., stable predictable traffic) or high frequency (e.g., spikes); and centrally (e.g., vendor provided cloud), closer to the network edge (e.g., a communication service provider (CSP) deploys some resources on- premises with limited resources, while relying on a vendor for other resources and dimensioning service), or at a customer edge, as fifth generation (5G) technology evolves. [0039] Presently, ML and data-driven dimensioning may be in nascent stages; and dimensioning models may be learned offline on historical data. [0040] For a ML production system, life cycle management of ML models being developed and/or applied may be a needed feature to handle drifts in data and/or predictions (e.g., concept-drift). When drift is detected, it may be desirable for a new ML model to learn to maintain performance levels. This new ML model may be learned from scratch or may be learned from a previous model/dataset as a starting point. In the context of zero-touch ML telecommunications dimensioning systems operating in live network environments, it may be desired that these operations need to be done very rapidly. For example, it may be desired that the entire life cycle management process (e.g., ML model creation/identification, training/retraining, deploying/re-deploying) is achieved on-the-fly, with minimal latency. [0041] Life cycle management for ML models presently may be inefficient and not suited to anticipate changes in network behavior and/or in cloud-native data loads. For example, in some approaches, all available data may be used for ML model training or re- training. Existing approaches also may not adequately address efficient ML model selection (e.g., for retraining) from a repository of ML models, particularly when ML model metadata (e.g., deployment metadata) is unknown and/or the number of ML models in the repository is large. For example, if multiple ML models (e.g., closely related ML models) can be improved (e.g., re-trained) based on incoming data, existing approaches may pick only one ML model, which may deprive other ML models from adapting to a data or concept-drift, and may result in an explosion in the number of ML models in a ML model repository. [0042] Thus, existing approaches for ML model training, retraining, deployment, and/or model swapping (e.g., re-deployment) may not solve, or may not adequately address: (1) Efficient training, re-training, deployment, and/or re-deployment via minimal new- data selection. Using an entire new dataset to train, re-train, deploy, and/or re- deploy a ML model may not be feasible (e.g., on-premises low-resource cloud infrastructure, such as limited computational resources) or at it may be time- consuming and wasteful. Thus, it may be desirable to compress the new dataset sufficiently to enable memory-efficient and/or fast training, re-training, deployment, and/or re-deployment of a ML model(s). (2) Identification of one or more ML models to update with new data. When a ML model needs to learn on a new dataset, training from a closely related prior dataset/model may make training fast. However, deployment metadata may be unavailable or insufficient to resolve which one or more of multiple dimensioning ML models to update with new data. A ML model repository may also have numerous models. Thus, ML model identification in these circumstances can be a challenge. [0043] Figure 7 is a flowchart illustrating computer-implemented operations of a node according to some embodiments of the present disclosure. The node can be network node QQ110, QQ300, QQ500 as discussed further herein that is configured to perform life cycle management of at least one ML model for telecommunications dimensioning in a network. The method includes performing (703) one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning. The current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The method further includes selecting (711) a second minimal informative dataset of data that sufficiently models a second dataset. The sufficiency is based on performance of the current ML model refined on a second incremental dataset. The second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. [0044] Various operations from the flow chart of Figure 7 may be optional with respect to some embodiments of nodes and related methods. For example, as discussed further hereon, operations of blocks 701 and 705-709 of Figure 7 may be optional. [0045] Further, as discussed further herein, in some embodiments, the method further includes use of a stopping criteria that may be based on life cycle management context in addition to performance saturation. In some embodiments, the stopping criteria applies when performance saturates, or when more than a defined amount (e.g., percentage) of a dataset is needed. [0046] Additionally, as discussed further herein, the method further includes identifying a ML model(s) with or without deployment metadata; and, in some embodiments, the method includes offline (e.g., deployments) or online (e.g., managed services) performance. [0047] Potential advantages provided by various embodiments of the present disclosure may include that based on the method either training a current ML model on a first minimal informative subset data, or proactively refining a current ML to follow a drift in a subsequent dataset; and selecting a second minimal informative subset of data that sufficiently models a second dataset, use of CPU and/or memory resources may be efficient. For ease of discussion only, the procedure to select a minimal informative subset of data is also referred to herein as an “active learning” procedure. Further, based on inclusion of a stopping criteria that is based on life cycle management context, efficiency of an active learning process may be improved and, thus, may make the method suitable for on-the-fly operations. [0048] Additional potential advantages include that ML model identification may be more efficient based the identification performed with or without deployment metadata. As a consequence, the method may use existing ML models whenever possible, may refine a subset of ML models if needed, and/or may retrain a ML model from scratch only when all existing ML models fail. Further, the method may improve efficiency based on each of these operations using a minimal informative subset of data. [0049] A further potential advantage may be that as telecommunications technology evolves (e.g., 5G technology), the method may be scalable (e.g., ML model agnostic) based on the method performing offline or online and at a variety of locations (e.g., centrally, at a network edge, or at a customer edge). [0050] Figure 1 is a block diagram illustrating an example embodiment of a node and other components configured to perform ML-based telecommunications dimensioning using at least one ML model, and the node can perform life cycle management of at least one ML model in accordance with some embodiments of the present disclosure. As discussed further herein, the node can be located at a variety of locations, e.g., centrally, at a network edge, or at a customer edge. The node 107 includes components that perform operations of a life cycle management (LCM) controller that can work with a ML model registry 111 (e.g., a memory, a database, a repository, etc.). LCM controller of node 107 can (1) select 107a a minimal informative subset of data from incoming data 101; and (2) identify 107b a ML model(s) from ML registry 111 (and optionally from ML model identification deployment metadata 105), so as to use an existing ML model(s) when possible, retrain/refine an existing ML model(s) when needed, or create/train a new ML model. [0051] The incoming data 101 includes data features extracted from PM counters to estimate resources required (e.g., amount of CPU, MEM, or number of VNFs). The method may be an offline process (e.g., a deployment) or may be a deployed network function that functions online and automatically. In the former case, the ML model metadata (e.g. deployment metadata) may be known; in the latter case, it may be likely that such data is unavailable. Based on the application context and input data 101 received, the LCM controller of node 107 determines a suitable ML model from the ML model registry 111a to use as-is, train, re-train, deploy, or re-deploy 103, execute in the execution environment, and serve 109 outcomes. In training, re-training, deployment, and re-deployment 103, the LCM controller of node 107 may use an offline active learning procedure to pick the minimal informative subset of data 107a to train 103 the ML model if the application-context is, e.g., a standalone deployment. In the case of a deployed network-function, LCM controller of node 107 may use an online ML approach to achieve the same objective. Model registry 111 can include all model related artifacts (e.g., model data 111b, performance metrics 111c, deployment metadata 111d). A deployed ML model can continue serving dimensioning outcomes for incoming (e.g., test) data. [0052] The following is a table of nomenclature referenced in the present disclosure: [0053] Figure 2 is a flowchart illustrating operations of a node for initial ML model training and deployment 201 in accordance with some embodiments of the present disclosure. In block 203, a dataset (D) and deployment metadata (DM) are obtained. In block 205, the dataset (D) is preprocessed (e.g., remove outliers, scale data, etc.). In block 207, the dataset (D) is partitioned into a minimal seed subset (e.g., D-SEED) (e.g., pick a number-of-unknown-model coefficients number of data with highest and lowest feature values that have max norm (e.g., the square root of the sum of the squares of the feature values) and a remainder data subset (D-OTHER). [0054] Still referring to Figure 2, in block 209, an active learning procedure is called to obtain a minimal informative subset of data. In the context of initial ML model training and deployment, the active learning procedure 209 may be called offline or online (e.g., depending on an application context), as discussed further herein. The procedure 209 is called with inputs, which include a minimal seed subset (e.g., D-SEED), a number of unknown ML model coefficients (e.g., NULL coefficients), a remainder data subset (e.g., D- OTHER), a target performance threshold, an acceptance threshold, and retrain = FALSE (e.g., when initial ML model training is being performed). A initial minimal seed subset (D- SEED) can be intelligently (e.g., domain-knowledge, heuristic) chosen. As discussed further herein, in a telecommunications dimensioning context, data points with the highest and lowest input feature values may be used. [0055] The active learning procedure 209 incrementally picks additional points from the remainder data (D-OTHER) to refine/retrain a model (M) until a stopping criteria is met. The active learning procedure 209 incrementally picks the additional points that maximize an acquisition function, as discussed further herein. Telecommunications dimensioning can be a linear regression problem. Thus, as discussed further herein, an acquisition function can be used to quantify additional points whose prediction uncertainty is maximum to pick as a technique for identifying the minimal informative subset of data given that the ML model learned from currently available data. [0056] Still referring to Figure 2, the acquired points (D-NEW) together with the minimal seed subset (D-SEED) constitutes the minimal informative subset of data, which is returned along with the corresponding revised ML model (M-NEW) trained on the minimal informative subset of data, and performance metrics (PERF) of the ML model M-NEW (PERF-NEW) on the minimal informative subset of data. Performance metrics may be application and situation dependent; and they may be hold-out metrics or cross-validation metrics. Coefficients of the revised ML model M-NEW are also obtained and are denoted as COEF-NEW (e.g., in initial training, coefficients are revised because they were NULL initially). [0057] In block 211, the deployment metadata (DM), the minimal informative subset of data (D-NEW + D-SEED), the pre-processed dataset (PP), M-NEW, COEF-NEW, and PERF-NEW are stored (e.g., in model registry 111). In block 213, the ML model M-NEW is ready to be deployed and to serve (or continue to serve) predictions. [0058] Still referring to Figure 2, if a prior ML model exists (e.g., in model registry 111) that is associated with a dataset D and is being updated with new data, then D-SEED is the existing dataset of the ML model (that is, D), and D-NEW is the extracted minimal informative data subset from the new data, via active learning procedure 209. The updated ML model associated with a new incremental dataset (D-SEED + D-NEW) is updated in the model registry and/or deployed. [0059] Figures 3A-3C are flowcharts illustrating operations of a node in accordance with some embodiments of the present disclosure. [0060] Referring first to Figure 3A, Figure 3A illustrates operations of a node to select the active learning procedure 209 based on an application context. In block 301, application context is obtained for the active learning procedure 209. In block 303, the node determines whether the application context is critical (e.g., time critical such as for a live network deployment, large dataset, or a user preference). If no, in block 305, the active learning procedure 209 is called offline. If yes, in block 307, the active learning procedure 209 is called online. [0061] A difference between the offline and the online active learning procedure 209 is the manner of application and a stopping criteria used to stop the procedure 209. Both the offline and the online active learning procedure 209 incrementally acquire informative new data points and refine a current ML model until a stopping criteria is met. Data can be acquired one at a time or in batches. [0062] Referring next to Figure 3B, Figure 3B illustrates operations of the node to call the active learning procedure 209. In some embodiments, the active learning procedure 209 can be performed when the method is performed in a one-off application (e.g. dimensioning for a customer deployment). A dataset may be received and from the dataset, performance metrics of interest using the entire (also referred to herein as a combined (COMB)) dataset are computed. The procedure 209 finds a minimal informative subset of data that best approximates (e.g., per a user-provided threshold) the performance with the entire (or combined) dataset. In embodiments using the procedure 209, the procedure 209 tracks an LCM application context stopping criteria. The stopping criteria uses a fraction of the acquired new-data in relation to the entire new-data as a technique for early-stopping the process 209 and for post-processing a completed process 209 to see if the result is valid (e.g., if more than a certain percentage of the data is needed, a, e.g., Subject Matter Expert (SME) may be flagged to check the appropriateness of the ML model for the new data). [0063] Referring the operations of Figure 3B, the inputs to active learning procedure 209 include D-SEED, M, COEF, D-OTHER, RETRAIN, and thresholds T1, and T2. T1 and T2 have default values defined; however, calls to active learning procedure 209 can override these values of T1 and T2 if desired. [0064] In block 215, the node trains a ML model M-COMB using both D-SEED and D-OTHER. Performance is recorded as PERF-COMB. The method proceeds to block 217. [0065] In block 217, D-NEW is set to an empty list and PERF-NEW is set as empty, and the size of D-OTHER is recorded as D-SIZE. The method proceeds to block 219. [0066] In block 219, the operations of blocks 221-231 are repeated while there are points in D-OTHER and competitive performance relative to PER-COMB is not yet achieved. [0067] In block 221, for each point in D-OTHER, an acquisition function value is computed given existing ML models specifics. Example embodiments of acquisition functions are discussed further herein. The method proceeds to block 223. [0068] In block 223, the datum is selected that maximizes the acquisition function. The method proceeds to block 225. [0069] In block 225, the datum from D-OTHER is popped and appended to D-NEW. The method proceeds to block 227. [0070] In block 227, depending on available inputs, either the ML model is refined on D-NEW (e.g., fine tune from COEF) or a new ML model is retrained afresh (e.g., an ordinary least-squares (OLS model)) on D-SEED and D-NEW combined. The resulting ML model is denoted as M-NEW and its performance is computed as PERF-NEW. The method proceeds to block 229. [0071] In block 229, the node determines whether PERF-NEW is within T1 of PERF- COMB. If no, in block 231, the node determines whether the size of D-NEW is greater than the T2 size (e.g., a percentage) of D-SIZE and whether retrain = TRUE. If the determination of block 231 is no, the operations of blocks 219 -229 are repeated. If the determination of block 231 is yes, the operation of block 235 is performed. [0072] If the operation of block 229 results in yes, in block 233, the node determines whether the size of D-NEW is greater than the T2 size (e.g., a percentage) of D- SIZE and whether retrain = TRUE. If yes, in block 235, PERF-NEW, D-NEW, and M-NEW are set to empty and process 209 continues to block 237. If the determination of block 233 is no, the active learning procedure 209 continues to block 237. [0073] In block 237, active learning procedure 209 returns PERF-NEW, D-SEED + D- NEW, and M-NEW. For complex ML models and/or large datasets and/or tight latency/resource constraints, the refining option (that is, fine-tuning from existing coefficients) may be more viable. For simple ML models and small data-sizes, a retraining of an existing ML model from scratch with the additional data may be performed. [0074] Referring next to Figure 3C, Figure 3C illustrates operations of the node to call the active learning procedure 209. In some embodiments, the online active learning procedure 209 can be performed when the components of the node that perform the method are deployed as a network function to operate automatically and without human intervention. The online active learning procedure 209 may be compute-efficient, zero- touch and may assume unavailability of a full-dataset of performance metrics. [0075] Thus, in embodiments using the online active learning procedure 209, the active learning procedure 209 tracks two stopping criteria: (1) the saturation of one or more performance metrics, as captured by the standard deviation of a moving window of recent metric values, and (2) the LCM application context inspired stopping criteria that uses a fraction (e.g., a percentage) of the acquired new-data in relation to the entire new- data as a technique for early-stopping the active learning process 209 and post-processing a completed active learning process 209 to see if the result is valid. The LCM-context inspired stopping criteria is based on a premise that retraining an existing ML model assumes compatible data. This premise can be reasonably assumed to be violated if more than a (e.g., a user-set) threshold % of the new-data needs to be acquired. In an example embodiment, if such a threshold is exceeded, the active learning procedure 209 will return an empty/NULL outcome triggering the training of a new ML model. Thus, the LCM application-context based stopping criteria may enable the active learning process 209 to be compute-resource efficient and thus enable on-the-fly ML model swapping. This stopping criteria can work for both supervised and unsupervised ML models even if telecommunications dimensioning is itself a supervised regression problem. [0076] Referring to the operations of Figure 3C, the inputs to active learning procedure 209 include D-SEED, M, COEF, D-OTHER, RETRAIN, W (a value for a number of entries in a performance list (PERF-LIST), and thresholds T1, and T2. W, T1, and T2 have default values defined; however, calls to active learning procedure 209 can override these values of W, T1, T2 if desired. [0077] In block 217, D-NEW and PERF-LIST is set to an empty list and PERF-NEW is set as empty. The size of D-OTHER is recorded as D-SIZE. The method proceeds to block 309. [0078] In block 309, the operations of blocks 221-315 are repeated while there are points in D-OTHER and performance on D-SEED and D-NEW has not saturated. [0079] In block 221, for each point in D-OTHER, an acquisition function value is computed given existing ML models specifics. Example embodiments of acquisition functions are discussed further herein. The method proceeds to block 223. [0080] In block 223, the datum is selected that maximizes the acquisition function. The method proceeds to block 225. [0081] In block 225, the datum from D-OTHER is popped and appended to D-NEW. The method proceeds to block 227. [0082] In block 227, depending on available inputs, either the ML model is refined (e.g., fine tune from COEF) on D-NEW or another ML model is retrained (e.g., a OLS model) on D-SEED and D-NEW combined. The resulting ML model is denoted as M-NEW and its performance is computed as PERF-NEW. The method proceeds to block 311. [0083] In block 311, PERF-NEW is appended to PERF-LIST. The method proceeds to block 313. [0084] In block 313, the node determines whether PERF-LIST is greater than or equal to W entries. If no, in block 315, the node determines whether the size of D-NEW is greater than T2 (e.g., a percentage) of D-SIZE and RETRAIN = TRUE. If the determination of block 315 is no, the operations of blocks 309 to 313 are repeated. If the determination of block 317 is yes, the active learning procedure 209 continues to block 235. [0085] If the determination of block 313 is yes, the active learning procedure 209 continues to block 317. In block 317, a saturation metric (PERF-SAT) on a performance measure is computed using W of the most resent entries of PERF-LIST. The method proceeds to block 319. [0086] In block 319, the node determines whether PERF-SAT is within threshold T1. If no, the active learning procedure 209 continues to block 315. If yes, the active learning procedure 209 continues to block 233. [0087] In block 233, the node determines whether the size of D-NEW is greater than the T2 size (e.g., a percentage) of D-SIZE and whether retrain = TRUE. If yes, in block 235, PERF-NEW, D-NEW, and M-NEW are set to empty and active learning process 209 continues to block 237. If the determination of block 233 is no, the active learning procedure 209 continues to block 237. [0088] In block 237, active learning procedure 209 returns PERF-NEW, D-SEED + D- NEW, and M-NEW. [0089] Model identification will now be discussed further. Depending on the application context, model metadata ( also referred to herein as “deployment metadata”) may or may not be available. Existing approaches may operate on at-most one ML model at any given time. A potential problem with this approach may be that in data drift scenarios, multiple relevant ML models may exist. The method of some embodiments of the present disclosure includes proactively refining multiple (e.g., all) relevant models, which may result in an efficient LCM process, zero-touch operation, and minimization of stored ML model instances (e.g., in a model registry). Relevant ML models include ML models that follow the data and any drift in the data. As a consequence, the method may enable, in the telecommunications domain, zero-touch cloud-native operation that handles multiple (e.g., all) relevant ML models in the LCM context. [0090] In some embodiments, in a ML model LCM context, three scenarios may occur, which are addressed by embodiments of the method of the present disclosure. The three scenarios are described further herein with reference to the flowcharts of Figures 4, 5, 6A, and 6B. [0091] Referring first to Figure 4, in this embodiment, deployment metadata is known and a target performance metric threshold for determining compatibility of data is known. This embodiment includes the minimal informative subset of data discussed herein, and only refines/retrains a ML model if needed. The ML model may be used as-is, with refinement, or after creating a new ML model, as discussed further herein. Relevant ML models with matching deployment metadata are pulled up from storage (e.g., from a model registry). Incoming data (or a sample of incoming data) (D-TEST) is evaluated against the relevant ML models. The best performing ML model is identified. If the best performing ML model is within the target performance metric threshold, the same is used to serve (or to continue serving) predictions. If the best performing ML model does not qualify for direct-usage, but its performance is not-too-far-off as determined by another threshold (e.g., a user-provided threshold), it is subject to model refinement on the basis that a compatible ML model has been found but needs refinement before deployment. The online or offline active learning procedure 209 is used for the refinement, depending on the application context. [0092] Only in the event that the identified ML model performs poorly on the new data (in other words, not within the threshold) is a fresh ML model creation triggered. If the new ML model is not performant enough, an alarm can be triggered to an operations center to seek SME validation of model assumptions. [0093] In Figure 4, threshold T1 is a threshold for determining data compatibility without retraining (e.g., 5% over existing/best). Threshold T2 is a threshold for determining data compatibility with the same ML model (e.g., 15% over existing/best). Threshold T3 is a threshold for determining if a new ML model is created (e.g., RMSE of 5 MHZ or MAPE of 15%). Returning ML model predictions assumes prior inverse-transformation according to PP or pre-processing specific to D-TEST (PP-TEST) if new ML model creation is triggered. [0094] Referring to the operations of Figure 4, the operations of blocks 401-421 are performed in embodiments that use an existing ML model. In block 401, data is obtained, including test deployment metadata (DM-TEST), test data (or a sample of test data) D- TEST, and thresholds for test performance. The method proceeds to block 403. [0095] In block 403, ML models with matching deployment metadata (DM, D, PP, M, COEF, PERF) recovered (e.g., pulled up from storage) corresponding to DM-TEST. The method proceeds to block 405. [0096] In block 405, the node determines whether at least one ML model was retrieved. If no, in block 407, an error is returned; or a new ML model can be created using the operations of blocks 435-445. If yes, the method proceeds to block 409. [0097] In block 409, the operations of blocks 411-415 are repeated for each ML model retrieved. In block 411, D-TEST is preprocessed using PP (e.g., scale data) to obtain new data with ML model M’s pre-processing applied to it (D-TEST-PP). In block 413, ML model M is applied on D-TEST-PP, and a performance metric PERF-TEST is computed. The method proceeds to block 415. [0098] In block 415, the node determines whether all retrieved ML models have been processed. If no, the method proceeds to block 409 and the operations of block 411- 415 are repeated. If yes, the method proceeds to block 417. [0099] In block 417, the ML model M having the best performance metric PERF- TEST is selected. The method proceeds to block 419. [00100] In block 419, the node determines whether PERF-TEST is within T1 of PERF. If no, the method proceeds to block 409 and the operations of blocks 411-419 are repeated. If yes, the method proceeds to block 421. In some embodiments, comparing PERF-TEST with existing ML model performance of PERF is a relative performance comparison. In some embodiments, an absolute performance comparison maybe used, e.g., is PERF-TEST within T1 (e.g., is the error within 10 CPU units), to determine ML model compatibility for using the ML model as-is or using the ML model after refinement. [00101] In block 421, ML model predictions are returned if needed, and ML model M is ready to serve (or to continue serving) predictions. [00102] Still referring to the operations of Figure 4, the operations of blocks 423-431 are performed in embodiments that refine and use an ML model. If the operation of block 419 results in no, the method proceeds to block 423. [00103] In block 423, the node determines whether PERF-TEST is within threshold T2 of PERF. If no, the method proceeds to the operation 435 to create a new ML model (discussed further herein). If yes, the method proceeds to call active learning procedure 209 with inputs D, M, COEF, D-TEST-PP, and RETRAIN = TRUE to retrieve PERF-NEW, D+D- NEW, and M-NEW. The method proceeds to block 425. [00104] In block 425, the method records model coefficients of M-NEW as COF- NEW. If required, PP is updated based on D and D-NEW. The method proceeds to block 429. [00105] In block 429, DM, D+D-NEW, PP, M-NEW, COEF-NEW, PERF-NEW are stored (e.g., in a model registry), and an existing model record is replaced. The method proceeds to block 429. [00106] In block 429, M-NEW is applied on D-TEST. The method proceeds to block 431. [00107] In block 431, ML model predictions are returned if required, and Model M- NEW is ready to continue serving predictions. [00108] In block 433, if the refined ML model is not performant enough with model and/or domain knowledge used to refine the ML model, an alarm can be triggered to an operations center to seek SME intervention (e.g., to validate model assumptions). [00109] Still referring to the operations of Figure 4, the operations of blocks 435-445 are performed in embodiments that create a new ML model. If the operation of block 423, results in no, the method proceeds to the operation 435 to create a new ML model. [00110] In block 435, D-TEST is pre-processed (PP-TEST) (e.g., remove outliers, scale data, etc.). The method proceeds to block 437. [00111] In block 437, the pre-processed data PP-TEST is partitioned into a minimal seed data subset (D-SEED) (e.g., pick a number-of-unknown-model-coefficients number of data with max norm) and the reminder/other data subset (D-OTHER). The model proceeds to call active learning procedure 209. [00112] Active learning procedure 209 is called with inputs D-SEED, model-type, D- OTHER, and RETRAIN = FALSE to retrieve PERF-NEW, D-SEED+D-NEW, and M-NEW. The method then proceeds to block 439. [00113] In block 439, model coefficients of M-NEW are recorded as COEF-NEW. The method proceeds to block 441. [00114] In block 441, the node determines whether PERF-NEW is not empty and within threshold T3. If no, an alarm can be triggered 433 to an operations center to seek SME intervention (e.g., to validate model assumptions). If yes, the method proceeds to block 443. [00115] In block 443, DM-TEST, D-SEED+D-NEW, PP-TEST, M-NEW, COEF-NEW, PERF-NEW are appended to model storage (e.g., to a model repository). The method proceeds to block 445. [00116] In block 445, the node computes and returns ML model predictions on D- TEST. M-NEW is ready to serve (or to continue serving) predictions. [00117] Referring next to Figure 5, in this embodiment, deployment metadata is known and a target performance metric threshold for determining compatibility of data is not known. This embodiment includes identifying and refining relevant ML models. Relevant ML models with matching deployment metadata are pulled up from storage (e.g., from a model registry). Each of these relevant ML models is subjected to a model refinement process using online or offline active learning procedure 209 applied on incoming/new data: (i) If the new data is compatible with an existing ML model, very-few data points are acquired, and the ML model remains largely the same; (ii) If the new data is very different from an existing ML model, the active learning procedure 209 will return an empty/NULL; and the existing ML model remains unchanged; or (iii) the relevant ML models are refined to follow the drift in data. After iterating through the relevant ML models, if at least one of the ML models has been refined successfully, no new ML models are created, otherwise, a new ML model is created. From among the refined ML models, the best performing ML model is designated to serve (or to continue serving) predictions. If a new ML model is created, it will serve (or continue serving) predictions. [00118] In Figure 5, threshold T3 is a threshold for determining new ML model acceptance (e.g., RMSE of 5 MHZ or MAPE of 15%). Returning ML model predictions assumes inverse-transformation according to PP or PP-TEST. [00119] Referring to the operations of Figure 5, the operations of blocks 501-515 and 209 (on the left side of Figure 5) are performed in embodiments to pull relevant ML models with matching deployment metadata. Each of these relevant ML models is subjected to a model refinement process using active learning procedure 209 applied on incoming/new data. [00120] In block 501, data is obtained, including test deployment metadata (DM- TEST), test data (or a sample of test data) D-TEST, and threshold T3 for acceptable new ML model performance. The method proceeds to block 503. [00121] In block 503, ML models are recovered (DM, D, PP, M, COEF, PERF) form storage (e.g., from a model repository) corresponding to DM-TEST. The method proceeds to block 505. [00122] In block 505, the node determines whether at least one ML model was retrieved. If no, in block 507, an error is returned; or a new ML model can be created using the operations of blocks 537-545. If, yes, the method proceeds to block 509 and a flag is set to FALSE. If the flag is set to FALSE, the method proceeds to block 511. [00123] In block 511, the operations of blocks 513, 515, and 209 are repeated for each ML model retrieved. The method proceeds to block 513. [00124] In block 513, D-TEST is pre-processed using PP (e.g., scale data, etc.) to obtain D-TEST-PP. The method proceeds to block 515. [00125] In block 515, ML model M is applied on D-TEST-PP, and a performance metric PERF-TEST is computed. The method proceeds to call active learning procedure 209. [00126] Active learning procedure 209 is called with inputs D, M, D-TEST-PP, RETRAIN = TRUE to retrieve PERF-NEW, D+D-NEW, and M-NEW. The method then proceeds to block 517 to begin performing ML model refinement. ML model refinement operations are performed in blocks 517-533. [00127] In block 517, the node determines whether D-NEW is empty. If yes, the method proceeds to block 527 (discussed further herein). If no, the method proceeds to block 519. [00128] In block 519, model coefficients of M-NEW are recorded as COEF-NEW. If required, PP is updated based on D and D-NEW. The method proceeds to block 521. [00129] In block 521, DM, D+D-NEW, PP, M-NEW, COEF-NEW, PERF-NEW are stored (e.g., in a model registry). An existing model record is replaced. The method proceeds to block 523. [00130] In block 523, a flag is set to TRUE to indicate that at least one ML model has been successfully refined. The method proceeds to block 525. [00131] In block 525, M-NEW is applied on D-TEST. The method proceeds to block 527. [00132] In block 527, the node determines whether all retrieved ML models have been processed. If no, the method proceeds to block 511 and block 513, 515, 209, and 517-527 are repeated to process the remaining ML model(s). If yes, the method proceeds to block 529. [00133] In block 529, the method determines whether a FLAG = TRUE. If no, the method proceeds to block 535 to begin operations to create a new ML model (discussed further herein), If yes, the method proceeds to block 531. [00134] In block 531, from the one or more refined ML models, a new ML model M- NEW is selected with maximum performance PERF-NEW. The method proceeds to block 533. [00135] In block 533, ML model predictions are returned for D-TEST if required; and ML model M-NEW is ready to serve (or to continue serving) predictions. [00136] Still referring to the operations of Figure 5, the operations of blocks 535-547 are performed in embodiments that create a new ML model. If the operation of block 529, results in no because the Flag = FALSE after processing all retrieved ML models (and, thus, new data is not compatible with any existing, retrieved ML model since a retrieved ML model was not successfully refined), the method proceeds to the operation 535 to create a new ML model. [00137] In block 535, D-TEST is pre-processed (PP-TEST) (e.g., remove outliers, scale data, etc.). The method proceeds to block 537. [00138] In block 537, the pre-processed data PP-TEST is partitioned into a minimal seed data subset(D-SEED) (e.g., pick a number-of-unknown-model-coefficients number of data with max norm) and the reminder/other data subset (D-OTHER). The model proceeds to call active learning procedure 209. [00139] Active learning procedure 209 is called with inputs D-SEED, model-type, D- OTHER, and RETRAIN=FALSE to retrieve PERF-NEW, D-SEED+D-NEW, and M-NEW. The method then proceeds to block 539. [00140] In block 539, model coefficients of M-NEW are recorded as COEF-NEW. The method proceeds to block 541. [00141] In block 541, the node determines whether PERF-NEW is not empty and within threshold T3. If no, an alarm can be triggered 547 to an operations center to seek SME intervention (e.g., to validate model assumptions). If yes, the method proceeds to block 543. [00142] In block 543, DM-TEST (a new ML model), D-SEED+D-NEW, PP-TEST, M- NEW, COEF-NEW, PERF-NEW are appended to model storage (e.g., to a model repository). The method proceeds to block 545. [00143] In block 545, the node computes and returns ML model predictions on D- TEST. M-NEW is ready to serve (or to continue serving) predictions. [00144] Referring next to Figures 6A-6B, in this embodiment, deployment metadata is unavailable and a target performance metric is known (e.g., approximate performance- based model identification). In this embodiment, because no model identification information is available, ML model identification is done considering available ML models (e.g., stored in a model registry) (operations of blocks 601-615). For efficiency, ML model identification is performed using an approximate performance metric. A small sample (e.g., a random sample) of new-data is extracted and evaluated using each of the available ML models (e.g., stored in a model registry). The top-K (e.g., K set by a user) ML models are deemed relevant (operations of block 615). If an existing ML model is determined to be performant, the existing ML is used (operations of the blocks 617-621. If not, refining the top-K ML models is performed, and if any of the top-K ML models is successfully refined, then the best of the refined top-K ML models is deployed (operations of blocks 623-647). A new ML model is created (operations of blocks 649-661 in Figure 6B) if none of the top-K ML models are compatible with the new data (that is, none of the top-K ML models is refined). That is, each of the top-K ML models is subjected to a model refinement process using active learning procedure 209 applied on incoming/new data: (i) If the new data is compatible with an existing ML model, very-few data points are acquired, and the ML model remains largely the same; (ii) If the new data is very different from an existing ML model, the active learning procedure 209 will return an empty/NULL; and the existing ML model remains unchanged; or (iii) the relevant ML models are refined to follow the drift in data. After iterating through all relevant ML models, if at least one of the ML models has been refined successfully, no new ML models are created, otherwise, a new ML model is created. From among the refined ML models, the best performing ML model is designated to serve (or to continue serving) predictions. If a new ML model is created, it will serve (or continue serving) predictions. [00145] In Figures 6A, 6B, threshold T1 is a threshold for determining data compatibility with the same ML model without retraining (e.g., 5% over existing/best). Threshold T3 is a threshold for determining new ML model acceptance (e.g., RMSE of 5 MHZ or MAPE of 15%). Returning ML model predictions assumes prior inverse- transformation according to PP or PP-TEST.. In some embodiments, thresholds can be a relative performance comparison or an absolute performance comparison (e.g., is PERF- TEST within T1 (e.g., is the error within 10 CPU units). [00146] Referring to the operations of Figure 6A, the operations of blocks 601-615 are performed in embodiments to identify a relevant ML model(s). Each of these relevant ML model(s). [00147] In block 601, data is obtained, including test data (or a sample of test data) D-TEST, a number of models K and a test-data percentage, and thresholds T1 and T3. The method proceeds to block 603. [00148] In block 603, a percentage (e.g., a small percentage) of random sample of D- TEST is extracted and denoted D-TEST-S. The method proceeds to block 605. [00149] In block 605, the operations of blocks 607-613 are repeated for each available ML model (e.g., in a model repository). The method proceeds to block 607. [00150] In block 607, relevant ML models are recovered (DM, D, PP, M, COEF, PERF). The method proceeds to block 609. [00151] In block 609, D-TEST-S is pre-processed using PP (e.g., scale data, etc.) to obtain D-TEST-S-PP. The method proceeds to block 611. [00152] In block 611, ML model M is applied on D-TEST-S-PP, and a performance metric PERF-TEST-S is computed. The method proceeds to block 613. [00153] In block 613, the node determines whether all ML models have been processed. If no, the operations of blocks 605-613 are repeated. If yes, the method proceeds to block 615. [00154] In block 615, the ML models are sorted based on PERF-TEST-S, and the top- K ML models are selected. The method proceeds to block 617 to perform operations to use an existing ML model. [00155] In block 617, the node determines whether a best model (denoted in this example embodiment as M1) performance is less than threshold T1. If no, the method proceeds to block 623 (discussed further herein). If yes, the method proceeds to block 619. [00156] In block 619, best model (m1) predictions are computed on D-TEST. The method proceeds to block 621. [00157] In block 621, nest model predictions are returned if required. ML model M1 is ready to continue serving predictions. [00158] If the operation of block 617 resulted in no, the method proceeds from block 617 to block 623. [00159] In block 623, a flag is set to FALSE. The method proceeds to block 625. [00160] In block 625, the operations of blocks 627-641 are repeated for each of the top-K ML models retrieved. [00161] In block 627, D-TEST is pre-processed using PP (e.g., scale data, etc.) to obtain D-TEST-PP. The method proceeds to block 629. [00162] In block 629, ML model M is applied on D-TEST-PP, and a performance metric PERF-TEST is computed. The method proceeds to call active learning procedure 209. [00163] Active learning procedure 209 is called with inputs D, M, COEF, D-TEST-PP, RETRAIN=TRUE to retrieve PERF-NEW, D+D-NEW, and M-NEW. The method proceeds to block 631 to begin operations to refine M-NEW. [00164] In block 631, the node determines whether D-NEW is empty (that is, model refinement was unsuccessful). If yes, the method proceeds to block 641 (discussed further herein). If no, the method proceeds to block 633. [00165] In block 633, model coefficients of M-NEW are recorded as COEF-NEW. If required, PP is updated based on D and D-NEW. The method proceeds to block 635. [00166] In block 635, DM, D+D-NEW, PP, M-NEW, COEF-NEW, PERF-NEW are stored (e.g., in a model registry). An existing model record is replaced. The method proceeds to block 637. [00167] In block 637, a flag is set to TRUE. The method proceeds to block 639. [00168] In block 639, M-NEW is applied on D-TEST. The method proceeds to block 641. [00169] In block 641, the node determines whether all retrieved ML models have been processed. If no, the method proceeds to block 625 and blocks 627, 629, 209, and 631-641 are repeated to process the remaining ML model(s). If yes, the method proceeds to block 643. [00170] In block 643, the method determines whether a FLAG = TRUE. If no, the method proceeds to block 649 of Figure 6B (discussed further herein) to begin operations to create a new ML model. If yes because at least one ML model was successfully refined, the method proceeds to block 645. [00171] In block 645, from among the K retrieved ML models that were refined, the ML model M-NEW with maximum performance PERF-NEW is selected. The method proceeds to block 647. [00172] In block 647, ML model predictions are returned for D-TEST if required; and ML model M-NEW is ready to serve (or to continue serving) predictions. [00173] If the result of the operation of block 643 is no, the method proceeds to block 649 of Figure 6B to begin operations to create a new ML model. [00174] Referring to Figure 6B, in block 649, D-TEST is pre-processed (PP-TEST) (e.g., remove outliers, scale data, etc.). The method proceeds to block 651. [00175] In block 651, the pre-processed data PP-TEST is partitioned into a minimal seed data subset (D-SEED) (e.g., pick a number-of-unknown-model-coefficients number of data with max norm) and the reminder/other data subset (D-OTHER). The model proceeds to call active learning procedure 209. [00176] Active learning procedure 209 is called with inputs D-SEED, model-type, D- OTHER, and RETRAIN=FALSE to retrieve PERF-NEW, D-SEED+D-NEW, and M-NEW. The method then proceeds to block 653. [00177] In block 653, model coefficients of M-NEW are recorded as COEF-NEW. The method proceeds to block 655. [00178] In block 655, the node determines whether PERF-NEW is not empty and within threshold T3. If no, an alarm can be triggered 661 to an operations center to seek SME intervention (e.g., to validate model assumptions). If yes, the method proceeds to block 657. [00179] In block 657, DM-TEST, D-SEED+D-NEW, PP-TEST, M-NEW, COEF-NEW, PERF-NEW are appended to model storage (e.g., to a model repository). The method proceeds to block 659. [00180] In block 659, the node computes and returns ML model predictions on D- TEST. M-NEW is ready to serve (or to continue serving) predictions. [00181] Example embodiments of the method of Figures 2, 3, 4, 5, 6A, and 6B, will now be discussed with reference to the flowcharts of Figures 7-12. [00182] As referenced above, referring to Figure 7, the method includes performing (703) one of (i) determine that a performance of a current ML model is not acceptable for a forecast of telecommunications dimensioning. The current ML model is trained on a first minimal informative subset of data from a first dataset comprising performance data for network traffic, or (ii) a proactive periodic refinement to the current ML model to proactively to follow a drift in a subsequent dataset. The method further includes selecting (711) a second minimal informative dataset of data that sufficiently models a second dataset. The sufficiency is based on performance of the current ML model refined on a second incremental dataset. The second incremental dataset comprises data incrementally added to the first minimal informative subset of data from the second dataset comprising performance data for network traffic that is different than the first dataset. [00183] In some embodiments, the method includes more than one incremental dataset, where an Nth incremental dataset = (N-1) incremental dataset + Nth minimal informative subset of data. In some embodiments, • The first incremental dataset = the first minimal informative subset of data. • The second incremental dataset = the first incremental dataset + the second minimal informative dataset. [00184] In some embodiments, the current ML model is refined with new/additional data (i.e., the Nth minimal informative subset of data). [00185] In some embodiments, the current ML model (e.g., a simple ML model) is retrained on prior data and new/additional data (i.e., the Nth incremental dataset). [00186] In some embodiments, performance of the current ML model, the refined ML model, and/or the retrained ML model is evaluated on the Nth incremental dataset to decide when to stop adding additional data points. [00187] In some embodiments, in offline training, the performance is evaluated to determine whether performance of the applicable ML model on the second incremental dataset is a close approximation of the performance of the applicable ML model trained on the first and the second datasets combined. [00188] In some embodiments, in online training, the performance is evaluated to determine whether the performance of the second incremental dataset (with the data points added so far) saturates. [00189] In some embodiments, the method stops adding new/additional datapoints when a threshold amount of points of the new/additional dataset is reached, irrespective of the performance achieved so far. [00190] In some embodiments, the method further includes incrementally adding (705) data from the second dataset to the first minimal informative subset of data to obtain a second incremental dataset; assessing (707) performance of the current ML model on the second incremental dataset; and applying (709) a stopping criterion to the incrementally adding data when the performance of the current ML model refined on the second incremental dataset satisfies one of (i) is within a threshold of performance of the current ML model trained on the entire first and second datasets combined or taken together during offline training of the current ML model, (ii) when greater than a defined percentage of data from the second dataset is needed in the incrementally adding to achieve (a) the performance on the second incremental dataset that approximates performance on the entire first and second datasets or (b) the performance saturates, during online or offline retraining of the current ML model, or (iii) when performance of the current ML model refined online on the second incremental dataset saturates. [00191] In some embodiments, performance comprises a metric (e.g., an error, an accuracy, or other metric (such as information content, informativeness of a data subset, etc.) [00192] In some embodiments, when the refined current ML model has an acceptable performance, the method generates a forecast of telecommunications dimensioning with the refined current ML model. [00193] Referring to Figure 8, in some embodiments, the method further includes determining (801) whether the current ML model or a stored ML model is successfully refined; and when (i) none of the current ML model and a stored ML model is successfully refined, creating (803) a new ML model, or (ii) the current ML model and/or at least one stored ML model is successfully refined, using (805) a refined ML model that has the best performance to generate a forecast of telecommunications dimensioning. [00194] In some embodiments, the conditions for applying a stopping criterion to the incrementally adding data contribute to the determining (801) and the creating (803). A stopping criterion is applied to the incrementally adding data when the performance of the current ML model refined on the second incremental dataset satisfies a condition including (i) the performance is within a threshold of performance of the current ML model trained on the entire first and second datasets combined or taken together during offline training of the current ML model; (ii) when greater than a defined percentage of data from the second dataset is needed in the incrementally adding to achieve (a) the performance on the second incremental dataset approximates performance on the entire first and second datasets or (b) the performance saturates, during online or offline retraining of the current ML model; or (iii) when performance of the current ML model refined online on the second incremental dataset saturates. The first (i) and second (ii) conditions may occur in offline training, and the second (ii) and the third (iii) conditions may occur in online training. When the second (ii) condition occurs before the first (i) or third (iii) condition in their respective offline/online process, adding points and refining of that ML model stops, and refinement of that ML model has failed. [00195] Referring to Figure 9, in some embodiments, the method further includes determining (901) (i) whether the current ML model has an acceptable performance, or (ii) whether another ML model exists from a plurality of stored ML models based on known or unknown model deployment metadata and known target performance; and when (i) the current ML model or a stored ML model has an acceptable performance, or (ii) the another ML model exists, omitting (903) refining the current ML model. [00196] Referring to Figure 10, the method further includes when (i) refinement of the current ML model does not result in an acceptable performance, or (ii) a periodic refinement to the current ML model does not result in an acceptable performance, the method further includes identifying (1001), from a plurality of ML models, another ML model to refine. The identifying is performed with or without available model metadata, and with or without known target performance of the another ML model. The model metadata includes information about deployment of a model. [00197] In some embodiments, when the model metadata is available and a target performance of the another ML model is available, the identifying includes identifying the another ML model based on a model error of the another ML model that is within the target performance of the another ML model. [00198] In some embodiments, the target performance comprises at least one of performance that is relative to performance of another ML model or an absolute performance metric. [00199] In some embodiments, when the ML model metadata is available and a target performance of the another ML model is not known, the identifying includes identifying ML models from the plurality of ML models that match the ML model metadata. [00200] In some embodiments, the method further includes performing (1003) a procedure to find a second minimal informative subset of new data that sufficiently models the second dataset based on performance of a retrained at least one ML model from the plurality of ML models. [00201] In some embodiments, for each ML model from the plurality of ML models that gets retrained, the second minimal informative subset of data is found. Each ML model is associated with its own dataset, which comprises one or more minimal informative subsets of data. [00202] In some embodiments, the procedure comprises iterating through the identified ML models on new data until (i) performance saturation is reached, or (ii) none of the identified ML models can be refined and a new ML model is created. [00203] In some embodiments, the method further includes designating (1005) a best performing ML model from the identified ML models or the new ML model to generate a forecast of telecommunications dimensioning. [00204] In some embodiments, when model metadata is not available, the identifying comprises (i) evaluating a performance of each of the plurality of ML models on a subset of the new data, (ii) identifying a subset of the plurality of ML models to retrain on a new dataset, the subset of the plurality of ML models identified based on a defined value that sets the number best performing ML models to include in the subset of the plurality of ML models; (iii) using an existing ML model that is determined to be performant, and (iv) when an existing ML model is not determined to be performant, performing a procedure to find a second minimal informative subset of the second dataset that sufficiently models the second dataset based on a performance of a retrained at least one ML model from the plurality of ML models, wherein the performance comprises achieving a target performance. [00205] In some embodiments, the procedure includes iterating through the subset of the plurality of ML models on the new data until (i) the target performance is reached, or (ii) none of the subset of the plurality of ML models can be refined and a new ML model is created. [00206] In some embodiments, the method further includes designating (1007) a best performing ML model from the subset of the plurality of ML models or the new ML model to generate a forecast of telecommunications dimensioning. [00207] Referring again to Figure 7, in some embodiments, the method further comprises initially training (701) the current ML model. The initial training includes selecting a seed dataset from the first dataset comprising performance data for network traffic; incrementally adding data from the remainder of the first dataset to the seed dataset to obtain a first incremental dataset; assessing performance of the current ML model on the first incremental dataset; and applying a stopping criterion to the incrementally adding data when the performance of the current ML model trained on the first incremental dataset satisfies one of (i) is within a threshold of performance over the entire first dataset during training of the current ML model, or (ii) when performance of the current ML model trained on the first incremental dataset saturates during online training of the current ML model. [00208] Referring again to Figure 11, in some embodiments, the method further includes deleting (1101), on a defined periodic basis, unused and/or out-of-date datasets and/or ML models from a ML model repository comprising the current ML model and a plurality of additional ML models; and maintaining (1103) in the ML model repository at least one dataset and at least one ML model that is in use and/or is up-to-date. [00209] Referring to Figure 12, in some embodiments, the method further includes compressing (1201) the first dataset and/or the second dataset for a stored ML model into a compressed form on a defined periodic basis. The compressing includes finding a minimal informative subset of accumulated datasets for the stored ML model. [00210] In some embodiments, the compressing finds the minimal informative data subset of all accumulated datasets that have not been forgotten. In some embodiments, the compressing is performed for each stored ML model. [00211] In some embodiments, the at least one ML model comprises a linear regression model, and the telecommunications dimensioning comprises estimating resources for a set of telecommunications features that capture telecommunications network traffic behavior. In the linear regression model, a resource requirement is a linear function of telecommunications domain features that capture network traffic behavior. [00212] In some embodiments, the set of telecommunications features comprise at least one of time, de-registrations, initial registrations, re-registrations, call duration, answered calls, call attempts, and total short message services, SMS. [00213] The various operations from the flow charts of Figures 8-12 may be optional with respect to some embodiments of nodes and related methods. [00214] In an example embodiment, dimensioning (that is, a resource requirement) estimate is equal to linear regression (features), where features include at least one of a time feature, a de-registration(s), an initial registration(s), a re-registration(s), call durations, answered calls, call attempts, and/or total SMS. [00215] In an example embodiment, a minimum sample problem is addressed using active learning procedure 209. As discussed further herein, a simple expression is derived for an acquisition function for a case of linear regression. In the following discussion, lower-case letters are used to denote both matrices and vectors, with their dimensions qualified by the context of the discussion. [00216] A linear regression formulation can include y = xw +e, where • y is an (n x 1) target vector, x is a (n x d) predictor matrix and w is a (d x 1) weight vector to be determined. • e represents the (n x 1) vector od residuals and expressed as additive white noise residuals, that is E[e] = 0 and E[ee T ] [00217] An OLS estimate of the weights w is given by ŵ = (x T x) -1 x T y [00218] If the true weights are w so that y = xw∗ + e, the expected value and covariance (uncertainty) of the estimated weights, given the data, is then specified by E[ŵ|x] = w∗ + (x T x) -1 x T E[e] = w∗ [00219] That is, the OLS estimate is unbiased and cov[ŵ|x] = E[(ŵ – w∗) T |x] = E[((x T x) -1 x T )ee T (x(x T x) -1 )] = (x T x) -1 x T E[[ee T ]x(x T x) -1 [00220] The act ive learning procedure 209 incrementally selects the point whose response is most uncertain (given the existing data points) and, thus, may best inform the modeling. For example, for an existing data sample X of dimension (k x d), where k << n, a new datum is selected x of dimension (1 x d) and whose (1 x 1) response is y. The active learning procedure 209 seeks to find the x’ from all data not currently in x which has the most uncertain response y’, quantified by Var{y’|x’, x, y}. Var{y’|x’, x, y} = E[(x’ŵ – x’w∗) 2 |x] = E[(x’(ŵ – w∗)} 2 |x] = E[(x’(ŵ – w∗)}{x’(ŵ – w∗)} T |x] = E[x’(ŵ – w∗)(ŵ – w∗)} T x’ T |x] = x’ E[(ŵ – w∗)(ŵ – w∗) T |x]x’ T = x’ cov(ŵ|x)x’ T = α x' C -1 x’ T [00221] x’ (1 x d), covariance C of the data x is (d x d) [00222] Given a well-chosen initial data sample x, the procedure 209 can incrementally find points x’ that maximize an acquisition function, e.g. x’C -1 x’ T , and add them to the data sample x. The process can be repeated until the RMSE is in a range of the best (all data) RMSE. [00223] If C as a covariance matrix of the existing data x is a similarity measure based on covariance, C -1 is a measure of dissimilarity and the procedure 209 incrementally identifies points most dissimilar to the existing set. [00224] While an acquisition function of some embodiments is discussed as above, the present disclosure is not so limited, and other acquisition functions may be used. [00225] Selecting a good initial data sample x may be important. Examples of good heuristics include, without limitation: • Using points that have maximum norm (e.g., the square root of the sum of the squares of feature values). This may be useful if high-value data is known to be useful for modeling for telecommunications network function dimensioning. • Using points that maximize any (e.g., one or more) features • Using points distributed across a range of the target variable so initial data for the linear model is spread out, and may be most dissimilar [00226] For the offline active learning procedure 209, in some embodiments, after picking an initial minimum set of samples, the acquisition function can be used to incrementally select additional samples until the stopping criteria (e.g., error approximates the error obtained using the full dataset occurs). This approximation can be defined by a user set threshold (e.g., within 10%). [00227] For the online active learning procedure 209, when, e.g., a ML model exists in a live network deployment scenario and needs to be retrained using new data: • The full/combined data error is unavailable • Additional data are identified using the acquisition function and incorporated to incrementally refine the ML model • The stopping criteria is the saturation of a model performance metric (e.g., RMSE or the trace of the inverse covariance matrix. [00228] While example embodiments are discussed herein with respect to using linearity of a dimensioning ML model to derive simple expressions which may enable efficient processing, the method of the present disclosure is model agnostic and acquisition function agnostic. [00229] In an example embodiment, Figure 13 is a plot illustrating CPU load for a call session control function (CSCF) dataset subject to an offline active learning procedure in accordance with some embodiments of the present disclosure; and Figure 14 is a further plot illustrating a training root mean squared error (RMSE) for the data of Figure 13. In this example embodiment, the full size of the CSCF dataset has 711 data points and was subjected to offline active learning procedure 209 to make a minimum-sample selection. As illustrated in Figures 13 and 14, only 23 informative points were required to get a competitive error of 1.99, compared to the full dataset error of 1.83. [00230] Further, in another example embodiment using online active learning procedure 209, a CSCF ML model already existed with 50 (randomly selected) data points. A remaining 661 (new) data points were used to update the existing ML model. First, as the new data was consistent with the existing ML model, an RMSE that was reasonable was obtained as illustrated in Figure 15. As shown in Figure 15, the ML model learned with 50 data points with a residual error of 1.49; and when tested without refinement on a compatible dataset of the remainder 611 data points, the ML model produced a slightly higher error of 2.01 that is suggestive of compatibility. Second, in this example embodiment, a result of the method was validated as being competitive. For the validation, a full dataset RMSE to pick samples from was not computed. Instead, samples from the remaining 661 data points that maximized the acquisition function were incrementally added to the existing 50 data points and the ML model was refined/re- learned until the RMSE saturated (per a threshold, as quantified by its standard deviation over the last 5 sample additions) to 0.01 or below at the 25th additional sample from data partition 2 (data2), as illustrated in Figure 16. As illustrated in Figures 15 and 16, only 25 of the 661 data points were required to re-train the existing ML model to an accuracy of 1.93. Thus, the online active learning procedure 209 can stop at this point. As shown in the inset plot in Figure 16, while online active learning procedure 209 can stop, outcomes can continue to be evaluated until the last available point to observe trends. For purposes of validating this result as being competitive, the combined data-set accuracy was computed to be 1.84. Thus, in total only 75 of the 711 data samples would be stored (e.g., in a model- repository). [00231] Another example embodiment is discussed in the context of planned or unplanned cell maintenance (e.g., 4G, 5G, and beyond). In a planned maintenance scenario, operators may take down multiple cell sites and reroute the traffic to the nearest neighbor. In an unplanned maintenance scenario, there can be failure of a cell site and traffic also gets routed to the nearest neighbor. When such situations occur, there may be changes in the network/subscriber behavior, which may lead to a change in the data. This data/concept drift triggers the method of the present disclosure for model retraining and redeployment as part of LCM. [00232] In some approaches for retraining, there may need to be a considerable amount of data in order to arrive to a new ML model, and it may be computationally expensive and time consuming. For example, a new ML model may be trained and deployed in this scenario in all the cell sites or in a centralized location. After the maintenance, the operator may put the cell site(s) back into operation, thereby making the network go back to its original state and the traffic behavior restores to the previous state as well. In this scenario, the newly developed ML model will fail, which means it is again a concept/data drift from a ML model LCM perspective. [00233] Existing approaches may consider this as a unique scenario and start the ML model retraining using a large amount of data, come-up with a ML model from scratch, and deploy it. Such approaches may be computationally expensive, time consuming, and not practical if such events occurred, e.g., any more often than as a rare one-off event. [00234] The method of some embodiments of the present disclosure includes intelligently identifying and deploying a ML model that was previously used in this scenario (with minimal/efficient retraining or no retraining). As a consequence, time and effort may be saved instead of building the same/similar ML model again. [00235] As discussed herein, a node may be provided, for example, as discussed below with respect to network node QQ110A, QQ110B of Figure 17, network node QQ300 of Figure 18, and/or virtualization environment QQ500 of Figure 19, all of which should be considered interchangeable in the examples and embodiments described herein and be within the intended scope of this disclosure, unless otherwise noted. The node configured to perform LCM of at least one ML model for telecommunications dimensioning may be provided by, e.g., a node in the cloud running software on cloud compute hardware; or a software function/service governing or controlling the telecommunications deployment running in the cloud. That is, the node may be implemented as part of a telecommunications deployment (e.g., a node that is part of the telecommunications deployment), or on a node as a separate functionality/service hosted in the cloud. The node also may be provided as a standalone software for dimensioning a telecommunications deployment running on consumer computational systems like servers or workstations; and the dimensioning exercise may be to estimate resource requirements for a telecommunications deployment that may include virtual or cloud-based network functions (VNFs or CNFs) and even physical network functions (PNFs). The cloud may be public, private (e.g., on premises or hosted), or hybrid. [00236] Figure 17 shows an example of a communication system QQ100 in accordance with some embodiments. [00237] In the example, the communication system QQ100 includes a telecommunication network QQ102 that includes an access network QQ104, such as a RAN, and a core network QQ106, which includes one or more core network nodes QQ108. The access network QQ104 includes one or more access network nodes, such as network nodes QQ110a and QQ110b (one or more of which may be generally referred to as network nodes QQ110), or any other similar 3rd Generation Partnership Project (3GPP) access node or non-3GPP access point. The network nodes QQ110 facilitate direct or indirect connection of UE, such as by connecting UEs QQ112a, QQ112b, QQ112c, and QQ112d (one or more of which may be generally referred to as UEs QQ112) to the core network QQ106 over one or more wireless connections. [00238] Example wireless communications over a wireless connection include transmitting and/or receiving wireless signals using electromagnetic waves, radio waves, infrared waves, and/or other types of signals suitable for conveying information without the use of wires, cables, or other material conductors. Moreover, in different embodiments, the communication system QQ100 may include any number of wired or wireless networks, network nodes, UEs, and/or any other components or systems that may facilitate or participate in the communication of data and/or signals whether via wired or wireless connections. The communication system QQ100 may include and/or interface with any type of communication, telecommunication, data, cellular, radio network, and/or other similar type of system. [00239] The UEs QQ112 may be any of a wide variety of communication devices, including wireless devices arranged, configured, and/or operable to communicate wirelessly with the network nodes QQ110 and other communication devices. Similarly, the network nodes QQ110 are arranged, capable, configured, and/or operable to communicate directly or indirectly with the UEs QQ112 and/or with other network nodes or equipment in the telecommunication network QQ102 to enable and/or provide network access, such as wireless network access, and/or to perform other functions, such as administration in the telecommunication network QQ102. [00240] In the depicted example, the core network QQ106 connects the network nodes QQ110 to one or more hosts, such as host QQ116. These connections may be direct or indirect via one or more intermediary networks or devices. In other examples, network nodes may be directly coupled to hosts. The core network QQ106 includes one more core network nodes (e.g., core network node QQ108) that are structured with hardware and software components. Features of these components may be substantially similar to those described with respect to the UEs, network nodes, and/or hosts, such that the descriptions thereof are generally applicable to the corresponding components of the core network node QQ108. Example core network nodes include functions of one or more of a Mobile Switching Center (MSC), Mobility Management Entity (MME), Home Subscriber Server (HSS), Access and Mobility Management Function (AMF), Session Management Function (SMF), Authentication Server Function (AUSF), Subscription Identifier De-concealing function (SIDF), Unified Data Management (UDM), Security Edge Protection Proxy (SEPP), Network Exposure Function (NEF), and/or a User Plane Function (UPF). [00241] The host QQ116 may be under the ownership or control of a service provider other than an operator or provider of the access network QQ104 and/or the telecommunication network QQ102, and may be operated by the service provider or on behalf of the service provider. The host QQ116 may host a variety of applications to provide one or more service. Examples of such applications include live and pre-recorded audio/video content, data collection services such as retrieving and compiling data on various ambient conditions detected by a plurality of UEs, analytics functionality, social media, functions for controlling or otherwise interacting with remote devices, functions for an alarm and surveillance center, or any other such function performed by a server. [00242] As a whole, the communication system QQ100 of Figure 17 enables connectivity between the UEs, network nodes, and hosts. In that sense, the communication system may be configured to operate according to predefined rules or procedures, such as specific standards that include, but are not limited to: Global System for Mobile Communications (GSM); Universal Mobile Telecommunications System (UMTS); Long Term Evolution (LTE), and/or other suitable 2G, 3G, 4G, 5G standards, or any applicable future generation standard (e.g., 6G); wireless local area network (WLAN) standards, such as the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standards (WiFi); and/or any other appropriate wireless communication standard, such as the Worldwide Interoperability for Microwave Access (WiMax), Bluetooth, Z-Wave, Near Field Communication (NFC) ZigBee, LiFi, and/or any low-power wide-area network (LPWAN) standards such as LoRa and Sigfox. [00243] In some examples, the telecommunication network QQ102 is a cellular network that implements 3GPP standardized features. Accordingly, the telecommunications network QQ102 may support network slicing to provide different logical networks to different devices that are connected to the telecommunication network QQ102. For example, the telecommunications network QQ102 may provide Ultra Reliable Low Latency Communication (URLLC) services to some UEs, while providing Enhanced Mobile Broadband (eMBB) services to other UEs, and/or Massive Machine Type Communication (mMTC)/Massive IoT services to yet further UEs. [00244] In some examples, the UEs QQ112 are configured to transmit and/or receive information without direct human interaction. For instance, a UE may be designed to transmit information to the access network QQ104 on a predetermined schedule, when triggered by an internal or external event, or in response to requests from the access network QQ104. Additionally, a UE may be configured for operating in single- or multi-RAT or multi-standard mode. For example, a UE may operate with any one or combination of Wi-Fi, NR (New Radio) and LTE, i.e. being configured for multi-radio dual connectivity (MR- DC), such as E-UTRAN (Evolved-UMTS Terrestrial Radio Access Network) New Radio – Dual Connectivity (EN-DC). [00245] In the example, the hub QQ114 communicates with the access network QQ104 to facilitate indirect communication between one or more UEs (e.g., UE QQ112c and/or QQ112d) and network nodes (e.g., network node QQ110b). [00246] Figure 18 shows a network node QQ300 in accordance with some embodiments. As used herein, network node refers to equipment capable, configured, arranged and/or operable to communicate directly or indirectly with a UE and/or with other network nodes or equipment, in a telecommunication network. Examples of network nodes include, but are not limited to, base stations (BSs) (e.g., radio base stations, Node Bs, eNBs, and NR NodeBs (gNBs)), and access points (APs) (e.g., radio access points). [00247] Base stations may be categorized based on the amount of coverage they provide (or, stated differently, their transmit power level) and so, depending on the provided amount of coverage, may be referred to as femto base stations, pico base stations, micro base stations, or macro base stations. A base station may be a relay node or a relay donor node controlling a relay. A network node may also include one or more (or all) parts of a distributed radio base station such as centralized digital units and/or remote radio units (RRUs), sometimes referred to as Remote Radio Heads (RRHs). Such remote radio units may or may not be integrated with an antenna as an antenna integrated radio. Parts of a distributed radio base station may also be referred to as nodes in a distributed antenna system (DAS). [00248] Other examples of network nodes include multiple transmission point (multi-TRP) 5G access nodes, multi-standard radio (MSR) equipment such as MSR BSs, network controllers such as radio network controllers (RNCs) or base station controllers (BSCs), base transceiver stations (BTSs), transmission points, transmission nodes, multi- cell/multicast coordination entities (MCEs), Operation and Maintenance (O&M) nodes, Operations Support System (OSS) nodes, Self-Organizing Network (SON) nodes, positioning nodes (e.g., Evolved Serving Mobile Location Centers (E-SMLCs)), and/or Minimization of Drive Tests (MDTs). [00249] The network node QQ300 includes a processing circuitry QQ302, a memory QQ304, a communication interface QQ306, and a power source QQ308. The network node QQ300 may be composed of multiple physically separate components (e.g., a NodeB component and a RNC component, or a BTS component and a BSC component, etc.), which may each have their own respective components. In certain scenarios in which the network node QQ300 comprises multiple separate components (e.g., BTS and BSC components), one or more of the separate components may be shared among several network nodes. For example, a single RNC may control multiple NodeBs. In such a scenario, each unique NodeB and RNC pair, may in some instances be considered a single separate network node. In some embodiments, the network node QQ300 may be configured to support multiple radio access technologies (RATs). In such embodiments, some components may be duplicated (e.g., separate memory QQ304 for different RATs) and some components may be reused (e.g., a same antenna QQ310 may be shared by different RATs). The network node QQ300 may also include multiple sets of the various illustrated components for different wireless technologies integrated into network node QQ300, for example GSM, WCDMA, LTE, NR, WiFi, Zigbee, Z-wave, LoRaWAN, Radio Frequency Identification (RFID) or Bluetooth wireless technologies. These wireless technologies may be integrated into the same or different chip or set of chips and other components within network node QQ300. [00250] In certain alternative embodiments, the network node QQ300 does not include separate radio front-end circuitry QQ318, instead, the processing circuitry QQ302 includes radio front-end circuitry and is connected to the antenna QQ310. Similarly, in some embodiments, all or some of the RF transceiver circuitry QQ312 is part of the communication interface QQ306. In still other embodiments, the communication interface QQ306 includes one or more ports or terminals QQ316, the radio front-end circuitry QQ318, and the RF transceiver circuitry QQ312, as part of a radio unit (not shown), and the communication interface QQ306 communicates with the baseband processing circuitry QQ314, which is part of a digital unit (not shown). [00251] The antenna QQ310 may include one or more antennas, or antenna arrays, configured to send and/or receive wireless signals. The antenna QQ310 may be coupled to the radio front-end circuitry QQ318 and may be any type of antenna capable of transmitting and receiving data and/or signals wirelessly. In certain embodiments, the antenna QQ310 is separate from the network node QQ300 and connectable to the network node QQ300 through an interface or port. [00252] The antenna QQ310, communication interface QQ306, and/or the processing circuitry QQ302 may be configured to perform any receiving operations and/or certain obtaining operations described herein as being performed by the network node. Any information, data and/or signals may be received from a UE, another network node and/or any other network equipment. Similarly, the antenna QQ310, the communication interface QQ306, and/or the processing circuitry QQ302 may be configured to perform any transmitting operations described herein as being performed by the network node. Any information, data and/or signals may be transmitted to a UE, another network node and/or any other network equipment. [00253] Embodiments of the network node QQ300 may include additional components beyond those shown in Figure 18 for providing certain aspects of the network node’s functionality, including any of the functionality described herein and/or any functionality necessary to support the subject matter described herein. For example, the network node QQ300 may include user interface equipment to allow input of information into the network node QQ300 and to allow output of information from the network node QQ300. This may allow a user to perform diagnostic, maintenance, repair, and other administrative functions for the network node QQ300. [00254] Figure 19 is a block diagram illustrating a virtualization environment QQ500 in which functions implemented by some embodiments may be virtualized. In the present context, virtualizing means creating virtual versions of apparatuses or devices which may include virtualizing hardware platforms, storage devices and networking resources. As used herein, virtualization can be applied to any device described herein, or components thereof, and relates to an implementation in which at least a portion of the functionality is implemented as one or more virtual components. Some or all of the functions described herein may be implemented as virtual components executed by one or more virtual machines (VMs) implemented in one or more virtual environments QQ500 hosted by one or more of hardware nodes, such as a hardware computing device that operates as a network node, UE, core network node, or host. Further, in embodiments in which the virtual node does not require radio connectivity (e.g., a core network node or host), then the node may be entirely virtualized. [00255] Applications QQ502 (which may alternatively be called software instances, virtual appliances, network functions, virtual nodes, virtual network functions, etc.) are run in the virtualization environment Q400 to implement some of the features, functions, and/or benefits of some of the embodiments disclosed herein. [00256] Hardware QQ504 includes processing circuitry, memory that stores software and/or instructions executable by hardware processing circuitry, and/or other hardware devices as described herein, such as a network interface, input/output interface, and so forth. Software may be executed by the processing circuitry to instantiate one or more virtualization layers QQ506 (also referred to as hypervisors or virtual machine monitors (VMMs)), provide VMs QQ508a and QQ508b (one or more of which may be generally referred to as VMs QQ508), and/or perform any of the functions, features and/or benefits described in relation with some embodiments described herein. The virtualization layer QQ506 may present a virtual operating platform that appears like networking hardware to the VMs QQ508. [00257] The VMs QQ508 comprise virtual processing, virtual memory, virtual networking or interface and virtual storage, and may be run by a corresponding virtualization layer QQ506. Different embodiments of the instance of a virtual appliance QQ502 may be implemented on one or more of VMs QQ508, and the implementations may be made in different ways. Virtualization of the hardware is in some contexts referred to as network function virtualization (NFV). NFV may be used to consolidate many network equipment types onto industry standard high volume server hardware, physical switches, and physical storage, which can be located in data centers, and customer premise equipment. [00258] In the context of NFV, a VM QQ508 may be a software implementation of a physical machine that runs programs as if they were executing on a physical, non- virtualized machine. Each of the VMs QQ508, and that part of hardware QQ504 that executes that VM, be it hardware dedicated to that VM and/or hardware shared by that VM with others of the VMs, forms separate virtual network elements. Still in the context of NFV, a virtual network function is responsible for handling specific network functions that run in one or more VMs QQ508 on top of the hardware QQ504 and corresponds to the application QQ502. [00259] Hardware QQ504 may be implemented in a standalone network node with generic or specific components. Hardware QQ504 may implement some functions via virtualization. Alternatively, hardware QQ504 may be part of a larger cluster of hardware (e.g. such as in a data center or CPE) where many hardware nodes work together and are managed via management and orchestration QQ510, which, among others, oversees lifecycle management of applications QQ502. In some embodiments, hardware QQ504 is coupled to one or more radio units that each include one or more transmitters and one or more receivers that may be coupled to one or more antennas. Radio units may communicate directly with other hardware nodes via one or more appropriate network interfaces and may be used in combination with the virtual components to provide a virtual node with radio capabilities, such as a radio access node or a base station. In some embodiments, some signaling can be provided with the use of a control system QQ512 which may alternatively be used for communication between hardware nodes and radio units. [00260] Although the nodes described herein may include the illustrated combination of hardware components, other embodiments may comprise computing devices with different combinations of components. It is to be understood that these nodes may comprise any suitable combination of hardware and/or software needed to perform the tasks, features, functions and methods disclosed herein. Determining, calculating, obtaining or similar operations described herein may be performed by processing circuitry, which may process information by, for example, converting the obtained information into other information, comparing the obtained information or converted information to information stored in the node, and/or performing one or more operations based on the obtained information or converted information, and as a result of said processing making a determination. Moreover, while components are depicted as single boxes located within a larger box, or nested within multiple boxes, in practice, nodes may comprise multiple different physical components that make up a single illustrated component, and functionality may be partitioned between separate components. For example, a communication interface may be configured to include any of the components described herein, and/or the functionality of the components may be partitioned between the processing circuitry and the communication interface. In another example, non- computationally intensive functions of any of such components may be implemented in software or firmware and computationally intensive functions may be implemented in hardware. [00261] In certain embodiments, some or all of the functionality described herein may be provided by processing circuitry executing instructions stored on in memory, which in certain embodiments may be a computer program product in the form of a non- transitory computer-readable storage medium. In alternative embodiments, some or all of the functionality may be provided by the processing circuitry without executing instructions stored on a separate or discrete device-readable storage medium, such as in a hard-wired manner. In any of those particular embodiments, whether executing instructions stored on a non-transitory computer-readable storage medium or not, the processing circuitry can be configured to perform the described functionality. The benefits provided by such functionality are not limited to the processing circuitry alone or to other components of the computing device, but are enjoyed by the computing device as a whole, and/or by end users and a wireless network generally. [00262] Further definitions and embodiments are discussed below. [00263] In the above-description of various embodiments of present inventive concepts, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of present inventive concepts. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which present inventive concepts belong. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein. [00264] When an element is referred to as being "connected", "coupled", "responsive", or variants thereof to another element, it can be directly connected, coupled, or responsive to the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly connected", "directly coupled", "directly responsive", or variants thereof to another element, there are no intervening elements present. Like numbers refer to like elements throughout. Furthermore, "coupled", "connected", "responsive", or variants thereof as used herein may include wirelessly coupled, connected, or responsive. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Well-known functions or constructions may not be described in detail for brevity and/or clarity. The term "and/or" (abbreviated “/”) includes any and all combinations of one or more of the associated listed items. [00265] It will be understood that although the terms first, second, third, etc. may be used herein to describe various elements/operations, these elements/operations should not be limited by these terms. These terms are only used to distinguish one element/operation from another element/operation. Thus a first element/operation in some embodiments could be termed a second element/operation in other embodiments without departing from the teachings of present inventive concepts. The same reference numerals or the same reference designators denote the same or similar elements throughout the specification. [00266] As used herein, the terms "comprise", "comprising", "comprises", "include", "including", "includes", "have", "has", "having", or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions or groups thereof. Furthermore, as used herein, the common abbreviation "e.g.", which derives from the Latin phrase "exempli gratia," may be used to introduce or specify a general example or examples of a previously mentioned item, and is not intended to be limiting of such item. The common abbreviation "i.e.", which derives from the Latin phrase "id est," may be used to specify a particular item from a more general recitation. [00267] Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits. These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s). [00268] These computer program instructions may also be stored in a tangible computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the functions/acts specified in the block diagrams and/or flowchart block or blocks. Accordingly, embodiments of present inventive concepts may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.) that runs on a processor such as a digital signal processor, which may collectively be referred to as "circuitry," "a module" or variants thereof. [00269] It should also be noted that in some alternate implementations, the functions/acts noted in the blocks may occur out of the order noted in the flowcharts. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Moreover, the functionality of a given block of the flowcharts and/or block diagrams may be separated into multiple blocks and/or the functionality of two or more blocks of the flowcharts and/or block diagrams may be at least partially integrated. Finally, other blocks may be added/inserted between the blocks that are illustrated, and/or blocks/operations may be omitted without departing from the scope of inventive concepts. Moreover, although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows. [00270] Many variations and modifications can be made to the embodiments without substantially departing from the principles of the present inventive concepts. All such variations and modifications are intended to be included herein within the scope of present inventive concepts. Accordingly, the above disclosed subject matter is to be considered illustrative, and not restrictive, and the examples of embodiments are intended to cover all such modifications, enhancements, and other embodiments, which fall within the spirit and scope of present inventive concepts. Thus, to the maximum extent allowed by law, the scope of present inventive concepts are to be determined by the broadest permissible interpretation of the present disclosure including the examples of embodiments and their equivalents, and shall not be restricted or limited by the foregoing detailed description.