Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A COMPUTER IMPLEMENTED METHOD, A SYSTEM AND A COMPUTER PROGRAM FOR OPTIMIZING THE OPERATION OF A CLOUD HOSTED SOFTWARE AS A SERVICE (SaaS) SYSTEM
Document Type and Number:
WIPO Patent Application WO/2019/086522
Kind Code:
A1
Abstract:
The present invention relates to a computer implemented method for optimizing the operation of a cloud hosted SaaS system, the method comprising: a) monitoring cloud resource usage regarding a cloud provider computing system, to detect cloud resource load consumption; b) detecting service usage regarding software applications deployed in the cloud provider computing system; and c) binding the detected cloud resource load consumption with the detected service usage, and determining associated cost contributions based on respective load contributions obtained from said binding, to optimize the operation of the cloud hosted SaaS system based on the determined cost contributions. The present invention also relates to a computer program and a system both adapted to implement the method of the invention.

Inventors:
LAOUTARIS NIKOLAOS (ES)
SAKELLARIOU SPYRIDON (GR)
Application Number:
PCT/EP2018/079838
Publication Date:
May 09, 2019
Filing Date:
October 31, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
LSTECH LTD (GB)
International Classes:
G06Q10/06; G06Q10/10; G06Q30/06
Foreign References:
US20160094401A12016-03-31
US20150120920A12015-04-30
US20110145413A12011-06-16
US20150358349A12015-12-10
US20140214755A12014-07-31
US20140074659A12014-03-13
US20150341230A12015-11-26
US20160112375A12016-04-21
US20110313896A12011-12-22
Other References:
A. BOUSIA; E. KARTSAKLI; A. ANTONOPOULOS; L. ALONSO; C. VERIKOUKIS: "Sharing the Small Cells for Energy Efficient Networking: How much does it cost?", PROCS OF GLOBECOM, vol. 14, 2014, pages 2649 - 2654
L. GYARMATI; R. STANOJEVIC; M. SIRIVIANOS; N. LAOUTARIS: "Procs of the 12th ACM SIGCOMM Conf. on Internet Measurement, IMC", vol. 12, 2012, ACM, article "Sharing the Cost of Backbone Networks: Cui Bono?", pages: 509 - 522
L. GYARMATI; N. LAOUTARIS; P. RODRIGUEZ: "Procs of the 10th ACM SIGCOMM Conf. on Internet Measurement, IMC", vol. 10, 2010, ACM, article "On Economic Heavy Hitters: Shapley Value Analysis of 95th-Percentile Pricing", pages: 75 - 80
Attorney, Agent or Firm:
PONTI & PARTNERS, S.L.P. (ES)
Download PDF:
Claims:
Claims

1 .- A computer implemented method for optimizing the operation of a cloud hosted software as a service (SaaS) system, wherein a SaaS provider computing system is operatively connected as a bridge between a cloud provider computing system and at least one customer computing system, to provide to at least one customer with SaaS services associated at least to software applications deployed in the cloud provider computing system, the method comprising performing the following steps:

a) monitoring cloud resource usage regarding said cloud provider computing system, to detect cloud resource load consumption;

b) detecting service usage regarding said applications; and

c) binding the detected cloud resource load consumption with the detected service usage, and determining associated cost contributions based on respective load contributions obtained from said binding, to optimize the operation of the cloud hosted SaaS system based on the determined cost contributions.

2.- A method according to claim 1 , wherein the method is applied to a plurality of customers, being provided with the SaaS services through respective customer computing systems operatively connected to the SaaS provider computing system, said step b) comprising automatically detecting service usage for each of said customers, and said step c) comprising automatically binding the detected cloud resource consumption with the detected service usage for each customer, to determine and associate cost contributions to services and customers.

3. - A method according to claim 2, comprising performing said step b) by embedding at least one tracking code to service provisioning web pages through which the SaaS is provided to the customers, to capture at least customer identifying information of customers accessing said web pages to track application usage for each customer.

4. - A method according to claim 3, wherein:

- step a) comprises obtaining cloud resource load time-series per cloud resource;

- step b) comprises capturing service invocation time-series by customers, including said customer information;

- step c) comprises aligning said cloud resource load time-series with said service invocation time-series within a common time resolution so that services arrived for execution within this resolution complete their execution within this resolution, and calculating the cost contributions per service and customer based on information available during the intervals of the considered resolution in the aligned time-series.

5.- A method according to claim 4, wherein said service invocation time-series also includes identifying information regarding the SaaS provider computing system, the service of the SaaS provider computing system invoked by a customer, the time the service was invoked, and the user who invoked the service, including said customer information.

6.- A method according to claim 4 or 5, comprising performing said cost contribution calculation for different load ranges through respective different cost penalty functions.

7 - A method according to claim 6, comprising distributing the cost penalty of said different cost penalty functions among active services of said customers, at least proportionally to their relative presence in the detections performed at step b), according to a proportional sharing approach.

8.- A method according to claim 7, comprising distributing the cost penalty of the different cost penalty functions among active services of the customers, also based on the demand said active services bring to the cloud resources in the detections performed at step b), according to a weighted-proportional sharing approach.

9.- A method according to any of claims 3 to 8, wherein said at least tracking code is at least one of a platform-specific service page snippet, a skeleton service page snippet to be completed at the SaaS provider computing system, a service page snippet conveying user UUIDs (universally unique identifiers), and a log-in snippet conveying user UUIDs and email binding.

10.- A method according to claim 4 or to any of claims 5 to 9 when depending on claim 4, comprising obtaining said service invocation time-series by mapping said service invocations into a series of time intervals based on the service invocations chronological order, on their execution time, on a lag between service invocation and execution, and on the length and time location of said intervals.

1 1 .- A method according to claim 4 or to any of claims 5 to 10 when depending on claim 4, comprising obtaining said cloud resource load time-series from measurement data, regarding one or more metrics of one or more of said cloud resources, included in said time intervals.

12.- A method according to any of the previous claims, comprising performing said optimization of the operation of the cloud hosted SaaS system based on the determined cost contributions at step c), wherein said optimization comprises at least one of the following actions:

- cloud resource remapping of the detected services;

- improvement of the implementation of services provided by the SaaS provider computing system; and - reconfiguration of the network connections between the cloud provider computing system and the at least one customer computing system, through the SaaS provider computing system.

13. - A method according to claim 12, comprising using the information obtained at step c) to detect software bugs and/or intrusion DoS attacks, and based on the results of said detections performing said improvement of the implementation of services provided by the SaaS provider computing system.

14. - A method according to any of the previous claims, comprising performing sad steps a), b) and c) by means of a remote computing system remotely communicated with the SaaS provider computing system.

15. - A method according to claim 14, comprising performing sad steps a), b) and c) by means of said remote computing system for various cloud-hosted SaaS systems, remotely communicated therewith, and provide by means of said remote computing system, on a subscription basis, cost analytics as-a-service, with the cost contributions obtained at step c), to each of said various cloud-hosted SaaS systems in order to allow them to use the provided information at least to perform the optimization of their operation.

16. - A computer program, comprising code instructions that when run in one or more processors implement the method of any of the previous claims.

17. - A system for optimizing the operation of a cloud hosted software as a service (SaaS) system, wherein a SaaS provider computing system is operatively connected as a bridge between a cloud provider computing system and at least one customer computing system, to provide to at least one customer with SaaS services associated at least to software applications deployed in the cloud provider computing system, the system being adapted to implement the method of any of claims 1 to 14, for which purpose the system comprises:

- monitoring means for performing step a);

- detection means for performing step b); and

- processing means for performing step c).

18. - A system according to claim 17, comprising a remote computing system that comprises said monitoring, detection and processing means, wherein said remote computing system is configured and arranged to provide, on a subscription basis, cost analytics as-a-service, with the cost contributions obtained at step c), to various cloud- hosted SaaS systems in order to allow them to use the provided information at least to perform the optimization of their operation.

19.- A computer implemented method for an economical optimization related to a cloud hosted software as a service (SaaS) system, wherein a SaaS provider computing system is operatively connected as a bridge between a cloud provider computing system and at least one customer computing system, to provide to at least one customer with SaaS services associated at least to software applications deployed in the cloud provider computing system, the method comprising performing the following steps:

a) monitoring cloud resource usage regarding said cloud provider computing system, to detect cloud resource load consumption;

b) detecting service usage regarding said applications; and

c) binding the detected cloud resource load consumption with the detected service usage, and determining associated cost contributions based on respective load contributions obtained from said binding, to provide cost analytics.

Description:
A computer implemented method, a system and a computer program for optimizing the operation of a cloud hosted software as a service (SaaS) system

FIELD OF THE INVENTION

The present invention relates, in a first aspect, to a computer implemented method for optimizing the operation of a cloud hosted software as a service (SaaS) system, based on cost contributions determined from cloud resource load consumption bound with detected services usage.

A second aspect of the invention relates to a computer program adapted to implement the method of the first aspect of the invention.

In a third aspect, the present invention relates to a system adapted to implement the method of the first aspect of the invention.

BACKGROUND OF THE INVENTION

Considering a cloud-hosted service provider, a problem arises regarding how to share the cost of its virtual infrastructure among the offered services and/or the users that use them, in a fair manner. The term cost is generically used in the present document to refer to any kind of cost pertained to a computing infrastructure, such as processing, storage, internal and external transmission cost. It is substantiated given the particular context of a resource; VMs (virtual machines) bear processing cost, disks bear storage cost while local or wide network interfaces bear transmission cost.

The problem of fair cost sharing has been addressed in traditional IP networking infrastructures assuming load measurements both per resource (network link, primarily) and per service/user consuming the resource [1]-[3]. Load measurements are reported per IP packet/flow header and thus consumption per service and user as understood by business can easily (through lookups) be deduced. Indeed, a number of IP network traffic measurement tools like Cisco's NetFlow [4] provide per-flow traffic statistics while a number of datasets with IP traffic traces are publically available.

However, unlike networking infrastructures, in cloud-hosted infrastructures load measurements of cloud resources (VMs, storage entities, egress network services or specialized computing platforms) are provided in a systemic aggregate form. They do not include any semantics pointing to the application/service level contributors incurring the reported load (such as header information in IP traffic measurements). Indeed, available cloud monitoring tools like Google's Stackdriver [5] and Elasticsearch's Metricbeat [6] provide a rich set of system-specific metrics which however are totally unaware of the overlying services/users actually using the monitored resources. The lack of explicit bindings between application/service and system layers in cloud-hosted service environments is a challenge that makes the problem at hand unique and most intriguing.

Cloud cost analytics for cloud cost management and optimization is an emerging market, as shown in Table 1 for different prior art proposals currently available in the market.

Existing commercial solutions rely solely on system information, as provided by available cloud monitoring services, while they mainly address cloud-based enterprises not as-a-service providers per se. The business value of their analytics is mainly exhausted in the distribution of cloud cost among different business functions (departments, cost centers, units) that the subscribed enterprises have, assuming that enterprises utilize a certain amount of cloud resources per business function. And, this is done by simply allowing their client enterprises tag the virtual resources per business function; and, subsequently by distributing the cloud cost per resource and adding the resulting resource cost per business function-tag. Such an allocation of the cloud cost is more useful to large enterprises like banks than to as-a-service providers who are naturally interested in knowing the cost per (groups of) offered service and/or customer.

It is, therefore, necessary to provide an alternative to the state of the art which covers the gaps found therein, by providing a method which really provides such knowledge regarding cloud cost per service and/or customer, in order to optimize the operation of a cloud hosted SaaS system. SUMMARY OF THE INVENTION

To that end, the present invention relates to a computer implemented method for optimizing the operation of a cloud hosted SaaS system, wherein a SaaS provider computing system is operatively connected as a bridge between a cloud provider computing system and at least one customer computing system, to provide to at least one customer with SaaS services associated at least to software applications deployed in the cloud provider computing system (i.e., the SaaS provider computing system is a customer that leases resources from the cloud provider computing system), the method comprising performing the following steps:

a) monitoring cloud resource usage regarding said cloud provider computing system, to detect cloud resource load consumption;

b) detecting service usage regarding said applications; and

c) binding the detected cloud resource load consumption with the detected service usage, and determining associated cost contributions based on respective load contributions obtained from said binding, to optimize the operation of the cloud hosted SaaS system based on the determined cost contributions.

For an embodiment, the method of the first aspect of the invention is applied to a plurality of customers, being provided with the SaaS services through respective customer computing systems operatively connected to the SaaS provider computing system, step b) comprising automatically detecting service usage for each of said customers, and step c) comprising automatically binding the detected cloud resource consumption with the detected service usage for each customer, to determine and associate cost contributions to services and customers.

According to an implementation of the above mentioned embodiment, the method comprises performing step b) by embedding at least one tracking code to service provisioning web pages through which the SaaS is provided to the customers, to capture at least customer identifying information of customers accessing said web pages to track application usage for each customer.

For a preferred embodiment, steps a), b) and c) of the method of the first aspect of the invention are performed as follows:

- step a) comprises obtaining cloud resource load time-series per cloud resource;

- step b) comprises capturing service invocation time-series by customers, including said customer information; and

- step c) comprises aligning said cloud resource load time-series with said service invocation time-series within a common time resolution so that services arrived for execution within this resolution complete their execution within this resolution, and calculating the cost contributions per service and customer based on information available during the intervals of the considered resolution in the aligned time-series.

Preferably, the above mentioned service invocation time-series also includes identifying information regarding the SaaS provider computing system, the service of the SaaS provider computing system invoked by a customer, the time the service was invoked, and the user who invoked the service, including said customer information.

According to an embodiment, the method of the first aspect of the invention comprises performing the cost contribution calculation for different load ranges through respective different cost penalty functions.

For an implementation of said embodiment, the method comprises distributing the cost penalty of the different cost penalty functions among active services of said customers, at least proportionally to their relative presence in the detections performed at step b), according to a proportional sharing approach.

Preferably, the method of the first aspect of the invention further comprises distributing the cost penalty of the different cost penalty functions among active services of the customers, also based on the demand said active services bring to the cloud resources in the detections performed at step b), according to a weighted-proportional sharing approach.

Regarding the above mentioned at least one tracking code, depending on the embodiment it is at least one of a platform-specific service page snippet, a skeleton service page snippet to be completed at the SaaS provider computing system, a service page snippet conveying user UUIDs (universally unique identifiers), and a log-in snippet conveying user UUIDs and email binding.

According to an embodiment, the method of the first aspect of the invention comprises obtaining the above mentioned service invocation time-series by mapping the service invocations into a series of time intervals based on the service invocations chronological order, on their execution time, on a lag between service invocation and execution, and on the length and time location of said intervals.

Regarding the said cloud resource load time-series, they are obtained, for an embodiment of the method of the first aspect of the invention, from measurement data, regarding one or more metrics of one or more of said cloud resources, included in the time intervals.

With respect to the above mentioned optimization of the operation of the cloud hosted SaaS system based on the determined cost contributions at step c), the method of the first aspect of the present invention comprises performing said optimization, wherein said optimization comprises, for some embodiments, at least one of the following actions: - cloud resource remapping of the detected services;

- improvement of the implementation of services provided by the SaaS provider computing system; and

- reconfiguration of the network connections between the cloud provider computing system and the at least one customer computing system, through the SaaS provider computing system.

For an embodiment, the method of the first aspect of the invention comprises using the information obtained at step c) to detect software bugs and/or intrusion DoS attacks, and based on the results of said detections performing the above mentioned improvement of the implementation of services provided by the SaaS provider computing system.

A second aspect of the invention relates to a computer program, comprising code instructions that when run in one or more processors implement the method of the first aspect of the invention.

A third aspect of the invention relates to a system for optimizing the operation of a cloud hosted software as a service (SaaS) system, wherein a SaaS provider computing system is operatively connected as a bridge between a cloud provider computing system and at least one customer computing system, to provide to at least one customer with SaaS services associated at least to software applications deployed in the cloud provider computing system, the system being adapted to implement the method of the first aspect of the invention, for which purpose the system comprises:

- monitoring means for performing step a);

- detection means for performing step b); and

- processing means for performing step c).

Generally, the system of the third aspect of the invention comprises at least a computing system remotely communicated with the SaaS provider computing system to perform the above described steps of the method of the first aspect of the invention.

The optimization of the operation of the cloud hosted SaaS system is generally performed by the SaaS provider computing system, based on the cost contributions determined at step c) by said computing system remotely communicated therewith. However, for other embodiments, that optimization is performed by one or more of other entities of the cloud hosted SaaS system, together or alternatively to the SaaS provider computing system.

The present invention, in its three aspects, goes far more beyond current commercial offerings both in terms of target and business intelligence. It targets cloud- hosted service providers offering a number of services in the mass market, not just multifunctional cloud-based enterprises, while it allows distribution of cloud cost among the offered services and users/customers using these services, not only among different business functions. The lack, in the prior art proposals, of connecting application/service usage with cloud infrastructure cost is the key differentiating point of the solution provided by the present invention over said prior art proposals. As a result, cloud-hosted service providers are provided with unprecedented intelligence/information regarding the costliness of their service portfolio and/or their clientele, and so on the effectiveness of the maintained virtual infrastructure, which can be optimized at different levels based on said intelligence/information.

The following advantages are provided by the present invention, for different embodiments, underpinned by two major technical/scientific innovations:

> automated methods for tracing information about users/customers invoking services from the Web sites of service providers in a transparent and safe manner,

> a cost attribution algorithm for sharing cloud cost per set of resources and among (groups of) services, users/customers using these resources in a fair manner, which compared to state-of-art bears the following innovative aspects:

it bridges the gap between application/service and system layers that inevitably exists in cloud environments,

■ it relies only on resource load measurements without necessarily requiring load measurements per individual service and user/customer consuming the resource,

it is flexible enough to apply different cost sharing policies as appropriate to different types of resources, through simple parameterizations, and ■ it is capable of self-adjusting critical operational aspects for improving the credibility of the cost sharing results according to the specific dynamics of the particular cloud environment where it applies.

The scalable and efficient implementation that the cost attribution algorithm has for some embodiments, adds to the innovation of the present invention, considering the huge amount of data coming on-line from tracking probes and resource monitoring agents across a multitude of provider Web sites and cloud instances.

A further aspect of the present invention relates to a computer implemented method for an economical optimization related to a cloud hosted software as a service (SaaS) system, wherein a SaaS provider computing system is operatively connected as a bridge between a cloud provider computing system and at least one customer computing system, to provide to at least one customer with SaaS services associated at least to software applications deployed in the cloud provider computing system, the method comprising performing the following steps: a) monitoring cloud resource usage regarding said cloud provider computing system, to detect cloud resource load consumption;

b) detecting service usage regarding said applications; and

c) binding the detected cloud resource load consumption with the detected service usage, and determining associated cost contributions based on respective load contributions obtained from said binding, to provide cost analytics.

All the embodiments described in the present document regarding the first aspect of the invention are also valid for the above mentioned further aspect of the present invention, in this case with the purpose of achieving an economical optimization related to the SaaS system (such as a fair and appropriate billing for the services provided to each of different customers thereof).

BRIEF DESCRIPTION OF THE FIGURES

In the following some preferred embodiments of the invention will be described with reference to the enclosed figures. They are provided only for illustration purposes without however limiting the scope of the invention.

Figure 1 schematically highlights the added value of the present invention compared to available commercial solutions.

Figure 2 schematically shows an informational architecture related to an implementation of the method of the first aspect of the invention, for an embodiment.

Figure 3 shows, by means of several plots, different cloud load ranges and associated load periods, taken into account to implement the method of the first aspect of the invention, for an embodiment.

Figure 4 schematically shows a functional architecture of the present invention, for an embodiment.

Figure 5 shows different processes, (a) and (b), which are included in the method of the first aspect of the invention, for corresponding embodiments, for tracking service invocations from provider Web sites.

Figure 6 shows two flow charts, (a) and (b), outlining a cost attribution algorithm implementing the method of the first aspect of the present invention, for an embodiment.

Figure 7 outlines a suitable "no-active" algorithm for setting the charging granularity in block A of Figure 6(a), according to an embodiment of the method of the first aspect of the invention.

Figure 8 schematically shows a system and functional architecture of both the method and system of the present invention, for an embodiment.

Figures 9 to 12 refer to some business methods described below. DESCRIPTION OF THE PREFERRED EMBODIMENTS

Figure 1 schematically highlights the added value of the present invention compared to available commercial solutions, identified as "competitors" in the Figure.

As shown in Figure 1 , available commercial solutions only provide cloud computational cost distribution per business unit, as they only obtain cloud resources consumption of a SaaS provider, using typical cloud monitoring services, and tag-based aggregation techniques.

In contrast, the present invention provides cloud computational cost distribution also per service, customer and user (and also per business unit), by binding to the obtained cloud resources consumption with information obtained by safe service/user tracking, as described above in this document and also below for different embodiments, specifically by adding a tracking code to a web site provided by a SaaS provider to a user through a log-in/service web page.

In the present section, some implementations of the present invention are described, according to specific working embodiments.

In order to allow the understanding of the following description, some notions, terminology, assumptions and the rationale underlying the invention, regarding its main features, for said implementations will be first explained.

Rationale, Notions and Assumptions:

Figure 2 presents the notions and their relationships underlying the present invention highlighting the invention-specific ones. They are defined and discussed in the following.

Overall:

In the context of a cloud-hosted service provider, the term job is used to refer to an invocation (actual use) of an offered service by a user. It is assumed that users belong to customers (subscribers, retailers etc.) of the service provider and that such a client hierarchy is included in (or can be deduced from) user semantics. As a result, jobs can be classified into various groups through tags that the service provider can assign to its services, customers and users. Below, in this document, means for safely accounting jobs (services invoked by users/customers) based on the 'tracking pixel' technique are described. The cloud infrastructure entails computational cost that can fall into certain categories like processing, storage and transmission cost categories. The problem is to derive fair shares of the computational cost of a cloud-hosted service provider over an agreed billing period (or another type of period instead of a billing period, in an equivalent manner), among the jobs arrived during the billing period. The fair proportion of the computational cost per job is called the cost contribution of the job. By aggregating cost contributions per service, user/customer and resource, the cost contributions per groups of business interest (e.g. large customers for a particular type of service) can subsequently be derived; a very useful insight indeed for service providers.

It is taken that there exist reliable monitoring tools for providing load information, in terms of various metrics, for the cloud resources making up the virtual infrastructure of the service provider. Load measurements per resource are produced at a certain granularity, the monitoring granularity.

However, they do not bear any hint to the jobs actually consuming these resources, so the loads per job cannot be known.

Based on the above, two distinct sets of input can be drawn during a billing period: the service usage dataset containing the time-series of the accounted jobs; and, the cloud usage dataset containing the time-series of the load measurement per resource. As already pointed out, these two datasets share no other information linkage but time information. The premise underlying the present invention is that cost contributions depend on respective load contributions and as such, the service and cloud usage datasets need to be logically combined.

Charging resolution:

For combining service and cloud usage information, the so-called completion assumption is put forward. It assumes that there is a time resolution, not necessarily of constant length during a billing period (although a different kind of period instead of a billing period could also be used, in an equivalent manner), such that jobs (services invoked by users/customers) falling into this resolution complete their execution within this resolution. This 'job-duration-including' resolution is called charging resolution herein, its granularity charging granularity and the resulting intervals, which equally space the billing period, charging intervals.

The completion assumption is statistically valid considering that resources alternate between idle and busy periods and hence the average busy time could be taken as a charging granularity. As it will become apparent in the following, smaller charging granularities would be more desirable for avoiding coarse aggregations and therefore enhancing differentiation among load and subsequently cost contributions. This issue is elaborated further below, where an algorithm for dynamically adjusting charging granularities is also presented. Without loss of generality, it is taken that charging granularities are multiples of monitoring granularities as the latter are usually configurable in the order of seconds.

Under the completion assumption, jobs arrive and complete in charging intervals and therefore the load during these intervals can be attributed solely to them. In other words, the active jobs in the infrastructure per charging interval are that arrived during the charging intervals (no other jobs from previous intervals) and so these jobs induce the measured resource load. Note that resource loads during charging intervals can be derived by suitably aggregating the measured loads over the contained measurement intervals.

Load ranges and penalties:

A parametric peak-load cost sharing scheme [2] is adopted herein, considering that resources may become more expensive during higher loads, especially when reaching their capacity since upgrade cost needs also be taken into account. Cost contributions should therefore be differentiated on the extent to which jobs happen to form peak resource loads rather than just being analogous to their respective loads -even when loads per job could be accurately measured. To this end, it is considered that resources operate into certain load ranges with a cost penalty being assigned to each load range. Load ranges and penalties are defined per resource type e.g. VM, storage or egress network. They are transparent to the service providers, considered as controlled parameters of our solution approach.

Load ranges can be defined either statically, based on pre-determined thresholds or dynamically, based on percentile values. For example, CPU utilization measurements of a VM could be split into the static ranges of {[0,40), [40,60), [60, 80), [80,100)} or into the dynamic ranges of {[0-25 th ), [25 th -50 th ), [50 th -75 th ), [75 th -100 th ] percentiles} containing all, the top 75%, 50% and 25% measurements, respectively. It is deemed that static/dynamic ranges are more suitable for finite/infinite capacity resources like VMs or egress network, respectively. Considering a specific resource, charging intervals can then be enlisted into the defined load ranges according to their aggregate measured loads and hence the billing period can be partitioning into load periods (Figure 3). For example, there could be lighter, light, moderate or busy periods. In the simplest case, a single load range and therefore period could be defined, which could either reflect the busy period of the resource, containing loads above a certain threshold or percentile, or encompass the whole billing period, containing all load values.

The so-defined notions of load ranges and corresponding periods are flexible enough to capture different types of resources and computational cost; thus, allowing our approach to apply to changing cloud provider policies through simple parameterization. Evidently, load periods have random durations within billing periods depending on how many charging intervals happen to belong to the defined load ranges.

Cost penalties are assigned to charging intervals for reflecting the relative surcharges of operating in a charging interval of a particular load range. They are defined as fixed or load-dependent weights biasing heavier load ranges. For instance, considering four load ranges, cost penalties could be fixed to {1 , 2, 4, 8}, correspondingly, or given by the function a load for increasing base values a's (>1 ) per load range. Note that dynamic load ranges do not necessarily admit load-dependent penalties; fixed penalties could be given based on suitable functions of the loads at the ends of the ranges. Fixed penalties differentiate cost solely per load range (different loads within a range are considered insignificant) while load-dependent penalties differentiate cost both per range and load therein.

The penalties per charging interval of the same load period can be added to yield the penalty of the load period throughout its duration. Load periods can then be seen as rectangles having as area their penalty and length their duration. And so, under the perspectives of peak-load sharing, the cost of a resource can be split among them according to their area (penalty) proportion.

Proportional and weighted-proportional sharing:

Having differentiated the cost of operations of resources by means of suitable load periods and associated cost penalties, the cost contribution of a job at a particular resource is set to its fair proportion of the cost penalties throughout the load periods considered. Two expressions can be distinguished: average per interval or per load period as a whole. More specifically, cost contributions at a resource are the fair penalty contributions averaged over all considered charging intervals or load periods. Below in this document, the differences between these two expressions are elaborated.

In the absence of explicit load information per job, two schemes of fairness are considered: the proportional and the weighted-proportional sharing schemes.

Under proportional sharing, fairness is taken to be the relative presence (appearance frequency) of the jobs. As a result, jobs with higher presence during heavier periods are given larger cost contributions provided that the heavier periods are long enough. If the heavier load periods are rather short, cost contributions are merely analogous to frequencies provided that the lighter range penalties do not differentiate substantially among each other.

Under weighted-proportional sharing, jobs are not differentiated only on how frequently they use the resources but also on the demand they bring to the resources. That is, whereas in proportional sharing jobs are assumed to bring the same demand, implying that they cost the same, in weighted-proportional sharing jobs are taken that they bring different demands and therefore their cost should be accordingly differentiated. Under the completion assumption, the load of a resource during a charging interval is the sum of the demands of the active jobs during this interval. This gives rise to a system of linear equations that will be shown below, one for each charging interval, through which the demand of the jobs may be calculated. Depending on when conditions of unique solutions emerge, job demands may be calculated per service or per service and user/customer. This system of linear equations is called herein the job demand equations and the solutions service differentiation weights.

Then, under weighted-proportional sharing, fairness is taken to be the relative service weight of the jobs. As a result, jobs with higher weights (service demand) are given larger cost contributions provided that all jobs have more or less the same presence profile. Should jobs have different profiles, their cost contributions should be analogous to their total weight (total demand) -product of their presence and weight. Obviously, when all jobs have the same weight weighted-proportional sharing reduces to proportional- sharing.

Discussion:

The previous sharing schemes are necessitated by the fact that there are no explicit load measurements per job at cloud resources; and, constitute an innovation of the present invention. Should the loads per job were known in advance fairness could be based on the relative loads of the jobs or the Shapley value [3] of the coalition of all jobs averaging individual load increments.

One might argue that presence is incidental and as such the derived cost contributions will be incidental too and thus of limited credibility. However, the completion assumption makes presence the mere causality of resource loads i.e. the jobs found present during a charging interval induce the measured load and only those. Furthermore, by averaging presence over the billing period its incidental nature becomes as that of a statistical estimator of means. Hence, the present invention makes presence acquire significance. The credibility of the present invention is rather subject to the validity of the completion assumption than to the variability underlying presence or the estimation of service differentiation weights. An automated procedure for dynamically adjusting the charging granularity so that the completion assumption be in increased level of confidence is discussed and described below in this document.

The notions of load ranges and penalties are powerful and flexible enough to support different peak-load sharing policies as appropriate for the particular types of resources. Furthermore, by grounding cost differentiations on penalties per load range and not on individual load measurements, errors due to statistical deviations can be smoothed out. These two aspects are in contrast to existing cost sharing schemes which mainly address a single type of resources while they rely directly on load measurements per resource contributor (traffic of transmission links). Being more general, the present invention reduces to the existing ones if individual load measurements exist, by considering one load range (encompassing the whole billing period or its heavier-load part) and load-dependent penalties given by the identity function.

Last, it is stressed that the present invention targets at sharing cloud cost per billing period, not to assert on the costliness of (different groups of) jobs. For this, a statistical analysis would be required to complement the cost attribution algorithm of the present invention.

Functional Architecture

Figure 4 summarizes the previous analysis by presenting the functional components making up the present invention, which are put into perspectives in the following sections.

Through the Service Provider Interface, cloud-hosted service providers provide information about the cloud they maintain and subsequently they can view analytics on the cost of their operations per groups of interest of the services they offer, the users/customers they support and the cloud resources they employ. The Service Invocation Tracker captures service invocations from the Web sites of the service providers based on the 'tracking pixel' technique while the Cloud Usage Collector ingests load measurements for the cloud resources of the service providers by relying on available monitoring APIs such as Stackdriver [5] and Metricbeat [6].

The Charging Resolution Manager and Cost Calculation Algorithm realize the intelligence for deriving fair cost shares per service, user/customer and resource. They are invoked during and, of course, at the end of each billing period (or another type of period instead of a billing period, in an equivalent manner) with the former component to determine the length of the charging granularity and align within it the collected service and cloud usage traces, and the latter component to calculate the fair cost contributions. The Load and Cost Aggregator performs cost aggregations across various groups of services, users/customers and resources as required by the service providers.

Tracking Service Invocations by Users:

Service providers are considered to offer an automated service provisioning Web site for presenting their service offerings in the Internet and allowing users to subscribe and subsequently use the offered services as and when they wish. In such a site, besides informational and subscription management pages, there is a log-in page for ensuring proper user access and a number of service pages for allowing users to use the services.

According to the Web technology, the pages are supplied by the site and execute for rendering at the Web browsers of the users.

Figure 5 sketches the processes employed for tracking service invocations in service provider Web sites. They rely on the 'tracking pixel' technique and are described in the following.

A snippet (piece of code, called in a previous section "tracking code") is provided to SaaS providers upon subscription to the service provided by the present invention (usually through a computing system remotely communicated with the SaaS provider), that is called "cost analytics" in the present document (and accessed through Webpage "cost-analytics.com" for the embodiment of Figure 5), for them to embed it into their service pages (such as "serviceprovider.com"). The snippet requests the fetching of a tiny image, a pixel, from a Web site ("cost-tracker.com") of ours (where "ours" refer to the remote computing system providing the subscribed service), piggy-bagging, in the HTTP request, the required service invocation information:

Provider that subscribed to cost analytics.

Service of the provider invoked by a user.

Time the service was invoked.

User who invoked the service, including customer information.

The first three elements can easily be provided by the snippet. However, user information cannot be provided because it is usually missing from the service pages. Ideally, user information should also bear information about the customer of the provider to whom the user belongs. Email is a suitable piece of such information as customer information could be deduced by the domain name (after @) of the users who log-in with their corporate email. Provider information is included in the snippet at the very time it is created. Indeed, upon subscription of a service provider to cost analytics, a UUID (universally unique identifier) is generated and bound to the provider; and so, the snippet can have this UUID in the redirections (pixel-fetching HTTP requests) it issues. Service information is just the URL of the service pages where the snippet will be embedded while service invocation timestamps can be drawn when the snippet executes at user browsers or when the redirections are received by the Web site ("cost-tracker.com") of the remote computing system providing the service. User information is known to the Web site ("serviceprovider.com") of the service provider (being a standard part of session information) but usually it is solely kept for session management purposes -it is not copied into the supplied pages. For capturing user information, two processes are provided: an explicit and a UUID-late-binding method.

In the explicit user tracking process (Fig. 5(a)), a number of platform-specific snippets are provided as appropriate to industry-wide adopted Web site development or content management systems. For completeness, a skeleton snippet is also provided targeting those service providers willing to provide the required user information based on their own efforts.

The UUID-late-binding process (Fig. 5(b)) for tracking user information evolves in two phases, requiring two kinds of snippets: one for the service pages and one for the log- in page. Both snippets cause redirections (pixel-fetching requests) to the Web site ("cost- tracker.com") of the remote computing system providing the service. The service page snippet conveys user UUIDs which are generated when the snippet executes at user browsers alongside provider, service and time information as previously specified. The log-in snippet conveys the binding between user UUIDs and emails. It generates user UUIDs at user browsers - guaranteed to be the same with the ones generated by the service page snippet- while it extracts the emails of the users from the log-in page, assuming of course that users log-in with their emails. Since the redirections from both snippets are received by the remote computing system providing the service, user UUIDs and emails can be bound. Considering that users cannot access service provider Web sites (such as "serviceprovider.com") without having logged-in before, user UUID-email bindings can occur pretty soon, within a day per user. For reducing the volume of redirections, the log-in page snippet could save the emails of the users in the local storage of their browsers so that to send binding redirections only when the emails are not found in the storage.

To exemplify the above and demonstrate the inseparable role of the tracking subsystem, let Joe be an employee of acme.com who have subscribed to the services offered by saasservice.com. The email of Joe is joe@acme.com and using this email logs in saasservice.com in order to use the services that wishes. The log-in page of saasservice.com has, among other e.g. authentication functionality, the tracking code (snippet). Then, the tracking code passes to the tracking server (of the remote computing system providing the subscribed service) the domain, i.e., acme.com of Joe along with a UUID uniquely identifying Joe. The domain corresponds to a particular customer of saasservice.com, in this example acme.com, and the email to a specific user of this customer. From then on, whenever Joe hits a service of saasservice.com, his UUID will be snipped to "us" (i.e. to the remote computing system providing the subscribed service) and bound with his previously accounted employer domain. This way the computing system providing the subscribed service (and hence, implementing the present invention) will know that any extra load induced on the cloud layer can be attributed to Joe of acme.com and other similarly-traced users; and, thus the portion of its cloud cost that corresponds to Joe and/or acme.com in particular can be provided to saasservice.com.

Summarizing, the following kinds of snippets are provided:

Platform-specific service page snippets.

A skeleton service page snippet to be completed by providers.

A service page snippet conveying user UUIDs and a log-in snippet conveying user UUID and email binding.

The specified tracking solutions are secure and workable, since a) their underlying redirection (pixel-fetching) logic is actually served by the service provider Web sites which are considered trustworthy among their users, and b) they avoid explicit use of cookies which could be deprecated by users or browsers for privacy protection reasons. The solutions will not work if users suppress code execution in their browsers or the loading of images through plain HTML requests from remote domains. However in these cases a great deal of Web functionality is essentially lost.

The Cost Attribution Algorithm: Input Information:

The cost attribution algorithm runs at the end of each billing period (or, instead, of any other chosen period, in an equivalent manner) for calculating cost contributions per service, user/customer and cloud resource of a service provider. It can also run at any point before the end of a billing period (or chosen period) to provide partial results up to that point. Its input information for a specific service provider is outlined in Table 2 and described in the following. Table 2: Input information for the cost attribution algorithm

The Service Usage Dataset includes the traces of service invocations by users during a billing period as captured by the tracking system:

(service, user/ customer, time)

It is assumed that user information includes information of the client-hierarchy of the service provider; though not required by the algorithm as such, it is useful for aggregating the produced cost contributions at application/service level.

The Cloud Usage Dataset includes load measurements -for pre-determined metrics- of the chargeable cloud resources in the infrastructure of the service provider during a billing period:

(resource, metric, time, measurement)

This information is collected through reliable cloud monitoring services like Stackdriver [5] and Metricbeat [6], which can take and report measurements at configurable granularities.

It is assumed that cloud resources are of certain general types (VM, disk, network etc.) and that the semantics of a resource include both its type and its unique identification (id) in the cloud infrastructure of the service provider.

resource ·■=< resource type >■< resource id >

Resource type information is necessary for accordingly differentiating algorithmic settings (see below) while it is also useful for providing an additional level of results aggregation. The id of a resource could either be its instance id as given by the cloud provider or a characteristic name given by the service provider. The set of resources of a service provider is drawn automatically from the cloud provider at subscription time through appropriate APIs and it is continuously updated from monitored data since resources are elastic both in its type and number.

A resource has a number of metrics reflecting different types of computational cost. E.g. a VM encompasses processing cost through its CPU utilisation and RAM usage metrics, or transmission cost through the number of bytes input or output from its interfaces. Therefore, the metrics required by the algorithm are only a small subset of the plethora of the metrics provided by the available monitoring services. The metrics as reported by the monitoring services are distinguished as being: gauge, delta or cumulative. Obviously, this information is required by the algorithm for performing needed aggregations. The types of resources, the metric(s) to measure, their type and the respective monitoring granularities are known in advance per cloud provider, constituting part of the overall configuration of the cost sharing algorithm:

(cloud provider, resource type, metric, metric type, monitoring granularity^ metric type■= gauge | delta \ cumulative

Tags include a set of tags/labels per individual service, user/customer and resource which are freely assigned by the service provider for enabling analytics per business group of interest:

{service, [tagl, .. ])

(user I customer , [tagl, ..

{resource, [tagl, .. ])

Tagging information is required for aggregating cost contributions over services, users/customers and resources (described below); it is transparent to the core of the cost attribution algorithm that derives the individual cost contributions.

Algorithmic settings provide appropriate values for the controlled parameters of the cost attribution algorithm, which based on the discussion provided above in the present document, include the charging granularity, resource load ranges and associated cost penalties:

charging granularity {in seconds)

(resource type , metric, [load_rangel, .. ,

{resource ty metric, penalty type, [penaltyl, .. , penaltyR] or penalty f U nction( x )) As the charging granularity can be adjusted dynamically (as will be described below), the input value specifies its initial value; an indicator could also be passed as to whether to stick with the initial value or not. As outlined previously in the present document, load ranges can be specified either in absolute or relative terms while penalties could be fixed or load-dependent:

load range ·■= number \ percentile

penalty type ·■= fixed \ load— dependent

In addition, algorithmic settings may include information for allocating invoked services to charging intervals should such information can be provided (described below):

(^service type, lag, shift^

Notation:

Considering a cloud-hosted service provider, let:

e denote a chargeable resource.

M e be the set of metrics of a specific chargeable resource e.

T B be the duration of a billing period, in seconds, or the duration from the start of the current billing period until the time the algorithm is called to run.

T c be the fixed duration of a charging interval, in seconds.

C B be the number of charging intervals during a billing period; obviously,

C B = T B /T C

I t denote a specific charging interval (of length T c ) during a billing period, t = 1. . C B . e' m l(t) be the aggregate measured load of metric m at resource e during the charging interval I t , t = l. . C B ; it is calculated based on the time-series of the respective measurements as specified below.

e' m R be the number of load ranges defined for metric m at resource e.

e ,m r(t) be the load range, l. . e,m R, to which the charging interval I t , t = l. . C B belongs according to the load of metric m at resource e during this interval, e,m l(t).

e' m C r be the number of charging intervals belonging to the load range r = l. . e,m R defined for metric m at resource e -these intervals make up the corresponding load period; obviously,

e,m r r v -ι

r _ y e,m r r

e' m p r t) be the penalty in the cost of operating in the charging interval I t , t = l. . C B that belongs to the load range r = 1. . e,m R defined for metric m at resource e; as stated previously in the present document, it can either be fixed and therefore constant throughout the intervals of the same load range, say e ,m p r , or dependent on the load of the interval.

e,m D r be the total penalty in the cost of operating throughout the load period

r = 1. . e,m R of metric m at resource e; obviously, — dependent r = 1. . ' R

be the number of active instances of service s invoked by user u during the charging interval I t , t = 1. . C B ; it is calculated based on the time-series of the traced service invocations (see description above) as specified below in the present document. V(t) be the total number of active instances of all services and all users during the charging interval I t , t = 1. . C B ; obviously,

Vi ) = ∑ SiU V{t) SiU t = l. . C B

e' m V s,u be the number of active instances of service s invoked by user u during the load period r = 1. . e,m R of metric m at resource e; obviously,

e.myr ^ e tne tota | num b er 0 f active instances of all services and all users during the load period r = 1. . e,m R of metric m at resource e; obviously,

e' m V r =∑s,u e ' m Vs r ,u r = l. . e ' m R

β ,πι β ∑ΛΙ be the service differentiation weight of service s and user u at resource e for metric

m; it is a calculated as specified below in the present document.

e' m w s u be the cost contribution of service s invoked by user u with respect to the consumption of a metric m at resource e; this is the main outcome of the cost attribution algorithm and it is calculated as specified below in the present document.

Service Usage Aggregation:

For determining the active jobs (services invoked by users) per charging interval, V(i) s u „ the captured service invocations need to be mapped into the series of charging intervals in a way to ensure chronological ordering as well as the underlying completion assumption (describe above); the jobs allocated to a charging interval need to complete execution during this interval. Furthermore, the mapping needs to take into account the fact that there might be a lag between job invocation and execution. The following parameters are considered: lag s be the time elapsed from when a job of service s is invoked until the job starts executing.

shift s be the minimum time of execution of a job of service s at a resource; in other words, a job is allocated at a charging interval if its admission time (invocation plus lag) is before its shift from the interval end, otherwise the job is allocated to the next charging interval.

Then, considering a job of service s which is invoked at time t, the charging interval during which the job will be considered active in the infrastructure is given by:

charging

, - granularity

(t + lag s ) » l t = [st, et) (1 a)

, r e - . i h> if t - - lag s≤ et - shifts

char qinq interval of ]ob. = ] T ^ , ^ (1 b) a a J J s {j t+ li if t + lag s > et - shift s ' Using the above formula, the jobs invoked during a billing period are allocated into an appropriate charging interval during which they complete their execution. Then, the number of active jobs in each charging interval V t) s , u can be calculated by a simple counting function.

The lag, shift parameters are considered as meta-data that should be provided by the cloud-based service provider who has an intimate knowledge of the services that provides. This information could be captured by means of a simple questionnaire during subscription. If this information cannot be provided, the parameters will default to zero.

The above assume that accounted invocation times precede execution times, meaning that lag > 0. This could be the case of non-blocking service invocations e.g. when services are invoked through Restful APIs. However, if services are invoked in a batch-mode then the service page will return to the users after the execution of the invoked service and as such, accounted invocation times will follow execution times, in which case lag < 0. In any case and given the absence of such intimate information, it all boils down to appropriately setting the length of the charging intervals; suitable self-adjusting means for appropriately setting the charging granularity is described below in the present document.

Cloud Usage Aggregation:

Consider a charging interval, I t = [st, et) and the set of the measurements of a metric m at resource e that fall into this interval including the measurement at its right- end, in increasing time order, say

{(ti< li , ■■ , (tfc< lk)> (tfc+i* 'fc+i)) It is assumed that the charging granularity is at multiples of the monitoring granularity, say k times, so we have:

st = t t < t 2 < .. < t k < t k+1 = et

t 2 _ti = . . = t k+1 — t k = A

Then, the aggregate measured load during the charging interval I t which encompasses k monitoring intervals, is given as follows, depending on the type of the monitored metric.

metric is gauge: i (£) = —— = ' (2a) metric is cumulative: e,m l(t) = l k+1 — l t (2b) metric is delta: e ' m l(t) = l j (2c)

As the charging intervals are taken to be left-closed-right-open, the time-series of the load over these intervals (x-axis has their start times being the end times of the next interval) forms a right-continuous step-function. Service Differentiation Weights:

Jobs bring different amounts of load at the various resources, depending on their particular service requirements, and therefore their cost contribution needs to be accordingly differentiated. However, explicit load measurements per job cannot be provided at a cloud environment and as such, load differentiation among jobs can only be estimated from available information.

Consider a specific resource, a particular metric and the respective time-series of the active jobs and load, cf. (1 a,b), (2a,b,c) -resource and metric indices are omitted:

{ [v(t) SiU , Vs, Vu, t = l. . C B ]

By means of the completion assumption (described above) the load of a resource during a charging interval is attributed only to the active jobs during this interval. Thus, considering that jobs per service s and user u bring the same demand, say β 3 Ι , the following equations hold:

job— demand equations; same demand per service and user:

The above constitutes a system of linear equations with unknowns [β 3 Ι Vs, Vu}.

Given the underlying assumptions, the yielded loads per service and user, if a solution exists, are treated as weights for relatively differentiating the jobs according to their induced load, not as their load per se.

Being of statistical nature -the coefficients V(t) s u s are observations of random variables- the job-demand equations are considered as traces rather than as constraints. As such, the system cannot be inconsistent. Mathematically speaking, if the coefficient matrix is of full column rank a least squares solution (minimizing the error over all equations) can always be determined, otherwise, some of the unknowns will be undetermined, and an infinitum of solutions exists. The mathematical condition of full column rank means that a solution can be found if:

a) more equations-traces than unknowns: the number of distinct service and user pairs is at least the number of charging intervals, and

b) columns are linearly independent: the jobs per service and user appear independently of each other i.e. the appearance of jobs per service and user does not follow the appearance of jobs from other services and users.

If one of the above conditions does not hold then the system is said not to include sufficient information for admitting a single (least squares) solution. If so, a new linear system is formed by aggregating jobs per service and customer (over all users of the customers) -note that user semantics include customer information (see above in the present document). If still a single solution cannot be determined, jobs are finally aggregated per service only (over all customers and their users). If yet, the system at this level is not of sufficient information, the trivial solution that of having all 1 's is returned. All these are mathematically depicted by the following:

job— demand equations; same demand per service and customer:

s , c (∑ uec ^(t) s ,u) ^, c = t), t = l.. C B (3b)

job— demand equations; same demand per service:

∑s(∑u V(t) SiU ) β 3 = Z(t), t = l.. C B (3c)

trivial solution:

¾ ,u = l, Vs, Vu

Evidently, the number of services is less than the number of distinct service and customer pairs which in turn is less than the number of distinct service and user pairs. Therefore, if the first condition (more equations than unknowns) for having a single (least squares) solution does not hold at a specific job aggregation level it will also not hold at the lower levels. However, no such assertions can be made for the second condition (linear independence). Independency might not hold at a specific level but it may hold at lower even higher levels. Intuitively, the likelihood to be independent is increased at lower, more-detailed, levels. However, this depends on the length of the charging intervals e.g. the invocation behaviour of a specific service among different customers may be the same over larger intervals. Based on the above, in order to select the system to solve for determining the service differentiation weights, the first condition is checked top-down the job aggregation levels and then the second condition bottom-up. There are well-known methods, within the realm of basic linear algebra, for checking linear independence and subsequently determining a least squares solution with their efficient and numerically-stable implementation in emerging Big Data processing platforms being a challenge.

Cost Contributions for a Single Resource and Metric:

With reference to the notation introduced above, cost contributions per service and user at a specific resource with respect to the consumption of a particular metric are defined as in the following (resource and metric indices are omitted).

Proportional sharing:

interval— based cost contributions:

range— based cost contributions:

Evidently, either form satisfies the unity constraint:

∑s,u Ws,u 1

As it can be seen, the two forms above differ in the granularity for distributing the cost penalty among the active job according to their relevant presence. The first form does the distribution at a fine time-grain, per charging interval, whereas the second at a grosser time-grain, per load period (union of charging intervals belonging to the same load range). This is more evident in the case where the penalties per load range are fixed; in this case, the interval-based cost contributions become:

interval— based cost contributions ixed enalties:

In the special case where the cost penalties are fixed, all the same and not zero i.e. p r (i) = p, P r = pC r , r = 1.. R, we have:

same fixed penalties:

∑ yr So, in this case, the first from yields as cost contribution of a specific job the average of the frequencies at which this jobs appears in all charging intervals, whereas the second form yields the weighted average of the frequencies at which this jobs appears in the load periods by the number of the intervals at each load period.

In the particular case of the above case where only one load range is considered not necessarily encompassing all charging intervals of a billing period but a subset of specific C 1 heavier-load intervals we have:

single load range, fixed penalty

s,u c i it in range v ^

w^ = V s u /V i

So, in this case, the first form yields as cost contribution of a specific job the average of the frequencies at which this jobs appears at each charging interval that happens to be in so-defined busy period, whereas the second form yields the overall frequency at which this job appears throughout the busy period.

In case of fixed penalties cf. (4b), (4a1 ), the defined cost contribution forms are equivalent if the following holds:

fixed penalties— equivalence condition

That is, if the average frequency of a particular job across all charging intervals of a load period is equal to the frequency of the job during the whole load period. A sufficient condition for this to hold is the total occurrences per interval, V(t) s, to be all equal.

Considering that the appearance frequency of a particular job during a charging interval is a random variable, the Laws of Large Numbers guarantee that its average will converge to the true mean as the number of intervals (samples) grow and so to the frequency over a load period if large enough. Thus, we can state that when fixed penalties are considered, the interval-based cost contributions converge to the range-based cost contributions if the number of intervals per load range is large enough, that is, the equivalence condition holds asymptotically:

fixed penalties— asymptotics:

fixed penalties— conjecture

(1) ^ (2)

Since the second form is rather a limit of the first form, it makes sense to adopt the first form on grounds of capturing temporal accuracy during billing periods, provided of course that the charging intervals are large enough so as the distribution of job presence is statistically similar among them. If this is not the case, the second form is preferred in order to overcome temporal statistical deviations. The decision as to which form to use can thus be automatically determined by the cost attribution algorithm in accordance to the self-made adjustments of the charging granularity (describe further below).

Note that in the case of load-dependent periods the above asymptotic behaviour does not hold since the sequence of p r (t) s may not converge for t.

Weighted-proportional sharing:

Compared to the previous case, job frequencies are replaced by the relative total weights of the jobs -relative total service demand brought into the infrastructure per job:

replaced by V(t) s u

βε,υΥε,ν. replaced by V s r u

As previously, relative total weights can be calculated per distinct charging interval or per load period. So, we have:

interval— based cost contributions:

w∞ = ∑ r t!r(t)=r p r (t) y y¾ /∑r P r (5a) range— based cost contributions:

= ∑r r S f v U r /∑r P r (5b)

Ls,u Ps,u v s,u

Evidently, when all service differentiation weights β 5 s are equal i.e. all jobs weight and cost the same, the above formulae reduce to the ones of the previous case. The following special cases are worth-stated.

interv — based cost contributions; fixed penalties:

same fixed penalties:

(1) _ β ,υΥί ,υ. In

- Lt y v(t) l L B

, , , (2) _ γι rr t vYsM_ I r

W s u - r L - r / B

single load range, fixed penalty wsv - y B v 1 Similarly to the previous case, the two expressions of cost contributions asymptotically converge for large load periods in case of fixed penalties.

Aggregating Cost Contributions:

The cost contributions as calculated by (4a, b, 5a, b) are at the most granular level: per service, user and resource. Being percentages they cannot be added unless they refer to the same kind of computational cost. As such, for yielding the contribution of a specific group of services and/or users to a particular kind of computational cost over a set of resources, the respective cost contributions can be added. With reference to the notation introduced previously in this document we have:

Cost contribution of sets of services S and users U for a set of resources E:

W S ,U = ∑eeE∑ses w SiU (6a) ueu

Algorithm Outline:

Figure 6 summarizes the previous by presenting the outline of the cost attribution algorithm, for an embodiment. Two parts are distinguished: the core part (view (a)) resulting in individual cost contributions per resource, service, user/customer (described above) and the aggregation part (view (b)) yielding the contributions of specific groups of services, users/customers to the computational cost of particular sets of resources (described above). Their efficient implementation in the Spark distributed processing platform (described below) adds to the innovative aspects of the present invention.

The core of the cost attribution algorithm can run in a batch (for an entire billing period) or an on-line, incremental (at several times during a billing period e.g. daily) fashion. In the latter case, since the calculation of cost contributions is not associative the corresponding part (see step C) needs to re-run every time from the beginning of the billing period. So does the part for calculating the service differentiation weights (step B).

As shown in Figure 6(a), the core part comprises getting cloud usage and service use dataset for the period, together with algorithmic settings, as described above in the present document, and continues with step A for alignment to charging granularity where a loads-jobs time-series is created, then step B for service weights calculation where a weights database is created, then step C for cost contributions calculation where individual cost contributions per resource, service, user/customer are calculated.

Regarding the aggregation part of the algorithm, as indicated in Figure 6(b), it starts for the current and/or past billing periods, and comprises getting those tags described above in the present document, and continues by performing a cost contributions aggregation according to equation (6a) above, to output a cost attributions dataset. As stated above in this document, for alternative embodiments, other type of periods, shorter or longer than the billing period, can be used in an equivalent manner to what is explained herein for the billing period. Discussion on Charging Granularity:

The choice of the granularity of the charging intervals so that the completion assumption holds (see above) is central to the present invention and therefore a significant condition affecting the credibility of the yielded cost contributions.

As the charging granularity gets larger the completion assumption becomes true and reversely thinner charging granularities are likely to make the completion assumption invalid. The length of charging granularity should also be seen under the perspectives of the load ranges which are introduced for differentiating cost sharing based on peak-loads. If charging intervals are large enough peak loads may be diluted by the calculations for deriving the aggregate load (see above). The choice therefore of the optimal charging granularity is a trade-off between the validity of the completion assumption and the loss of peak-load information.

Information on the duration of each job executed during a billing period would be helpful however this cannot be a priori known while is hard to measure from system statistics -very complex mappings between user submitted jobs and system processes/threads are required which outmost depend on the specificities of the architecture of the particular software system in place. Furthermore, it cannot be taken for granted that all jobs of the same service will have the same execution times in an empty machine since execution times depend on input and therefore on user behaviour.

The above discussion indicates that the optimal setting of the charging granularity could only be determined through benchmark tests in particular provider environments. And, based on the benchmark results, the optimal setting of the charging granularity could be automated through appropriate ML techniques.

A criterion for dynamically adjusting the charging granularity could be the proportion of the charging intervals with significant load but with no (or very small number of) active jobs. Such intervals indicate that the completion assumption does not hold and therefore the charging granularity should increase. Along this line, Figure 7 outlines a suitable algorithm for setting the charging granularity. This algorithm could run as a part of a training phase for getting an initial estimate of the charging granularity as well as alongside the cost attribution algorithm (in block A of Figure 6(a)) for verifying the validity of the initial estimate and recalculating cost contributions if new estimates deem more valid. The meaning of each step of the algorithm of Figure 7 and of the terminology used is described in the same Figure 7 (by the corresponding legends included therein), which shows how granularity G is increased (under the constraint that G+1 be below than a maximum charging granularity D) when the percentage of no-active intervals (i.e. without active jobs) is above a tolerable percentage indicated by P.

Output Information:

Table 3 specifies the output information of the cost attribution algorithm, for the embodiments described in the present section. Two datasets are produced:

- one with the individual cost contributions per service, user/customer and resource, resulted from the core part (Figure 6(a)) of the algorithm, and

- one with the aggregate cost contributions per wished group of services, users/ customers or resources, resulted from the aggregation part (Figure 6(b)) of the algorithm.

Table 3: Output information of the cost attribution algorithm

System Architecture:

With reference to the functional architecture presented above with reference to Figure 4, Figure 8 provides an overall system view depicting the main computational blocks implementing the specified functionality, while Table 4 presents the underlying implementation technologies.

The system will run at a remote computing system providing on a subscription basis its cost analytics as-a-service to various cloud-hosted SaaS, in order to allow them to use the provided information to perform the optimization of the operation of the cloud hosted SaaS system (as described above), or, optionally, also to perform optimization actions and/or create and provide optimization advices.

The legends included in Figure 8 identify the different entities or sub-systems involved in the system of the invention and also describe the actions implemented thereby and indicate the data communication between them (by means of the depicted arrow lines and terms included thereon) according to the method of the present invention, which have been described above in the present document.

Table 4: System technologies

A person skilled in the art could introduce changes and modifications in the embodiments described without departing from the scope of the invention as it is defined in the attached claims.

Business methods

What is CostNous?

CostNous: A method and an implementation of a cloud based service for operators of complex IT/network infrastructure and their multi tenant clients/services, allowing them to: - compute current costs, revenues and profits based on actual usage records, compute future costs, revenues and profits based on forecasted usage conduct "what-if" analysis on anticipated costs, revenues and profits in view of real business scenarios such as customer-base/usage changes, resource breakdowns, site expansion etc.

- optimise investment on infrastructure based on current and future usage costs, revenues and profits,

Problem Description

Large virtual or physical infrastructure operators need to know how to optimally price their service offering in order to stay competitive and continue to attract new customers, while avoiding churn of existing ones.

As seen in Figure 9 the problem of charging the right price for business services offered online can be quite complex as each time a customer is using a business service, her action cascades activities across a number of subsystems, that ultimately end up consuming resources with some cost associated among them. The consumed resources can be physical (first layer from the bottom of Figure 9), virtual (second layer from the bottom of Figure 9), or licensed software (third layer from the bottom of Figure 9).

Depending on the type of company and service offering, the enterprise may own/lease all or some of the layers shown in Figure 9.

In order for an enterprise to know the exact cost each service/customer is incurring on their resources they need to collect logs from all infrastructure layers they own. However logs that show customer related information, are not always correlated with the logs related to resource usage. Consequently, enterprises end up averaging everything out, i.e. dividing their total costs from each layer (see invoice arrows in Figure 9) by the number of customers using their services and some kind of overall volume/transaction based usage. Unfortunately such gross oversimplifications lead to imprecise/unfair pricing, churn risk and higher infrastructure costs: Unfair pricing: Customers that use the offered services during peak times can impose orders of magnitude higher cost for the enterprise compared to customer that make similar amount of use during non-peak times (See L. Gyarmati, R. Stanojevic, M. Sirivianos, N. Laoutaris, "Sharing the Cost of Backbone Networks: Cui Bono?," in Proc. of ACM IMC'12 and R. Stanojevic, N. Laoutaris, P. Rodriguez, "On Economic Heavy Hitters: Shapley value analysis of 95th-percentile pricing," in Proc. of ACM IMC'10).

High churn & missed opportunities: Not accurately knowing the cost and profitability for each customer, makes it difficult for enterprises to offer smart discounts to keep good customers in, or attract new ones from the competition. It also makes it difficult to explain to difficult customers that impose high cost to the infrastructure, why they need to pay more than average charge rates.

Dilemma of either paying a higher infrastructure budget or risk breaking SLAs. If customer/service usage is not correlated to resource usage, CTI managers and engineers find it extremely hard to predict what resources will be needed in the near future. That leads to over-provisioning in order to be safe and not break any Service Level Agreements. Over-provisioning can significantly lower profitability and in the case of startups can deplete cash reserves. What does CostNous offer?

CostNous provides valid answers to a number of business-critical questions. The term valid is used to stress the fact that CostNous provides its answers by analyzing customer usage data combined with system resource usage data. CostNous utilizes big data technologies in order to work on raw data rather than starting from higher aggregation models which inevitably ignore the data's structural and temporal dynamics.

Costnous offers the following features:

Automatic Data Ingestion System (ADIS) for systems hosted on public clouds

A unique component of CostNous is ADIS, an intelligent component of CostNous that allows it to automatically integrate with any virtual datacenter hosted on a supported public cloud. Immediately after an IT administrator provides ADIS with the appropriate credentials for accessing her virtual datacenter and adds a few lines of javascript code on their web portal, ADIS automatically starts collecting customer and resource usage data. As soon as a sufficient amount of data is collected, ADIS executes a statistical analysis subroutine in order to correlate the impact of each business service on the virtual datacenter resources (e.g. CPU, Memory, Storage, Network, GPU, etc.). Furthermore, ADIS is capable of processing invoices sent to the user by the cloud provider in order to assign a unit cost to each system resource. ADIS minimizes the effort and costs associated with integrating CostNous to any virtual datacenter when hosted on a public cloud provider.

Actual cost calculation, profitability assessment and correct pricing CostNous calculates the cost per customer/service utilizing peak-load information following the intuition that the cost of a resource increases during high-utilization times much like a hotel room is more expensive during high seasons. Cost calculations based on total or average resource consumption may be misleading since they silently assume that all customers/services consume resources in a uniform way. Instead, CostNous seeks to identify the contribution of customers/services to resource peaks and accordingly derive the total cost of serving/offering customers/services. This way, the most costly customers/services are not necessary the heavy-hitters but those who drive resources to operate at increased loads and therefore those who actually necessitate costly infrastructure upgrades. The actual profitability per customer and service can then be assessed by comparing the charges applied by the billing system of the provider with the actual cost of operations as deduced by CostNous. This is helpful for renegotiating prices in order to retain the most profitable customers in and explain to the costlier ones their bill. Furthermore, by being able to calculate the actual cost of operations correct prices can be given to peer or infrastructure-sharing providers as CDNs and VSPs.

What-if scenarios based on realistic synthetic customer usage

CostNous allows the definition of various what-if scenarios regarding hypothetical usage. Based on scenario settings, hypothetical loads per customer and service at each resource are generated and the anticipated cost, revenues and profits are calculated. For increasing the validity of the produced results, the hypothetical loads are generated at the raw time-scale and in a way that mimics the dynamics of the respective actual customer behavior which has been decoded during a stochastic analysis phase.

Hypothetical loads regarding existing customers/services result in the generation of suitably adjusted usage data time-series as per scenario settings, with the adjustments being applied uniformly so that not to spoil the usage pattern dynamics. Hypothetical loads regarding new customers/services can also be inferred. When new customer or service loads are calculated CostNous generates separate and non-identical usage data time-series per resource, which are statistically-close to a similar real-world customer or service. The hypothetical loads of new and existing customers/services are then consolidated and the anticipated traffic of the scenario is produced at the raw time granularity of the historical data (usage & resource logs) ingested into the system.

Investment optimization based on current or scenario-driven anticipated traffic

By following the evolution of the current traffic and by calculating the anticipated traffic in the context of what-if scenarios, CostNous can guide investment and procurement. In particular, CostNous can determine the: minimum amount of investment required for all resources to operate within desired levels of utilization such that QoS is not compromised,

most critical resources that need to be upgraded so that the total expenditure does not exceed a given budget, and

the optimal time to undertake resource upgrade considering that the resource upgrade unit cost may be time-varying

The above aspects are casted as mathematical optimization problems which are then solved by appropriate numerical techniques. The information utilized in these optimization problems -expressed in the form of constraints and variables- reflects the upgrades that need to be done at each resource given anticipated peak load so that the utilization per resource remains within desirable levels so that QoS is not compromised.

It should be stressed that CostNous undertakes investment optimization adopting a holistic view that is, considering all resources shared by the customers/service, not resources of particular use. The effect of traffic loads from ingress resources to downstream resources e.g. from an access point to transport (back-hauling links) and then to virtual computing resources and core network resources are accounted and thus required investment is deduced at all resource levels. Evidently, by taking into account the timed journey of the traffic per customer/service across resources, not only total required investment becomes more accurate but it can also be prioritized both in time and particular resource.

Service/product bundle/inventory optimisation CostNous exploits its traffic forecast capability on a per customer and per service base to also offer product/service inventory optimisation functionalities. The operator of CostNous can select all or a subset of its services to be optimally bundled together. Then he indicates one or more capacity constraints on the amount of resources that can be made available to the resulting service bundle (or equivalently product inventory in use cases that involve non IT services, see later). CostNous then solves an appropriate Knapsack optimisation problem to identify a subset of the services to be included in the bundle such that the overall profit from them is optimised in view of the constraints of how much of each service can be hosted. To solve this problem, CostNous is utilising its fine grained view of service profit margins, together with its future forecasts, and its ability to solve various types of knapsack optimisation.

Use Cases

For Infrastructure Operators

Datacenters / Cloud Providers

Public Cloud providers and owners of large datacenters can use CostNous in the following scenarios:

• Infrastructure Cost Optimization: Cloud and Virtual Datacenter operators can accurately identify the cost impact of each customer on their physical resources. For example they can identify customers that their usage patterns allow for resource overcommitment, meaning that they can service more of them on the same physical resources leading to CAPEX savings.

• Aggressive Customer Acquisition-Churn Prevention: Operators can target with special discounts customers who are identified by CostNous as having a lower cost impact than the average.

• Budget Optimization as a Service: Cloud Providers can offer the cost optimization, what-if analysis and forecasting capabilities of CostNous as a paid service to their customers. See following use cases on how their customers can use CostNous.

Financial clearance operators and FinTech Industry

Financial clearance operators as well as other FinTech service providers can use CostNous in the following scenarios: • Evaluate Data Center's Capacity under Financial Crisis: The usage patterns of financial transactions change dramatically when an actual or perceived economic crisis occurs. Using CostNous' what-if analysis tool, users can simulate the transaction usage of specific services (e.g. increased ATM usage during capital controls), and see if their resources will be able to maintain the needed performance SLAs

• Aggressive International Customer Acquisition: Potential large customers (for example Banks wanted to outsource their credit card clearance) that operate on different time zones and thus their resources demand peaks when the majority of other customers usage is low can be identified with CostNous what-if analysis.

Such customers can be offered aggressive discounts.

Network Operators

• Optimize Mobile Telephony Base Station Maintenance: 3G/4G Base Station maintenance costs can be greatly reduced by using CostNous to forecast the usage and profitability of each base station. For example the stations that are forecasted to have the highest demands on mobile data traffic and, at the same time, are used by the highest paying customers will be prioritized first to upgrade from 3G to 4G.

· Competitive Virtual Network Operators Offering: CostNous makes it possible for network operators to offer virtual slices of their mobile and fixed infrastructure to Virtual Network Operators at the lowest possible pricing. Using the CostNous What-lf scenario module, an operator can predict the cost impact of hosting a virtual network operator on its access, backhaul and core network. With CostNous the operator can offer a competitive pricing to the virtual operator while maintaining profitability.

• Interoperator Traffic Fees Negotiation: CostNous historical and forecasted cost impact analysis for different types of service usage on an operator's network is an invaluable tool when operators negotiate transient traffic fees among them. With CostNous operators can decide which transient traffic generates the highest cost for them and demand the appropriate fees from the relevant operators.

• Optimal Large Business Customer Price Discount: It is typical for large

companies to continuously negotiate their network and telephony contracts. With CostNous the operator's account managers know exactly what their maximum possible discount is for a business customer when they negotiate complex telecommunication contracts involving multilayered fixed and mobile services. Optimized Retail Pricing for Different Contract Types: Marketing Managers can view the total historical and forecasted cost for each type of contracted subscriber and retail service offering. With CostNous forecasting and what-if scenarios they can see what the profitability impact will be before launching a new service offering, or when is time to revise their pricing.

Promotional Campaign Infrastructure Impact: Marketing in collaboration with the operator's IT and Network departments can use CostNous what-if scenarios to understand what the impact of a promotional campaign will have on their infrastructure. For example a marketing campaign that gives short messages for free during Christmas may be a great promotional activity but the operator's SMSC (Short Message Service Center) will not be capable to cope with the additional load, creating a negative customer experience backfiring on the campaign.

Non-IT infrastructure Operators (utilities, transportation, etc.)

CostNous can be used with non-IT complex infrastructures and networks and provide the same services it provides to IT based organisation. Electricity, water, petroleum, and gas networks can use CostNous to compute the costs put on the network from different large customers, as well as to conduct what-if and optimisation studies. Unlike IT-based cases where CostNous can read the use of different customers from headers in network packets or in process IDs, in the non-IT cases CostNous relies on smart-metering at ingress or egress points of the network. Combining these with additional topological and routing information about the network, CostNous is using Network Tomography to compute an estimate of the path taken by each flow from provider to customer. Once this is known, CostNous can, as in the IT case, compute the strain put on the infrastructure by every and each flow of any customer.

Inventory Optimisation for store chains A retail chain can use CostNous Inventory Optimisation functionality to optimise the product inventory for each one of its stores. In this use case, a CostNous service is mapped to the (sales) traffic of a particular product at each particular store, ingested into CostNous from the accounting software of the retailer. By knowing the profit margins from each product, CostNous can compute the overall profit under different inventory stocks kept at each store. The knapsack solver of CostNous can then be used to identify the level of stock for each item to be kept at each store in order to maximize the profits based on the forecasted product sales.

For Customers of Cloud Infrastructure Providers

Cloud-hosted Web Services An increasing number of companies from startups to multinationals rely on the public cloud to host their service offering over the internet. Typically they are referred to as Over The Top (OTT) service providers. OTTs can utilize the online version of CostNous to effortlessly and seamlessly integrate with it via the ADIS component. Upon successful integration CostNous can automatically collect resource and service usage statistics to perform its cost and forecasting analysis. OTTs can utilize CostNous as follows:

• Reduce Virtual Infrastructure Cost by pre-allocating Virtual Resources: Many public cloud providers offer discounts if a customer can guarantee sustained use of their virtual resources for at least 6-12 months. OTTs can use CostNous to forecast their resource needs for the next 12 months and thus enjoy special discounts by pre-allocating their virtual resources.

• Optimize Profitability per Service/Customer: Using the historical and forecasted cost allocation of CostNous, OTTs can tune their pricing for each service they offer over the web or even for specific customers, depending on the impact they have on the cost of their virtual infrastructure.

• Identify Non Profitable Customers: Due to the complexities involved with calculating unit cost for resources and then assigning total costs to each customer, OTTs rely on oversimplifications and averaging out costs. Thus if OTTs simply follow the common practice of dividing the virtual infrastructure costs by the number of customers and how many times they used a service, every customer may seem profitable. CostNous with its fine grained cost allocation identifies profitability discrepancies among customers discovering customers that are not profitable. For example a Customer Relationship Management (CRM) as a Service company can use CostNous to see how the workload from each of its customers impact on the costs to the underlying cloud infrastructure according to its pricing scheme on number of VMs, data traffic in/out etc.

Mobile Virtual Network Operators

An MVNO can through CostNous understand how different retail offerings e.g. prepaid with free night calls vs prepaid without free night calls impact on the fees that the MVNO has to pay the mobile network operator(s) where the MVNO obtains the mobile network infrastructure.

For 3rd party Analysts

Consulting companies Consulting companies are the primary contractors for customer profitability analysis, investment optimisation, and what-if commercial plans of infrastructure providers. What consulting companies typically do is develop custom "one-off" analyses based on the data provided by the customer and those known by them. CostNous can help consulting companies perform these tasks more efficiently by avoiding re-inventing the wheel every time. It also allows them to deliver to the customer a working service that will be continuously updated with new data in addition to their reports and recommendations.

Block Diagram

CostNous functionality is decomposed into a number of components depicted in the Figure 10 below and described in the following paragraphs: Automatic Data Ingestion System ADIS

ADIS ensures a fast and seamless integration of CostNous with the user's virtual datacenter. The datacenter's administrator deploys and provides access to a minimal set of javascript utilities and system agents that feed CostNous with streams of customer and resource usage data. This is as simple as adding some Google Analytics scripts to a web- site for audience monitoring.

Unit Cost Statistical Analysis

As soon as ADIS ingests enough data into CostNous, the system performs a statistical analysis that correlates customer usage with resource usage. This "training" period can last from few hours to few days, depending on the use case and the type of forecasting to the future sought. The accuracy of the system keeps improving as more data flow in. By ingesting the billing information related to the cost of the data center's resources, ADIS can accurately identify the unit cost of each resource and allocate the cost impact to each customer and service offering.

Customer Usage Stochastic Analyzer

The Customer Usage Stochastic Analyzer, performs a statistical analysis on the time- stamped customer usage logs. Its purpose is to assert the stochastic nature of each service usage per customer in order to reveal the random characteristics of the customer usage dataset and its evolution over time. In addition during this stage a model is trained with historical customer usage data, capable of generating realistic usage sequences based on the stochastic insights gained in the previous step.

Realistic Usage Generator

The Usage Generator utilizes the realistic usage generation model to produce customer or service usage sequences (synthetic time-stamped data) mimicking the time-dependent dynamics of the historical usage data, to generate forecasted usage data. The newly generated data are stochastically close to the actual observed usage. The synthetic data generation process is as unbiased as possible with respect to underlying probability distribution assumptions. The Realistic Usage Generator runs either automatically once a day, or during a what-if scenario simulation, and produces a new customer/service/resource usage dataset that looks similar to the historical dataset but with future timestamps and forecasted values.

Usage Forecasting Evolution

The Usage Forecasting Evolution is a set of elaborate stochastic ML-algorithms for enhancing stochastic characterization of usage data beyond the level of plain statistics. This component is an automated intelligent means for spotting notable statistical changes in the usage data evolution. The Usage Forecasting Evolution set of algorithms minimize the manual effort required from data scientists in order to continuously maintain high prediction accuracy levels of the Customer Usage Stochastic Analyzer's model, and ensure that as usage patterns change over time, the Customer Usage Stochastic Analyzer adjusts accordingly.

What-lf Scenario Simulator

The What-lf Scenario Simulator allows business users to make adjustments to the forecasted usage data generation model by inserting business scenarios, such as: · How the forecasted usage per customer and per service will be affected if a service offering is promoted by a marketing campaign that will cause a 10% increase of new customers as well as a 5% increase in its use by existing customers?

• What will be the forecasted usage of a new large B2B customer?

• What will be the forecasted usage of a new service offering?

Upon running a What-lf Scenario Simulation the Realistic Usage Generator is executed, intelligently translating vague business attributes such as expect an increase of 10% of customers who use service x, to actual fine grained forecasted usage data.

Upon execution of a What-lf Scenario Simulation the user can view reports showing the adjusted cost and revenue forecast per customer, service and resource. Investment Optimizer

The Investment Optimizer accepts as input saved What-lf Scenario Simulations and can answer specific investment optimization questions in relation to datacenter capacity. The investment Optimizer can answer questions like:

• Given a specific set of SLAs and the forecasted What-lf Simulation do I need to upgrade any of my resources and when?

• If any resources must be upgraded what is the minimum budget in order to satisfy all SLAs?

• If the available budget is lower than the expected minimum, what is the optimal budget distribution among all needed resources?

• Given unit cost evolution predictions and purchasing constraints, what is the optimal time to purchase the needed resources in order to achieve the lowest possible total cost?

Prior-Art and Advancements

Automatic Data Ingestion System

Prior-art

Various Log Analytics and Security Analytics companies such as AlertLogic (www.alertlogic.com), sumologic (https://www.sumologic.com/), tenable (https://www.tenable.com/), splunk (https://www.splunk.com/), provide a plethora of tools for ingesting and correlating logs. However all of them are focusing either on system logs for security and performance metrics, or on business related logs for marketing and sales related activities.

Advancement The CostNous ADIS system like the other vendors mentioned above also provides standard utilities to ingest system related data, however it differs in the automation of customer usage data ingestion for public cloud based systems, something that typically requires a lot of manual effort with the systems and vendors mentioned before. ADIS provides an innovative script based mechanism to ingest customer related data, intercepting customer usage traffic at the point of demarcation between the business service layer and the software applications layer (see Figure 9). Furthermore it automates the correlation between system and usage data by performing a statistical analysis that reveals how customer usage affects resource usage (see Actual Cost Calculation below). Actual Cost Calculation

Prior-art

Several metrics for sharing the cost among a number of competitors -customers/services- in a fair manner based on peak aggregate traffic loads have been specified and their validity tested. These metrics range from simple proportional shares of the aggregate peak traffic at the resource to combinatorial-complex metrics such as the Shapley value. These works address the problem of peak load-based cost sharing in simple reference models.

Advancement CostNous extends the state-of-art by specifying fair peak load-based cost metrics per customer/service over time periods during which resources may hit peak loads at several different times at each of which different mixes of customers/services may exist. The dependency graph between the resources is also taking into account.

Considering a particular customer/service mix, it's fair actual cost is, in general, time- varying and therefore the question is how the costs accounted at individual times are aggregated to produce a total cost figure that it is fair among all competing customers/services. CostNous adopts a moving window approach with memory from previous windows (ala exponential smoothing) and subsequently a scoring mechanism for normalizing the resulting cost figures among all customers/services and types of resources they used, noting that different resource may have different operational value or upgrade cost.

The fair allocation of cost into mixes of customers/services based on resource peak loads over the evolving traffic continuum constitutes a scientific advancement of CostNous. What-if scenarios for cost, revenue and profit assessment

Prior-art

What-if scenarios for cost, revenue and profit assessment for are usually focusing either on business or system analytics (e.g. SAS www.sas.com, webtrends www.webtrends.com, Google Analytics http://analytics.google.com, VMWare VRealize https://www.vmware.com/support/pubs/vmware-yrealize-suite-pu bs.html). As such they have restricted scope and do not utilize correlation of system and business information as input. Advancement

In complex multiple-resource, multiple-access domains, the starting point of assessing cost, revenue and profitability of operations should be the raw traffic records much like as in retail domains the starting point is the number and cost of sales per commodity. CostNous indeed commoditizes such complex service provider domains in fine-grained commodities -customers, services and mixes of them. By analyzing available raw usage and system information it derives valid estimates of their future demand according to various business scenarios -valid in the sense of following domain dynamics - and subsequently their actual cost, revenues and profitability. The creation of a complete view of business economics at fine-grained commodity level in the current or hypothetical contexts based solely on actual usage data and the wealth of insights that can be distilled from there constitutes a unique advantage of CostNous.

Investment optimization Prior-art

In current business practices, the problem of determining the minimum amount of required investment for infrastructure upgrade given anticipated traffic load is dealt at various levels by different specialized systems. Capacity planning systems (neubrain http://www.neubrain.com/solutions/telecom-utilities-analytic s ) are mainly concerned with the basic transmission services and related networking resources -nodes, links, firewalls, etc; cloud management systems (VMWare VRealize https://www.vmware.com/support/pubs/vmware-yrealize-suite-pu bs.html) deal with virtual computing resources; while, particular management systems deal with the resources dedicated to the provisioning of specific services e.g. VoIP services. Depending on type of service, these systems may rely on average load measurements -as is usually the case in networking services- missing therefore peak load information which normatively should trigger resource upgrading.

Advancement

CostNous performs investment optimization by taking a holistic view, horizontally, across all resources deployed in a provider domain -removing the need to have different specialized systems. Also, CostNous considers timely peak-load information at the raw data time-scale -for all customer/service commodities competing across all resources- increasing the validity of the produced results at maximum. The incorporation of all resources in unified investment optimization models, their resolution based on anticipated peak-load information and the resulting timely upgrade plan constitute a unique advantage of CostNous.

Time-Series Analysis and Forecasting

Prior-art

The topic of time-series analysis and forecasting has been extensively studied by the research community as it is of interest in various application domains, telecoms, econometrics and energy consumption to name a few. A number of general purpose forecasting methods have been proposed ranging from simple moving averaging, exponential smoothing and regression models to more complex autoregression, ML- and neural network-based models. Broadly speaking, all these models try to best fit a curve in the time-series, which is expressed as a set of numerical equations relating present and past values; and, based on the fitted curve they predict the next values of the time-series. Forecasting models are generally highly parameterized with the specification of the parameters to best fit a particular time-series being by far a non-trivial task; a One-fits-all' paradigm is simply not applicable. For this, suitable methodologies involving manually- undertaken time-series analysis and/or more automated training phases have been proposed. Indicatively, we refer to the Box-Jenkins methodology for parameterizing ARIMA models which relies on observing certain aspects of the time-series such as stationarity and seasonality through appropriate statistical charts. Existing data analytics suites e.g. by SAS or SAP and programming languages provide rich support for viewing time-series dynamics and implementing known forecasting models.

A fully automated approach for selecting and parameterizing the most appropriate forecasting model for a time-series at hand is generally missing. Different forecasting models see the time-series from their own perspectives however they all stand on certain presumptions like stationarity, existence of white-noise, which unfortunately do not hold in reality.

Advancement CostNous uses state-of-art forecasting techniques and models. Traffic is first analyzed from stochastic perspectives for revealing its essential random characteristics and their evolution over time. The analysis is carried out at multiple time-scales, for example supposing a minutely raw time-scale, half-hourly, hourly, daily, weekly and monthly time- scales are considered. A number of useful insights can thus be revealed for guiding the selection of the most appropriate forecasting models to use and the determination of their parameters. Such insights at each time-scale include: identification of seasonality (recurring patterns) and stationarity (time-independent evolution); quantification of the magnitude of outliers; distribution of values around their mean or in inter-quantile intervals; and, relationships between different traffic variables or of the same variable among time- intervals (autocorrelation). For a given variable the relationships between its values at a lower time-scale and at a higher time-scale where a certain number of values is aggregated is also analyzed. The stochastic analysis will reveal the time-scale(s) where a stable (the same) statistical behavior is exhibited and where seasonality or stationarity can clearly be identified. Note that telecom traffic usually exhibits seasonality per weekly periods and stationarity during monthly periods.

Forecasting is then done at multiple time-scales, from the higher time-scale where seasonality and stationarity is more evident through to lower time-scales until the raw time-scale. Different models may apply at different time-scales according to the insights of the stochastic analysis. As higher time-scales aggregate a certain number of values in the lower time-scale, the forecasted values at each level should be adjusted such that their aggregate complies with the forecasted values at the higher level. In addition of using state-of-art forecasting models at each time-scale, CostNous employs a new breed of models relying on the notion of relationship-patterns. A relationship-pattern expresses a particular relationship among a certain number of consecutive time-series values at a particular time-scale. Such relationships may include: the ratios of the values over their sum e.g. ratios of hourly values over the daily total or daily values over the week total; or, the ratios of consecutive values e.g. ratios of minute by minute values during an hour of a day.

By their definition, relationship-patterns have one degree of freedom meaning that one piece of information -the sum or the first value- suffices to determine the values over which the pattern extends. This makes them suitable for a multiple time-scale forecasting approach. Furthermore, relationship-patterns are useful from many angles. First, they reflect autocorrelation in a compound manner avoiding calculations of individual autocorrelation measures. Second, they are handy for discovering seasonality by spotting out periodic occurrence of the same patterns. Third, they are not affected by non- stationarity because they express how consecutive values relate with other rather than regress in time in absolute terms. Last, they are mathematically tractable since they can be treated as vectors.

By laying down the relationship-patterns at a given time-scale, smaller frequently appearing patterns can be identified. It is expected that pattern variability will increase at lower time-scales. CostNous seeks to find such frequent patterns as well as rules or transition probabilities underlying their succession. To this end, CostNous employs a ML algorithm for pattern evolution mining. This algorithm resembles the known ML algorithms for association rule mining with the striking difference however being that the set of items over which to identify frequent itemsets is not known a priori but they need to be dynamically discovered -in this set-up, the items correspond to sub-patterns of any length above two.

By viewing the time-series at a particular time-scale as a sequence (chain) of patterns (under a specific relationship), forecasting is then about determining the next pattern to appear. This selection is based on probabilistic pattern succession rules mined during a training phase (based on historical data) as well as on rules mined from the recent history of the time-series. If significant changes are noted between the two i.e. between the expected and the actually observed, suitable notifications are produced signaling anomalies in the expected traffic evolution. Such anomalies may not necessarily be treated as negative events. For example, a change in the weekly patterns may indicate increase of sales while a change in the daily patterns may indicate a change in the customers' behavior. The model by default trusts the current noted evolution and adjusts itself accordingly; however, it can be told to stick to the expected evolution patterns e.g. if the changes were attributed to an outage.

In general, pattern forecasting is of probabilistic nature resulting in a set of patterns that are likely to occur. A number of techniques are then applied for forecasting the constituent values and related error intervals; and, the one that has appeared to be more accurate (the least errors) is selected.

The multiple time-scale, pattern-oriented forecasting models of CostNous constitute a scientific advancement given that state-of-art algorithms are predominately single value- oriented.

Realistic Traffic Generation

Prior-art The field of traffic modeling and generation has been extensively studied especially in the context of telecom traffic -ATM earlier, IP and Web traffic nowadays. In general, existing traffic generation models assume a priori a certain probability distribution and use actual traffic measurements to estimate its parameters. Such parametric models essentially produce traffic sequences at steady-state (equilibrium) silently assuming that the traffic process will eventually converge to a specific distribution. On the other hand, realistic traffic generation models do not aim at producing traffic at steady-state but instead, they aim at reliably reproducing the transient evolution of the traffic i.e. its dynamics time by time. In addition to capturing and parameterizing the model of the traffic process (in its transient or steady-state evolution, analytically or empirically/numerically), the problem of traffic generation entails another problem: that of generating random sequences from a general distribution (PRNG, pseudo-random number generation). This problem is not trivial even for known distributions and even for normal or uniform distributions for which generators are provided in existing computer systems. Most of the proposed methods rely on uniform number generators and therefore their ability to generate sequences following the random characteristics of a general distribution primarily depends on the ability of the used generators to credibly produce uniformly-distributed sequences.

The EU co-funded research project ONTIC has developed an approach and suitable techniques for realistic traffic generation in the previous sense. The proposed approach considers traffic generation at multiple time-scales and relies on non-parametric statistical inference and the commonly used inversion method for sampling from general distributions. Traffic is first generated at a higher time-scale where statistical convergence is first noticed and then following a top-down fashion until the raw time-scale. As each time-scale aggregates (sums) a certain number of values in the lower time-scale, the proposed top-down traffic generation approach entails resolving a set of so-called linear stochastic equations; and, to this end suitable methods have been proposed and tested.

Advancement

Adhering to its multiple time-scaling and non-parametric nature CostNous enhances the ONTIC approach for realistic traffic generation by incorporating correlation information. Evidently, by reproducing noticed correlations, 'realistic-ness' of the generated synthetic traffic is increased, which is the ultimate target.

As outlined in the forecasting section, CostNous captures correlation information at each time-scale and between successive time-scales in the form of relationship-patterns. This information is utilized in resolving the aggregation constraints (linear stochastic equations) as moving down the time-scales. In particular, the representative patterns and related succession rules that the stochastic analysis has identified at a time-scale are applied to yield the traffic values at each level based on the generated values at the level above - recall that patterns have one degree of freedom. By alternating the succession of the identified patterns, within rules, different synthetic traffic sequences can be generated all bearing the same statistical characteristics including closeness to an original sequence - note that a sample of size n corresponds to n factorial ordered sequences.

The multiple time-scale non-parametric traffic generation method explicitly utilizing correlation information constitutes a scientific advancement of CostNous over ONTIC and subsequently other conventional (parametric) traffic generation methods which generate traffic as a sequence of values drawn independently from an (a priori) assumed common distribution.

CostNous On-Line: Cost Analytics for Cloud-hosted Service Providers

Abstract

This document specifies the architecture, techniques and algorithms for attributing the cost of the cloud infrastructure of a service provider to services and users/customers based on two distinct sets of information: a) service invocation times by users/customers, and b) measurements of certain metrics that represent the aggregate, across all services and users, consumption of chargeable cloud resources. Specifically, the solution approach involves three main steps:

> First, the service invocation times including also user/customer information are captured from the service provisioning Web site of the service provider through appropriate 'tracking pixel' techniques.

> Second, the service invocation time-series and the load time-series per chargeable cloud resource, as provided by available cloud monitoring tools, are aligned within a common time resolution so that services arrived for execution within this resolution complete their execution within this resolution.

> And third, the cost contributions per service and user/customer are calculated based on the information available during the intervals of the considered resolution. A parametric peak-load cost sharing scheme is adopted for differentiating cost contributions on the extent to which services per user/customer contribute to heavier resource loads. As such, the rationale behind the calculation of cost contributions lies in two axes:

o First, to differentiate the cost of operations according to different load ranges (the heavier the load, the more expensive) by means of suitably defined cost penalty functions; and,

o Subsequently, to distribute the cost penalty among the active services of users/customers proportionally to their relative presence given that there is no clear information (measurements) regarding the resource consumption incurred per service and user/customer.

The binding of application/service and system layers for cost analytics in cloud-hosted service infrastructures constitutes the unique added value of our solution compared to relevant existing commercial offers. Furthermore, our solution advances the state of art in cost sharing. It is flexible enough, through appropriate configurations, to apply different cost sharing policies as appropriate for the variety of resources available today in the cloud and applied tariff schemes. Furthermore, it does not require explicit knowledge of resource loads per individual contributor. In contrast, known cost sharing approaches rely on explicit resource load measurements per contributor while they silently assume a monolithic type of resource and rate-based tariffs.

1. The Problem 1.1 Problem Statement and Challenges

Considering a cloud-hosted service provider, the problem is to share the cost of its virtual infrastructure among the offered services and/or the users that use them, in a fair manner.

The problem of fair cost sharing has been addressed in traditional IP networking infrastructures assuming load measurements both per resource (network link, primarily) and per service/user consuming the resource [1 ]-[3]. Load measurements are reported per IP packet/flow header and thus consumption per service and user as understood by business can easily (through lookups) be deduced. Indeed, a number of IP network traffic measurement tools like Cisco's NetFlow [4] provide per-flow traffic statistics while a number of datasets with IP traffic traces are publically available.

However, unlike networking infrastructures, in cloud-hosted infrastructures load measurements of chargeable cloud resources (VMs, storage entities, egress network services or specialized computing platforms) are provided in a systemic aggregate form. They do not include any semantics pointing to the application/service level contributors incurring the reported load (such as header information in IP traffic measurements). Indeed, available cloud monitoring tools like Google's Stackdriver [5] and Elasticsearch's Metricbeat [6] provide a rich set of system-specific metrics which however are totally unaware of the overlying services/users actually using the monitored resources.

The lack of explicit bindings between application/service and system layers in cloud-hosted service environments is a challenge that makes the problem at hand unique and most intriguing.

1.2 Prior Art Cloud cost analytics for cloud cost management and optimization is an emerging market (Table 1 above).

Existing commercial solutions rely solely on system information, as provided by available cloud monitoring services, while they mainly address cloud-based enterprises not as-a- service providers per se. The business value of their analytics is mainly exhausted in the distribution of cloud cost among different business functions (departments, cost centers, units) that the subscribed enterprises have, assuming that enterprises utilize a certain amount of cloud resources per business function. And, this is done by simply allowing their client enterprises tag the virtual resources per business function; and, subsequently by distributing the cloud cost per resource and adding the resulting resource cost per business function-tag. Such an allocation of the cloud cost is more useful to large enterprises like banks than to as-a-service providers who are naturally interested in knowing the cost per (groups of) offered service and/or customer.

Our system goes far more beyond current commercial offerings both in terms of target and business intelligence. It targets cloud-hosted service providers offering a number of services in the mass market, not just multi-functional cloud-based enterprises, while it allows distribution of cloud cost among the offered services and users/customers using these services, not only among different business functions. The lack of connecting application/service usage with cloud infrastructure cost is the key differentiating point and the unique value of our solution over its competitors. As a result, cloud-hosted service providers are provided with unprecedented business intelligence regarding the costliness of their service portfolio and/or their clientele and so on the effectiveness of the maintained virtual infrastructure and the correctness of the adopted pricing policies. 1.3 Innovation

The unique value that our system brings in the cost analytics market is underpinned by two major technical/scientific innovations:

> automated methods for tracing information about users/customers invoking services from the Web sites of service providers in a transparent and safe manner (section 0),

> a cost attribution algorithm (section 0) for sharing cloud cost per set of resources and among (groups of) services, users/customers using these resources in a fair manner, which compared to state-of-art bears the following innovative aspects:

it bridges the gap between application/service and system layers that inevitably exists in cloud environments,

it relies only on resource load measurements without necessarily requiring load measurements per individual service and user/customer (section 0) consuming the resource,

it is flexible enough to apply different cost sharing policies as appropriate to the tariff schemes used by cloud providers for charging different types of resources, through simple parameterizations, and

it is capable of self-adjusting critical operational aspects for improving the credibility of the cost sharing results according to the specific dynamics of the particular cloud environment where it applies.

The scalable and efficient implementation of the cost attribution algorithm (section 0) adds to the innovation of the approach, considering the huge amount of data coming on-line from tracking probes and resource monitoring agents across a multitude of provider Web sites and cloud instances.

2. Solution Concepts and Approach 2.1. Rationale, Notions and Assumptions jError! No se encuentra el origen de la referenda, presents the notions and their relationships underlying our approach highlighting the approach-specific ones. They are defined and discussed in the following. 2.1.1 Overall

In the context of a cloud-hosted service provider, the term job is used to refer to an invocation (actual use) of an offered service by a user. It is assumed that users belong to customers (subscribers, retailers etc.) of the service provider and that such a client hierarchy is included in (or can be deduced from) user semantics. As a result, jobs can be classified into various groups through tags that the service provider can assign to its services, customers and users. Section 0 describes means for safely accounting jobs (services invoked by users/customers) based on the commonly used 'tracking pixel' technique.

The problem is to derive fair shares of the cost of the service provider infrastructure as billed by the cloud provider over an agreed billing period among the jobs arrived during the billing period. The fair proportion of the billing cost per job is called the cost contribution of the job while its corresponding share in monetary units is called the attributed cost of the job. By aggregating cost contributions per service, user/customer and resource (section 0), the cost contributions per groups of business interest (e.g. large customers for a particular type of service) can subsequently be derived; a very useful insight indeed for service providers.

It is taken that there exist reliable monitoring tools for providing load information, in terms of various metrics, for the chargeable cloud resources making up the virtual infrastructure of the service provider. Load measurements per resource are produced at a certain granularity, the monitoring granularity.

However, they do not bear any hint to the jobs actually consuming these resources, so the loads per job cannot be known. Based on the above, two distinct sets of input can be drawn during a billing period: the service usage dataset containing the time-series of the accounted jobs; and, the cloud usage dataset containing the time-series of the load measurement per chargeable resource. As already pointed out, these two datasets share no other information linkage but time information. Our premise is that cost contributions depend on respective load contributions and as such, the service and cloud usage datasets need to be logically combined.

2.1.2 Charging resolution

For combining service and cloud usage information, the so-called completion assumption is put forward. It assumes that there is a time resolution, not necessarily of constant length during a billing period, such that jobs (services invoked by users/customers) falling into this resolution complete their execution within this resolution. We call this 'job-duration-including' resolution charging resolution, its granularity charging granularity and the resulting intervals, which equally space the billing period, charging intervals.

The completion assumption is statistically valid considering that resources alternate between idle and busy periods and hence the average busy time could be taken as a charging granularity. As it will become apparent in the following, smaller charging granularities would be more desirable for avoiding coarse aggregations and therefore enhancing differentiation among load and subsequently cost contributions. Section 0 elaborates on this issue and presents an algorithm for dynamically adjusting charging granularities. Without loss of generality, it is taken that charging granularities are multiples of monitoring granularities as the latter are usually configurable in the order of seconds.

Under the completion assumption, jobs arrive and complete in charging intervals and therefore the load during these intervals can be attributed solely to them. In other words, the active jobs in the infrastructure per charging interval are those arrived during the charging intervals (no other jobs from previous intervals) and so these jobs induce the measured resource load. Note that resource loads during charging intervals can be derived by suitably aggregating the measured loads over the contained measurement intervals (section 0).

2.1.3 Load ranges and penalties

We adopt a parametric peak-load cost sharing scheme [2] considering that resources may become more expensive during higher loads, especially when reaching their capacity since upgrade cost needs also be taken into account. Cost contributions should therefore be differentiated on the extent to which jobs happen to form peak resource loads rather than just being analogous to their respective loads -even when loads per job could be accurately measured. To this end, we consider that resources operate into certain load ranges with a cost penalty being assigned to each load range. Load ranges and penalties are defined per resource type e.g. VM, storage or egress network. They are transparent to the service providers, considered as controlled parameters of our solution approach.

Load ranges can be defined either statically, based on pre-determined thresholds or dynamically, based on percentile values. For example, CPU utilization measurements of a VM could be split into the static ranges of {[0,40), [40,60), [60, 80), [80,100)} or into the dynamic ranges of {[0-25 th ), [25 th -50 th ), [50 th -75 th ), [75 th -100 th ] percentiles} containing all, the top 75%, 50% and 25% measurements, respectively. It is deemed that static/dynamic ranges are more suitable for finite/infinite capacity resources like VMs or egress network, respectively. Considering a specific resource, charging intervals can then be enlisted into the defined load ranges according to their aggregate measured loads and hence the billing period can be partitioning into load periods (jError! No se encuentra el origen de la referenda.)- For example, we could have lighter, light, moderate or busy periods. In the simplest case, a single load range and therefore period could be defined, which could either reflect the busy period of the resource, containing loads above a certain threshold or percentile, or encompass the whole billing period, containing all load values.

The so-defined notions of load ranges and corresponding periods are flexible enough to capture different tariff schemes for different types of resources; thus, allowing our approach to apply to changing cloud provider policies through simple parameterization. Evidently, load periods have random durations within billing periods depending on how many charging intervals happen to belong to the defined load ranges.

Cost penalties are assigned to charging intervals for reflecting the relative surcharges of operating in a charging interval of a particular load range. They are defined as fixed or load-dependent weights biasing heavier load ranges. For instance, considering four load ranges, cost penalties could be fixed to {1 , 2, 4, 8}, correspondingly, or given by the function a load for increasing base values a's (>1 ) per load range. Note that dynamic load ranges do not necessarily admit load-dependent penalties; fixed penalties could be given based on suitable functions of the loads at the ends of the ranges. Fixed penalties differentiate cost solely per load range (different loads within a range are considered insignificant) and so they are deemed suitable for resources charged at a flat fee like VMs. Load-dependent penalties differentiate cost both per range and load therein and so they are deemed more appropriate for resources charged at rate-based tariffs especially when thick load ranges are considered.

The penalties per charging interval of the same load period can be added to yield the penalty of the load period throughout its duration. Load periods can then be seen as rectangles having as area their penalty and length their duration. And so, under the perspectives of peak-load sharing, the billed cost of a resource can be split among them according to their area (penalty) proportion. 2.1.4 Proportional and weighted-proportional sharing

Having differentiated the cost of operations of resources by means of suitable load periods and associated cost penalties, the cost contribution of a job at a particular resource is set to its fair proportion of the cost penalties throughout the load periods considered. Two expressions can be distinguished depending on how throughout is meant: average per interval or per load period as a whole. More specifically, cost contributions at a resource are the fair penalty contributions averaged over all considered charging intervals or load periods. Section 4.6.1 elaborates on the differences between these two expressions.

In the absence of explicit load information per job, two schemes of fairness are considered: the proportional and the weighted-proportional sharing schemes.

Under proportional sharing, fairness is taken to be the relative presence (appearance frequency) of the jobs. As a result, jobs with higher presence during heavier periods are given larger cost contributions provided that the heavier periods are long enough. If the heavier load periods are rather short, cost contributions are merely analogous to frequencies provided that the lighter range penalties do not differentiate substantially among each other.

Under weighted-proportional sharing, jobs are not differentiated only on how frequently they use the resources but also on the demand they bring to the resources. That is, whereas in proportional sharing jobs are assumed to bring the same demand, implying that they cost the same, in weighted-proportional sharing jobs are taken that they bring different demands and therefore their cost should be accordingly differentiated. Under the completion assumption, the load of a resource during a charging interval is the sum of the demands of the active jobs during this interval. This gives rise to a system of linear equations, one for each charging interval, through which the demand of the jobs may be calculated (section 4.5). Depending on when conditions of unique solutions emerge, job demands may be calculated per service or per service and user/customer. We call this system of linear equations the job demand equations and the solutions service differentiation weights.

Then, under weighted-proportional sharing, fairness is taken to be the relative service weight of the jobs. As a result, jobs with higher weights (service demand) are given larger cost contributions provided that all jobs have more or less the same presence profile. Should jobs have different profiles, their cost contributions should be analogous to their total weight (total demand) -product of their presence and weight. Obviously, when all jobs have the same weight weighted-proportional sharing reduces to proportional-sharing. 2.1.5 Discussion

The previous sharing schemes are necessitated by the fact that there are no explicit load measurements per job at cloud resources; and, constitute an innovation of our approach. Should the loads per job were known fairness could be based on the relative loads of the jobs or the Shapley vale [3] of the coalition of all jobs averaging individual load increments.

One might argue that presence is incidental and as such the derived cost contributions will be incidental too and thus of limited credibility. However, the completion assumption makes presence the mere causality of resource loads i.e. the jobs found present during a charging interval induce the measured load and only those. Furthermore, by averaging presence over the billing period its incidental nature becomes as that of a statistical estimator of means. Hence, our approach makes presence acquire significance.

The credibility of our approach is rather subject to the validity of the completion assumption than to the variability underlying presence or the estimation of service differentiation weights. An automated procedure for dynamically adjusting the charging granularity so that the completion assumption be in increased level of confidence is discussed and described in section 4.9.

The notions of load ranges and penalties are powerful and flexible enough to support different peak-load sharing policies as appropriate for the particular types of resources and charging schemes applied -finite capacity or ample resources charged with flat or rate-based tariffs. Furthermore, by grounding cost differentiations on penalties per load range and not on individual load measurements, errors due to statistical deviations can be smoothed out. These two aspects are in contrast to existing cost sharing schemes which mainly address finite capacity resources and rate-based tariffs while they rely directly on load measurements per resource contributor (traffic of transmission links). Being more general, our approach reduces to the existing ones if individual load measurements exist, by considering one load range (encompassing the whole billing period or its heavier-load part) and load-dependent penalties given by the identity function.

Last, it is stressed that our approach targets at sharing cloud cost per billing period, not to assert on the costliness of (different groups of) jobs. For this, a statistical analysis would be required to complement our cost attribution algorithm.

2.2 Functional Architecture

Figure 4 summarizes the previous analysis by presenting the functional components making up our solution, which are put into perspectives in the following sections. Through the Service Provider Interface, cloud-hosted service providers provide information about the cloud they maintain and subsequently they can view analytics on the cost of their operations per groups of interest of the services they offer, the users/customers they support and the cloud resources they employ. The Service Invocation Tracker captures service invocations from the Web sites of the service providers based on the commonly used 'tracking pixel' technique (section 3) while the Cloud Usage Collector ingests load measurements for the cloud resources of the service providers by relying on available monitoring APIs such as Stackdriver [5] and Metricbeat [6]. The Charging Resolution Manager and Cost Calculation Algorithm realize the intelligence for deriving fair cost shares per service, user/customer and resource (section 4.8). They are invoked during and, of course, at the end of each billing period with the former component to determine the length of the charging granularity (4.9) and align within it the collected service (section 4.3) and cloud usage (section 4.4) traces, and the latter component to calculate the fair cost contributions (section 4.6). The Load and Cost Aggregator performs cost aggregations across various groups of services, users/customers and resources (section 4.7) as required by the service providers.

3. Tracking Service Invocations by Users We take that service providers offer an automated service provisioning Web site for presenting their service offerings in the Internet and allowing users to subscribe and subsequently use the offered services as and when they wish. In such a site, besides informational and subscription management pages, there is a log-in page for ensuring proper user access and a number of service pages for allowing users to use the services. According to the Web technology, the pages are supplied by the site and execute for rendering at the Web browsers of the users. jError! No se encuentra el origen de la referencia. sketches the methods employed for tracking service invocations in service provider Web sites. They rely on the commonly used 'tracking pixel' technique and are described in the following. A snippet (piece of code) is provided to service providers upon subscription to our cost analytics for them to embed it into their service pages. Without this snippet our analytics cannot be in effect. The snippet requests the fetching of a tiny image, a pixel, from a Web site of ours, piggy-bagging, in the HTTP request, the required service invocation information: Provider that subscribed to our cost analytics.

Service of the provider invoked by a user.

Time the service was invoked.

User who invoked the service, including customer information.

The first three elements can easily be provided by the snippet. However, user information cannot be provided because it is usually missing from the service pages. Ideally, user information should also bear information about the customer of the provider to whom the user belongs. Email is a suitable piece of such information as customer information could be deduced by the domain name (after @) of the users who log-in with their corporate email.

Provider information is included in the snippet at the very time it is created. Indeed, upon subscription of a service provider to our analytics, a UUID (universally unique identifier) is generated and bound to the provider; and so, the snippet can have this UUID in the redirections (pixel-fetching HTTP requests) it issues. Service information is just the URL of the service pages where the snippet will be embedded while service invocation timestamps can be drawn when the snippet executes at user browsers or when the redirections are received by our Web site. User information is known to the Web site of the service provider (being a standard part of session information) but usually it is solely kept for sesion management purposes -it is not copied into the supplied pages. For capturing user information, two methods are provided: an explicit and a uuid-late-binding method.

In the explicit user tracking method, a number of platform-specific snippets are provided by us as appropriate to industry-wide adopted Web site development or content management systems. For completeness, a skeleton snippet is also provided targeting those service providers willing to provide the required user information based on their own efforts.

The uuid-late-binding method for tracking user information evolves in two phases, requiring two kinds of snippets: one for the service pages and one for the log-in page. Both snippets cause redirections (pixel-fetching requests) to our Web site. The service page snippet conveys user UUIDs which are generated when the snippet executes at user browsers alongside provider, service and time information as previously specified. The log-in snippet conveys the binding between user UUIDs and emails. It generates user UUIDs at user browsers - guaranteed to be the same with the ones generated by the service page snippet- while it extracts the emails of the users from the log-in page, assuming of course that users log-in with their emails. Since the redirections from both snippets are received by us, user UUIDs and emails can be bound. Considering that users cannot access service provider Web sites without having logged-in before, user UUID- email bindings can occur pretty soon, within a day per user. For reducing the volume of redirections, the log-in page snippet could save the emails of the users in the local storage of their browsers so that to send binding redirections only when the emails are not found in the storage.

To exemplify the above and demonstrate the inseparable role of the tracking sub-system, let Joe be an employee of acme.com who have subscribed to the services offered by saasservice.com. The email of Joe is joe@acme.com and using this email logs in saasservice.com in order to use the services that wishes. The log-in page of saasservice.com has, among other e.g. authentication functionality, our code. Then, our code passes to our tracking server the domain, i.e., acme.com of Joe along with a UUID uniquely identifying Joe. The domain corresponds to a particular customer of saasservice.com, in this example acme.com, and the email to a specific user of this customer. From then on, whenever Joe hits a service of saasservice.com, his UUID will be snipped to us and bound with his previously accounted employer domain. This way our system will know that any extra load induced on the cloud layer can be attributed to Joe of acme.com and other similarly-traced users; and, thus we can provide saasservice.com the portion of its cloud cost that corresponds to Joe and/or acme.com in particular.

Summarizing, the following kinds of snippets are provided:

- Platform-specific service page snippets.

- A skeleton service page snippet to be completed by providers. - A service page snippet conveying user UUIDs and a log-in snippet conveying user UUID and email binding.

The specified tracking solutions are secure and workable, since a) their underlying redirection (pixel-fetching) logic is actually served by the service provider Web sites which are considered trustworthy among their users, and b) they avoid explicit use of cookies which could be deprecated by users or browsers for privacy protection reasons. The solutions will not work if users suppress code execution in their browsers or the loading of images through plain HTML requests from remote domains. However in these cases a great deal of Web functionality is essentially lost. 4. The Cost Attribution Algorithm 4.1 Input Information

The cost attribution algorithm runs at the end of each billing period for calculating cost contributions per service, user/customer and chargeable cloud resource of a service provider. It can also run at any point before the end of a billing period to provide partial results up to that point. Its input information for a specific service provider is outlined in Table 2 and described in the following.

Table 5: Input information for the cost attribution algorithm

The Service Usage Dataset includes the traces of service invocations by users during a billing period as captured by the tracking system (section ):

(service, user/ customer, time)

It is assumed that user information includes information of the client-hierarchy of the service provider; though not required by the algorithm as such, it is useful for aggregating the produced cost contributions at application/service level. The Cloud Usage Dataset includes load measurements -for pre-determined metrics- of the chargeable cloud resources in the infrastructure of the service provider during a billing period:

(resource, metric, time, measurement) This information is collected through reliable cloud monitoring services like Stackdriver [5] and Metricbeat [6], which can take and report measurements at configurable granularities.

It is assumed that the chargeable cloud resources are of certain general types (VM, disk, network etc.) and that the semantics of a resource include both its type and its unique identification (id) in the cloud infrastructure of the service provider. resource ·■=< resource type >■< resource id >

Resource type information is necessary for accordingly differentiating algorithmic settings (see below) while it is also useful for providing an additional level of results aggregation. The id of a resource could either be its instance id as given by the cloud provider or a characteristic name given by the service provider. The set of resources of a service provider is drawn automatically from the cloud provider at subscription time through appropriate APIs and it is continuously updated from monitored data since resources are elastic both in its type and number.

A chargeable resource may be charged in terms of one (usually) or more metrics or at a fixed fee depending on its capacity. E.g. egress network is charged based on the number of bytes transmitted whereas VMs or persistent disks are charged flat according to the number of cores and RAM or disk size, respectively. In the latter case, a suitable metric(s) is chosen as a representative measure of the consumption of the resource e.g. CPU utilization for VMs or bytes written for disks. Therefore, the metrics required by the algorithm are only a small subset of the plethora of the metrics provided by the available monitoring services. The metrics as reported by the monitoring services are distinguished as being: gauge, delta or cumulative. Obviously, this information is required by the algorithm for performing needed aggregations. The types of chargeable resources, the metric(s) to measure, their type and the respective monitoring granularities are known in advance per cloud provider, constituting part of the overall configuration of the cost sharing algorithm: metric type, monitoring granularity

metric type■= gauge | delta {cumulative Tags include a set of tags/labels per individual service, user/customer and resource which are freely assigned by the service provider for enabling analytics per business group of interest:

(service, [tagl, .. ]) (user /customer, [tagl, ..

(resource, [tagl, .. ])

Tagging information is required for aggregating cost contributions over services, users/customers and resources (section 0) -it is transparent to the core of the cost attribution algorithm that derives the individual cost contributions. Bill and Tariffs specify the cost per consumption unit per resource type and the bill received for the billing period for which the cost attribution algorithm will run:

(period, billing category, bill) (resource tyve , metric, unitcost^

Analytical bills may not necessarily be at the level of individual chargeable resources. In general, it is assumed that cloud providers have their own billing categories and there is a mapping between chargeable resources and these billing categories. This information is considered overall configuration information.

{cloud provider, service provider, resource, billing category)

Unit cost information is required for bringing individual cost contributions into a common base so that they can be aggregated (section ). It can be drawn from the cloud provider and/or the bills received from the service provider. It is noted that cloud service tariffs depend on a number of parameters (e.g. committed usage period) while they can frequently change, furthermore there are a number of discounts offered on-line. An average tariff should therefore be carefully calculated e.g. averaging billed cost among resources of the same type and total consumption units per resource.

Algorithmic settings provide appropriate values for the controlled parameters of the cost attribution algorithm, which based on the discussion in section 0, include the charging granularity, resource load ranges and associated cost penalties: charging granularity (in seconds) (resource †yvP , metric, [load_rangel, . . , load_rangeR]^

(resource type , metric, penalty type, [penaltyl, . . , penaltyR] or penalty f U nction ( x ))

As the charging granularity can be adjusted dynamically (section 0), the input value specifies its initial value; an indicator could also be passed as to whether to stick with the initial value or not. As outlined in section 0, load ranges can be specified either in absolute or relative terms while penalties could be fixed or load-dependent: load range ·■= number \ percentile penalty type ·■= fixed \ load— dependent

In addition, algorithmic settings may include information for allocating invoked services to charging intervals should such information can be provided (section 0):

(service type, lag, shift)

4.2 Notation

Considering a cloud-hosted service provider, let:

denote a chargeable resource.

be the set of metrics of a specific chargeable resource e.

e,m.

Λ be the consumption unit cost of metric m at resource e.

Bill be the cost of the set of resources E as billed by the cloud provider,

be the duration of a billing period, in seconds, or the duration from the start of the current billing period until the time the algorithm is called to run.

Tc be the fixed duration of a charging interval, in seconds,

be the number of charging intervals during a billing period; obviously, C B = T B /T C

k denote a specific charging interval (of length T c ) during a billing period, t = 1. . C B . e' m l(t) be the aggregate measured load of metric m at resource e during the charging interval I t , t = 1. . C B ; it is calculated based on the time-series of the respective measurements as specified in section 0.

be the number of load ranges defined for metric m at resource e.

e' m r(t) be the load range, 1. . e,m R, to which the charging interval I t , t = 1. . C B belongs according to the load of metric m at resource e during this interval, e,m l(t).

e,m. r be the number of charging intervals belonging to the load range r = 1. . e,m R

defined for metric m at resource e -these intervals make up the corresponding load period; obviously,

e,m r r v -ι

r _ y e,m r r

e,m r

P r (t) be the penalty in the cost of operating in the charging interval I t , t = 1. . C B that belongs to the load range r = 1. . e,m R defined for metric m at resource e; as stated in section 0, it can either be fixed and therefore constant throughout the intervals of the same load range, say

e ,m p r , or dependent on the load of the interval.

e,m D r be the total penalty in the cost of operating throughout the load period r =

1.. e,m R of metric m at resource e; obviously,

C e>m £r- e ' m ^ ( penalties are fixed

(∑f e ' m r(t)=r e,m r (t) < i penalties are load— dependent V

V(t s,u be the number of active instances of service s invoked by user u during the

charging interval I t , t = 1.. C B ; it is calculated based on the time-series of the traced service invocations (section 0) as specified in section 0.

V t) be the total number of active instances of all services and all users during the

charging interval I t , t = 1.. C B ; obviously,

V(t) = ∑ SiU V t) S t = l. . C B

e' m Vs,u be the number of active instances of service s invoked by user u during the load period r = 1.. e,m R of metric m at resource e; obviously,

e.myr ^ e tne tota | num b er 0 f active instances of all services and all users during the load period r = 1.. e,m R of metric m at resource e; obviously,

e ' "V r =∑ s ,„ e' "¾. r = l. . e'm R

β' πι βε, η be the service differentiation weight of service s and user u at resource e for

metric m; it is a calculated as specified in section 0.

e' m w s u be the cost contribution of service s invoked by user u with respect to the

consumption of a metric m at resource e; this is the main outcome of the cost attribution algorithm and it is calculated as specified in section iError! No se encuentra el origen de la referenda..

e' m A s u be the attributed cost of service s invoked by user u with respect to the billed

consumption of metric m at resource e; it is calculated as specified in section 0.

4.3 Service Usage Aggregation

For determining the active jobs (services invoked by users) per charging interval, V(i) s u , the captured service invocations (section 0) need to be mapped into the series of charging intervals in a way to ensure chronological ordering as well as the underlying completion assumption (section 0) -the jobs allocated to a charging interval need to complete execution during this interval. Furthermore, the mapping needs to take into account the fact that there might be a lag between job invocation and execution. The following parameters are considered: lag s be the time elapsed from when a job of service s is invoked until the job starts executing.

shift s be the minimum time of execution of a job of service s at a resource; in other words, a job is allocated at a charging interval if its admission time (invocation plus lag) is before its shift from the interval end, otherwise the job is allocated to the next charging interval.

Then, considering a job of service s which is invoked at time t, the charging interval during which the job will be considered active in the infrastructure is given by:

charging

, - granularity

(t + lag s ) > l t = [st, et) (1 a)

, r e - , f h > if t + lag s < et— shift s

char qinq interval of ;οο ς = \ , ^ , ^ (1 b) a a J J s {j t+ li if t + lag s > et - shift s ' Using the above formula, the jobs invoked during a billing period are allocated into an appropriate charging interval during which they complete their execution. Then, the number of active jobs in each charging interval V(i) s u can be calculated by a simple counting function.

The lag, shift parameters are considered as meta-data that should be provided by the cloud-based service provider who has an intimate knowledge of the services that provides. This information could be captured by means of a simple questionnaire during subscription. If this information cannot be provided, the parameters will default to zero.

The above assume that accounted invocation times precede execution times, meaning that lag>0. This could be the case of non-blocking service invocations e.g. when services are invoked through Restful APIs. However, if services are invoked in a batch-mode then the service page will return to the users after the execution of the invoked service and as such, accounted invocation times will follow execution times, in which case lag <0. In any case and given the absence of such intimate information, it all boils down to appropriately setting the length of the charging intervals; suitable self-adjusting means for appropriately setting the charging granularity is described in section 4.9.

4.4 Cloud Usage Aggregation

Consider a charging interval, I t = [st, et) and the set of the measurements of a metric m at resource e that fall into this interval including the measurement at its right-end, in increasing time order, say {(t 1( li , . . , (t k , l k ), (tfc+i, /fc+i))

We assume that the charging granularity is at multiples of the monitoring granularity, say k times, so we have: st = t t < t 2 < . . < t k < t k+ 1 = et t 2 _ti— . .— t k+1 — t k = A

Then, the aggregate measured load during the charging interval I t which encompasses k monitoring intervals, is given as follows, depending on the type of the monitored metric. metric is gauge: i (£) = —— = ' (2a) metric is cumulative: e,m l(t) = l k+1 — l t (2b) metric is delta: e ' m l(t) = l j (2c)

As the charging intervals are taken to be left-closed-right-open, the time-series of the load over these intervals (x-axis has their start times being the end times of the next interval) forms a right-continuous step-function. The following point is worth-mentioning. We take that monitoring services are perfectly reliable. This assumption is not made out from convenience but rather from reality. In a cloud-based environment, resources are elastic; they can come and go as conditions warrant so. As such, the absence of measurements for a particular resource does not necessarily mean failure of the monitoring services but simply that the resource went out. In any case, one might argue that the monitoring or the ingestion system can miss some measurements. We take that such outages cannot be spotted on-line but only after receiving relevant explicit notices. In such cases, the time-series of the measurements should be rectified based on common techniques for filling missing values (forward, backward expansion or linear interpolation). 4.5 Service Differentiation Weights

Jobs bring different amounts of load at the various resources, depending on their particular service requirements, and therefore their cost contribution needs to be accordingly differentiated. However, explicit load measurements per job cannot be provided at a cloud environment and as such, load differentiation among jobs can only be estimated from available information.

Consider a specific resource, a particular metric and the respective time-series of the active jobs and load, cf. (1 a,b), (2a,b,c) -resource and metric indices are omitted:

{ [v(t) SiU , Vs, Vu, t = l. . C B ]

By means of the completion assumption (section 0) the load of a resource during a charging interval is attributed only to the active jobs during this interval. Thus, considering that jobs per service s and useru bring the same demand, say β , the following equations hold: job— demand equations; same demand per service and user: The above constitutes a system of linear equations with unknowns [β 3 Ι Vs, Vu}. Given the underlying assumptions, the yielded loads per service and user, if a solution exists, are treated as weights for relatively differentiating the jobs according to their induced load, not as their load per se.

Being of statistical nature -the coefficients V(t) s u s are observations of random variables- the job-demand equations are considered as traces rather than as constraints. As such, the system cannot be inconsistent. Mathematically speaking, if the coefficient matrix is of full column rank a least squares solution (minimizing the error over all equations) can always be determined, otherwise, some of the unknowns will be undetermined, and an infinitum of solutions exists. The mathematical condition of full column rank means that a solution can be found if: a) more equations-traces than unknowns: the number of distinct service and user pairs is at least the number of charging intervals, and

b) columns are linearly independent: the jobs per service and user appear independently of each other i.e. the appearance of jobs per service and user does not follow the appearance of jobs from other services and users.

If one of the above conditions does not hold then the system is said not to include sufficient information for admitting a single (least squares) solution. If so, a new linear system is formed by aggregating jobs per service and customer (over all users of the customers) - note that user semantics include customer information (sections jError! No se encuentra el origen de la referenda., 0). If still a single solution cannot be determined, jobs are finally aggregated per service only (over all customers and their users). If yet, the system at this level is not of sufficient information, the trivial solution that of having all s is returned. All these are mathematically depicted by the following: job— demand equations; same demand per service and customer: ∑s,c&usc V(t) s>u ) s>c = t), t = l. . C B (3b) job— demand equations; same demand per service: ∑s(∑u V(t) SiU ) β 3 = Z(t), t = l.. C B (3c)

trivial solution: &,u = l, Vs, Vu

Evidently, the number of services is less than the number of distinct service and customer pairs which in turn is less than the number of distinct service and user pairs. Therefore, if the first condition (more equations than unknowns) for having a single (least squares) solution does not hold at a specific job aggregation level it will also not hold at the lower levels. However, no such assertions can be made for the second condition (linear independence). Independency might not hold at a specific level but it may hold at lower even higher levels. Intuitively, the likelihood to be independent is increased at lower, more- detailed, levels. However, this depends on the length of the charging intervals e.g. the invocation behavior of a specific service among different customers may be the same over larger intervals.

Based on the above, in order to select the system to solve for determining the service differentiation weights, the first condition is checked top-down the job aggregation levels and then the second condition bottom-up. There are well-known methods, within the realm of basic linear algebra, for checking linear independence and subsequently determining a least squares solution with their efficient and numerically-stable implementation in emerging Big Data processing platforms being a challenge.

4.6 Cost Contributions for a Single Resource and Metric

With reference to the notation introduced in section 0, cost contributions per service and user at a specific resource with respect to the consumption of a particular metric are defined as in the following (resource and metric indices are omitted).

4.6.1 Proportional sharing

interval— based cost contributions: range— based cost contributions: = ∑ r P r ^/∑ r P r (4b) Evidently, either form satisfies the unity constraint:

As it can be seen, the two forms above differ in the granularity for distributing the cost penalty among the active job according to their relevant presence. The first form does the distribution at a fine time-grain, per charging interval, whereas the second at a grosser time-grain, per load period (union of charging intervals belonging to the same load range). This is more evident in the case where the penalties per load range are fixed; in this case, the interval-based cost contributions become: interval— based cost contributions; fixed penalties:

In the special case where the cost penalties are fixed, all the same and not zero i.e. p r (i) = p, P r = pC r , r = 1.. R, we have: fixed penalties: ( (11)) _ Y v V ^)ss,,u ,

-' ■u - L (t) 1 1 "% =∑r^/c E

So, in this case, the first from yields as cost contribution of a specific job the average of the frequencies at which this jobs appears in all charging intervals, whereas the second form yields the weighted average of the frequencies at which this jobs appears in the load periods by the number of the intervals at each load period. In the particular case of the above case where only one load range is considered not necessarily encompassing all charging intervals of a billing period but a subset of specific C 1 heavier-load intervals we have: single load range, fixed penalty = V V^ So, in this case, the first form yields as cost contribution of a specific job the average of the frequencies at which this jobs appears at each charging interval that happens to be in so-defined busy period, whereas the second form yields the overall frequency at which this job appears throughout the busy period.

In case of fixed penalties cf. (4b), (4a1 ), the defined cost contribution forms are equivalent if the following holds: fixed penalties— equivalence condition

That is, if the average frequency of a particular job across all charging intervals of a load period is equal to the frequency of the job during the whole load period. A sufficient condition for this to hold is the total occurrences per interval, V t) s, to be all equal.

Considering that the appearance frequency of a particular job during a charging interval is a random variable, the Laws of Large Numbers guarantee that its average will converge to the true mean as the number of intervals (samples) grow and so to the frequency over a load period if large enough. Thus, we can state that when fixed penalties are considered, the interval-based cost contributions converge to the range-based cost contributions if the number of intervals per load range is large enough, that is, the equivalence condition holds asymptotically: fixed penalties— asymptotics:

a.s. (2)

W s, ,u W, as C r r = 1.. R fixed penalties— conjecture

(1)

w ,(2)

v v s,u > w s,u I

Since the second form is rather a limit of the first form, it makes sense to adopt the first form on grounds of capturing temporal accuracy during billing periods, provided of course that the charging intervals are large enough so as the distribution of job presence is statistically similar among them. If this is not the case, the second form is preferred in order to overcome temporal statistical deviations. The decision as to which form to use can thus be automatically determined by the cost attribution algorithm in accordance to the self-made adjustments of the charging granularity (section 0). Note that in the case of load-dependent periods the above asymptotic behavior does not hold since the sequence of p r (t) s may not converge for t.

Weighted-proportional sharing

Compared to the previous case, job frequencies are replaced by the relative total weights of the jobs -relative total service demand brought into the infrastructure per job: ) s u

As previously, relative total weights can be calculated per distinct charging interval or per load period. So, we have: interval— based cost contributions: w∞ = ∑ r t!r(t)=r p r (t) y y¾ /∑r P r (5a) range— based cost contributions:

Evidently, when all service differentiation weights β / s are equal i.e. all jobs weight and cost the same, the above formulae reduce to the ones of the previous case. The following special cases are worth-stated. interval— based cost contributions; fixed penalties:

same fixed penalties:

single load range, fixed penalty

t in range

Similarly to the previous case, the two expressions of cost contributions asymptotically converge for large load periods in case of fixed penalties.

4.7 Aggregating and Monetizing Cost Contributions

The cost contributions as calculated by (3a, b) are at the most granular level: per service, user and resource. Being percentages they cannot be added unless they refer to the same entity. In general, different types of resources (or charged metrics on the same resource) cost differently and therefore cost contributions need to be normalized accordingly. As such, for yielding the contribution of a specific group of services and/or users to the cost: of a particular resource and metric, the respective cost contributions can be added; of a particular resource over all its charged metrics, the respective cost contributions should be weighted according to the different unit cost of the metrics at the resource; and,

of a specific set of different resources, the respective cost contributions should be weighted according to the different unit costs of the resources and their metrics. For turning cost contributions into monetary cost, cost contributions need simply be multiplied with the cost of the resources these cost contributions refer to. The yielded cost is called the attributed cost. Obviously, attributed costs are additive because they all refer to the same monetary units.

With reference to the notation introduced in section 0, cost contributions can be aggregated and turned into absolute cost shares as in the following.

Cost contribution of sets of services S and users U for a resource e and metric m: w SiU = ∑ses w SiU (6a) ueu

Cost contribution of sets of services S and users U for a resource e:

∑ e,m e,m

m≡M e u Z. seS w s,u

ew S = -. (6b) Cost contribution of sets of services S and users U for a set of resources

∑eeE∑meM e "∑ sES w s ,u

u U (6c) l eEEl mEM e u

The sets of services, users (or customers as users carry client-hierarchy information) and resources could be arbitrarily defined e.g. based on tags or billable category information provided as input (section 0).

attributed unit cost of resource e and metric m to service s and user u: m,e m,e m,e / ~ 7 \ a s ,u = u w S (7a) attributed billed cost of a set of resources to service s and user u:

E A s ,u = E w s ,u E Bill (7b) attributed billed cost of a set of resources to groups of services S and users U

EA S>U = ∑sss E A SiU (7c)

4.8 Algorithm Outline jError! No se encuentra el origen de la referencia.(a)i, 12(a)ii and 12(b) summarizes the previous by presenting the outline of the cost attribution algorithm. Two parts are distinguished: the core part resulting in individual cost contributions per resource, service, user/customer (section 0) and the aggregation part yielding the attributions of specific groups of services, users/customers to the billed cost of particular sets of resources (section 0).

The algorithm is broken into a number of blocks which are described in a pseudo data- manipulation code so that to hint its parallel implementation potential. Its efficient implementation in the Spark distributed processing platform (section 0) adds to the innovative aspects of our work. The core of the cost attribution algorithm can run in a batch (for an entire billing period) or an on-line, incremental (at several times during a billing period e.g. daily) fashion. In the latter case, since the calculation of cost contributions is not associative the corresponding part (see step C.3) needs to re-run every time from the beginning of the billing period. So does the part for calculating the service differentiation weights. 4.9 Discussion on Charging Granularity

The choice of the granularity of the charging intervals so that the completion assumption holds (section 0) is central to the approach and therefore a significant condition affecting the credibility of the yielded cost contributions.

As the charging granularity gets larger the completion assumption becomes true and reversely thinner charging granularities are likely to make the completion assumption invalid. The length of charging granularity should also be seen under the perspectives of the load ranges which are introduced for differentiating cost sharing based on peak-loads. If charging intervals are large enough peak loads may be diluted by the calculations for deriving the aggregate load (section 0). The choice therefore of the optimal charging granularity is a trade-off between the validity of the completion assumption and the loss of peak-load information.

Information on the duration of each job executed during a billing period would be helpful however this cannot be a priori known while is hard to measure from system statistics - very complex mappings between user submitted jobs and system processes/threads are required which outmost depend on the specificities of the architecture of the particular software system in place. Furthermore, it cannot be taken for granted that all jobs of the same service will have the same execution times in an empty machine since execution times depend on input and therefore on user behavior.

The above discussion indicates that the optimal setting of the charging granularity could only be determined through benchmark tests in particular provider environments. And, based on the benchmark results, the optimal setting of the charging granularity could be automated through appropriate ML techniques. A criterion for dynamically adjusting the charging granularity could be the proportion of the charging intervals with significant load but with no (or very small number of) active jobs. Such intervals indicate that the completion assumption does not hold and therefore the charging granularity should increase. Along this line, jError! No se encuentra el origen de la referenda, outlines a suitable algorithm for setting the charging granularity. This algorithm could run as a part of a training phase for getting an initial estimate of the charging granularity as well as alongside the cost attribution algorithm for verifying the validity of the initial estimate and recalculating cost contributions if new estimates deem more valid. 4.10 Output Information

Table 3 specifies the output information of the cost attribution algorithm. Two datasets are produced: - one with the individual cost contributions and unit cost attributions per service, user/customer and resource, resulted from the core part (section 0) of the algorithm, and

one with aggregate cost contributions and attributions per wished group of services, users/ customers or resources, resulted from the aggregation part (section 0) of the algorithm.

The groups of business interest may not be necessarily pre- configured (e.g. per billing category, section 0) but they may well be defined dynamically when and as service providers wish so (e.g. through tags, section 0). Obviously, the first output dataset suffices for performing any required aggregation since it contains all individual unit cost attributions which by definition are expressed in monetary units and so provide a common base for aggregating cost contributions (section 0).

Table 6: Output information of the cost attribution algorithm

5 System Architecture

With reference to the functional architecture presented in section 0, jError! No se encuentra el origen de la referenda, provides an overall system view depicting the main computational blocks implementing the specified functionality, while Table 4 presents the underlying implementation technologies.

The system will run at our side offering its cost analytics as-a-service to the various cloud- hosted providers. References:

[1 ] A. Bousia, E. Kartsakli, A. Antonopoulos, L. Alonso, C. Verikoukis, "Sharing the

Small Cells for Energy Efficient Networking: How much does it cost?", Procs of

GLOBECOM'14, pp. 2649-2654, 2014.

[2] L. Gyarmati, R. Stanojevic, M. Sirivianos, N. Laoutaris, "Sharing the Cost of

Backbone Networks: Cui Bono?", Procs of the 12 th ACM SIGCOMM Conf. on

Internet Measurement, IMC'12, pp. 509-522, ACM, 2012.

[3] L. Gyarmati, N. Laoutaris, P. Rodriguez, "On Economic Heavy Hitters: Shapley

Value Analysis of 95th-Percentile Pricing", Procs of the 10 th ACM SIGCOMM Conf. on Internet Measurement, IMC'10, pp.75-80, ACM, 2010.

[4] NetFlow,

https://www.cisco.eom/c/en/us/products/ios-nx-os-software/io s-netflow/index.html [5] Stackdriver, https://cloud.google.com/monitoring/api/v3/

[6] Metricbeat, https://www.elastic.co/guide/en/beats/metricbeat/5.2/index.h tml

[7] Cloud Cruiser, https://www.cloudcruiser.com/

[8] Cloudyn, https://www.cloudyn.com/

[9] Cloudability, https://www.cloudability.com/

[10] Talligent, https://www.talligent.com/