Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MICROSERVICE PROFILING IN CONTAINERIZED ENVIRONMENTS FOR DATA DRIVEN APPROACHES
Document Type and Number:
WIPO Patent Application WO/2021/105905
Kind Code:
A1
Abstract:
A method, system, network node and apparatus are disclosed. A network node (12) is provided. The network node (12) includes processing circuitry (18) configured to determine a plurality of process roles for a plurality of microservices associated with a virtual application and aggregate a plurality of data instances of the plurality of process roles. The processing circuitry (18) is further configured to create a profile based at least on the aggregated plurality of data instances and perform an action based at least on the profile.

Inventors:
FARRAHI MOGHADDAM FEREYDOUN (CA)
POURZANDI MAKAN (CA)
ZHANG MENGYUAN (CA)
Application Number:
PCT/IB2020/061155
Publication Date:
June 03, 2021
Filing Date:
November 25, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ERICSSON TELEFON AB L M (SE)
International Classes:
G06F21/55; G06F21/56; H04L29/06
Foreign References:
US20180095730A12018-04-05
US20160380916A12016-12-29
US20180113790A12018-04-26
US20140196115A12014-07-10
Other References:
CHANG HYUNSEOK ET AL: "Microservice Fingerprinting and Classification using Machine Learning", 2019 IEEE 27TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP), IEEE, 8 October 2019 (2019-10-08), pages 1 - 11, XP033653884, DOI: 10.1109/ICNP.2019.8888077
Attorney, Agent or Firm:
WEISBERG, Alan M. (US)
Download PDF:
Claims:
What is claimed is:

1. A network node (12) comprising: processing circuitry (18) configured to: determine a plurality of process roles for a plurality of microservices associated with a virtual application; aggregate a plurality of data instances of the plurality of process roles; create a profile based at least on the aggregated plurality of data instances; and perform an action based at least on the profile.

2. The network node (12) of a aim 1, wherein the processing circuitry is further configured to: determine a plurality of process identifier, PID, instances over a predefined time that are triggered by the plurality of microservices; and determine a plurality of virtual process role identifiers based at least on the plurality of PID instances, each of the plurality of virtual process role identifiers corresponding to a respective one of the plurality of process roles.

3. The network node (12) of Claim 2, wherein at least one of the plurality of PID instances is assigned without consideration of a process functionality; and each of the plurality of virtual process role identifiers being assigned based at least on a respective process functionality.

4. The network node (12) of Claim 3, wherein the aggregated plurality of data instances of the plurality of process roles aggregates redundant processes having the same virtual process role identifier of the plurality of virtual process role identifiers.

5. The network node (12) of any one of Claims 3 and 4, wherein the processing circuitry (18) is further configured to perform a hash function on each of the plurality of process roles; and the plurality of virtual process role identifiers being based on the hash function on each of the plurality of process roles.

6. The network node (12) of Claim 5, wherein the processing circuitry (18) is further configured perform a hash function on at least one partial process tree, each of the at least one partial process tree including at least two respective process roles of the plurality of process roles; and the plurality of virtual process role identifiers being based on the hash function on the at least one partial process tree.

7. The network node (12) of any one of Claims 1-6, wherein each one of the plurality of data instances corresponds to a respective system call.

8. The network node (12) of any one of a aims 1-7, wherein the profile corresponds to a consistent view of the plurality of microservices.

9. The network node (12) of any one of Claims 1-8, wherein the action includes at least one of outlier detection of at least one of the plurality of microservices and detection of a network attack.

10. The network node (12) of any one of Claims 1-9, wherein the plurality of process roles are defined within a distributed virtualized environment.

11. A method for a network node (12), the method comprising: determining (S104) a plurality of process roles for a plurality of microservices associated with a virtual application; aggregating (S106) a plurality of data instances of the plurality of process roles; creating (S108) a profile based at least on the aggregated plurality of data instances; and performing (SI 10) an action based at least on the profile.

12. The method of Claim 11, further comprising: determining a plurality of process identifier, PID, instances over a predefined time that are triggered by the plurality of microservices; and determining a plurality of virtual process role identifiers based at least on the plurality of PID instances, each of the plurality of virtual process role identifiers corresponding to a respective one of the plurality of process roles.

13. The method of Claim 12, wherein at least one of the plurality of PID instances is assigned without consideration of a process functionality; and each of the plurality of virtual process role identifiers being assigned based at least on a respective process functionality.

14. The method of Claim 13, wherein the aggregated plurality of data instances of the plurality of process roles aggregates redundant processes having the same virtual process role identifier of the plurality of virtual process role identifiers.

15. The method of any one of Claims 13 and 14, further comprising performing a hash function on each of the plurality of process roles; and the plurality of virtual process role identifiers being based on the hash function on each of the plurality of process roles.

16. The method of Claim 15, further comprising performing a hash function on at least one partial process tree, each of the at least one partial process tree including at least two respective process roles of the plurality of process roles; and the plurality of virtual process role identifiers being based on the hash function on the at least one partial process tree.

17. The method of any one of Claims 11-16, wherein each one of the plurality of data instances corresponds to a respective system call.

18. The method of any one of Claims 11-17, wherein the profile corresponds to a consistent view of the plurality of microservices.

19. The method of any one of Claims 11-18, wherein the action includes at least one of outlier detection of at least one of the plurality of microservices and detection of a network attack.

20. The method of any one of Claims 11-19, wherein the plurality of process roles are defined within a distributed virtualized environment.

Description:
MICROSERVICE PROFILING IN CONTAINERIZED ENVIRONMENTS

FOR DATA DRIVEN APPROACHES

TECHNICAL FIELD

The present disclosure relates to wireless communications, and in particular, to profiling microservices in a distributed virtualized environment.

BACKGROUND

To detect infected or malicious processes on a server, one practice/method is to monitor those processes in order to profile their behavior. Afterwards, if the process’s behavior diverts from the normal or predefined behavior, the monitoring system may decide that the process has an abnormal behavior. One among various other approaches is to trace system calls from a process to profile the process. Machine Leaming/Artificial Intelligence (ML/AI) may be used in this context to profile the process’s behavior using models from trained data sets. Some aspects of the service architecture are described below.

Container

FIG. 1 is an example of a block diagram of a container that contains an application and runs (or is executed) on top of a Docker. “Dockers” are known in the art and beyond the scope of this disclosure.

Microservice

Referring to FIG. 2, a microservice may be defined as an architecture derived from a service-oriented architecture (SOA), which is based on an application running (e.g., being executed) in one or several containers providing a fine-grained service. For example, microservices may be defined as: Each individual service runs on its own, separate from the others within the architecture.

In this model, different software components, microservices are instantiated to tackle the incoming requests. These microservices may be implemented using one or many processes.

Profiling microservices

Remote Procedure Call (RPC) may be used for profiling microservices. For example, in one approach a method is not used to identify different microservices. In another approach, microservice profiling is described where such profiling is used for representing the behavior of microservice by the testers. Generally, a process identifier (PID) associated with a binary is used to identify a server. However, in a virtualized environment, PID may not serve as a unique identifier as the PID changes constantly.

Other Approaches

Other approaches relate to processing instance profiling for modeling the malware behavior based on system calls. In some of these approaches, machine learning (ML) is used for detecting malware. Some of these approaches are based on processing instances running on physical servers. However, these approaches do not address the case of several processes running at different moments in time, on different locations (i.e., the virtualized environment allows to run the same microservice instances simultaneously in different computing nodes) and representing the same role. In other words, these approaches do not address the concept of process roles for processes running in a distributed virtualized environment over a session of handling the same request.

In yet another approach, a mechanism to fingerprint using ML (e.g., autoencoders, long term short term memory (LSTM)) is described. This approach is based on tracing the system calls for a service and assumes that all PIDs are associated with service.

However, in a microservice architecture, short-lived containerized components are part of the building blocks of a highly available and continuous service function. These short-lived (i.e., short time duration) microservices are used to perform services with minimum overhead. These short-lived microservices are each assigned a distinct PID. This may cause a microservice to be presented with many different PIDs in a short period of time where this approach of associating a behavior to one PID may become obsolete as the inconsistency in PID of these microservices’ processes makes it hard to trace the microservice. Even for medium-lived and long-lived microservices, it is not uncommon for a microservice to be crashed, restarted, interrupted, replaced, etc., which means a new microservice of the same type may be launched with a new PID. FIG. 3 is a diagram of a monolithic architecture over time where each microservice is represented by only one container running one process. A microservice can be represented by a set of processes inside a container where these processes interact inside the container and can be also be correctly profiled the same way.

FIG. 4 is a diagram of service handing requests with microservices over time. It may be assumed that each microservice is only one process instance. Different patterns for microservices launched from different locations are illustrated in FIG. 5 where the same process created for the same microservice and the interactions between them are designated.

With these approaches, profiling applications based on monitoring the application as a whole is limited to analyses for PIDs at one location, i.e., non- distributed environments, such that attempting to profiles applications in a distributed virtualized environment is not provided.

SUMMARY

A problem with existing approaches is that they do not address the process role problem. A process role may not be the behavior of a process instance and may also not be the fingerprint of a process functionality or process type functionality.

A process role represents the set of process instances corresponding to the microservice:

- These process instances may be instantiated in different locations (i.e., virtualized environment are allowed to run the same microservice instances simultaneously in different computing nodes), as a result of autoscaling or as to tackle different functions that are part of the same microservice instantiated in different locations (e.g., containers in different computing nodes) for load balancing or other causes.

- These process instances represent the microservice functionality over a period of time corresponding to a session.

Therefore, from this point of view, there is no existing approach as the existing approaches are based on PIDs and one location. The instant disclosure helps solve at least some of the problems with the process role in the context of a distributed virtualized environment (e.g., containers) as described herein.

Further, in traditional approaches to profiling applications, the application as a whole is monitored. Such monitoring as a whole may work for monolithically serial applications. However, in the distributed virtualized environment, an application is made of many software components, i.e., microservices, interacting with each other instantiated around the data center and handling small pieces of functionality.

The instant disclosure solves at least some of the problems with existing approaches by profiling different microservices inside the application to thereby improving profiling of the application. In one or more embodiments, the teachings go beyond monolithic profiling of the application and provide distributed profiling for a distributed application.

In this case, the profiling goes beyond a simple profiling of different component instances individually at time t as there is a need to profile and capture the behavior of the distributed components everywhere in the virtualized environment and for the entire session over a period of time in order to be able to come up with a realistic profile for the microservices making this application. One example of a feature to be used to profile different microservices is the patterns of interactions between different microservices. For example, if a backdoor in a serving component filters the logs from the serving component towards the billing component, the monitoring of the entire application may not detect the attack. It could be the billing and serving components, individually, do not indicate any abnormal behavior. However, profiling the pattern of communications between the incoming requests to serving microservice type and the info towards the hilling microservice can detect an attack. Note that to build the profiles for these different microservices there may be a need for profiling consistently all corresponding PIDs and aggregating those patterns to be able to build a realistic profile as described herein with respect to one or more embodiments.

In one or more embodiments, the profiling is fine-grained based at least in part on PIDs associated with a microservice over time in a distributed environment.

Some embodiments advantageously provide methods, systems, and apparatuses for profiling microservices in a distributed virtualized environment. In one or more embodiments, it may be assumed that a service function is made of several components. These components are instantiated as containers which interact with each other. Therefore, microservices may refer to all the instances in a service function performing a defined type of activity.

Additionally, the teachings of the disclosure advantageously provide a finegrained approach considering system calls from microservices than, for example, existing system that do not consider using any method to identify different microservices.

Additionally, the teachings of the disclosure advantageously provide profiling of the actual run-time behavior for ML approaches as opposed to microservice profiling describe for representing behavior of microservice by testers.

Additionally, the teachings of the disclosure advantageously represent the interactions, for example, communications, between the groups of processes having the same role over the execution time of the service. Note that in the approach described herein, QDFG nodes represent the processes and the resources such as virtual file. Additionally, the system may aggregate calls made by a process role over time. These are one or more differences between the teaching of the disclosure and existing systems.

According to one aspect of the disclosure, a network node is provided. The network node includes processing circuitry configured to determine a plurality of process roles for a plurality of microservices associated with a virtual application and aggregate a plurality of data instances of the plurality of process roles. The processing circuitry is further configured to create a profile based at least on the aggregated plurality of data instances and perform an action based at least on the profile.

According to one or more embodiments of this aspect, the processing circuitry is further configured to determine a plurality of process identifier, PID, instances over a predefined time that are triggered by the plurality of microservices and determine a plurality of virtual process role identifiers based at least on the plurality of PID instances where each of the plurality of virtual process role identifiers corresponds to a respective one of the plurality of process roles. According to one or more embodiments of this aspect, at least one of the plurality of PID instances being assigned without consideration of a process functionality where each of the plurality of virtual process role identifiers is assigned based at least on a respective process functionality. According to one or more embodiments of this aspect, the aggregated plurality of data instances of the plurality of process roles aggregates redundant processes having the same virtual process role identifier of the plurality of virtual process role identifiers.

According to one or more embodiments of this aspect, the processing circuitry is further configured to perform a hash function on each of the plurality of process roles. The plurality of virtual process role identifiers is based on the hash function on each of the plurality of process roles. According to one or more embodiments of this aspect, the processing circuitry is further configured perform a hash function on at least one partial process tree where each of the at least one partial process tree includes at least two respective process roles of the plurality of process roles. The plurality of virtual process role identifiers are based on the hash function on the at least one partial process tree. According to one or more embodiments of this aspect, each one of the plurality of data instances corresponds to a respective system call.

According to one or more embodiments of this aspect, the profile corresponds to a consistent view of the plurality of microservices. According to one or more embodiments of this aspect, the action includes at least one of outlier detection of at least one of the plurality of microservices and detection of a network attack.

According to one or more embodiments of this aspect, the plurality of process roles are defined within a distributed virtualized environment.

According to another aspect of the disclosure, a method for a network node is provided. A plurality of process roles for a plurality of microservices associated with a virtual application is determined. A plurality of data instances of the plurality of process roles are aggregated. A profile is created based at least on the aggregated plurality of data instances. An action is performed based at least on the profile.

According to one or more embodiments of this aspect, a plurality of process identifier, PID, instances over a predefined time that are triggered by the plurality of microservices are determined. A plurality of virtual process role identifiers are determined based at least on the plurality of PID instances where each of the plurality of virtual process role identifiers corresponds to a respective one of the plurality of process roles. According to one or more embodiments of this aspect, at least one of the plurality of PID instances is assigned without consideration of a process functionality where each of the plurality of virtual process role identifiers is assigned based at least on a respective process functionality. According to one or more embodiments of this aspect, the aggregated plurality of data instances of the plurality of process roles aggregates redundant processes having the same virtual process role identifier of the plurality of virtual process role identifiers.

According to one or more embodiments of this aspect, a hash function is performed on each of the plurality of process roles where the plurality of virtual process role identifiers are based on the hash function on each of the plurality of process roles. According to one or more embodiments of this aspect, a hash function is performed on at least one partial process tree where each of the at least one partial process tree includes at least two respective process roles of the plurality of process roles. The plurality of virtual process role identifiers are based on the hash function on the at least one partial process tree. According to one or more embodiments of this aspect, each one of the plurality of data instances corresponds to a respective system call.

According to one or more embodiments of this aspect, the profile corresponds to a consistent view of the plurality of microservices. According to one or more embodiments of this aspect, the action includes at least one of outlier detection of at least one of the plurality of microservices and detection of a network attack. According to one or more embodiments of this aspect, the plurality of process roles are defined within a distributed virtualized environment. BRIEF DESCRIPTION OF THE DRAWINGS

A more complete understanding of the present embodiments, and the attendant advantages and features thereof, will be more readily understood by reference to the following detailed description when considered in conjunction with the accompanying drawings wherein:

FIG. 1 is a diagram of a representation of containers;

FIG. 2 is a diagram of a monolithic architecture and a microservices architecture; FIG. 3 is a diagram of a monolithic architecture over time;

FIG. 4 is a diagram of a service handling requests with microservices over time;

FIG. 5 is a diagram of how microservices can be launched from different locations in a virtualized environment;

FIG. 6 is a block diagram a system according to some embodiments of the present disclosure;

FIG. 7 is a flowchart of an example process in a network node according to some embodiments of the present disclosure;

FIG. 8 is a flowchart of another example process in a network node according to some embodiments of the present disclosure;

FIG. 9 is a diagram of process roles for the microservice;

FIG. 10 is a diagram of MDFG represented for the virtual application;

FIG. 11 is a diagram of a new MQDG that is generated based at least in part on virtual resources;

FIG. 12 is a diagram of a virtual IMS, Clearwater open source application; FIG. 13 is a diagram of a graph representing a Clearwater implementation; FIG. 14 is a diagram of another graph representing a Clearwater implementation;

FIG. 15 is a diagram of process roles defined inside the clear water application; and

FIG. 16 is a diagram of detected abnormal behavior.

DETAILED DESCRIPTION

Before describing in detail exemplary embodiments, it is noted that the embodiments reside primarily in combinations of apparatus components and processing steps related to profiling microservices in a distributed virtualized environment. Accordingly, components have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein. Like numbers refer to like elements throughout the description.

As used herein, relational terms, such as “first” and “second,” “top” and “bottom,” and the like, may be used solely to distinguish one entity or element from another entity or element without necessarily requiring or implying any physical or logical relationship or order between such entities or elements. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the concepts described herein. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including” when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

In embodiments described herein, the joining term, “in communication with” and the like, may be used to indicate electrical or data communication, which may be accomplished by physical contact, induction, electromagnetic radiation, radio signaling, infrared signaling or optical signaling, for example. One having ordinary skill in the art will appreciate that multiple components may interoperate and modifications and variations are possible of achieving the electrical and data communication.

In some embodiments described herein, the term “coupled,” “connected,” and the like, may be used herein to indicate a connection, although not necessarily directly, and may include wired and/or wireless connections.

The term “network node” used herein can be any kind of network node such as a server and/or device such as in a cloud computing environment and/or any network environment, any of base station (BS), radio base station, base transceiver station (BTS), base station controller (BSC), radio network controller (RNC), g Node B (gNB), evolved Node B (eNB or eNodeB), Node B, multi-standard radio (MSR) radio node such as MSR BS, multi-cell/multicast coordination entity (MCE), integrated access and backhaul (LAB) node, relay node, donor node controlling relay, radio access point (AP), transmission points, transmission nodes, Remote Radio Unit (RRU) Remote Radio Head (RRH), a core network node (e.g., mobile management entity (MME), self-organizing network (SON) node, a coordinating node, positioning node, MDT node, etc.), an external node (e.g., 3rd party node, a node external to the current network), nodes in distributed antenna system (DAS), a spectrum access system (SAS) node, an element management system (EMS), etc. The network node may also comprise test equipment.

In some embodiments, the non-limiting terms wireless device (WD) or a user equipment (UE) are used interchangeably. The WD herein can be any type of wireless device capable of communicating with a network node or another WD over radio signals, such as wireless device (WD). The WD may also be a radio communication device, target device, device to device (D2D) WD, machine type WD or WD capable of machine to machine communication (M2M), low-cost and/or low-complexity WD, a sensor equipped with WD, Tablet, mobile terminals, smart phone, laptop embedded equipped (LEE), laptop mounted equipment (LME), USB dongles, Customer Premises Equipment (CPE), an Internet of Things (IoT) device, or a Narrowband IoT (NB-IOT) device etc.

Also, in some embodiments the generic term “radio network node” is used. It can be any kind of a radio network node which may comprise any of base station, radio base station, base transceiver station, base station controller, network controller, RNC, evolved Node B (eNB), Node B, gNB, Multi-cell/multicast Coordination Entity (MCE), LAB node, relay node, access point, radio access point, Remote Radio Unit (RRU) Remote Radio Head (RRH).

Note further, that functions described herein as being performed by a network node may be distributed over a plurality of wireless devices and/or network nodes. In other words, it is contemplated that the functions of the network node and wireless device described herein are not limited to performance by a single physical device and, in fact, can be distributed among several physical devices.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms used herein should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein. Embodiments provide profiling microservices in a distributed virtualized environment.

Referring again to the drawing figures, in which like elements are referred to by like reference numerals, there is shown in FIG. 6 a schematic diagram of a system 10 according to one or more embodiments. System 10 includes a plurality of network nodes 12a, 12b, 12c (collectively referred to as network node 12).

Network node 12 includes hardware 14 enabling it to communicate with other devices in system 10 such as with one or more other network nodes 12. The hardware 14 may include a communication interface 16 for setting up and maintaining a wired or wireless connection with an interface of a different device of the system 10 such as with one or more network nodes 12. In one or more embodiments, communication interface 16 may include a radio interface. The radio interface may be formed as or may include, for example, one or more RF transmitters, one or more RF receivers, and/or one or more RF transceivers.

In the embodiment shown, the hardware 14 of the network node 12 further includes processing circuitry 18. The processing circuitry 18 may include a processor 20 and a memory 22. In particular, in addition to or instead of a processor, such as a central processing unit, and memory, the processing circuitry 18 may comprise integrated circuitry for processing and/or control, e.g., one or more processors and/or processor cores and/or FPGAs (Field Programmable Gate Array) and/or ASICs (Application Specific Integrated Circuitry) adapted to execute instructions. The processor 20 may be configured to access (e.g., write to and/or read from) the memory 22, which may comprise any kind of volatile and/or nonvolatile memory, e.g., cache and/or buffer memory and/or RAM (Random Access Memory) and/or ROM (Read-Only Memory) and/or optical memory and/or EPROM (Erasable Programmable Read-Only Memory).

Thus, the network node 12 further has software 24 stored internally in, for example, memory 22, or stored in external memory (e.g., database, storage array, network storage device, etc.) accessible by the network node 12 via an external connection. The software 24 may be executable by the processing circuitry 18. The processing circuitry 18 may be configured to control any of the methods and/or processes described herein and/or to cause such methods, and/or processes to be performed, e.g., by network node 12. Processor 20 corresponds to one or more processors 20 for performing network node 12 functions described herein. The memory 22 is configured to store data, programmatic software code and/or other information described herein. In some embodiments, the software 24 may include instructions that, when executed by the processor 20 and/or processing circuitry 18, causes the processor 20 and/or processing circuitry 18 to perform the processes described herein with respect to network node 12. For example, processing circuitry 18 of the network node 12 may include profile unit 26 configured to perform one or network node functions 16 as described herein such as with respect to profiling microservices in a distributed virtualized environment as described herein.

Although FIG. 6 show various “units” such as profile unit 26 as being within a respective processor, it is contemplated that these units may be implemented such that a portion of the unit is stored in a corresponding memory within the processing circuitry. In other words, the unit may be implemented in hardware or in a combination of hardware and software within the processing circuitry.

FIG. 7 is a flowchart of an exemplary process in a network node 12 according to some embodiments of the present disclosure. One or more Blocks and/or functions performed by network node 12 may be performed by one or more elements of network node 12 such as by profile unit 26 in processing circuitry 18, processor 20, radio interface, etc. In one or more embodiments, network node 12 such as via one or more of processing circuitry 18, processor 20, communication interface 16 and radio interface is configured to profile (Block S100) a plurality of microservices in a distributed virtualized environment where the plurality of microservices are associated with an application, as described herein. In one or more embodiments, one or more of the plurality of microservices are provided by one or more logical nodes where some of the logical nodes may be implemented by network node 12 while some of the other logical nodes may be implemented by one or more other network nodes 12.

In one or more embodiments, network node 12 such as via one or more of processing circuitry 18, processor 20, communication interface 16 and radio interface is configured to perfonn (Block S102) at least one function based at least in part on the profile of the plurality of microservices, as described herein. For example, the at least one function may include one or more of detection of a network attack, detection of an abnormal process, etc.

According to one or more embodiments, the profiling of the plurality of services includes collecting and aggregating different instances of the plurality of microservices. According to one or more embodiments, the profiling of the plurality of services is based at least in part on the microservices contexts. According to one or more embodiments, the profiling of the plurality of microservices includes determining process roles for the plurality of microservices where at least one of the process roles provides a same microservice execution over time where different processes associated with the process rules are created in different logical containers in different network nodes

According to one or more embodiments, the profiling of the plurality of microservices includes determining process roles for the plurality of microservices where each process role defines a chain of binaries for launching a process and the chain of binaries is associated with a microservice type. According to one or more embodiments, the profiling of the plurality of microservices includes determining process roles for the plurality of microservices where each of the process roles are defined at least in part by a port identifier.

According to one or more embodiments, the profiling of the plurality of microservices includes determining process roles for the plurality of microservices. The processing circuitry is further configured to generate a graph representing a data flow between parts of at least one of the process roles of one of a same microservice type and different microservice type. According to one or more embodiments, the profiling of the plurality of microservices includes determining process roles for the plurality of microservices where the process roles are associated with different processes created in the distributed virtualized environment based at least in part on a characteristic associated with the process roles. According to one or more embodiments, the characteristic is a chain of binaries.

FIG. 8 is a flowchart of another example process in a network node 12 according to some embodiments of the present disclosure. One or more Blocks and/or functions performed by network node 12 may be performed by one or more elements of network node 12 such as by profile unit 26 in processing circuitry 18, processor 20, radio interface, etc. In one or more embodiments, network node 12 such as via one or more of processing circuitry 18, processor 20, communication interface 16 and radio interface is configured to determine (Block S100) a plurality of process roles for a plurality of microservices associated with a virtual application, as described herein.

In one or more embodiments, network node 12 such as via one or more of processing circuitry 18, processor 20, communication interface 16 and radio interface is configured to aggregate (Block S100) a plurality of data instances of the plurality of process roles, as described herein. In one or more embodiments, network node 12 such as via one or more of processing circuitry 18, processor 20, communication interface 16 and radio interface is configured to create (Block S100) a profile based at least on the aggregated plurality of data instances, as described herein. In one or more embodiments, network node 12 such as via one or more of processing circuitry 18, processor 20, communication interface 16 and radio interface is configured to perform (Block SI 00) an action based at least on the profile, as described herein.

According to one or more embodiments, the processing circuitry 18 is further configured to: determine a plurality of process identifier, PID, instances over a predefined time that are triggered by the plurality of microservices, and determine a plurality of virtual process role identifiers based at least on the plurality of PID instances where each of the plurality of virtual process role identifiers corresponds to a respective one of the plurality of process roles.

According to one or more embodiments, at least one of the plurality of PID instances being assigned without consideration of a process functionality where each of the plurality of virtual process role identifiers is assigned based at least on a respective process functionality. According to one or more embodiments, the aggregated plurality of data instances of the plurality of process roles aggregates redundant processes having the same virtual process role identifier of the plurality of virtual process role identifiers. According to one or more embodiments, the processing circuitry 18 is further configured to perform a hash function on each of the plurality of process roles where the plurality of virtual process role identifiers is based on the hash function on each of the plurality of process roles. According to one or more embodiments, the processing circuitry 18 is further configured perform a hash function on at least one partial process tree where each of the at least one partial process tree including at least two respective process roles of the plurality of process roles. The plurality of virtual process role identifiers are based on the hash function on the at least one partial process tree. According to one or more embodiments, each one of the plurality of data instances corresponds to a respective system call. According to one or more embodiments, the profile corresponds to a consistent view of the plurality of microservices.

According to one or more embodiments, the action includes at least one of outlier detection of at least one of the plurality of microservices and detection of a network attack. According to one or more embodiments, the plurality of process roles are defined within a distributed virtualized environment.

Embodiments provide profiling microservices in a distributed virtualized environment.

Having generally described arrangements for profiling microservices in a distributed virtualized environment, details for these arrangements, functions and processes are provided as follows, and which may be implemented by the network node 12.

Profiling process role

Referring back to FIGS. 3 and 4, there may be various PIDs associated with a microservice. In reality, as different microservices are initiated to handle incoming requests or perform operations for the applications, the PID instances may keep changing for the microservice. Additionally, microservices may be instantiated in different containers in different physical nodes. There may be a need to distinguish a microservice based at least in part on its role in a rather long period of time beyond microservice life span in a distributed environment.

The instant disclosure solves at least some of the problems with existing systems by representing and associating all PID instances of the same microservice over time to the same microservice. This approach may be referred to as process role profiling.

In one or more embodiments described herein, this process role may present the same microservice execution over time even though there are different processes that are created in different containers in different physical nodes to address the requests over time.

Note that a process role may correspond to a node in a graph representing a microservice. For example, in FIG. 9, all the logs corresponding to all the instances (i.e., data instances) of process role in a microservice over a period of time and in different locations are aggregated together such as via processing circuitry 18 under the same process role identity. These logs may then be used to extract features for profiling a microservice, as described herein.

The graph/profile representing the data flow between different process roles part of the same microservice type or different microservice types may be referred to as the Microservice Data Flow Graph (MDFG). For example, in FIG. 10, the MDFG represents different process roles for different microservice types for a virtualized application and the data flow for the application where the redundant data instances in FIG. 9 have been aggregated. This abstraction may then be used later to profile the overall virtualized application behavior.

Furthermore, the concept described herein for the processes can be extended to all the associated virtual resources. For example, additional to process interactions, the virtual file accesses can be further added into the MQDG. The MQDG is extendable to virtual resources as the virtual resources used in a microservice type may always be the same irrespective of the location and time for that microservice. In one or more embodiments, the microservice instances of the same type may expose same behavior when and where they are instantiated. In FIG. 11, a different version of MDFG with the virtual resources accesses presented is illustrated.

In order to distinguish a process role, a chain of binaries may be used to launch a process. These binaries are associated with a microservice type in the application. The associated actions/logs from all those process instances over time and in different locations may be aggregated such that they correspond to the same role. Afterwards, a profile for the process role may be built such as via processing circuitry 18 in order to be used, for example, for machine learning activities. Note that one or more embodiments described herein may be based at least in part on logging the actions between different process roles in the same virtual application. The profile built in this manner represents the profile of interactions between microservices inside the same virtual application.

Therefore, one or more embodiments of the instant disclosure solves at least some of the problems with existing systems at least in part by associating several processes associated to the same microservice in different locations, then aggregating their behavior to build a process role profile, and then representing application behavior as interactions between these microservices.

However, one or more embodiments of the instant disclosure may go further as the use of a binary chain for defining a process role may not be enough. Untraceability may be a problem for solutions which solely work based at least in part on analysis of the tight or strict relation between components in a system such as Syscall-based ML solutions for anomaly detection in a system

Further, PID inconsistency may not be the only problem with data-driven approaches that have to be dealt with in a dynamic system. With the emergence of new networking schemes and architectures for containerized environments such as Docker, it is not uncommon for the packets to not follow the TCP/IP traditional practices. For example, analyzing Docker TCP communications shows that, in a session establishment handshake, it is possible that the server and requester use different IPs and port number between two same processes. This may add to the complexity of communication tracing between components of a system when the PIDs, IPs, and PORTs are not consistent and are dynamically changing.

The data driven approaches such as machine learning may be highly sensitive to consistency of data profiling of the same objects such that if these data driven approaches cannot draw the connection between the inconsistent data in the system, they may not be able to correlate a meaningful conclusion based on the raw data.

Therefore, a second complementary approach is described herein to define processes associated with the same process role.

Profiling process role definition

In this section, different methods are described for defining process roles in a distributed virtualized environment (e.g., containers). In one or more embodiments, a distributed virtualized environment is provided by one or more network nodes 12 that may correspond to, for example, one or more servers, devices, computing entities in one or more network nodes such as a network cloud.

Binary based classification

To define the processes part of the same microservice, the PIDs of the microservices may not be usable as these PIDs are assigned in an arbitrary way. For example, the PID may be assigned without consideration of a process type to which the PID is assigned such that a PID, by itself, does not reveal much information of the process to which it is assigned. At the same time, as in a virtualized and containerized microservice architecture environment, the same binary code can be invoked several times and in different roles, therefore referring to the binary code (i.e., characteristic associated with at least one process role and/or process) used to launch the process may not be sufficient for differentiation. For example, an ETCD binary can be invoked by component A in the virtual application but can also be invoked by component B in the same system such that the ETCD binary may be part of two different microservice types. Although two or more components are using exactly the same binary code to launch a process, the resulting ETCD processes may have different functionality and therefore they may also have different roles in the system, i.e., they are part of two different process roles and/or provide and/or perform different functions. In one or more embodiments, binary code and binary are used interchangeably.

One approach is to use the chain of binaries, i.e., binary chain, associated to the parent processes to identify the unique role of a process being performed in the virtualized environment. Using the previous example, it may be that BinaryA- BinaryETCD and Binarye-BinarymcD are used instead of PIDi and PID2. The PIDi and PID2 are dynamic, but BinaryA-BinarymcD and BinaryB-BinaryETCD are consistent with respect to the microservices. To adopt the convention of numeric representation, ProcessRolei=HASH(BinaryA-BinaryETCD) and ProcessRole2=HASH(BinaryB- BinaryETCD) is used.

In the case that the parent binaries are also similar, to make a distinction between the roles, the process may go one layer further into heritage of the process. For example, ProcessRolei=HASH(Binary A_parent- Binary A -Binary ETCD .

Pseudo code for calculating the role id (i.e., virtual process role ID) There are two functions (get_process_role_id and is_redundant_branch) that may call each other recursively in order to complete the task of calculating the role id of a given process such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc. For example, the virtual process role ID may be based on performing a hash function as described below. DefineProcessRole( input TargetPID, input Process_Instantiation_Family_Tree,

Input StartPID )

// TargetPID is the PID of the process which we want to define the role // Process_Instantiation_Family_Tree is the TargetPID’s Instantiation Family Tree // StartPID is the process ID of the starting process in Process_Instantiation_Family_Tree for TargetPID

// In one or more embodiments, the virtualized environment is considered where the virtualized is separated and

// non virtualized part of the process tree. For example, StartPID is the docker engine process PID if there is any child process: hash_list_1 = TREE_HASH(for each item in children process list); if there is any sibling process: hash_list_2 = TREE_HASH(for each item in siblings list); if there is a parent process in tree: hash_list_3 = TREE_HASH(for each item in parents list); role_hash_list = Concatenate (hash_list_l, hash_list_2, hash_list_3) ; ProcessRolelD = LIST_HASH(role_hash_list); return ProcessRolelD The source code below: get_process_role_id(pid_input, process_instantiation_family_tree, terminating process ) pidO = terminating process: pidl = pid_input; treel = process_instantiation_family_tree; tree2 = get_not_including_subtree(treel, pidO); // this is used to separate the virtualized and // non virtualized part of the process tree.

// for example, terminating process is the docker engine pid // a non-virtualized-related process that is branched from kernel process // may not affect the roles of microservices inside containers pid_listl = process_childs_list = get_process_childs_list(pid 1 , tree2); pid_list2 = process_sibling_list = get_process_sibling_list(pidl, tree2); pid2 = process_parent = get_process_parent(pidl, tree2); hashl = process_hash = HASH(B IN AR Y_CERT(pid 1 )) ; hash2 = hash3 = hash4 = none; if there is any child process: hash_list_l = TREE_HASH(for each item in pid_listl); hash2 = LIST_HASH(hash_list_ 1 ) ; if there is any sibling process: hash_list_2 = TREE_HASH(for each item in pid_list2, tree2); hash3 = LIST_HASH(hash_list_2); if there is a parent process in tree: hash4 = parent_process_hash = get_process_role_id(pid2, tree2, pidO); hash_list_3 = list(hashl, hash2, hash3, hash4);

RESULT = LIST_HASH(hash_list_3); return RESULT LJST_HASH(list_input): listl = list_input list2 = UST_CLEAN(listl); list3 = sorted(list2);

RESULT = 79123453462457; // arbitrary but fixed random value for i in range(l,len(list3))

RESULT = HASH(result*list3 [i]) ; return RESULT UST_CLEAN (list_input): listl = list_input; list2 = remove_none_values(listl); // in case a process do not have child, sibling, or parent list3 = remove_duplicates(list2);

// This may be important as in virtualized environment,

// there may be redundant roles that may need to be removed to // avoid calculating different role ids when a microservice scales out // or scales in

RESULT = lists return RESULT

HASH (input):

RESULT = SHA256(input); // this is one example of implementation return RESULT

TREE_HASH(pid_input, tree_input): pidl = pid_input; treel = tree_input; tree2 = get_subtree_with_root_of(tree 1 ,pid 1 ) ;

// it returns a subtree of treel where the pidl is the root of that subtree hashl = get_process_role_id(pid 1 , tree2, none);

RESULT = hashl return RESULT Therefore, while the PID assigned to a process may vary depending on, for example, when the process is triggered, the hash of a process will remain constant as, for example, the same subtree is triggered for a particular process.

Port Number Based Classification

In this section, an approach is described that may provide a connection between inconsistent data in order to, for example, advantageously allow the ML- based solutions with consistency and higher accuracy. The port is used such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc. to identify processes having the same role. As used herein, process may refer to a logical process performed at a logical node provided by a network node 12. The TCP packets (send and received) are grouped such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc., by port number which may otherwise not be grouped by Port or IP. For inconsistent IPs and PIDs, the port number may be used as it may be the only consistent element in a TCP communication. Therefore, one or more embodiments described herein group the TCP packets (send and received) by port number which otherwise may not be groupable by IP or PID. Next, the receiving process (PID) of received packets are identified and their process roles are calculated such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc. The sending process (PID) of the sent packets may also need to be identified and their process role calculated in a same way of receiving processes.

Eventually, over the period of time and TCP sessions, two process roles may emerge from both receiving and sending processes: one correlated with the port number. Then, a determination such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc. may be made that ProcessRolex is communicating with ProcessRole y over TCP at ti. For future communications at time tz, the determination may be made that the ProcessRole* is communicating with ProcessRole y based at least in part on port number even if original processes are restarted and PIDs are changed and with inconsistent IPs.

These two processes may be considered the result of the profiling of one or more microservices where the two processes ProcessRole* and ProcessRole y may be used to perform at least one function such as detection a logical/network attack, determination that one or more of the processes are abnormal, etc.

Cloud Implementation such as via one or more network nodes 12 Virtual IMS, Clear Water open source application Role definiοn

Some examples of different roles for a clear water application as described below.

The following process tree of containerize Clearwater IMS system (where the dynamic PIDs are removed and each process is represented by its binary) is considered and illustrated in FIG. 12. In one or more embodiments, the Clearwater IMS system is provided at least in part by system 10. Each container has its own unique process subtree. For bono, a HASH(containerd-shim_supervisord_bono_run- in-signalin_bono) is unique and enough to identify consistently this process role, however, for etcd, the HASH(containerd-shim_supervisord_ etcd) is not unique anymore and the get_process_role_id(target_etcd_PID, process_instantiation_family_tree, target_ containerd-shim) function may be needed in order to return a unique identifier.

In both cases, in case of container/process restart, the process role id may not change, and if multiple container (microservice) of the same type are launched such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc., they may all produce a similar unique process role id which can be used in the following use case listed below. For example, a full tree and/or subtree (i.e., partial tree or partial process tree) may be used to generate the virtual process role ID for the process described herein, To enforce the process role identification, the process roles may be identified such as via one or more of processing circuitry 18, processor 20, communication interface 16, profile unit 26, etc. based at least in part on standard port associations for each microservice. For example, in a containerized Clearwater IMS system (shown as the following), the bono microservice is identified by association with its standard ports.

CONTAINER ID IMAGE COMMAND CREATED

STATUS PORTS

NAMES b78e9987b9b5 weaveworks/scope: 1.11.6 "/home/weave/entrypo... " 2 weeks ago Up 2 weeks weavescope

69dlal82febd clearwater/sprout "Aisr/bin/supervisor..." 2 weeks ago Up 2 weeks 5052/tcp, 5054/tep, 0.0.0.0:32775->22/tep clearwater-docker_sprout_1

0c3eaa869c5b clearwater/cassandra "/usr/bin/supervisor..." 2 weeks ago Up 2 weeks 7001/tcp, 9042/tep, 9160/tcp, 0.0.0.0:32772->22/tcp clearwater-docker_cassandra_l

8d41fca644d2 clearwater/homestead-prov "/usr/bin/supervisor..." 2 weeks ago Up 2 weeks 8889/tcp, 0.0.0.0:32774->22/tcp clearwater-docker_homestead-prov_l 228b639d8938 clearwater/homer "/usr/bin/supervisor... " 2 weeks ago

Up 2 weeks 7888/tcp, 0.0.0.0:32776->22/tcp clearwater-docker_homer_l 9c502dl04635 clearwater/bono "/usr/bin/supervisor... " 2 weeks ago

Up 2 weeks 0.0.0.0:3478->3478/tep, 0.0.0.0:3478->3478/udp, 0.0.0.0:5060-

>5060/tcp, 0.0.0.0:5062->5062/tcp, 0.0.0.0:5060->5060/udp, 5058/tcp, 0.0.0.0:32778- >22/tcp clearwater-docker_bono_l ab77521e8e45 clearwater/ellis "/usr/bin/supervisor... " 2 weeks ago Up 2 weeks 0.0.0.0:80->80/tcp, 0.0.0.0:32773->22/tcp clearwater-docker ellis 1 039fa08643ab clearwater/chronos "/usr/bin/supervisor..." 2 weeks ago Up 2 weeks 7253/tcp, 0.0.0.0:32768->22/tcp clearwater-docker_chronos_l 8b5dlbe48879 clearwater/astaire "/usr/bin/supervisor... " 2 weeks ago Up 2 weeks 11311/tcp, 0.0.0.0:32770->22/tcp clearwater-docker_astaire_l

59718e9529eb clearwater/homestead "/usr/bin/supervisor... " 2 weeks ago

Up 2 weeks 8888/tcp, 0.0.0.0:32769->22/tcp clearwater-docker_homestead_l af8bf0bd9ee9 clearwater/ralf "/usr/bin/supervisor... " 2 weeks ago

Up 2 weeks 10888/tcp, 0.0.0.0:3277 l->22/tcp clearwater-docker_ralf_l

42f8c54a9fld quay.io/coreos/etcd:v2.2.5 "/etcd -name etcdO -... " 2 weeks ago Up 2 weeks 2379-2380/tcp, 4001/tcp, 7001/tcp clearwater-docker_etcd_ 1

FIGS. 13 and 14 each illustrate two graphs representing communications for Clearwater implementation during 3 minutes of execution. On the left, the QDFG according to the teaching of the disclosure is illustrated, and on the right is a simple PID based monitoring,

The QDFG approach advantageously allows for a coherent vision of the communications for Clearwater. As for the simple PID approach, the creation/deletion of microservices in Clearwater causes the entire graph to change every few minutes. The classical approach, i.e., the simple PID, may not be used to profile Clearwater behavior over time due to the changing of the entire graph every few minutes.

Profiling the process roles

The roles are defined inside the clear water application as illustrated in FIG.

15. Without a process role profiling approach, it may not be possible to profile the interactions between different microservices.

Utility: Using ML to detect abnormal behavior

An attack as illustrated in FIG. 16 that is based on a backdoor is purposely implemented inside the source code for testing. The attack to the backdoor was able to be detected based at least in part on the application profiling of the interactions between different microservices.

Feasibility: Performance

The results from experiments indicate that the performance impact on the container is small.

Table 1 Accuracy of different machine learning algorithms

Table 1 shows the accuracy of machine learning algorithms based on the features generated from the consistent virtualized process roles. Three machine learning algorithms are presented in this table, namely, isolation forest, one class SVM, and elliptic envelop algorithm. The first and third algorithms perform well in detecting outliers and relatively good normal behavior detection rate. Although one class SVM has the worst accuracy in outlier detection, it detects normal behavior most accurately amount the three algorithms. The standard deviation for each accuracy is based on 100 iterations of the experiment results. The relatively small standard deviations demonstrate that the averaged accuracy of detection rates showing in the table are representative of the overall situations. This results also demonstrate the usability of process role in generating features for machine learning and anomaly detection. Some Aspects of the Disclosure

A system 10 for profiling microservices adapted to ML/data mining methods in containerized virtualized environment based at least in part on process role is provided. In one or more embodiments, this process role is not the same as process functionality, as described herein. For example, process role may refer to one or more characteristics of a process that may or may not include process functionality.

The profiling may be based at least in part on defining process roles for different process instances created for each microservice over time in a distributed environment.

Several methodologies to determine that process role for different process instances in the containerized virtualized environment based on process tree, and/or networking characteristics as described herein.

An identification (ID) of a process role for microservices in containerized virtualized environment is described herein.

Some Examples

Example Al. A network node 12 configured to, and/or comprising a communication interface 16 and/or comprising processing circuitry 18 configured to: profile a plurality of microservices in a distributed virtualized environment, the plurality of microservices being associated with an application; and perform at least one function based at least in part on the profile of the plurality of microservices.

Example A2. The network node 12 of Example Al, wherein the profiling of the plurality of services includes collecting and aggregating different instances of the plurality of microservices.

Example A3. The network node 12 of Example Al, wherein the profiling of the plurality of services is based at least in part on the microservices contexts.

Example A4. The network node 12 of Example Al, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, at least one of the process roles providing a same microservice execution over time where different processes associated with the process rules are created in different logical containers in different network nodes 12. Example AS. The network node 12 of Example Al, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, each process role defining a chain of binaries for launching a process, the chain of binaries being associated with a microservice type.

Example A6. The network node 12 of Example Al, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, each of the process roles are defined at least in part by a port identifier.

Example A7. The network node 12 of Example Al, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices; and the processing circuitry 18 being further configured to generate a graph representing a data flow between parts of at least one of the process roles of one of a same microservice type and different microservice type.

Example A8. The network node 12 of Example Al, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, the process roles being associated with different processes created in the distributed virtualized environment based at least in part on a characteristic associated with the process roles.

Example A9. The network node 12 of Example A8, wherein the characteristic is a chain of binaries.

Example Bl. A method, comprising: profiling (S100) a plurality of microservices in a distributed virtualized environment, the plurality of microservices being associated with an application; and performing (S102) at least one function based at least in part on the profile of the plurality of microservices.

Example B2. The method of Example Bl, wherein the profiling of the plurality of services includes collecting and aggregating different instances of the plurality of microservices.

Example B3. The method of Example Bl, wherein the profiling of the plurality of services is based at least in part on the microservices contexts.

Example B4. The method of Example Bl, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, at least one of the process roles providing a same microservice execution over time where different processes associated with the process rules are created in different logical containers in different network nodes 12.

Example B5. The method of Example Bl, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, each process role defining a chain of binaries for launching a process, the chain of binaries being associated with a microservice type.

Example B6. The method of Example Bl, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, each of the process roles are defined at least in part by a port identifier.

Example B7. The method of Example Bl, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices; and the method further comprising generating a graph representing a data flow between parts of at least one of the process roles of one of a same microservice type and different microservice type.

Example B8. The method of Example Bl, wherein the profiling of the plurality of microservices includes determining process roles for the plurality of microservices, the process roles being associated with different processes created in the distributed virtualized environment based at least in part on a characteristic associated with the process roles.

Example B9. The method of Example B8, wherein the characteristic is a chain of binaries.

Therefore, the teachings described herein can consistently track microservices in a containerized environment. This may be necessary for ML approaches. Shortlived containerized components are part of the building blocks of a highly available and continuous service function. These short duration microservices are assigned each a distinct PID. This would cause for a microservice to be presented with many different PIDs in a short period of time. The classical approach of associating a behavior to one PID becomes obsolete as the inconsistency in PID of these microservices’ processes, makes it difficult to trace the microservice. The approach described herein provides a fine-grained profiling of microservices in a distributed virtualized environment. In one or more embodiments, the approach collects and aggregates the traces for different instances of the microservices running in different locations, i.e., computing nodes for the same virtual application. This approach then covers the appropriate modeling at least in part by taking into at least one of account scaling in/out, load balancing, migrations and other possible dynamic distributed aspects of cloud for microservices.

The approach described herein is able to distinguish between profiling the same binary used in different virtualized environments contexts. For example, the same binary ETCD can be profiled in two or several different ways depending on the microservice contexts. This may be useful when it comes to fingerprinting different microservices even though they could share some binaries.

As will be appreciated by one of skill in the art, the concepts described herein may be embodied as a method, data processing system, computer program product and/or computer storage media storing an executable computer program. Accordingly, the concepts described herein may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects all generally referred to herein as a “circuit” or “module.” Any process, step, action and/or functionality described herein may be performed by, and/or associated to, a corresponding module, which may be implemented in software and/or firmware and/or hardware. Furthermore, the disclosure may take the form of a computer program product on a tangible computer usable storage medium having computer program code embodied in the medium that can be executed by a computer. Any suitable tangible computer readable medium may be utilized including hard disks, CD-ROMs, electronic storage devices, optical storage devices, or magnetic storage devices.

Some embodiments are described herein with reference to flowchart illustrations and/or block diagrams of methods, systems and computer program products. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer (to thereby create a special purpose computer), special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer readable memory or storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

It is to be understood that the functions/acts noted in the blocks may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows.

Computer program code for carrying out operations of the concepts described herein may be written in an object oriented programming language such as Java® or C++. However, the computer program code for carrying out operations of the disclosure may also be written in conventional procedural programming languages, such as the "C" programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer. In the latter scenario, the remote computer may be connected to the user's computer through a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).

Many different embodiments have been disclosed herein, in connection with the above description and the drawings. It will be understood that it would be unduly repetitious and obfuscating to literally describe and illustrate every combination and subcombination of these embodiments. Accordingly, all embodiments can be combined in any way and/or combination, and the present specification, including the drawings, shall be construed to constitute a complete written description of all combinations and subcombinations of the embodiments described herein, and of the manner and process of making and using them, and shall support claims to any such combination or subcombination.

It will be appreciated by persons skilled in the art that the embodiments described herein are not limited to what has been particularly shown and described herein above. In addition, unless mention was made above to the contrary, it should be noted that all of the accompanying drawings are not to scale. A variety of modifications and variations are possible in light of the above teachings without departing from the scope of the following claims.