Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
WORKLOAD SCHEDULING BASED ON A PLATFORM ENERGY POLICY
Document Type and Number:
WIPO Patent Application WO/2012/082349
Kind Code:
A2
Abstract:
In some embodiments, a data center system may include one or more server platforms, a workload scheduler, and a set of stored data shared between the server platforms and the workload scheduler. The server platforms may include processing cores and memory in electrical communication with the processing cores. The memory may store code which when executed causes the server platform to store a platform power correlation factor, receive workload requirements for a workload from a workload scheduler, determine a current and expected energy consumption based on the workload requirements and the platform performance correlation factor, communicate the current and expected energy consumption for the workload to the workload scheduler, and if the workload is dispatched to the server platform from the workload scheduler, store the workload requirements in the memory and modify characteristics of the server platform to execute the workload. The workload scheduler may determine if the workload can be sent to the server platform based on the current and expected energy consumption for the workload and pre-configured power and temperature thresholds for the server platform and also rack location, row location, and / or other data center specific information. The set of stored data may include a platform compute policy and / or a platform energy policy. Other embodiments are disclosed and claimed.

Inventors:
GIRI RAVI (IN)
Application Number:
PCT/US2011/062305
Publication Date:
June 21, 2012
Filing Date:
November 29, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INTEL CORP (US)
GIRI RAVI (IN)
International Classes:
G06F1/32; G06F9/06
Foreign References:
US20040249515A12004-12-09
US20090187782A12009-07-23
US6167524A2000-12-26
US20090007128A12009-01-01
Attorney, Agent or Firm:
VINCENT, Lester, J. et al. (1279 Oakmead ParkwaySunnyvale, California, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1 . A server platform, comprising:

one or more processing cores; and

memory in electrical communication with the one or more processing cores, the memory storing code which when executed causes the server platform to:

store a platform power correlation factor;

receive workload requirements for a workload from a workload scheduler;

determine a current and expected energy consumption based on the workload requirements and the platform performance correlation factor;

communicate the current and expected energy consumption for the workload to the workload scheduler; and

if the workload is dispatched to the server platform from the workload scheduler, store the workload requirements in the memory and modify characteristics of the server platform to execute the workload.

2. The server platform of claim 1 , wherein the platform power correlation factor corresponds to an expected power draw at various levels of resource utilization.

3. The server platform of claim 1 , wherein the workload requirements correspond to one or more of a number of processing cores, an amount of memory needed, and an expected run time.

4. The server platform of claim 1 , wherein the workload scheduler is to determine if the workload can be sent to the server platform based on the current and expected energy consumption for the workload communicated to the workload scheduler from the server platform and pre-configured power and temperature thresholds for the server platform and also one or more of rack location, row location, and other data center specific information.

5. The server platform of claim 1 , wherein the modified characteristics of the workload include one or more of a processing core to switch off, a portion of memory to switch off, a power profile, and a performance profile.

6. The server platform of claim 1 , wherein the memory to store the platform power correlation factor and the workload requirements includes a nonvolatile memory.

7. A method of operating a server platform, comprising:

storing a platform power correlation factor in a memory;

receiving workload requirements for a workload from a workload scheduler;

determining a current and expected energy consumption based on the workload requirements and the platform performance correlation factor;

communicating the current and expected energy consumption for the workload to the workload scheduler; and

if the workload is dispatched to the server platform from the workload scheduler, storing the workload requirements in the memory and modifying characteristics of the server platform to execute the workload.

8. The method of claim 7, wherein the platform power correlation factor corresponds to an expected power draw at various levels of resource utilization.

9. The method of claim 7, wherein the workload requirements correspond to one or more of a number of processing cores, an amount of memory needed, and an expected run time.

10. The method of claim 7, wherein the workload scheduler is to determine if the workload can be sent to the server platform based on the current and expected energy consumption for the workload communicated to the workload scheduler from the server platform and pre-configured power and temperature thresholds for the server platform and also one or more of rack location, row location, and other data center specific information.

1 1 . The server platform of claim 7, wherein the modified characteristics of the workload include one or more of a processing core to switch off, a portion of memory to switch off, a power profile, and a performance profile.

12. The method of claim 7, wherein the memory for storing the platform power correlation factor and the workload requirements includes a non-volatile memory.

13. A data center system, comprising:

one or more server platforms;

a workload scheduler; and

a set of stored data shared between the one or more server platforms and the workload scheduler,

wherein at least one of the server platforms includes:

one or more processing cores; and

memory in electrical communication with the one or more processing cores, the memory storing code which when executed causes the server platform to:

store a platform power correlation factor;

receive workload requirements for a workload from a workload scheduler;

determine a current and expected energy consumption based on the workload requirements and the platform performance correlation factor;

communicate the current and expected energy consumption for the workload to the workload scheduler; and

if the workload is dispatched to the server platform from the workload scheduler, store the workload requirements in the memory and modify characteristics of the server platform to execute the workload,

wherein the workload scheduler is to determine if the workload can be sent to the server platform based on the current and expected energy consumption for the workload communicated to the workload scheduler from the server platform and pre-configured power and temperature thresholds for the server platform and also one or more of rack location, row location, and other data center specific information,

and wherein the set of stored data includes at least one of a platform compute policy and a platform energy policy.

14. The data center system of claim 13, wherein the platform power correlation factor corresponds to an expected power draw at various levels of resource utilization.

15. The data center system of claim 13, wherein the workload requirements correspond to one or more of a number of processing cores, an amount of memory needed, and an expected run time.

16. The data center system of claim 13, wherein the modified characteristics of the workload include one or more of a processing core to switch off, a portion of memory to switch off, a power profile, and a performance profile.

17. The data center system of claim 13, wherein the memory to store the platform power correlation factor and the workload requirements includes a non-volatile memory.

Description:
WORKLOAD SCHEDULING BASED ON A PLATFORM ENERGY POLICY

The invention relates to power management and more particularly to workload scheduling of an electronic system based on a platform energy policy.

BACKGROUND AND RELATED ART

The article "Above the Clouds: A Berkeley View of Cloud Computing," written by Michael Armbrust et al., dated February 10, 2009, discusses the need for energy proportionality in data centers. Various companies provide hardware and / or software for power management. For example, Intel Corporation's Dynamic Power Node Manager and Data Center Manager are hardware and / or software power management tools for a server or a group of servers.

BRIEF DESCRIPTION OF THE DRAWINGS

Various features of the invention will be apparent from the following description of preferred embodiments as illustrated in the accompanying

drawings, in which like reference numerals generally refer to the same parts throughout the drawings. The drawings are not necessarily to scale, the emphasis instead being placed upon illustrating the principles of the invention.

Fig. 1 is a block diagram of a server platform in accordance with some embodiments of the invention.

Fig. 2 is a block diagram of a data center system in accordance with some embodiments of the invention.

Fig. 3 is a flow diagram in accordance with some embodiments of the invention.

Fig. 4 is a block diagram of another data center system in accordance with some embodiments of the invention.

Fig. 5 is another flow diagram in accordance with some embodiments of the invention.

DESCRIPTION

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular structures, architectures, interfaces, techniques, etc. in order to provide a thorough understanding of the various aspects of the invention. However, it will be apparent to those skilled in the art having the benefit of the present disclosure that the various aspects of the invention may be practiced in other examples that depart from these specific details. In certain instances, descriptions of well known devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

With reference to Fig. 1 , a server platform 10 may include one or more processing cores 12 and memory 14 in electrical communication with the one or more processing cores 12. For example, the memory 14 may store code which when executed causes the server platform 10 to store a platform power correlation factor, receive workload requirements for a workload from a workload scheduler, determine a current and expected energy consumption based on the workload requirements and the platform performance correlation factor, communicate the current and expected energy consumption for the workload to the workload scheduler, and if the workload is dispatched to the server platform from the workload scheduler, store the workload requirements in the memory and modify characteristics of the server platform to execute the workload.

For example, in some embodiments of the server platform 10, the platform power correlation factor may correspond to an expected power draw at various levels of resource utilization. The workload requirements may correspond to one or more of a number of processing cores, an amount of memory needed, and an expected run time. For example, the workload scheduler may be configured to determine if the workload can be sent to the server platform 10 based on the current and expected energy consumption for the workload communicated to the workload scheduler from the server platform 10 and pre-configured power and temperature thresholds for the server platform 10 and also one or more of rack location, row location, and other data center specific information.

In some embodiments of the server platform 10, the modified

characteristics of the workload may include one or more of a processing core to switch off, a portion of memory to switch off, a power profile, and a performance profile. For example, the memory 14 to store the platform power correlation factor and the workload requirements may include a non-volatile memory, such as flash memory.

With reference to Figure 2, a data center system 20 may include one or more server platforms 22, a workload scheduler 24, and a set of stored data 26 shared between the one or more server platforms 22 and the workload scheduler 24. For example, at least one of the server platforms 22 may include one or more processing cores and memory in electrical communication with the one or more processing cores. The memory may store code which when executed causes the server platform 22 to store a platform power correlation factor, receive workload requirements for a workload from the workload scheduler, determine a current and expected energy consumption based on the workload requirements and the platform performance correlation factor, communicate the current and expected energy consumption for the workload to the workload scheduler, and if the workload is dispatched to the server platform from the workload scheduler, store the workload requirements in the memory and modify characteristics of the server platform to execute the workload.

In the data center system 20 in accordance with some embodiments of the invention, the workload scheduler 24 may determine if the workload can be sent to the server platform 22 based on the current and expected energy consumption for the workload communicated to the workload scheduler 24 from the server platform 22 and pre-configured power and temperature thresholds for the server platform 22 and also one or more of rack location, row location, and other data center specific information. For example, the set of stored data may include at least one of a platform compute policy and a platform energy policy.

In some embodiments of the data center system 20, the platform power correlation factor may correspond to an expected power draw at various levels of resource utilization. The workload requirements may correspond to one or more of a number of processing cores, an amount of memory needed, and an expected run time. The modified characteristics of the workload may include one or more of a processing core to switch off, a portion of memory to switch off, a power profile, and a performance profile. For example, the memory to store the platform power correlation factor and the workload requirements may include a non-volatile memory, such as a flash memory.

With reference to Figure 3, a method of operating a server platform in accordance with some embodiments of the invention may include storing a platform power correlation factor in a memory (e.g. at block 30), receiving workload requirements for a workload from a workload scheduler (e.g. at block 31 ), determining a current and expected energy consumption based on the workload requirements and the platform performance correlation factor (e.g. at block 32), communicating the current and expected energy consumption for the workload to the workload scheduler (e.g. at block 33), and if the workload is dispatched to the server platform from the workload scheduler, storing the workload requirements in the memory and modifying characteristics of the server platform to execute the workload (e.g. at block 34).

For example, the platform power correlation factor may correspond to an expected power draw at various levels of resource utilization (e.g. at block 35). For example, the workload requirements may correspond to one or more of a number of processing cores, an amount of memory needed, and an expected run time (e.g. at block 36).

In some embodiments of the invention, the workload scheduler may determine if the workload can be sent to the server platform based on the current and expected energy consumption for the workload communicated to the workload scheduler from the server platform and pre-configured power and temperature thresholds for the server platform and also one or more of rack location, row location, and other data center specific information (e.g. at block 37).

For example, the modified characteristics of the workload may include one or more of a processing core to switch off, a portion of memory to switch off, a power profile, and a performance profile (e.g. at block 38). For example, the memory for storing the platform power correlation factor and the workload requirements may include a non-volatile memory (e.g. at block 39), such as a flash memory.

Advantageously, some embodiments of the invention may provide a technique for data center energy efficiency with power and thermal aware workload scheduling. For example, some embodiments of the invention may involve balancing IT load, energy efficiency, location awareness, and / or a platform power correlation. For example, some embodiments of the invention may be useful in a data center utilizing server platforms that have service processors with temperature and power sensors (e.g. IPMI 2.0 and above including, for example, Intel's Node Manager).

By way of background and without limitation, the cost of energy for a large scale data center (e.g. HPC and / or internet/cloud provider) may be the single largest operational expense for the data center. Such data center environments may see relatively high server resource utilization (e.g. CPU, memory, I/O) and as a result higher energy consumption for running the servers as well as cooling them. Advantageously, some embodiments of the invention may provide a platform capability that helps lower energy cost with little or no throughput impact.

For example, some embodiments of the invention may provide platform level hardware and / or software capabilities that workload schedulers can use to intelligently schedule and dispatch jobs to achieve improved or optimal compute utilization as well as energy consumption. For example, some embodiments of the invention may provide an energy policy engine for HPC / cloud type of computing needs.

In accordance with some aspects of the invention, some combination the following data may be utilized to perform effective power and thermal aware scheduling:

1 . The available compute capacity expressed, for example, in normalized units such as SPECint or SPECfp or an application specific performance indicator specific to a particular data center (e.g. an HPC shop);

2. The requirements for the workload. For example, the workload requirements may include information about a preferred execution environment for the workload such as architecture, number of cores, memory, and / or disk space, among other workload requirement information such as priority or criticality;

3. Current (and expected) power draw at different granularities. For example, Watts/Hr for a particular server / rack / row / data center configuration; and / or

4. Location information related to the server platforms. For example, row / rack / data center location information.

Advantageously, some embodiments of the invention may include a platform power correlation factor stored in memory. For example, the platform power correlation factor may be embedded in the firmware. For example, the platform power correlation factor may allow the data center system to determine an expected power draw at various level of resource utilization as well as to determine an expected power draw if some of the resources were switched off. The data center system may also have the ability to record the location information for the server platforms and / or components of the server platforms in the data center.

Some server platforms may provide some capability (e.g. Intel's Node Managerâ„¢) to manage server power consumption (e.g. read the server power usage and set basic policies for controlling power usage). Advantageously, some embodiments of the present invention may provide a method for workload schedulers to interact directly with the platform and leverage existing platform abilities such as node manager, etc to efficiently schedule workloads while optimizing energy consumption.

With reference to Fig. 4, in accordance with some embodiments of the invention a data center system 40 includes one or more server platforms 42 in communication with a workload scheduler 44 and a data share 46. For example, the server platform 42 may include a combination of software (e.g. drivers, operating systems, and applications) and hardware / firmware (e.g. a

manageability engine interface with extensions, a flash memory store, a platform interface, a node manager, a service processor, and LAN hardware). For example, the data share 46 may include information related to a configuration management database (CMDB), compute policies, and energy policies).

Advantageously, some embodiments of the invention may provide a direct interface for workload schedulers to interact with platform capabilities. For example, the ability to map workload efficiency to power consumption of each platform and / or the ability to record location in data center and assist in self- manageability. For example, the platform interface to the workload schedulers may provide a mechanism to store the relevant bits of data in platform level flash storage.

With reference to Fig. 5, in some embodiments of the invention the workload scheduler may send the workload requirements to a server platform (e.g. number of cores, amount of memory needed, and an expected run time; e.g. at block 50). The server platform may respond with a current and expected energy consumption based on the power to performance correlation factor (e.g. based on the workload requirements and the platform power correlation factor; e.g. at blocks 51 and 52). The server platform may also provide additional information from the node manager and / or service processor (e.g. location information; e.g. at block 53). The workload scheduler may then determine if the workload can be sent to that server platform based on how the response matches against pre-configured power and temperature thresholds for that rack / row / data center (e.g. at blocks 54 and 55). If the workload can be run on the server platform, then the workload scheduler may dispatch the job and store the workload requirements in the data store (e.g. at block 56). Based on the workload requirements, the server platform may modify some operating characteristics to execute the workload (e.g. switching off some cores, etc.; e.g. at block 57 ) and perform those actions (e.g. utilizing the internal interface to the node manager / service processor; e.g. at block 58).

In one non-limiting example, for a highly critical workload that needs two cores and all the memory, the server platform can automatically switch remaining cores to a low power state and ensure no power capping is done to achieve highest throughput. In another non-limiting example, for a multi-threaded workload that does not need all the system memory, all cores can be switched on to high power state but some of the memory DIMM's can be turned off. In another non-limiting example, if the service processor / node manager reports higher ambient temperatures, the server platform may shut down half the cores and update the performance capability data so that the workload scheduler is aware of degraded capability.

Advantageously, some embodiments of the invention may also help in system management. For example, being able to query the server platform itself for performance capability information and location information may enable highly accurate and reliable manageability.

In accordance with some embodiments of the invention, components of an energy efficient data center with power and thermal aware workload scheduling may include a flash memory based data store, extensions to a manageability engine interface to allow host OS and applications such as workload schedulers to transact with the server platforms, and an interface to the server platform firmware / BIOS , service process and other related platform capabilities such as a node manager.

The foregoing and other aspects of the invention are achieved individually and in combination. The invention should not be construed as requiring two or more of such aspects unless expressly required by a particular claim. Moreover, while the invention has been described in connection with what is presently considered to be the preferred examples, it is to be understood that the invention is not limited to the disclosed examples, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and the scope of the invention.