Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CORE CHARACTERIZATION FOR PROCESSOR CORES
Document Type and Number:
WIPO Patent Application WO/2017/131667
Kind Code:
A1
Abstract:
A circuit includes a processor and a memory. The processor executes computer-executable instructions from the memory. A core characterizer operated by the processor characterizes process variations across a set of processor cores. The core characterizer includes a core stimulator to exercise each member of the set of processor cores to cause core changes across the set of cores. A core analyzer monitors the core changes of the processor cores to generate a proximity map for the set of processor cores. The proximity map stores boost data that includes location data and the thermal data for each member of the set of processor cores. The thermal data describes thermal characteristics for each of the processor cores based on the process variations.

Inventors:
BACHA, Anys (5475 Rings Road, Atrium II North TowerSuite 20, Dublin Ohio, 43017, US)
WINICK, Bradley D. (3404 E. Harmony Road, Fort Collins, Colorado, 80528-9544, US)
VADEN, Thomas L. (446 Whiton Road, Neshanic Station, New Jersey, 08853, US)
Application Number:
US2016/015148
Publication Date:
August 03, 2017
Filing Date:
January 27, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP (11445 Compaq Center Drive West, Houston, Texas, 77070, US)
International Classes:
G06F11/30; G06F1/08; G06F1/26; G06F9/48; G06F11/07
Foreign References:
US20150178138A12015-06-25
US20060290365A12006-12-28
US7463992B22008-12-09
US20090271141A12009-10-29
US20140245314A12014-08-28
Attorney, Agent or Firm:
SURESH, Anup Ashok et al. (Hewlett Packard Enterprise, 3404 E. Harmony RoadMail Stop 7, Fort Collins Colorado, 80528, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1 . A circuit, comprising:

a processor and a memory, the processor executes computer-executable instructions from the memory;

a core characterizer operated by the processor to characterize a process variation across a set of processor cores, the core characterizer comprising:

a core stimulator to exercise each member of the set of processor cores to cause a set of core changes across the set of processor cores; and

a core analyzer to monitor the set of core changes of the processor cores to generate a proximity map for the set of processor cores, the proximity map storing boost data that includes location data and thermal data for each member of the set of processor cores, and the thermal data describing a thermal characteristic for each of the processor cores based on the process variations.

2. The circuit of claim 1 , comprising a voltage adjuster operated by the core stimulator to cause a set of voltage changes across the set of cores.

3. The circuit of claim 2, comprising a fault detector operated by the core analyzer to detect if a task has failed in response to the set of voltage changes, the voltage of a particular processor core at the task failure point indicating a thermal characteristics of the particular processor core.

4. The circuit of claim 1 , comprising a power module installed by the core stimulator and operated on each member of the set of processor cores to cause heating and cooling of the processor cores.

5. The circuit of claim 4, comprising a temperature detector operated by the core analyzer to determine a temperature ramp rate for each member of the set of processor cores, the temperature ramp rate indicates thermal characteristics of the respective processor core.

6. The circuit of claim 1 , wherein the boost data comprises minimum and maximum frequency boost information describing the minimum time for a processor core to operate at a first frequency to facilitate cooling and a maximum time to operate at a second frequency based on the thermal characteristics, the first frequency being lower than the second frequency.

7. The circuit of claim 1 , wherein the proximity map is arranged as a set of stacks describing multiple levels of processor cores, each level includes at least one tile having at least two processors per tile.

8. A system, comprising:

a set of processor cores to execute computer executable instructions from a memory;

a scheduler to process a proximity map having boost data that includes location data and the thermal data for each member of the set of processor cores, the location data describes where each member is located relative to other members in the set of processor cores and the thermal data describes thermal characteristics for each of the processor cores based on the process variations; and

a frequency booster operated by the scheduler to selectively cycle the frequency of each member of the set of processor cores based on the location data and thermal data that characterizes the thermal properties of each member of the set of processor cores.

9. The system of claim 8, wherein the scheduler is a supervisory processor, a supervisory processor from the set of processor cores, or an operating system.

10. The system of claim 8, wherein the boost data comprises minimum and maximum frequency boost information describing the minimum time for a processor core to operate at a first frequency to facilitate cooling and a maximum time to operate at a second frequency based on the thermal characteristics, the first frequency being lower than the second frequency.

1 1 . The system of claim 8, wherein the proximity map is arranged as a set of stacks describing multiple levels of processor cores, each level includes at least one tile having at least two processors per tile, the scheduler employs the set of stacks and at least one tile to selectively optimize the availability of boost durations between stacks or tiles.

12. The system of claim 1 1 , where the scheduler selectively boosts frequency of a respective processor core by maximizing a distance to at least one other processor core based on the location data.

13. The system of claim 12, wherein the scheduler selectively boosts frequency of one member within a tile while concurrently reducing the frequency of at least one other core member within the tile.

14. A method, comprising:

exercising each member of a set of processor cores to cause core changes across the set of cores;

monitoring the core changes of the processor cores to generate a proximity map for the set of processor cores, the proximity map stores boost data that includes location data and the thermal data for each member of the set of processor cores, the thermal data describing thermal characteristics for each of the processor cores based on process variations across the set of cores; and

cycling the frequency of each member of the set of processor cores based on the location data and thermal data that characterizes the thermal properties of each member of the set of processor cores.

15. The method of claim 14, comprising inferring thermal capabilities and boost durations of processor cores based on leakage current of the processor cores.

Description:
CORE CHARACTERIZATION FOR PROCESSOR CORES

BACKGROUND

[0001] For several years, the microprocessor industry has experienced steady increases in processor core count per integrated circuit package as a result of continuous improvements in process technology. This resulted in processors reaching what is now known as many-core scaling. As the industry transitions to future fabrication techniques such as three-dimensional (3D) die-stacking, software operating the respective processor cores can also be optimized accordingly to take advantage of new chip characteristics to provide commensurate improvements in performance.

BRIEF DESCRIPTION OF THE DRAWINGS

[0002] FIG. 1 is a block diagram illustrating an example of a circuit to characterize processor cores in accordance with the present disclosure.

[0003] FIG. 2 is a block diagram illustrating an example of a system to schedule tasks within a set of processor cores based on a proximity map in accordance with the present disclosure.

[0004] FIG. 3 is a block diagram illustrating an example of a core layout for boosting core frequency in accordance with the present disclosure.

[0005] FIG. 4 is a block diagram illustrating an example of a core characterizer to characterize processor cores in accordance with the present disclosure.

[0006] FIG. 5 is a flow chart illustrating an example of a method to characterize processor cores and perform thermal-based scheduling in accordance with the present disclosure.

DETAILED DESCRIPTION

[0007] This disclosure relates a core characterizer to generate a proximity map of multiple processor cores. The proximity map is employed by a scheduler to strategically boost (e.g., increase or decrease core frequency) core members to facilitate thermal management and processing efficiency of the collective set of cores. The proximity map includes boost data that describes where a respective core resides in a hierarchy of cores and includes data describing the thermal characteristics of the respective core. Based on the boost data, the scheduler can selectively increase and decrease operating frequencies of core members to enable some cores to cool while others are heating as their workloads increase due to elevated operating frequency. Cooling core members can act as heat sinks for other core members. This facilitates extending frequency boost cycles for core members by minimizing thermal coupling effects and taking advantage of process variation differences in the respective cores.

[0008] For each processor design, chip manufacturers define a thermal design point (TDP) limit that the die should not exceed. This is a consideration for 3D die stacked designs, for example. However, different cores have different boost capabilities due to process variation as well as their location on the chip. Studies have shown that since heat increases relatively slowly - 100's of milliseconds to a few seconds, thermal headroom could temporarily be exploited to boost core frequencies and provide increases in performance. However, the amount of thermal coupling that occurs can decrease this duration and the ability to perform such boosting cycles. In this disclosure, a low system impact core scheduling method is provided that minimizes the effects of thermal coupling to allow extended boosting on die-stacked core designs. Also, the core characterizer can dynamically characterize the capabilities of the cores based on process variation and location within a given set of cores. This allows constructing the proximity map that is reflective of the chip's floorplan within the given set of cores. The proximity map can then be utilized for thermal-based task scheduling such as via a dedicated supervisor processor and/or operating system.

[0009] FIG. 1 illustrates an example of a circuit 100 to characterize processor cores in accordance with the present disclosure. The circuit 100 includes a processor and a memory 1 14. The processor 1 10 executes computer-executable instructions from the memory 1 14. A core characterizer 120 operated by the processor 1 10 characterizes process variations across a set of processor cores 130 (core set 130) having cores 1 though N, with N being a positive integer. The core characterizer 120 includes a core stimulator 140 to exercise each member of the set of processor cores 130 to cause core changes across the set of cores (e.g., voltage changes, temperature changes). A core analyzer 150 monitors the core changes of the processor cores 130 to generate a proximity map 160 for the set of processor cores. The proximity map 160 stores boost data that includes location data and the thermal data for each member of the set of processor cores 130. The thermal data describes thermal characteristics for each of the processor cores based on the process variations. Although the core characterizer 120, the core stimulator 140, and core analyzer 150 are shown as executing from

memory 1 14, a set of these can be implemented as a logic circuit which can be similarly executed by the processor 1 10.

[00010] As will be illustrated and described below with respect to FIG. 4, the circuit 100 can also include a voltage adjuster operated by the core stimulator 140 to cause voltage changes across the set of cores. The circuit 100 can also include a fault detector operated by the core analyzer 150 to detect if a task has failed in response to the voltage changes, where the voltage of the processor core at the task failure point indicates thermal characteristics of the respective processor core. In another example, the circuit 100 can include a power module installed by the core stimulator 140 and is operated on each member of the set of processor cores 130 to cause heating and cooling of the processor cores. The circuit 100 can also include a temperature detector operated by the core analyzer 150 to determine a temperature ramp rate for each member of the set of processor cores 130. The temperature ramp rate indicates thermal characteristics of the respective processor core. The boost data described herein can include minimum and maximum frequency boost information describing the minimum time for a processor core to operate at a lower frequency to facilitate cooling and a maximum time to operate at a higher frequency based on the thermal

characteristics. Such data can be processed by a scheduler such as illustrated and described below with respect to FIG. 2. The proximity map 160 can be arranged as a set of stacks describing multiple levels of processor cores where each level includes at least one tile having at least two processors per tile (See e.g., FIG. 3).

[00011] The proximity map enables ordering the set of cores 130 in the map according to their boost capabilities. One example used to characterize the cores falls from an observation related to minimum core voltage. Cores that can achieve the lowest voltage without failing imply they exhibit more leakage which in turn implies that they are more susceptible to thermal runaway conditions and thus, would become hotter faster. In this example, the core characterizer 120 runs a series of self-tests on each core before they are handed off to a supervisory processor or an operating system, for example. For a given core that is under test, the voltage can be gradually lowered by the core stimulator 140 while it runs the self-test. The voltage can be gradually dropped in small decrements depending on the granularity of the voltage regulator supplying voltage to the core. The voltage decrementing process continues until the core fails. When a failure occurs, the core is prompted to enter an error handler. At this point, the lowest safe voltage that the core under test achieved is recorded by the core

analyzer 150 and used to compute the boost duration for that respective core and then the core voltage is raised back to a safe level, where the core characterizer 120 proceeds to testing another core. This testing can be conducted in parallel for performance reasons if desired.

[00012] In another core characterization example, a power module (e.g., code designed to increase processor activity) on the different cores during boot operations and characterizing the ramp rate of the temperature on each core by the core

analyzer 150. This can entail inserting idle periods where cores can be cooled down after testing. In both examples of either voltage variation or power module execution, the core characterization can be conducted once (or predetermined intervals) during first boot and save the proximity map and associated boost data in non-volatile memory for subsequent boots. The characterization tests can also be rerun after a certain number of boots to account for aging effects on hardware.

[00013] When subsequent processors obtains the proximity information, it constructs a table that includes using (stack, tile, core) coordinates to describe the location of the chip and its minimum and maximum boost duration. The stack number corresponds to the stack level in the 3D die, for example. The tile number corresponds to the tile location within a stack. The core number is the core number within a given tile. The number of cores per tile can vary (See e.g., FIG. 3) where 4 cores per tile as an example are employed. The minimum and maximum boost durations can depend on the number of active cores within the tile. For example, if a single core is active per tile, it can be boosted over the maximum duration described by proximity map 160.

[00014] FIG. 2 illustrates an example of a system 200 to schedule tasks within a set of processor cores 210 based on a proximity map 220 in accordance with the present disclosure. The set of processor cores 210 execute computer executable instructions from a memory (not shown). A scheduler 230 processes the proximity map 220 having boost data 240 that includes location data and the thermal data for each member of the set of processor cores 210. The location data describes where each member is located relative to other members in the set of processor cores 210 and the thermal data describes thermal characteristics for each of the processor cores based on the process variations. A frequency booster 250 operated by the scheduler 230 selectively cycles the frequency of each member of the set of processor cores 210 based on the location data and thermal data that characterizes the thermal properties of each member of the set of processor cores.

[00015] The scheduler 230 can be a supervisory processor separate from the processor cores 210, a supervisory processor from the set of processor cores, or an operating system that interacts with the set of processor cores. As noted previously, the boost data 240 can include minimum and maximum frequency boost information describing the minimum time for a processor core to operate at a lower frequency to facilitate cooling and a maximum time to operate at a higher frequency based on the thermal characteristics. The proximity map 220 can be arranged as a set of stacks (See e.g., FIG. 3) describing multiple levels of processor cores where each level includes at least one tile having at least two processors per tile. The scheduler 230 selectively boosts frequency of a respective processor core by maximizing a distance to at least one other processor core based on the location data. The scheduler 230 can also selectively boost frequency of one member within a tile while concurrently reducing the frequency of at least one other core member within the tile.

[00016] By processing the proximity map 220, the scheduler 230 facilitates increasing the amount of silicon fragments that can act as heat-sinks for active cores. This can include forming tiles of adjacent cores. One core can be activated per tile initially.

When a tile has been marked with an active core (e.g., core 1 in tile), the scheduler 230 would select a different tile that doesn't have any active cores (unmarked tile) for scheduling a new task. Core stacks can also be prioritized by the use of a different stack when selecting a new tile. To allow efficient removal of heat and air flow, execution restrictions can be imposed by the scheduler 230 of maintaining a maximum number of active tiles throughout the stack (vertically) during the initial execution phase defined by a set of tasks operating on a set of cores.

[00017] As the vertical stacks of cores become full, the scheduler 230 can schedule the second core of each tile in the position indicated utilizing a largest Manhattan distance from one core to another. Then, other cores can be allocated (e.g., cores 3 and 4 within the tile after cores 1 and two have been boosted down). In one example, boosting within a tile can be performed in core pairs (or other subset designation such as triplets). The scheduler 230 would then alternate by boosting one subset of cores (e.g., cores 1 and 2) while another subset of cores (e.g., cores 3 and 4) remain unboosted. After cores 1 and 2 run for a while, the scheduler 230 then unboosts them via the frequency booster 250 so they cool down while it boosts other cores. The scheduler 230 continues to alternate between these frequency boosting configurations within a given tile. Similarly, tiles selected for boosting are selected in an alternating manner. Thus, the scheduler 230 can employ the set of stacks and/or tiles to

selectively optimize the availability of boost durations between stacks or tiles.

[00018] FIG. 3 illustrates an example of core layouts 300 for boosting core frequency in accordance with the present disclosure. In this example, three processing stacks are provided where each stack includes three processing tiles. Each processing tile includes four processing cores. The core layouts 300 are for example purposes only, more or less than three stacks can be utilized, more or less than three tiles per stack can be employed, and more or less than three processing cores per tile can be utilized as disclosed herein. In a first level or scheduling, tile 2 of stack 3 can be boosted.

Boosting within tile 2 can include alternating between processor cores 1 and 2 in one boosting cycle, and cores 3 and 4 in another boosting cycle. The boost frequency of each core can be set based on the boost data that was generated as previously described. If other cores are activated, tile 3 on stack 1 can then be boosted if it is the furthest vertical and horizontal distance from tile 2 operating on stack 3. In this manner, thermal heating and cooling can be spread out over the entirety of the processor cores to facilitate efficient thermal management of the cores. Other cores can be similarly activated such as via tile 1 in stack 2 which is also located at a distance from the other activated tiles as described herein. Periods of frequency boosting and unboosting can be moved across the stacks, tiles, and within tiles to facilitate that hotspots are not generated in one area of an integrated circuit and to utilize cooler unboosted cores as heat sinks for more active boosted cores.

[00019] FIG. 4 illustrates an example of a core characterizer 410 to characterize processor cores in accordance with the present disclosure. The core characterizer 410 can include a voltage adjuster 420 operated by a core stimulator 430 to cause voltage changes across a set of cores. For example, a programmable voltage regulator can be provided to increase and decrease the voltage to a respective core. The core characterizer 410 can also include a fault detector 440 operated by a core analyzer 450 to detect if a task has failed in response to the voltage changes. When a core fault is detected (e.g., processing task fails) the voltage of the processor core at the task failure point indicates thermal characteristics of the respective processor core.

[00020] In another example, the core stimulator 430 can install power module 460 (e.g., firmware module to exercise core) that is operated on each member of the set of processor cores 130 to cause heating and cooling of the processor cores. The power module 460 can be a rigorous set of processor instructions to exercise each portion of the processor core in a concurrent manner (e.g., cache portions, register portions, I/O portions, and so forth). During operations of the power module, a temperature detector 470 operated by the core analyzer 450 determines a temperature ramp rate for each member of the set of processor cores. For example, linear curves can be developed that describe the heating increase of a given core over time as the power module 460 is executed where the temperature ramp rate indicates thermal

characteristics of the respective processor core. In some examples, voltage changing and fault detection are applied to characterize a given core. In another example, power module exercising and ramp rate determinations are utilized to characterize cores. In yet another example, both voltage change/fault detection, a power module

exercising/ramp determination can be used to characterize a given core.

[00021] In view of the foregoing structural and functional features described above, an example method will be better appreciated with reference to FIG. 5. While, for purposes of simplicity of explanation, the method is shown and described as executing serially, it is to be understood and appreciated that the method is not limited by the illustrated order, as parts of the method could occur in different orders and/or concurrently from that shown and described herein. Such method can be executed by various

components and executed by an integrated circuit, computer, or a controller, for example.

[00022] FIG. 5 is an example of a method 500 to characterize processor cores and perform thermal-based scheduling in accordance with the present disclosure. At 510, the method 500 includes exercising each member of a set of processor cores to cause core changes across the set of cores (e.g., via core stimulator 140 of FIG. 1 ). At 520, the method 500 includes monitoring the core changes of the processor cores to generate a proximity map for the set of processor cores (e.g., via core analyzer 150 of FIG. 1 ). The proximity map stores boost data that includes location data and the thermal data for each member of the set of processor cores. The thermal data describes thermal characteristics for each of the processor cores based on process variations across the set of cores. At 530, the method 500 includes cycling the frequency of each member of the set of processor cores based on the location data and thermal data that characterizes the thermal properties of each member of the set of processor cores (e.g., via the scheduler 230 and frequency booster250 of FIG 2).

[00023] As noted previously, the boost data can include minimum and maximum frequency boost information describing the minimum time for a processor core to operate at a lower frequency to facilitate cooling and a maximum time to operate at a higher frequency based on the thermal characteristics. Exercising each member of the set of cores can include voltage changes applied to each member of the core and/or running a power module on the respective core. Monitoring the core changes can include core fault detection in response to voltage changes and/or ramp rate

determinations in response to power module execution as described herein. The method 500 can also include inferring thermal capabilities and boost durations of processor cores based on leakage current of the processor cores. As described previously, cores that can operate at lower test voltages can be indicative of higher leakage and thus are cores to have their booting frequencies reduced over other cores exhibiting lower leakage characteristics.

[00024] What have been described above are examples. It is, of course, not possible to describe every conceivable combination of components or methods, but one of ordinary skill in the art will recognize that many further combinations and permutations are possible. Accordingly, the examples are intended to embrace all such alterations, modifications, and variations that fall within the scope of this application, including the appended claims. Additionally, where the disclosure or claims recite "a," "an," "a first," or "another" element, or the equivalent thereof, it should be interpreted to include one or more than one such element, neither requiring nor excluding two or more such elements. As used herein, the term "includes" means includes but not limited to, and the term "including" means including but not limited to. The term "based on" means based at least in part on.