THOMAS, Bijo (AE Eindhoven, NL-5656, NL)
CLAIMS:
1. A method for dynamically partitioning a cache memory in a multiprocessor for a set of application tasks comprising the steps of: storing a scheduling pattern of said set of application tasks in a storage; selecting a subset of application tasks from the set of application tasks and updating a cache controller logic with said subset of application tasks, wherein said subset of application tasks comprise a set of application tasks which are executed in the upcoming clock cycles; and allocating cache partitions dynamically to the subset of application tasks updated in said cache controller logic.
2. The method of claim 1, wherein said storage comprises a task scheduler, said task scheduler stores the scheduling pattern of the application tasks.
3. The method of claim 1, wherein said updating the cache controller logic comprises updating a partition control register with the selected subset of application tasks.
4. The method of claim 3, wherein said allocating comprises allocating cache partitions dynamically to the selected subset of application tasks in said partition control register, wherein a dynamic partition unit performs the allocation of cache partitions.
5. The method of claim 1 wherein, the application tasks comprises a sequence of instructions.
6. A system for dynamically partitioning a cache memory in a multiprocessor for a set of application tasks comprising: a task scheduler (205, 405) for storing a scheduling pattern of said set of application tasks; and cache controller logic (215, 415) for selecting and updating a subset of application tasks from the set of application tasks, and for allocating cache partitions dynamically to said subset of application tasks updated in said cache controller logic.
7. The system of claim 6, wherein said task scheduler (205, 405) comprises a look up table (210, 410) for storing the scheduling pattern of the application tasks.
8. The system of claim 6, wherein the cache controller logic (415) comprises: a partition control register (425) for updating the subset of application tasks; and a dynamic partition unit (420) for allocating cache partitions dynamically to the subset of application tasks.
9. The system of claim 8, wherein the partition control register (425) is updated by the task scheduler (405) according to the scheduling pattern stored in the look up table
(410).
10. The system of claim 6, wherein the application tasks comprises a sequence of instructions. |
Dynamic cache partitioning
FIELD OF THE INVENTION
The present invention relates in general to a data processing system comprising cache storage, and more specifically relates to dynamic partitioning of the cache storage for application tasks in a multiprocessor.
BACKGROUND OF THE INVENTION
Cache partitioning is a well-known technique in multi-tasking systems for achieving more predictable cache performance by reducing resource interference. In a data processing system comprising of multiprocessors, the cache storage is shared between multiple processes or application tasks. The cache storage is partitioned into different sections for different application tasks. In a multiprocessing system with large number of application tasks, cache partitioning may result in small sized partitions per application tasks, as the total cache size is limited. This will cause performance deterioration, as the application tasks will not be able to accommodate its working set in the allotted cache partition which causes more cache misses. It can be advantageous to partition the cache into sections, where each section is allocated to a respective class of processes, rather than the processes sharing entire cache storage.
US Patent application 2002/0002657A1 by Henk Muller et al discloses a method of operating a cache memory in a system, in which a processor is capable of executing a plurality of processes. Such techniques partition the cache into many small partitions instead of using one monolithic data-cache in which, accesses to different data objects may interfere.
In such cases, typically the compiler is aware of the cache architecture, and allocates the cache partitions to the application tasks.
Future multiprocessor systems are going to be very complex and will contain a large number of application tasks. Hence the cache partitioning will result in small sized partitions per tasks, which will deteriorate the performance.
SUMMARY OF THE INVENTION
It is, inter alia, an object of the invention to provide system and method for improved dynamic cache partitioning in multiprocessors. The invention is defined by the independent claims. Advantageous embodiments are defined in the dependent claims. The invention is based on the recognition that the prior art techniques do not exploit the pattern of execution of the application tasks. For instance, the execution behavior of multimedia applications often follows a periodic pattern of execution. I.e. the multimedia applications include application tasks that get scheduled periodically and follow a pattern of execution. By exploiting this behavior it is possible to have more efficient cache partitioning. A cache partitioning technique for application tasks based on the scheduling information in multiprocessors is provided. Cache partitioning is performed dynamically based on the information of the pattern of task scheduling provided by the task scheduler. Execution behavior of the application tasks is obtained from the task scheduler and partitions are allocated to only a subset of application tasks, which are going to be executed in the upcoming clock cycles. The present invention will improve the cache utilization by avoiding unnecessary reservation of the cache partitions for the executing application tasks during the entire duration of their execution and hence an effective utilization of the cache is achieved.
In an example embodiment of the present invention, a method for dynamically partitioning a cache memory in a multiprocessor for a set of application tasks is provided. The method includes the steps of storing a scheduling pattern of the set of application tasks in a storage, selecting a subset of application tasks from the set of application tasks and updating a cache controller logic with the subset of application tasks, where the subset of application tasks comprise a set of application tasks which are executed in the upcoming clock cycles, and allocating cache partitions dynamically to the subset of application tasks updated in the cache controller logic. A task scheduler stores the scheduling pattern of the application tasks in a look up table (LUT). The selected subset of application tasks are updated in a partition control register and a dynamic partition unit allocates cache partitions dynamically to the subset of application tasks stored in the partition control register.
In another example embodiment of the present invention, a system is provided for dynamically partitioning a cache memory in a multiprocessor for a set of application tasks. The system includes a task scheduler for storing a scheduling pattern of the set of application tasks, and cache controller logic for selecting and updating a subset of application tasks from the set of application tasks, and for allocating cache partitions dynamically to the subset of application tasks updated in the cache controller logic. The cache controller logic includes a
partition control register for updating the subset of application tasks, and a dynamic partition unit for allocating cache partitions dynamically to the subset of application tasks.
The above summary of the present invention is not intended to represent each disclosed embodiment, or every aspect, of the present invention. Other aspects and example embodiments are provided in the figures and the detailed description that follows.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a flow diagram illustrating an embodiment of a method for dynamically partitioning a cache memory in a multiprocessor according to an embodiment of the present invention.
FIG. 2 illustrates an embodiment of a basic architecture of a data processing system.
FIG. 3 shows a table of an example of the task scheduling pattern of the application tasks. FIG. 4 is a block diagram illustrating a dynamic partitioning system embodiment of the present invention.
DESCRIPTION OF EMBODIMENTS
The present invention proposes a cache partitioning technique based on the patterns of execution of the application tasks. The partitions are allocated to only a subset of application tasks which are going to be executed in the upcoming clock cycles. Since, considering only a subset of application tasks for cache portioning, large sized partitions can be allotted for the application tasks.
FIG. 1 is a flow diagram illustrating the method for dynamically partitioning a cache memory in a multiprocessor according to an embodiment 100 of the present invention. At step 105, the scheduling pattern of the set of the applications tasks is stored in a look-up table (LUT). The application tasks may be defined as elementary operations (like hardware operations or computer instructions) or can be an ensemble of elementary operations (like software programs). The application tasks follow a periodic pattern of execution. This scheduling pattern is stored in the LUT. At 110, a subset of application tasks is selected from the set of application tasks. The subset of application tasks includes the application tasks which are going to be executed in the upcoming clock cycles.
At step 115, a partition control register is updated with the subset of application tasks selected in step 110. A partition control register may be implemented as a memory
mapped input output (MMIO) register. A task scheduler updates the partition control register with the selected subset of application tasks. At step 120, a dynamic partitioning unit allocates cache partitions dynamically to the subset of application tasks updated in the partition control register.
FIG. 2 illustrates an embodiment of a basic architecture of a data processing system 200. The data processing system 200 includes a processing unit 210, a cache memory 220, cache controller logic 215, a data bus 230, a main memory 225 and a task scheduler 205. The scheduler can be implemented either in software or hardware. If it is a software one, it could be running on the processing unit 210. In the case of a hardware implementation, the scheduler is a separate unit as a task scheduler 205 which is coupled to the processing unit 210. The main memory 225 is coupled to the data bus 230. The cache memory 220 and the cache controller logic 215 are as well coupled to the data bus 230. The processing unit 210 is coupled to the cache memory 220. The task scheduler 205 is coupled to the data bus 230 as well as to the processing unit 210 and the cache controller logic 215.
Such a data processing system 200 may be implemented as a system-on-chip (SoC). The data processing system 200 explained above is particularly applicable to multitasking streaming applications, for example in audio and video applications.
FIG. 3 shows a table of an example of the task scheduling pattern of the application tasks 300. A particular repetition pattern of execution of application tasks is termed as a scheduling pattern. This scenario is common in multimedia/streaming applications where the constituent application tasks follow a repetitive pattern of scheduling (and hence execution). Consider an example periodic execution pattern of tasks Tl, T2, and T3. The top row indicates the task IDs TID and the bottom row indicates the schedule instance SI (1-14). From the figure it can be seen that the scheduling follows a recurring pattern (Tl, T2, Tl, T2, Tl, T2, T3), which can be tracked by the task scheduler at run time. Task scheduler stores this scheduling information dynamically using the LUT. The proposed cache partition policy considers only a subset of the tasks for allocating the partitions. In this case the suitable subset is (Tl, T2) as the task T3 occurs only in the scheduling instance 7 and hence the partition for task T3 can be allocated at a later time (i.e. by schedule instance 7). As the entire cache partition is allocated for Tl and T2 for most of the execution time (schedule instance 1-6), a more efficient cache partitioning is achieved. At schedule instance 7, a subset of cache partition occupied by either Tl or T2 (according to
some cache replacement policy like least recently used (LRU) can be evicted to accommodate the partition for T3.
FIG. 4 is a block diagram illustrating a dynamic partitioning system embodiment of the present invention 400. The figure shows a part of the basic data processing system as in FIG. 2. The figure serves to explain the relationship between the task scheduler 205 (405 in FIG. 4) and the cache controller logic 215 (415 in FIG. 4) for performing the dynamic cache partitioning according to the present invention. The system includes a task scheduler 205, a look up table (LUT) 410, cache controller logic 215 (415 in FIG.4), a dynamic partitioning unit 420, and a partition control register 425.
The task scheduler 205, stores the task schedule pattern in the form of LUT 410. The cache controller logic 215 includes a partition control register 425 and a dynamic partition unit 420. The partition control register 425 is updated by the task scheduler 205. The partition control register 425 contains information regarding which are going to be executed in the upcoming clock cycles. This information includes the task IDs of the application tasks. I.e. according to the example in FIG. 2, at an arbitrary schedule instance 1, the partition control register 425 includes only task IDs corresponding to Tl and T2. Again at say schedule instance 6, the partition control register 425 includes application tasks Tl, T2 and T3, which imply that a partition has to be allocated to T3 also. Dynamic partition unit 420 reads the information from the partition control register 425 and allocates partitions to only application tasks which have IDs registered in the partition control register 425. In this way only a subset of application tasks are selected for allocating cache partitions and hence effectively utilizing the available cache space across the application tasks.
The present invention will find its industrial applications in system on chip (SoC) for audio, video and mobile applications. The present invention will improve the cache utilization by avoiding unnecessary reservation of the cache partitions for the executing application tasks during the entire duration of their execution. So an effective utilization of the cache storage is achieved.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The
word "comprising" does not exclude the presence of elements or steps other than those listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and/or by means of a suitably programmed processor. In the device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
Next Patent: AN OPTICAL DRIVE WITH IMPROVED LASER POWER CONTROL (LPC)
