Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND SYSTEM FOR DYNAMIC MANAGEMENT OF WRITE-MISS BUFFER
Document Type and Number:
WIPO Patent Application WO/2014/209956
Kind Code:
A1
Abstract:
In described examples, output traffic is controlled from a buffer that stores write-miss entries (61) associated with one level of a cache for subsequent forwarding to another level of the cache (69). A determination is made about whether a predetermined condition is satisfied (66, 67, 68). An oldest entry is output from the buffer only in response to a determination that the predetermined condition is satisfied (69). Posting of a new entry to the buffer (61) is insufficient to satisfy the predetermined condition (66, 67, 68).

Inventors:
BHORIA NAVEEN (US)
ZBICIAK JOSEPH RAYMOND MICHAEL (US)
DAMODARAN RAGURAM (US)
CHACHAD ABHIJEET ASHOK (US)
Application Number:
PCT/US2014/043797
Publication Date:
December 31, 2014
Filing Date:
June 24, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TEXAS INSTRUMENTS INC (US)
TEXAS INSTRUMENTS JAPAN (JP)
International Classes:
G06F5/12
Foreign References:
US20090147796A12009-06-11
US20020026562A12002-02-28
US4742446A1988-05-03
Attorney, Agent or Firm:
DAVIS, Michael, A., Jr. et al. (International Patent ManagerP.O. Box 655474, Mail Station 399, Dallas TX, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method of controlling output traffic from a buffer that stores write-miss entries associated with one level of a cache for subsequent forwarding to another level of the cache, comprising: determining whether a predetermined condition is satisfied; and

outputting an oldest entry from the buffer only in response to a determination that the predetermined condition is satisfied;

wherein posting of a new entry to the buffer is insufficient to satisfy the predetermined condition.

2. The method of claim 1, wherein the predetermined condition is satisfied if the oldest entry has been stored in the buffer for a predetermined number of write cycles.

3. The method of claim 2, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

4. The method of claim 2, wherein the predetermined condition is satisfied if an entry in the buffer requires expedited forwarding to the another level of the cache.

5. The method of claim 4, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

6. The method of claim 2, wherein the predetermined number of write cycles is dynamically adjustable based on a number of unused locations available in the buffer.

7. The method of claim 1, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

8. The method of claim 1, wherein the predetermined condition is satisfied if a predetermined number of write cycles have occurred since an entry was last output from the buffer.

9. The method of claim 1, wherein the predetermined condition is satisfied if an entry in the buffer requires expedited forwarding to the another level of the cache.

10. The method of claim 9, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

11. A cache controller apparatus, comprising:

a buffer configured to store write-miss entries associated with one level of a cache for subsequent forwarding to another level of the cache; and a buffer controller coupled to the buffer and configured to determine whether a predetermined condition is satisfied, and to output an oldest entry from the buffer only in response to a determination that the predetermined condition is satisfied;

wherein posting of a new entry to the buffer is insufficient to satisfy the predetermined condition.

12. The apparatus of claim 11, wherein the predetermined condition is satisfied if the oldest entry has been stored in the buffer for a predetermined number of write cycles.

13. The apparatus of claim 12, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

14. The apparatus of claim 12, wherein the predetermined condition is satisfied if an entry in the buffer requires expedited forwarding to the another level of the cache.

15. The apparatus of claim 14, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

16. The apparatus of claim 11, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

17. The apparatus of claim 11, wherein the predetermined condition is satisfied if an entry in the buffer requires expedited forwarding to the another level of the cache.

18. The apparatus of claim 17, wherein the predetermined condition is satisfied if the buffer contains a predetermined threshold number of entries.

19. A data processing system, comprising:

a data processing resource; and

a multilevel cache architecture coupled to the data processing resource, and including a cache controller that includes a buffer configured to store write-miss entries associated with one level of a cache for subsequent forwarding to another level of the cache;

wherein the cache controller includes a buffer controller coupled to the buffer and configured to determine whether a predetermined condition is satisfied, and to output an oldest entry from the buffer only in response to a determination that the predetermined condition is satisfied; and

wherein posting of a new entry to the buffer is insufficient to satisfy the predetermined condition.

20. Apparatus for controlling output traffic from a buffer that stores write -miss entries associated with one level of a cache for subsequent forwarding to another level of the cache, the apparatus comprising:

means for determining whether a predetermined condition is satisfied; and

means for outputting an oldest entry from the buffer only in response to a determination that the predetermined condition is satisfied;

wherein posting of a new entry to the buffer is insufficient to satisfy the predetermined condition.

Description:
METHOD AND SYSTEM FOR DYNAMIC MANAGEMENT OF WRITE-MISS BUFFER

[0001] This relates in general to multilevel cache control, and in particular to a method and system for dynamic management of a write -miss buffer.

BACKGROUND

[0002] In a multilevel cache hierarchy, a write command for which a cache miss occurs (a "write-miss") is stored in a first-in first-out ("FIFO") write-miss buffer, so that a central processing unit ("CPU") does not stall, and so that write-miss data can be forwarded to the next level of cache when possible. Conventional approaches to write-miss buffer management merge newly received write misses with write-miss entries already in the buffer, if the appropriate merge conditions occur (e.g., address and permission matches). The write-miss buffer is typically drained fast enough (e.g., at the same rate at which new write-misses are being posted by the CPU) to prevent a full write-miss buffer that would cause the CPU to stall. Entries that are output from the write-miss buffer are forwarded to the next level of the cache hierarchy.

[0003] In some situations, it is desirable to reduce the traffic from the write-miss buffer to the next level of cache. As one example, such traffic reduction becomes more important when write-through mode is enabled in the cache controller. Although such conventional approaches adequately avoid complete filling of the write-miss buffer, they do not address reducing traffic from the write-miss buffer to the next cache level.

SUMMARY

[0004] In described examples, output traffic is controlled from a buffer that stores write-miss entries associated with one level of a cache for subsequent forwarding to another level of the cache. A determination is made about whether a predetermined condition is satisfied. An oldest entry is output from the buffer only in response to a determination that the predetermined condition is satisfied. Posting of a new entry to the buffer is insufficient to satisfy the predetermined condition.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] FIGS. 1-3 are diagrams of a write-miss buffer, according to example embodiments.

[0006] FIG. 4 is a block diagram of a data processing system, according to example embodiments.

[0007] FIG. 5 is a block diagram of a cache controller of FIG. 4.

[0008] FIG. 6 is a flowchart of operations, according to example embodiments.

[0009] FIG. 7 is a flowchart of operations, according to further example embodiments.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

[0010] Traffic from the write-miss buffer to the next cache level may be reduced by measures that produce as many write-miss merges as possible. Example embodiments provide a dynamic scheme for write-miss buffer management, so that a posted write-miss command is retained in the write-miss buffer as long as possible without increasing CPU stall cycles. This increases the likelihood that entries in the write-miss buffer will be merged with future write-misses, thereby reducing traffic from the write-miss buffer to the next cache level without adversely increasing the incidence of CPU stalls.

The following three (3) parameters are available to control the drain rate of the write-miss buffer dynamically: (a) the number of write cycles that a particular write-miss has spent in the buffer; (b) the number of free entries (unused locations) available in the buffer; and (c) attributes of the write-miss (such as not merge-able with another write-miss, and/or time-sensitive). In some embodiments, the write-miss buffer only outputs a write-miss entry if one of the following conditions occurs: the oldest entry has been in the buffer for a number of write cycles that equals (or exceeds) a value MAX, where MAX is a number of cycles less than the maximum latency tolerable by the system; or the number of entries in the buffer reaches (or exceeds) a threshold value; or the buffer contains an entry having an attribute that indicates expedited handling is required for the entry, such as an entry having a strict latency requirement (e.g., a time or delay sensitive entry, or an entry that is not a cacheable write command, or an entry that is a special type of write command that must be committed to memory as soon as possible, such as a coherence write-flush command). Another example where an expedited handing attribute may be used is when a pending read miss needs to go out in order.

[0011] In various embodiments, the value of MAX is specified by either the user or the application, and is tapered based on the number of free buffer entries available, such as MAX = [# of specified cycles] * [# of free entries available]. Accordingly, MAX may be dynamically adjusted to vary in proportion to the number of unused locations currently available in the buffer. In some embodiments, MAX is set to a default value, such as 1 , or another value suitable for the application.

[0012] In various embodiments, the threshold value is specified by either the user or the application. In some embodiments, the threshold value is set to a default value, such as one-half of the write-miss buffer size. Some embodiments dynamically adjust the threshold value based on the input stream pattern. For example, if hardware detects that newly received write-misses are being merged with existing buffer entries four locations ahead of them in the write-miss buffer, then the threshold value could be set higher than four.

[0013] FIGS. 1-3 conceptually illustrate examples of dynamic write-miss buffer control. Each example shows a FIFO write-miss buffer with eight locations, designated 0-7. FIG. 1 shows a buffer 11 with write-miss entries in locations 0-2, and FIG. 2 shows the buffer 11 with entries in locations 0-4. FIG. 3 shows a buffer 31 with write-miss entries in locations 0-4. In FIGS. 1-3, the buffer entries are designated as "write-dataO", "write-datal", etc., and are shown oldest-to-newest from bottom-to-top. In the column designated Col(a) in FIGS. 1-3, cycle counts (cntO, cntl, etc.) are associated with the respective write-miss entries. The cycle count is initialized to 0 when the corresponding write-miss is posted to the buffer, and increments at each write cycle. In the column designated Col(b) in FIG. 3, bits indicate attributes associated with the respective write-miss entries. In the FIG. 3 example, a value of 1 in Col(b) tags the corresponding entry to indicate that it requires expedited handling.

[0014] Referring to FIG. 1, the entry write-dataO is not output until the corresponding count, cntO, reaches MAX. Referring to FIG. 2, the entry write-dataO is output in response to the posting of the entry write-data4, regardless of whether cntO has reached MAX, because the number of entries present in the buffer is five, which exceeds the threshold value (designated at TH in FIGS. 1-3) of four in FIG. 2.

[0015] Referring to FIG. 3, the entries will begin to drain from the buffer in response to the posting of the entry write-data4, regardless of whether cntO has reached MAX, and regardless of the fact that the number of entries present in buffer 31 is less than the threshold value TH, because write-data4 requires expedited handling (its Col(b) value is 1). When draining the buffer 31 , write-dataO through write-data3 are successively output before write-data4 is output. Some embodiments respond instead to the posting of write-data4, by permitting the write-data4 entry to be advanced and become the next output from the buffer 31. Such operation may include extra checking to ensure maintenance of data order and integrity. [0016] FIG. 4 shows a data processing system in which this type of dynamic write-miss buffer control may be implemented, according to example embodiments. A data processing resource 41 is coupled to a memory storage resource 43. In various embodiments, the memory storage resource 43 may be wholly separate from data processing resource 41, or partially or fully integrated with data processing resource 41. The memory storage resource 43 includes multilevel cache architecture 45 and other storage 49. A cache controller 47 controls operation of the multilevel cache 45.

[0017] FIG. 5 shows the cache controller 47 of FIG. 4 in more detail. In this example, buffer 31 of FIG. 3, for a particular level of the cache architecture 45, receives write-misses and forwards them to the next cache level under control of a buffer controller 51 that is coupled to the buffer 51 and receives the write-misses. The buffer controller 51 also receives (from a channel 53) an indication of each write cycle performed by the data processing resource 41 of FIG. 4. In some embodiments, the buffer controller 51 is capable of this type of dynamic write-miss buffer control.

[0018] FIG. 6 shows operations that may be performed to implement this type of dynamic write-miss buffer control. In some embodiments, the operations of FIG. 6 may be performed by the cache controller 47 of FIGS. 4 and 5. In FIG. 6, each write cycle (e.g., of the data processing resource 41 of FIG. 4) is detected at operation 60. When a write cycle is detected at operation 60, it is then determined at operation 61 whether a new write-miss entry is posted to the write-miss buffer. If not, the technique proceeds to operation 65. If a new entry is posted at operation 61, it is then determined at operation 62 whether the entry has an attribute of a special case (special handling). If not, the count value (such as cntO or cntl of FIG. 3) for the entry is initialized to 0 at operation 63, after which the technique proceeds to operation 65. Otherwise, the entry is tagged as a special handling case (such as the "1" bit at Col(b) in FIG. 3) at operation 64, and the technique proceeds to operation 65.

[0019] At operation 65, the previously existing count values of the previously posted entries are incremented. Thereafter, it is determined at operation 66 whether the buffer contains a special handling case entry. If so, the oldest entry is output at operation 69 for forwarding to the next cache level, after which the next write cycle is awaited at operation 60. Otherwise, it is determined at operation 67 whether the number of entries in the buffer has reached the threshold value. If so, the technique proceeds to operation 69, where the oldest entry is output. Otherwise, it is determined at operation 68 whether the number of write cycles that the oldest entry has spent in the buffer has reached MAX. If so, the technique proceeds to operation 69, where the oldest entry is output. Otherwise, the next write cycle is awaited at operation 60.

[0020] Some embodiments permit a special handling case entry to be advanced and become the next output from the buffer. An example of such embodiments is shown by dashed line at operation 601 in FIG. 6 where, after a special handling case entry is tagged at operation 64, it is put in front of the other buffer entries to be next output from the buffer.

[0021] Some embodiments reduce complexity by using only a single cycle count, instead of using a different cycle count for each write -miss buffer entry. This single cycle count indicates how many write cycles have elapsed since an entry was last output from the write-miss buffer. Whenever an entry is output from the buffer, the cycle count is reset to 0. If this cycle count reaches the value of MAX, then the oldest entry is output, and the cycle count is reset. An example is shown in FIG. 7.

[0022] The operations of FIG. 7 are described in conjunction with operations of FIG. 6. Single cycle count operations are largely the same as the multiple cycle count operations of FIG. 6, except for the following modifications. In single cycle count embodiments, operation 63 of FIG. 6 is omitted, and operation 65 of FIG. 6 is replaced by operation 65A of FIG. 7, where the single cycle count is incremented. As also shown in FIG. 7, operation 68 of FIG. 6 is replaced by operation 68A, where the single cycle count is compared to MAX. If the count has reached MAX at operation 68A, then the oldest entry is output at operation 69. Otherwise, the next cycle is awaited at operation 60. The cycle count is cleared (reset to 0) at operation 701, in conjunction with the outputting at operation 69.

[0023] Modifications are possible in the described embodiments, and other embodiments are possible, within the scope of the claims.