DYNAMIC APPROXIMATE STORAGE FOR CUSTOM APPLICATIONS - MICROSOFT TECHNOLOGY LICENSING LLC

Title:

DYNAMIC APPROXIMATE STORAGE FOR CUSTOM APPLICATIONS

Document Type and Number:

WIPO Patent Application WO/2016/137717

Kind Code:

Abstract:

A memory chip for dynamic approximate storage includes an array of memory cells associated with at least two regions. The chip further includes at least one threshold register for storing values for thresholds for memory cells corresponding to each of the at least two regions; and control logic to programmatically adjust the values for the thresholds for the memory cells. A method of controlling a storage device for dynamic approximate storage includes modifying at least one value stored in a threshold register and associated with at least one cell in a region of a memory comprising at least two regions to apply a biasing for the at least one cell, wherein the biasing adjusts ranges for values in a cell.

Inventors:

STRAUSS KARIN (US)
CEZE LUIS HENRIQUE (US)
MALVAR HENRIQUE S (US)
GUO QING (US)

Application Number:

PCT/US2016/016673

Publication Date:

September 01, 2016

Filing Date:

February 05, 2016

Export Citation:

Click for automatic bibliography generation Help

Assignee:

MICROSOFT TECHNOLOGY LICENSING LLC (US)

International Classes:

G11C7/10

Foreign References:

US20120155174A1	2012-06-21
US20130024605A1	2013-01-24
US8891303B1	2014-11-18
US20140089561A1	2014-03-27

Other References:

None

Attorney, Agent or Firm:

MINHAS, Sandip et al. (Attn: Patent Group Docketing One Microsoft Wa, Redmond Washington, US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A memory chip for dynamic approximate storage, comprising:

an array of memory cells, the array comprising at least two regions;

at least one threshold register for storing values for thresholds for memory cells corresponding to each of the at least two regions; and

control logic to programmatically adjust the values for the thresholds for the memory cells.

2. The memory chip of claim 1, wherein the memory cells comprise at least one of a single level cell or a multi -level cell, wherein the at least two regions have a corresponding at least two types of available error correction overhead.

3. The memory chip of claim 1, wherein the values for thresholds represent resistance thresholds.

4. The memory chip of claim 3, wherein the resistance thresholds for the memory cells for at least one of the at least two regions indicate non-uniform in log scale resistance bands.

5. The memory chip of claim 1, wherein the values for thresholds represent voltage thresholds.

6. The memory chip of claim 1, further comprising a controller providing a variety of error correction codes, wherein the controller stores a region-to-configuration map to indicate an appropriate region of the at least two regions to which a request targets and an appropriate error correction strength from one of the variety of error correction codes to service the request.

7. The memory chip of claim 1, further comprising at least one bit pattern mapping register for storing a bit pattern table that maps a particular bit pattern to a particular level in a multilevel cell of at least one of the regions of the at least two regions, wherein the control logic further comprises control logic to programmatically assign the particular bit pattern being mapped to the particular level.

8. A method of controlling a storage device for dynamic approximate storage, comprising:

modifying at least one value stored in a threshold register and associated with at least one cell in a region of a memory comprising at least two regions to apply a biasing for the at least one cell, wherein the biasing adjusts ranges for values in a cell.

9. The method of claim 8, further comprising:

assigning a first level of error correction to one of the at least two regions and assigning a second level of error correction to a second of the at least to regions.

10. The method of claim 8, further comprising:

assigning a bit pattern to each of the ranges for the values in the cell.

11. The method of claim 8, wherein the biasing generates non-uniform in log scale ranges.

12. A mobile device comprising:

a processor;

a storage system comprising:

a memory chip comprising an array of memory cells, the array comprising at least two regions; at least one threshold register for storing values for thresholds for memory cells corresponding to each of the at least two regions; and control logic to programmatically adjust the values for the thresholds for the memory cells; and a custom application stored on the storage system and comprising instructions that when executed by the processor store data on the memory chip, wherein the data has bits identified with at least two types of error constraints.

13. The mobile device of claim 12, further comprising a controller providing a variety of error correction codes.

14. The mobile device of claim 13, wherein the controller stores a region-to- configuration map to indicate an appropriate region of the at least two regions to which a request to store the data targets and an appropriate error correction strength from one of the variety of error correction codes to service the request.

15. The mobile device of claim 12, wherein the thresholds for the memory cells for at least one of the at least two regions indicate non-uniform in log scale ranges.

Description:

DYNAMIC APPROXIMATE STORAGE FOR CUSTOM APPLICATIONS

BACKGROUND

[0001] Memory and storage often have various tradeoffs between precision (errors), endurance, performance, energy efficiency, and density (capacity). Single-level cell (SLC) memories, such as dynamic random access memory (DRAM) and some forms of Flash, store one bit of data in each cell. To provide higher density, multi-level cell (MLC) memory, such as available with Flash and phase-change memory (PCM), subdivides the range of values in a cell into a larger number of levels to store more than one bit of data. For example, Flash represents values in the threshold voltage of a memory cell and PCM represents values in the resistance of the memory cell. Accordingly, for certain multi-level storage, the larger the resistance range allowed by the cell, the higher the number of levels that can be used in the cell to store information, making the cell denser from a storage perspective. That is, the cell is able to store more information per unit of physical volume. However, with respect to the tradeoffs, there are limitations on how dense a cell can be made while still being cheap and reliable.

[00021 In addition, the denser the cell, the more precise the write and read machinery needs to be to preserve the same error rate. For example, for a fixed resistance range, using a higher number of levels requires more precise hardware to write and read these cells correctly every time. More precise hardware means higher costs; and, for the same hardware, storing a higher number of levels in a cell incurs a higher read and write error rate. Other resistance-changing processes such as drift in PCM also affect the read error rate.

BRIEF SUMMARY

(0003] Dynamic approximate storage and systems are described herein that enable applications and operating systems to take advantage of relaxing the error requirements of a region in memory of a storage device in exchange for increased capacity, endurance, performance, energy efficiency or other property of the storage device while still being able to maintain suitable output quality for the data.

[0004] A memory chip for approximate storage is described that includes at least two regions of memory with different error constraints. The memory chip can include at least one threshold register for storing values for thresholds used to identify a value (or values) for memory cells corresponding to each of the at least two regions; and control logic to programmatically adjust the values for the thresholds for the memory cells. The thresholds can be adjusted to create asymmetric ranges for values in a cell and even adjust the number of levels (and bits) a cell can store.

[0005] A method of controlling a storage device for approximate storage includes modifying at least one value stored in a threshold register and associated with at least one cell in a region a memory comprising at least two regions to apply a biasing for the at least one cell, wherein the biasing adjusts ranges for values in a cell. In further implementations, the error rate of a cell or a region of memory can be modified so that the different regions of memory have different levels of error correction.

[0006] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] Figure 1A illustrates operation of storage precision and data encoding.

(0008] Figure IB illustrates a representation of three types of storage regions for the memory 100 shown in Figure 1 A.

[0009] Figures 2A-2C illustrate an analog (resistance) range (x axis) for a four level cell with differently assigned thresholds to provide storage of one bit (Figure 2A) and two bits (Figures 2B and 2C).

[0010] Figure 3 illustrates a memory system for programmable storage precision.

[0011] Figures 4A-4C illustrate various operating environments and corresponding storage locations.

[0012] Figure 5A shows Table 1 of uniform (u) and biased (b) 4-level cell parameters.

[0013] Figure 5B shows Table 2 of uniform (u) and biased (b) 8-level cell parameters.

[0014] Figure 6 shows a plot of combined raw bit error rates (RBERs) over time of uniform and biased PCM cells.

[0015] Figure 7 shows a plot of density with increasing scrubbing intervals comparing uniform and biased PCM cells.

[0016] Figure 8 shows a plot comparing capabilities and overheads of error correction codes (ECC) at 512 data bit blocks.

[0017] Figure 9 shows a graph illustrating the trade-off between ECC storage overhead and scrubbing interval. (0018] Figure 10 shows a plot comparing capabilities and overheads of ECC at 512 byte block data for Flash.

DETAILED DESCRIPTION

[0019] Dynamic approximate storage and systems are described herein that enable applications and operating systems to take advantage of relaxing the error requirements of a region in memory of a storage device in exchange for increased capacity, endurance, performance, energy efficiency or other property of the storage device while still being able to maintain suitable output quality for the data.

[0020] "Approximate storage" refers to a memory optimization technique where it is possible to indicate that certain objects may be stored in memory having a higher likelihood of errors. These regions of memory having a higher likelihood of errors are not necessarily regions having a high error rate, only that the tolerance or requirement for a particular error rate is relaxed and occasional errors may happen.

[0021] "Dynamic approximate storage" involves the adjustable optimization of relaxed error constraints for regions of a storage device. The dynamic approximate storage can involve single-level cell (SLC) and/or variable multi-level cell (MLC) storage, and may further involve error correction techniques. A relaxed error constraint for a memory region may also contribute to additional capacity for storing the data, particularly for applications that do not require, at least for some data, the precision that conventional memory provides.

[0022] Dynamic approximate storage is applicable for custom applications that can take advantage of memory with relaxed error constraints. These custom application can include approximation-aware algorithms and/or operating systems. Relaxing the error requirements of a region in memory of a storage device can enable increased capacity, endurance, performance, energy efficiency or other property of the storage device.

[0023] The described dynamic approximate storage for custom applications contains distinct regions having different error constraints. The probability of error for a memory region is referred to herein as "precision," where lower probabilities of error represents a higher precision. That is, a higher precision has a lower error rate. Careful matching of cell error properties with types of bits being stored by an application can benefit output quality of that data while getting most of the density benefit that approximate storage offers.

[0024] The terms "memory" and "storage" are used interchangeably herein and any specific meaning being applied to the term can be understood by its context. (0025] Figure 1A illustrates operation of storage precision and data encoding; and Figure IB illustrates a representation of three types of storage regions for the memory 100 shown in Figure 1 A. Referring to Figure 1 A, memory 100 can be assigned multiple regions with different allowed levels of precision (or error rates). For example, a first type region 101 may be the most precise, with the lowest likelihood of errors (e.g., 10 ^"12 errors); a second type region 102 may be less precise, with a relaxed error rate; and a third type region 103 may be even less precise with a more relaxed error rate. These regions can be hardwired or dynamic (programmed). The difference in error rates may be due to types of error correction applied and/or the density permitted.

(0026] Memory 100 may be any suitable memory storage device including SLC memories and MLC memories. For a MLC memory (or a hybrid memory containing SLC and MLC cells), the first type region 101 may be configured as a single level cell region for storing a single bit (even though in the case of the MLC memory it is capable of multiple levels); the second type region 102 may be configured as a denser region, for example, with three or four-level cells; and the third type region 103 may be configured more densely than the second type region 102, for example, with eight-level (e.g., for 3 bits) or denser cells. For various SLC and MLC implementations, the regions may be assigned different types of error correction (e.g., different error codes and/or number of error correction bits). An example of these types of storage regions is shown in Figure IB.

[0027] In Figure IB, three types of storage are shown, representing regions 101, 102, and 103 of Figure 1 A. A first type region 121 with lowest error rates can be optimized for high importance bits 111 to include memory cells 122 and error correction cells 123 providing large error correction overhead. A second type region 131 can be optimized for medium importance bits 112 to include cells 132 that are denser than memory cells 122 with moderate error rates and error correction cells 133 providing some error correction overhead. The third region 141 then includes the cells 142 with the highest allowed error rates (optionally optimized for the lower importance bits 113), for example, being the densest memory and having minimal, if any, error correction.

[0028] The inclusion of regions of relaxed error constraints can be identified to programs storing data to the memory 100. Custom applications can then take advantage of the identified precision (as approximation-aware algorithms) and can assign particular data to appropriate regions (e.g., one of regions 101, 102, and 103).

[0029] For example, a custom application may generate encoded data 110 that includes bits that can be identified as being most important to output quality (high importance bits 111), bits that can be identified as being important to output quality (medium importance bits 112), and bits that can be identified as being less important to output quality (low importance bits 113). Here, the high importance bits 111 require the highest precision to ensure output quality. Therefore, the high important bits 111 are stored in the first type region 101. An example of high important bits 111 is the header of a block of data. The medium importance bits 112 may be stored in the first type region 101 or it may be sufficient to store the medium importance bits 112 in the second type region 102. The low importance bits 113 can then be stored in the most relaxed error rate region, for example, the third type region 103.

(0030] Although three types of regions are shown, in some cases two types of regions may be used; and in some other cases more than three types may be used. The number of types of regions may depend on the particular applications using the memory. In some cases, the number of types are programmable and may be adjusted after manufacturing and even reprogrammed.

(0031] The memory cell for each of the regions may be the same type of cell; in such a case, the different regions are obtained by how the values of the cells are interpreted. That is, the reading and writing processes for the multi-level cells can control what type of memory region a cell belongs to. For example, a four-level multi-level cell can be used to store 1 bit or 2 bits by adjusting the thresholds (and even assigning a particular range of resistances).

[0032] This is illustrated in Figures 2A-2C, which shows an analog (resistance) range (x axis) for a four level cell with differently assigned threshold to provide storage of one bit (Figure 2A) and two bits (Figures 2B and 2C). The x-axis range is a log scale.

(0033] Phase-change memory cells store information in the resistance of a chalcogenide material, which provides a wide enough range of resistances to allow multilevel cells. The resistance varies based on the amount of amorphous and crystalline material in the cell, which can be controlled by applying current pulses of different amplitude to the material. For SLCs, a single bit of information is stored in either a fully amorphous state (high resistance) or a mostly crystalline state (low resistance). Accordingly, referring to Figure 2A, the four level cell can be programmed as a two level cell (in a manner of a SLC) by assigning a target threshold for a Ί ' and a '0' .

[0034] The write circuitry for SLCs can be less precise than for MLCs because of the high margins between the high resistance and low resistance levels. Since each MLC level is narrower than the SLC levels, the write circuitry involves more precision and typically employs an iterative process of applying write pulses and subsequently reading the cell to verify the resistance is within the target level boundaries (e.g., 27). MLCs contain circuitry to map their analog state into digital information. Each range of analog values (a level) maps to a certain binary value (or "bit pattern"). On write operations, the write circuitry iteratively applies pulses and verifies if the target level was reached. In Figures 2A-2C, the final write resistance is modeled as a normal distribution around the target resistance of that level.

[0035] An example mapping for the 4-level cell is illustrated in Figure 2B with levels Jo, Ji, J2, and J3 (in log scale) from lowest to highest resistance having ' 10' mapped to the lowest resistance band, Ί mapped to the second lowest resistance band, '01 ' mapped to the second highest resistance band, and '00' mapped to the highest resistance band. Of course, other mappings may be used. Indeed, in some embodiments, it is possible to optimize the mapping of certain bit patterns to particular resistance ranges. For example, data streams indicated as having a high number of '00' patterns can be stored in cells having '00' mapped to a particular part of a range - such as J3 (see Figures 2B and 2C), which may have fewer drift errors (as discussed below in more detail).

[0036] In some cases, the binary values can be assigned to ranges based on their frequency, where the most common is assigned to the level that fails the least and the second most common to the level with the second to least failure rates and the like. In some cases, the most common is assigned to the level that fails the least, and Gray code is used for the other cells so that values are assigned using a Gray code constraint, but in a manner that minimized aggregate error rate for that type of cell.

[0037] For example, if it is determined that '00' is the most common bit pattern, then ' 10', then ' 01 ' , and then ' 11 ' ; and if it is determined that the drift of a multi -level cell occurs as described above (where the highest and lowest levels do not suffer drift, but the second and third do with the third suffering the most), then '00' can be assigned to the fourth level. For the case where Gray code is then used, which is where only one bit changes from level to level, the choice for the bit pattern for the third level is '01 ' or ' 10' . Since '01 ' is less common than ' 10' and the third level has the worst drift, '01 ' is assigned to the third level. Next, for the second level, ' 11 ' must be selected according to Gray code. Finally, the first level is assigned ' 10' .

[0038] Typically, the partitioning of each resistance range is uniform and each level sits within a target level boundary of 2T (or 4T for the example in Figure 2A), where more than a B distance (B > T) from a peak of a level' s distribution may result in a value indicative of a next level of the cell. A Gray code can be used to minimize the Hamming distance between adjacent levels.

[0039] An example of uniform partitioning for PCM is a lowest resistance band Jo : 10 ³-10 ⁴Ω, a second lowest resistance band Ji : 10 ⁴-10 ⁵Ω, a second highest resistance band J2 : 10 ⁵-10 ⁶Ω, and a highest resistance band J3 : 10 ⁶-10 ⁷Ω). The write process typically targets the middle of these uniformly partitioned bands (e.g., 10 ^{3 5}Ω, 10 ^{4 5}Ω, 10 ^{5 5}Ω, 10 ^{6 5}Ω, respectively).

[0040] A PCM cell can suffer two types of errors - write error and drift error. A write error occurs when the write circuitry is unable to set the cell resistance at the target level before exceeding a maximum number of iterations, leaving the cell in an undesired resistance level. In PCM, material relaxation causes cell resistances to drift to higher resistance levels, resulting in the second type of errors, drift errors. Resistance drift is caused by structural relaxation of the material, which increases resistance over time. The higher the resistance, the stronger the drift. Drift unidirectionally increases the cell resistance and its effect is more significant in higher resistance levels than the lower ones.

[0041] The implication is that even if a cell is correctly written originally (within 2J of a resistance range), it may drift over time to a different value, resulting in soft errors (e.g., if an JO value drifts beyond JO + B). Since J3 is the highest level of the 4-level resistance ranges, drifting to a higher resistance while in J3 does not cause an error. Instead, J2 becomes the level that suffers drift error the most and often dominates in the combined soft error of the cell. As such, the second highest level in a uniform cell (J2 in Figure 2B) becomes the most drift-prone level and dominates the combined drift error rate. Similar effects are seen in MLCs with more than four levels.

[0042] According to certain implementations described herein, errors due to drift (and other susceptibilities) are exposed to applications in a controlled manner using approximate storage. In addition to approximate storage, in some cases, methods for mitigating drift, such as memory scrubbing, error correction, or merging multiple levels to create aggregate levels wide enough to make the level boundary-crossing probability negligible (e.g., Tri-Level Cells) may be included.

[0043J Since the size and position of the band in the cells' resistance range determines the number of errors arising from the write process and from drift, it is possible to minimize certain kinds of errors by changing the cell's resistance ranges along with how bits are mapped to cells. (0044] Figure 2C illustrates an implementation using biasing to minimize drift errors. Instead of the uniform distribution of levels, i.e., where all levels are of the same size (in log space) of Figure 2B, biased levels, in which the analog value ranges are tuned for minimizing the combined (write and drift) error rate can be implemented. The biased levels present as non-uniform in log space ranges.

[0045] For example, in the mapping described above, making the second highest resistance band wider (e.g., 10 ⁵-10 ^{6 5}Ω) while still targeting 10 ^{5 5}Ω during write operations will result in fewer drift errors in PCM.

[0046] Biasing repositions and resizes each resistance level as shown in Figure 2C. The combined drift error rate can be minimized by equalizing the drift error rate of each individual level (assuming the stored data maps to each level uniformly). As shown in Figure 3C, levels are wider in value ranges where drift is more likely to have an effect, i.e., the higher resistance levels. Level biasing can be optimized based on a fixed elapsed time since the last write ("scrubbing intervals"). This assumes that the system will scrub the storage contents and reset resistance levels to the target resistance at this scrubbing interval. It is worth noting that cells will work at different scrubbing intervals, but they will suffer higher error rates compared to the interval they were optimized for because levels' error rates will not be completely equalized.

[0047] The biasing changes the target resistances from being at the center of each level (with equal bands BE) to forming a narrow band at the left (D) and a wider band at the right (Bi) to leave more room for drift. However, as the target resistance is moved to lower values and D is reduced, the write error rate begins to increase because the tail of the write resistance distribution gets closer to the lower end of that level. The sizing of D and Bis is therefore a trade-off between write error rate and drift error rate. This relationship and solution can be different for drift in other technologies. For example, some technologies may suffer drift to the lower values in the ranges. Other technologies may suffer drift to the middle values or a particular range of values in the overall range of values. For either of those types of technologies, the biasing can be conducted to form wider bands in the direction where drift may occur.

[0048] Accordingly, certain implementations include tuning non-uniform band sizes (e.g., non-uniform ranges in log scale according to the exponents) and write operation targets (to a value possibly different from the middle of a band) to set cell error rates in configurations that may generate less quality degradation in stored images (or any encoded data in general).

[0049] In addition or as an alternative, certain implementations include changing how values are mapped to levels, for example, to place more common values (for a particular algorithm) in levels less likely to suffer errors (e.g., into the highest resistance band which has the fewest, if any, errors due to drift). In addition to enabling the storage of 2 bits, the four-level cell can have its particular values be assigned to each target threshold with 2T boundaries in an optimized manner. For example, as shown in Figures 2B and 2C, ' 10' may be mapped to the lowest resistance level while ' 11 ' is mapped to the next highest. However, in some other cases, the ' 11 ' may be mapped to the lowest resistance level with ' 10' at the next lowest. The mapping may be assigned based on pattern occurrence. For example, for applications with large numbers of ' 10' patterns, the ' 10' can be mapped to the highest resistance level. Another example use of reassigning values to levels is to minimize the number of bits that change in adjacent levels.

(0050] Figure 3 illustrates a memory system for programmable storage precision. The memory 300 can implement memory 100 (and any or all of the resistance thresholds illustrated in Figures 2A-2C) by including, for example, circuitry 301 for dynamically changing thresholds of multi-level cell storage such as Flash and PCM. Instead of hard coded thresholds that are tuned for generic behavior, the circuitry 301 for dynamically changing thresholds can implement variable threshold. That is, the thresholds indicating levels for a memory cell can be optimized for a particular application (or data type). For example, asymmetric ranges can be established for different values in a cell. PCM uses resistance values; however, other memories can use other physical characteristics including voltage values.

[0051] In addition, in some cases, regions can be allocated with particular error codes. In some of such cases the circuitry 301 may be used to implement variable error correction. That is, the error rate of a cell can be modified using different levels of error correction depending on the error constraint for the region to which the cell belongs. The circuitry 301 may be on-chip or part of a memory controller and include registers, decoding logic, and, optionally, logic for external control of data in the registers. A memory controller can contain at least part of the control logic for controlling aspects of the memory; in addition, some control logic may be on the memory itself and other portions of the system.

[0052] In some cases, memory 300 is a memory card based on an industry standard, such as a Compact Flash (CF), MultiMediaCard (MMC), or Secure Digital (SD) nonvolatile memory card that is available for use in mobile phones, digital cameras, tablets, phablets, laptops and other computing devices. An SD picture memory (or other data type that is encoded and contains identifiable bits of different error tolerances) implementing the memory 300 includes the circuitry 301 to store programmable thresholds and different mappings based on data attributes (e.g., mappings based on error tolerance and/or bit patterns).

[0053] A 32 GB generic SD can function as a picture memory, for example, for 80 GB of data with the inclusion of the circuitry 301 containing registers to store level information (e.g., thresholds for the levels) to enable storage of more error tolerant data bits into denser cells. In some cases, the SD includes logic that allows external control of the values stored in the registers. In some cases, instead of external programmable control of the values stored in the registers, the SD may have hardwired levels/thresholds with different versions for the precisions, which are established at the time of manufacture (or other suitable step in the process).

(0054] With an appropriate memory 300, an operating system 310 accessing the memory 300 includes a means to utilize the memory 300. That is, an attribute for the level of precision for data is included so that the operating system 310 can indicate to the memory 300 the level of precision associated with certain bits and/or bytes of data. The indication can include a flag. In some cases, the operating system 310 can receive multiple images (or other data) and send the data identified with the same importance levels into the same type of memory cells by, for example, communicating with a memory controller for the memory 300 to indicate the level of precision for a set of bits or bytes. The example illustrated in Figures 1A and IB shows three levels of precision (and a corresponding three types of regions); however, the granularity may be adjusted so that there are more grades of memory and more grades of the types/importance of bits.

[0055] The operating system 310 may include the functionality that identifies data type (and corresponding appropriate level of storage precision) for data being stored. In addition, or as an alternative, the operating system may expose via an application programming interface (API) 320 the different levels of storage precision so that applications 330 can more easily identify to the operating system 310 whether particular data can be stored in memory cells have relaxed requirements.

[0056} The application 330 is created or modified to be able assign the relative prioritization of encoded bits of an image (or some other encoded data) into different error susceptibility (and resulting quality-loss) categories. When communicating with the operating system 310 to store the data in the memory 300, the application 330 requests (or indicates) different levels of precision for its data. Whether already understood by the operating system 310 or via the API 320, the operating system 310 and/or memory controller (of memory 300) then maps the bits in the different error susceptibility categories to different cell categories, according to the cells' expected error rates.

[0057] Accordingly an operating environment such as illustrated in Figure 3 can include one or more memory chips (memory 300) that are programmed by the memory controller (or read/write controller) to have thresholds (and/or error codes) optimized for a particular application 330. The assignment of the particular levels of precision can be performed by the system, but be application and/or scenario dependent. For example, an application 330 that stores images of a certain type, for example, JPEG XR, can use memory 300 with thresholds optimized for storing JPEG XR. In some cases, the operating system 310 and/or memory controller can access profiles (stored in precise memory and/or registers) to obtain system parameters based on the type of application storing data. Users may also be able to specify final qualities of their images.

[0058] The programmed thresholds may also depend on the location of the storage. Figures 4A-4C illustrate various operating environments and corresponding storage locations. For example, as illustrated in Figure 4A, cloud storage 401 can receive an image

402 from an application 330 executing on a computing device 403. The computing device

403 may be or include one or more of a server, personal computer, mobile device, wearable computer, gaming system, and appliance. One scenario for the operating environment shown in Figure 4A is a cellphone sending an image 402 to a Microsoft OneDrive® account. The number of programmed thresholds and the particular memory type (e.g., Flash, PCM, etc.) implementing the cloud storage can be selected to be suitable for handling massive amounts of data as common for cloud storage. In one implementation taking advantage of this scenario, the computing device 403 may be a phone that aggressively degrades a copy of the image 402 located on the device the longer the copy has not been accessed. The image 402 may be stored in the cloud storage 401 as a higher quality copy from which the computing device 403 may recover a high quality image.

[0059] In the example illustrated in Figure 4B, the storage 404 receiving the image 405 from the application 330 is a local storage on (or associated with) the device 406 executing the application 330. Continuing with the cellphone example, in this operating environment, the image 405 is stored in the cellphone's memory, for instance by writing the image from temporary storage 407 (such as at the time an image is captured by a camera of the cellphone) to a data storage 408 of the storage 404 of the computing device 406. The application 330 may also be stored in the application storage 409 of storage system 404. In some cases, the data storage 408 and application storage 409 may be part of a same memory chip while the temporary storage 404 is part of a cache that may be separate from or integrated (e.g., on-chip) with a processor of the computing device 406. In some cases, the temporary storage 407 is a region on a same chip as the data storage 408. Of course, the particular configuration is device and technology dependent.

[0060] As previously noted, the particular number of programmed thresholds can be based on the capabilities and storage needs of the device 406, which may be one or more of a server, personal computer, mobile device, wearable computer, gaming system, and appliance. In the cellphone example, 20% of the storage 404 may be allocated for the most precise region of memory so that there is sufficient space for application storage 404 and important data; whereas the remaining storage 404 can have higher allowed error rates (e.g., by being more dense or having less bits for error correction).

(0061] In the example illustrated in Figure 4C, the storage 410 can be associated with a website that receives an uploaded image 411 from a computing device 412 (via web browser 413) over the Internet 414 (and stores the image at a server 415 with associated storage 410). Although shown in a separate representation, the example illustrated in Figure 4C may be carried out in the environment illustrated in Figure 4A. For example, the website may be hosted by a cloud service and/or the associated storage may be cloud storage. In other cases, designated servers are used to host the website and store associated data. The needs and capabilities of these devices can influence the number of thresholds and the allocation of the amount of storage available at each threshold.

(0062] Storage Substrate Optimization

[0063] A PCM storage substrate can be optimized to offer high density, yet reasonable error rates via biasing and very low frequency scrubbing. In the example study, a PCM storage substrate was optimized to minimize errors via biasing and tuned via selective error correction to different error rate levels. This optimization was performed for a particular image encryption algorithm, an approximation aware progressive transform codec (PTC).

[0064] In the optimization, the mapping of cell resistance levels to their digital values were adjusted to perform biasing to optimize the PCM cells to balance write errors with drift errors and then the optimized cells were tuned with selective error correction to match the bits encoded by the PTC that these cells are expected to store. For example, a multi -level PCM cell design can be optimized for high density (e.g., 3 x) at reasonably high error rates (e.g., 10 ^~ ).

[0065] The described optimization achieves low error rates in a 4-level configuration (2 bits/cell) and reasonably low error rates in an 8-level configuration (3 bits/cell).

[0066] For optimization, a PCM cell' s resistance range is partitioned into biased levels. Once the resistance range is partitioned into biased levels, the next step is to map digital values to individual biased levels. Both in general and in the PTC encoded images, zeroes are the most common ('00' for 4-level cells and '000' for 8-level cells), so the value zero is mapped to the highest level, which is immune to drift. There was no other value that appeared to be more common than others for images, so the values for the remaining levels were assigned by using a simple Gray code.

[0067J In a preferred implementation for the case study embodiment, three cell configurations are used: a precise configuration, a 4-level configuration and an 8-level configuration. Neither the 4-level nor the 8-level configuration achieves the published uncorrectable bit error rate of solid-state storage products (10 ^~ ^) _m fa _ei _{r raw} f _orrrij but can achieve reasonably low error rates that can be error-corrected to the commercial reliability level. Even for 8-level cells, which have higher error rates, the storage overhead of error correction is lower than 100%, so even with this overhead, biased 8-level cells provide denser storage when compared to the uncorrected biased 4-level cells.

[0068[ Unfortunately, even after biasing, using the modeled circuitry for 16-level cells resulted in error rates that were too high (write error rates are reasonable around 10 ^~4, but the drift error rate is unbearably high— 10 1 after 1 second of write operation) and cannot be brought down to reasonable rates by error correction with storage overhead low enough to justify the increase in number of levels. The 2-level and 3-level cells were used as precise baselines since they show very low error rates. On the one hand, 2-level cells are simpler and faster. On the other hand, 3-level cells offer higher density at still low enough error rates to be considered precise. The 4-level and 8-level cells were then used as approximate memory cells.

[0069] Even after biasing, drift may still be an issue in the long-term. To mitigate excessive drift, scrubbing can be used to rewrite the cell and bring the resistance level back down. Based on the PCM cell model (described in more detail below), the scrubbing period was expected to be on the order of 3 months (10^ seconds). The average access bandwidth on the order of 100 bits/second per gigabit of storage is a negligible figure. Also, if data is going to be scrubbed anyways, this may be a good opportunity to also perform wear leveling.

[0070] Once cells are optimized, the cells can be tuned to provide different error rate levels. The storage controller is responsible for offering a variety of error correction codes, each at a different point in the space defined by the storage overhead required for metadata storage and the error rate reduction provided. In principle this results in higher controller complexity, but in practice using multiple codes in the same family (e.g., BCH-4 and BCH- 16) may keep complexity under control.

[0071 ] The controller is also responsible for organizing the storage into regions, each with a different error correction strength. The controller stores a region-to- configuration map in a table resident in the controller and backed by a preconfigured precise region of storage that persists the map during power cycles. System software sends special configuration commands to the controller to allocate and configure regions. Once configured, the controller uses the requested address and the information in the region-to- configuration map to determine which region the request targets and the appropriate error correction strength to use in servicing the request. The number of different regions is small (e.g., 8 in this example), so the region-to-configuration map can support variable-size regions and be fully associative.

[0072] Regions with different error correction have different metadata overhead. As such, different regions will need different number of cells to store the same number of data bits. The entire storage space may be managed in one of two ways. Static management simply partitions the storage into multiple regions at manufacturing time. This approach is inflexible in that it does not allow a different proportion of storage to be dedicated to a region. The second approach is to allow dynamic reconfiguration of regions to match application demands. In this case, region resizing causes additional complexity. Assuming the storage device leaves manufacturing with all regions initialized to the strongest available error correction by default, when a region is configured of the first time, it grows in density, and thus in usable size. A simple way to cope with this is to expose this region as two regions, one of the original size before reconfiguration, and a virtual one with the surplus storage. This makes addressing simpler. A region can only be reconfigured to a smaller size if the system can accommodate the contents of the surplus region elsewhere.

[0073] Evaluation Setup

[0074] A custom simulation infrastructure was used for the multi-level cell simulations. The quality measurements were based on 24 grayscale raw images at 768 x 512 pixels resolution in the Kodak PCD image set. Configurations and parameter settings for 4-level cells and 8-level cells are summarized in Figures 5A and 5B, respectively. Figure 5 A shows Table 1 of uniform (u) and biased (b) 4-level cell parameters. RT denotes the mean resistance of a level, and RB denotes the resistance at the upper boundary of the level. Figure 5B shows Table 2 of uniform (u) and biased (b) 8-level cell parameters. Note that, compared to uniform cells, biased cells have target levels (log RT) and the level boundaries (log RB) move toward lower resistances by appropriate amounts, resulting in lower drift-induced errors at the cost of increased write errors. The write error rate of biased cells was set to the order of 10 ^~ according to the application's characteristics. The overall drift error rate can be minimized by equalizing the drift error rates for all the levels (except for the first level and the last level). Cells are optimized for a scrubbing interval t = 10^s (about 3 months) after they are written. During scrubbing, their original target resistance is restored.

[0075] The proposed system was evaluated by two metrics: peak signal to noise ratio (PSNR) and memory density. PSNR compares the original image, pixel by pixel, with the decoded image that contains errors from lossy compression algorithm {e.g., quantization) and memory subsystem (in this case, uncorrected write errors and drift errors). The higher the PSNR value, the smaller the difference between the original and the reconstructed images.

[0076] The approximate memory system was evaluated with images from several target PSNR levels, i.e., 35 dB, 38 dB, 40 dB, and 42 dB. For most images, 40-42 dB range denotes high image quality, with distortion nearly imperceptible visually; whereas, 38 dB and 35 dB represent mediocre and low quality, respectively. Due to the nondeterministic error patterns in the approximate memory system, 100 samples of each image were run in the benchmark and the minimum PSNR was used, which gives a lower bound on the quality of the reconstructed image. Memory density is defined as the number of data bits stored by a cell. Error-prone memories {e.g., PCM) commonly use error correction codes (ECC) to recover from certain number of errors. The storage overhead of error correction bits may degrade memory density.

[0077J For a custom application for which target error rates for error tolerance classes have been determined, the PCM substrate can be optimized for the custom application. In the example case study, the substrate is optimized for an arbitrary scrub rate

(10^5, or approximately 3 months) by optimizing cells via biasing. Figure 6 shows a plot of combined raw bit error rates (RBERs) over time of uniform and biased PCM cells. Here, the effect of biasing on error rates, for both 4-level and 8-level cells, is illustrated, reporting combined error rates across all levels. Error rates grow over time because of drift effects.

[0078] Initially, 4-level and 8-level uniform cells (Uniform 4LC and Uniform 8LC) are used. As expected, error rates for 4-level cells are always lower than for 8-level cells because fewer levels allow more room for drift in each level. However, both types of cells start showing excessively high error rates even only an hour after being written. In contrast,

Biased 4LC maintains very low drift error rates during the range of time (10 ^~20 _at i ()10 _s) The raw bit error rate (RBER) of the Biased 4LC is dominated by the write errors. Biased 8LC, which combines highest density with reasonably low error rates, provides a good trade-off with error rate of about 10 two orders of magnitude lower than Uniform 8LC at lO^s. Luckily, it also matches the needs of the most error tolerant bits (i.e., the refinement bits). This allows no error correction to be used at all for these bits, eliminating unnecessary metadata overhead.

[0079] Figure 7 shows a plot of density with increasing scrubbing intervals comparing uniform and biased PCM cells. Figure 7 provides insight on which cell configuration offers the best trade-off between overall density for the example implementation, including error correction to maintain uncorrectable bit error rate (UBER) at commercial rates (10 l^), and scrubbing overhead. 2LC and 3LC cells have RBERs as low as precise memory, and hence do not require error correction. 3LC provides 1.58 x higher density over 2LC. The densities of uniform cells (i.e., 4LC, 8LC, and 16LC), although high for short scrubbing intervals (so short they are unattractive), fall sharply at longer intervals, since drift-induced errors accrue fast. In contrast, biasing suppresses the growth of drift error rates significantly: Bias4LC has stable 1.86 x density gains (due to write errors), and Bias8LC experiences a much smoother density degradation, achieving 2.28 x density improvement after about 3 months (10 ⁷s).

[0080] Once both the algorithmic error rate requirements for a custom application are determined and the substrate is optimized for lowest possible error rates, the algorithm and substrate can be matched via error correction. This relies on understanding the tradeoffs between storage overhead of the error correction mechanism and its correcting power. Figure 8 shows a plot comparing capabilities and overheads of error correction codes (ECC) at 512 data bit blocks. In Figure 8, a variety of error correction mechanisms (with storage overheads), and the correspondence between raw bit error rates (RBER) and uncorrectable bit error rates (UBER) are provided.

[0081] Single error correcting and double-error detecting (SECDED) ECC corrects one error and detects up to two errors in 72 bits; each of the BCH codes corrects up to the denoted number of errors in 512 data bits plus overhead. The appropriate error correction technique can be selected for a memory region based on the needs/error constraints of a particular class of bits of the custom application. In some cases, additional error correction may not be needed. In other cases, some additional error correction scheme is used. For example, if a correction mechanism is desired that accepts a RBER of 10 ³ and produces a UBER of 10 ¹⁶, the plot shows that BCH-16 is the code that provides this capability with the lowest storage overhead (31.3%). Similarly, it can be seen that BCH-6 provides a 10 ^~ UBER at an overhead of 11.7%, which can be sufficient for certain low importance bits (that may make up a majority of the bits being stored).

[0082] It is also worth noting that as RBER increases, the code strength required to maintain the same UBER grows rapidly. This highlights the value of biasing: had it not lowered the error rate by two orders of magnitude, the 8-level cell design would have offered RBER so high that the overhead of correcting all errors would have made it prohibitive.

[0083] The scrubbing period chosen for the biasing optimization was somewhat arbitrary. To illustrate the effects of using the same described cell design with other scrubbing intervals (so the cells are used "out-of-spec" for different scrubbing intervals), simulations were performed over the different scrubbing intervals. If the interval is shorter than specified, write errors dominate; if the interval is longer, drift errors dominate instead.

[0084] Figure 9 shows a graph illustrating the trade-off between ECC storage overhead and scrubbing interval. For each column, the code in the first row is applied to all the bits in thorough correction (TC); the selective correction (SC) uses the first- row ECC for the control and run-length bits in MBl, and the second-row ECC for the control and run-length bits in other MBs, and leaves all refinement bits unprotected. The third row shows the total overhead for SC. In Figure 9, it can be seen how error correction selection would have changed for different scrubbing intervals (assuming < 1 dB quality degradation).

[0085] The graph in Figure 9 compares thorough correction (Bias8LC TC) with selective correction (Bias8LC SC) side-by-side at each interval. As the scrubbing interval increases (towards the right of x-axis), stronger ECC mechanisms must be employed to suppress the growth of drift error rate, resulting in higher storage overheads. On the other hand, larger intervals reduce system energy and bandwidth overheads due to data movement and checker bits computation generated by the scrubbing.

[0086] Although 10 ⁷ seconds was selected as the target scrubbing interval for the dense, approximate image storage system, shorter intervals might also be acceptable for other systems if higher density is the top priority. The main takeaway from these results, however, is that selectively applying error correction only where needed can significantly reduce the loss in density while bringing the memory to the algorithmically- required error rates, as evidenced by the large difference in each pair of bars. By including the biasing

(optimized at the scrubbing interval of 10 ⁷s), only 10.22% storage overhead (brought down from almost 32%) is required, resulting in being able to reach storage density 2.73 x over the 2-level baseline.

(0087] The framework described herein is readily applicable to other technologies, e.g., Flash, particularly multi-level Flash (e.g., TLC NA D Flash). In such devices, ECCs (BCH and LDPC are common) are applied to a sector of 512 bytes (or greater, such as 1024 bytes). Figure 10 shows a plot comparing capabilities and storage overheads of ECC at 512 byte block data (typical of Flash). Each code is able to correct the denoted number of errors in a 512 Byte Flash sector and the augmented ECC checker bits.

[0088] Prior studies report that TLC NAND Flash devices have an initial RBER of

10 ^~^, which increases gradually with the number of program/erase cycles. Accordingly, a TLC Flash could use BCH- 16 for the cells storing the bits needing the highest precision, BCH-6 for bits needing less precision, and remaining bits needing the lowest precision could be uncorrected. RBER increases along with program/erase cycles, so stronger ECCs are gradually required. For instance, RBER reaches 10 ^~^ _aft _ei- approximately 3000 program/erase cycles. At this point, the density improvement of selective correction and thorough correction lower to 2.88 x and 2.49 x , respectively, making selective correction more attractive.

[0089] Co-designed data encoding and storage mechanisms provide denser approximate storage. By identifying the relative importance of encoded bits on output quality and performing error correction according to the identified relative importance, it is possible to increase storage capacity. Level biasing can further be incorporated into storage to reduce error rates in substrates subject to drift. Advantageously, the described systems and techniques are applicable to a variety of storage substrates. (0090] Certain aspects of the invention provide the following non-limiting embodiments:

[0091] Example 1. A memory chip for dynamic approximate storage, comprising: an array of memory cells, the array comprising at least two regions; at least one threshold register for storing values for thresholds for memory cells corresponding to each of the at least two regions; and control logic to programmatically adjust the values for the thresholds for the memory cells.

[0092] Example 2. The memory chip of example 1, wherein the values for thresholds represent resistance thresholds.

(0093) Example 3. The memory chip of example 2, wherein the resistance thresholds for the memory cells for at least one of the at least two regions indicate nonuniform in log scale resistance bands.

[0094] Example 4. The memory chip of example 1, wherein the values for thresholds represent voltage thresholds.

(0095] Example 5. The memory chip of any of examples 1-4, wherein the memory cells comprise at least one of a single level cell or a multi-level cell, wherein the at least two regions have a corresponding at least two types of available error correction overhead.

[0096] Example 6. The memory chip of any of examples 1-5, further comprising a controller providing a variety of error correction codes, wherein the controller stores a region-to-configuration map to indicate an appropriate region of the at least two regions to which a request targets and an appropriate error correction strength from one of the variety of error correction codes to service the request.

[0097] Example 7. The memory chip of any of examples 1-6, further comprising at least one bit pattern mapping register for storing a bit pattern table that maps a particular bit pattern to a particular level in a multilevel cell of at least one of the regions of the at least two regions, wherein the control logic further comprises control logic to programmatically assign the particular bit pattern being mapped to the particular level.

[0098] Example 8. The memory chip of any of examples 1-7, wherein the memory chip is a secure digital (SD) nonvolatile memory card.

[0099] Example 9. A method of controlling a storage device for dynamic approximate storage, comprising: modifying at least one value stored in a threshold register and associated with at least one cell in a region of a memory comprising at least two regions to apply a biasing for the at least one cell, wherein the biasing adjusts ranges for values in a cell. (0100] Example 10. The method of example 9, further comprising: assigning a bit pattern to each of the ranges for the values in the cell.

[0101] Example 11. The method of example 9 or 10, further comprising: assigning a first level of error correction to one of the at least two regions and assigning a second level of error correction to a second of the at least to regions.

[0102] Example 12. The method of any of examples 9-11, wherein the biasing generates non-uniform in log scale ranges.

[0103] Example 13. The method of any of examples 9-12, wherein the memory is a phase change memory and the values represent resistance values.

[0104] Example 14. The method of any of examples 9-12, wherein the values represent voltage values.

[0105] Example 15. The method of example 14, wherein the memory is a multilevel Flash memory.

[0106] Example 16. A mobile device comprising: a processor; a storage system comprising: a memory chip comprising an array of memory cells, the array comprising at least two regions; at least one threshold register for storing values for thresholds for memory cells corresponding to each of the at least two regions; and control logic to programmatically adjust the values for the thresholds for the memory cells; and a custom application stored on the storage system and comprising instructions that when executed by the processor store data on the memory chip, wherein the data has bits identified with at least two types of error constraints.

[0107] Example 17. The mobile device of example 16, wherein the thresholds for the memory cells for at least one of the at least two regions indicate non-uniform in log scale ranges.

[0108] Example 18. The mobile device of example 16 or 17, further comprising a controller providing a variety of error correction codes.

[0109] Example 19. The mobile device of example 18, wherein the controller stores a region-to-configuration map to indicate an appropriate region of the at least two regions to which a request to store the data targets and an appropriate error correction strength from one of the variety of error correction codes to service the request.

[0110] Example 20. The mobile device of any of examples 16-19, further comprising at least one bit pattern mapping register for storing a bit pattern table that maps a particular bit pattern to a particular level in a multilevel cell of at least one of the regions of the at least two regions, wherein the control logic further comprises control logic to programmatically assign the particular bit pattern being mapped to the particular level.

[011.1] Example 21. The mobile device of any of examples 16-19, wherein the values for thresholds represent resistance thresholds.

[0112] Example 22. The mobile device of any of examples 16-19, wherein the values for thresholds represent voltage thresholds.

[0113] Example 23. A system or product for performing the method of any of examples 9-15.

[0114] Example 24. A system comprising means for modifying at least one value stored in a threshold register and associated with at least one cell in a region of a memory comprising at least two regions to apply a biasing for the at least one cell, wherein the biasing adjusts ranges for values in a cell.

[0115] It should be understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application.

[0116] Although the subject matter has been described in language specific to structural features and/or acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as examples of implementing the claims and other equivalent features and acts are intended to be within the scope of the claims.

Previous Patent: DATA ENCODING ON SINGLE-LEVEL AND VARIABLE MULTI-LEVEL CELL STORAGE

Next Patent: DRILLING RISER WITH DISTRIBUTED BUOYANCY