Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PARALLELIZATION OF ERROR ANALYSIS CIRCUITRY FOR REDUCED POWER CONSUMPTION
Document Type and Number:
WIPO Patent Application WO/2012/127262
Kind Code:
A1
Abstract:
A memory device (e.g., a flash memory device) includes power efficient codeword error analysis circuitry. The circuitry analyzes codewords stored in the memory of the memory device to locate and correct errors in the codewords before the codewords are communicated to a host device that requests the codewords from the memory device. The circuitry includes a highly parallel configuration with reduced complexity (e.g., reduced gate count) that a controller may cause to perform the error analysis under most circumstances. The circuitry also includes an analysis section of greater complexity with a less parallel configuration that the controller may cause to perform the error analysis less frequently. Because the more complex analysis section runs less frequently, the error analysis circuitry may provide significant power consumption savings in comparison to prior designs for error analysis circuitry.

Inventors:
DROR ITAI (IL)
Application Number:
PCT/IB2011/000635
Publication Date:
September 27, 2012
Filing Date:
March 24, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SANDISK IL LTD (IL)
DROR ITAI (IL)
International Classes:
G06F11/10
Domestic Patent References:
WO2009072105A22009-06-11
WO2009074979A22009-06-18
Other References:
None
Download PDF:
Claims:
CLAIMS

I claim:

1. A memory device comprising:

an error search section configured to test for errors in a codeword under analysis, the error search section comprising:

a low parallelization circuit configuration that evaluates Y' instances of an error locator test in parallel; and

a high parallelization circuit configuration that evaluates 's' instances of the error locator test in parallel;

where Y' < 's'; and

a controller in communication with the error search section, the controller configured to:

obtain an error count for the codeword under analysis;

when the error count exceeds a parallelization threshold, search for the errors with the low parallelization circuit configuration; and

when the error count does not exceed the parallelization threshold, search for the errors with the high parallelization circuit configuration.

2. The memory device of claim 1 , where the error locator test comprises an error locator polynomial.

3. The memory device of claim 1 , where the error search section comprises a Chien search section and the error locator test comprises an error locator polynomial evaluated by the low parallelization circuit configuration and the high parallelization circuit configuration.

4. The memory device of claim 3, where the error locator polynomial comprises polynomial coefficients obtained from an error processing stage prior to the error search section.

5. The memory device of claim 2, where: the error locator polynomial is configured to locate as many as errors in the codeword under analysis; and

the low parallelization circuit configuration evaluates T terms in the error locator polynomial for each of the Y' instances.

6. The memory device of claim 5, where the high parallelization circuit configuration evaluates 's' instances of the error locator polynomial in parallel for V terms in the error locator polynomial.

7. The memory device of claim 6, where 'L' < and Y' < 's'.

8. The memory device of claim 6, where 'L' represents a pre-selected probability for which the high parallelization circuit configuration executes.

9. The memory device of claim 6, where 'L' represents an expected preselected power consumption reduction.

10. The memory device of claim 1 , where the parallelization threshold represents an expected pre-selected power consumption reduction.

1 1. A memory device comprising:

an error search section configured to test for errors in a codeword under analysis, the error search section comprising:

search units, each configured to evaluate a specific term position of an error locator equation across T term positions in the error locator equation, in parallel for Y' instances of the error locator equation;

among the search units, V search units further configured to evaluate their specific term position of the error locator equation across 'L' term positions in the error locator equation, in parallel for 's' instances of the error locator equation, where:

V < T; and

's' > Y'; and a controller in communication with the error search section, the controller configured to:

obtain an error count for the codeword under analysis;

when the error count exceeds 'L', execute in the error search section a low parallelization analysis in which the search units search for the errors in the codeword under analysis Y' instances of the error locator equation at a time; and

when the error count does not exceed execute in the error search section a high parallelization analysis in which the search units search for the error in the codeword under analysis 's' instances of the error locator equation at a time.

12. The memory device of claim 1 1 , where the error search section comprises a Chien search section and the error locator equation comprises an error locator polynomial evaluated by the search units.

13. The memory device of claim 12, where the error locator polynomial comprises polynomial coefficients obtained from an error processing stage prior to the error search section.

14. The memory device of claim 1 1 , where 'L' represents a pre-selected probability for which the high parallelization analysis executes.

15. The memory device of claim 1 1 , where 'L' represents an expected preselected power consumption reduction.

16. The memory device of claim 1 1 , where the error locator equation comprises an error locator polynomial.

17. The memory device of claim 16, where the 'L' search units each comprise a multiplexer that selects between:

an error locator polynomial coefficient for the error locator polynomial; an error locator polynomial coefficient multiplier for evaluating a particular iteration of Y instances of the error locator polynomial in parallel; and an error locator polynomial coefficient multiplier for evaluating a particular iteration of 's' instances of the error locator polynomial in parallel.

18. The memory device of claim 16, where the T search units, other than the 'L' search units, each comprise a multiplexer that selects between:

an error locator polynomial coefficient for the error locator polynomial; and an error locator polynomial coefficient multiplier for evaluating a particular iteration of V instances of the error locator polynomial in parallel.

19. A method for locating errors in a codeword, the method comprising:

obtaining an error count for a codeword under analysis;

when the error count exceeds a parallelization threshold, searching for the errors in the codeword with a low parallelization circuit configuration that evaluates 'r' instances of an error locator test in parallel; and

when the error count does not exceed a parallelization threshold, searching for the errors in the codeword with a high parallelization circuit configuration that evaluates 's' instances of the error locator test in parallel;

where 'r' < 's'.

20. The method of claim 19, where the error locator test comprises an error locator polynomial.

21 . The method of claim 19, where searching for the errors with a low parallelization circuit configuration comprises performing a Chien search.

22. The method of claim 19, where searching for the errors with a high parallelization circuit configuration comprises performing a Chien search.

23. The method of claim 19, where: the error locator polynomial is configured to locate as many as T errors in the codeword under analysis; and

the low parallelization circuit configuration evaluates terms in the error locator polynomial for each of the 'r' instances.

24. The method of claim 23, where the high parallelization circuit configuration evaluates 's' instances of the error locator polynomial in parallel for 'L' terms in the error locator polynomial.

25. The method of claim 24, where 'L' < T and V < 's'.

26. The method of claim 24, further comprising:

setting 'L' to implement a pre-selected probability for which the high parallelization circuit configuration executes.

27. The method of claim 24, further comprising:

setting 'L' to implement an expected pre-selected power consumption reduction.

28. The method of claim 19, further comprising:

setting the parallelization threshold to implement an expected pre-selected power consumption reduction.

Description:
PARALLELIZATION OF ERROR ANALYSIS CIRCUITRY FOR REDUCED POWER

CONSUMPTION

INVENTORS:

Itai Dror

BACKGROUND OF THE INVENTION

1. Technical Field.

[001] This disclosure relates to error detection and correction of data stored in, for example memory devices employing, for example, flash memory. In particular, this disclosure relates to reducing the power consumption of error analysis circuitry in memory devices without substantially impacting error analysis performance.

2. Related Art.

[002] Continual development and rapid improvement in semiconductor manufacturing techniques have led to extremely high density memory devices. The memory devices are available in a wide range of types, speeds, and functionality. Memory devices often take the forms, as examples, of flash memory cards and flash memory drives. Today, capacities for memory devices have reached 64 gigabytes or more for portable memory devices such as Universal Serial Bus (USB) flash drives and one terabyte or more for solid state disk drives. Memory devices form a critical part of the data storage subsystem for digital cameras, digital media players, home computers, and an entire range of other host devices.

[003] One important characteristic of a memory device is its power consumption. In an age when many host devices are powered by limited capacity batteries, every fraction of a watt in power saving translates into extended battery life and extended functionality between recharges for the host device. Reliability and cost are also important characteristics of a memory device. Significant volumes of memory devices are manufactured and sold each year, and competitive pressures have resulted in very low cost and even lower margins. Accordingly, even small improvements in the cost of a memory device can yield significant financial and marketplace position benefits. At the same time, low cost cannot be achieved at the expense of reliability. Instead, consumers expect that their memory devices will store their data for extended periods of time without significant risk of data loss.

SUMMARY

[004] A memory device (e.g., a flash memory device) includes power efficient error analysis circuitry. The circuitry analyzes codewords stored in the memory of the memory device to locate and correct errors in the codewords before they are provided to a host device that requests them from the memory. The circuitry includes a highly parallel analysis section with reduced power consumption that a controller may cause to perform the error analysis under most circumstances. The circuitry also includes an analysis section with a less parallel configuration and greater power consumption that the controller may cause to perform the error analysis less frequently. Because the less parallel analysis section runs less frequently, the error analysis circuitry may provide significant power consumption savings in comparison to prior designs.

[005] In one implementation, the memory device includes an error search section configured to test for errors in a codeword under analysis. The error search section includes a low parallelization circuit configuration that evaluates 'r' instances of an error locator test in parallel. The error search section also includes a high parallelization circuit configuration that evaluates 's' instances of the error locator test in parallel, where 'r' < 's'. A controller in communication with the error search section is configured to obtain an error count for the codeword under analysis. When the error count exceeds a parallelization threshold, the memory device searches for the errors with the low parallelization circuit configuration. However, when the error count does not exceed the parallelization threshold, the memory device searches for the errors with the high parallelization circuit configuration. The parallelization threshold may be set such that the high parallelization (lower power consuming) circuit configuration performs the search in most cases.

[006] Other features and advantages of the inventions will become apparent upon examination of the following figures, detailed description, and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[007] The system may be better understood with reference to the following drawings and description. In the figures, like reference numerals designate corresponding parts throughout the different views.

[008] Figure 1 illustrates prior art error decoding logic.

[009] Figure 2 shows a probability curve of the probability of decoding more than a specified number of errors in a code word.

[010] Figure 3 illustrates a memory device that includes power efficient codeword error analysis circuitry.

[01 1 ] Figure 4 shows high parallelization and low parallelization search circuitry.

[012] Figure 5 shows search units for high parallelization search circuitry.

[013] Figure 6 shows search units for low parallelization search circuitry.

[014] Figure 7 shows test logic for finding errors in a codeword.

[015] Figure 8 shows a method for finding errors in a codeword.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[016] The discussion below makes reference to host devices and memory devices. A host device may be a wired or wireless device, may be portable or relatively stationary, and may run from battery power, AC power, or another power source. A host device may be a consumer electronic device such as a personal computer, a mobile phone handset, a game device, a personal digital assistant (PDA), an email/text messaging device, a digital camera, a digital media/content player, a GPS navigation device, a satellite signal (e.g., television signal) receiver, or cable signal (e.g., television signal) receiver. In some cases, a host device accepts or interfaces to a memory device that includes the functionality described below. Examples of memory devices include memory cards, flash drives, and solid state disk drives. For example, a music/video player may accept a memory card that incorporates the functionality described below, or a personal computer may interface to a solid state disk drive that includes the functionality described below. In other cases, the host device may directly incorporate the logic that implements the functionality described below, for example as part of its memory system.

[017] Figure 1 illustrates prior art error decoding logic 100 that may be present in a host device or a memory device. In particular, the error decoding logic 100 is a block diagram of the functionality of a Bose, Chaudhuri, and Hocquenghem (BCH) error decoder. The innovations described in this application are not limited to BCH decoders, but may instead be applied to any error detection or correction logic, including Reed-Solomon decoders, turbo decoders, low density parity check decoders, and other error detection or correction logic that operates on data elements such as codewords that are members of a particular code design.

[018] The error decoding logic 100 is partitioned into four stages 102, 104, 106, and 108. A possibly corrupted data element V is submitted, word by word, to the first stage 102. The first stage 102 determines 'p' residuals, b,, from the input data element with respect to 'p' minimal polynomials, Φ,. The 'p' residuals are submitted to the second stage 104, which calculates the '2t' syndrome components from the 'ρ' residuals calculated at the first stage 102, where is the maximum number of correctable errors supported by the decoder. The third stage 106 calculates the coefficients of the error location polynomial from the '2t' syndrome components that were determined by the second stage 104. The third stage 106 may employ, for example, the Berlekamp-Massey method to solve for the coefficients. The outputs of the third stage 106 are the V coefficients of the error locator polynomial, where V is the number of errors determined to be present in the input data element V. The fourth stage 108 locates the V errors in the input data element V by solving the error locator polynomial. The fourth stage 108 may be implemented as Chien search logic, for example, that outputs the additive inverses of the error locations. [019] The Chien search circuit in the fourth stage 108 may find the bit addresses of the V errors by locating the zeros, 'e\ of the error location polynomial. The third stage 106 outputs the number of errors found, V, in the input data element V, and the V coefficients σ, of the error locator polynomial. These parameters are input to the Chien search logic.

[020] Figure 2 shows an example plot 200 that gives the probability 202 that a data element (e.g., a codeword from a predefined code) will have more than a specified number of errors. For the purposes of illustration, the plot 200 assumes a predefined BCH(18214, 16384, 245) code that corrects up to 122 errors in a codeword, and a bit error rate of 0.34%. The BCH(18214, 16384, 245) code specifies that each codeword will have 18,214 bits, of which 16,384 are data bits (e.g., a block of 2048 8-bit bytes of user data), and that there is a minimum Hamming distance of 245 between code words. Other such plots may be generated for any particular code design and bit error rate, and the data elements input to the error detection or correction logic may be codewords from any desired code design.

[021] The plot 200 shows, for example, that there is less than a 1/1000 chance of the codeword having more than 88 errors. The probability for a given number of errors drops precipitously as the number of errors increases toward , the maximum error correcting capability of the code. Although it is rare to have a large number of errors in the data element, the power supply needs to be designed to accommodate the power needed to correct these large numbers of errors because they do sometimes occur.

[022] The following description presents several techniques for reducing the maximum power consumption of error decoding logic. In one aspect, the memory device searches for errors in the codeword using search circuitry with different degrees of parallelization and complexity. The search circuitry chosen may depend on the number of errors, V, detected in the codeword in comparison against a power control threshold. The power control threshold may be a constant, or may be changed during the operation of the device that includes the error decoding logic. [023] For example, when V is greater than a power control threshold number of errors, the memory device may search for errors using a low parallelization circuit having a particular complexity. The complexity may be expressed in terms of power consumption, gate count (which may translate directly or indirectly to power consumption), or other complexity measure. When V is less than a power control threshold number of errors, the memory device may instead search for errors using a high parallelization circuit having reduced complexity compared to the low parallelization circuit. In the implementations described below, the high parallelization circuit has a lower gate count and lower power consumption than the low parallelization circuit. As a result, the memory device detects and corrects a large number of errors (which occur infrequently) by employing lower parallelization, higher complexity circuits less frequently than the higher parallelization, low complexity circuits. As a result, the memory device benefits during most decoding operations from decreased the maximum sustained power consumption of the error decoding logic.

[024] The power control threshold at which the lower complexity circuits perform the search may be set at any level. In some implementations, the power control threshold may be set to correspond to a desired probability that fewer than the threshold number of errors will occur. The power control threshold may thereby determine a pre-selected probability for which the low complexity circuits perform the search, and may thereby implement an expected pre-selected power consumption reduction for the memory device. As a specific example, using the code design given above, the power control threshold may be set to 88, so that there is only a 1/1000 chance that the higher complexity circuits perform the search for any given codeword. As a result, the error decoding logic searches using the lower complexity, higher parallelization circuits, 999 times out of 1000, so that the overall performance of the memory device remains nearly unaffected. At the same time, less power is consumed 999 times out of 1000, according to the difference in power consumption between the lower complexity and higher complexity circuits.

[025] The reduced power requirements may give rise to less expensive or less complex power supplies, thereby reducing the cost and increasing the reliability of the electronic device that incorporates these control techniques. This is particularly true in implementations where the memory controller and the power supply are fabricated monolithically (e.g., on a single chip). In such implementations, the power supply relies on internal chip capacitance to avoid the expense and space required for discrete capacitors, but at the same time the power supply tends to be limited in peak power output, rise and fall time, and other parameters. Accordingly, reducing the power required from the power supply facilitates low cost and reliable fabrication and operation of memory devices that incorporate the power supply, e.g., on a single chip.

[026] Figure 3 shows a memory device 300 in which a memory 302 stores data elements 304 that a memory interface 306 stores and retrieves. In particular, the controller 308 and memory interface 306 retrieve data elements requested by the host device and communicate them first to the error decoding logic 310 for error detection and correction. As one example, the memory 302 may be a flash memory card memory array, and the memory interface 306 may respond to read requests from a host device by retrieving the requested data elements from the memory 302 and passing them to the error decoding logic 310. After error detection and correction, the corrected data elements may then be communicated to the host device through the host device interface 307 (e.g., a Secure Digital (SD), micro SD, Compact Flash, or other flash memory card interface).

[027] The error decoding logic 310 includes error count logic 312 (e.g., the Berlekamp-Massey coefficient solver noted above with respect to Figure 1 ) that determines an error count, V, for the data element. The comparison logic 314 in the controller 308 receives the error count and determines when the error count exceeds a power control threshold (PCT). The PCT value may be stored in the PCT register 320. If the error count exceeds the power control threshold, the comparison logic 314 may, as examples, assert a power control enable signal or status bit, or communicate a power control message or command to the control logic 316. The control logic 316 asserts the parallelization control signals to the error search logic 318 to selectively cause a search for the V errors to be performed by the high parallelization, low complexity circuitry or by the low parallelization, high complexity circuitry.

[028] In one implementation, the error search logic 318 performs a Chien search to find the roots of a polynomial over a finite field GF(q):

[029] the nonzero β in GF(q) for which Λ(β) = 0 are the roots of the polynomial.

[030] The implementation of a Chien search is made more efficient because each non-zero β may be expressed as for a selected ίβ, where a is a primitive element of GF(q). Therefore, α' from Y between 0 and q-1 cover the entire field (other than the zero element).

[031] There is this relationship between instances Y and 'i+1 ' of the polynomial:

[032] Each Λ(α') is therefore the sum of the terms in the polynomial, from which the next set of coefficients for each term in the polynomial are given by:

[034] The Chien search starts at i = 0 with and iterates through each

value of i up to (q - 1 ). As described in more detail below, each iteration may cover Y' instances (Y' values of Y) of the polynomial for the low parallelization configuration, or 's' instances ('s' values of Y) of the polynomial for the high parallelization configuration, with 's' > Y\ If the sum of the terms in any instance of a polynomial is zero, then a root of the polynomial had been located, and an error exists in the codeword at position T.

[035] Figure 4 shows an example of the error search logic 318, including a high parallelization circuit 402 and low parallelization circuit 404. The high parallelization circuit 402 overlaps in circuitry with the low parallelization circuitry 404 as shown by the overlap 406. Figure 5 isolates a view of the high parallelization circuitry 402, while Figure 6 isolates a view of the low parallelization circuitry 404.

[036] The following parameters characterize the implementations: 'U is the power control threshold, 's' is the number of instances of the polynomial evaluated in parallel for the high parallelization circuitry, Y is the number of instances of the polynomial evaluated in parallel for the low parallelization circuitry, and ΐ is the number of errors that may be corrected in the codeword as determined by the code design. Furthermore, in Figure 4-6, the 'σ,' are the coefficients of the polynomial determined in an earlier stage of the error decoding logic 310, and 'a' are the multipliers raised to the correct powers for each term in the polynomial and for each instance of the polynomial according to the equations above. Note that the error decoding process described in Figure 1 that feeds the error search logic 318 has the property that σ, = 0 for Y > V.

[037] As noted above, the total number of instances of the polynomial to evaluate is q-1. The value of V is less than the value of 's', and therefore it takes more iterations of the low parallelization circuitry 404 to evaluate the total number of instances of the polynomial. This is one sense in which the low parallelization circuitry has a lesser degree of parallelization than the high parallelization circuitry. Note that the values of 's' and 'r' and 'L' are selected such that the low parallelization circuitry has a higher power consumption than the high parallelization circuitry. In part, the higher power consumption results from the additional multipliers employed in the low parallelization circuitry to evaluate the polynomial across all T term positions. In contrast, the high parallelization circuitry uses fewer multipliers to evaluate the polynomial across 'L' term positions, when V <= 'L', taking advantage of σ, = 0 for T > V. [038] The search logic 318 may be implemented using search units. Figure 5 shows an example of a search unit 502 for the high parallelization circuitry 402. There are 'L' such search units for the high parallelization circuitry. Each search unit may evaluate a specific term position of the polynomial for multiple instances (e.g., 's' instances) of the polynomial.

[039] The search unit 502 includes a multiplexer 504 that selects between three inputs: the polynomial coefficient 'σ,' 506 for the first iteration of evaluation of a set of 's' or V polynomials, a multiplication of the prior coefficient (stored in the register 512) for the term position by a power of 'a s ' 508 which multiplies the polynomial coefficients by the requisite powers of 'a s ' when 's' instances of the polynomial are evaluated each iteration in high parallelization mode, and a multiplication of the prior coefficient for the term position by a power of 'α Γ ' 510 which multiplies the polynomial coefficients by the requisite powers of 'α Γ ' when 'r' instances of the polynomial are evaluated each iteration in low parallelization mode.

[040] The search units that overlap between the high parallelization circuitry 402 and the low parallelization circuitry 404 are configurable to act in a low parallelization mode or a high parallelization mode by controlling the multiplexers. In particular, the multiplexers 504 select multiplication by powers of 'α Γ ' for low parallelization mode, and multiplication by powers of 'a s ' for high parallelization mode. In one implementation, the control logic 316 generates the parallelization control signals to cause the multiplexers 504 to select the appropriate parameters for the selected parallelization mode.

[041] The search unit 502 further includes a set of 's' coefficient multipliers 514. The coefficient multipliers 514 multiply the coefficient for a specific term position by an additional power of 'a' to evaluate the specific term position of the polynomial for different instances of the polynomial. No additional multiplication is needed for the first instance of the polynomial, and the unmodified values of the registers (e.g., the register 512) are used. The 's-1 ' coefficient multipliers 514 evaluate the first term position for 's-1 ' instances of the polynomial. There are 'Ι_' search units operating in parallel, each of which evaluates a particular term position of the polynomial for each of 's' instances of the polynomial. A horizontal slice through the high parallelization circuitry 402 evaluates one instance of the polynomial (across the first V term positions). One such horizontal slice is labeled in Figure 5 as the slice 516.

[042] Figure 6 shows an example of a search unit 602 for the low parallelization circuitry 404. There are search units for the low parallelization circuitry 404, and each search unit may evaluate a specific term position of the polynomial for multiple instances (e.g., Y instances) of the polynomial. Among the search units, 'L' are shared with the high parallelization circuitry 402 and have the form described above with regard to the search unit 502. The remaining 't - L' search units have a slightly different form, and the last such search unit is designated by element 602 in Figure 6.

[043] The search unit 602 includes a multiplexer 604 that selects between two inputs: the polynomial coefficient ΌΥ 606 for the first iteration of evaluation of a set of Y polynomials, and a multiplication of the prior coefficient (stored in the register 610) for the term position by powers of 'α Γ ' 608 which multiplies the polynomial coefficient by the requisite powers of 'α Γ ' when Y instances of the polynomial are evaluated each iteration in low parallelization mode.

[044] As noted before, the 'U search units that overlap between the high parallelization circuitry 402 and the low parallelization circuitry 404 are configurable to act in a low parallelization mode or a high parallelization mode by controlling the multiplexers. The multiplexers 604 in the remaining 't - U search units select 'σ as the coefficient for the first iteration of evaluating Y instances of the polynomials in parallel, and select the multiplication of the prior coefficient by powers of 'α Γ ' for the coefficient for subsequent iterations. In one implementation, the parallelization control signals generated by the control logic 316 control the multiplexers 604 to select inputs as noted above according to the selected operating mode.

[045] The search unit 602 further includes a set of Y coefficient multipliers 612. The coefficient multipliers 612 multiply the coefficient for a specific term position by an additional power of 'α' to evaluate the specific term position of the polynomial for different instances of the polynomial. No additional multiplication is needed for the first instance of the polynomial, and the unmodified values of the registers (e.g., the register 610) are used. In particular, the 'r-1 ' coefficient multipliers 612 evaluate the last term position for 'r-1 ' instances of the polynomial. There are T search units operating in parallel, each of which evaluates a different one of the term positions for Y' instances of the polynomial in parallel. A horizontal slice through the low parallelization circuitry 402 evaluates one instance of the polynomial (across all T term positions). One such horizontal slice is labeled in Figure 5 as the slice 614.

[046] Once the term positions in the polynomial are evaluated, the error search logic 318 tests whether any instance of the polynomial, for which term positions have been evaluated, is zero (and thereby determines whether an error is located) using the test logic 700 shown in Figure 7. One instance of the test logic is labeled 702 and it evaluates the first instance of a polynomial in each set of polynomials. The test logic 702 connects to and performs an exclusive-or operation 704 against the fist output of each of the coefficient multipliers across the search units. For operation in the low parallelization mode, there are 'r' instances 708 of the test logic that each accept T inputs corresponding to the evaluated term positions from the search units. For operation in the high parallelization mode, additional instances 710 of the test logic also operate to evaluate the additional 's - r' instances of the polynomial, for a total of 's' instances. For example, the test logic 712 determines whether the instance r+1 in any set of iterations of the polynomial is zero by analyzing the 'L' term positions evaluated by the search units in the high parallelization circuitry 402. The additional instances 710 of the test logic are not used when operating in low parallelization mode, and their outputs may be disabled or ignored.

[047] Because σ, = 0 for T > V, and the high parallelization mode is active when V <= 'Ι_', the additional instances 710 of the test logic only need to accept 'L' < T inputs from the coefficient multipliers shown in Figure 5. The other instances 708 of the test logic also operate in high parallelization mode, but the inputs to the exclusive-or operation are zero for any term position beyond 'L'. Thus, even though the other instances 708 of the test logic have inputs, the inputs beyond (for example, the inputs labeled 714) are zero when operating in high parallelization mode and do not contribute to the analysis of any particular instance of the polynomial.

[048] Figure 8 shows the logic 800 that a memory device may implement to selectively search for errors with different degrees of parallelization. The memory device selects a power control threshold value (802) and sets a power control threshold register with the power control threshold value (804). The power control threshold value may be fixed or may change during the operation of the memory device, and may be provided by a host device, may be set or preconfigured in the memory device, or may be set or obtained in other ways.

[049] The memory device submits the next data element to the error decoding logic 310 (806). The error decoding logic 310 determines and provides to the controller 308 an error count V for the data element (808). When the error count

V exceeds the power control threshold, then the controller 308 searches for errors in the data element with a low parallelization circuit configuration (e.g., using the low parallelization circuit 404) (810). The low parallelization circuit configuration evaluates V instances of an error locator test (e.g., the error locator polynomial) in parallel. On the other hand, when the error count V does not exceed the power control threshold, then the controller 308 searches for errors in the data element with a high parallelization circuit configuration (e.g., using the high parallelization circuit 402) (812). The high parallelization circuit configuration evaluates 's' instances of an error locator test (e.g., the error locator polynomial) in parallel, where 'r' < 's'. In general, the low parallelization circuit configuration is expected to consume more power than the high parallelization circuit configuration because, for example, the low parallelization circuit configuration has a higher gate count or complexity than the high parallelization circuit configuration.

[050] The error locator polynomial may be configured to locate as many as errors in the codeword under analysis. The low parallelization circuit configuration may evaluate terms in the error locator polynomial for each of the

V instances of the error locator polynomial. On the other hand, the high parallelization circuit configuration evaluates 's' instances of the error locator polynomial in parallel for 'L' terms in the error locator polynomial, with 'L' < T. [051] The choice of V, the power control threshold, may be made to implement a pre-selected probability for which the high parallelization circuit configuration executes. For example, using the code design described above, setting 'L' to 88 implements a 999/1000 probability that the high parallelization circuit configuration executes. Because the high parallelization circuit configuration has an expected power saving compared to the low parallelization circuit configuration, the choice of 'U may implement an expected pre-selected power consumption reduction.

[052] The methods, power control sections, controllers, and other logic described above may be implemented in many different ways in many different combinations of hardware, software or both hardware and software. For example, the logic shown in Figures 3-8 may be circuitry in a controller, a microprocessor, or an application specific integrated circuit (ASIC), or may be implemented with discrete logic, or a combination of other types of circuitry. The logic may be encoded or stored in a machine-readable or computer-readable medium such as a compact disc read only memory (CDROM), magnetic or optical disk, flash memory, random access memory (RAM) or read only memory (ROM), erasable programmable read only memory (EPROM) or other machine- readable medium as, for example, instructions for execution by a processor, controller, or other processing device. Similarly, the memory that stores the data elements such as the PCT may be volatile memory, such as Dynamic Random Access Memory (DRAM) or Static Radom Access Memory (SRAM), or nonvolatile memory such as NAND Flash or other types of non-volatile memory, or may be combinations of different types of volatile and non-volatile memory. The instructions that comprise the software may be part of a single program, separate programs, implemented in an application programming interface (API), in libraries such as Dynamic Link Libraries (DLLs), or distributed across multiple memories and processors. The instructions may be included in firmware that a controller executes. For example, the firmware may be operational firmware for a memory card whose read/write operations are directed by a controller. The controller may execute the instructions to perform all or part of the techniques described above. For example, the instructions may perform the comparison of V and 'PCT', and responsively search for errors using either the high or low parallelization circuit configuration.

[053] While various embodiments of the invention have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the invention. Accordingly, the invention is not to be restricted except in light of the attached claims and their equivalents.