Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A LEADING BIT ANTICIPATOR FOR FLOATING POINT MULTIPLICATION
Document Type and Number:
WIPO Patent Application WO/1999/060475
Kind Code:
A1
Abstract:
A floating point multiplier unit (200) with a leading bit anticipator (240) for predicting the leading non-zero bit of the sum of carry and sum terms, the leading bit anticipator comprising an array of logic gates to provide a binary tuple indicative of the logical OR of the carry and sum terms.

Inventors:
VIJAYRAO NARSING K (US)
KUMAR SUDARSHAN (US)
Application Number:
PCT/US1999/008050
Publication Date:
November 25, 1999
Filing Date:
April 08, 1999
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INTEL CORP (US)
VIJAYRAO NARSING K (US)
KUMAR SUDARSHAN (US)
International Classes:
G06F5/01; G06F7/74; (IPC1-7): G06F7/00; G06F7/38
Foreign References:
US5530663A1996-06-25
US5889690A1999-03-30
US5771183A1998-06-23
US5790444A1998-08-04
US5493520A1996-02-20
Attorney, Agent or Firm:
Cortes, Roland B. (Sokoloff Taylor & Zafman LLP 7th floor 12400 Wilshire Boulevard Los Angeles, CA, US)
Taylor, Edwin H. (Sokoloff Taylor & Zafman LLP 7th floor 12400 Wilshire Boulevard Los Angeles, CA, US)
Download PDF:
Claims:
What is claimed is:
1. A circuit to predict the position of the leading nonzero bit of C + S, where C and S are binary tuples, the circuit comprising at least one logic gate to provide a binary tuple X, where X is indicative of C OR S.
2. The circuit as set forth in claim 1, further comprising a priority encoder circuit responsive to X to provide a binary tuple indicative of the position of the leading non zero bit of C OR S.
3. A circuit to predict the position of the leading nonzero bit of C + S, where C and S are binary ntuples, the circuit comprising: at least one logic gate to provide an nbinary tuple X, with ith component given by [X1 = ([C1 OR [S1), i = 0,1,..., n1; and a priority encoder circuit responsive to X to provide a binary tuple indicative of the position of the leading nonzero bit of X.
4. A circuit to predict the position of the leading nonzero bit of C + S, where C and S are binary ntuples, the circuit comprising: at least one logic gate to provide an nbinary tuple X, with ith component given by [, = ( [C], NOR [), = 0,1,..., n1; and a priority encoder circuit responsive to X to provide a binary tuple indicative of the position of the leading zero bit of X.
5. A circuit to predict the position of the leading nonzero bit of C + S, where C and S are binary tuples, the circuit comprising: at least one logic gate to provide a binary tuple X, where X is indicative of C OR S; and a priority encoder circuit responsive to X to provide a binary tuple indicative of the position of the leading nonzero bit of C OR S.
6. A floating point unit comprising: a shift unit to shift a binary tuple P; and an anticipator unit responsive to binary tuples C and S to provide the shift unit a shift signal indicative of the position of the leading nonzero bit of C OR S.
7. The floating point unit as set forth in claim 6, wherein the shift unit is responsive to the shift signal so as to shift P by m bits, wherein m is such that C OR S shifted by m bits has a leading one.
8. The floating point unit as set forth in claim 7, wherein the anticipator unit further comprises: at least one logic gate to provide a binary tuple X indicative of C OR S; and a priority encoder responsive to X to provide the shift signal.
9. The floating point unit as set forth in claim 6, further comprising an adder to provide P, where P = C + S.
10. The floating point unit as set forth in claim 7, further comprising an adder to provide P, where P = C + S.
11. The floating point unit as set forth in claim 8, further comprising an adder to provide P, where P = C + S. P.
12. The floating point unit as set forth in claim 9, further comprising a carrysave adder unit to provide the binary tuples C and S, where C represents carry terms and S represents sum terms.
13. The floating point unit as set forth in claim 10, further comprising a carrysave adder unit to provide the binary tuples C and S, where C represents carry terms and S represents sum terms.
14. The floating point unit as set forth in claim 11, further comprising a carrysave adder unit to provide the binary tuples C and S, where C represents carry terms and S represents sum terms.
15. A method to predict the position of the leading nonzero bit of C + S, where C and S are binary ntuples, the method comprising: performing n Boolean binary operations on each pair of bits { [C];, [S]}, i = 0,1,..., n to provide an ntuple X indicative of C OR S; and providing, based upon X, a shift signal indicative of the leading nonzero bit position of C OR S, wherein the leading nonzero bit position of C OR S is the predicted position of the leading nonzero bit of C + S.
16. A method for normalization in a floating point unit, the method comprising: providing carry and sum tuples C and S; adding C and S to provide a binary tuple P, where P = C + S ; performing Boolean binary operations on the components of C and S to provide a binary tuple X indicative of C OR S; and shifting, based upon X, the binary tuple P.
17. The method as set forth in claim 16, wherein in shifting the binary tuple P, P is shifted m bits, wherein m is such that C OR S shifted by m bits has a leading one.
Description:
A Leading Bit Anticipator for Floating Point Multiplication Field of Invention The present invention relates to floating point multiplier and add units in microprocessors, and more particularly, to floating point multiplier and add units with leading bit anticipators.

Background In many floating point multiplier units in which two floating point numbers are to be multiplied, the mantissas are multiplied together and normalized, where the normalization involves shifting until the final mantissa has a leading non-zero bit. To speed up the floating point multiplication, it is useful to predict the amount of shifting necessary, so that a shift circuit can be configured while the product is still being computed. Add units also can benefit from predicting the leading non-zero bit of the sum. Some execution units can perform both multiplication and addition.

It is desirable for leading non-zero bit prediction to be fast and simple to implement so that the shift circuit can be quickly configured.

Brief Description of the Drawings Fig. 1 is a high-level diagram of a microprocessor with a floating point multiply unit.

Fig. 2 is a high-level diagram of a portion of a floating point multiply unit with a leading non-zero bit anticipator.

Fig. 3 is an embodiment of a leading non-zero bit anticipator.

Detailed Description of Embodiments Fig. 1 is a high-level diagram of microprocessor 100 with floating point multiply functional unit 110. Registers 120 and 140 hold two floating point numbers a and b to be multiplied together, where A and B denote their mantissas, respectively, in registers 150 and 160. The product p = ab may be computed by obtaining the product of the mantissas P = AB in register 130, setting register 130 to shift P so that its leading bit is 1 (i. e., normalization), and properly computing the exponent of p based upon the exponents of a and b as well as the number of bit shifts applied to P.

In Fig. 2, functional unit 200 is a high-level diagram of a portion of floating point multiply functional unit 110. In the particular embodiment of Fig. 1, the product P of the mantissas A and B is obtained by first obtaining carry terms C and sum terms S by carry-save adder (CSA) 210, where C and S are binary tuples and where P is related to the carry and sum terms by P = C + S. This sum is performed by full adder functional unit 220. In the particular embodiment of Fig. 2, it is seen that the carry and sum terms are 128 bits wide, so that the product P obtained from adding the carry and sum terms is also 128 bits wide. Other embodiments will have different word sizes.

To speed up the multiplication of floating point numbers, it is desirable to set up shift register functional unit 130 to properly shift the output of full adder 220 before P is finally computed. In this way, shift register 130 will be ready to shift P when it is available from full adder 220. It is therefore desirable to anticipate, or predict, the position of the leading non-zero bit of P based only upon the carry and sum terms.

This prediction function is performed by leading bit anticipator (LZA) 240. As described below, LZA 240 does not always predict exactly the position of the leading non-zero bit. However, at most it will mispredict by one position. The final result of shift register 130 after shifting, denoted by P', will therefore be, to within one bit shift, the desired mantissa of the product p. Depending upon P, a final bit shift of P'may be required, but this is not time consuming. This final bit shift is not shown in Fig. 2.

Fig. 3 provides a high-level diagram of an embodiment of LZA 240. The Boolean binary operation OR is applied to each pair of bits of C and S. Functional unit 310 represents an array of OR gates, each OR gate applying an OR binary operation to a pair of bits from C and S. We denote this operation on C and S by C OR S, where C OR S is a binary tuple (word) having the same length as C and S and with ith component given by [C OR S] i = [C] OR [S];. The predicted leading non-zero bit position of the product P is the position of the leading non-zero bit of C OR S.

Priority encoder 320 asserts one of its output lines 330 corresponding to the leading non-zero bit position of C OR S. Output lines 330 provide a binary tuple indicative of the leading non-zero bit position of C OR S, and are coupled to shift : register 130 to provide this prediction information so that shift register 130 can be set up to shift P before it is computed by full adder 220.

As an example, let S = (00010001 O 1) and C= (000101101). The sum of S and C is 0001110010. Applying the Boolean binary operation OR to each pair of bits from C and S yields C OR S = (0001101101). For this example, the leading non-zero bit of C OR S is in the sixth position (where the least significant bit is the zeroth position), and the leading non-zero bit of C + S is predicted correctly. As an example of a misprediction, let S = (0001100101) and C = (0000101101). For this example, C + S = (0010010010) and C OR S = (0001101101), and it is seen that the leading non-zero bit of C OR S mispredicts the leading non-zero bit of C + S by one position.

In some situations, such as denormalized numbers, the product P discussed earlier is not shifted to have a leading non-zero bit. However, the embodiments disclosed herein have utility for denormalized numbers because the position of the leading non-zero bit is still useful for determining the amount of shifting necessary for denormalized numbers.

The embodiments disclosed herein were in reference to a floating point multiplication unit. However, it is to be appreciated that the invention claimed below is not limited to only multiplication units. Embodiments of the present invention are also applicable to addition units, and other kinds of execution units employing combinations of multiplication and addition. Consequently, the term floating point unit encompasses a floating point multiplication unit, a floating point addition unit, or a combination thereof.

Various modifications may be made to the embodiment described above without departing from the scope of the invention as claimed below. For example, it is immaterial whether priority encoder 320 is included within LZA 240 or not. As another example, OR gates 310 may be replaced with NOR gates, in which case priority encoder 320 is modified to provide signals on output 330 indicative of the position of the first zero bit of C OR S.