Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FUNCTION ACCELERATOR
Document Type and Number:
WIPO Patent Application WO/2015/006577
Kind Code:
A1
Abstract:
In described examples, a processor includes a function accelerator unit (104), which is configured to evaluate a mathematical function. The function accelerator unit (104) includes a coefficient generator (202), which is configured to generate coefficients for a polynomial evaluated to produce a solution to the function. The coefficient generator (202) varies values of the coefficients based on an input value at which the function is to be evaluated. Also, the function accelerator unit (104) includes a polynomial evaluator (204), which is configured to apply the coefficients provided by the coefficient generator (202) to evaluate the polynomial at the input value.

Inventors:
DIEWALD HORST (DE)
ZIPPERER JOHANN (DE)
Application Number:
PCT/US2014/046178
Publication Date:
January 15, 2015
Filing Date:
July 10, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TEXAS INSTRUMENTS INC (US)
TEXAS INSTRUMENTS DEUTSCHLAND (DE)
TEXAS INSTRUMENTS JAPAN (JP)
International Classes:
G06F1/02; G06F7/544; G06F17/10
Domestic Patent References:
WO2013095463A12013-06-27
Foreign References:
CN103176948A2013-06-26
US20100198895A12010-08-05
US20010007990A12001-07-12
Attorney, Agent or Firm:
DAVIS, Michael, A., Jr. et al. (International Patent ManagerP.O. Box 655474, Mail Station 399, Dallas TX, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A processor, comprising:

a function accelerator unit configured to evaluate a mathematical function, and including: a coefficient generator configured to generate coefficients for a polynomial evaluated to produce a solution to the function, wherein the coefficient generator varies values of the coefficients based on an input value at which the function is to be evaluated; and a polynomial evaluator configured to apply the coefficients provided by the coefficient generator to evaluate the polynomial at the input value.

2. The processor of claim 1, wherein the coefficient generator is configured to determine a number of coefficients to be applied in the polynomial, and wherein the coefficient generator varies the number of coefficients based on the input value at which the function is to be evaluated.

3. The processor of claim 1, wherein the coefficient generator is configured to generate the coefficients based on a value of a predetermined number of most significant bits of the input value.

4. The processor claim 1, wherein the coefficient generator includes a plurality of cascaded coefficient tables, each of which is configured to generate coefficients based on a different set of bits of the input value.

5. The processor of claim 1, wherein the coefficient generator is configured to generate a plurality of sets of different coefficients with respect to the function, and each of the sets of coefficients is applicable to a different range of the input value.

6. The processor claim 5, wherein range size to which a given one of the sets of coefficients is applicable is different from range size to which a different one of the sets of coefficients is applicable.

7. The processor of claim 1, wherein the coefficient generator is configured to provide a scaling factor for use with a coefficient, and the polynomial evaluator is configured to apply the scaling factor in conjunction with the coefficient to evaluate the polynomial.

8. The processor of claim 1, wherein the coefficient generator includes a plurality of coefficient tables for use in evaluating a function, and the coefficient generator is configured to select which of the coefficient tables to use for coefficient generation based on a control signal provided to the coefficient generator.

9. The processor of claim 1, wherein: the function accelerator unit is configured to solve a plurality of different functions; and, for a given function, the coefficient generator is configured to generate, based on the input value, coefficients applicable to a different function to solve the given function.

10. The processor of claim 9, wherein the polynomial evaluator is configured to: produce a result of evaluation of the polynomial for the different function; and further process the result to produce a result for the given function.

11. The processor of claim 1, wherein the polynomial evaluator is configured to: determine, based on the function, which terms of the polynomial are to be computed; and skip, based on the function, computation of at least one term of the polynomial.

12. The processor claim 1, wherein the coefficient values, input value range per coefficient set, number of coefficients sets, number of coefficients applied, and scaling factor for scaling a coefficient are programmable at run-time of the processor.

13. A method of accelerating function processing, the method comprising:

providing, to a hardware accelerator, a designation of a function to be evaluated, and an operand value at which the function is to be evaluated;

generating, by the hardware accelerator, coefficients for a polynomial to be evaluated to produce a solution to the function, and varying values of the coefficients based on the operand value; and

applying the coefficients to evaluate the polynomial at the operand value.

14. The method of claim 13, further comprising:

determining, by the hardware accelerator, a number of coefficients to be applied in the polynomial; and

varying the number of coefficients based on the operand value at which the function is to be evaluated.

15. The method of claim 13, further comprising generating the coefficients based on the value of a predetermined number of most significant bits of the operand value

16. The method of claim 13, further comprising selecting from a plurality of sets of different coefficients with respect to the function, wherein each of the sets of coefficients is applicable to a different range of the operand value, and wherein at least some of the different ranges of the operand value are optionally of different size.

17. The method of claim 13, further comprising:

generating a scaling factor for use with a coefficient; and

applying the scaling factor in conjunction with the coefficient to evaluate the polynomial.

18. The method of claim 13, wherein generating the coefficients includes selecting the coefficients based on selection of one of minimization of energy use and result accuracy as a computational goal.

19. The method of claim 13, further comprising:

generating, for a given function, based on the operand value, coefficients applicable to a different function to solve the given function; and

producing a result of evaluation of the polynomial for the different function; and processing the result to produce a result for the given function.

20. The method of claim 13, further comprising:

determining, based on the function, which terms of the polynomial are to be computed; and skipping, based on the function, computation of at least one term of the polynomial.

21. A function acceleration circuit, comprising:

a coefficient generator configured to: generate coefficients for a polynomial to be evaluated to produce a solution to a function, wherein the coefficient generator varies values of the coefficients based on an input value at which the function is to be evaluated; determine a number of coefficients to be applied in the polynomial, wherein the coefficient generator varies the number of coefficients based on the input value at which the function is to be evaluated; and provide a scaling factor for use with at least one of the coefficients; and

a polynomial evaluator configured to: determine, based on the function, which terms of the polynomial are to be computed and which terms, between terms to be computed, are to be omitted; and apply the coefficients and the scaling factor provided by the coefficient generator to evaluate the polynomial at the input value.

22. The circuit of claim 21, wherein the coefficient generator is configured to generate a plurality of sets of different coefficients with respect to the function, and each of the sets of coefficients is applicable to a different range of the input value; and wherein range size to which a given one of the sets of coefficients is applicable is different from range size to which a different one of the sets of coefficients is applicable.

23. The circuit of claim 21, wherein the coefficient generator includes a plurality of coefficient tables for use in evaluating a function, and the coefficient generator is configured to select which of the coefficient tables to use for coefficient generation based on whether energy efficiency or result accuracy is selected.

24. The circuit of claim 21, wherein the coefficient generator is configured to generate for a given function, based on the input value, coefficients applicable to different function; and wherein the polynomial evaluator is configured to: produce a result of evaluation of the polynomial for the different function; and further process the result to produce a result for the given function.

Description:
FUNCTION ACCELERATOR

[0001] This relates in general to information handling systems, and in particular to function acceleration.

BACKGROUND

[0002] Many computer applications require the evaluation of mathematical functions, such as trigonometric functions, exponential functions, and root functions. Evaluation of such mathematical functions is typically provided via a library of software routines executed by a processor.

SUMMARY

[0003] In described examples, a processor includes a function accelerator unit, which is configured to evaluate a mathematical function. The function accelerator unit includes a coefficient generator, which is configured to generate coefficients for a polynomial evaluated to produce a solution to the function. The coefficient generator varies values of the coefficients based on an input value at which the function is to be evaluated. Also, the function accelerator unit includes a polynomial evaluator, which is configured to apply the coefficients provided by the coefficient generator to evaluate the polynomial at the input value.

[0004] A method of accelerating function processing includes providing, to a hardware accelerator, a designation of a function to be evaluated, and an operand value at which the function is to be evaluated. The hardware accelerator generates coefficients for a polynomial to be evaluated to produce a solution to the function. Values of the coefficients are varied based on the operand value. The coefficients are applied to evaluate the polynomial at the operand value.

[0005] A function acceleration circuit includes a coefficient generator, which is configured to generate coefficients for a polynomial to be evaluated to produce a solution to a function. The coefficient generator varies values of the coefficients based on an input value at which the function is to be evaluated. The coefficient generator is also configured to determine a number of coefficients to be applied in the polynomial. The coefficient generator varies the number of coefficients based on the input value at which the function is to be evaluated. The coefficient generator is further configured to provide a scaling factor for use with at least one of the coefficients. Also, the function acceleration circuit includes a polynomial evaluator. The polynomial evaluator is configured to: determine, based on the function, which terms of the polynomial are to be computed and which terms, between terms to be computed, are to be omitted; and apply the coefficients and the scaling factor provided by the coefficient generator to evaluate the polynomial at the input value.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] FIG. 1 is a block diagram of a processor of example embodiments.

[0007] FIG. 2 is a block diagram of a function accelerator of the processor of FIG. 1.

[0008] FIG. 3 is a block diagram of a coefficient generator and polynomial evaluator of the function accelerator of FIG. 2.

[0009] FIG. 4 is a block diagram of coefficient tables of the coefficient generator of FIG. 3.

[0010] FIG. 5 is a block diagram of various fields of a function input value in the function accelerator of FIG. 2.

[0011] FIG. 6 is a flow diagram of an operation of the function accelerator of FIG. 2.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

[0012] Processors generally include an arithmetic unit that provides addition and subtraction of integer values. Many processors also include multipliers capable of integer multiplication. Processor architectures directed to more math intensive processing may support floating point and/or fixed point numeric formats, along with integer formats. Evaluation of complex functions (such as trigonometric functions, exponential functions, logarithmic functions, and roots) is generally performed via execution of software routines that apply the adder and/or multiplier of the processor as needed to evaluate the functions. Unfortunately, function evaluation via software can be slow and power inefficient.

[0013] Example embodiments include a function acceleration unit that reduces time and/or energy for estimating complex functions. The function acceleration unit employs polynomial estimation of the function, and determines the values of the coefficients of the polynomial, the number of coefficients to be applied, and other computational parameters based on the value of the operand at which the function is to be evaluated. Accordingly, embodiments may apply a first set of coefficients if the operand value falls within a first range, and a second set of coefficients if the operand value falls within a second range. Embodiments may support any number of such ranges, and the ranges may be of different sizes. By varying the coefficient number and values based on the operand value, the function acceleration unit is able to reduce time and power for generating a result, without loss of accuracy relative to conventional systems. Alternatively, embodiments may produce a more accurate result without increase in time and power relative to conventional solutions.

[0014] FIG. 1 is a block diagram of a processor 100 of example embodiments. The processor 100 executes instructions retrieved from a computer readable storage device, such as a volatile or non-volatile memory device. The instructions may include a complex instruction that directs the processor 100 to evaluate a function, such as a trigonometric, exponential, logarithmic, root, or other complex function.

[0015] The processor 100 includes an execution unit 102 and a function accelerator 104. The execution unit 102 may include an arithmetic logic unit, shifter, multiplier and/or other data manipulation circuitry applied in instruction execution. Embodiments of the processor 100 may include more than one execution unit. The function accelerator 104 is coupled to the execution unit 102. The function accelerator 104 applies polynomial evaluation to estimate a specified function at a designated input value.

[0016] The function accelerator 104 provides improved function evaluation efficiency by selecting the number and values of coefficients applied in the polynomial based on the input value. Accordingly, the function accelerator 104 may apply different numbers of coefficients and/or coefficient values in different ranges of the function, where the number and/or values of the coefficients are optimized for each range. In some embodiments, the function accelerator 104 may execute complex instructions that specify the function to be evaluated.

[0017] In some embodiments, the execution unit 102 and the function accelerator 104 may be part of and embodied in a single processor core. In other embodiments, the execution unit 102 is part of a processor core, and the function accelerator 104 is separate from the processor core.

[0018] The bus interface 106 connects the execution unit 102 and, the function accelerator 104 in some embodiments, to other components of the processor 100 and/or to components external to the processor 100 via a communication structure, such as a data and address bus. In some embodiments, the function accelerator 104 may be coupled to the execution unit 102 via the bus interface 106.

[0019] The processor resources 108 include peripheral devices, such as memories, input/output ports, timers, and communication subsystems that the execution unit 102, and the function accelerator in some embodiments, access via the bus interface 106. [0020] FIG. 2 is a block diagram of the function accelerator 104. The function accelerator 104 includes a coefficient generator 202, a polynomial evaluator 204, and registers 210. The registers 210 are coupled to the coefficient generator 202 and the polynomial evaluator 204, and provide storage for coefficients, input values, and results of function and polynomial evaluation.

[0021] The coefficient generator 202 provides the coefficients of the polynomial to be evaluated to estimate the value of the function. The coefficient generator 202 includes one or more coefficient tables 206. The coefficient tables 206 may store coefficient values or may produce coefficient values by operation of logic (such as combinatorially). The coefficient generator 202 produces coefficients for the polynomial based on the function to be evaluated, and the function input value. Accordingly, the coefficient tables 206 may include one or more tables corresponding to each function that can be evaluated by the function accelerator 104. The coefficient tables 206 may include volatile and/or non-volatile coefficient storage (such as registers, random access memory, FLASH memory, and read only memory), and coefficient values may be programmed into the tables 206 by execution of the processor 100 (at run-time) or at manufacture of the processor 100.

[0022] The coefficient generator 202 partitions the range of input values of a function into a plurality of sub-ranges, and may generate different values for each coefficient in each sub-range. For example, for a given function, the coefficient generator may generate a first set of coefficient values for an input value in a first sub-range of the function, and generate different coefficient values for an input value in a second sub-range of the function. The number of input values encompassed by the first sub-range may differ from the number of input values encompassed by the second sub-range. The size of each sub-range may be selected in accordance with the coefficients applied to estimate the function in the sub-range.

[0023] Similarly, the coefficient generator 202 may generate a different number of coefficients for each range, or at least some ranges, of the function. For example, in ranges where the function is more linear, the coefficient generator 202 may generate few coefficients. In more non-linear ranges of the function, the coefficient generator 202 may generate more coefficients. Accordingly, the coefficient generator 202 subdivides the range of the function into a number of sub-ranges suitable to estimate the function, while providing (for each sub-range) a number and value of coefficients selected to estimate the function in the sub-range. The number and/or value of the coefficients may be adaptively selected based on, for example, accuracy and/or energy constraints.

[0024] The polynomial evaluator 204 receives the coefficients provided by the coefficient generator 202, and applies the coefficients to estimate the function at the input value. The polynomial evaluator 204 includes control logic 208. The control logic 208 sequences the polynomial evaluator 204 through the arithmetic operations (such as multiplications and additions) applied to estimate the function. The polynomial evaluator 204 may include adders, multipliers, shifters and other computational logic needed to evaluate the polynomial. In some embodiments, the polynomial evaluator 204 may apply computational logic embodied in other execution units of the processor 100 to compute the polynomial result.

[0025] In some embodiments, the polynomial evaluator 204 may employ fractional arithmetic (such as fixed point processing) to evaluate the polynomial. The input value and result of function evaluation may be provided in other numeric formats (such as floating point format), and the polynomial evaluator 204 may provide conversion between numeric formats as needed. Some embodiments of the polynomial evaluator may employ floating point computation.

[0026] For symmetrical functions, such as sine and cosine, the polynomial evaluator 204 may adjust the input value to allow for evaluation of the function in a predetermined sub-range. For

TC

example, input values for trigonometric functions may be adjusted to fall in a sub-range of 0 to— , and the result of evaluation correspondingly adjusted to produce a result in accordance with the input value. In some embodiments, the input values for trigonometric functions (such as sine or cosine) may be restricted, by adjustment operations in the function accelerator 104, to a range of 0

TC TC

to— , and polynomial evaluation in the range of 0 to— may provide more accurate results than

TC

evaluation over 0 to — . For example, requests to evaluate the sine function may apply a sine

TC

approximation polynomial between 0 to— and may apply a cosine approximation polynomial in the range of 0 to— to evaluate sine function request input values falling between— and— . The

4 4 2 polynomial evaluator 204 may also scale the input value to ensure that the input value falls in a magnitude range suitable for accurate computation.

[0027] For specified values of a function, the polynomial evaluator 204 may store result values to be provided, instead of computing the result. For example, result values for trigonometric tactions a, input values or 0, - , - , etc. may be provided from storage, instead of being computed.

[0028] The control logic 208 may include state machines that provide the polynomial sequencing. For example, the control logic 208 may include a state machine for sequencing of each different polynomial applied to estimate a function. The state machines may specify which terms of a polynomial are applied. One polynomial state machine may apply odd numbered terms, another may apply even numbered terms, and yet another may apply terms as specified. In some embodiments, the control logic 208 may sequence polynomial evaluation in accordance with Horner's method. Polynomial sequencing (such as via state machine), in addition to other control functions of the logic 208, may be programmed at run-time or at manufacture of the processor 100.

[0029] In addition to sequencing computation of the polynomial, the control logic 208 may select which polynomial is to be applied to evaluate the function. In some cases, to evaluate a given function, the control logic 208 may select a polynomial generally applied to evaluate a different function. For example, to evaluate a sine function at an input value in a predetermined sub-range

TC TC

(such as— to— ), the control logic 208 may select to evaluate a cosine function and further process

(square and subtract from one) the result of cosine evaluation to produce the sine function result. Accordingly, the control logic 208 may select a polynomial to evaluate based on the requested function and the function input value. The control logic 208 may provide an indication of the selected polynomial to the coefficient generator 202. Such polynomial selection information may be provided in a table of the control logic. Alternatively, polynomial selection may be provided by the coefficient tables 206 or other circuitry of the function accelerator 104 and communicated to the control logic 208.

[0030] FIG. 3 shows coefficient generation in the function accelerator 104. In FIG. 3, the coefficient generator 202 selects coefficients for a polynomial based on the value of the three most significant bits (2 1 , 2 "2 , and 2 "3 ) 310 of the input value 302. In some embodiments, a different number of bits and/or different bits of the input value 302 may be used for selecting the coefficients. The three bits 310 of value 302 considered in FIG. 3 may represent eight sub-ranges of the function being evaluated, or in some embodiments values of the three bit field 310 may be combined to represent fewer than eight sub-ranges of the function. For example, values 3-5 of the field 310 may represent a single sub-range for coefficient generation, and each of values 0, 1, 2, 6, 7 and 8 may represent different distinct sub-ranges for coefficient generation.

[0031] The coefficient generator 202 includes i coefficient tables 304, 306 and 308, where each coefficient table produces a coefficient for a term of the polynomial. The coefficient generator 202 may produce different coefficient values and a different number of coefficients based on the value of the three most significant bits of value 302. The number of coefficients (z) and the coefficient values (c) are provided to the polynomial evaluator 204.

[0032] The coefficient generator 202 may also generate weight values (w) that are to be applied in conjunction with the coefficients. In some embodiments, a weight value may be provided in conjunction with each coefficient value. The weight value may be applied by the polynomial evaluator 204 to scale a result of multiplication by the associated coefficient to, for example, keep the result within a desired range. The weight values may be positive or negative powers of 2 to allow for application by shifting. The weight values may be applied at various stages of the polynomial evaluation, such as immediately after application of a coefficient or later in the polynomial computation.

[0033] The coefficient generator 202 may also select the coefficient number, coefficient values, and weight values based on a select signal (SEL) or selection information provided to the coefficient generator 202. The selection information may specify a goal of function evaluation to be provided for in the selection of coefficients. For example, in support of a goal of minimizing energy consumption, the coefficient generator 202 may provide fewer coefficients and/or sacrifice result accuracy to some degree. Similarly, to maximize result accuracy, the coefficient generator 202 may provide more coefficients, which consumes more power to generate a result.

[0034] Additionally, embodiments may adjust accuracy versus energy consumption by selecting the width of the coefficients and term calculation logic applied to evaluate a polynomial. For example, a smaller computation width (such as 32 bits) can be selected and applied to reduce energy consumption, while a larger computation width (such as 64 bits) can be selected and applied to increase result accuracy. Such selection may be realized via a select signal or selection information provided to the coefficient generator 202 and/or the polynomial evaluator 204.

[0035] FIG. 4 shows cascaded coefficient tables of the coefficient generator 202. The coefficient generator 202 may employ such an arrangement of coefficient tables if, for example, a particular sub-range of a function is to be further partitioned. In FIG. 4, the coefficient generator 202 includes a number of cascaded coefficient tables 402, 404 and 406. Coefficient table 402 is arranged to select coefficients, weights, etc. based on the value of the uppermost three bits (2 _1 , 2 ~2 , and 2 "3 ) 408 of the input value 402 at which the function is to be evaluated. Accordingly, the coefficient table 402 may define up to seven sub-ranges of the function, and provide unique coefficients that are optimized to improve estimation accuracy for each sub-range.

[0036] In the coefficient generator 202 of FIG. 4, at least one of the sub-ranges defined by table 202 is further partitioned into smaller sub-ranges by coefficient table 404. Coefficient table 404 is coupled to coefficient table 402 and is arranged to select coefficients, weights, etc. based on a lower three bits (2 ~4 , 2 "5 , and 2 "6 ) 410 of the input value 402 when table 402 indicates that the input value 402 falls into the sub-range corresponding to table 404. A third coefficient table 406 is arranged to select coefficients, weights, etc. based on bits 410 and trigger signals provided by coefficient table 404. For example, the coefficient table 406 may provide coefficients for one or more of the sub-ranges defined by of bits 410. By dividing the coefficients across cascaded tables in this manner, embodiments may reduce the overall amount of storage for cases where the function is partitioned into fewer than the maximum number of sub-ranges supported by the total number of bits of the fields 408 and 410.

[0037] FIG. 5 shows use of various fields of a function input value 502 in the function accelerator 104. The tables 510 may include the coefficient tables 206, tables providing constants for use in function evaluation, and/or other tables applied by the function accelerator 104. The tables 510 may be accessed using various portions of the input value 502. In some embodiments of the function accelerator 104, the fractional portion 504 of the input value 502, or a portion thereof, may be used for accessing a table 510 to provide coefficients, constants or other values for function evaluation.

[0038] The function accelerator 104 may also apply the sign flag 508 of the input value 502 to access the tables 502. For example, the sign flag 508 may be applied in conjunction with the fractional portion 504 of the input value 502 to generate polynomial coefficients. The function accelerator may also apply the sign flag 508 to select the polynomial to be evaluated and/or to determine whether the result of function evaluation produces an imaginary number.

[0039] Embodiments of the function accelerator 104 may also apply a non- fractional portion 506 of the input value 502 to access the tables 510. The non-fractional portion 506 may be an exponent value (such as an exponent value of an IEEE 754 floating point value), an integer portion of a fixed point input value 502, etc. The non-fractional portion 506 may be applied to retrieve constants, coefficients, etc. from the tables 510.

[0040] FIG. 6 is a flow diagram of an operation 600 of the function accelerator 104 for evaluating a function. Although depicted sequentially as a matter of convenience, at least some of the described actions can be performed in a different order and/or performed in parallel. Additionally, some embodiments may perform only some of the described actions. In some embodiments, at least some portions of the operation 600 can be implemented as instructions stored in a computer readable medium and executed by the processor 100.

[0041] In block 602, the processor 100 provides (to the function accelerator 104) an indication of a function to be evaluated and an operand value at which the indicated function is to be evaluated. For example, the execution unit 102 may pass an instruction defining the function and operand to the function accelerator 104. Alternatively, the execution unit 102 may load a function specification and/or operand into registers accessible by the function accelerator 104.

[0042] In block 604, the function accelerator 104 identifies a sub-range of the function encompassing the operand value. The function accelerator 104 may divide the range of the function into any number of sub-ranges and provide different coefficients for each sub-range. The sub-ranges may each encompass a different number of operand values.

[0043] In block 606, the function accelerator 104 identifies the polynomial to be evaluated for the indicated function and operand value. Different polynomials may be provided to evaluate different sub-ranges of a function. For some sub-ranges of a given function, a polynomial generally applied to evaluate a different function may be applied to evaluate the given function.

[0044] In block 608, the function accelerator 104 generates a number of coefficients, coefficient values, and weight values to be applied in the selected polynomial. The number of coefficients, coefficient values, and weight values may be selected based on the sub-range of the function in which the operand value falls and the polynomial selected for evaluation. Control information may also be provided to the function accelerator 104 that affects coefficient selection. For example, control information received by the function accelerator 104 may cause generation of fewer coefficients to reduce polynomial computation time and energy consumption, or cause generation of more coefficients to increase result accuracy.

[0045] In block 610, the function accelerator 104 evaluates the selected polynomial using the generated coefficient and weight values. The computation of the polynomial may be sequenced in accordance with Horner's method in some embodiments. [0046] In block 612, the function accelerator 104 applies (to the result of polynomial evaluation) any further processing needed to produce the result of the function. The result of the function may be provided to the execution unit 102 or stored for access by the execution 102 or other components of the processor 100.

[0047] Modifications are possible in the described embodiments, and other embodiments are possible, within the scope of the claims.