Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
H.264 QUANTIZATION
Document Type and Number:
WIPO Patent Application WO/2006/086426
Kind Code:
A3
Abstract:
Low complexity (16 bit arithmetic) H.264 video compression (Fig.1a, input macroblock) replaces a single quantization table (Fig.1a, input quantization parameter) for all quantization parameters with multiple quantization tables (Fig.1a, quantization) and thereby equalizes quantization shifts and round-off additions (Fig.1a, encoded quantized transformed macroblock); this eliminates the need for 32- bit accesses.

Inventors:
ZHOU MINHUA (US)
Application Number:
PCT/US2006/004358
Publication Date:
December 13, 2007
Filing Date:
February 08, 2006
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TEXAS INSTRUMENTS INC (US)
ZHOU MINHUA (US)
International Classes:
H04N7/12; G06K9/36
Foreign References:
US5241383A1993-08-31
US5481309A1996-01-02
US6631162B12003-10-07
US5991454A1999-11-23
US20030202579A12003-10-30
Attorney, Agent or Firm:
FRANZ, Warren, L. (P.O. Box 655474Dallas, Texas, US)
Download PDF:
Claims:

CLAIMS

1. A method of video encoding, comprising the steps of:

(a) transforming a 4x4 block of integer data into a 4x4 block of integer transform coefficients; and

(b) quantizing said 4x4 block of coefficients by (i) element-by-element multiplication of absolute values of said coefficients with entries of one of a plurality of 4x4 positive integer quantization matrices, (ii) addition of a rounding control parameter, (iii) restoration of the signs of said coefficients, (iv) and right shifting;

(c) wherein a first matrix of said plurality of quantization matrices has entries which are equal to one-half with round-off of corresponding entries of a second matrix of said plurality; and

(d) said one of said plurality is selected according to a quantization parameter.

2. The method of Claim 1 , wherein:

(a) said plurality includes 4x4 matrices M 0 , Mi, ..., M Q- ) where Q is a positive integer which factors as Q = NM with N and M positive integers each greater than 1 ; and

(b) for each pair of integers n,k with n in the range 1 to N-I and k in the range 0 to M- 1 , the elements of said matrices are related by

M nM +k(i,j) = (MkCj) + 2"- 1 ) » n for 0 < i j < 3.

3. The method of Claim 2, wherein M = 6 and N = 7.

4. The method of Claim 2, wherein:

(a) with said quantization parameter equal to nM+k with n greater than 0 and with said coefficients denoted y(ij), said quantizing includes the computations: c(ij) = sign[ y(i,j) ] [ |y(i,j)| M (n . 1)M+ k(ij) + α 2 16 ] » 16 where α is a round off factor with 0 < α < 1.

5. The method of Claim 2, wherein:

(a) with said quantization parameter equal to k and with said coefficients denoted y(ij), said quantizing includes the computations: c(i,j) = sign[ y(i,j) ] [ ly(i,j)l M k (ij) + α 2 15 ] » 15 where α is a round off factor with 0 < α < 1.

Description:

H.264 QUANTIZATION

The present invention relates to digital image and video signal processing, and more particularly to block transformation and/or quantization plus inverse quantization and/or inverse transformation. BACKGROUND

Various applications for digital video communication and storage exist, and corresponding international standards have been and are continuing to be developed. Low bit rate communications, such as video telephony and conferencing, plus large video file compression, such as motion pictures, led to various video compression standards: H.261, H.263, MPEG-I, MPEG-2, AVS, and so forth. These compression methods rely upon the discrete cosine transform (DCT) or an analogous transform plus quantization of transform coefficients to reduce the number of bits required to encode.

DCT-based compression methods decompose a picture into macroblocks where each macroblock contains four 8x8 luminance blocks plus two 8x8 chrominance blocks, although oliicr block sizes and transform variants could be used. FIG. 2a depicts the functional blocks of DCT-based video encoding. In order to reduce the bit-rate, 8x8 DCT is used to convert the 8x8 blocks (luminance and chrominance) into the frequency domain. Then, the 8x8 blocks of DCT-coefficients are quantized, scanned into a 1-D sequence, and coded by using variable length coding (VLC). For predictive coding in which motion compensation (MC) is involved, inverse-quantization and IDCT are needed for the feedback loop. Except for MC, all the function blocks in FIG. 2a operate on an 8x8 block basis. The rate-control unit in FIG. 2a is responsible for generating the quantization step (qp) in an allowed range and according to the target bit-rate and buffer-fullness to control the DCT-coefficients quantization unit. Indeed, a larger quantization step implies more vanishing and/or smaller quantized coefficients which means fewer and/or shorter codewords and consequent smaller bit rates and files.

There are two kinds of coded macroblocks. An INTRA-coded macroblock is coded independently of previous reference frames. In an INTER-coded macroblock, the motion compensated prediction block from the previous reference frame is first generated for each block (of the current macroblock), then the prediction error block (i.e. the difference block between current block and the prediction block) are encoded.

For INTRA-coded raacroblocks, the first (0,0) coefficient in an INTRA-coded 8x8 DCT block is called the DC coefficient, the rest of 63 DCT-coefficients in the block are AC coefficients; while for INTER-coded macroblocks, all 64 DCT-coefficients of an INTER- coded 8x8 DCT block are treated as AC coefficients. The DC coefficients may be quantized 5 with a fixed value of the quantization step, whereas the AC coefficients have quantization steps adjusted according to the bit rate control which compares bit used so far in the encoding of a picture to the allocated number of bits to be used. Further, a quantization matrix (e.g., as in MPEG-4) allows for varying quantization steps among the DCT coefficients.

In particular, the 8x8 two-dimensional DCT is defined as: i n

10 where βx,y) is the input 8x8 sample block and F(u,v) the output 8x8 transformed block where u,v,x,y = 0, 1, ..., 7; and

otherwise

Note that this transforming has the form of 8x8 matrix multiplications, F = D'xfxD, where 15 "x" denotes 8x8 matrix multiplication and D is the 8x8 matrix with u,x element equal to

. (2x + V)uπ

C(u) cos -^ :

16

The transform is performed in double precision, and the final transform coefficients are rounded to integer values.

Next, define the quantization of the transform coefficients as

where QP is the quantization factor computed in double precision from the quantization step, qp, as an exponential such as: QP = 2 qp/6 . The quantized coefficients are rounded to integer values and are encoded.

The corresponding inverse quantization becomes: 5 F'(u,v) = QF(u,v) * QP with double precision values rounded to integer values.

Lastly, the inverse transformation (reconstructed sample block) is:

/ I / N l r r ^ > λM πl , . ( 2x + ϊ)uπ (2y + l)vπ f (x,y) = - ∑ ∑ C(w)C(v)F' (M, v) cos - i — -^- cos ^ '

again with double precision values rounded to integer values.

Various more recent video compression methods, such as the H.264 and AVS standards, simplify the double precision DCT method by using integer transforms in place of the DCT and/or different size blocks, Indeed, define an nxn integer transform matrix, T nxn , with elements analogous to the 8x8 DCT transform coefficients matrix D. Then, with/,™ and F n x n denoting the input nxn sample data matrix (block of pixels or residuals) and the output nxn transform-coefficients block, respectively, define the forward nxn integer transform as: r mat ~~ 1 nxn ~^ Jnxn X * nxn where "x" denotes nxn matrix multiplication, and the nxn matrix 1 P nX11 is the transpose of the nxn matrix T nxn .

For example, as in other existing video standards, in H.264 the smallest coding unit is a macroblock which contains four 8x8 luminance blocks plus two 8x8 chrominance blocks from the two chrominance components. However, as shown in FIG. 3, in H.264 the 8x8 blocks are further divided into 4x4 blocks for transform plus quantization, which leads to a total of twenty-four 4x4 blocks for a macroblock. After the integer transform, the four DC values from each of two chrominance components are pulled together to form two chrominance DC blocks, on which an additional 2x2 transform plus quantization is performed. Similarly, if a macroblock is coded in INTRA 16x16 mode, the sixteen DC values of the sixteen 4x4 luminance blocks are put together to create a 4x4 luminance DC block, on which 4x4 luminance DC transform plus quantization is carried out.

Therefore, in H.264 there are three kinds of transform plus quantization, namely, 4x4 transform plus quantization for twenty-four luminance/chrominance blocks; 2x2 transform plus quantization for two chrominance DC blocks; and 4x4 transform plus quantization for the luminance DC blocks if the macroblock is coded as INTRAl 6x16 mode.

The quantization of the transformed coefficients may be exponentials of the quantization step as above or may use lookup tables with integer entries. The inverse quantization mirrors the quantization. And the inverse transform also uses T nxn , and its

transpose analogous to the DCT using D and its transpose for both the forward and inverse transforms.

However, these alternative methods still have computational complexity which could be reduced while maintaining performance. SUMMARY

The present invention provides low-complexity quantization for H.264 image/video processing by modification of quantization tables per quantization parameter.

The preferred embodiment methods provide for simplified 16-bit operations useful in H.264 video coding. BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. Ia-Ib are flow diagrams.

FIGS. 2a-2b illustrate motion compensation video compression with DCT and other transformation and quantization.

FIG. 3 shows H.264 macroblock structure. FIG. 4 illustrates method comparisons.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The preferred embodiment methods provide simplified 4x4 and 2x2 transformed block quantizations which apply to the 16-bit H.264 method. The quantization lookup tables are made dependent upon the quantization parameter to equalize the round-off and shifting; this avoids 32-bit accesses.

The methods have application to video compression which operates on blocks of (motion-compensated) pixels with H.264 integer transformation plus quantization of the transformed coefficients where the quantization can vary widely. For H.264 encoding as illustrated in FIG. 2b, buffer fullness feedback from the bitstream output buffer may determine the quantization factor, which typically varies in the range from 1 to 200-500. The preferred embodiment methods would apply in block "quantize" in FIG. 2b. FIGS. Ia-Ib are transform/quantization of encode and decode flows.

Preferred embodiment systems perform preferred embodiment methods with digital signal processors (DSPs) or general purpose programmable processors or application specific circuitry or systems on a chip (SoC) such as both a DSP and RISC processor on the same chip with the RISC processor controlling. In particular, digital still cameras (DSCs) with

video clip capabilities or cell phones with video capabilities could include the preferred embodiment methods. A stored program could be in an onboard ROM or external flash EEPROM for a DSP or programmable processor to perform the signal processing of the preferred embodiment methods. Analog-to-digital converters and digital-to-analog converters provide coupling to the real world, and modulators and demodulators (plus antennas for air interfaces) provide coupling for transmission waveforms.

Initially, consider the H.264 transform, quantization, and their inverses for each of the three block types: 4x4 luminance/chrominance blocks, 2x2 chrominance DC blocks, and 4x4 luminance DC blocks; the preferred embodiment methods provide simplifications of the quantizations of H.264. (a) Forward transform for 4x4 luminance/chrominance blocks

The forward 4x4 transform uses the following 4x4 transform matrix, r 4x4 , for matrix multiplications with each 4x4 sample data matrix of the twenty-four 4x4 luminance/chrominance blocks of a macroblock:

Thus the forward transform of 4x4 matrix with elements x y to 4x4 matrix with elements y υ is

X 1 03 1 2 1 1 1 1 -1 - 2 1 - 1 - 1 2 _ 1 - 2 1 - 1

Note that the columns of T^ are orthogonal and that very roughly T 4X4 is proportional to the 4x4 DCT matrix, (b) Quantization for 4x4 luminance/chrominance blocks

The^ y for i = 0, 1, 2, 3 andy = 0, 1, 2, 3 are quantized to give the c tJ as a function of the quantization parameter qP by: c υ = sign(y, j )*(\y, j \ *QLevelScale(qP%6, i,j) + delta) » (15 + qPI6) where QLevelScale(qP%6, i, j) is the quantization lookup table; qP denotes either the luminance quantization parameter, QPy, or the chrominance quantization parameter, QPc

(both QPy and QP C are in the range 0, 1, ..., 53); delta = α*2 15+9?/6 with 0 < a< 1 the roundoff parameter; sϊgn{.) is the signum function (sign(z) = +1 if z is positive, sign(z) = -1 if z is negative, and sign(0) = 0); * denotes scalar multiplication; / is integer division (integer quotient and remainder discarded); % is the modulo operation which essentially is the remainder discarded from integer division; and » and « denote right and left shifting, which applies to numbers when expressed in binary notation. Note that qP/6 lies in the range 0 to 8. The quantization lookup table consists of six 4x4 scaling matrices, one for each of the six possible values of qP%6. Each 4x4 scaling matrix has the same simple form but differing element values: QLevelScale[6][4][4] = {

{{13107, 8066,13107, 8066},{ 8066, 5243, 8066, 5243},{13107, 8066,13107, 8066},{ 8066, 5243, 8066, 5243}}, {{11916, 7490,11916, 7490},{ 7490, 4660, 7490, 4660},{11916, 7490,11916, 7490},{ 7490, 4660, 7490, 4660}}, {{10082, 6554,10082, 6554},{ 6554, 4194, 6554, 4194},{10082, 6554,10082, 6554},{ 6554, 4194, 6554, 4194}}, {{ 9362, 5825, 9362, 5825},{ 5825, 3647, 5825, 3647},{ 9362, 5825, 9362, 5825},{ 5825, 3647, 5825, 3647}}, {{ 8192, 5243, 8192, 5243},{ 5243, 3355, 5243, 3355},{ 8192, 5243, 8192, 5243},{ 5243, 3355, 5243, 3355}}, {{ 7282, 4559, 7282, 4559},{ 4559, 2893, 4559, 2893},{ 7282, 4559, 7282, 4559},{ 4559, 2893, 4559, 2893}}

}; Note that overall the quantization is roughly multiplication by an integer scaling factor which lies between 2 U and 2 14 followed by integer division by 2 15 which compensates for the size of the integer scaling factor, and then integer division by 2 qP/6 which lies in the range 1 to 2 and provides the reduction in the number of bits for quantization. The quantized coefficients C 1J are ultimately encoded and transmitted/stored, (c) Inverse quantization for 4x4 luminance/chrominance blocks After decoding to recover the c y , inverse quantization for a 4x4 quantized block c v with / = 0, 1, 2, 3 and/ = 0, 1, 2, 3 gives d y as: d,j = (c y *IQLevelScale(qP%6, /, /) ) « qP/6 where, again, qP denotes either the luminance quantization parameter, QP γ, or the chrominance quantization parameter, QP C , and IQLevelScale(qP%6, i, j) is the inverse quantization lookup table entry. The inverse quantization lookup table again consists of a 4x4 scaling matrix for each of the six possible qP%6 with each 4x4 scaling matrix having four elements with a low value, eight with a middle value, and four with a high value:

IQLevelScale[6][4][4] =

{

{{10, 13, 10, 13},{ 13, 16, 13, 16},{10, 13, 10, 13} 5 { 13, 16, 13, 16}}, {{11, 14, 11, 14},{ 14, 18, 14, 18},{11, 14, 11, 14},{ 14, 18, 14, 18}}, {{13, 16, 13, 16},{ 16, 20, 16, 20},{13, 16, 13, 16},{ 16, 20, 16, 20}}, {{14, 18, 14, 18},{ 18, 23, 18, 23},{14 S 18, 14, 18},{ 18, 23, 18, 23}}, {{16, 20, 16, 20},{ 20, 25, 20, 25},{16, 20, 16, 20},{ 20, 25, 20, 25}}, {{18, 23, 18, 23},{ 23, 29, 23, 29},{18, 23, 18, 23},{ 23, 29, 23, 29}}

}; Note that the left shifting provides recovery of the number of bits lost in integer division by 2 qPfβ during quantization, and that the increase in magnitrude from multiplication by IQLevelScale(qP%6, i, j) is essentially offset by the prior decrease in magnitude by multiplication by QLevelScale(qP%6, i, j) plus division by 2 15 in the quantization, (d) Inverse transform for 4x4 luminance/chrominance blocks The inverse 4x4 transform differs from the DCT in that the 4x4 transform matrix transpose is not equal to the 4x4 matrix inverse because the rows have differing norms; that is, J 4X4 is not an orthogonal matrix. Indeed, the scaling matrices of the quantization and inverse quantization adjust the relative size of transformed pixels. Explicitly, the inverse transform uses the 4x4 matrix V^ and its transpose where:

Note that F 4x4 looks like 7 " V 4 but with two columns scaled by 1 A to reduce dynamic range. Thus the inverse transform of the 4x4 matrix with elements d v for / = 0, 1, 2, 3 and/ - 0, 1, 2,

3 is the 4x4 matrix with elements h y defined as:

Lastly, the h tJ are scaled down to r y = (h y + 32) » 6 to define the recovered (decoded and decompressed) data.

Similar transforms and quantization applies to the 2x2 chrominance DC blocks, (e) Forward transform for 2x2 chrominance DC blocks

The forward 2x2 transform uses the following 2x2 transform matrix, Ti^a, for matrix multiplications with each 2x2 sample data matrix of the two 2x2 chrominance DC blocks of a macroblock:

Ti i

72x2 ~

1 - 1

Thus the forward transform of 2x2 matrix with elements x }l to 2x2 matrix with elements y υ is:

(f) Quantization for 2x2 chrominance DC blocks

The y, j for / = 0, 1 and/ = 0, 1 are quantized to give the c tJ as a function of the quantization parameter QPc by:

Cg = sign(y υ )*(\y y \ *QLevelScale(QP c %6, 0, 0) + delta) » (16 + QPdfy where QLevelScale(QPc%6, 0, 0) is an entry in the quantization lookup table listed in (b) above; QPc is the chrominance quantization factor as before and in the range 0, 1, ..., 51; and delta = cr f =2 16+βPc/6 with 0 < a < 1 the round-off parameter. These quantized coefficients c υ are ultimately encoded and transmitted/stored.

(g) Inverse transform for 2x2 chrominance DC blocks After decoding to recover a 2x2 quantized DC block c v with i = 0, 1 andj = 0, 1, inverse 2x2 transform prior to inverse quantization to give/i, as:

Note that like the DCT, the transform is essentially its own inverse, (h) Inverse quantization for 2x2 chrominance DC blocks 0, 1 and/ = 0, 1 are inverse quantized to give the dcC v as a function of the quantization parameter QPc by: dcQj - ( (f v *IQLevelScale(QPc %6, O 5 0) ) « QPc /6 ) » 1

where, again, QPc denotes the chrominance quantization parameter, and IQLevelScale(qP%6, O 5 0) is a (0,0) entry of the inverse quantization lookup table listed in

(C).

Lastly, similar transforms and quantization applies to the 4x4 luminance DC blocks, (i) Forward transform for 4x4 luminance DC blocks

The forward transform of 4x4 luminance DC block x y to 4x4 matrix with elements h y is

K K K K 1 1 1 1 "-oo " oi " 02 " 03 1 1 1 1

K K 1 1 - 1 _ J X 10 X 1 , x π X \3 1 1 -1 - 1

K Kl K 1 -1 -1 1 ^2O X 21 " 22 X 23 1 - 1 - 1 1

K * K Kl K 1 - 1 1 - 1 X 30 X 31 *32 *33 1 -1 1 - 1

Then scale the h y to get the transform y y by y v == (h v + 1) » 1. (j) Quantization for 4x4 luminance DC blocks

The>" y for i = 0, 1, 2, 3 and/ = 0, 1, 2, 3 are quantized to give the c v as a function of the luminance quantization parameter QPy by: c v = *QLevelScale(QPγ %6, 0, 0) + delta) » (16 + QP γ 16) where QLevelScale(QPγ%6, 0, 0) is a (0,0) entry of the quantization lookup table listed in (b); and again ~ delta = α*2 I6+δ/y6 with 0 < a < 1 is the round-off parameter, (k) Inverse transform for 4x4 luminance DC blocks

After decoding to recover a 4x4 quantized DC block c y with i = 0, 1, 2, 3 and/ = 0, 1 , 2, 3, inverse 4x4 transform prior to inverse quantization to give/ y as:

/θθ J Ol 7 θ2 /θ3 1 1 1 1 c oo c o, C 02 C 03 1 1 1 1

/io /π Jn Ju 1 1 -1 - 1 C 10 C I 1 C 12 C I3 1 1 - 1 — ]

J 20 ./ 21 722 723 1 - 1 1 C 20 C 21 C 22 C 23 1 -1 - 1 1 /30 Jλ\ J 32 /33. 1 - 1 1 - 1 C 30 C 31 C 32 C 33 1 - 1 1 - 1 (1) Inverse quantization for 4x4 luminance DC blocks

The f υ for / = O 5 1, 2, 3 and/ = 0, 1, 2, 3 are inverse quantized to give the dcY y as a function of the quantization parameter QPy by: dcY, j = ( U r β *IQLevelScale(QPγ%6, 0, 0) ) « (QPy 16) + 2) » 2 where, again, QPy denotes the luminance quantization parameter, and IQLevelScale(QPγ %6, 0, 0) is a (0,0) entry of the inverse quantization lookup table listed in (c).

During the development of the H.264 standard, efforts were made to ensure that the H.264 transform and quantization could be implemented in 16-bit arithmetic. This goal has largely been achieved. However, the rounding control parameter delta used in the forward quantizations of foregoing steps (b), (f), and (j) may exceed 16 bits; and this makes the H.264 forward quantization implemented impractical on a processor which does not have 32-bit memory access. Indeed, delta = α*2 15+?/V6 or α*2 16+?/3/6 which can be up to 24 bits. Consequently, the preferred embodiments provide forward quantizations for H.264 which have a constant delta. In particular, the preferred embodiment methods of transform plus quantization and their inverses for 4x4 luminance/chrominance blocks use foregoing steps (a), (c), and (d) but replace step (b) with new step (b'); for 2x2 chrominance DC blocks use foregoing steps (e), (g), and (h) but replace step (f) with new step (f ); and for 4x4 luminance DC blocks use foregoing steps (i), (k), and (1) but replace step (j) with new step (j'). These new steps are as follows:

(b') Preferred embodiment quantization for 4x4 luminance/chrominance blocks The y υ for i = 0, 1, 2, 3 and 7 = 0, 1, 2, 3 are quantized to give the c v as a function of the quantization parameter qP by:

C j = sign(y, j ) *(\y v \ *QMat ( %P%6, i, j) + «i2 15 ) » 15 when qP/6 = 0 c, j *QMat (qm i, j) + α*2 16 ) » 16 otherwise where, as in (b), qP denotes either the luminance quantization parameter, QPy, or the chrominance quantization parameter, QPc', and also as in (b), 0 < a < 1 is the round-off parameter. i, j) is a new quantization lookup table defined in terms of QLevelScale(qP%6, i,j), listed in (b), and defined as: QMaf%P%6, i,j) = QLevelScale(qP%6, i,J) ( QLevelScale{qP%6, i, j) + 2" ~ ') » n for n > 0 That is, QLevelScale[6][4][4] is replaced by QMat φ) [6][4][4], QMat (i) [6][4][4], ..., or

QMat a) [6][4][4], depending upon qPlβ. Note that for QMat {n) [6][4][4] entries there is a right shift of H bits with round-off of the corresponding QLevelScale[6][4][4] entries; the right shift decreases the sizes of the entries from the range 2 1 "-2 14 to the range 2 n~ "-2 14→ '. (Note qP/6 in the range 0 to 8 implies that n will be in the range 1 to 7.) This use of more tables allows the ^P/6-dependent size delta to be replaced by a constant size α*2 16 (or α*2 15 when

qP/6 = 0) which is a 16-bit integer. For example, the three distinct values of the table QLevelScale(O, i,j) are 13107, 8066, and 5243; whereas, the corresponding entries of i, j) are 102, 63, and 41, respectively. This saves 7 bits by a trade-off with lower resolution. (f ) Preferred embodiment quantization for 2x2 chrominance DC blocks

The y, j for i = O 3 1 and/ = 0, 1 are quantized to give the c v as a function of the chrominance quantization parameter QPc by: where (b') defines 0, 0) and a. Note that QMaP(QP c %6, 0, 0) is also needed; whereas, (b') only uses i, j) for n < 7.

Q') Preferred embodiment quantization for 4x4 luminance DC blocks

The>> ;y for i — 0, 1, 2, 3 andy = 0, 1, 2, 3 are quantized to give the c tJ as a function of the luminance quantization parameter QPyby: » 16 where (b') defines QMat {QPy/6) (QP Y %6, 0, 0) and a. Again, note that QMat (S) (QP Y %6, 0, 0) is also needed.

For an implementation in which table size is not a concern, the new quantization- matrices used in (b'), (F), and (j') can be pre-calculated and stored. The explicit new quantization matrices are as follows: DQMat [6][4][4] = {

{{13107, 8066,13107, 8066},{ 8066, 5243, 8066, 5243},{13107, 8066,13107, 8066},{ 8066, 5243, 8066, 5243}}, {{11916, 7490,11916, 7490},{ 7490, 4660, 7490, 4660},{l 1916, 7490,11916, 7490},{ 7490, 4660, 7490, 4660}}, {{10082, 6554,10082, 6554},{ 6554, 4194, 6554, 4194},{10082, 6554,10082, 6554},{ 6554, 4194, 6554, 4194}}, {{ 9362, 5825, 9362, 5825},{ 5825, 3647, 5825, 3647},{ 9362, 5825, 9362, 5825},{ 5825, 3647, 5825, 3647}}, {{ 8192, 5243, 8192, 5243},{ 5243, 3355, 5243, 3355},{ 8192, 5243, 8192, 5243},{ 5243, 3355, 5243, 3355}}, {{ 7282, 4559, 7282, 4559},{ 4559, 2893, 4559, 2893},{ 7282, 4559, 7282, 4559},{ 4559, 2893, 4559, 2893}}, },

QMat (l) [6][4][4] = { {{ 6554, 4033, 6554, 4033},{ 4033, 2622, 4033, 2622},{ 6554, 4033, 6554, 4033},{ 4033, 2622, 4033, 2622}}, {{ 5958, 3745, 5958, 3745},{ 3745, 2330, 3745, 2330},{ 5958, 3745, 5958, 3745},{ 3745, 2330, 3745, 2330}}, {{ 5041, 3277, 5041, 3277},{ 3277, 2097, 3277, 2097},{ 5041, 3277, 5041, 3277},{ 3277, 2097, 3277, 2097}}, {{ 4681, 2913, 4681, 2913},{ 2913, 1824, 2913, 1824},{ 4681, 2913, 4681, 2913},{ 2913, 1824, 2913, 1824} }, {{ 4096, 2622, 4096, 2622},{ 2622, 1678, 2622, 1678},{ 4096, 2622, 4096, 2622},{ 2622, 1678, 2622, 1678}},

{{ 3641, 2280, 3641, 2280},{ 2280, 1447, 2280, 1447},{ 3641, 2280, 3641, 2280},{ 2280, 1447, 2280, 1447}},

};

QMat (2> [6][4][4] = { {{ 3277, 2017, 3277, 2017},{ 2017, 1311, 2017, 1311},{ 3277, 2017, 3277, 2017},{ 2017, 1311, 2017, 1311 }}, {{ 2979, 1873, 2979, 1873},{ 1873, 1165, 1873, 1165},{ 2979, 1873, 2979, 1873},{ 1873, 1165, 1873, 1165}}, {{ 2521, 1639, 2521, 1639},{ 1639, 1049, 1639, 1049},{ 2521, 1639, 2521, 1639},{ 1639, 1049, 1639, 1049}}, {{ 2341, 1456, 2341, 1456},{ 1456, 912, 1456, 912},{ 2341, 1456, 2341, 1456},{ 1456, 912, 1456, 912}}, {{ 2048, 1311, 2048, 1311},{ 1311, 839, 1311, 839},{ 2048, 1311, 2048, 1311},{ 1311, 839, 1311, 839}}, {{ 1821, 1140, 1821, 1140},{ 1140, 723, 1140, 723},{ 1821, 1140, 1821, 1140},{ 1140, 723, 1140, 723}},

};

QMat (3) [6][4][4] = {

{{ 1638, 1008, 1638, 1008},{ 1008, 655, 1008, 655},{ 1638, 1008, 1638, 1008},{ 1008, 655, 1008, 655}}, {{ 1490, 936, 1490, 936},{ 936, 583, 936, 583},{ 1490, 936, 1490, 936},{ 936, 583, 936, 583}}, {{ 1260, 819, 1260, 819},{ 819, 524, 819, 524},{ 1260, 819, 1260, 8I9},{ 819, 524, 819, 524}}, {{ 1170, 728, 1170, 728},{ 728, 456, 728, 456},{ 1170, 728, 1170, 728},{ 728, 456, 728, 456}}, {{ 1024, 655, 1024, 655},{ 655, 419, 655, 419},{ 1024, 655, 1024, 655},{ 655, 419, 655, 419}}, {{ 910, 570, 910, 570},{ 570, 362, 570, 362},{ 910, 570, 910, 570},{ 570, 362, 570, 362}}, };

QMat (4) [6][4][4] = {

{{ 819, 504, 819, 504},{ 504, 328, 504, 328},{ 819, 504, 819, 504},{ 504, 328, 504, 328}},

{{ 745, 468, 745, 468},{ 468, 291, 468, 291},{ 745, 468, 745, 468},{ 468, 291, 468, 291}}, {{ 630, 410, 630, 410},{ 410, 262, 410, 262},{ 630, 410, 630, 410},{ 410, 262, 410, 262}},

{{ 585, 364, 585, 364},{ 364, 228, 364, 228},{ 585, 364, 585, 364},{ 364, 228, 364, 228}},

{{ 512, 328, 512, 328},{ 328, 210, 328, 210},{ 512, 328, 512, 328},{ 328, 210, 328, 210}},

{{ 455, 285, 455, 285},{ 285, 181, 285, 181},{ 455, 285, 455, 285},{ 285, 181, 285, 181}},

};

QMat (5) [6][4][4] = {

{ { 410, 252, 410, 252},{ 252, 164, 252, 164},{ 410, 252, 410, 252},{ 252, 164, 252, 164}},

{ { 372, 234, 372, 234},{ 234, 146, 234, 146},{ 372, 234, 372, 234},{ 234, 146, 234, 146}},

{{ 315, 205, 315, 205},{ 205, 131, 205, 131 },{ 315, 205, 315, 205},{ 205, 131 , 205, 131 } }, {{ 293, 182, 293, 182},{ 182, 1 14, 182, 114},{ 293, 182, 293, 182},{ 182, 1 14, 182, 1 14} },

{ { 256, 164, 256, 164},{ 164, 105, 164, 105},{ 256, 164, 256, 164}, { 164, 105, 164, 105}},

{ { 228, 142, 228, 142}, { 142, 90, 142, 90}, { 228, 142, 228, 142}, { 142, 90, 142, 90} },

};

QMat (6) [6][4][4] = {

{ { 205, 126, 205, 126},{ 126, 82, 126, 82},{ 205, 126, 205, 126},{ 126, 82, 126, 82}},

{{ 186, 117, 186, 117},{ 117, 73, 117, 73},{ 186, 117, 186, 117},{ 117, 73, 117, 73}},

{{ 158, 102, 158, 102},{ 102, 66, 102, 66},{ 158, 102, 158, 102},{ 102, 66, 102, 66}},

{{ 146, 91, 146, 91 },{ 91, 57, 91, 57},{ 146, 91, 146, 91},{ 91, 57, 91, 57}},

{{ 128, 82, 128, 82},{ 82, 52, 82, 52},{ 128, 82, 128, 82},{ 82, 52, 82, 52}}, {{ 114, 71, 114, 71 },{ 71, 45, 71, 45},{ 114, 71, 114, 71 },{ 71, 45, 71, 45}},

};

QMat (7) [6]|4][4] = {

{{ 102, 63, 102, 63},{ 63, 41, 63, 41 },{ 102, 63, 102, 63},{ 63, 41, 63, 41 }}, {{ 93, 59, 93, 59}, { 59, 36, 59, 36},{ 93, 59, 93, 59},{ 59, 36, 59, 36}},

{{ 79, 51, 79, 51 },{ 51, 33, 51, 33 },{ 79, 51, 79, 51 },{ 51, 33, 51, 33}},

{{ 73, 46, 73, 46}, { 46, 28, 46, 28},{ 73, 46, 73, 46},{ 46, 28, 46, 28}},

{{ 64, 41, 64, 41 },{ 41, 26, 41, 26},{ 64, 41, 64, 41},{ 41, 26, 41, 26}},

{{ 57, 36, 57, 36}, { 36, 23, 36, 23},{ 57, 36, 57, 36},{ 36, 23, 36, 23}}, };

QMat< 8) [6][4][4] = {

{{ 51, 32, 51, 32},{ 32, 20, 32, 20},{ 51, 32, 51, 32},{ 32, 20, 32, 20}},

{{ 47, 29, 47, 29},{ 29, 18, 29, 18},{ 47, 29, 47, 29},{ 29, 18, 29, 18}}, {{ 39, 26, 39, 26},{ 26, 16, 26, 16},{ 39, 26, 39, 26},{ 26, 16, 26, 16}},

{{ 37, 23, 37, 23},{ 23, 14, 23, 14},{ 37, 23, 37, 23},{ 23, 14, 23, 14}},

{{ 32, 20, 32, 20},{ 20, 13, 20, 13},{ 32, 20, 32, 20},{ 20, 13, 20, 13}},

{{ 28, 18, 2.8, 18},{ 18, 11, 18, 11},{ 28, 18, 28, 18},{ 18, 11, 18, 11}},

};

Note that in QMat (8) [6][4][4] only QMaP(2,Q,0), QMaP(3,0,0), QMaP(4,0,0), QMaP(5,0,0) are used, the rest of components in 0Mα/ (8) [6][4][4] do not need to be stored. Therefore, the total table size is about 1350 bytes (QMaP to QMaP stored as two-byte entries, QMaP to QMaP stored as one-byte entries). For implementations in which a small table size is desired, the quantization matrices for a macroblock can be computed on the fly according to the quantization scales QP Y and QPc hy

QMaP(QP V%6, i, j) = QLevelScale(QP Y %6, i, j) for QP Y /6 < 2 ( QLevelScale(QP Y %6, i, j) + 2 ρ?y6"2 ) » (QP γ/6- 1 ) otherwise

QMaP(QP c %6, i, j) = QLevelScale(QP c %6, i, j) for QP c /6 < 2

( QLevelScale(QPc%6, i, j) + 2 QPcl6~2 ) » (βPc/6-1) otherwise and ) » QP Y I6 ( QLevelScale{QP c %6,0,0) + 2 QPc/6~l ) » QP c /6

Therefore, for a macroblock, a 4x4 quantization matrix for 16 luminance blocks, a 4x4 quantization matrix for 8 chrominance blocks, a quantization scale for a 4x4 luminance DC block, and a quantization scale for two 2x2 chrominance DC blocks need to be computed for the transform coefficients quantization according to a given QPy and QPc- Since the quantization scales do not change very frequently from macroblock to macroblock, such a quantization matrix computations normally do not need to be performed for each macroblock.

Simulations were carried out to test the efficiency of the preferred embodiment simplified forward quantization for H264. "Anchor T&Q" is the H264 transform plus quantization which is made up of equation from (a) to (1), the "Simplified T&Q" is made up of equations from (a), (b'), (c), (d), (e), (f ), (g),-φ), (i), (J'), (k), and (1); that is, only the forward quantization is changed in this case, everything else remains the same. All quantization steps (qp = O 5 1, 2, ... 51} are tested through. Each qp is tested with 5000 random macroblocks, the sample value is in the range of [-255:255]. The PSNR values between the input sample macroblocks and their reconstructed macroblocks are computed (see FIG. 4) over all the test sample macroblocks for each qp. The results are listed in the following Tables 1,2,3.

Table 3. Simulation results INTRAl 6x16-coded macroblocks, α=l/3 is used

As shown in Tables 1-3, the preferred embodiment simplified forward quantization performs almost identically to the quantization currently recommended by H.264, for all the allowed quantization scales (0-51) and the macroblock types (INTER, INTRA4x4 or INTRAl 6x16). Thus the preferred embodiment quantization provides the same compression efficient as the current H.264 quantization design, but enables the H.264 quantization to be implemented on devices with no capability of 32-bit memory access.

Various modifications to the preferred embodiments may be made while retaining the feature of multiple quantization tables to limit bit size of a rounding control parameter. For example, the quantization may use finer resolution, such as increments of qPI% rather than qPlβ, and so forth.