ON COEFFICIENT VALUE PREDICTION AND COST DEFINITION

Title:

ON COEFFICIENT VALUE PREDICTION AND COST DEFINITION

Document Type and Number:

WIPO Patent Application WO/2024/076632

Kind Code:

Abstract:

A mechanism for processing video data is disclosed. The mechanism determines to predict a value of a residual coefficient based on a cost. A conversion is performed between a visual media data and the media data file based on the residual coefficient. The coefficient may be used in transform coding or transform-skip coding.

Inventors:

SALEHIFAR MEHDI (US)
HE YUWEN (US)
ZHANG KAI (US)
ZHANG LI (US)

Application Number:

PCT/US2023/034461

Publication Date:

April 11, 2024

Filing Date:

October 04, 2023

Export Citation:

Click for automatic bibliography generation Help

Assignee:

BYTEDANCE INC (US)

International Classes:

H04N19/18; H04N19/192; H04N19/103; H04N19/60; H04N19/61

Attorney, Agent or Firm:

HOWELL, Brandt D. et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

What is claimed is:

1. A method for processing video data comprising: determining to predict a value of a residual coefficient based on a cost; and performing a conversion between a visual media data and a bitstream based on the residual coefficient.

2. The method of claim 1 , wherein the coefficient is used in transform coding or transform-skip coding.

3. The method of any of claims 1-2, wherein only a direct current (DC) coefficient value is predicted.

4. The method of any of claims 1-3, wherein the value of the first N coefficients are predicted, wherein N is any integer number, and wherein the first N coefficients are determined based on raster scan order, diagonal scan order, vertical scan order, horizontal scan order, or based on dividing the coefficients into subblocks and based on any combination of the scan order for the subblocks and any scan order for the coefficients inside of the subblocks.

5. The method of any of claims 1-4, wherein prediction of a coefficient depends on coding information comprising a partial value of the reconstructed coefficient; a parity; surrounding neighboring values; a block size, coding unit (CU), prediction unit (PU), or transform unit (TU) sizes; a prediction mode used for a block, which may depend on whether the block is inter coded or intra coded, on the intra direction value, or on the type of the inter prediction used for that block; multiple transform selection (MTS) index values; Low-Frequency Non-Separable Transform (LFNST) index values; block partitioning type; transform skip flag; quantization parameter (QP); color components, or color format.

6. The method of any of claims 1-5, wherein N coefficients at positions pl, p2, ... pN are predicted.

7. The method of any of claims 1 -6, wherein information related to a coefficient value is derived from a cost derivation process.

8. The method of any of claims 1-7, wherein a full coefficient is derived from a cost derivation process, a prediction of the coefficient is derived from the cost derivation process and a coefficient is added by the prediction to obtain a final coefficient, a scaling factor is derived from the cost derivation process and a coefficient is multiplied or divided by the scaling factor to obtain the final coefficient, module transform (T) information is derived and a final coefficient value is T*coeff + 1, where T is any positive integer and t any integer between 0 and T-l, or T = 2 and the cost derivation process determines a parity of a coefficient.

9. The method of any of claims 1-8, wherein information related to prediction value is derived from a set of values.

10. The method of any of claims 1-9, wherein the prediction value is predicted from zero and C such that the final coefficient value is X or X+C where C is any integer, or wherein the prediction value is not signaled such that based on a prediction from zero or C the final coefficient value is X or X+C where X is a partially coded coefficient, or wherein a flag is coded to indicate whether a predication from zero and C is correct such that when the prediction is incorrect an opposite value is added to X, or wherein two or more prediction values are included in one set, or wherein the information is related to N sets of values that include M_i candidates, for i from 1 to N and the N sets are implicitly derived or signalled explicitly such that N and M i are each positive integers, or wherein M possible prediction values are denoted as vl, . . ., vM, and a best prediction denoted as vK (1 <= K <= M), is added to X, to create final coefficient of X + vK, without any signaling, or all M possible prediction are sorted based on a predefined cost and an index is signaled to indicate a correct prediction, or wherein a prediction derivation process is applied simultaneously with or after dependent quantization or rate distortion optimization quantization is complete, or wherein a predefined prediction value is not be constant, or wherein a predefined prediction value is a function of surrounding coefficient values, or wherein a function of a summation of an absolute value of the transform surrounding neighbors determines predefined prediction values, or wherein a function of a summation of a partial absolute value of the transform surrounding neighbors determines predefined prediction values.

11. The method of any of claims 1-10, wherein an actual prediction value is used to code a coefficient value remainder.

12. The method of any of claims 1-11, wherein an accurate prediction P is derived on both an encoder and a decoder side where X = coeff- P is coded at the encoder the decoder decodes X and adds P to obtain a coefficient final value, or wherein a prediction derivation process is applied after dependent quantization or rate distortion optimization quantization (RDOQ) is complete, or wherein a prediction derivation process is applied with a dependent quantization (DQ) or a RDOQ process, or wherein an approximation of a prediction is used such that an absolute value of a prediction is limited to C where C is any positive number or where a parity of the prediction is always even, always odd, or derived from a partial coefficient value, or wherein a binary search style is used to determine a prediction value, or wherein all possible values with absolute values less than C are examined and a value with a lowest cost is used as a prediction.

13. The method of any of claims 1-12, wherein a partial prediction value is used to predict a part of a coefficient.

14. The method of any of claims 1-13, wherein information related to a zero coefficient is not predicted, or wherein remaining coefficients are predicted based on a greater than zero flag, or wherein information related to a coefficient being greater than one is predicted, or wherein remaining coefficients are predicted based on a greater than one flag, or wherein information related to a coefficient being greater than two is predicted, or wherein remaining coefficients are predicted based on a greater than two flag, or wherein any information in second residual coding pass is predicted, or wherein any information in third residual coding pass is predicted, or wherein a part of a coefficient is signaled and another part of the coefficient is predicted, or wherein a partial prediction includes deriving a prediction value from a set of values or an actual prediction for a part of the coefficient.

15. The method of any of claims 1-14, wherein a cost for evaluating a coefficient value hypothesis or prediction is a function of at least one neighboring sample.

16. The method of any of claims 1-15, wherein a cost is calculated as a difference between a partial reconstruction of border samples in a current block and a corresponding reference where the corresponding reference is derived from neighboring block reconstruction, where the partial reconstruction of the border samples and the corresponding reference are adjacent, or where the partial reconstruction of the border samples and the corresponding reference have a same number of samples, each reconstructed border sample has a corresponding reference sample, and a difference is calculated by comparing each pair of corresponding reconstructed border sample and reference sample, or wherein one or more rows, one or more columns, or both are used as a partial reconstruction area, or wherein KI rows, K2 columns, or both are used as a partial reconstruction area where KI and K2 are integer numbers, or wherein different cost functions are used to derive one hypothesis cost where the hypothesis cost is: a sum of absolute difference (SAD) between the partial reconstruction and their references, a sum of absolute transformed difference (SATD) or other cost measure between the partial reconstruction and their references, a mean removal (MR) based SAD (MR-SAD) between template samples and their references, a weighted average of SAD or MR- SAD and SATD between a partial reconstruction and their references, or a cost function between partial reconstruction and reference template according to SAD, MR-SAD, SATD, MR-SATD, sum of squared differences (SSD), sum square error (SSE), MR-SSE, weighted SAD, MR-SAD, weighted SATD, weighted MR-SATD, weighted SSD, weighted MR-SSD, weighted SSE, weighted MR-SSE, or gradient information, or wherein a cost considers a continuity between a reference template and reconstructed samples adjacently or non-adjacently neighboring to a current template in addition to a SAD (Boundary SAD), or wherein reconstructed samples left or above adjacently or non-adjacently neighboring to current template are considered, or wherein a cost is calculated based on SAD and Boundary SAD, or wherein a cost is calculated as (SAD + w*Boundary_SAD) where w is pre-defined, signaled, or derived according to decoded information.

17. The method of any of claims 1-16, wherein a number of the multiple transform selection (MTS) candidates depends on coefficient characteristics.

18. The method of any of claims 1-17, wherein a number of the MTS candidates depends on a last significant coefficient position, or wherein a number of candidates for a last significant coefficient position between P i and P i+1 are K i where P i and K i are any non-negative numbers where K i <= K_i+1 <= ..., or wherein a number of the MTS candidates and context for coding an index depends on a sum of absolute value of coefficients, or wherein a number of the candidates for sum of absolute value of coefficients between P i and P_i+1 are K i where P i and KJ are any non-negative numbers where K i <= K_i+1 <= ...., or wherein a sum of absolute value of some, but not all, positions are used for determining a number of the MTS candidates and context for coding an index, or wherein DC position is not used in the sum, or wherein only coefficient at positions pl, p2, ... pN are used for the sum of absolute values where pi is any non-negative integers, or wherein a sum of absolute value of some, but not all, positions (not all) are used for determining a number of MTS candidates and context for coding an index, or wherein a number of the MTS candidates and context for coding an index depends on a sum of partial absolute values of coefficients, or wherein a partial sum is min (abs(coeff), C), where C is a non-negative number, or wherein any combination of a partial sum and a full sum depending on a coefficient position or value are used for determining a number of the MTS candidates and context for coding an index, or where a sum of the min (abs(coeff), Ci) is used for determining a number of the MTS candidates and a context for coding an index where Ci is any non-negative integer and is different for each position pi, or wherein coefficient value at position pi is not used in a corner case of Ci = 0, or wherein a coefficient value at position pi is fully used in a corner case of Ci = MAX_INT, or wherein any function other than min is used.

19. The method of any of claims 1-18, wherein a minimum function is used to determine a predicted sign such that a prediction of N signs results in a 2^AN hypothesis that includes going through all the 2^AN costs and finding a minimum to determine a predicted sign.

20. The method of any of claims 1-19, wherein a hypothesis with a lowest cost among all 2^AN costs determines a predicted sign for all the N signs, or wherein a hypothesis with a lowest cost among all the 2^AN costs only determines a predicted sign for a first k signs where k is any integer number of N or less, or wherein after coding to determine whether a prediction of a first k signs is correct, non-correct signs hypothesis are discarded and remaining hypothesis are used for predicting remaining signs, or combinations thereof.

21. The method of any of claims 1-20, wherein a head-to-head minimum function is used to determine predicted signs.

22. The method of any of claims 1-21, wherein a head-to-head minimum function is defined as: after calculating a cost for a 2^AN hypothesis for an ith sign, the 2^A(N-1) hypothesis is related to a negative ith sign, the 2^A(N-1) hypothesis is related to positive ith sign, and remaining N-l sign situations are identical and a comparison of these 2^A(N-1) negative and 2^A(N-1) positive hypothesis is performed head-to-head to count a number of the times the negative hypothesis and the positive hypothesis have lower costs, or wherein whichever hypothesis has a most head-to-head lower cost is chosen as a predicted sign, or wherein head-to-head minimum functions are applied on all 2^AN hypothesis for all signs, or wherein after determining an actual sign for an ith sign, incorrect hypothesis are discarded and head-to-head min function is applied on remaining hypothesis, or wherein any combination of discarding or keeping incorrect hypothesis is applied when using hypothesis to determine predicted signs.

23. The method of any of claims 1-22, wherein a combination of different cost definitions is used to determine predicted signs.

24. The method of any of claims 1-23, wherein for signs at positions pl,. ..pJ one cost function is used and for signs at positions ql, .. .qK another cost function is used, or wherein pl, .. ., pJ and ql,..., qK are any integer number between 1 and N and no two are the same, or wherein a min function and another head-to-head min function are used, or a combination discarding incorrect hypothesis and keeping incorrect hypothesis is used in combination with any combination of different cost functions for each sign prediction, or wherein different cost functions are used for determining a sign prediction for one sign, or wherein when based on one cost criteria positive hypothesis and negative hypothesis costs are smaller than a predefined threshold or bigger than a predefined threshold, the next cost function is used to determine which sign is predicted continuously until a last cost functions in a queue or a threshold criteria for that function is satisfied, or wherein a cost is defined based on a weighted cost difference between head-to-head hypotheses, or wherein wl and w2 are added to each camp, where wl and w2 may be any real number such as 0.3, 0.5, 1 .. ..

25. The method of any of claims 1-24, wherein cost definitions and decision making used for sign prediction are used for coefficient value prediction.

26. The method of any of claims 1-25, wherein a minimum function is used to determine a coefficient value prediction, or wherein a head-to-head minimum function is used to determine a coefficient value prediction, or wherein an incorrect hypothesis is discarded depending on coding side information related to coefficient value prediction, or wherein any combination of discarding incorrect hypothesis or retaining incorrect hypothesis is used in combination with any combination of different cost functions for each sign prediction.

27. The method of any of claims 1-26, wherein any combination of sign prediction and coefficient value prediction for candidates are applied.

28. The method of any of claims 1-27, wherein sign prediction is applied on all coefficients, or wherein sign prediction is applied only on N signs, or wherein a first N signs based on a predefined scan order are used for sign prediction, or wherein a first N signs based on coefficient magnitude are used for the sign prediction, or wherein only signs of coefficient at positions pl,. . ., pN are used for sign prediction, or wherein only coefficient at positions ql,..., qM are used for coefficient value prediction, or wherein sign prediction or coefficient value prediction are applied for a coefficient in a mutually exclusive manner, or wherein sign prediction or coefficient value prediction are both applied for a coefficient, or wherein all of signs are predicted prior to prediction of coefficient values, or wherein all of coefficient values are predicted prior to prediction of signs, or wherein any order combination of predicting signs and coefficient values is applied.

29. The method of any of claims 1-28, wherein there are differences between passes used for residual coding.

30. The method of any of claims 1 -29, wherein different passes are used depending on a total number of context coded bins, or wherein there is no limitation on a number of context coded bins, or wherein prediction is used for the position of specified values, or wherein no special treatment is used for the 0 or any other value position.

31. The method of any of claims 1-30, wherein usage is dependent on coded information such that coded information includes block sizes, temporal layers, slice types, picture types, or color component.

32. The method of any of claims 1-31, wherein usage is indicated in the bitstream, and wherein the indication of enabling, disabling, or mechanism selection is signaled in a sequence level, group of pictures level, picture level, slice level, tile group level, sequence header, picture header, sequence parameter set (SPS), video parameter set (VPS), dependency parameter set (DPS), decoding capability information (DCI), picture parameter set (PPS), adaptation parameter set (APS), slice header, tile group header, prediction block (PB), transform block (TB), coding block (CB), picture unit (PU), transform unit (TU), coding unit (CU), virtual pipeline data unit (VPDU), coding tree unit (CTU), CTU row, slice, tile, sub-picture, and/or other kinds of region contain more than one sample or pixel.

33. The method of any of claims 1-32, wherein the conversion includes encoding the visual media data into the bitstream.

34. The method of any of claims 1-32, wherein the conversion includes decoding the visual media data from the bitstream.

35. An apparatus for processing video data comprising: a processor; and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform the method of any of claims 1-34.

36. A non-transitory computer readable medium comprising a computer program product for use by a video coding device, the computer program product comprising computer executable instructions stored on the non-transitory computer readable medium such that when executed by a processor cause the video coding device to perform the method of any of claims 1-34.

37. A non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by a video processing apparatus, wherein the method comprises: determining to predict a value of a residual coefficient based on a cost; and generating a bitstream based on the determining.

38. A method for storing bitstream of a video comprising: determining to predict a value of a residual coefficient based on a cost; generating a bitstream based on the determining; and storing the bitstream in a non-transitory computer-readable recording medium.

Description:

On Coefficient Value Prediction and Cost Definition

CROSS-REFERENCE TO RELATED PATENT APPLICATIONS

[0001] This patent application claims the benefit of U.S. Provisional Patent Application No. 63/413,082, filed October 4, 2022, the teachings and disclosure of which are hereby incorporated in their entireties by reference thereto.

TECHNICAL FIELD

[0002] This patent document relates to generation, storage, and consumption of digital audio video media information in a file format.

BACKGROUND

[0003] Digital video accounts for the largest bandwidth used on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth demand for digital video usage is likely to continue to grow.

SUMMARY

[0004] A first aspect relates to a method for processing video data comprising: determining to predict a value of a residual coefficient based on a cost; and performing a conversion between a visual media data and a bitstream based on the residual coefficient.

[0005] A second aspect relates to an apparatus for processing video data comprising: a processor; and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform any of the preceding aspects.

[0006] A third aspect relates to non-transitory computer readable medium comprising a computer program product for use by a video coding device, the computer program product comprising computer executable instructions stored on the non-transitory computer readable medium such that when executed by a processor cause the video coding device to perform the method of any of the preceding aspects.

[0007] A fourth aspect relates to a non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by a video processing apparatus, wherein the method comprises: determining to predict a value of a residual coefficient based on a cost; and generating a bitstream based on the determining. [0008] A fifth aspect relates to a method for storing bitstream of a video comprising: determining to predict a value of a residual coefficient based on a cost; generating a bitstream based on the determining; and storing the bitstream in a non-transitory computer-readable recording medium.

[0009] For the purpose of clarity, any one of the foregoing embodiments may be combined with any one or more of the other foregoing embodiments to create a new embodiment within the scope of the present disclosure.

[0010] These and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.

[0012] FIG. 1 is an example residual coding structure for transform blocks.

[0013] FIG. 2 is an example illustration of a template used for selecting probability models.

[0014] FIG. 3 is an example illustration of the two scalar quantizers used in an approach of dependent quantization.

[0015] FIG. 4 is an example state transition and quantizer selection for a dependent quantization.

[0016] FIG. 5 is an example Low-Frequency Non-Separable Transform (LFNST) process.

[0017] FIG. 6 is an example region of interest (ROI) for LFNST16.

[0018] FIG. 7 is an example ROI for LFNST8.

[0019] FIG. 8 is an example discontinuity measure.

[0020] FIG. 9 is a block diagram showing an example video processing system.

[0021] FIG. 10 is a block diagram of an example video processing apparatus.

[0022] FIG. 11 is a flowchart for an example method of video processing.

[0023] FIG. 12 is a block diagram that illustrates an example video coding system.

[0024] FIG. 13 is a block diagram that illustrates an example encoder.

[0025] FIG. 14 is a block diagram that illustrates an example decoder.

[0026] FIG. 15 is a schematic diagram of an example encoder. DETAILED DESCRIPTION

[0027] It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or yet to be developed. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.

[0028] Section headings are used in the present document for ease of understanding and do not limit the applicability of techniques and embodiments disclosed in each section only to that section. Furthermore, H.266 terminology is used in some descriptions only for ease of understanding and not for limiting scope of the disclosed techniques. As such, the techniques described herein are applicable to other video codec protocols and designs also. In the present document, editing changes are shown to text by bold italics indicating cancelled text and bold underline indicating added text, with respect to a draft of the VVC specification.

1. Initial discussion

[0029] This document is related to video and/or image coding technologies. Specifically, it is related to residual coding. The ideas may be applied individually or in various combinations, to video coding standard like High Efficiency Video Coding (HE VC), Versatile Video Coding (VVC), or the next generation video coding standards beyond WC such as enhanced compression model (ECM) or future video coding standards or video codecs.

2. Video coding introduction

[0030] Video coding standards have evolved primarily through the development of the ITU-T and International Organization for Standardization (ISO)/ International Electrotechnical Commission (IEC) standards. The International Telecommunication Union - Telecommunication Standardization Sector (ITU-T) produced H.261 and H.263, ISO/IEC produced Moving Picture Experts Group (MPEG)-1 and MPEG-4 Visual, and the two organizations jointly produced the H.262/MPEG-2 Video and H.264/MPEG-4 Advanced Video Coding (AVC) and H.265/ High Efficiency Video Coding (HEVC) standards. Since H.262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by video coding experts group (VCEG) and moving picture experts group (MPEG) jointly. This group finalized the Versatile Video Coding (WC) [1] standard, aiming at yet another 50% bit- rate reduction and providing a range of additional functionalities. After finalizing VVC, activity for beyond WC has started. A description of the additional tools on top of the VVC tools [2] has been summarized in [3], and the corresponding reference software is named as ECM.

[0031] An example reference software of VVC, named WC test model VTM), may be found at: https://vcgit.hhi. fraunhofer.de/jvet/WCSoftware_VTM/-/tree/VTM- 17.0. An example reference software of beyond VVC, named ECM, may be found at: https://vcgit.hhi.fraunhofer.de/ecm/ECM/- /tree/ECM-6.0.

2.1 Transform coefficient level coding

[0032] In HEVC, transform coefficients of a coding block are coded using non-overlapped coefficient groups (CGs), also known as subblocks, and each CG contains the coefficients of a 4x4 block of a coding block. In WC, the selection of coefficient group sizes becomes dependent upon transform block (TB) size only, which removes the dependency on channel type. As a consequence, various CGs (1x16, 2x8, 8x2, 2x4, 4x2 and 16x1) become available. The CGs inside a coding block, and the transform coefficients within a CG, are coded according to pre-defined scan orders. To restrict the maximum number of context coded bins per pixel, the area of the TB and the color component are used to derive the maximum number of context-coded bins for a TB. For a luma TB, the maximum number of context-coded bins is equal to TB zo size *1.75. For a chroma TB, the maximum number of context-coded bins (CCB) is equal to TB_zosize*1.25. Here, TB zosize indicates the number of samples within a TB after coefficient zero-out. Note that the coded sub block flag in transform skip residual mode is not considered for CCB count. Unlike HEVC, where residual coding is designed for the statistics and signal characteristics of transform coefficient levels, two separate residual coding structures are employed for transform coefficients and transform skip coefficients, respectively.

2.1.1 Residual coding for transform coefficients

[0033] FIG. 1 is an example residual coding structure for transform blocks. In transform coefficient coding, a variable, remBinsPassl, is first set to the maximum number of context-coded bins and is decreased by one when a context-coded bin is signaled. While the remBinsPassl is larger than or equal to four, the first coding pass, which includes the sig_coeff_flag, abs_level_gtl_flag, par_level_flag, and abs_level_gt3_flag, is coded by using context-coded bins. If the number of context coded bin is not greater than Mccb in the first pass coding, the rest part of level information, which is indicated to be further coded in the first pass, is coded with syntax element of abs_remainder by using Golomb-rice code and bypass-coded bins. When the remBinsPassl becomes smaller than 4 while coding the first pass, the rest part of coefficients, which are indicated to be further coded in the first pass, are coded with a syntax element of abs remainder, and coefficients which are not coded in the first pass is directly coded in the second pass with the syntax element of dec abs level by using Golomb-Rice code and bypass-coded bins as depicted in FIG. I. The remBinsPassl is reset for every TB. The transition of using context-coded bins for the sig_coeff_flag, abs_level_gtl_flag, par level flag, and abs_level_gt3_flag to using bypass-coded bins for the rest coefficients only happens at most once per TB. For a coefficient subblock, if the remBinsPassl is smaller than 4, the entire coefficient subblock is coded by using bypass-coded bins. After all the above-mentioned level coding, the signs (sign_flag) for all scan positions with sig_coeff_flag equal to 1 is finally bypass coded. The unified (same) rice parameter (ricePar) derivation is used for Pass 2 and Pass 3. The only difference is that baseLevel is set to 4 and 0 for Pass 2 and Pass 3, respectively. Rice parameter is determined not only based on sum of absolute levels of neighboring five transform coefficients in local template, but the corresponding base level is also taken into consideration as follows: RicePara = RiceParTablef max(min( 31, sumAbs - 5 * baseLevel), 0) ]

[0034] After the termination of the 1st subblock coding pass, the absolute value of each of the remaining yet-to-be-coded coefficients is coded by the syntax element dec_abs_level, which corresponds to a modified absolute level value with the zero-level value being conditionally mapped to a nonzero value. At the encoder side, the value of syntax element dec abs level is derived from the absolute level (absLevel), dependent quantizer state (QState) and the value of rice parameter (RicePara) as follows:

ZeroPos = ( QState < 2? 1 : 2 ) « RicePara if (absLevel == 0) dec abs level = ZeroPos else dec_abs_level = (absLevel <= ZeroPos) ? (absLevel - 1) : absLevel 2.1.2 Context modeling for coefficient coding

[0035] The selection of probability models for the syntax elements related to absolute values of transform coefficient levels depends on the values of the absolute levels or partially reconstructed absolute levels in a local neighbourhood. FIG. 2 is an example illustration of a template used for selecting probability models. In FIG. 2, The black square specifies the current scan position, and the gray squares represent the local neighbourhood used.

[0036] The selected probability models depend on the sum of the absolute levels (or partially reconstructed absolute levels) in a local neighbourhood and the number of absolute levels greater than 0 (given by the number of sig coeff flags equal to 1) in the local neighbourhood. The context modelling and binarization depends on the following measures for the local neighbourhood:

- numSig: the number of non-zero levels in the local neighbourhood;

- sumAbsl : the sum of partially reconstructed absolute levels (absLevell) after the first pass in the local neighbourhood;

- sumAbs: the sum of reconstructed absolute levels in the local neighbourhood

- diagonal position (d): the sum of the horizontal and vertical coordinates of a current scan position inside the transform block

[0037] Based on the values of numSig, sumAbsl, and d, the probability models for coding sig_flag, par_flag, gtl_flag, and gt2_flag are selected. The Rice parameter for binarizing abs_remainder is selected based on the values of sumAbs and numSig.

[0038] In VVC reduced 32-point MTS (RMTS32) based on skipping high frequency coefficients is used to reduce computational complexity of 32-point discrete sine transform (DST)- 7/ discrete cosine transform (DCT)-8. And it accompanies coefficient coding changes considering all types of zero-out (i.e., RMTS32 and the existing zero out for high frequency components in DCT2). Specifically, binarization of last non-zero coefficient position coding is coded based on reduced TU size, and the context model selection for the last non-zero coefficient position coding is determined by the original TU size. In addition, 60 context models are used to encode the sig coeff flag of transform coefficients. The selection of context model index is based on a sum of a maximum of five previously partially reconstructed absolute level called locSumAbsPassl as follows:

- If cldx is equal to 0, ctxlnc is derived as follows: ctxlnc = 12 * Max( 0, QState - 1 ) +

Min( ( locSumAbsPassl + 1 ) » 1, 3 ) + ( d < 2 ? 8 : ( d < 5 ? 4 : 0 ) )

- Otherwise (cldx is greater than 0), ctxlnc is derived as follows: ctxlnc = 36 + 8 * Max( 0, QState - 1) +

Min( ( locSumAbsPassl + 1 ) » 1, 3 ) + ( d < 2 ? 4 : 0 )

2.2 Dependent Quantization

[0039] In addition, the same HE VC scalar quantization is used with a concept called dependent scalar quantization. FIG. 3 is an example illustration of the two scalar quantizers used in an approach of dependent quantization. Dependent scalar quantization refers to an approach in which the set of admissible reconstruction values for a transform coefficient depends on the values of the transform coefficient levels that precede the current transform coefficient level in reconstruction order. The main effect of this approach is that, in comparison to conventional independent scalar quantization as used in HEVC, the admissible reconstruction vectors are packed denser in the N-dimensional vector space (N represents the number of transform coefficients in a transform block). That means, for a given average number of admissible reconstruction vectors per N-dimensional unit volume, the average distortion between an input vector and the closest reconstruction vector is reduced. The approach of dependent scalar quantization is realized by: (a) defining two scalar quantizers with different reconstruction levels and (b) defining a process for switching between the two scalar quantizers.

[0040] The two scalar quantizers used, denoted by Q0 and QI, are illustrated in FIG. 3. The location of the available reconstruction levels is uniquely specified by a quantization step size A. The scalar quantizer used (Q0 or QI) is not explicitly signalled in the bitstream. Instead, the quantizer used for a current transform coefficient is determined by the parities of the transform coefficient levels that precede the current transform coefficient in coding/reconstruction order.

[0041] FIG. 4 is an example state transition and quantizer selection for a dependent quantization. As illustrated in FIG. 4, the switching between the two scalar quantizers (Q0 and QI) is realized via a state machine with four states. The state can take four different values: 0, 1, 2, 3. It is uniquely determined by the parities of the transform coefficient levels preceding the current transform coefficient in coding/reconstruction order. At the start of the inverse quantization for a transform block, the state is set equal to 0. The transform coefficients are reconstructed in scanning order (e.g., in the same order they are entropy decoded). After a current transform coefficient is reconstructed, the state is updated as shown in FIG. 4, where k denotes the value of the transform coefficient level.

2.2.1 Dependent Quantization with 8-states

[0042] In ECM, coding efficiency of trellis-coded quantization in VVC is increased by increasing the number of quantization states (at the cost of a higher encoder complexity). Dependent quantization with 8 quantization states in addition to the variant of dependent quantization with 4 quantization state is supported (JVET-Q0243).

[0043] For supporting both variants of dependent quantization (4 and 8 states) in a unified framework, the decoding process for the VVC variant of dependent quantization is re-written.

[0044] The state transition table (sec. 7.4.12.11 in VVC) is modified from

QStateTransTable[ ][ ] = { { 0, 2 }, { 2, 0 ), { 1, 3 }, { 3, 1 } } to

QStateTransTable[ ][ ] = { { 0, 1 }, { 2, 3 }, { 1, 0 }, { 3, 2 } }

[0045] There are three aspects that depend on the quantization state QState: (a) the mapping of transmitted transform coefficient levels to intermediate quantization indexes (part of the dequantization specified in the syntax); (b) the context selection for the sig coeff flag; (c) the derivation of the mapping parameter ZeroPos[ ] for transform coefficient levels coded in bypass mode. All three aspects are re-written in order to reflect the swapping of quantization states:

(a) The mapping of transmitted transform coefficient levels to intermediate quantization indexes (see syntax structure residual_coding() in VVC) is modified from

TransCoeffLevel[ xO ][ yO ][ cldx ][ xC ][ yC ] =

( 2 * AbsLevel[ xC ][ yC ] - ( QState > 1 ? 1 : 0 ) ) * ( 1 - 2 * coeff_sign_flag[ n ] ) to

TransCoeffLevel[ xO ][ yO ][ cldx ][ xC ][ yC ] =

( 2 * AbsLevel[ xC ][ yC ] - ( QState & 1 ) ) * ( 1 - 2 * coeff_sign_flag[ n ] )

(b) The context selection of the sig_coeff_flag (see sec. 9.3.4.2.8 in VVC) depends on a parameter (context set id) that is derived based on the quantization state. In VVC, this parameter is given by

Max( 0, QState - 1 ) With the relabelling of the quantization states, this parameter can be derived according to ctxSetldf QState & 3 ] with ctxSet!d[ ] = { 0, 1, 0, 2 }

It should be noted that for the 4-state version, the result of (QState & 3) is equal to QState. The masking is only required for the 8-state version of dependent quantization.

(c) The derivation of the mapping parameter ZeroPos[ ] for transform coefficient levels coded in bypass mode is modified from

ZeroPos[ n ] = ( QState < 2 ? 1 : 2 ) « cRiceParam to

ZeroPos[ n ] = ( 1 + ( QState & 1 ) ) « cRiceParam

2.3 Multiple transform selection (MTS) for core transform

[0046] In addition to DCT-II which has been employed in HEVC, a Multiple Transform Selection (MTS) scheme is used for residual coding both inter and intra coded blocks. MTS uses multiple selected transforms from the DCT8/DST7. The introduced transform matrices are DST-VII and DCT-VIII. Table I shows the basis functions of the selected DST/DCT. [0047] In order to keep the orthogonality of the transform matrix, the transform matrices are quantized more accurately than the transform matrices in HEVC. To keep the intermediate values of the transformed coefficients within the 16-bit range, after horizontal and after vertical transform, all the coefficients are to have 10-bit.

[0048] In order to control MTS scheme, separate enabling flags are specified at SPS level for intra and inter, respectively. When MTS is enabled at SPS, a CU level flag is signaled to indicate whether MTS is applied or not. Here, MTS is applied only for luma. The MTS signaling is skipped when one of the below conditions is applied.

- The position of the last significant coefficient for the luma TB is less than 1 (i.e., direct curren (DC) only)

- The last significant coefficient of the luma TB is located inside the MTS zero-out region [0049] If MTS CU flag is equal to zero, then DCT2 is applied in both directions. However, if MTS CU flag is equal to one, then two other flags are additionally signaled to indicate the transform type for the horizontal and vertical directions, respectively. Transform and signaling mapping table as shown in Table II. Unified the transform selection for intra-subblock partitioning (ISP) and implicit MTS is used by removing the intra-mode and block-shape dependencies. If current block is ISP mode or if the current block is intra block and both intra and inter explicit MTS is on, then only DST7 is used for both horizontal and vertical transform cores. When it comes to transform matrix precision, 8-bit primary transform cores are used. Therefore, all the transform cores used in HEVC are kept as the same, including 4-point DCT-2 and DST-7, 8-point, 16-point and 32-point DCT-2. Also, other transform cores including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7 and DCT-8, use 8-bit primary transform cores.

Table IE Transform and signaling mapping table

[0050] To reduce the complexity of large size DST-7 and DCT-8, High frequency transform coefficients are zeroed out for the DST-7 and DCT-8 blocks with size (width or height, or both width and height) equal to 32. Only the coefficients within the 16x16 lower-frequency region are retained. [0051] As in HEVC, the residual of a block can be coded with transform skip mode. To avoid the redundancy of syntax coding, the transform skip flag is not signalled when the CU level MTS_CU_flag is not equal to zero. Note that implicit MTS transform is set to DCT2 when LFNST or Matrix-based Intra Prediction (MIP) is activated for the current CU. Also, the implicit MTS can be still enabled when MTS is enabled for inter coded blocks.

2.3.1 Enhanced MTS for intra coding

[0052] In the WC design [1] for MTS, only DST7 and DCT8 transform kernels are utilized which are used for intra and inter coding.

[0053] Additional primary transforms including DCT5, DST4, DST1, and identity transform

(IDT) are employed. Also, the MTS set is made dependent on the TU size and intra mode information. 16 different TU sizes are considered, and for each TU size 5 different classes are considered depending on intra-mode information. For each class, 1, 4 or 6 different transform pairs are considered. Number of intra MTS candidates are adaptively selected (between 1, 4 and 6 MTS candidates) depending on the sum of absolute value of transform coefficients. The sum is compared against the two fixed thresholds to determine the total number of allowed MTS candidates:

1 candidate: sum <= thO

4 candidates: thO < sum <= thl

6 candidates: sum > thl [0054] Note, although a total of 80 different classes are considered, some of those different classes often share exactly same transform set. So there are 58 (less than 80) unique entries in the resultant look up table (LUT). For angular modes, a joint symmetry over TU shape and intra prediction is considered. So, a mode i (i > 34) with TU shape AxB will be mapped to the same class corresponding to the mode j=(68 - i) with TU shape BxA. However, for each transform pair the order of the horizontal and vertical transform kernel is swapped. For example, for a 16x4 block with mode 18 (horizontal prediction) and a 4x16 block with mode 50 (vertical prediction) are mapped to the same class. However, the vertical and horizontal transform kernels are swapped. For the wide- angle modes the nearest conventional angular mode is used for the transform set determination. For example, mode 2 is used for all the modes between -2 and -14. Similarly, mode 66 is used for mode 67 to mode 80.

2.4 Low-Frequency non-separable transform (LFNST)

[0055] FIG. 5 is an example LFNST process. In VVC, LFNST is applied between forward primary transform and quantization (at encoder) and between de-quantization and inverse primary transform (at decoder side) as shown in FIG. 5. In LFNST, 4x4 non-separable transform or 8x8 non- separable transform is applied according to block size. For example, 4x4 LFNST is applied for small blocks (i.e., min (width, height) < 8) and 8x8 LFNST is applied for larger blocks (i.e., min (width, height) > 4).

[0056] Application of a non-separable transform, which is being used in LFNST, is described as follows using input as an example. To apply 4x4 LFNST, the 4x4 input block X is first represented as a vector X

= [*00 *01 *02 *03 *10 u x ₁₂ X ₁₃ X ₂₀ X ₂₁ X ₂₂ X ₂₃ X ₃₀ X ₃₁ X ₃₂ X ₃₃] ^r

[0057] The non-separable transform is calculated as F = T ■ X, where F indicates the transform coefficient vector, and T is a 16x16 transform matrix. The 16x1 coefficient vector F is subsequently re-organized as 4x4 block using the scanning order for that block (horizontal, vertical or diagonal). The coefficients with smaller index will be placed with the smaller scanning index in the 4x4 coefficient block. 2.4.1 Reduced Non-separable transform

[0058] LFNST (low-frequency non-separable transform) is based on direct matrix multiplication approach to apply non-separable transform so that it is implemented in a single pass without multiple iterations. However, the non-separable transform matrix dimension needs to be reduced to minimize computational complexity and memory space to store the transform coefficients. Hence, reduced non-separable transform (or reduced separable transform (RST)) method is used in LFNST. The main idea of the reduced non-separable transform is to map an N (N is commonly equal to 64 for

8x8 non-separable secondary transform (NSST)) dimensional vector to an R dimensional vector in a different space, where N/R (R < N) is the reduction factor. Hence, instead of NxN matrix, RST matrix becomes an R*N matrix as follows: where the R rows of the transform are R bases of the N dimensional space. The inverse transform matrix for RT is the transpose of its forward transform. For 8x8 LFNST, a reduction factor of 4 is applied, and 64x64 direct matrix, which is conventional 8x8 non-separable transform matrix size, is reduced to 16x48 direct matrix. Hence, the 48* 16 inverse RST matrix is used at the decoder side to generate core (primary) transform coefficients in 8*8 top-left regions. Whenl6x48 matrices are applied instead of 16x64 with the same transform set configuration, each of which takes 48 input data from three 4x4 blocks in a top-left 8x8 block excluding right-bottom 4x4 block. With the help of the reduced dimension, memory usage for storing all LFNST matrices is reduced from 10 kilobytes (KB) to 8KB with reasonable performance drop. In order to reduce complexity LFNST is restricted to be applicable only if all coefficients outside the first coefficient sub-group are nonsignificant. Hence, all primary-only transform coefficients have to be zero when LFNST is applied. This allows a conditioning of the LFNST index signalling on the last-significant position, and hence avoids the extra coefficient scanning in the current LFNST design, which is needed for checking for significant coefficients at specific positions only. The worst-case handling of LFNST (in terms of multiplications per pixel) restricts the non-separable transforms for 4x4 and 8x8 blocks to 8x16 and 8x48 transforms, respectively. In those cases, the last-significant scan position has to be less than 8 when LFNST is applied, for other sizes less than 16. For blocks with a shape of 4xN and Nx4 and N > 8, the proposed restriction implies that the LFNST is now applied only once, and that to the top- left 4x4 region only. As all primary-only coefficients are zero when LFNST is applied, the number of operations needed for the primary transforms is reduced in such cases. From encoder perspective, the quantization of coefficients is remarkably simplified when LFNST transforms are tested. A ratedistortion optimized quantization has to be done at maximum for the first 16 coefficients (in scan order), the remaining coefficients are enforced to be zero.

2.4.2 LFNST extension with large kernel

[0059] The LFNST design in VVC is extended as follows:

• The number of LFNST sets ( ) and candidates (C) are extended to 5=35 and C=3, and the LFNST set (IfnstTrSetldx) for a given intra mode (predModelntra) is derived according to the following formula: o For predModelntra < 2, IfnstTrSetldx is equal to 2 o IfnstTrSetldx = predModelntra, for predModelntra in [0,34] o IfnstTrSetldx = 68 - predModelntra, for predModelntra in [35,66]

• Three different kernels, LFNST4, LFNST8, and LFNST16, are defined to indicate LFNST kernel sets, which are applied to 4xN/Nx4 (N>4), 8xN/Nx8 (N>8), and MxN (M, N>16), respectively.

[0060] The kernel dimensions are specified by:

(LFSNT4, LFNST8*, LFNST16*) = (16x16, 32x64, 32x96)

[0061] FIG. 6 is an example ROI for LFNST16. The forward LFNST is applied to top-left low frequency region, which is called ROI. When LFNST is applied, primary-transformed coefficients that exist in the region other than ROI are zeroed out, which is not changed from the VVC standard. The ROI for LFNST16 is depicted in FIG. 6. It comprises six 4x4 sub-blocks, which are consecutive in scan order. Since the number of input samples is 96, transform matrix for forward LFNST16 can be Rx96. R is chosen to be 32 in this contribution, 32 coefficients (two 4x4 sub-blocks) are generated from forward LFNST 16 accordingly, which are placed following coefficient scan order.

[0062] FIG. 7 is an example ROI for LFNST8. The forward LFNST8 matrix can be Rx64 and R is chosen to be 32. The generated coefficients are located in the same manner as with LFNST 16.

[0063] The mapping from intra prediction modes to these sets is shown in Table III,

Table III: Mapping of intra prediction modes to LFNST set index

2.5 Sign prediction

[0064] The basic idea of the coefficient sign prediction method (JVET-D0031 and JVET-J0021) is to calculate reconstructed residual for both negative and positive sign combinations for applicable transform coefficients and select the hypothesis that minimizes a cost function.

[0065] FIG. 8 is an example discontinuity measure. To derive the best sign, the cost function is defined as discontinuity measure across block boundary shown on FIG. 8. It is measured for all hypotheses, and the one with the smallest cost is selected as a predictor for coefficient signs.

[0066] The cost function is defined as a sum of absolute second derivatives in the residual domain for the above row and left column as follows: where R is reconstructed neighbors, P is prediction of the current block, and r is the residual hypothesis. The term (— /?_i + 2R ₀ — Pj) can be calculated only once per block and only residual hypothesis is subtracted.

[0067] The transform coefficients with the largest K qldx value of the top-left 4x4 area are selected, qldx value is the transform coefficient level after compensating the impact from the multiple quantizers in DQ. A larger qldx value will produce a larger de-quantized transform coefficient level, qldx is derived as follows qldx = (abs(level) « 1) - (state & 1); where level is the transform coefficient level parsed from the bitstream and state is a variable maintained by the encoder and decoder in DQ.

[0068] The sign prediction area was extended to maximum 32x32. Signs of top left MxN block are predicted. The value of M and N is computed as follows: o M = min(w, maxW) o N = min(/i, maxH) where, w and h are the width and height of the transform block. The maximum area for sign prediction is not always set to 32x32. Encoder sets the maximum area (maxW, maxH) based on configuration, sequence class and QP, and signaled the area in SPS

[0069] The maximum number of predicted signs is kept unchanged. The sign prediction is also applied to LFNST blocks. And for LFNST block, a maximum of 4 coefficients in the top-left 4x4 area are allowed to be sign predicted.

3. Technical problems solved by disclosed technical solutions

[0070] There are several parts in the residual coding/ sign prediction may be improved. In some designs, there is no explicit prediction for the coefficient values. Further, in some designs the cost defined in the sign prediction does not use all the available information

4. A listing of solutions and embodiments

[0071] To solve the above-described problem, methods as summarized below are disclosed. The items should be considered as examples to explain the general concepts and should not be interpreted in a narrow way. Furthermore, these examples can be applied individually or combined in any manner.

[0072] Examples 1-6 relate to Coefficient value prediction based on a cost.

Example 1

[0073] In an example, the value of the at least one residual coefficient may be predicted based on a cost.

[0074] The coefficient may be used in transform coding or transform- skip coding.

[0075] In one example only the value of the DC coefficient may be predicted.

[0076] In one example the value of the first N coefficient may be predicted. N may be any integer number such as 1, 2, 3, 10, 100, ... [0077] In one example the first N coefficient may be determined based on the raster scan order. [0078] In one example the first N coefficient may be determined based on the diagonal scan order.

[0079] In one example the first N coefficient may be determined based on the vertical/horizontal scan order.

[0080] In one example the first N coefficient may be determined based on dividing the coefficient to subblocks and any combination of the scan order for the subblocks and any scan order for the coefficients inside of the subblocks.

[0081] In one example whether to and/or how to predict a coefficient may depend on coding information. The coding information may comprise: the partial value of the reconstructed coefficient; the parity; the surrounding neighboring values; the block size, e.g. CU, PU, TU sizes; the prediction mode used for that block, which may depend on whether the block is inter coded or intra coded, on the intra direction value, and/or on the type of the inter prediction used for that block; MTS index values; LFNST index values; block partitioning type; transform skip flag; quantization parameter (QP); and/or color components and/or color format.

[0082] In one example N coefficients at positions pl, p2, ... pN may be predicted. For example, pl, . . ., pN may be any non-negative number such as 0, 3, 11, ...

Example 2

[0083] In an example, information related to the coefficient value may be derived from the cost derivation process.

[0084] In one example the full coefficient may be derived from the process.

[0085] In one example a prediction of the coefficient may be derived from the process and coefficient may be added by this prediction to get the final one.

[0086] In one example a scaling factor may be derived from the process and coefficient may be multiplied/ divided by this scaling factor to get the final one.

[0087] In one example the module T information may be derived, and final coefficient value may be T*coelT + t, where T may be any positive integer and t any integer between 0 and T-l. T may be 2, 3, 4, 10, .... Or any other positive integer. In one example T = 2, and the process may determine the parity of the coefficient.

Example 3 [0088] In an example, the derived information related to prediction value may be from a set of values.

[0089] In one example it may predict the predication value from 2 fixed numbers: 0 and C; thus, the final coefficient value may be X or X + C, where X is the partially coded coefficient. In one example C may be any integer, such as -100, -10, 0, 3, 20, ...

[0090] In one example it may always use the prediction without any signaling, thus depending on the predicting to 0 or C, final value may be X or X + C, where X is the partially coded coefficient. [0091] In one example a flag may be coded to indicate whether the predication of 0/C was correct or not. If not the opposite (C/0 respectively) will be added to X.

[0092] In one example there may be M prediction value in one set, wherein M may be larger than one. M may be any positive integers such as 2, 3, 5, 10, ....

[0093] In one example there may be more than one set let say N, where N could be any positive integer, and inside of each set there may be M_i candidates, for i from 1 to N. These N sets may be implicitly derived based on the surrounding information or may be signaled explicitly. MJ may be any positive integer.

[0094] In one example the M possible prediction value may be denoted as vl, .. ., vM, and the best prediction denoted as vK (1 <= K <= M), may be added to X, to create final coefficient of X + vK, without any signaling.

[0095] In one example all the M possible prediction may be sorted based on a predefined cost, and an index may be signaled to choose which one is the correct prediction.

[0096] In one example this prediction derivation process may be applied after dependent quantization (at both encoder and decoder) and/or RDOQ (at encoder only) has finished their jobs, or with their process simultaneously.

[0097] In one example the predefined prediction values may not be constant.

[0098] In one example the predefined prediction values may be a function of the surrounding coefficient values. In one example a function of the summation of the absolute value of the T surrounding neighbors may determine the predefined prediction values. In one example a function of the summation of the partial absolute value of the T surrounding neighbors may determine the predefined prediction values.

Example 4 [0099] In an example, an actual prediction value may be used to code/decode the coefficient value remainder.

[00100] In one example an accurate prediction P may be derived on both encoder and decoder side. On encoder it codes X = coeff- P, and on decoder side it decodes X and add P to get the final coefficient value.

[00101] In one example this prediction derivation process may be applied after dependent quantization and/or RDOQ has finished their jobs.

[00102] In one example this prediction derivation process may be applied with the dependent quantization (DQ) and/or RDOQ process.

[00103] In one example an approximation of the prediction may be used. In one example abs value of the prediction may be limited to C, where C is any positive number such as 1, 2, 3, 4, . .. In one example the parity of the prediction may be always even or odd or derived from the partial coefficient value.

[00104] In one example a binary search style method may be used to find the prediction value.

[00105] In one example all the possible value with abs values less than C, may be examined and the one with the lowest cost may be used as the prediction.

Example 5

[00106] In an example, a partial prediction value may be used to predict a part of the coefficient.

[00107] In one example, information related to being 0 or not may be predicted.

[00108] In one example, after signaling greater than 0 flag, the remaining may be predicted.

[00109] In one example information related to being greater than 1 or not may be predicted.

[00110] In one example, after signaling greater than 1 flag, the remaining may be predicted.

[00111] In one example information related to greater than 2 flag or not may be predicted.

[00112] In one example, after signaling greater than 2 flag, the remaining may be predicted.

[00113] In one example any information in pass 2 (as described in section 2.1.1) of residual coding may be predicted.

[00114] In one example, any information in pass 3 (as described in section 2.1.1) of residual coding may be predicted.

[00115] In one example, any parts of the coefficient may be signaled, and the other parts may be predicted. [00116] In one example, this partial prediction may have any form such as, deriving prediction value from a set of values, or actual prediction for that part.

Example 6

[00117] In an example, the cost for evaluating a coefficient value hypothesis or prediction (as described in the section for sign prediction) may be a function of at least a neighboring sample.

[00118] For example, the cost may be calculated as the difference between the partial reconstruction of the border samples in the current block and corresponding reference. This corresponding reference may be derived from the neighboring block reconstruction. In one example, the partial reconstruction of the border samples and the corresponding reference may be adjacent. In one example, the partial reconstruction of the border samples and the corresponding reference have the same number of samples. And each reconstructed border sample has a corresponding reference sample. The difference can be calculated by comparing each pair of corresponding reconstructed border sample and reference sample.

[00119] In one example either one row or one column or both may be used as the partial reconstruction area.

[00120] In one example either KI rows or K2 columns or both (KI rows and K2 columns) may be used as the partial reconstruction area. KI and K2 may be any integer number such as 1, 2, 3, . . . [00121] In one example different cost functions may be used to derive one hypothesis cost. In one example this cost may be Sum of Absolute Difference (SAD) between the partial reconstruction and their references. In one example this cost may be Sum of Absolute Transformed Difference (SATD) or any other cost measure between the partial reconstruction and their references. In one example this cost may be Mean Removal based Sum of Absolute Difference (MR-SAD) between the template samples and their references. In one example this cost may be a weighted average of SAD/MR-SAD and SATD between the partial reconstruction and their references.

[00122] In one example, the cost function between partial reconstruction and reference template may be a Sum of absolute differences (SAD)/ mean-removal (MR) SAD (MR-SAD); Sum of absolute transformed differences (SATD)/MR-SATD; Sum of squared differences (SSD)/ MR-SSD; sum square error (SSE)/MR-SSE; Weighted SAD/weighted MR-SAD; Weighted SATD/weighted MR-SATD; Weighted SSD/weighted MR-SSD; Weighted SSE/weighted MR-SSE; and/or Gradient information. [00123] The cost may consider the continuity (Boundary SAD) between reference template and reconstructed samples adjacently or non-adjacently neighboring to current template in addition to the SAD calculated above. For example, reconstructed samples left and/or above adjacently or non- adjacently neighboring to current template are considered. In one example, the cost may be calculated based on SAD and Boundary SAD. In one example, the cost may be calculated as (SAD + w*Boundary_SAD). w may be pre-defined or signaled or derived according to decoded information.

[00124] Example 7 relates to MTS set derivation.

Example 7

[00125] In an example, a number of the MTS candidates may depend on the coefficient characteristics.

[00126] In one example number of the MTS candidates may depend on the last significant coefficient position.

[00127] In one example number of the candidates for last significant coefficient position between P i and P_i+ 1 may be K i. P i and K i may be any non-negative numbers, where K i <= K_i+1

[00128] In one example number of the MTS candidates, and context for coding the index may depend on the sum of absolute value of the coefficients.

[00129] In one example number of the candidates for sum of absolute value of the coefficients between P i and P_i+1 may be K i. P i and K i may be any non-negative numbers, where K i <=

[00130] In one example sum of absolute value of some of the positions (not all) may be used for determining number of the MTS candidates, and context for coding the index. In one example, DC position may not be used in the sum. In one example only coefficient at positions pl, p2, ... pN may be used for the sum of absolute values, pi could be any non-negative integers.

[00131] In one example sum of absolute value of some of the positions (not all) may be used for determining number of the MTS candidates, and context for coding the index.

[00132] In one example number of the MTS candidates, and context for coding the index may depend on the sum of partial absolute value of the coefficients. In one example this partial sum may be min (abs(coeff), C), where C is a non-negative number such as 0, 2, 3, ... [00133] In one example any combination of the partial sum, and full sum depending on the coefficient position and/or value may be used for determining number of the MTS candidates, and/or context for coding the index.

[00134] In one example sum of the min (abs(coefl), Ci) may be used for determining number of the MTS candidates, and/or context for coding the index. Ci may any non-negative integer and may be different for each position pi. In the corner case of Ci = 0, coefficient value at position pi is not used. In the corner case of Ci = MAX INT, coefficient value at position pi is fully used.

[00135] In one example any other function beside min may be used.

[00136] Examples 8-11 relate to cost definition for sign prediction, coefficient value prediction, etc.

Example 8

[00137] In an example, the min function may be used to determine the predicted sign. In other words, if N signs are being predicted, there will be 2 ^AN hypothesis. Going through all the 2 ^AN costs and finding minimum may determine the predicted sign.

[00138] In one example the hypothesis with the lowest cost among all the 2 ^AN costs, may determine the predicted sign for all the N signs.

[00139] In another example the hypothesis with the lowest cost among all the 2 ^AN costs, may only determine the predicted sign for the first k signs, k maybe any integer number such as 1, 2, 3, till N.

[00140] In another example after coding to see whether the prediction of the first k signs is correct or not, the non-correct signs hypothesis may be thrown away, and the rest of the hypothesis may be used for predicting the remaining signs.

[00141] In one example any combination of the previous 2 approaches may be used to determine the predicted signs.

Example 9

[00142] In an example, a head-to-head min function may be used to determine the predicted signs. [00143] In one example in this approach head-to-head min function may be defined as: after calculating the cost for the 2 ^AN hypothesis, for the ith sign, 2 ^A(N-1) hypothesis is related to Negative ith sign, and 2 ^A(N-1) hypothesis is related to Positive ith sign, and everything else (remaining N-l sign situations) are identical. Then we may compare these 2 ^A(N-1) Negative and 2 ^A(N-1) Positive hypothesis head-to-head, and count number of the times Negative / Positive hypothesis has lower cost.

[00144] In one example whichever has the most head-to-head lower cost may be chosen as the predicted sign.

[00145] In one example we may apply these head-to-head min function on all the 2 ^AN hypothesis for all the signs.

[00146] In another example after knowing the actual sign for the ith sign, we may throw away the wrong hypothesis, and we apply this head-to-head min function on the remaining hypothesis.

[00147] In another example any combination of throwing out the wrong hypothesis/ or keeping them may be used to determine the predicted signs.

Example 10

[00148] In an example, any combination of the different cost definitions may be used to determine the predicted signs.

[00149] In one example for signs at positions p 1, . . . pJ one cost function may be used and for signs at positions q 1 , . . . qK another cost function may be used, pl, . .. , pJ and q 1 , . .. , qK may be any integer number between 1 and N and no 2 of them are the same. In one example one may use min function and another one head-to-head min function and vice versa.

[00150] In one example any combination of throwing out wrong hypothesis or keeping them may be used in combination with any combination of different cost function for each sign prediction.

[00151] In one example even for determining the sign prediction for one sign, different cost function may be used. In one example if based on one cost criteria Positive hypothesis and Negative hypothesis costs are smaller than a predefined threshold or bigger than a predefined threshold, the next cost function may be used to determine which sign to be predicted to. This may continue till the last cost functions in the queue, or till it satisfies the threshold criteria for that function. In one example a new cost may be defined based on the weighted cost difference between head-to-head hypotheses. In one example instead of just comparing the Positive hypothesis to Negative hypothesis and adding 1 or 0 to each camp, wl and w2 may be added to each camp, where wl and w2 may be any real number such as 0.3, 0.5, 1 .. .

Example 11

[00152] In an example, any cost definitions, decision making, .. . used for the sign prediction may be used for the coefficient value prediction too. [00153] In one example min function may be used to determine the coefficient value prediction. [00154] In another example head-to-head min function may be used to determine the coefficient value prediction.

[00155] In another example depending on coding side information related to the coefficient value prediction, wrong hypothesis may be thrown away.

[00156] In one example any combination of throwing out wrong hypothesis or keeping them may be used in combination with any combination of different cost function for each sign prediction.

[00157] Example 12 relates to sign prediction and coefficient value prediction candidates’ selection

Example 12

[00158] In an example, any combination of sign prediction and coefficient value prediction for the candidates may be applied

[00159] In one example sign prediction may be applied on all the coefficients.

[00160] In another example sign prediction may be applied only on N signs.

[00161] In one example the first N signs based on a predefined scan order may be used for sign prediction.

[00162] In another example the first N signs based on coefficient magnitude may be used for the sign prediction.

[00163] In one example only the signs of the coefficient at positions pl,.. pN may be used for sign prediction.

[00164] In one example only coefficient at positions ql, . . . , qM may be used for coefficient value prediction.

[00165] In one example having sign prediction or coefficient value prediction may be applied mutually exclusive for a coefficient, i.e., if a coefficient sign has been predicted, its value would not and vice versa.

[00166] In one example having sign prediction or coefficient value prediction may be both applied for a coefficient.

[00167] In one example first all of the signs are predicted, then the coefficient values are predicted.

[00168] In another example first all of the coefficient values are predicted, then the signs are being predicted. [00169] In another example any order combination of predicting the signs, and coefficient values may be applied.

[00170] Example 13 relates to residual coding passes.

Example 13

[00171] In an example, there may be differences on the process/passes (as described in section 2.1.1) used for residual coding.

[00172] In one example different passes depending on the total number of the context coded bin may be used.

[00173] In another example there may be no limitation on number of the context coded bin, thus there may not be different passes depending on the total number of the context coded bin used.

[00174] In one example there may be prediction for the position of the 0 or any other value.

[00175] In another example there may not be any special treatment for the 0 or any other value position.

[00176] Example 14-15 relates to general coding concepts.

Example 14

[00177] In an example, whether to and/or how to apply the methods described above may be dependent on coded information.

[00178] In one example, the coded information may include block sizes and/or temporal layers, and/or slice/picture types, colour component, et al.

Example 15

[00179] In an example, whether to and/or how to apply the methods described above may be indicated in the bitstream

[00180] The indication of enabling/disabling or which method to be applied may be signalled at sequence level, group of pictures level, picture level, slice level, and/or tile group level, such as in sequence header, picture header, sequence parameter set (SPS), video parameter set (VPS), dependency parameter set (DPS), decoding capability information (DCI), picture parameter set (PPS), adaptation parameter set (APS), slice header, and/or tile group header.

[00181] The indication of enabling/disabling or which method to be applied may be signaled at prediction block (PB), transform block (TB), coding block (CB), picture unit (PU), transform unit (TU), coding unit (CU), virtual pipeline data unit (VPDU), coding tree unit (CTU), CTU row, slice, tile, sub-picture, and/or other kinds of region contain more than one sample or pixel. 5. References

[1] B. Bross, J. Chen, S. Liu, and Y.-K. Wang "Versatile Video Coding (Draft 10)," document JVET-S2001, 19th JVET meeting: by teleconference, 22 June - 1 July 2020.

[2] J. Chen, Y. Ye, S. Kim, "Algorithm descriptions for Versatile Video Coding and Test Model 11 (VTM 11)” document JVET-T2002, 20th JVET meeting by teleconference, 7 - 16 Oct. 2020.

[3] M. Coban, F. Leannec, K. Naser, and J. Strom " Algorithm description of Enhanced Compression Model 5 (ECM 5)," document JVET-Z2025, 26th JVET meeting: by teleconference, 20 - 29 April 2022.

[00182] FIG. 9 is a block diagram showing an example video processing system 4000 in which various techniques disclosed herein may be implemented. Various implementations may include some or all of the components of the system 4000. The system 4000 may include input 4002 for receiving video content. The video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format. The input 4002 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interface include wired interfaces such as Ethernet, passive optical network (PON), etc. and wireless interfaces such as Wi-Fi or cellular interfaces.

[00183] The system 4000 may include a coding component 4004 that may implement the various coding or encoding methods described in the present document. The coding component 4004 may reduce the average bitrate of video from the input 4002 to the output of the coding component 4004 to produce a coded representation of the video. The coding techniques are therefore sometimes called video compression or video transcoding techniques. The output of the coding component 4004 may be either stored, or transmitted via a communication connected, as represented by the component 4006. The stored or communicated bitstream (or coded) representation of the video received at the input 4002 may be used by a component 4008 for generating pixel values or displayable video that is sent to a display interface 4010. The process of generating user- viewable video from the bitstream representation is sometimes called video decompression. Furthermore, while certain video processing operations are referred to as “coding” operations or tools, it will be appreciated that the coding tools or operations are used at an encoder and corresponding decoding tools or operations that reverse the results of the coding will be performed by a decoder. [00184] Examples of a peripheral bus interface or a display interface may include universal serial bus (USB) or high definition multimedia interface (HDMI) or Displayport, and so on. Examples of storage interfaces include serial advanced technology attachment (SATA), peripheral component interconnect (PCI), integrated drive electronics (IDE) interface, and the like. The techniques described in the present document may be embodied in various electronic devices such as mobile phones, laptops, smartphones or other devices that are capable of performing digital data processing and/or video display.

[00185] FIG. 10 is a block diagram of an example video processing apparatus 4100. The apparatus 4100 may be used to implement one or more of the methods described herein. The apparatus 4100 may be embodied in a smartphone, tablet, computer, Internet of Things (loT) receiver, and so on. The apparatus 4100 may include one or more processors 4102, one or more memories 4104 and video processing circuitry 4106. The processor(s) 4102 may be configured to implement one or more methods described in the present document. The memory (memories) 4104 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing circuitry 4106 may be used to implement, in hardware circuitry, some techniques described in the present document. In some embodiments, the video processing circuitry 4106 may be at least partly included in the processor 4102, e.g., a graphics co-processor.

[00186] FIG. 11 is a flowchart for an example method 4200 of video processing. The method 4200 includes determining to predict a value of a residual coefficient based on a cost at step 4202. A conversion is performed between a visual media data and a bitstream based on the residual coefficient at step 4204. The conversion of step 4204 may include encoding at an encoder or decoding at a decoder, depending on the example.

[00187] It should be noted that the method 4200 can be implemented in an apparatus for processing video data comprising a processor and a non-transitory memory with instructions thereon, such as video encoder 4400, video decoder 4500, and/or encoder 4600. In such a case, the instructions upon execution by the processor, cause the processor to perform the method 4200. Further, the method 4200 can be performed by a non-transitory computer readable medium comprising a computer program product for use by a video coding device. The computer program product comprises computer executable instructions stored on the non-transitory computer readable medium such that when executed by a processor cause the video coding device to perform the method 4200. [00188] FIG. 12 is a block diagram that illustrates an example video coding system 4300 that may utilize the techniques of this disclosure. The video coding system 4300 may include a source device 4310 and a destination device 4320. Source device 4310 generates encoded video data which may be referred to as a video encoding device. Destination device 4320 may decode the encoded video data generated by source device 4310 which may be referred to as a video decoding device.

[00189] Source device 4310 may include a video source 4312, a video encoder 4314, and an input/output (I/O) interface 4316. Video source 4312 may include a source such as a video capture device, an interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources. The video data may comprise one or more pictures. Video encoder 4314 encodes the video data from video source 4312 to generate a bitstream. The bitstream may include a sequence of bits that form a coded representation of the video data. The bitstream may include coded pictures and associated data. The coded picture is a coded representation of a picture. The associated data may include sequence parameter sets, picture parameter sets, and other syntax structures. I/O interface 4316 may include a modulator/demodulator (modem) and/or a transmitter. The encoded video data may be transmitted directly to destination device 4320 via I/O interface 4316 through network 4330. The encoded video data may also be stored onto a storage medium/server 4340 for access by destination device 4320.

[00190] Destination device 4320 may include an I/O interface 4326, a video decoder 4324, and a display device 4322. VO interface 4326 may include a receiver and/or a modem. I/O interface 4326 may acquire encoded video data from the source device 4310 or the storage medium/ server 4340. Video decoder 4324 may decode the encoded video data. Display device 4322 may display the decoded video data to a user. Display device 4322 may be integrated with the destination device 4320, or may be external to destination device 4320, which can be configured to interface with an external display device.

[00191] Video encoder 4314 and video decoder 4324 may operate according to a video compression standard, such as the High Efficiency Video Coding (HEVC) standard, Versatile Video Coding (WM) standard and other current and/or farther standards.

[00192] FIG. 13 is a block diagram illustrating an example of video encoder 4400, which may be video encoder 4314 in the system 4300 illustrated in FIG. 12. Video encoder 4400 may be configured to perform any or all of the techniques of this disclosure. The video encoder 4400 includes a plurality of functional components. The techniques described in this disclosure may be shared among the various components of video encoder 4400. In some examples, a processor may be configured to perform any or all of the techniques described in this disclosure.

[00193] The functional components of video encoder 4400 may include a partition unit 4401, a prediction unit 4402 which may include a mode select unit 4403, a motion estimation unit 4404, a motion compensation unit 4405, an intra prediction unit 4406, a residual generation unit 4407, a transform processing unit 4408, a quantization unit 4409, an inverse quantization unit 4410, an inverse transform unit 4411, a reconstruction unit 4412, a buffer 4413, and an entropy encoding unit 4414.

[00194] In other examples, video encoder 4400 may include more, fewer, or different functional components. In an example, prediction unit 4402 may include an intra block copy (IBC) unit. The IBC unit may perform prediction in an IBC mode in which at least one reference picture is a picture where the current video block is located.

[00195] Furthermore, some components, such as motion estimation unit 4404 and motion compensation unit 4405 may be highly integrated, but are represented in the example of video encoder 4400 separately for purposes of explanation.

[00196] Partition unit 4401 may partition a picture into one or more video blocks. Video encoder 4400 and video decoder 4500 may support various video block sizes.

[00197] Mode select unit 4403 may select one of the coding modes, intra or inter, e.g., based on error results, and provide the resulting intra or inter coded block to a residual generation unit 4407 to generate residual block data and to a reconstruction unit 4412 to reconstruct the encoded block for use as a reference picture. In some examples, mode select unit 4403 may select a combination of intra and inter prediction (CIIP) mode in which the prediction is based on an inter prediction signal and an intra prediction signal. Mode select unit 4403 may also select a resolution for a motion vector (e.g., a sub-pixel or integer pixel precision) for the block in the case of inter prediction.

[00198] To perform inter prediction on a current video block, motion estimation unit 4404 may generate motion information for the current video block by comparing one or more reference frames from buffer 4413 to the current video block. Motion compensation unit 4405 may determine a predicted video block for the current video block based on the motion information and decoded samples of pictures from buffer 4413 other than the picture associated with the current video block. [00199] Motion estimation unit 4404 and motion compensation unit 4405 may perform different operations for a current video block, for example, depending on whether the current video block is in an I slice, a P slice, or a B slice.

[00200] In some examples, motion estimation unit 4404 may perform uni-directional prediction for the current video block, and motion estimation unit 4404 may search reference pictures of list 0 or list 1 for a reference video block for the current video block. Motion estimation unit 4404 may then generate a reference index that indicates the reference picture in list 0 or list 1 that contains the reference video block and a motion vector that indicates a spatial displacement between the current video block and the reference video block. Motion estimation unit 4404 may output the reference index, a prediction direction indicator, and the motion vector as the motion information of the current video block. Motion compensation unit 4405 may generate the predicted video block of the current block based on the reference video block indicated by the motion information of the current video block.

[00201] In other examples, motion estimation unit 4404 may perform bi-directional prediction for the current video block, motion estimation unit 4404 may search the reference pictures in list 0 for a reference video block for the current video block and may also search the reference pictures in list 1 for another reference video block for the current video block. Motion estimation unit 4404 may then generate reference indexes that indicate the reference pictures in list 0 and list 1 containing the reference video blocks and motion vectors that indicate spatial displacements between the reference video blocks and the current video block. Motion estimation unit 4404 may output the reference indexes and the motion vectors of the current video block as the motion information of the current video block. Motion compensation unit 4405 may generate the predicted video block of the current video block based on the reference video blocks indicated by the motion information of the current video block.

[00202] In some examples, motion estimation unit 4404 may output a full set of motion information for decoding processing of a decoder. In some examples, motion estimation unit 4404 may not output a full set of motion information for the current video. Rather, motion estimation unit 4404 may signal the motion information of the current video block with reference to the motion information of another video block. For example, motion estimation unit 4404 may determine that the motion information of the current video block is sufficiently similar to the motion information of a neighboring video block. [00203] In one example, motion estimation unit 4404 may indicate, in a syntax structure associated with the current video block, a value that indicates to the video decoder 4500 that the current video block has the same motion information as another video block.

[00204] In another example, motion estimation unit 4404 may identify, in a syntax structure associated with the current video block, another video block and a motion vector difference (MVD). The motion vector difference indicates a difference between the motion vector of the current video block and the motion vector of the indicated video block. The video decoder 4500 may use the motion vector of the indicated video block and the motion vector difference to determine the motion vector of the current video block.

[00205] As discussed above, video encoder 4400 may predictively signal the motion vector. Two examples of predictive signaling techniques that may be implemented by video encoder 4400 include advanced motion vector prediction (AMVP) and merge mode signaling.

[00206] Intra prediction unit 4406 may perform intra prediction on the current video block. When intra prediction unit 4406 performs intra prediction on the current video block, intra prediction unit 4406 may generate prediction data for the current video block based on decoded samples of other video blocks in the same picture. The prediction data for the current video block may include a predicted video block and various syntax elements.

[00207] Residual generation unit 4407 may generate residual data for the current video block by subtracting the predicted video block(s) of the current video block from the current video block. The residual data of the current video block may include residual video blocks that correspond to different sample components of the samples in the current video block.

[00208] In other examples, there may be no residual data for the current video block for the current video block, for example in a skip mode, and residual generation unit 4407 may not perform the subtracting operation.

[00209] Transform processing unit 4408 may generate one or more transform coefficient video blocks for the current video block by applying one or more transforms to a residual video block associated with the current video block.

[00210] After transform processing unit 4408 generates a transform coefficient video block associated with the current video block, quantization unit 4409 may quantize the transform coefficient video block associated with the current video block based on one or more quantization parameter (QP) values associated with the current video block. [00211] Inverse quantization unit 4410 and inverse transform unit 4411 may apply inverse quantization and inverse transforms to the transform coefficient video block, respectively, to reconstruct a residual video block from the transform coefficient video block. Reconstruction unit 4412 may add the reconstructed residual video block to corresponding samples from one or more predicted video blocks generated by the prediction unit 4402 to produce a reconstructed video block associated with the current block for storage in the buffer 4413.

[00212] After reconstruction unit 4412 reconstructs the video block, the loop filtering operation may be performed to reduce video blocking artifacts in the video block.

[00213] Entropy encoding unit 4414 may receive data from other functional components of the video encoder 4400. When entropy encoding unit 4414 receives the data, entropy encoding unit 4414 may perform one or more entropy encoding operations to generate entropy encoded data and output a bitstream that includes the entropy encoded data.

[00214] FIG. 14 is a block diagram illustrating an example of video decoder 4500 which may be video decoder 4324 in the system 4300 illustrated in FIG. 12. The video decoder 4500 may be configured to perform any or all of the techniques of this disclosure. In the example shown, the video decoder 4500 includes a plurality of functional components. The techniques described in this disclosure may be shared among the various components of the video decoder 4500. In some examples, a processor may be configured to perform any or all of the techniques described in this disclosure.

[00215] In the example shown, video decoder 4500 includes an entropy decoding unit 4501, a motion compensation unit 4502, an intra prediction unit 4503, an inverse quantization unit 4504, an inverse transformation unit 4505, a reconstruction unit 4506, and a buffer 4507. Video decoder 4500 may, in some examples, perform a decoding pass generally reciprocal to the encoding pass described with respect to video encoder 4400.

[00216] Entropy decoding unit 4501 may retrieve an encoded bitstream. The encoded bitstream may include entropy coded video data (e.g., encoded blocks of video data). Entropy decoding unit

4501 may decode the entropy coded video data, and from the entropy decoded video data, motion compensation unit 4502 may determine motion information including motion vectors, motion vector precision, reference picture list indexes, and other motion information. Motion compensation unit

4502 may, for example, determine such information by performing the AMVP and merge mode. [00217] Motion compensation unit 4502 may produce motion compensated blocks, possibly performing interpolation based on interpolation filters. Identifiers for interpolation filters to be used with sub-pixel precision may be included in the syntax elements.

[00218] Motion compensation unit 4502 may use interpolation filters as used by video encoder 4400 during encoding of the video block to calculate interpolated values for sub-integer pixels of a reference block. Motion compensation unit 4502 may determine the interpolation filters used by video encoder 4400 according to received syntax information and use the interpolation filters to produce predictive blocks.

[00219] Motion compensation unit 4502 may use some of the syntax information to determine sizes of blocks used to encode frame(s) and/or slice(s) of the encoded video sequence, partition information that describes how each macroblock of a picture of the encoded video sequence is partitioned, modes indicating how each partition is encoded, one or more reference frames (and reference frame lists) for each inter coded block, and other information to decode the encoded video sequence.

[00220] Intra prediction unit 4503 may use intra prediction modes for example received in the bitstream to form a prediction block from spatially adjacent blocks. Inverse quantization unit 4504 inverse quantizes, i.e., de-quantizes, the quantized video block coefficients provided in the bitstream and decoded by entropy decoding unit 4501. Inverse transform unit 4505 applies an inverse transform.

[00221] Reconstruction unit 4506 may sum the residual blocks with the corresponding prediction blocks generated by motion compensation unit 4502 or intra prediction unit 4503 to form decoded blocks. If desired, a deblocking filter may also be applied to filter the decoded blocks in order to remove blockiness artifacts. The decoded video blocks are then stored in buffer 4507, which provides reference blocks for subsequent motion compensation/intra prediction and also produces decoded video for presentation on a display device.

[00222] FIG. 15 is a schematic diagram of an example encoder 4600. The encoder 4600 is suitable for implementing the techniques of VVC. The encoder 4600 includes three in-loop filters, namely a deblocking filter (DF) 4602, a sample adaptive offset (SAO) 4604, and an adaptive loop filter (ALF) 4606. Unlike the DF 4602, which uses predefined filters, the SAO 4604 and the ALF 4606 utilize the original samples of the current picture to reduce the mean square errors between the original samples and the reconstructed samples by adding an offset and by applying a finite impulse response (FIR) filter, respectively, with coded side information signaling the offsets and filter coefficients. The ALF 4606 is located at the last processing stage of each picture and can be regarded as a tool trying to catch and fix artifacts created by the previous stages.

[00223] The encoder 4600 further includes an intra prediction component 4608 and a motion estimation/compensation (ME/MC) component 4610 configured to receive input video. The intra prediction component 4608 is configured to perform intra prediction, while the ME/MC component 4610 is configured to utilize reference pictures obtained from a reference picture buffer 4612 to perform inter prediction. Residual blocks from inter prediction or intra prediction are fed into a transform (T) component 4614 and a quantization (Q) component 4616 to generate quantized residual transform coefficients, which are fed into an entropy coding component 4618. The entropy coding component 4618 entropy codes the prediction results and the quantized transform coefficients and transmits the same toward a video decoder (not shown). Quantization components output from the quantization component 4616 may be fed into an inverse quantization (IQ) components 4620, an inverse transform component 4622, and a reconstruction (REC) component 4624. The REC component 4624 is able to output images to the DF 4602, the SAO 4604, and the ALF 4606 for filtering prior to those images being stored in the reference picture buffer 4612.

[00224] A listing of solutions preferred by some examples is provided next.

[00225] The following solutions show examples of techniques discussed herein.

[00226] 1. A method for processing video data comprising: determining to predict a value of a residual coefficient based on a cost; and performing a conversion between a visual media data and a bitstream based on the residual coefficient.

[00227] 2. The method of claim 1, wherein the coefficient is used in transform coding or transform-skip coding.

[00228] 3. The method of any of claims 1-2, wherein only a DC coefficient value is predicted.

[00229] 4 The method of any of claims 1-3, wherein the value of the first N coefficients are predicted, and wherein the first N coefficients are determined based on raster scan order, diagonal scan order, or based on dividing the coefficients into subblocks and based on any combination of the scan order for the subblocks and any scan order for the coefficients inside of the subblocks.

[00230] 5. The method of any of claims 1-4, wherein prediction of a coefficient depends on coding information comprising a partial value of the reconstructed coefficient; a parity; surrounding neighboring values; a block size, CU, PU, or TU sizes; a prediction mode used for that block, which may depend on whether the block is inter coded or intra coded, on the intra direction value, or on the type of the inter prediction used for that block; MTS index values; LFNST index values; block partitioning type; transform skip flag; quantization parameter (QP); color components, or color format.

[00231] 6. The method of any of claims 1-5, wherein N coefficients at positions pl, p2, ... pN are predicted.

[00232] 7. The method of any of claims 1-6, wherein information related to a coefficient value is derived from a cost derivation process.

[00233] 8. The method of any of claims 1-7, wherein a full coefficient may be derived from the process, a prediction of the coefficient is derived from the process and a coefficient is added by the prediction to obtain a final coefficient, a scaling factor ise derived from the process and a coefficient is computed based on the scaling factor, or module T information is derived and a final coefficient value is T*coeff + 1, where T may be any positive integer and t any integer between 0 and T-l.

[00234] 9 The method of any of claims 1-8, wherein information related to prediction value is derived from a set of values.

[00235] 10. The method of any of claims 1-9, wherein the prediction value is predicted from a plurality of fixed numbers, or wherein the prediction value is not signaled, or wherein a flag is coded to indicate whether a predication is correct, or wherein M prediction values are included in one set, or wherein the information is related to N sets of values that include M i candidates, for i from 1 to N and the N sets are implicitly derived or signalled explicitly, or wherein M possible prediction values are denoted as vl, . . ., vM, and a best prediction denoted as vK (1 <= K <= M), is added to X, to create final coefficient of X + vK, without any signaling, or all M possible prediction are sorted based on a predefined cost, and an index is signaled to indicate a correct prediction, or wherein a prediction derivation process is applied after dependent quantization or rate distortion optimization quatization is complete, or wherein a predefined prediction value is not be constant, or wherein a predefined prediction value is a function of surrounding coefficient values.

[00236] 11. The method of any of claims 1-10, wherein an actual prediction value is used to code a coefficient value remainder.

[00237] 12. The method of any of claims 1-11, wherein an accurate prediction P is derived on both an encoder and a decoder side where X = coeff- P at the encoder the decoder decodes X and adds P to obtain a coefficient value, or wherein a prediction derivation process is applied after dependent quantization or RDOQ, or wherein a prediction derivation process is applied with a dependent quantization (DQ) or a RDOQ process, or wherein an approximation of a prediction is used such that an absolute value of a prediction may be limited to C, where C is any positive number, or wherein a binary search style is used to determine a prediction value, or wherein all possible value with absolute values less than C are examined and a value with a lowest cost is used as a prediction. [00238] 13. The method of any of claims 1-12, wherein a partial prediction value is used to predict a part of a coefficient.

[00239] 14. The method of any of claims 1-13, wherein information related to a zero coefficient is not predicted, or wherein remaining coefficients are predicted based on a greater than zero flag, or wherein information related to a coefficient being greater than one is predicted, or wherein remaining coefficients are predicted based on a greater than one flag, or wherein information related to a coefficient being greater than two is predicted, or wherein remaining coefficients are predicted based on a greater than two flag, or wherein any information in second residual coding pass is predicted, or wherein any information in third residual coding pass is predicted, or wherein a part of a coefficient is signaled and another part of the coefficient is predicted, or wherein a partial prediction includes deriving a prediction value from a set of values or actual an prediction for a part of the coefficient.

[00240] 15. The method of any of claims 1-14, wherein a cost for evaluating a coefficient value hypothesis or prediction is a function of at least one neighboring sample.

[00241] 16. The method of any of claims 1 -15, wherein a cost is calculated as a difference between a partial reconstruction of border samples in a current block and a corresponding reference where the corresponding reference is derived from neighboring block reconstruction, or wherein one or more rows, one or more columns, or both are used as a partial reconstruction area, or wherein KI rows, K2 columns, or both are used as a partial reconstruction area where KI and K2 are integer numbers, or wherein different cost functions are used to derive one hypothesis cost, or wherein a cost considers a continuity between a reference template and reconstructed samples neighboring to a current template in addition to a sum of absolute differences SAD.

[00242] 17. The method of any of claims 1-16, wherein a number of the multiple transform selection (MTS) candidates depends on coefficient characteristics.

[00243] 18. The method of any of claims 1-17, wherein a number ofthe MTS candidates depends on a last significant coefficient position, or wherein a number of candidates for a last significant coefficient position between P i and P_i+ 1 are K i where P i and K i are any non-negative numbers where K i <= K_i+1 <= ..., or wherein a number of the MTS candidates and context for coding an index depends on a sum of absolute value of coefficients, or wherein a number of the candidates for sum of absolute value of coefficients between P i and P_i+ 1 are K i where P i and K i are any non-negative numbers where K i <= K_i+1 <= ...., or wherein a sum of absolute value of some, but not all, positions are used for determining a number of the MTS candidates and context for coding an index, or wherein a sum of absolute value of some, but not all, positions (not all) are used for determining a number of MTS candidates and context for coding an index, or wherein a number of the MTS candidates and context for coding an index depends on a sum of partial absolute values of coefficients, or wherein any combination of a partial sum and a full sum depending on a coefficient position or value are used for determining a number of the MTS candidates and context for coding an index, or where a sum of the min (abs(coeff), Ci) is used for determining a number of the MTS candidates and a context for coding an index where Ci is any non-negative integer and is different for each position pi.

[00244] 19. The method of any of claims 1-18, wherein a minimum function is used to determine a predicted sign such that a prediction of N signs results in a 2 ^AN hypothesis that includes going through all the 2 ^AN costs and finding a minimum to determine a predicted sign.

[00245] 20. The method of any of claims 1-19, wherein a hypothesis with a lowest cost among all 2 ^AN costs determines a predicted sign for all the N signs, or wherein a hypothesis with a lowest cost among all the 2 ^AN costs only determines a predicted sign for a first k signs where k is any integer number of N or less, or wherein after coding to determine whether a prediction of a first k signs is correct, non-correct signs hypothesis are discarded and remaining hypothesis are used for predicting remaining signs.

[00246] 21. The method of any of claims 1-20, wherein a head-to-head minimum function is used to determine predicted signs.

[00247] 22. The method of any of claims 1-21, wherein a head-to-head minimum function is defined as: after calculating a cost for a 2 ^AN hypothesis for an ith sign, the 2 ^A(N-1) hypothesis is related to a negative ith sign, the 2 ^A(N-1) hypothesis is related to positive ith sign, and remaining N- 1 sign situations are identical and a comparison of these 2 ^A(N-1) negative and 2 ^A(N-1) positive hypothesis is performed head-to-head to count a number of the times the negative hypothesis and the positive hypothesis have lower costs, or wherein whichever hypothesis has a most head-to-head lower cost is chosen as a predicted sign, or wherein head-to-head minimum functions are applied on all 2 ^AN hypothesis for all signs, or wherein after determining an actual sign for an ith sign, incorrect hypothesis are discarded and head-to-head min function is applied on remaining hypothesis, or wherein any combination of discarding or keeping incorrect hypothesis is applied when using hypothesis to determine predicted signs.

[00248] 23. The method of any of claims 1-22, wherein a combination of different cost definitions is used to determine predicted signs.

[00249] 24. The method of any of claims 1-23, wherein for signs at positions pl,...pJ one cost function is used and for signs at positions ql,...qK another cost function is used, or a combination discarding incorrect hypothesis and keeping incorrect hypothesis is used in combination with any combination of different cost functions for each sign prediction, or wherein different cost functions are used for determining a sign prediction for one sign.

[00250] 25. The method of any of claims 1-24, wherein cost definitions and decision making used for sign prediction are used for coefficient value prediction.

[00251] 26. The method of any of claims 1-25, wherein a minimum function is used to determine a coefficient value prediction, or wherein a head-to-head minimum function is used to determine a coefficient value prediction, or wherein an incorrect hypothesis is discarded depending on coding side information related to coefficient value prediction, or wherein any combination of discarding incorrect hypothesis or retaining incorrect hypothesis is used in combination with any combination of different cost functions for each sign prediction.

[00252] 27. The method of any of claims 1-26, wherein any combination of sign prediction and coefficient value prediction for candidates are applied.

[00253] 28. The method of any of claims 1-27, wherein sign prediction is applied on all coefficients, or wherein sign prediction is applied only on N signs, or wherein a first N signs based on a predefined scan order are used for sign prediction, or wherein a first N signs based on coefficient magnitude are used for the sign prediction, or wherein only signs of coefficient at positions pl,..., pN are used for sign prediction, or wherein only coefficient at positions ql,..., qM are used for coefficient value prediction, or wherein sign prediction or coefficient value prediction are applied for a coefficient in a mutually exclusive manner, or wherein sign prediction or coefficient value prediction are both applied for a coefficient, or wherein all of signs are predicted prior to prediction of coefficient values, wherein all of coefficient values are predicted prior to prediction of signs, or wherein any order combination of predicting signs and coefficient values is applied.

[00254] 29. The method of any of claims 1 -28, wherein there are differences between passes used for residual coding.

[00255] 30. The method of any of claims 1-29, wherein different passes are used depending on a total number of context coded bins, or wherein there is no limitation on a number of context coded bins, or wherein prediction is used for the position of specified values.

[00256] 31. An apparatus for processing video data comprising: a processor; and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform the method of any of claims 1-30.

[00257] 32. A non-transitory computer readable medium comprising a computer program product for use by a video coding device, the computer program product comprising computer executable instructions stored on the non-transitory computer readable medium such that when executed by a processor cause the video coding device to perform the method of any of claims 1-30. [00258] 33. A non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by a video processing apparatus, wherein the method comprises: determining to predict a value of a residual coefficient based on a cost; and generating a bitstream based on the determining.

[00259] 34. A method for storing bitstream of a video comprising: determining to predict a value of a residual coefficient based on a cost; generating a bitstream based on the determining; and storing the bitstream in a non-transitory computer-readable recording medium.

[00260] 35. A method, apparatus or system described in the present document.

[00261] In the solutions described herein, an encoder may conform to the format rule by producing a coded representation according to the format rule. In the solutions described herein, a decoder may use the format rule to parse syntax elements in the coded representation with the knowledge of presence and absence of syntax elements according to the format rule to produce decoded video.

[00262] In the present document, the term “video processing” may refer to video encoding, video decoding, video compression or video decompression. For example, video compression algorithms may be applied during conversion from pixel representation of a video to a corresponding bitstream representation or vice versa. The bit stream representation of a current video block may, for example, correspond to bits that are either co-located or spread in different places within the bitstream, as is defined by the syntax. For example, a macroblock may be encoded in terms of transformed and coded error residual values and also using bits in headers and other fields in the bitstream. Furthermore, during conversion, a decoder may parse a bitstream with the knowledge that some fields may be present, or absent, based on the determination, as is described in the above solutions. Similarly, an encoder may determine that certain syntax fields are or are not to be included and generate the coded representation accordingly by including or excluding the syntax fields from the coded representation.

[00263] The disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machinegenerated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.

[00264] A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

[00265] The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

[00266] Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e g., erasable programmable read-only memory (EPROM), electrically erasable programmable readonly memory (EEPROM), and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and compact disc read-only memory (CD ROM) and Digital versatile disc-read only memory (DVD-ROM) disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

[00267] While this patent document contains many specifics, these should not be construed as limitations on the scope of any subject matter or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular techniques. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

[00268] Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

[00269] Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

[00270] A first component is directly coupled to a second component when there are no intervening components, except for a line, a trace, or another medium between the first component and the second component. The first component is indirectly coupled to the second component when there are intervening components other than a line, a trace, or another medium between the first component and the second component. The term “coupled” and its variants include both directly coupled and indirectly coupled. The use of the term “about” means a range including ±10% of the subsequent number unless otherwise stated.

[00271] While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.

[00272] In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled may be directly connected or may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.

Previous Patent: REAL-TIME FEEDBACK TO IMPROVE IMAGE CAPTURE

Next Patent: METHODS OF TREATING ESTROGEN RECEPTOR-MEDIATED DISORDERS