Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
GENERAL CONSTRAINT INFORMATION AND SIGNALING OF SYNTAX ELEMENTS IN VIDEO CODING
Document Type and Number:
WIPO Patent Application WO/2021/236888
Kind Code:
A1
Abstract:
A method, apparatus, and a non-transitory computer-readable storage medium for decoding video data are provided. A decoder receives a bitstream associated with the video data. The decoder obtains a syntax element from the bitstream. The syntax element may indicate a slice is a bi-predictive slice (B-slice). The decoder decodes the video data based on the syntax element.

Inventors:
JHU HONG-JHENG (CN)
XIU XIAOYU (US)
CHEN YI-WEN (US)
MA TSUNG-CHUAN (US)
CHEN WEI (US)
KUO CHE-WEI (CN)
WANG XIANGLIN (US)
YU BING (CN)
Application Number:
PCT/US2021/033331
Publication Date:
November 25, 2021
Filing Date:
May 20, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BEIJING DAJIA INTERNET INFORMATION TECH CO LTD (CN)
JHU HONG JHENG (CN)
XIU XIAOYU (US)
International Classes:
H04N19/70; H04N19/105; H04N19/119; H04N19/184; H04N19/423; H04N19/44
Foreign References:
US20170111642A12017-04-20
Other References:
B. BROSS, J. CHEN, S. LIU, Y.-K. WANG: "Versatile Video Coding (Draft 9)", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. JVET-R2001-vA; m53983, 15 May 2020 (2020-05-15), pages 1 - 524, XP030287936
Y.-J. CHANG (QUALCOMM), V. SEREGIN, Y. HE, M. COBAN, M. KARCZEWICZ (QUALCOMM): "AhG9: On general constraint information syntax", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. m53263 ; JVET-R0286, 4 April 2020 (2020-04-04), XP030286379
Y.-W. CHEN (KWAI), X. XIU (KWAI), T.-C. MA (KWAI), H.-J. JHU (KWAI), W. CHEN (KUAISHOU), X. WANG (KWAI INC.): "AHG9: On syntax signalling conditions in picture header", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. m53308 ; JVET-R0324, 13 April 2020 (2020-04-13), XP030286511
Y.-W. CHEN (KWAI), X. XIU (KWAI), T.-C. MA (KWAI), H.-J. JHU (KWAI), W. CHEN (KUAISHOU), X. WANG (KWAI INC.): "AHG9: On TMVP enabling flag in picture header", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. m53307 ; JVET-R0323, 18 April 2020 (2020-04-18), XP030286505
Attorney, Agent or Firm:
SHEN WANG (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method for decoding video data, comprising: receiving, at a decoder, a bitstream associated with the video data; obtaining, at the decoder, a syntax element from the bitstream; and decoding, at the decoder, the video data based on the syntax element.

2. The method of claim 1, wherein the syntax element comprises no sbt constraint flag that is equal to 1 when a slice is an intra slice and when intra only constraint flag is equal to 1.

3. The method of claim 1, wherein the syntax element comprises no act constraint flag and no chroma qp offset constraint flag and both are equal to 1 when a chroma format is not monochrome, the no act constraint flag is equal to 1 when max chroma format constraint idc is equal to 0, and the no chroma qp offset constraint flag is equal to 1 when the max chroma format constraint idc is equal to 0.

4. The method of claim 1, wherein the syntax element comprises no_prof_constraint_flag that is equal to 1 when an affine mode is disabled, and the no_prof_constraint_flag is equal to 1 when no afifine motion constraint flag is equal to 1.

5. The method of claim 1, wherein the syntax element comprises no bdpcm constraint flag that is equal to 1 when a transform skip mode is disabled, and the no bdpcm constraint flag is equal to 1 when no transform skip constraint flag is equal to 1

6. The method of claim 1, wherein the syntax element comprises no_mixed_nalu_types_in_pic_constraint_flag that is equal to 1 when a picture has one subpicture, and the no_mixed_nalu_types_in_pic_constraint_flag is equal to 1 when one_subpic_per_pic_constraint_flag is equal to 1.

7. The method of claim 1, wherein the syntax element indicates that a slice is a bi- predictive slice (B-slice).

8. The method of claim 1, wherein the syntax element comprises ph collocated from lO flag and weight_table( ).

9. The method of claim 8, wherein the syntax element comprises flags indicating conditions in a picture header to prevent redundant signaling and wherein the syntax element comprises mvd ll zero flag, ph disable bdof flag, and ph disable dmvr flag.

10. The method of claim 9, wherein the conditions in the picture header comprise: in response to determining that rpl_info_in_ph_flag is not equal to 1 or determining that the rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] is greater than 0, determining that mvd_ll_zero_flag is signaled.

11. The method of claim 9, wherein the conditions in the picture header comprise: determining, in response to determining that sps_dmvr_pic_present_flag is equal to 1, and determining that rpl_info_in_ph_flag is not equal to 1 or that the rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] is greater than 0 and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] is greater than 0, that ph disable dmvr flag is signaled.

12. The method of claim 9, wherein the conditions in the picture header comprise: inferring, in response to sps dmvr enabled flag being equal to 1 and sps_dmvr_pic_present_flag being equal to 0, that ph_disable_dmvr_flag is equal to 0; inferring, in response to the sps dmvr enabled flag being equal to 1 and the sps_dmvr_pic_present_flag being equal to 1, that the ph disable dmvr flag is equal to 1; and inferring, in response to the sps_dmvr_enabled_flag being equal to 0, that the ph disable dmvr flag is equal to 1.

13. The method of claim 9, wherein the conditions in the picture header comprise: inferring, in response to sps dmvr enabled flag being equal to 1 and sps_dmvr_pic_present_flag being equal to 0, that a value of ph disable dmvr flag is equal to

0; inferring, in response to the sps dmvr enabled flag being equal to 1 and the sps_dmvr_pic_present_flag being equal to 1 and rpl_info_in_ph_flag being equal to 0, that the value of the ph disable dmvr flag is equal to a signaled value; and inferring, in response to the sps dmvr enabled flag being equal to 1 and the sps_dmvr_pic_present_flag being equal to 1 and the rpl_info_in_ph_flag being equal to 1 and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] being greater than 0, that the value of the ph disable dmvr flag is equal to a signaled value.

14. The method of claim 9, wherein the conditions in the picture header comprise: inferring, in response to sps bdof enabled flag being equal to 1 and sps_bdof_pic_present_flag being equal to 0, that a value of ph disable bdof flag is equal to

0; inferring, in response to the sps bdof enabled flag being equal to 1 and the sps_bdof_pic_present_flag being equal to 1, that a value of ph disable dmvr flag is equal to 1; and inferring, in response to the sps_bdof_enabled_flag being equal to 0, that the value of the ph disable bdof flag is equal to 1.

15. The method of claim 9, wherein the conditions in the picture header comprise: inferring, in response to sps bdof enabled flag being equal to 1 and sps_bdof_pic_present_flag being equal to 0, that a value of ph disable bdof flag is equal to

0; inferring, in response to the sps_bdof_enabled_flag being equal to 0 and the sps_bdof_pic_present_flag being equal to 0, that the value of the ph disable bdof flag is equal to 1 ; inferring, in response to the sps_bdof_enabled_flag being equal to 1, the sps_bdof_pic_present_flag being equal to 1, and rpl_info_in_ph_flag being equal to 0, that the value of the ph disable bdof flag is equal to a signaled value; inferring, in response to the sps_bdof_enabled_flag being equal to 1, the sps_bdof_pic_present_flag being equal to 1, the rpl_info_in_ph_flag being equal to 1, and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] being greater than 0, that the value of the ph disable bdof flag is equal to a signaled value; and inferring, in response to the sps_bdof_enabled_flag being equal to 1, the sps_bdof_pic_present_flag being equal to 1, the rpl_info_in_ph_flag being equal to 1, and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] being equal to 0, that the value of the ph disable bdof flag is equal to 1.

16. The method of claim 9, wherein the conditions in the picture header comprise: determining, in response to determining that ph temporal mvp enabled flag is equal to 1, and that num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] is greater than 0 and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] is greater than 0, that ph collocated from lO flag is signaled.

17. The method of claim 9, wherein the conditions in the picture header comprise: inferring, in response to num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] being greater than 1, that a value of ph collocated from lO flag is equal to 1; and inferring, in response to num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] being greater than 1, the value of the ph_collocated_from_10_flag is equal to 0.

18. The method of claim 9, wherein the conditions in the picture header comprise: determining, in response to pps_weighted_bipred_flag being equal to 1, wp i n fo_i n ph fl ag being equal to 1, and either rpl_info_in_ph_flag equal to 0 or rpl_info_in_ph_flag and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] both being greater than 0, that num_ll_weights is signaled.

19. The method of claim 9, wherein the conditions in the picture header comprise: determining, in response to determining that wp_info_in_ph_flag is equal to 1, a rpl_info_in_ph_flag is equal to 1, and num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] is equal to 0 or num_ref_entries[ 1 ][ Rplsldx[ 1 ] is greater than equal to 0, that NumWeightsLl is equal to 0.

20. The method of claim 9, wherein the conditions in the picture header comprise: determining, in response to ( !pps_weighted_bipred_flag | | ( pps_wp_info_in_ph_flag

&& num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] = = 0 ) ) being equal to 1, that NumWeightsLl is equal to 0.

21. The method of claim 9, wherein the conditions in the picture header comprise: determining, in response to determining that ph_gdr_pic_flag being equal to 0, that ph_inter_slice_allowed_flag is signaled.

22. The method of claim 1, further comprising: receiving a bitstream conformance constraint that indicates a value of ph temporal mvp enabled flag based on offsets applied to a picture size.

23. The method of claim 22, wherein the ph temporal mvp enabled flag is equal to 0 in response to no reference picture in a Decoded Picture Buffer (DPB) having an associated variable value RprConstraintsActive equal to 0.

24. The method of claim 1, further comprising: receiving a bitstream conformance constraint that indicates a value of ph temporal mvp enabled flag based on common reference picture existing among slices in a current picture.

25. The method of claim 24, wherein the ph temporal mvp enabled flag is equal to 0 in response to no common reference picture existing in the slices associated with a picture header.

26. The method of claim 24, wherein the ph temporal mvp enabled flag is equal to 0 in response to no common reference picture existing in non-intra slices associated with a picture header.

27. The method of claim 24, wherein the bitstream conformance constraint further indicates that pic width in luma samples and pic height in luma samples values of a reference picture referred to by slice collocated ref idx values are equal to pic width in luma samples and pic height in luma samples values of the current picture.

28. The method of claim 1, further comprising: determining, in response to determining that pps_mixed_nalu_types_in_pic_flag is equal to 0, that ph gdr or i rap pi c_fl ag is signaled.

29. The method of claim 1, further comprising: determining, in response to determining that a value of pps_mixed_nalu_type_in_pic_flag is equal to 1, that a value of ph gdr or i rap pic flag is equal to 0.

30. The method of claim 1, wherein the syntax element comprises pps_mixed_nalu_types_in_pic_flag.

31. The method of claim 30, wherein the pps_mixed_nalu_types_in_pic_flag equals to 1 and indicates that at least one picture that is neither intra random access point (IRAP) nor gradual decoding refresh (GDR) picture referring to a Picture Parameter Sets (PPS) has more than one Video Coding Layer (VCL) Network Abstraction Layer (NAL) unit.

32. The method of claim 30, wherein a value of the pps_mixed_nalu_types_in_pic_flag for a GDR picture is equal to 0, wherein when the pps_mixed_nalu_types_in_pic_flag is equal to 0 for a picture, and a slice of the picture has nal unit type equal to GDR NUT, all other slices of the picture have the same value of nal unit type, and the picture is known to be a GDR picture after receiving a first slice of the picture.

33. The method of claim 1, wherein the syntax element comprises ph gdr pic flag. and presence of the ph_gdr_pic_flag in a picture header is based on a value of pps_mixed_nalu_types_in_pic_flag in a PPS.

34. The method of claim 33, wherein the ph_gdr_pic_flag is signaled in response to the pps_mixed_nalu_types_in_pic_flag being equal to 0.

35. The method of claim 33, wherein when not present, the value of ph gdr pic flag is inferred to be equal to 0, and when pps_mixed_nalu_types_in_pic_flag is 0 and to be equal to the value of ph_gdr_or_irap_pic_flag when pps_mixed_nalu_types_in_pic_flag is 1.

36. The method of claim 1, wherein the syntax element comprises ph gdr pic flag. and wherein when sps gdr enabled flag is equal to 0, the value of ph_gdr_pic_flag is equal to 0.

37. The method of claim 33, wherein the syntax element comprises ph gdr pic flag indicating whether a picture associated with a picture header is a GDR picture, and wherein the ph gdr pic flag is equal to 1 when ph gdr or i rap pi c_fl ag is equal to 1 and pps_mixed_nalu_types_in_pic_flag is equal to 1.

38. The method of claim 1, wherein the syntax element comprises pps_mixed_nalu_types_in_pic_flag, the pps_mixed_nalu_types_in_pic_flag equal to 1 indicating that at least one non-intra random access point (IRAP) picture referring to the Picture Parameter Sets (PPS) has more than one Video Coding Layer (VCL) Network Abstraction Layer (NAL) units, and the pps_mixed_nalu_types_in_pic_flag equal to 0 indicating that at least one non-IRAP picture referring to the PPS has one or more VCL NAL units.

39. The method of claim 1, wherein the syntax element comprises ph_inter_slice_allowed_flag, wherein the ph_inter_slice_allowed_flag equals to 0 when coded slices of a picture have sh_slice_type equal to 2, and wherein the ph inter slice allowed flag is equal to 1 when the ph_gdr_pic_flag is equal to 1.

40. The method of claim 1, wherein the syntax element comprises no deblocking filter constraint flag indicating that deblocking filter is disabled.

41. The method of claim 40, wherein the no deblocking filter constraint flag is equal to 1 in response to pps deblocking filter disabled flag, ph deblocking filter disabled flag, and sh deblocking filter disabled flag being equal to 1.

42. An apparatus for video decoding, comprising: one or more processors; and a non-transitory computer-readable storage medium configured to store instructions executable by the one or more processors; wherein the one or more processors, upon execution of the instructions, are configured to perform the method in any of claims 1-41.

43. A non-transitory computer-readable storage medium for video decoding storing computer-executable instructions that, when executed by one or more computer processors, causing the one or more computer processors to perform the method in any of claims 1-41.

Description:
GENERAL CONSTRAINT INFORMATION AND SIGNALING OF SYNTAX ELEMENTS IN VIDEO CODING

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application is based upon and claims priority to Provisional Applications No. 63/027,950 filed on May 20, 2020, and 63/030,295 filed on May 26, 2020, the entire contents thereof are incorporated herein by reference in their entirety for all purposes.

TECHNICAL FIELD

[0002] This disclosure is related to video coding and compression. More specifically, this application relates to systems and methods for general constraint information and signaling of syntax elements in video coding.

BACKGROUND

[0003] Various video coding techniques may be used to compress video data. Video coding is performed according to one or more video coding standards. For example, video coding standards include versatile video coding (VVC), joint exploration test model (JEM), high- efficiency video coding (H.265/HEVC), advanced video coding (H.264/AVC), moving picture experts group (MPEG) coding, or the like. Video coding generally utilizes prediction methods (e.g., inter-prediction, intra-prediction, or the like) that take advantage of redundancy present in video images or sequences. An important goal of video coding techniques is to compress video data into a form that uses a lower bit rate, while avoiding or minimizing degradations to video quality.

[0004] Some background knowledge related to video coding and reference picture management will be elaborated in the following sections.

SUMMARY

[0005] Examples of the present disclosure provide methods and apparatus for syntax in video coding.

[0006] According to a first aspect of the present disclosure, a method for decoding video data is provided. The method may include a decoder receiving a bitstream associated with the video data. The decoder may also obtain a syntax element from the bitstream. The decoder may further decode the video data based on the syntax element.

[0007] It is to be understood that the above general descriptions and detailed descriptions below are only examples and explanatory and not intended to limit the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate examples consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

[0009] FIG. 1 is a block diagram of an encoder, according to an example of the present disclosure.

[0010] FIG. 2 is a block diagram of a decoder, according to an example of the present disclosure.

[0011] FIG. 3 A is a diagram illustrating block partitions in a multi -type tree structure, according to an example of the present disclosure.

[0012] FIG. 3B is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.

[0013] FIG. 3C is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.

[0014] FIG. 3D is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.

[0015] FIG. 3E is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.

[0016] FIG. 4 is an illustration of a picture divided into CTUs, according to an example of the present disclosure.

[0017] FIG. 5A is an illustration of a multi-type tree splitting mode, according to an example of the present disclosure.

[0018] FIG. 5B is an illustration of a multi-type tree splitting mode, according to an example of the present disclosure.

[0019] FIG. 5C is an illustration of a multi-type tree splitting mode, according to an example of the present disclosure.

[0020] FIG. 5D is an illustration of a multi-type tree splitting mode, according to an example of the present disclosure. [0021] FIG. 6 is an illustration of gradual intra refreshing, according to an example of the present disclosure.

[0022] FIG. 7 is a method for decoding video data, according to an example of the present disclosure.

[0023] FIG. 8 is a diagram illustrating a computing environment coupled with a user interface, according to an example of the present disclosure.

DETAILED DESCRIPTION

[0024] Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise represented. The implementations set forth in the following description of embodiments do not represent all implementations consistent with the present disclosure. Instead, they are merely examples of apparatuses and methods consistent with aspects related to the present disclosure, as recited in the appended claims.

[0025] The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to limit the present disclosure. As used in the present disclosure and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It shall also be understood that the term “and/or” used herein is intended to signify and include any or all possible combinations of one or more of the associated listed items.

[0026] It shall be understood that, although the terms “first,” “second,” “third,” etc., may be used herein to describe various information, the information should not be limited by these terms. These terms are only used to distinguish one category of information from another. For example, without departing from the scope of the present disclosure, first information may be termed as second information; and similarly, second information may also be termed as first information. As used herein, the term “if’ may be understood to mean “when” or “upon” or “in response to a judgment” depending on the context.

[0027] Conceptually, video coding standards mentioned above are similar. For example, they all use block-based processing, and share similar video coding block diagram to achieve video compression. FIG. 1 shows a typical encoder block diagram for these standards.

[0028] FIG. 1 shows a general diagram of a block-based video encoder for the VYC. Specifically, FIG. 1 shows atypical encoder 100. The encoder 100 has video input 110, motion compensation 112, motion estimation 114, intra/inter mode decision 116, block predictor 140, adder 128, transform 130, quantization 132, prediction related info 142, intra prediction 118, picture buffer 120, inverse quantization 134, inverse transform 136, adder 126, memory 124, in-loop filter 122, entropy coding 138, and bitstream 144.

[0029] In the encoder 100, a video frame is partitioned into a plurality of video blocks for processing. For each given video block, a prediction is formed based on either an inter prediction approach or an intra prediction approach.

[0030] A prediction residual, representing the difference between a current video block, part of video input 110, and its predictor, part of block predictor 140, is sent to a transform 130 from adder 128. Transform coefficients are then sent from the Transform 130 to a Quantization 132 for entropy reduction. Quantized coefficients are then fed to an Entropy Coding 138 to generate a compressed video bitstream. As shown in FIG. 1, prediction related information 142 from an intra/inter mode decision 116, such as video block partition info, motion vectors (MVs), reference picture index, and intra prediction mode, are also fed through the Entropy Coding 138 and saved into a compressed bitstream 144. Compressed bitstream 144 includes a video bitstream.

[0031] In the encoder 100, decoder-related circuitries are also needed in order to reconstruct pixels for the purpose of prediction. First, a prediction residual is reconstructed through an Inverse Quantization 134 and an Inverse Transform 136. This reconstructed prediction residual is combined with a Block Predictor 140 to generate un-filtered reconstructed pixels for a current video block.

[0032] Spatial prediction (or “intra prediction”) uses pixels from samples of already coded neighboring blocks (which are called reference samples) in the same video frame as the current video block to predict the current video block.

[0033] Temporal prediction (also referred to as “inter prediction”) uses reconstructed pixels from already-coded video pictures to predict the current video block. Temporal prediction reduces temporal redundancy inherent in the video signal. The temporal prediction signal for a given coding unit (CU) or coding block is usually signaled by one or more MVs, which indicate the amount and the direction of motion between the current CU and its temporal reference. Further, if multiple reference pictures are supported, one reference picture index is additionally sent, which is used to identify from which reference picture in the reference picture storage the temporal prediction signal comes from. [0034] Motion estimation 114 intakes video input 110 and a signal from picture buffer 120 and output, to motion compensation 112, amotion estimation signal. Motion compensation 112 intakes video input 110, a signal from picture buffer 120, and motion estimation signal from motion estimation 114 and output to intra/inter mode decision 116, a motion compensation signal.

[0035] After spatial and/or temporal prediction is performed, an intra/inter mode decision 116 in the encoder 100 chooses the best prediction mode, for example, based on the rate- distortion optimization method. The block predictor 140 is then subtracted from the current video block, and the resulting prediction residual is de-correlated using the transform 130 and the quantization 132. The resulting quantized residual coefficients are inverse quantized by the inverse quantization 134 and inverse transformed by the inverse transform 136 to form the reconstructed residual, which is then added back to the prediction block to form the reconstructed signal of the CU. Further in-loop filtering 122, such as a deblocking filter, a sample adaptive offset (SAO), and/or an adaptive in-loop filter (ALF) may be applied on the reconstructed CU before it is put in the reference picture storage of the picture buffer 120 and used to code future video blocks. To form the output video bitstream 144, coding mode (inter or intra), prediction mode information, motion information, and quantized residual coefficients are all sent to the entropy coding unit 138 to be further compressed and packed to form the bitstream.

[0036] In the encoder, a video frame is partitioned into blocks for processing. For each given video block, a prediction is formed based on either inter prediction or intra prediction. In inter prediction, predictors may be formed through motion estimation and motion compensation, based on pixels from previously reconstructed frames. In intra prediction, predictors may be formed based on reconstructed pixels in the current frame. Through mode decision, a best predictor may be chosen to predict a current block.

[0037] The prediction residual (i.e., the difference between a current block and its predictor) is sent to transform module. Transform coefficients are then sent to quantization module for entropy reduction. Quantized coefficients are fed to entropy coding module to generate compressed video bitstream.

[0038] As shown in FIG. 1 (described above), prediction related info from inter and/or intra prediction modules, such as block partition info, motion vectors, reference picture index, and intra prediction mode, etc., are also going through entropy coding module and saved into bitstream. [0039] In the encoder, decoder related modules are also needed in order to reconstruct pixels for prediction purpose. First, prediction residual is reconstructed through inverse quantization and inverse transform. Such reconstructed prediction residual is combined with the block predictor to generate un-filtered reconstructed pixels for a current block.

[0040] To improve coding efficiency and visual quality, in-loop filter is commonly used. For example, deblocking filter is available in AVC, HEVC as well as the current VVC. In HEVC, an additional in-loop filter called SAO (sample adaptive offset) is defined to further improve coding efficiency. In the latest VVC, yet another in-loop filter called ALF (adaptive loop filter) is being actively investigated, and it has a high chance to be included in the final standard.

[0041] FIG. 2 shows a typical decoder block diagram for these standards. One can see that it is almost the same as the reconstruction related section residing in encoder. Specifically, FIG. 2 shows a typical decoder 200 block diagram. Decoder 200 has bitstream 210, entropy decoding 212, inverse quantization 214, inverse transform 216, adder 218, intra/inter mode selection 220, intra prediction 222, memory 230, in-loop filter 228, motion compensation 224, picture buffer 226, prediction related info 234, and video output 232.

[0042] Decoder 200 is similar to the reconstruction-related section residing in the encoder 100 of FIG. 1. In the decoder 200, an incoming video bitstream 210 is first decoded through an Entropy Decoding 212 to derive quantized coefficient levels and prediction-related information. The quantized coefficient levels are then processed through an Inverse Quantization 214 and an Inverse Transform 216 to obtain a reconstructed prediction residual. A block predictor mechanism, implemented in an Intra/inter Mode Selector 220, is configured to perform either an Intra Prediction 222 or a Motion Compensation 224, based on decoded prediction information. A set of unfiltered reconstructed pixels is obtained by summing up the reconstructed prediction residual from the Inverse Transform 216 and a predictive output generated by the block predictor mechanism, using a summer 218.

[0043] The reconstructed block may further go through an In-Loop Filter 228 before it is stored in a Picture Buffer 226, which functions as a reference picture store. The reconstructed video in the Picture Buffer 226 may be sent to drive a display device, as well as used to predict future video blocks. In situations where the In-Loop Filter 228 is turned on, a filtering operation is performed on these reconstructed pixels to derive a final reconstructed Video Output 232. [0044] In the decoder, bitstream is first decoded through entropy decoding module to derive quantized coefficient levels and prediction related info. Quantized coefficient levels are then processed through inverse quantization and inverse transform modules to obtain reconstructed prediction residual. Block predictor is formed through either intra prediction or motion compensation process based on prediction info decoded. The unfiltered reconstructed pixels are obtained by summing up the reconstructed prediction residual and the block predictor. In case in-loop filter is turned on, filtering operations are performed on these pixels to derive the final reconstructed video for output.

[0045] The first version of the HEVC standard was finalized in October 2013, which offers approximately 50% bit-rate saving or equivalent perceptual quality compared to the prior generation video coding standard H.264/MPEG AVC. Although the HEVC standard provides significant coding improvements than its predecessor, there is evidence that superior coding efficiency can be achieved with additional coding tools over HEVC. Based on that, both VCEG and MPEG started the exploration work of new coding technologies for future video coding standardization one Joint Video Exploration Team (JVET) was formed in Oct. 2015 by ITU-T VECG and ISO/IEC MPEG to begin significant study of advanced technologies that could enable substantial enhancement of coding efficiency. One reference software called joint exploration model (JEM) was maintained by the JVET by integrating several additional coding tools on top of the HEVC test model (HM).

[0046] In Oct. 2017, the j oint call for proposals (CfP) on video compression with capability beyond HEVC was issued by ITU-T and ISO/IEC. In Apr. 2018, 23 CfP responses were received and evaluated at the 10-th JVET meeting, which demonstrated compression efficiency gain over the HEVC around 40%. Based on such evaluation results, the JVET launched a new project to develop the new generation video coding standard that is named as Versatile Video Coding (VVC). In the same month, one reference software codebase, called VVC test model (VTM), was established for demonstrating a reference implementation of the VVC standard. [0047] Like HEVC, the VVC is built upon the block-based hybrid video coding framework. FIG. 1 gives the block diagram of a generic block-based hybrid video encoding system. The input video signal is processed block by block (called coding units (CUs)). In VTM-1.0, a CU can be up to 128x128 pixels. However, different from the HEVC, which partitions blocks only based on quad-trees, in the VVC, one coding tree unit (CTU) is split into CUs to adapt to varying local characteristics based on quad/binary/temary-tree. Additionally, the concept of multiple partition unit type in the HEVC is removed, i.e., the separation of CU, prediction unit (PU) and transform unit (TU) does not exist in the VVC anymore; instead, each CU is always used as the basic unit for both prediction and transform without further partitions. In the multi- type tree structure, one CTU is firstly partitioned by a quad-tree structure. Then, each quad tree leaf node can be further partitioned by a binary and ternary tree structure.

[0048] FIG. 3 A shows a diagram illustrating block quaternary partition in a multi-type tree structure, in accordance with the present disclosure.

[0049] FIG. 3B shows a diagram illustrating block vertical binary partition in a multi-type tree structure, in accordance with the present disclosure.

[0050] FIG. 3C shows a diagram illustrating block horizontal binary partition in a multi type tree structure, in accordance with the present disclosure.

[0051] FIG. 3D shows a diagram illustrating block vertical ternary partition in a multi -type tree structure, in accordance with the present disclosure.

[0052] FIG. 3E shows a diagram illustrating block horizontal ternary partition in a multi type tree structure, in accordance with the present disclosure

[0053] As shown in FIGS. 3A-3E, there are five splitting types, quaternary partitioning, horizontal binary partitioning, vertical binary partitioning, horizontal ternary partitioning, and vertical ternary partitioning. In FIG. 1, spatial prediction and/or temporal prediction may be performed. Spatial prediction (or “intra prediction”) uses pixels from the samples of already coded neighboring blocks (which are called reference samples) in the same video picture/slice to predict the current video block. Spatial prediction reduces spatial redundancy inherent in the video signal. Temporal prediction (also referred to as “inter prediction” or “motion compensated prediction”) uses reconstructed pixels from the already coded video pictures to predict the current video block. Temporal prediction reduces temporal redundancy inherent in the video signal. Temporal prediction signal for a given CU is usually signaled by one or more motion vectors (MVs) which indicate the amount and the direction of motion between the current CU and its temporal reference. Also, if multiple reference pictures are supported, one reference picture index is additionally sent, which is used to identify from which reference picture in the reference picture store the temporal prediction signal comes. After spatial and/or temporal prediction, the mode decision block in the encoder chooses the best prediction mode, for example, based on the rate-distortion optimization method. The prediction block is then subtracted from the current video block; and the prediction residual is de-correlated using transform and quantized. The quantized residual coefficients are inverse quantized and inverse transformed to form the reconstructed residual, which is then added back to the prediction block to form the reconstructed signal of the CU. Further in-loop filtering, such as deblocking filter, sample adaptive offset (SAO) and adaptive in-loop filter (ALF) may be applied on the reconstructed CU before it is put in the reference picture store and used to code future video blocks. To form the output video bitstream, coding mode (inter or intra), prediction mode information, motion information, and quantized residual coefficients are all sent to the entropy coding unit to be further compressed and packed to form the bitstream.

[0054] FIG. 2 (described above) gives a general block diagram of a block-based video decoder. The video bitstream is first entropy decoded at entropy decoding unit. The coding mode and prediction information are sent to either the spatial prediction unit (if intra coded) or the temporal prediction unit (if inter coded) to form the prediction block. The residual transform coefficients are sent to inverse quantization unit and inverse transform unit to reconstruct the residual block. The prediction block and the residual block are then added together. The reconstructed block may further go through in-loop filtering before it is stored in reference picture store. The reconstructed video in reference picture store is then sent out to drive a display device, as well as used to predict future video blocks.

[0055] In general, the basic intra prediction scheme applied in the VVC is kept the same as that of the HEVC, except that several modules are further extended and/or improved, e.g., matrix weighted intra prediction (MIP) coding mode, intra sub-partition (ISP) coding mode, extended intra prediction with wide-angle intra directions, position-dependent intra prediction combination (PDPC) and 4-tap intra interpolation. The main focus of the disclosure is to improve the existing general constraint information design in the VVC standard. The related background knowledge is elaborated in the following sections.

[0056] Like HEVC, VVC uses a NAL unit based bitstream structure. A coded bitstream is partitioned into NAL units which, when conveyed over lossy packet networks, should be smaller than the maximum transfer unit size. Each NAL unit consists of a NAL unit header followed by the NAL unit payload. There are two conceptual classes of NAL units. Video coding layer (VCL) NAL units containing coded sample data, e.g., coded slice NAL units, whereas non-V CL NAL units that contain metadata typically belonging to more than one coded picture, or where the association with a single coded picture would be meaningless, such as parameter set NAL units, or where the information is not needed by the decoding process, such as SEI NAL units.

[0057] VVC inherits the parameter set concept of HEVC with a few modification and additions. Parameter sets can be either part of the video bitstream or can be received by a decoder through other means (including out-of-band transmission using a reliable channel, hard coding in encoder and decoder, and so on). A parameter set contains an identification, which is referenced, directly or indirectly, from the slice header as discussed in more detail later. The referencing process is known as “activation.” Depending on the parameter set type, the activation occurs per picture or per sequence. The concept of activation through referencing was introduced, among other reasons, because implicit activation by virtue of the position of the information in the bitstream (as common for other syntax elements of a video codec) is not available in case of out-of-band transmission.

[0058] The video parameter set (VPS) was introduced to convey information that is applicable to multiple layers as well as sub-layers. The VPS was introduced to address these shortcomings as well as to enable a clean and extensible high-level design of multilayer codecs. Each layer of a given video sequence, regardless of whether they have the same or different sequence parameter sets (SPS), refer to the same VPS.

[0059] In VVC, SPSs contain information which applies to all slices of a coded video sequence. A coded video sequence starts from an instantaneous decoding refresh (IDR) picture, or a BLA picture, or a CRA picture that is the first picture in the bitstream and includes all subsequent pictures that are not an IDR or BLA picture. A bitstream consists of one or more coded video sequences. The content of the SPS can be roughly subdivided into six categories: 1) a self-reference (its own ID); 2) decoder operation point related information (profile, level, picture size, number sub-layers, and so on); 3) enabling flags for certain tools within a profile, and associated coding tool parameters in case the tool is enabled; 4) information restricting the flexibility of structures and transform coefficient coding; 5) temporal scalability control; and 6) visual usability information (VUI), which includes HRD information.

[0060] For decoder operation point related information in SPS, there is a list of constraint flags that indicate properties that cannot be violated in the entire bitstream. The constraint flags are encapsulated into their own syntax structures, general_constraint_info( ). The syntax and the associated semantic of general constraint information in current VVC draft specification is illustrated in Table 1 and Table 2, respectively.

Table 1. General constraint information syntax

Table 2. General constraint information semantics

[0061] VVC’s picture parameter set (PPS) contains such information which could change from picture to picture. The PPS includes information roughly comparable what was part of the PPS in HEVC, including: 1) a self-reference; 2) initial picture control information such as initial quantization parameter (QP), a number of flags indicating the use of, or presence of, certain tools or control information in the slice header; and 3) tiling information.

[0062] The slice header contains information that can change from slice to slice, as well as such picture related information that is relatively small or relevant only for certain slice or picture types. The size of slice header may be noticeably bigger than the PPS, particularly when there are tile or wavefront entry point offsets in the slice header and RPS, prediction weights, or reference picture list modifications are explicitly signaled.

[0063] Versatile Video Coding (WC)

[0064] At the 10th JVET meeting (April 10-20, 2018, San Diego, US), JVET defined the first draft of Versatile Video Coding (VVC) and the VVC Test Model 1 (VTM1) as its reference software implementation. It was decided to include a quadtree with a nested multi-type tree as the initial new coding feature of VVC. The multi-type tree is a coding block partition structure including both binary and ternary split. Since then, the reference software VTM, with both encoding and decoding process implemented, has been developed and updated through the following JVET meetings.

[0065] In VVC, a picture of an input video is partitioned into blocks called coding tree units (CTUs). A CTU is split into coding units (CUs) using a quadtree with a nested multi-type tree structure, with a CU defining a region of pixels sharing the same prediction mode (e.g., intra or inter). In this document, the term ‘unit’ defines a region of an image covering all components such as luma and chroma; the term ‘block’ is used to define a region covering a particular component (e.g., luma), and the blocks of different components (e.g., luma vs. chroma) may differ in spatial location when considering the chroma sampling format such as 4:2:0.

[0066] Partitioning of the Picture into CTUs

[0067] Pictures are divided into a sequence of coding tree units (CTUs). The CTU concept is same to that of the HEVC. For a picture that has three sample arrays, a CTU consists of an NxN block of luma samples together with two corresponding blocks of chroma samples. [0068] FIG. 4 shows the example of a picture divided into CTUs.

[0069] The maximum allowed size of the luma block in a CTU is specified to be 128x128 (although the maximum size of the luma transform blocks is 64x64).

[0070] Partitioning of the CTUs Using a Tree Structure

[0071] In HEVC, a CTU is split into CUs by using a quaternary -tree structure denoted as coding tree to adapt to various local characteristics. The decision whether to code a picture area using inter-picture (temporal) or intra-picture (spatial) prediction is made at the leaf CU level. Each leaf CU can be further split into one, two or four PUs according to the PU splitting type. Inside one PU, the same prediction process is applied and the relevant information is transmitted to the decoder on a PU basis. After obtaining the residual block by applying the prediction process based on the PU splitting type, a leaf CU can be partitioned into transform units (TUs) according to another quaternary -tree structure similar to the coding tree for the CU. One of key feature of the HEVC structure is that it has the multiple partition conceptions including CU, PU, and TU.

[0072] In VVC, a quadtree with nested multi -type tree using binary and ternary splits segmentation structure replaces the concepts of multiple partition unit types, i.e., it removes the separation of the CU, PU and TU concepts except as needed for CUs that have a size too large for the maximum transform length, and supports more flexibility for CU partition shapes. In the coding tree structure, a CU can have either a square or rectangular shape. A coding tree unit (CTU) is first partitioned by a quaternary tree (a.k.a. quadtree) structure. Then the quaternary tree leaf nodes can be further partitioned by a multi-type tree structure. As shown in FIGS. 5A-5D, there are four splitting types in multi -type tree structure.

[0073] FIG. 5A shows a vertical binary splitting (SPLIT BT VER).

[0074] FIG. 5B shows a horizontal binary splitting (SPLIT BT HOR).

[0075] FIG. 5C shows a vertical ternary splitting (SPLIT TT VER).

[0076] FIG. 5D shows a horizontal ternary splitting (SPLIT TT HOR).

[0077] The multi-type tree leaf nodes are called coding units (CUs), and unless the CU is too large for the maximum transform length, this segmentation is used for prediction and transform processing without any further partitioning. This means that, in most cases, the CU, PU and TU have the same block size in the quadtree with nested multi-type tree coding block structure. The exception occurs when maximum supported transform length is smaller than the width or height of the color component of the CU.

[0078] Syntax in VVC

[0079] In VVC, the first layer of bitstream of syntax signaling is the Network Abstraction Layer (NAL) where the bitstream is divided into a set of NAL units. Some NAL units signal common control parameters to the decoder, such as the Sequence Parameter Sets (SPS) and Picture Parameter Sets (PPS). Others contain video data. The Video Coding Layer (VCL) NAL units contain slices of coded video. A coded picture is called an access unit and can be encoded as one or more slices.

[0080] A coded video sequence starts with an Instantaneous Decoder Refresh (IDR) picture. All following video pictures are coded as slices. A new IDR picture signals that the previous video segment is ended, and a new one begins. Each NAL unit begins with a one-byte header followed by the Raw Byte Sequence Payload (RBSP). The RBSP contains encoded slices. Slices are binary coded, so they may be padded with zero bits to ensure that the length is an integer number of bytes. A slice consists of a slice header and slice data. Slice data are specified as a series of CUs.

[0081] The picture header concept was adopted in the 16th JVET meeting to be transmitted once per picture as the first VCL NAL unit of a picture. It was also proposed to group some syntax elements previously in the slice header to this picture header. Syntax elements that functionally only need to be transmitted once per picture could be moved to the picture header instead of being transmitted multiple times in slices for a given picture.

[0082] In the VVC specification, the syntax tables specify a superset of the syntax of all allowed bitstreams. Additional constraints on the syntax may be specified, either directly or indirectly, in other clauses. Below is the syntax table of the slice header and picture header in VVC, Tables 3 and 4, respectively. The semantics of some syntax are also illustrated after the syntax table.

Table 3. Syntax of slice header

Table 4. Syntax of picture header

[0083] Semantic of Selected Syntax Elements

[0084] ph_temporal_mvp_enabled_flag specifies whether temporal motion vector predictors can be used for inter prediction for slices associated with the picture header (PH). If ph_temporal_mvp_enabled_flag is equal to 0, the syntax elements of the slices associated with the PH shall be constrained such that no temporal motion vector predictor is used in decoding of the slices. Otherwise (ph temporal mvp enabled flag is equal to 1), temporal motion vector predictors may be used in decoding of the slices associated with the PH. When not present, the value of ph temporal mvp enabled flag is inferred to be equal to 0. When no reference picture in the Decoded Picture Buffer (DPB) has the same spatial resolution as the current picture, the value of ph temporal mvp enabled flag shall be equal to 0.

[0085] The maximum number of subblock-based merging MVP candidates, MaxNumSubblockMergeCand, is derived as follows: if( sps_affme_enabled_flag )

MaxNumSubblockMergeCand = 5 - five_minus_max_num_subblock_merge_cand (1) else

MaxNumSubblockMergeCand = sps sbtmvp enabled flag && ph temporal mvp enabled flag

[0086] The value of MaxNumSubblockMergeCand shall be in the range of 0 to 5, inclusive.

[0087] slice_collocated_from_10_flag equal to 1 specifies that the collocated picture used for temporal motion vector prediction is derived from reference picture list 0. slice_collocated_from_10_flag equal to 0 specifies that the collocated picture used for temporal motion vector prediction is derived from reference picture list 1.

[0088] When slice type is equal to B or P, ph temporal mvp enabled flag is equal to 1, and slice_collocated_from_10_flag is not present, the following applies: - If rpl_info_in_ph_flag is equal to 1, slice collocated from lO flag is inferred to be equal to ph collocated from lO flag.

- Otherwise (rpl_info_in_ph_flag is equal to 0 and slice type is equal to P), the value of slice_collocated_from_10_flag is inferred to be equal to 1.

[0089] slice collocated ref idx specifies the reference index of the collocated picture used for temporal motion vector prediction.

[0090] When slice type is equal to P or when slice type is equal to B and slice_collocated_from_10_flag is equal to 1, slice_collocated_ref_idx refers to an entry in reference picture list 0, and the value of slice_collocated_ref_idx shall be in the range of 0 to NumRefldxActivef 0 ] - 1, inclusive.

[0091] When slice type is equal to B and slice collocated from lO flag is equal to 0, slice collocated ref idx refers to an entry in reference picture list 1, and the value of slice_collocated_ref_idx shall be in the range of 0 to NumRefldxActivef 1 ] - 1, inclusive. [0092] When slice collocated ref idx is not present, the following applies:

- If rpl_info_in_ph_flag is equal to 1, the value of slice collocated ref idx is inferred to be equal to ph collocated ref idx.

- Otherwise (rpl_info_in_ph_flag is equal to 0), the value of slice collocated ref idx is inferred to be equal to 0.

[0093] It is a requirement of bitstream conformance that the picture referred to by slice_collocated_ref_idx shall be the same for all slices of a coded picture.

It is a requirement of bitstream conformance that the values of pic width in luma samples and pic height in luma samples of the reference picture referred to by slice collocated ref idx shall be equal to the values of pic width in luma samples and pic height in luma samples, respectively, of the current picture, and RprConstraintsActivel slice collocated from lo riag ? 0 : 1 ][ slice collocated ref idx ] shall be equal to 0.

[0094] It is noted that the values of RprConstraintsActive[i][j] is derived in the section 8.3.2 in the VVC specification as abstracted below.

[0095] Decoding Process for Reference Picture Lists Construction

[0096] This process is invoked at the beginning of the decoding process for each slice of a non-IDR picture.

[0097] Reference pictures are addressed through reference indices. A reference index is an index into a reference picture list. When decoding an I slice, no reference picture list is used in decoding of the slice data. When decoding a P slice, only reference picture list 0 (i.e., RefPicList[ 0 ]), is used in decoding of the slice data. When decoding a B slice, both reference picture list 0 and reference picture list 1 (i.e., RefPicListf 1 ]) are used in decoding of the slice data.

[0098] At the beginning of the decoding process for each slice of a non-IDR picture, the reference picture lists RefPicList[ 0 ] and RefPicList[ 1 ] are derived. The reference picture lists are used in marking of reference pictures as specified in clause 8.3.3 or in decoding of the slice data.

[0099] NOTE 1 - For an I slice of a non-IDR picture that it is not the first slice of the picture, RefPicListf 0 ] and RefPicListf 1 ] may be derived for bitstream conformance checking purpose, but their derivation is not necessary for decoding of the current picture or pictures following the current picture in decoding order. For a P slice that it is not the first slice of a picture, RefPicListf 1 ] may be derived for bitstream conformance checking purpose, but its derivation is not necessary for decoding of the current picture or pictures following the current picture in decoding order.

[00100] The reference picture lists RefPicLi st| 0 ] and RefPicListf 1 ], the reference picture scaling ratios RefPicScalef i ][ j ][ 0 ] and RefPicScalef i ][ j ][ 1 ], and the reference picture scaled flags RprConstraintsActivef 0 ][ j ] and RprConstraintsActive| 1 ][j ] are derived as follows: for( i = 0; i < 2; i++ ) { for( j = 0, k = 0, pocBase = PicOrderCntVal; j < num_ref_entries[ i ][ Rplsldxf i ] ]; J++) { if( !inter_layer_ref_pic_flag[ i ][ Rplsldxf i ] ][ j ] ) { if( st_ref_pic_flag[ i ][ Rplsldxf i ] ][ j ] ) {

RefPicPocListf i ][ j ] = pocBase - DeltaPocValStf i ][ Rplsldxf i ] ][ j ] if( there is a reference picture picA in the DPB with the same nuh layer id as the current picture and PicOrderCntVal equal to RefPicPocListf i ][ j ] )

RefPicListf i ][ j ] = picA else

RefPicListf i ][ j ] = "no reference picture" (2) pocBase = RefPicPocList[ i ][ j ]

} else { if( !delta_poc_msb_cycle_lt[ i ][ k ] ) { if( there is a reference picA in the DPB with the same nuh layer id as the current picture and

PicOrderCntVal & ( MaxPicOrderCntLsb - 1 ) equal to

PocLsbLt[ i ][ k ] )

RefPicList[ i ][ j ] = picA else

RefPicList[ i ][ j ] = "no reference picture"

RefPicLtPocList[ i ][ j ] = PocLsbLt[ i ][ k ]

} else { if( there is a reference picA in the DPB with the same nuh layer id as the current picture and

PicOrderCntVal equal to FullPocLt[ i ][ k ] )

RefPicList[ i ][ j ] = picA else

RefPicList[ i ][ j ] = "no reference picture"

RefPicLtPocList[ i ][ j ] = FullPocLt[ i ][ k ]

} k++

}

} else { layerldx =

DirectRefLayerIdx[ GeneralLayerIdx[ nuh layer id ] ][ ilrp_idx[ i ][ Rplsldx ][ j ] ] refPicLayerld = vps_layer_id[ layerldx ] if( there is a reference picture picA in the DPB with nuh layer id equal to refPicLayerld and the same PicOrderCntVal as the current picture )

RefPicList[ i ][ j ] = picA else

RefPicList[ i ][ j ] = "no reference picture"

} fRefWidth is set equal to PicOutputWidthL of the reference picture RefPicList[ i ] [ j ] fRefHeight is set equal to PicOutputHeightL of the reference picture RefPicList[ i ] [ j ] refPicWidth, refPicHeight, refScalingWinLeftOffset, refScalingWinRightOffset, refS cal ingW inTop Offs et, and refScalingWinBottomOffset, are set equal to the values of pic width in luma samples, pic_height_in_luma_samples, scaling_win_left_offset, scaling_win_right_offset, scaling_win_top_offset, and scaling_win_bottom_offset, respectively, of the reference picture

RefPicList[ i ][ j ]

RefPicScale[ i ][ j ][ 0 ] =

( ( fRefWidth « 14 ) + ( PicOutputWidthL » 1 ) ) / PicOutputWidthL RefPicScalef i ][ j ][ 1 ] =

( ( fRefHeight « 14 ) + ( PicOutputHeightL » 1 ) ) / PicOutputHeightL

RprConstraintsActivel i ][ j ] = ( pic_width_in_luma_samples != refPicWidth | | pic_height_in_luma_samples != refPicHeight | | scaling_win_left_offset != refScalingWinLeftOffset | | scaling_win_right_offset != refScalingWinRightOffset | | scaling_win_top_offset != refScalingWinTopOffset | | scaling_win_bottom_offset != refScalingWinBottomOffset )

}

}

[00101] scaling win left offset. scaling win right offset. scaling win top offset, and scaling win bottom offset specify the offsets that are applied to the picture size for scaling ratio calculation. When not present, the values of scaling_win_left_offset, scaling_win_right_offset, scaling_win_top_offset, and scaling_win_bottom_offset are inferred to be equal to pps conf win left offset, pps conf win right offset, pps conf win top offset, and pps conf win bottom offset, respectively.

[00102] The value of SubWidthC * ( scaling win left offset + scaling win right offset ) shall be less than pic width in luma samples, and the value of SubHeightC * ( scaling_win_top_offset + scaling_win_bottom_offset ) shall be less than pic height in luma s ampl es .

[00103] The variables PicOutputWidthL and PicOutputHeightL are derived as follows:

PicOutputWidthL = pic width in luma samples - (3)

SubWidthC * ( scaling_win_right_offset + scaling_win_left_offset )

PicOutputHeightL = pic height in luma samples -

SubWidthC * ( scaling_win_bottom_offset + scaling_win_top_offset )

(79)

[00104] Let refPicOutputWidthL and refPicOutputHeightL be the PicOutputWidthL and PicOutputHeightL, respectively, of a reference picture of a current picture referring to this PPS. Is a requirement of bitstream conformance that all of the following conditions are satisfied:

- PicOutputWidthL * 2 shall be greater than or equal to refPicWidthlnLumaSamples.

- PicOutputHeightL * 2 shall be greater than or equal to refPicHeightlnLumaSamples.

- PicOutputWidthL shall be less than or equal to refPicWidthlnLumaSamples * 8.

- PicOutputHeightL shall be less than or equal to refPicHeightlnLumaSamples * 8.

- PicOutputWidthL * pic width max in luma samples shall be greater than or equal to refPicOutputWidthL * (pic width in luma samples - Max( 8, MinCbSizeY )).

- PicOutputHeightL * pic height max in luma samples shall be greater than or equal to refPicOutputHeightL * (pic height in luma samples - Max( 8, MinCbSizeY )).

[00105] NAT. Unit Syntax

[00106] Similar to HEVC, in the VVC specification, one NAL unit header table with the total length of two bytes is signaled at the beginning of each NAL unit to specify the basic information of the NAL unit. Table 5 illustrates the syntax elements that exist in the current NAL unit header.

Table 5. Syntax of NAL unit header

[00107] In table 5, the first bit is forbidden zero bit which is used to specify whether there is any error incurred during transmission. 0 means that the NAL unit is normal while 1 means there is syntax violation. Therefore, for normal bitstream, its corresponding value shall be equal to 0. The next bit is nuh reserved zero bit which is reserved for future usage and shall be equal to 0. The following 6 bits are used to specify the value of the syntax nuh layer id which identify the layer to which the NAL unit belongs to. The value of nuh layer id shall be in the range of 0 to 55, inclusive. Other values for nuh_layer_id are reserved for future use. After that, the syntax element nal unit type is used to specify the NAL unit type, i.e., the type of RBSP data structure contained in the NAL unit as specified as in the table below.

Table 6. NAL unit type

[00108] Gradual intra refreshing

[00109] Low latency and error resilience are two important factors that should be considered for practical video transmission system. Intra refreshing, which periodically insert IRAP pictures, is commonly used to limit the error propagation among temporal pictures and enhance the error resilience capability of the bitstream. However, due to the fact that the coding efficiency of inter coding is much better than intra coding, the relatively big size of intra pictures could potentially cause a latency issue when they are sent through a network with fixed transmission rate. This can lead to undesirable network congestion and packet losses. To address such issue, gradual intra refreshing (GDR) was adopted into the VVC standard which spreads the intra coded regions among multiple inter pictures, as depicted in FIG. 6.

[00110] FIG. 6 shows a diagram illustrating gradual intra refreshing, in accordance with the present disclosure. 610 is an intra region of a first picture. 612 is a dirty region of the first picture. 614 is a clean region of a second picture. 616 is a intra region of the second picture. 618 is a dirty region of the second picture. 620 is a clean region of a third picture. 622 is a intra region of the third picture. 624 is a dirty region of the third picture. 626 is a clean region of a fourth picture. 628 is a intra region of the fourth picture. 630 is a dirty region of the fourth picture. [00111] As shown in FIGS. 6, three regions are defined. A clean region corresponds to the pixels which have been refreshed during the current GDR period, the dirty region corresponds to one area which has not been refreshed, and a region that represent the coding blocks where intra coding is applied. The principle of the GDR is to ensure that pixels from clean region are reconstructed using pixels coming only from refreshed area of the temporal reference pictures in the same GDR period. In the current VVC, there are three GDR-related syntax elements ph gdr o r_i rap p i c fl ag. ph_gdr_pic_flag and ph_recovery_poc_cnt signaled in the picture header. Table 7 illustrates the corresponding GDR signaling in the picture header and the associated semantics.

Table 7. GDR signaling

[00112] ph_gdr_or_irap_pic_flag equal to 1 specifies that the current picture is a GDR or IRAP picture. ph_gdr_or_irap_pic_flag equal to 0 specifies that the current picture is not a GDR picture and may or may not be an IRAP picture.

[00113] ph_gdr_pic_flag equal to 1 specifies the picture associated with the PH is a GDR picture ph gdr pic flag equal to 0 specifies that the picture associated with the PH is not a GDR picture. When not present, the value of ph_gdr_pic_flag is inferred to be equal to 0. When sps gdr enabled flag is equal to 0, the value of ph_gdr_pic_flag shall be equal to 0.

[00114] NOTE 1 - When ph gdr or i rap pi c_fl ag is equal to 1 and ph gdr pic flag is equal to 0, the picture associated with the PH is an IRAP picture.

[00115] ph_recovery_poc_cnt specifies the recovery point of decoded pictures in output order.

[00116] When the current picture is a GDR picture, the variable recoveryPointPocVal is derived as follows: recoveryPointPocVal = PicOrderCntVal + ph_recovery_poc_cnt (4)

[00117] If the current picture is a GDR picture, and there is a picture picA that follows the current GDR picture in decoding order in the CLVS that has PicOrderCntVal equal to recoveryPointPocVal, the picture picA is referred to as the recovery point picture. Otherwise, the first picture in output order that has PicOrderCntVal greater than recoveryPointPocVal in the CLVS is referred to as the recovery point picture. The recovery point picture shall not precede the current GDR picture in decoding order. The pictures that are associated with the current GDR picture and have PicOrderCntVal less than recoveryPointPocVal are referred to as the recovering pictures of the GDR picture. The value of ph_recovery_poc_cnt shall be in the range of 0 to MaxPicOrderCntLsb - 1, inclusive.

[00118] NOTE 3 - When sps gdr enabled flag is equal to 1 and PicOrderCntVal of the current picture is greater than or equal to recoveryPointPocVal of the associated GDR picture, the current and subsequent decoded pictures in output order are exact match to the corresponding pictures produced by starting the decoding process from the previous IRAP picture, when present, preceding the associated GDR picture in decoding order.

[00119] Mixed NAL types in one picture

[00120] Different from the HEVC standard where the NAL types of the slices within one picture have to be the same, it is allowed to have the mix if IRAP and non-IRAP NAL unit types within one picture. The motivation of such functionality is region-based random access using sub-pictures. For example, for 360-degree video streaming, some areas of one 360-degree video can be watched a lot more by users than the other areas. To better trade-off coding efficiency and the average viewpoint switching latency, more frequent IRAP pictures can be used to code those more-often-watched areas than the other area. For such reason, one flag pps_mixed_nalu_types_in_pic_flag is introduced in the PPS. When the flag is equal to one, it indicates that each picture referring to the PPS has more than one NAL unit and the NAL units do not have the same value of nal unit type. Otherwise (when the flag is equal to zero), each picture referring to the PPS has one or more NAL units and the NAL units of each picture referring to the PPS have the same value of nal unit type. Additionally, the flag pps_mixed_nalu_types_in_pic_flag is equal to one, one bitstream conformance constraint is further applied that for any particular picture some NAL units have a particular IRAP NAL unit type and the other have a particular non-IRAP NAL unit type. In other words, NAL units of any particular picture cannot have more than one IRAP NAL unit type and cannot have more than one non-IRAP NAL unit type, as specified as below:

[00121] For VCL NAL units of any particular picture, the following applies:

- If pps_mixed_nalu_types_in_pic_flag is equal to 0, the value of nal unit type shall be the same for all VCL NAL units of a picture, and a picture or a PU is referred to as having the same NAL unit type as the coded slice NAL units of the picture or PU.

Otherwise (pps_mixed_nalu_types_in_pic_flag is equal to 1), the following applies:

- The picture shall have at least two subpictures.

- VCL NAL units of the picture shall have two or more different nal_unit_type values.

- There shall be no VCL NAL unit of the picture that has nal unit type equal to GDR NUT.

- When the VCL NAL units of at least one subpicture of the picture have a particular value of nal unit type equal to IDR W RADL, IDR N LP, or CRA NUT, the VCL NAL units of other subpictures in the picture shall all have nal unit type equal to TRAIL NUT.

[00122] Improvements to General Constraint Information and Syntax Elements

In current VVC, no_sbt_constraint_flag is signaled in the general constraint information without any constraint. However, the feature controlled by the flag no_sbt_constraint_flag is only applicable when the slice is an inter slice. Therefore, when the slice is an intra slice, the value of no_sbt_constraint_flag shall be equal to 1.

[00123] Similarly, no act constraint flag and no chroma qp offset constraint flag are signaled in the general constraint information without any constraint. However, the features controlled by the flags no act constraint flag and no chroma qp offset constraint flag are only applicable when the chroma format is not monochrome. Therefore, the value of these two flags shall be equal to 1 when the chroma format is monochrome.

[00124] Similarly, in another example, no_mixed_nalu_types_in_pic_constraint_flag is signaled in the general constraint information without any constraint. However, the feature controlled by the flag no_mixed_nalu_types_in_pic_constraint_flag is only applicable when the picture has at least two subpictures. Therefore, the value of no_mixed_nalu_types_in_pic_constraint_flag shall be equal to 1 when the picture has one subpicture.

[00125] Similarly, in yet another example, no prof constraint flag is signaled in the general constraint information without any constraint. However, the feature controlled by the flag no prof constraint flag is only applicable when the affine mode is enabled. Therefore, the value of no prof constraint flag shall be equal to 1 when the affine mode is disabled. [00126] Similarly, in one more example, no_bdpcm_constraint_flag is signaled in the general constraint information without any constraint. However, the feature controlled by the flag no_bdpcm_constraint_flag is only applicable when the transform skip mode is enabled. Therefore, the value of no_bdpcm_constraint_flag shall be equal to 1 when the transform skip mode is disabled.

[00127] It is also observed that in current VVC, several coding tools are missing in the general constraint information syntax. These coding tools flags should be added to provide the same general constraint controls as others.

[00128] In current VVC, mvd_ll_zero_flag is signaled in the picture header (PH) without any conditional constraint. However, the feature controlled by the flag mvd_ll_zero_flag is only applicable when the slice is a bi-predictive slice (B-slice). Therefore, the flag signaling is redundant when the slice associated with the picture header is not a B-slice.

[00129] Similarly, in another example, ph disable bdof flag and ph disable dmvr flag are signaled in the PH only when the corresponding enabling flags (sps_bdof_pic_present_flag, sps_dmvr_pic_present_flag) signaled in sequence parameter set (SPS) are true, respectively. As shown in table 8, however, the features controlled by the flags ph disable bdof flag and ph_disable_dmvr_flag are only applicable when the slice is a bi-predictive slice (B-slice). Therefore, the signaling of these two flags is redundant or useless when the slices associated with the picture header is not a B-slice.

Table 8. Flag features [00130] One more example can also be seen on the syntax elements ph_collocated_from_10_flag to indicate the collocatd picture is from list 0 or listl. And another exmaple can be seen on the syntax pred_weight_table( ) which are the syntax elements related to the weighting tabled for the bi-predictive preidction.

Table 9. Flag features

[00131] A third problem is associated with the syntax ph_temporal_mvp_enabled_flag. In current VVC, because the resolution of the collocated picture selected for TMVP derivation shall be the same as the resolution of the current picture, there is a bitstream conformance constraint to check the value of ph_temporal_mvp_enabled_flag as illustrated below: [00132] When no reference picture in the DPB has the same spatial resolution as the current picture, the value of ph_temporal_mvp_enabled_flag shall be equal to 0.

[00133] However, in current VVC, not only the resolution of the collocated picture will affect the enabling of TMVP, but also the offsets that are applied to the picture size for scaling ratio calculation affect the enabling of TMVP. In current VVC, however, the offsets are not considered in the bitstream conformance of ph_temporal_mvp_enabled_flag.

[00134] Moreover, there is a requirement of bitstream conformance that the picture referred to by slice collocated ref idx shall be the same for all slices of a coded picture. However, when a coded picture has multiple slices and there is no common reference picture existing among all these slices, this bitstream conformance has no chance to be met. And in such case, ph temporal mvp enabled flag should be constrained to be 0.

[00135] According to the current VVC specification, IRAP picture is referred to be as one picture where all the associated NAL units having the same nal_unit_type which belongs to the IRAP NAL types. Specifically, the description in the below is used to define the IRAP picture in the VVC specification:

[00136] intra random access point (IRAP) picture: A coded picture for which all VCL NAL units have the same value of nal_unit_type in the range of IDR W RADL to CRA NUT, inclusive.

[00137] NOTE 1 - An IRAP picture does not use inter prediction in its decoding process, and may be a CRA picture or an IDR picture. The first picture in the bitstream in decoding order must be an IRAP or GDR picture. Provided the necessary parameter sets are available when they need to be referenced, the IRAP picture and all subsequent non-RASL pictures in the CLVS in decoding order can be correctly decoded without performing the decoding process of any pictures that precede the IRAP picture in decoding order.

[00138] NOTE 2 - The value of pps_mixed_nalu_types_in_pic flag for an IRAP picture is equal to 0. When pps_mixed_nalu_types_in_pic_flag is equal to 0 for a picture, and any slice of the picture has nal unit type in the range of IDR W RADL to CRA NUT, inclusive, all other slices of the picture have the same value of nal unit type, and the picture is known to be an IRAP picture.

[00139] As can be seen from the above, for each IRAP picture, the corresponding PPS that the picture refers to should have its pps_mixed_nalu_types_in_pic_flag equal to 0. Similarly, in the current VVC specification, GDR picture is referred to be as one picture for which the nal_unit_type of all the NALs associated with the picture shall be equal to GDR_NUT, as specified as gradual decoding refresh (GDR) picture: A picture for which each VCL NAL unit has nal unit type equal to GDR NUT.

[00140] Given that all the NAL units of one GDR picture must have the same NAL types, the flag pps_mixed_nalu_types_in_pic flag in the corresponding PPS that the GDR picture refers to cannot be equal to one.

[00141] On the other hand, as discussed in the introduction section, two flags, i.e., ph dr o r_i rap p i c fl ag and ph_gdr_pic flag, are signaled in picture header to indicate whether one picture is one IRAP picture or one GDR picture. When the flag ph gdr o r_i rap p i c fl ag is equal to one and the flag ph gdr pic flag is equal to zero, the current picture is one IRAP picture. When the flag ph_gdr_or_irap_pic flag is equal to one and the flag ph_gdr_pic flag is equal to one, the current picture is one GDR picture. According to the current VVC specification, the two flags are allowed to be signaled as one or zero without considering the value of the flag pps_mixed_nalu_types_in_pic_flag in the PPS. However, as mentioned earlier, one picture can be one IRAP picture or one GDR picture only if the NAL units in the picture have the same nal unit type, i.e., the corresponding pps_mixed_nalu_types_in_pic_flag has to be zero. Therefore, the existing IRAP/GDR signaling in the picture header is problematic when either or both of ph gdr or i rap pic flag and ph_gdr_pic_flag is equal to one (i.e., indicating the current picture is either IRAP picture or GDR picture) and the corresponding pps_mixed_naly_types_in_pic_flag is equal to one (i.e., indicating there are multiple NAL types in the current picture).

[00142] Proposed Methods

[00143] Several methods are proposed to address the issues described in the section of problem statement methods are provided to simplify and/or further improve the existing design of the high-level syntax It is noted that the proposed methods could be applied independently or combinedly.

[00144] Since the feature controlled by the flag no_sbt_constraint_flag is only applicable when the slice is an inter slice, according to a method of the disclosure, it is proposed to add the constraint that the value of no_sbt_constraint_flag shall be equal to 1 when the slice is an intra slice. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00145] Since the features controlled by the flags no_act_constraint_flag and no_chroma_qp_offset_constraint_flag are only applicable when the chroma format is not monochrome, according to a method of the disclosure, it is proposed to add the constraint that the value of these two flags shall be equal to 1 when the chroma format is monochrome. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00146] Another example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00147] Since the feature controlled by the flag no_mixed_nalu_types_in_pic_constraint_flag is only applicable when the picture have at least two subpictures, according to a method of the disclosure, it is proposed to add the constraint that the value of no_mixed_nalu_types_in_pic_constraint_flag shall be equal to 1 when the picture have one subpicture. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00148] Since the feature controlled by the flag no prof constraint flag is only applicable when the affine mode is enabled, according to a method of the disclosure, it is proposed to add the constraint that the value of no_prof_constraint_flag shall be equal to 1 when the affine mode is disabled. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00149] Since the feature controlled by the flag no_bdpcm_constraint_flag is only applicable when the transform skip mode is enabled, according to a method of the disclosure, it is proposed to add the constraint that the value of no_bdpcm_constraint_flag shall be equal to 1 when the transform skip mode is disabled. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00150] Several coding tools are missing in the general constraint information syntax. These coding tools flags should be added to provide the same general constraint controls as other flags.

[00151] In current VVC, sps_conformance_window_flag equal to 1 indicates that the conformance cropping window offset parameters follow next in the SPS. According to the disclosure, it is proposed to add the flag of cropping function, no_conformance_window_constraint_flag, in the general constraint information syntax to provide the same general constraint controls as other flags. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font. [00152] In current VVC, sps_weighted_pred_flag equal to 1 specifies that weightec prediction may be applied to P slices referring to the SPS. sps_weighted_pred_flag equal to 0 specifies that weighted prediction is not applied to P slices referring to the SPS. According to the disclosure, it is proposed to add the syntax element, no_weighted_pred_constraint_flag, in the general constraint information syntax to provide the same general constraint controls as other flags. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font. Additionally, given that weighted prediction is only applicable when inter coding tools are allowed. Therefore, it is proposed to add one bitstream conformance constraint that the value of no_weighted_pred_constrant_flag should be equal to 1, when only intra coding is allowed for coding the sequence.

[00153] In current VVC, sps_weighted_bipred_flag equal to 1 specifies that explicit weighted prediction may be applied to B slices referring to the SPS. sps weighted bipred flag equal to 0 specifies that explicit weighted prediction is not applied to B slices referring to the SPS. According to the disclosure, it is proposed to add the syntax element, no_weighted_bipred_constraint_flag, in the general constraint information syntax to provide the same general constraint controls as other flags. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font. Additionally, it is proposed to add one bitstream conformance constraint that the value of no weight bipred constraint flag should be equal to one, when only intra coding is allowed for coding the sequence.

[00154] In current VVC, sps virtual boundaries enabled flag equal to 1 specifies that disabling in-loop filtering across virtual boundaries is enabled and may be applied in the coded pictures in the CLVS. sps_virtual_boundaries_enabled_flag equal to 0 specifies that disabling in-loop filtering across virtual boundaries is disabled and not applied in the coded pictures in the CLVS. In-loop filtering operations include the deblocking filter, sample adaptive offset filter, and adaptive loop filter operations. According to the disclosure, it is proposed to add the syntax element, no_virtual_boundaries_constraint_flag, in the general constraint information syntax to provide the same general constraint controls as other flags. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font. [00155] In the current VVC draft, there are two flags, namely no_ref_pic_resampling_constraint_flag and no_res_change_in_clvs_constraint_flag, signaled in the general constraint information syntax table. The first flag indicates whether the reference picture resampling functionality is allowed in the coded sequence while the second flag indicates whether the resolutions of the pictures in the coded sequence are allowed to be adjusted. Given that the pictures’ resolutions can be different from each other only if the reference picture resampling is enabled, it is proposed to add one bitstream conformance constraint that the value of no_res_change_in_clvs_constraint_flag should be equal to one when the value of no_ref_pic_resampling_constraint_flag is equal to one, as specified as below. Meanwhile, given that reference picture resampling is one inter coding functionality, it cannot be applied when only intra coding is allowed. Therefore, another bitstream conformance constraint is added to restrict the value of no_ref_pic_resampling_constraint_flag should be equal to one when only intra coding tools are allowed.

[00156] no ref pic resampling ^ constraint flag equal to 1 specifies that sps_ref_pic_resampling_enabled_flag shall be equal to 0. no_ref_pic_resampling_constraint_flag equal to 0 does not impose such a constraint. When intra only constraint flag is equal to 1, the value of no_ref_pic_resampling_constraint_flag shall be equal to 1.

[00157] no_res_change_in_clvs_constraint_flag equal to 1 specifies that sps_res_change_in_clvs_allowed_flag shall be equal to 0. no_res_change_in_clvs_constraint_flag equal to 0 does not impose such a constraint. When the value of no_ref_pic_resampling_constraint_flag is equal to one, the value of no_res_change_in_clvs_constraint_flag should be equal to one.

[00158] In current VVC, pps deblocking fllter disabled flag equal to 1 specifies that the operation of deblocking filter is not applied for slices referring to the PPS for which one of the following two conditions is true: 1) ph deblocking filter disabled flag and sh deblocking filter disabled flag are not present and inferred to be equal to 1 and 2) ph deblockig filter disabled flag or sh deblocking filter disabled flag is present and equal to 1. Further, pps_deblocking_fllter_disabled_flag equal to 1 also specifies that the operation of deblocking filter is applied for slices referring to the PPS for which one of the following two conditions is true:l) ph deblocking filter disabled flag and sh deblocking filter disabled flag are not present and inferred to be equal to 0 and 2) ph deblocking filter disabled flag or sh deblocking filter disabled flag is present and equal to 0. pps deblocking filter disabled flag equal to 0 specifies that the operation of the deblocking filter is applied for slices referring to the PPS for which one of the following two conditions is true: 1) ph deblocking filter disabled flag and sh deblocking filter disabled flag are not present and 2) ph deblocking filter disabled flag or sh deblocking filter disabled flag is present and equal to 0. Further, deblocking filter disabled flag equal to 0 also specifies that the operation of deblocking filter is not applied for slices referring to the PPS for which ph deblocking filter disabled flag or sh_deblocking_filter_disabled_flag is present and equal to 1. According to the disclosure, it is proposed to add the syntax element, no_deblocking_filter_constraint_flag, in the general constraint information syntax to provide the same general constraint controls as other flags. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00159] Another example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00160] In current VVC, sps_num_subpics_minusl plus 1 specifies the number of subpictures in each picture in the CLVS. The value of sps num subpics minus 1 shall be in the range of 0 to Ceil( sps_pic_width_max_in_luma_samples ÷ CtbSizeY ) *

Ceil( sps_pic_height_max_in_luma_samples ÷ CtbSizeY ) - 1, inclusive. When not present, the value of sps_num_subpics_minusl is inferred to be equal to 0. According to the disclosure, it is proposed to add one bitstream conformance constraint that the value of sps_num_subpics_minusl should be equal to 0 when the value of one_subpic_per_pic_constraint_flag is equal to 1. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font.

[00161] FIG. 7 shows a method for decoding a video signal in accordance with the present disclosure. The method may be, for example, applied to a decoder.

[00162] In step 710, the decoder may receive a bitstream associated with video data.

[00163] In step 712, the decoder may obtain a syntax element from the bitstream.

[00164] In step 714, the decoder may decode the video data based on the syntax element.

[00165] Since the features controlled by the flags mvd_ll_zero_flag, ph disable bdof flag and ph disable dmvr flag are only applicable when the slice is a bi- predictive slice (B-slice), according to a method of the disclosure, it is proposed to signal these flags only when the associated slices are B-slices. It is noted that when the reference picture lists are signaled in PH (e.g. rpl_info_in_ph _flag=l), it means all the slices of the coded picture use the same reference pictures signaled in PH. Therefore, when the reference picture lists are signaled in PH and the signaled reference picture lists indicate that the current picture is not bi- predictive, the flags mvd ll zero flag, ph disable bdof flag and ph disable dmvr flag need not to be signaled.

[00166] In the first embodiment, some conditions are added to those syntaxes sent in picture header (PH) to prevent redundant signaling or undefined decoding behavior due to improper values sent for some of the syntaxes in the picture header. Some examples based on the embodiment are illustrated below, wherein variables num_ref_entries[i][ Rplsldxf i ]] represent the number of reference pictures in the list i.

[00167] If (!rpl_info_in_ph_flag || (rpl_info_in_ph flag && num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] > 0 && num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] > 0 )) mvd l 1 zero fl ag

[00168] Alternatively, the conditions can be written in a more compact form which gives the same results. Because a bi-predictive slice (B-slice) or bi-predictive picture must have at least one listl reference picture, it can only check whether current slice/picture has listl reference picture. An example of the alternative condition checking is illustrated below:

If (!rpl_info_in_ph_flag || (rpl_info_in_ph_flag && num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0 )) mvd l 1 zero fl ag

[00169] The semantics of mvd ll zero flag is also modified to handle the case when it is not signaled.

[00170] mvd_ll_zero_flag equal to 1 indicates that the mvd_coding( xO, yO, 1 ) syntax structure is not parsed and MvdLl[ xO ][ yO ][ compldx ] and MvdCpLlf xO ][ yO ][ cpldx ][ compldx ] are set equal to 0 for compldx = 0..1 and cpldx = 0..2. mvd ll zero flag equal to 0 indicates that the mvd_coding( xO, yO, 1 ) syntax structure is parsed. When not present, the value of mvd_ll_zero_flag is inferred to be 0. [00171] Several examples of conditionally signalling the syntax element ph disable dmvr flag are illustrated below:

[00172] If (sps_dmvr_pic_present_flag && (!rpl_info_in_ph_flag || (rpl_info_in_ph_flag && num_ref_entries[ 0 ][ Rplsldxf 0 ] ] > 0 && num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0)) ) ph disable dmvr flag

[00173] Similarly, an example of the alternative condition checking is illustrated below: [00174] If (sps_dmvr_pic_present_flag && (!rpl_info_in_ph_flag || (rpl_info_in_ph_flag && num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0)) ) ph disable dmvr flag

[00175] The semantics of ph disable dmvr flag is also modified to handle the case when it is not signaled.

[00176] ph_disable_dmvr_flag equal to 1 specifies that decoder motion vector refinement based inter bi-prediction is disabled in the slices associated with the PH. ph disable dmvr flag equal to 0 specifies that decoder motion vector refinement based inter bi-prediction may or may not be enabled in the slices associated with the PH.

[00177] When ph disable dmvr flag is not present, the following applies:

- If sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 0.

- Else if sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 1, the value of ph disable dmvr flag is inferred to be equal to 1.

- Otherwise (sps dmvr enabled flag is equal to 0), the value of ph disable dmvr flag is inferred to be equal to 1.

[00178] An alternative way to derive the value of ph disable dmvr flag when it is not presented is illustrated below:

- If all the conditions are considered for the derivation of the value of ph disable dmvr flag when it is either explicitly signalled or implicitly derived:

- If sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 0.

- Else if sps_dmvr_enabled_flag is equal to 0 and sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 1.

- Else if sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to X. (X is explicitly signalled)

- Else if sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0, the value of ph_disable_dmvr_flag is inferred to be equal to X. (X is explicitly signalled)

- Else (sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] ==0), the value of ph disable dmvr flag is inferred to be equal to 1.

[00179] Since the syntax element ph disable dmvr flag is explicitly signalled under the third and the fourth conditions, they can be removed from the derivation of ph disable dmvr flag when ph disable dmvr flag is not present:

[00180] When ph disable dmvr flag is not present, the following applies: - If sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 0.

- Else if sps_dmvr_enabled_flag is equal to 0 and sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 1.

- Else (sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] ==0), the value of ph disable dmvr flag is inferred to be equal to 1.

[00181] The conditions can be editorially simplified as below:

[00182] When ph_disable_ dmvr flag is not present, the following applies:

- If sps dmvr enabled flag is equal to 1 and sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 0.

- Otherwise (sps_dmvr_enabled_flag is equal to 0 or sps_dmvr_pic_present_flag is equal to 1), the value of ph disable dmvr flag is inferred to be equal to 1.

[00183] Another alternative way to derive the value of ph disable dmvr flag when it is not presented is illustrated below:

[00184] When ph disable dmvr flag is not present, the following applies:

- If sps_ dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 1- sps dmvr enabled flag.

- Else if sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 1- sps_ dmvr enabled flag.

- Else if sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0, the value of ph disable dmvr flag is inferred to be equal to 1- sps dmvr enabled flag.

- Else (sps_dmvr_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] =0), the value of ph disable dmvr flag is inferred to be equal to 1.

[00185] Since the syntax element ph disable dmvr flag is explicitly signalled under the second and the third conditions, they can be removed from the derivation of ph disable dmvr flag when it is not present:

[00186] When ph disable dmvr flag is not present, the following applies: - If sps_dmvr_pic_present_flag is equal to 0, the value of ph disable dmvr flag is inferred to be equal to 1- sps_ dmvr enabled flag..

- Otherwise, the value of ph_disable_ dmvr flag is inferred to be equal to 1.

[00187] Several examples of conditionally signalling the syntax element ph disable bdof flag are illustrated below:

[00188] If (sps_bdof_pic_present_flag && (!rpl_info_in_ph_flag || (rpl_info_in_ph_flag && num_ref_entries[ 0 ][ Rplsldxf 0 ] ] > 1 && num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 1)) ) ph disable bdof flag

[00189] Similarly, an example of the alternative condition checking is illustrated below: [00190] If (sps_bdof_pic_present_flag && (!rpl_info_in_ph_flag || (rpl_info_in_ph_flag && num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0)) ) ph disable bdof flag

[00191] The semantics of ph disable bdof flag is also modified to handle the case when it is not signaled.

[00192] ph disable bdof flag equal to 1 specifies that bi-directional optical flow inter prediction based inter bi-prediction is disabled in the slices associated with the PH. ph disable bdof flag equal to 0 specifies that bi-directional optical flow inter prediction based inter bi-prediction may or may not be enabled in the slices associated with the PH.

[00193] When ph disable bdof flag is not present, the following applies:

- If sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 0.

- Else if sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 1, the value of ph disable dmvr flag is inferred to be equal to 1.

- Otherwise (sps bdof enabled flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 1.

[00194] An alternative way to derive the value of ph disable bdof flag when it is not presented is illustrated below:

[00195] If all the conditions are considered for the derivation of the value of ph disable bdof flag when it is either explicitly signalled or implicitly derived:

- If sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 0. - Else if sps_bdof_enabled_flag is equal to 0 and sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 1.

- Else if sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to X. (X is explicitly signalled)

- Else if sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] > 0, the value of ph_disable_bdof_flag is inferred to be equal to X. (X is explicitly signalled)

- Else (sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] ==0), the value of ph disable bdof flag is inferred to be equal to 1.

[00196] Since the syntax element ph disable bdof flag is explicitly signalled under the third and the fourth conditions, they can be removed from the derivation of ph disable bdof flag when ph disable bdof flag is not present:

[00197] When ph disable bdof flag is not present, the following applies:

- If sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 0.

- Else if sps_bdof_enabled_flag is equal to 0 and sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 1.

- Else (sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldxf 1 ] ] =0), the value of ph disable bdof flag is inferred to be equal to 1.

[00198] The conditions can be editorially simplified as below:

[00199] When ph disable bdof flag is not present, the following applies:

- If sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 0.

- Otherwise (sps_bdof_enabled_flag is equal to 0 or sps_bdof_pic_present_flag is equal to 1), the value of ph disable bdof flag is inferred to be equal to 1.

[00200] Another alternative way to derive the value of ph disable bdof flag when it is not presented is illustrated below:

[00201] When ph disable bdof flag is not present, the following applies: - If sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 1- sps bdof enabled flag.

- Else if sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 1- sps bdof enabled flag.

- Else if sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] > 0, the value of ph disable bdof flag is inferred to be equal to 1- sps bdof enabled flag.

- Else (sps_bdof_pic_present_flag is equal to 1 and rpl_info_in_ph_flag is equal to 1 and num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] ==0), the value of ph disable bdof flag is inferred to be equal to 1.

[00202] Since the syntax element ph disable bdof flag is explicitly signalled under the second and the third conditions, they can be removed from the derivation of ph disable bdof flag when it is not present:

[00203] When ph disable bdof flag is not present, the following applies:

- If sps_bdof_pic_present_flag is equal to 0, the value of ph disable bdof flag is inferred to be equal to 1- sps bdof enabled flag..

- Otherwise, the value of ph disable bdof flag is inferred to be equal to 1.

[00204] Moreover, the signalling conditions for syntax elements ph_collocated_from_10_flag and weight_table( ) are modified because the two types of syntax elements are only applicable when the associated slices are B-slices. Examples of the modified syntax elements signaling are illustrated below.

[00205] The semantics of ph collocated from lO flag is also modified to handle the case when it is not signaled.

[00206] ph_collocated_from_10_flag equal to 1 specifies that the collocated picture used for temporal motion vector prediction is derived from reference picture list 0. ph collocated from lO flag equal to 0 specifies that the collocated picture used for temporal motion vector prediction is derived from reference picture list 1.

[00207] When ph collocated from lO flag is not present, the following applies: - If num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] is larger than 1, the value of ph collocated from lO flag is inferred to be 1.

- Otherwise (num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] is larger than 1 ), the value of ph_collocated_from_10_flag is inferred to be 0.

[00208] Similarly, an example of the alternative condition checking is illustrated below: if( pps weighted bipred flag && wp_info_in_ph_flag && (!rpl_info_in_ph_flag (rpl_info_in_ph_flag && num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] > 0))) inim l l weights

[00209] The semantics of the syntax elements in pred_weight_table( ) are also modified to handle the case when they are not signaled.

[00210] num_ll_weights specifies the number of weights signalled for entries in reference picture list 1 when pps weighted bipred flag and wp_info_in_ph_flag are both equal to 1. The value of num_ll_weights shall be in the range of 0 to Min( 15, num_ref_entries[ 1 ][ Rplsldxf 1 ] ] ), inclusive.

[00211] The variable NumWeightsLl is derived as follows: if( ! pps_weighted_bipred_flag)

NumWeightsLl = 0 else if (wp_info_in_ph_flag && rpl_info_in_ph_flag && (num_ref_entries[ 0 ][ Rplsldx[ 0 ] ] ==0 || num_ref_entries[ 1 ][ Rplsldx[ 1 ] ] >=0))

NumWeightsLl = 0 else if( wp_info_in_ph_flag ) (5)

NumWeightsLl = num_ll_weights else

NumWeightsLl = NumRefldxActive[ 1 ]

[00212] In the semantics of the syntax elements in pred_weight_table( ), an alternative way to derive the value of num_ll_weights when it is not presented is illustrated below:

[00213] num_ll_weights specifies the number of weights signalled for entries in reference picture list 1 when pps weighted bipred flag and wp_info_in_ph_flag are both equal to 1. The value of num_ll_weights shall be in the range of 0 to Min( 15, num_ref_entries[ 1 ][ Rplsldxf 1 ] ] ), inclusive. When not present, the value of num_ll_weights is inferred to be 0.

[00214] The variable NumWeightsLl is derived as follows: if( ! pps_weighted_bipred_flag)

NumWeightsLl 0 else if( wp_info_in_ph_flag ) (6)

NumWeightsLl = num ll weights else

NumWeightsLl = NumRefldxActive[ 1 ]

[00215] In the semantics of the syntax elements in pred_weight_table( ), another alternative way to derive the value of num_ll_weights when it is not presented is illustrated below: if( !pps_weighted_bipred_flag | | ( wp_info_in_ph_flag && num_ref_entries[ 1 ][ Rplsldxf 1 ] ] = = 0 ) )

NumWeightsLl 0 else if( wp_info_in_ph_flag )

NumWeightsLl num ll weights else

NumWeightsLl = NumRefldxActive[ 1 ] [00216] Conceptually, it is proposed to add signaling condition to check whether the current picture has reference pictures from both listO and listl reference picure lists for any syntax elements which is only applicable in B slices to avoid signaling redundant bits. The checking condition is not limited to the above mentioned method to check the size of both reference picture lists (e.g. list 0/listl reference picture lists) and the checing condition may be any other method to indicate whether current picture has reference pictures from both listO and listl reference picture lists. For example, a flag can be signaled to indicate whether current picture has both listO and litsl reference pictures.

[00217] When the syntax elements are not signaled and the reference picture list information is signaled in the picture header (PH), the values of the syntax elements are derived using the information whether current picture has both listO and listl reference pictures or it has only listO or listl reference pictures. In one example, when ph collocated from lO flag is not signaled, its value is inferred to be the only reference picture that current picture has. In another example, when sps bdof enabled flag is equal to 1 and sps_bdof_pic_present_flag is equal to 1 but ph disable bdof flag is not signalled, it implies that either num_ref_entries[ 0 ][ Rplsldxf 0 ] ] is equal to 0 or num_ref_entries[ 1 ][ Rplsldxf 1 ] ] is equal to 0 according the proposed signalling condition on ph disable bdof flag. Therefore, under this condition, ph disable bdof flag is not signalled and is inferred as 1. In current VVC, not only the resolution of the collocated picture may affect the enabling of TMVP but also the offsets applied to the picture size for scaling ratio calculation may affect the enabling of TMVP. In current VVC, however, the offsets are not considered in the bitstream conformance of ph temporal mvp enabled flag.

[00218] In the second embodiment, it is proposed to add a bitstream conformance constraint to the current VVC requiring that the value of ph_temporal_mvp_enabled_flag shall be dependent on the offsets that are applied to the picture size for scaling ratio calculation, as illustrated below:

[00219] When no reference picture in the DPB has the same spatial resolution and the same offsets that are applied to the picture size for scaling ratio calculation as the current picture, the value of ph temporal mvp enabled flag shall be equal to 0.

[00220] The above sentences can also be written in another way as below:

[00221] When no reference picture in the DPB has the associated variable value RprConstraintsActivel i ][ j ] equal to 0, the value of ph_temporal_mvp_enabled_flag shall be equal to 0. [00222] In current VVC, there is a requirement of bitstream conformance that the picture referred to by slice collocated ref idx shall be the same for all slices of a coded picture. However, when the coded picture has multiple slices and there is no common reference picture existing among all these slices, this bitstream conformance has no chance to be met.

[00223] In the third embodiment of the disclosure, the requirement of bitstream conformance on ph_temporal_mvp_enabled_flag is modified to consider whether there is a common reference picture existing among all the slices in the current picture. Based on the embodiment, several exemplar modifications to the VVC specification are illustrated below. [00224] ph_temporal_mvp_enabled_flag specifies whether temporal motion vector predictors can be used for inter prediction for slices associated with the PH. If ph_temporal_mvp_enabled_flag is equal to 0, the syntax elements of the slices associated with the PH shall be constrained such that no temporal motion vector predictor is used in decoding of the slices. Otherwise (ph temporal mvp enabled flag is equal to 1), temporal motion vector predictors may be used in decoding of the slices associated with the PH. When not present, the value of ph temporal mvp enabled flag is inferred to be equal to 0. When no reference picture in the DPB has the same spatial resolution as the current picture, the value of ph temporal mvp enabled flag shall be equal to 0. When no common reference picture exists in all the slices associated with the PH, the value of ph temporal mvp enabled flag shall be equal to 0.

[00225] ph_temporal_mvp_enabled_flag specifies whether temporal motion vector predictors can be used for inter prediction for slices associated with the PH. If ph_temporal_mvp_enabled_flag is equal to 0, the syntax elements of the slices associated with the PH shall be constrained such that no temporal motion vector predictor is used in decoding of the slices. Otherwise (ph temporal mvp enabled flag is equal to 1), temporal motion vector predictors may be used in decoding of the slices associated with the PH. When not present, the value of ph temporal mvp enabled flag is inferred to be equal to 0. When no reference picture in the DPB has the same spatial resolution as the current picture, the value of ph temporal mvp enabled flag shall be equal to 0. When no common reference picture exists in all the inter slices associated with the PH, the value of ph temporal mvp enabled flag shall be equal to 0.

[00226] ph_temporal_mvp_enabled_flag specifies whether temporal motion vector predictors can be used for inter prediction for slices associated with the PH. If ph_temporal_mvp_enabled_flag is equal to 0, the syntax elements of the slices associated with the PH shall be constrained such that no temporal motion vector predictor is used in decoding of the slices. Otherwise (ph temporal mvp enabled flag is equal to 1), temporal motion vector predictors may be used in decoding of the slices associated with the PH. When not present, the value of ph temporal mvp enabled flag is inferred to be equal to 0. When no reference picture in the DPB has the same spatial resolution as the current picture, the value of ph temporal mvp enabled flag shall be equal to 0. When no common reference picture exists in all the non-intra slices associated with the PH, the value of ph temporal mvp enabled flag shall be equal to 0.

[00227] In the fourth embodiment, the bitstream conformance on slice collocated ref idx is simplified as below:

[00228] It is a requirement of bitstream conformance that the values of pic width in luma samples and pic height in luma samples of the reference picture referred to by slice collocated ref idx shall be equal to the values of pic width in luma samples and pic height in luma samples, respectively, of the current picture, and RprConstraintsActivef shce_collocated_from_10_flag ? 0

1 ][ slice collocated ref idx ] shall be equal to 0.

[00229] As discussed in the section “problem statement”, when the value of pps_mixed_nalu_types_in_pic_flag is equal to one, each picture referring to the PPS has more than one NAL units and those NAL units do not have the same nal_unit_type. On the other hand, in the current picture header signaling, the values of ph_gdr_or_irap_pic_flag and ph gdr pic flag are allowed to signaled as ones even when the value of the flag pps_mixed_nalu_types_in_pic_flag in the associated PPS is equal to one. Because the NAL units in one IRAP picture or one GDR picture must have the same nal unit type, such signaling scenario should not be allowed. In the following, different methods are proposed to resolve such problem.

[00230] Method one: in the first method, it is proposed to condition the presence of the flag ph dr o r_i rap p i c fl ag in the picture header on the value of pps_mixed_nalu_types_in_pic_flag in the PPS. Specifically, the ph_gdr_or_irap_pic_flag is only signaled when the value of pps_mixed_nalu_types_in_pic_flag is equal to zero. Otherwise, when the flag pps_mixed_nalu_types_in_pic_flag is equal to one, the flag ph gdr o r_i rap p i c fl ag is not signaled and inferred to be zero. The corresponding picture header table is illustrated after the proposed modification is applied.

[00231] Method two: in the second method, one bitstream conformance constraint is proposed to require that the corresponding value of the signaled flag ph_gdr_or_irap_pic_flag shall be equal to one when the flag pps_mixed_nalu_types_in_pic_flag is equal to one. Specifically, the proposed bitstream conformance constraint can be specified as below. ph_gdr_or_irap_pic_flag equal to 1 specifies that the current picture is a GDR or IRAP picture. ph_gdr_or_irap_pic_flag equal to 0 specifies that the current picture is not a GDR picture and may or may not be an IRAP picture. When the value of pps_mixed_nalu_type_in_pic_flag is equal to one, the value of ph dr o r_i rap p i c fl ag shall be equal to zero.

[00232] Method three: in this method, it is proposed to move the signaling of pps_mixed_nalu_types_in_pic_flag from PPS level to picture level, slice level or other coding level. For instance, assuming the flag is moved to picture header, the flag can be renamed as ph_mixed_nalu_type_in_pic_flag. Additionally, it is proposed to use the flag to condition the signaling of ph_gdr_or_irap_pic_flag. Specifically, the ph_gdr_or_rap_pic_flag is only signaled when the flag ph_mixed_nalu_type_in_pic_flag is equal to zero. Otherwise, when the flag ph_mixed_nalu_type_in_pic_flag is one, the flag ph_gdr_or_rap_pic_flags is not signaled and inferred to be zero. In another embodiment, it is proposed to add bitstream conformance constraint that the value of ph gdr or i rap pi c fl ag should be equal to zero when the value of ph_mixed_nalu_type_in_pic_flag is equal to one. In yet another embodiment, it is proposed to use ph gdr o r_i rap p i c fl ag to condition the presence of ph_mixed_nalu_type_in_pic_flag. Specifically, the flag ph_mixed_nalu_type_in_pic_flag is only signaled when the value of ph gdr or rap pi c fl ag is equal to zero. Otherwise, when the value of ph gdr or rap pi c fl ag is equal to one, the flag ph_mixed_nalu_type_in_pic_flag is not signaled and always inferred to be zero.

[00233] Method four: it is proposed to apply the value of pps_mixed_nalu_type_in_pic_flag only to the pictures that are neither IRAP nor GDR pictures. Specifically, by such method, the semantic of pps_mixed_nalu_type_in_pic_flag should be modified as follows:

[00234] pps_mixed_nalu_types_in_pic_flag equal to 1 specifies that each picture that is neither IRAP nor GDR picture referring to the PPS has more than one VCL NAL unit and the VCL NAL units do not have the same value of nal_unit_type. pps_mixed_nalu_types_in_pic_flag equal to 0 specifies that each picture that is neither IRAP nor GDR picture referring to the PPS has one or more VCL NAL units and the VCL NAL units of each picture referring to the PPS have the same value of nal unit type.

[00235] On the other hand, in the current VVC specification, it is required that all the NAL units in one GDR picture have to have the same nal unit type which is equal to GDR NUT. The following bitstream conformance constraint is applied to the definition of the GDR picture such that the corresponding value of pps_mixed_nal_types_in_pic_flag should be equal to zero. [00236] gradual decoding refresh (GDR) picture: A picture for which each VCL NAL unit has nal unit type equal to GDR NUT. The value of pps_mixed_nalu_types_in_pic_flag for a GDR picture is equal to 0 When pps_mixed_nalu_types_in_pic_flag is equal to 0 for a picture, and any slice of the picture has nal unit type is GDR NUT, all other slices of the picture have the same value of nal unit type, and the picture is known to be a GDR picture after receiving the first slice of the picture.

[00237] In another embodiment, it is proposed to remove the GDR NAL unit type from NAL unit header while only use the syntax elements ph gdr or irap pic flag and ph gdr pic flag to indicate whether the current picture is GDR picture or not.

[00238] Different from the above methods where the constraint of pps_mixed_nalu_types_in_pic_flag is applied to both IRAP and GDR pictures, in the following, three methods are proposed where the constraint is applied to IRAP picture but not to GDR picture.

[00239] Method One: in this method, it is proposed to condition the presence of the flag ph gdr pic flag in the picture header on the value of pps_mixed_nalu_types_in_pic_flag in the PPS. Specifically, the flag ph_gdr_pic_flag is only signaled when the value of pps_mixed_nalu_types_in_pic_flag is equal to zero. Otherwise, when the flag pps_mixed_nalu_types_in_pic_flag is equal to one, the flag ph_gdr_pic_flag is not signaled and inferred to be zero, i.e., the current picture cannot be one GDR picture. The corresponding picture header table is modified as follows after the proposed signaling condition is applied. picture ph gdr pic flag equal to 0 specifies that the picture associated with the PH is not a GDR picture. When not present, the value of ph gdr pic flag is inferred to be equal to 0 when pps_mixed_nalu_types_in_pic flag is 0 and to be equal to the value of ph gdr o r_i rap p i c fl ag when pps_mixed_nalu_types_in_pic_flag is 1. When sps gdr enabled flag is equal to 0, the value of ph_gdr_pic_flag shall be equal to 0.

[00241] Method Two: it is proposed to introduce one bitstream conformance constraint that ph gdr pic flag should be equal to 1 when ph gdr or i rap pic fl ag is 1 and pps_mixed_nalu_types_in_pic_flag is 1, as specified as

[00242] ph_gdr_pic_flag equal to 1 specifies the picture associated with the PH is a GDR picture ph gdr pic flag equal to 0 specifies that the picture associated with the PH is not a GDR picture. When not present, the value of ph_gdr_pic_flag is inferred to be equal to 0. When sps gdr enabled flag is equal to 0, the value of ph_gdr_pic_flag shall be equal to 0. The value of ph gdr pic flag should be equal to 1, when ph gdr or i rap pi c fl ag is equal to 1 and pps_mixed_nalu_types_in_pic_flag is equal to 1.

[00243] NOTE 1 - When ph gdr or i rap pi c fl ag is equal to 1 and ph gdr pic flag is equal to 0, the picture associated with the PH is an IRAP picture.

[00244] Method three: it is proposed to apply the flag pps_mixed_nalu_types_in_pic_flag only to non-IRAP pictures. Specifically, in this method, the semantic of pps_mixed_nalu_types_in_pic_flag should be modified as

[00245] pps_mixed_nalu_types_in_pic_flag equal to 1 specifies that each non-IRAP picture referring to the PPS has more than one VCL NAL unit and the VCL NAL units do not have the same value of nal unit type. pps_mixed_nalu_types_in_pic_flag equal to 0 specifies that each non-IRAP picture referring to the PPS has one or more VCL NAL units and the VCL NAL units of each picture referring to the PPS have the same value of nal unit type.

[00246] According to the current VVC, ph_inter_slice_allowed_flag equal to 0 specifies that all coded slices of the picture have sh_slice_type equal to 2. ph_inter_slice_allowed_flag equal to 1 specifies that there may or may not be one or more coded slices in the picture that have sh_slice_type equal to 0 or 1 ph_gdr_pic_flag equal to 1 specifies the picture associated with the PH is a GDR picture. ph_gdr_pic_flag equal to 0 specifies that the picture associated with the PH is not a GDR picture. When not present, the value of ph_gdr_pic_flag is inferred to be equal to 0. When sps_gdr_enabled_flag is equal to 0, the value of ph_gdr_pic_flag shall be equal to 0. According to this disclosure, it is proposed to infer the value of ph inter slice allowed flag based on the value of ph gdr pic flag. For example, in case the PH is a GDR picture, ph_inter_slice_allowed_flag is not signaled at the corresponding coding level. An example of the decoding process on VVC Draft is illustrated below. The changes to the VVC Draft are shown in bold and italic font .

00247] Additionally, it is proposed to add one bitstream conformance constraint that the value of ph inter slice allowed flag should be equal to one, when ph_gdr_pic_flag is equal to one.

[00248] ph_inter_slice_allowed_flag equal to 0 specifies that all coded slices of the picture have sh_slice_type equal to 2. ph_inter_slice_allowed_flag equal to 1 specifies that there may or may not be one or more coded slices in the picture that have sh slice type equal to 0 or 1 [00249] When ph_gdr_or_irap_pic_flag is equal to 1 and ph gdr pic flag is equal to 0 (i.e., the picture is an IRAP picture), and vps_independent_layer_flag[ GeneralLayerldxf nuh layer id ] ] is equal to 1, the value of ph_inter_slice_allowed_flag shall be equal to 0.

[00250] When ph_gdr_pic_flag is equal to 1, the value of ph inter slice allowed flag shall be equal to 1.

[00251] The above methods may be implemented using an apparatus that includes one or more circuitries, which include application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components. The apparatus may use the circuitries in combination with the other hardware or software components for performing the above described methods. Each module, sub-module, unit, or sub-unit disclosed above may be implemented at least partially using the one or more circuitries.

[00252] FIG. 8 shows a computing environment (or computing device) 810 coupled with a user interface 860. The computing environment 810 can be part of a data processing server. The computing environment 810 includes processor 820, memory 840, and I/O interface 850. In some embodiments, the computing device 810 can be used to implement a coder (such as, an encoder or a decoder) which performs any of video coding processes as described herein, including any encoding process or any decoding process, for example.

[00253] The processor 820 typically controls overall operations of the computing environment 810, such as the operations associated with the display, data acquisition, data communications, and image processing. The processor 820 may include one or more processors to execute instructions to perform all or some of the steps in the above-described methods. Moreover, the processor 820 may include one or more modules that facilitate the interaction between the processor 820 and other components. The processor may be a Central Processing Unit (CPU), a microprocessor, a single chip machine, a GPU, or the like.

[00254] The memory 840 is configured to store various types of data to support the operation of the computing environment 810. Memory 840 may include predetermine software 842. Examples of such data include instructions for any applications or methods operated on the computing environment 810, video datasets, image data, etc. The memory 840 may be implemented by using any type of volatile or non-volatile memory devices, or a combination thereof, such as a static random access memory (SRAM), an electrically erasable programmable read-only memory (EEPROM), an erasable programmable read-only memory (EPROM), a programmable read-only memory (PROM), a read-only memory (ROM), a magnetic memory, a flash memory, a magnetic or optical disk.

[00255] The I/O interface 850 provides an interface between the processor 820 and peripheral interface modules, such as a keyboard, a click wheel, buttons, and the like. The buttons may include but are not limited to, a home button, a start scan button, and a stop scan button. The I/O interface 850 can be coupled with an encoder and decoder.

[00256] In some embodiments, there is also provided a non-transitory computer-readable storage medium comprising a plurality of programs, such as comprised in the memory 840, executable by the processor 820 in the computing environment 810, for performing the above- described methods. For example, the non-transitory computer-readable storage medium may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disc, an optical data storage device or the like.

[00257] The non-transitory computer-readable storage medium has stored therein a plurality of programs for execution by a computing device having one or more processors, where the plurality of programs when executed by the one or more processors, cause the computing device to perform the above-described method for motion prediction.

[00258] In some embodiments, the computing environment 810 may be implemented with one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field- programmable gate arrays (FPGAs), graphical processing units (GPUs), controllers, micro controllers, microprocessors, or other electronic components, for performing the above methods.

[00259]

Other examples of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed here. This application is intended to cover any variations, uses, or adaptations of the disclosure following the general principles thereof and including such departures from the present disclosure as come within known or customary practice in the art. It is intended that the specification and examples be considered as exemplary only.

[00260] It will be appreciated that the present disclosure is not limited to the exact examples described above and illustrated in the accompanying drawings, and that various modifications and changes can be made without departing from the scope thereof.