Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
3D VIDEO CODING WITH PARTITION-BASED DEPTH INTER CODING
Document Type and Number:
WIPO Patent Application WO/2015/006884
Kind Code:
A1
Abstract:
Techniques are provided for partition-based inter-coding of depth in 3D video coding. Depth partitions may, in some examples, include non-rectangular depth partitions. The techniques may include various features that support inter-coding of non-rectangular depth partitions. In some examples, the non-rectangular partitions may be-inter-coded using, for example, average value-based depth coding for residual information.

Inventors:
CHEN YING (US)
ZHAO XIN (CN)
ZHANG LI (US)
Application Number:
PCT/CN2013/000869
Publication Date:
January 22, 2015
Filing Date:
July 19, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
QUALCOMM INC (US)
CHEN YING (US)
ZHAO XIN (CN)
ZHANG LI (US)
International Classes:
H04N19/50
Domestic Patent References:
WO2012128847A12012-09-27
Foreign References:
CN101222639A2008-07-16
CN101873500A2010-10-27
CN101365142A2009-02-11
CN101540926A2009-09-23
Attorney, Agent or Firm:
LEE AND LI - LEAVEN IPR AGENCY LTD. (Tower W1 Oriental Plaza,,1 East Chang An Avenue,East, Beijing 8, CN)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method of decoding video data, the method comprising:

performing an inter-prediction decoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture only when a size of the PU is the same as the size of a coding unit (CU) of the PU, wherein performing the inter-prediction mode comprises:

obtaining motion information indicating inter-predictive samples for the non-rectangular partitions;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data.

2. The method of claim 1 , wherein performing the inter-prediction mode comprises performing the inter-prediction mode only when a size of the PU is larger than a predetermined size of KxK pixels.

3. The method of claim 2, wherein K is equal to one of 8, 16 or 32.

4. The method of any of claims 1 -3, wherein performing the inter-prediction mode comprises performing the inter-prediction mode only when a size of the PU is smaller than a predetermined size of KxK pixels.

5. The method of claim 4, wherein K is equal to one of 16, 32 or 64.

6. The method of any of claims 1 -5, wherein the non-rectangular partitions include wedgelet partitions.

7. The method of any of claims 1 -6, wherein the motion information includes a reference index indicating a reference picture in a reference picture list for generation of the inter- predictive samples and a motion vector that identifies the inter-predictive samples, and obtaining the motion information comprises obtaining the motion information by motion vector prediction or by receiving the motion information in an encoded bitstream.

8. The method of any of claims 1-7, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

9. A device comprising a video decoder configured to perform the method of any of claims 1-8.

10. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 1 -8.

1 1. A method of encoding video data, the method comprising:

performing an inter-prediction encoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture only when a size of the PU is the same as the size of a coding unit (CU) of the PU, wherein performing the inter-prediction mode comprises:

generating motion information indicating inter-predictive samples for the non-rectangular partitions;

generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

encoding the non-rectangular partitions of the PU based at least in part on the motion information and the residual data.

12. The method of claim 1 1 , wherein performing the inter-prediction mode comprises performing the inter-prediction mode only when a size of the PU is larger than a predetermined size of xK pixels.

13. The method of claim 12, wherein is equal to one of 8, 16 or 32.

14. The method of claim of any of claims 1 1 - 13, wherein performing the inter-prediction mode comprises performing the inter-prediction mode only when a size of the PU is smaller than a predetermined size of KxK pixels.

15. The method of claim 14, wherein is equal to one of 16, 32 or 64.

16. The method of any of claims 1 1 - 15, wherein the non-rectangular partitions include wedgelet partitions.

17. The method of any of claims 1 1 - 16, wherein the motion information includes a reference index identifying a reference picture for generation of the inter-predictive samples and a motion vector that identifies the inter-predictive samples, the method further comprising generating the motion information based on motion vector prediction or motion estimation, and wherein encoding the non-rectangular partitions comprises, when the motion information is generated based on motion vector prediction, encoding a merge index to indicate one of a plurality of candidate sets of motion information in a merge candidate list.

18. The method of any of claims 1 1 - 17, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

19. A device comprising a video encoder configured to perform the method of any of claims 1 1 - 18.

20. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 1 1- 18.

21. A method of decoding video data, the method comprising:

performing an inter-prediction decoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein performing the inter-prediction mode comprises:

obtaining motion information indicating inter-predictive samples for the non- rectangular partitions,

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and

reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data; and

disabling an advanced motion vector prediction (AMVP) mode when the motion information is obtained using motion vector prediction.

22. The method of claim 21 , wherein the motion vector prediction includes a merge mode in which the motion information is obtained from one of a plurality of candidates in a merge candidate list for the non-rectangular partition.

23. The method of claim 21 or 22, further comprising disabling a skip mode when the motion information is obtained using motion vector prediction.

24. The method of any of claims 21 -23, wherein the non-rectangular partitions include wedgelet partitions.

25. The method of any of claims 21 -24, wherein the motion information includes a reference index indicating a reference picture in a reference picture list for generation of the inter- predictive samples and a motion vector that identifies the inter-predictive samples, and obtaining the motion information comprises obtaining the motion information by motion vector prediction or by receiving the motion information in an encoded bitstream.

26. The method of any of claims 21 -25, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

27. A device comprising a video decoder configured to perform the method of any of claims 21 -26.

28. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 21 -26.

29. A method of encoding video data, the method comprising:

performing an inter-prediction encoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein performing the inter-prediction mode comprises:

generating motion information indicating inter-predictive samples for the non- rectangular partitions,

generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and

encoding the non-rectangular partitions of the PU based at least in part on the motion information and the residual data; and

disabling an advanced motion vector prediction (AMVP) mode when the motion information is generated using motion vector prediction.

30. The method of claim 29, wherein the motion vector prediction includes a merge mode in which the motion information is obtained from one of a plurality of candidates in a merge candidate list for the respective non-rectangular partition.

31. The method of claim 29 or 30, further comprising disabling a skip mode when the motion information is obtained using motion vector prediction.

32. The method of any of claims 29-3 1 , wherein the non-rectangular partitions include wedgelet partitions.

33. The method of any of claims 29-3 1 , wherein the motion information includes a reference index identifying a reference picture for generation of the inter-predictive samples and a motion vector that identifies the inter-predictive samples, the method further comprising generating the motion information based on motion vector prediction or motion estimation, and wherein encoding the non-rectangular partitions comprises, when the motion information is generated based on motion vector prediction, encoding a merge index to indicate one of a plurality of candidate sets of motion information in a merge candidate list for the respective non-rectangular partition.

34. The method of any of claims 29-3 1 , wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

35. A device comprising a video decoder configured to perform the method of any of claims 29-34.

36. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 29-34.

37. A method of decoding video data, the method comprising:

predicting motion information for a non-rectangular partition of video data of a prediction unit (PU) of a depth view component of a picture;

performing motion compensation on a rectangular block of the prediction unit that covers the non-rectangular partition using the predicted motion information to obtain inter-predictive samples;

selecting some of the inter-predictive samples for the non-rectangular partition of the PU; obtaining residual data representing differences between the selected inter-predictive samples and pixels of the non-rectangular partition; and

reconstructing the non-rectangular partition of the PU based at least in part on the selected predictive samples and the residual data.

38. The method of claim 37, wherein predicting the motion information comprises:

receiving a merge index for the non-rectangular partition; and

obtaining the motion information from a selected one of a plurality of candidates in a merge candidate list for the non-rectangular partition.

39. The method of claim 37, wherein selecting some of the predictive samples comprises selecting a subset comprising less than all of the predictive samples for the non-rectangular partition of the PU.

40. The method of claim 37, wherein selecting some of the predictive samples comprises selecting predictive samples that spatially correspond to locations of pixels of the non- rectangular partition within the motion compensated rectangular block.

41 . The method of any of claims 37-40, wherein the non-rectangular partitions include wedgelet partitions.

42. The method of any of claims 37-41 , wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the selected inter-predictive samples for the respective non-rectangular partition.

43. A device comprising a video decoder configured to perform the method of any of claims 37-42.

44. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 37-42.

45. A method of encoding video data, the method comprising:

predicting motion information for a non-rectangular partition of video data of a prediction unit (PU) of a depth view component of a picture;

performing motion compensation on a rectangular block of the prediction unit that covers the non-rectangular partition using the predicted motion information to obtain inter-predictive samples;

selecting some of the inter-predictive samples for the non-rectangular partition of the PU; generating residual data representing differences between the selected inter-predictive samples and pixels of the non-rectangular partition, and

encoding the non-rectangular partition of the PU based at least in part on the motion information and the residual data.

46. The method of claim 45, wherein predicting the motion information comprises predicting motion information from a selected one of a plurality of motion information candidates in a merge candidate list for the non-rectangular partition.

47. The method of claim 46, wherein encoding the non-rectangular partition comprises encoding a merge index to indicate the selected motion information candidate.

48. The method of claim 45, wherein selecting some of the predictive samples comprises selecting a subset comprising less than all of the predictive samples for the non-rectangular partition of the PU.

49. The method of claim 45, wherein selecting some of the predictive samples comprises selecting predictive samples that spatially correspond to locations of pixels of the non- rectangular partition within the motion compensated rectangular block.

50. The method of any of claims 45-49, wherein the non-rectangular partitions include wedgelet partitions.

51. The method of any of claims 45-50, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the selected inter-predictive samples for the respective non-rectangular partition.

52. A device comprising a video decoder configured to perform the method of any of claims 45-5 1.

53. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 45-51 .

54. A method of decoding video data, the method comprising:

obtaining separate sets of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component;

obtaining sets of inter-predictive samples for the non-rectangular partitions of the PU based on the separate sets of motion information;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

reconstructing the non-rectangular partitions of the PU based at least in part on inter- predictive samples and the residual data.

55. The method of claim 54, wherein the separate sets of motion include a first set of motion information for a first one of the non-rectangular partitions of the PU, and a

second set of motion information for a second one of the non-rectangular partitions of the PU, and wherein each of the sets of motion information includes a motion

vector and a reference index for a reference picture list.

56. The method of claim 54, wherein obtaining the separate sets of motion information comprises obtaining at least one of the separate sets of motion information using motion vector prediction based on a merge index for a merge candidate listindicating a plurality of motion information candidates.

57. The method of claim 54, wherein obtaining the separate sets of motion information comprises receiving at least one of the separate sets of motion information in an encoded bitstream.

58. The method of any of claims 54-57, wherein the non-rectangular partitions include wedgelet partitions.

59. The method of any of claims 54-58, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the selected inter-predictive samples for the respective non-rectangular partition.

60. A device comprising a video decoder configured to perform the method of any of claims 54-59.

61. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of anv of claims 54-59.

62. A method of encoding video data, the method comprising:

generating separate sets of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component;

obtaining sets of inter-predictive samples for the non-rectangular partitions of the PU based on the separate sets of motion information;

generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

encoding the non-rectangular partitions of the PU based at least in part on the separate sets of motion information and the residual data.

63. The method of claim 62, wherein the separate sets of motion include a first set of motion information for a first one of the non-rectangular partitions of the PU, and a

second set of motion information for a second one of the non-rectangular partitions of the PU, and wherein each of the sets of motion information includes a motion vector and a reference index for a reference picture list.

64. The method of claim 63, wherein generating the separate sets of motion information comprises generating at least one of the separate sets of motion information using motion vector prediction based on one of a plurality of motion information candidates of a merge candidate list.

65. The method of claim 62, wherein generating the separate sets of motion information comprises generating at least one of the separate sets of motion information by motion estimation.

66. The method of any of claims 62-65, wherein the non-rectangular partitions include wedgelet partitions.

67. The method of any of claims 62-66, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the selected inter-predictive samples for the respective non-rectangular partition.

68. A device comprising a video encoder configured to perform the method of any of claims 62-67.

69. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 62-67.

70. A method of decoding video data, the method comprising:

obtaining motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein the motion information indicates inter-predictive samples for the non-rectangular partitions;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

reconstructing the non-rectangular partitions of the PU based at least in

part on the inter-predictive samples indicated by the motion information and the residual data, wherein the motion information for the non-rectangular partitions is compressed when a size of the PU is smaller than a predetermined size of xK pixels.

71. The method of claim 70, wherein the motion information is not compressed when the size of the PU is greater than or equal to KxK pixels.

72. The method of claim 70 or 71 , wherein K is 16.

73. The method of claim 70, wherein the motion information is compressed to include a reference index and motion vector for a first reference picture list and exclude a reference index and motion vector for a second reference picture list.

74. The method of claim 70, wherein the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions that covers a particular pixel of the PU, and exclude a reference index and motion vector for one of the non- rectangular partitions that covers the particular pixel of the PU, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

75. The method of claim 74, wherein the particular pixel is a top-left pixel of the PU.

76. The method of claim 70, wherein the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions having a smallest reference index value to a selected reference picture list, and exclude a reference index and motion vector for the other of the non-rectangular partitions, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

77. The method of claim 76, wherein, when the reference indexes of the non-rectangular partitions to the selected reference picture list are the same, the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions having a largest motion vector magnitude, and exclude a reference index and motion vector for the other of the non-rectangular partitions, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

78. The method of any of claims 70-77, wherein the non-rectangular partitions include wedgelet partitions.

79. The method of any of claims 70-78, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

80. A device comprising a video decoder configured to perform the method of any of claims 70-79.

81. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 70-79.

82. A method of encoding video data, the method comprising: generating motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture;

compressing the motion information for the non-rectangular partitions when a size of the PU is smaller than a predetermined size of x pixels;

selecting inter-predictive samples for the non-rectangular partitions based on the compressed information;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

encoding the non-rectangular partitions of the PU based at least in part on the compressed motion information and the residual data.

83. The method of claim 82, further comprising not compressing the motion information when the size of the PU is greater than or equal to KxK pixels.

84. The method of claim 82 or 83, wherein K is 16.

85. The method of claim 82, wherein compressing the motion information comprises compressing the motion information to include a reference index and motion vector for a first reference picture list and exclude a reference index and motion vector for a second reference picture list.

86. The method of claim 82, wherein compressing the motion information comprises compressing the motion information to include a reference index and motion vector for one of the non-rectangular partitions that covers a particular pixel of the PU, and exclude a reference index and motion vector for one of the non-rectangular partitions that covers the particular pixel of the PU, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

87. The method of claim 82, wherein the particular pixel is a top-left pixel of the PU.

88. The method of claim 82, wherein compressing the motion information comprises compressing the motion information to include a reference index and motion vector for one of the non-rectangular partitions having a smallest reference index value to a selected reference picture list, and exclude a reference index and motion vector for the other of the non-rectangular partitions, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

89. The method of claim 88, wherein, when the reference indexes of the non-rectangular partitions to the selected reference picture list are the same, compressing the motion information comprises compressing the motion information to include a reference index and motion vector for one of the non-rectangular partitions having a largest motion vector magnitude, and exclude a reference index and motion vector for the other of the non-rectangular partitions, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

90. The method of any of claims 82-89, wherein the non- rectangular partitions include wedgelet partitions.

91. The method of any of claims 82-90, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

92. A device comprising a video encoder configured to perform the method of any of claims 82-91.

93. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 82-91.

94. A method of encoding video data, the method comprising:

obtaining motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein the motion information indicates inter-predictive samples for the non-rectangular partitions;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

reconstructing the non-rectangular partitions of the PU based at least in

part on the inter-predictive samples indicated by the motion information and the residual data, wherein the motion information for the non-rectangular partitions is compressed when temporal motion vector prediction (TMVP) is enabled for the motion information.

95. The method of claim 94, wherein the motion information is compressed when the size of the PU is equal to 16x 16 pixels.

96. The method of claim 94, wherein the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions that covers a particular set of one or more pixels of the PU, and exclude a reference index and motion vector for one of the non-rectangular partitions that covers the particular pixel of the PU, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

97. The method of claim 96, wherein the particular set of one or more pixels is a top-left pixel of the PU.

98. The method of claim 96, wherein the particular set of one or more pixels is a set of one or more center pixels of the PU.

99. The method of claim 94, wherein the motion information is compressed to exclude any motion information for a TMVP candidate of a merge list comprising a plurality of motion information candidates.

100. The method of any of claims 94-99, wherein the non-rectangular partitions include wedgelet partitions.

101. The method of any of claims 94- 100, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

102. A device comprising a video decoder configured to perform the method of any of claims 94-101.

103. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 94- 101.

104. A method of encoding video data, the method comprising:

generating motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture;

compressing the motion information for the non-rectangular partitions when temporal motion vector prediction (TMVP) is enabled for the motion information;

selecting inter-predictive samples for the non- rectangular partitions based on the compressed information;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

encoding the non-rectangular partitions of the PU based at least in part on the compressed motion information and the residual data.

105. The method of claim 104, wherein the motion information is compressed when the size of the PU is equal to 16x 16 pixels.

106. The method of claim 104, wherein compressing the motion information comprises compressing the motion information to include a reference index and a motion vector for one of the non-rectangular partitions that covers a particular set of one or more pixels of the PU, and exclude a reference index and motion vector for one of the non-rectangular partitions that covers the particular pixel of the PU, wherein the included reference index and motion information is used for each of the non- rectangular partitions.

107. The method of claim 106, wherein the particular set of one or more pixels is a top-left pixel of the PU.

108. The method of claim 106, wherein the particular set of one or more pixels is a set of one or more center pixels of the PU.

109. The method of claim 104, wherein compressing the motion information comprises compressing the motion information to exclude any motion information for a TMVP candidate of a merge list comprising a plurality of motion information candidates.

1 10. The method of any of claims 104- 1.9, wherein the non-rectangular partitions include wedgelet partitions.

1 1 1. The method of any of claims 104- 1 10, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

1 12. A device comprising a video encoder configured to perform the method of any of claims 104- 1 10.

113. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 104- 1 10.

1 14. A method of decoding video data, the method comprising:

predicting motion information for a first block in a first coding unit (CU) of a depth view component of a picture based on motion information of a second block in a second CU, wherein, when the second block includes non-rectangular partitions of video data, the predicted motion information is selected to be motion information from one of the non-rectangular partitions that covers a particular pixel of the second block;

generating inter-predictive samples for the first block based on the predicted motion information;

obtaining residual data representing differences between the inter-predictive samples and pixels of the first block; and

reconstructing the first block based at least in part on the inter-predictive samples and the residual data.

1 1 5. The method of claim 1 14, wherein the particular pixel is a center pixel of the second block.

1 16. The method of claim 1 14, wherein the motion information includes a reference index identifying a reference picture for generation of the inter-predictive samples and a motion vector that identifies the inter-predictive samples, and wherein the motion information for the second block is one of a plurality of candidate sets of motion information in a merge candidate list.

1 17. The method of claim 1 16, further comprising selecting the motion information for the second block based on a merge index to the merge candidate list.

1 1 8. The method of any of claims 1 14- 1 17, wherein the non-rectangular partitions include wedgelet partitions.

1 19. A device comprising a video decoder configured to perform the method of any of claims 1 14- 1 18.

120. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 1 14- 1 18.

121. A method of encoding video data, the method comprising:

predicting motion information for a first block of video data in a first coding unit (CU) of a depth view component of a picture based on motion information of a second block in a second CU, wherein, when the second block includes non-rectangular partitions of video data, the predicted motion information is motion information from one of the non-rectangular partitions that covers a particular pixel of the second block;

generating inter-predictive samples for the first block based on the predicted motion information;

generating residual data representing differences between the inter-predictive samples and pixels of the first block; and

encoding the first block based at least in part on the motion information and the residual data.

122. The method of claim 121 , wherein the particular pixel is a center pixel of the second block.

123. The method of claim 121 , wherein the motion information includes a reference index identifying a reference picture for generation of the inter-predictive samples and a motion vector that identifies the inter-predictive samples, and wherein the motion information for the second block is selected from one of a plurality of candidate sets of motion information in a merge candidate list.

124. The method of claim 123, further comprising encoding a merge index to the merge candidate list to indicate the selected one of the candidate sets of motion information.

125. The method of any of claims 121 - 124, wherein the non-rectangular partitions include wedgelet partitions.

126. A device comprising a video encoder configured to perform the method of any of claims 121 - 125.

127. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 121 - 125.

128. A method of decoding video data, the method comprising:

generating merge candidate lists including candidate blocks for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture;

predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists;

generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

reconstructing the non-rectangular partitions of the PU based at least in

part on the inter-predictive samples and the residual data.

129. The method of claim 128, wherein the merge candidate lists are the same for the non- rectangular partitions.

130. The method of claim 128, wherein the merge candidate lists are different for the non- rectangular partitions.

131. The method of claim 128, wherein each of the merge candidate lists includes spatial neighbor blocks, and the merge candidate list for each of the non-rectangular partitions excludes spatial neighbor blocks that are not adjacent to the respective non-rectangular partition.

132. The method of claim 128, wherein each of the merge candidate lists includes spatial neighbor blocks, the merge candidate lists are different for the non-rectangular partitions, and each of the merge candidate lists includes at least some spatial neighbor blocks in common with one another.

133. The method of claim 128, wherein each of the merge candidate lists includes spatial neighbor blocks, and the spatial neighbor blocks in the merge candidate lists are mutually exclusive.

134. The method of claim 128, wherein the merge candidate list for each of the non- rectangular partitions includes spatial neighbor blocks that are adjacent to the respective non- rectangular partition and one or more spatial neighbor blocks that are selected as a substitute spatial neighbor block by shifting vertically or horizontally from a first position of one of the spatial neighbor blocks that is not adjacent to the respective non-rectangular partition to a second position of one of the spatial neighbor blocks that is adjacent to the respective non-rectangular partition, and selecting the spatial neighbor block at the second position as the substitute spatial neighbor block.

135. The method of claim 128, wherein the merge candidate list for each of the non- rectangular partitions includes spatial neighbor blocks that are adjacent to the respective non- rectangular partition and one or more spatial neighbor blocks that are selected as a substitute spatial neighbor block by shifting vertically or horizontally from a first position of a corner spatial neighbor block that is not adjacent to the respective non-rectangular partition to a second position of one of the spatial neighbor blocks that is not adjacent to the respective non- rectangular partition but corresponds to a position of a spatial neighbor block ordinarily used in a merge candidate list for a PU that does not have non-rectangular partitions.

136. The method of claim 128, further comprising selecting candidates from the merge candidate lists for each of the non-rectangular partitions based on merge index values received in an encoded bitstream.

137. The method of any of claims 128- 136, wherein the non-rectangular partitions include wedgelet partitions.

138. The method of any of claims 128- 137, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

139. A device comprising a video decoder configured to perform the method of any of claims 128- 138.

140. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 128- 138.

141 . A method of encoding video data, the method comprising:

generating merge candidate lists including candidate blocks for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture;

predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists;

generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information;

generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

encoding the non-rectangular partitions of the PU based at least in part on the predicted motion information and the residual data.

142. The method of claim 141 , wherein the merge candidate lists are the same for the non- rectangular partitions.

143. The method of claim 141 , wherein the merge candidate lists are different for the non- rectangular partitions.

144. The method of claim 141 , wherein each of the merge candidate lists includes spatial neighbor blocks, and the merge candidate list for each of the non-rectangular partitions excludes spatial neighbor blocks that are not adjacent to the respective non-rectangular partition.

145. The method of claim 141 , wherein each of the merge candidate lists includes spatial neighbor blocks, the merge candidate lists are different for the non-rectangular partitions, and each of the merge candidate lists includes at least some spatial neighbor blocks in common with one another.

146. The method of claim 141 , wherein each of the merge candidate lists includes spatial neighbor blocks, and the spatial neighbor blocks in the merge candidate lists are mutually exclusive.

147. The method of claim 141 , wherein the merge candidate list for each of the non- rectangular partitions includes spatial neighbor blocks that are adjacent to the respective non- rectangular partition and one or more spatial neighbor blocks that are selected as a substitute spatial neighbor block by shifting vertically or horizontally from a first position of one of the spatial neighbor block that is not adjacent to the respective non-rectangular partition to a second position of one of the spatial neighbor blocks that is adjacent to the respective non-rectangular partition, and selecting the spatial neighbor block at the second position as the substitute spatial neighbor block.

148. The method of claim 141 , wherein the merge candidate list for each of the non- rectangular partitions includes spatial neighbor blocks that are adjacent to the respective non- rectangular partition and one or more spatial neighbor blocks that are selected as a substitute spatial neighbor block by shifting vertically or horizontally from a first position of a corner spatial neighbor block that is not adjacent to the respective non-rectangular partition to a second position of one of the spatial neighbor blocks that is not adjacent to the respective non- rectangular partition but corresponds to a position of a spatial neighbor block ordinarily used in a merge candidate list for a PU that does not have non-rectangular partitions.

149. The method of claim 141 , further comprising selecting candidates from the merge candidate lists for each of the non-rectangular partitions based on merge index values received in an encoded bitstream.

150. The method of any of claims 141 - 149, wherein the non-rectangular partitions include wedgelet partitions.

151 . The method of any of claims 141 - 150, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

152. A device comprising a video encoder configured to perform the method of any of claims 141 - 151.

153. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 141 - 15 1 .

154. A method of decoding video data, the method comprising:

generating motion vector inheritance (MVI) candidates for merge candidate lists for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a view of a picture;

predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists;

generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information;

obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

reconstructing the non-rectangular partitions of the PU based at least in

part on the inter-predictive samples and the residual data,

wherein the MVI candidate for each of the non-rectangular partitions includes a block of a texture view component, and wherein the block is selected based on at least partial co-location of the respective non-rectangular partition and the block.

155. The method of claim 154, wherein the co-location comprises co-location of a particular pixel of the respective non-rectangular partition with a pixel of the block.

156. The method of claim 154, wherein the co-location comprises co-location of a corner pixel of the respective non-rectangular partition with a pixel of the block.

157. The method of any of claims 154- 156, wherein the non-rectangular partitions include wedgelet partitions.

158. The method of any of claims 154- 157, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

159. A device comprising a video decoder configured to perform the method of any of claims 154- 158.

160. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 154- 158.

161. A method of decoding video data, the method comprising:

generating motion vector inheritance (MVI) candidates for merge candidate lists for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a view of a picture;

predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists;

generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information;

generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions; and

encoding the non-rectangular partitions of the PU based at least in

part on the prediction motion information and the residual data,

wherein the MVI candidate for each of the non-rectangular partitions includes a block of a texture view component, and wherein the block is selected based on at least partial co-location of the respective non-rectangular partition and the block.

162. The method of claim 161 , wherein the co-location comprises co-location of a particular pixel of the respective non-rectangular partition with a pixel of the block.

163. The method of claim 161 , wherein the co-location comprises co-location of a corner pixel of the respective non-rectangular partition with a pixel of the block.

164. The method of any of claims 161 - 163, wherein the non-rectangular partitions include wedgelet partitions.

165. The method of any of claims 161- 164, wherein the residual data for each of the non- rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

1 6. A device comprising a video encoder configured to perform the method of any of claims 161 - 165.

167. A computer-readable data storage medium comprising instructions to cause one or more processors to perform the method of any of claims 161 - 165.

Description:
3D VIDEO CODING WITH PARTITION-BASED DEPTH INTER CODING

TECHNICAL FIELD

[0001] This disclosure relates to video coding and compression, and more particularly, depth coding techniques that may be used to support three-dimensional (3D) video.

BACKGROUND

[0002] Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, tablet computers, smartphones, personal digital assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video teleconferencing devices, set-top devices, and the like. Digital video devices implement video compression techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), the High Efficiency Video Coding (HEVC) standard, and extensions of such standards. The video devices may transmit, receive and store digital video information more efficiently.

[0003] An encoder-decoder (codec) applies video compression techniques to perform spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice may be partitioned into video blocks, which may also be referred to as treeblocks, coding units (CUs) and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures alternatively may be referred to as frames.

[0004] Spatial or temporal prediction results in a predictive block for a block to be coded. Residual data represents pixel differences between the original block to be coded and the predictive block. An inter-coded block is encoded according to a motion vector that points to a block of reference samples forming the predictive block, and the residual data indicating the difference between the coded block and the predictive block. An intra- coded block is encoded according to an intra-coding mode and the residual data. For further compression, the residual data may be transformed from the spatial domain to a transform domain, resulting in residual transform coefficients, which then may be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned in order to produce a one-dimensional vector of transform coefficients, and entropy coding may be applied to achieve even more compression.

[0005] A multi-view coding bitstream may be generated by encoding views, e.g., from multiple perspectives. Multi-view coding may allow a decoder to choose between different views, or possibly render multiple views. Moreover, some three-dimensional (3D) video techniques and standards that have been developed, or are under development, make use of multiview coding aspects. For example, different views may transmit left and right eye views to support 3D video. Some 3D video coding processes may apply so- called multiview-plus-depth coding. In multiview-plus-depth coding, a 3D video bitstream may contain multiple views that include not only texture view components, but also depth view components. For example, each view may comprise a texture view component and a depth view component.

SUMMARY

[0006] The techniques of this disclosure generally relate to techniques for partition-based inter-coding of depth in 3D video coding. Depth partitions may, in some examples, include non-rectangular depth partitions. The techniques may include various features that support inter-coding of non-rectangular depth partitions. In some examples, the non- rectangular partitions may be coded using, for example, average value-based depth coding for residual information.

[0007] In one aspect, the disclosure describes a method of decoding video data, the method comprising performing an inter-prediction decoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture only when a size of the PU is the same as the size of a coding unit (CU) of the PU, wherein performing the inter-prediction mode comprises obtaining motion information indicating inter-predictive samples for the non-rectangular partitions, obtaining residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data. [0008] In another aspect, the disclosure describes a method of encoding video data, the method comprising performing an inter-prediction encoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture only when a size of the PU is the same as the size of a coding unit (CU) of the PU, wherein performing the inter-prediction mode comprises generating motion information indicating inter-predictive samples for the non-rectangular partitions, generating residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the motion information and the residual data.

[0009] In another aspect, the disclosure describes a method of decoding video data, the method comprising performing an inter-prediction decoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein performing the inter-prediction mode comprises obtaining motion information indicating inter-predictive samples for the non-rectangular partitions, obtaining residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data, and disabling an advanced motion vector prediction (AMVP) mode when the motion information is obtained using motion vector prediction.

[0010] In another aspect, the disclosure describes a method of encoding video data, the method comprising performing an inter-prediction encoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein performing the inter-prediction mode comprises generating motion information indicating inter-predictive samples for the non-rectangular partitions, generating residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the motion information and the residual data, and disabling an advanced motion vector prediction (AMVP) mode when the motion information is generated using motion vector prediction.

[0011] In another aspect, the disclosure describes a method of decoding video data, the method comprising predicting motion information for a non-rectangular partition of video data of a prediction unit (PU) of a depth view component of a picture, performing motion compensation on a rectangular block of the prediction unit that covers the non- rectangular partition using the predicted motion information to obtain inter-predictive samples, selecting some of the inter-predictive samples for the non-rectangular partition of the PU, obtaining residual data representing differences between the selected inter- predictive samples and pixels of the non-rectangular partition, and reconstructing the non-rectangular partition of the PU based at least in part on the selected predictive samples and the residual data.

[0012] In another aspect, the disclosure describes a method of encoding video data, the method comprising predicting motion information for a non-rectangular partition of video data of a prediction unit (PU) of a depth view component of a picture, performing motion compensation on a rectangular block of the prediction unit that covers the non- rectangular partition using the predicted motion information to obtain inter-predictive samples, selecting some of the inter-predictive samples for the non-rectangular partition of the PU, generating residual data representing differences between the selected inter- predictive samples and pixels of the non-rectangular partition, and encoding the non- rectangular partition of the PU based at least in part on the motion information and the residual data.

[0013] In another aspect, the disclosure describes a method of decoding video data, the method comprising obtaining separate sets of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component, obtaining sets of inter-predictive samples for the non-rectangular partitions of the PU based on the separate sets of motion information, obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on inter- predictive samples and the residual data.

[0014] In another aspect, the disclosure describes a method of encoding video data, the method comprising generating separate sets of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component, obtaining sets of inter-predictive samples for the non-rectangular partitions of the PU based on the separate sets of motion information, generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the separate sets of motion information and the residual data.

[0015] In another aspect, the disclosure describes a method of decoding video data, the method comprising obtaining motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein the motion information indicates inter-predictive samples for the non-rectangular partitions, obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data, wherein the motion information for the non-rectangular partitions is compressed when a size of the PU is smaller than a predetermined size of KxK pixels.

[0016] In another aspect, the disclosure describes a method of encoding video data, the method comprising generating motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, compressing the motion information for the non-rectangular partitions when a size of the PU is smaller than a predetermined size of KxK pixels, selecting inter-predictive samples for the non- rectangular partitions based on the compressed information, obtaining residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the compressed motion information and the residual data.

[0017] In another aspect, the disclosure describes a method of encoding video data, the method comprising obtaining motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein the motion information indicates inter-predictive samples for the non-rectangular partitions, obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data, wherein the motion information for the non-rectangular partitions is compressed when temporal motion vector prediction (TMVP) is enabled for the motion information.

[0018] In another aspect, the disclosure describes a method of encoding video data, the method comprising generating motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, compressing the motion information for the non-rectangular partitions when temporal motion vector prediction (TMVP) is enabled for the motion information, selecting inter-predictive samples for the non-rectangular partitions based on the compressed information, obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the compressed motion information and the residual data.

[0019] In another aspect, the disclosure describes a method of decoding video data, the method comprising predicting motion information for a first block in a first coding unit (CU) of a depth view component of a picture based on motion information of a second block in a second CU, wherein, when the second block includes non-rectangular partitions of video data, the predicted motion information is selected to be motion information from one of the non-rectangular partitions that covers a particular pixel of the second block, generating inter-predictive samples for the first block based on the predicted motion information, obtaining residual data representing differences between the inter-predictive samples and pixels of the first block, and reconstructing the first block based at least in part on the inter-predictive samples and the residual data.

[0020] In another aspect, the disclosure describes a method of encoding video data, the method comprising predicting motion information for a first block of video data in a first coding unit (CU) of a depth view component of a picture based on motion information of a second block in a second CU, wherein, when the second block includes non-rectangular partitions of video data, the predicted motion information is motion information from one of the non-rectangular partitions that covers a particular pixel of the second block, generating inter-predictive samples for the first block based on the predicted motion information, generating residual data representing differences between the inter- predictive samples and pixels of the first block, and encoding the first block based at least in part on the motion information and the residual data.

[0021] In another aspect, the disclosure describes a method of decoding video data, the method comprising generating merge candidate lists including candidate blocks for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information, obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples and the residual data.

[0022] In another aspect, the disclosure describes a method of encoding video data, the method comprising generating merge candidate lists including candidate blocks for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information, generating residual data

representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the predicted motion information and the residual data.

[0023] In another aspect, the disclosure describes a method of decoding video data, the method comprising generating motion vector inheritance (MVI) candidates for merge candidate lists for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a view of a picture, predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information, obtaining residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples and the residual data, wherein the MVI candidate for each of the non-rectangular partitions includes a block of a texture view component, and wherein the block is selected based on at least partial co-location of the respective non-rectangular partition and the block.

[0024] In another aspect, the disclosure describes a method of decoding video data, the method comprising generating motion vector inheritance (MVI) candidates for merge candidate lists for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a view of a picture, predicting motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, generating inter-predictive samples for the non-rectangular partitions based on the predicted motion information, generating residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the prediction motion information and the residual data, wherein the MVI candidate for each of the non-rectangular partitions includes a block of a texture view component, and wherein the block is selected based on at least partial co-location of the respective non-rectangular partition and the block. [0025] In another aspect, the disclosure describes a device comprising a video decoder configured to perform various methods as described herein, a device comprising a video encoder configured to perform various methods described herein, and a non-transitory computer-readable data storage medium comprising instructions to cause one or more processors to perform various methods described herein.

[0026] The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

BRIEF DESCRIPTION OF DRAWINGS

[0027] FIG. 1 is a block diagram illustrating an example video coding system that may utilize the techniques of this disclosure.

[0028] FIG. 2 is a diagram illustrating intra prediction modes used in high efficiency video coding (HEVC).

[0029] FIG. 3 is a diagram illustrating an example of one wedgelet partition pattern for use in coding an 8x8 block of pixel samples.

[0030] FIG. 4 is a diagram illustrating an example of one contour partition pattern for use in coding an 8x8 block of pixel samples.

[0031] FIG. 5 is a diagram illustrating eight possible types of chains defined in a region boundary chain coding process.

[0032] FIG. 6 is a diagram illustrating a region boundary chain coding mode with one depth prediction unit (PU) partition pattern and the coded chains in chain coding.

[0033] FIG. 7 is a diagram illustrating derivation of a motion vector inheritance (MVI) candidate for depth coding.

[0034] FIG. 8 is a diagram illustrating generation of a motion compensated rectangular block forming a reference block used to obtain inter-predictive reference samples for a non-rectangular partition.

[0035] FIG. 9 is a diagram illustrating selection of motion information for a non- rectangular partition that covers a top-left pixel of a prediction unit of a depth view component.

[0036] FIG. 10 is a diagram illustrating selection of motion information of a non- rectangular partition for use in motion prediction for a block in another CU based on the determination that the partition covers a center pixel of the candidate block for motion prediction.

[0037] FIGS. 11 A, 11B, 11C and 11D are diagrams illustrating definition of neighboring blocks used for construction of a merge candidate list for non-rectangular partitions in depth coding.

[0038] FIG. 12 is a block diagram illustrating an example video encoder that may implement the techniques of this disclosure.

[0039] FIG. 13 is a block diagram illustrating an example video decoder that may implement the techniques of this disclosure.

[0040] FIG. 14 is a flow diagram illustrating a method for enabling coding according to a non-rectangular depth inter prediction mode.

[0041] FIG. 15 is a flow diagram illustrating a method for performing coding with motion prediction while disabling advanced motion vector prediction (AMVP) in a non- rectangular depth inter prediction mode.

[0042] FIG. 16 is a flow diagram illustrating a method for performing coding with motion prediction in a non-rectangular depth inter prediction mode.

[0043] FIG. 17 is a flow diagram illustrating a method for maintaining separate sets of motion information for non-rectangular partitions in a non-rectangular depth inter prediction mode.

[0044] FIG. 18 is a flow diagram illustrating a method for compression of motion information for non-rectangular partitions in a non-rectangular depth inter prediction mode.

[0045] FIG. 19 is a flow diagram illustrating another method for compression of motion information for non-rectangular partitions in a non-rectangular depth inter prediction mode.

[0046] FIG. 20 is a flow diagram illustrating a method for selecting motion information for motion prediction from a merge candidate having non-rectangular partitions.

[0047] FIG. 21 is a flow diagram illustrating a method for generating merge candidate lists for motion prediction for non-rectangular partitions in a non-rectangular depth inter prediction mode.

[0048] FIG. 22 is a flow diagram illustrating a method for generating motion vector inheritance (MVI) candidates for a merge candidate list for motion prediction for non- rectangular partitions in a non-rectangular depth inter prediction mode. DETAILED DESCRIPTION

[0049] This disclosure describes techniques for 3D video coding based on advanced codecs, such as High Efficiency Video Coding (HEVC) codecs. The 3D coding techniques described in this disclosure include depth coding techniques related to advanced inter-coding of depth views in a multiview-plus-depth video coding process, such as the 3D-HEVC extension to HEVC, which is presently under development.

[0050] In HEVC, assuming that the size of a coding unit (CU) is 2Nx2N, a video encoder and video decoder may support various prediction unit (PU) sizes of 2Nx2N or NxN for intra-prediction, and symmetric PU sizes of 2Nx2N, 2NxN, Nx2N, NxN, or similar for inter-prediction. A video encoder and video decoder may also support asymmetric partitioning for PU sizes of 2NxnU, 2NxnD, nLx2N, and nRx2N for inter-prediction.

[0051] For depth coding as provided in 3D-HEVC, a video encoder and video decoder may be configured to support a variety of different depth coding partition modes for intra-prediction or inter-prediction, including modes that use non-rectangular partitions. Examples of depth coding with non-rectangular partitions include wedgelet partition- based depth coding and contour partition-based depth coding. As described in this disclosure, techniques for partition-based inter-coding of non-rectangular partitions, such as wedgelet partitions or contour partitions, as examples, may be performed in conjunction with an average value-based coding of residual information, which may be the same as or similar to a simplified depth coding (SDC) process for depth intra coding of residual information.

[0052] Video data coded using 3D video coding techniques may be rendered and displayed to produce a three-dimensional effect. As one example, two images of different views (i.e., corresponding to two camera perspectives having slightly different horizontal positions) may be displayed substantially simultaneously such that one image is seen by a viewer's left eye, and the other image is seen by the viewer's right eye.

[0053] A 3D effect may be achieved using, for example, stereoscopic displays or auto stereoscopic displays. Stereoscopic displays may be used in conjunction with eyewear that filters the two images accordingly. For example, passive glasses may filter the images using polarized lenses or different colored lenses to ensure that the proper eye views the proper image. Active glasses, as another example, may rapidly shutter alternate lenses in coordination with the stereoscopic display, which may alternate between displaying the left eye image and the right eye image. Autostereoscopic displays display the two images in such a way that no glasses are needed. For example, autostereoscopic displays may include mirrors or prisms that are configured to cause each image to be projected into a viewer's appropriate eyes.

[0054] The techniques of this disclosure relate to techniques for coding 3D video data by coding texture and depth data to support 3D video. In general, the term "texture" is used to describe luminance (that is, brightness or "luma") values of an image and chrominance (that is, color or "chroma") values of the image. In some examples, a texture image may include one set of luminance data (Y) and two sets of chrominance data for blue hues (Cb) and red hues (Cr). In certain chroma formats, such as 4:2:2 or 4:2:0, the chroma data is downsampled relative to the luma data. That is, the spatial resolution of chrominance pixels may be lower than the spatial resolution of corresponding luminance pixels, e.g., one-half or one-quarter of the luminance resolution.

[0055] Depth data generally describes depth values for corresponding texture data. For example, a depth image may include a set of depth pixels (or depth values) that each describes depth, e.g., in a depth component of a view, for corresponding texture data, e.g., in a texture component of the view. Each pixel may have one or more texture values (e.g., luminance and chrominance), and may also have a one or more depth values. The depth data may be used to determine horizontal disparity for the corresponding texture data, and in some cases, vertical disparity may also be used.

[0056] A device that receives the texture and depth data may display a first texture image for one view (e.g., a left eye view) and use the depth data to modify the first texture image to generate a second texture image for the other view (e.g., a right eye view) by offsetting pixel values of the first image by the horizontal disparity values determined based on the depth values. In general, horizontal disparity (or simply "disparity") describes the horizontal spatial offset of a pixel in a first view to a corresponding pixel in the right view, where the two pixels correspond to the same portion of the same object as represented in the two views.

[0057] In still other examples, depth data may be defined for pixels in a z-dimension perpendicular to the image plane, such that a depth associated with a given pixel is defined relative to a zero disparity plane defined for the image. Such depth may be used to create horizontal disparity for displaying the pixel, such that the pixel is displayed differently for the left and right eyes, depending on the z-dimension depth value of the pixel relative to the zero disparity plane. The zero disparity plane may change for different portions of a video sequence, and the amount of depth relative to the zero- disparity plane may also change. [0058] Pixels located on the zero disparity plane may be defined similarly for the left and right eyes. Pixels located in front of the zero disparity plane may be displayed in different locations for the left and right eye (e.g., with horizontal disparity) so as to create a perception that the pixel appears to come out of the image in the z-direction

perpendicular to the image plane. Pixels located behind the zero disparity plane may be displayed with a slight blur, to slight perception of depth, or may be displayed in different locations for the left and right eye (e.g., with horizontal disparity that is opposite that of pixels located in front of the zero disparity plane). Many other techniques may also be used to convey or define depth data for an image.

[0059] Two-dimensional video data is generally coded as a sequence of discrete pictures, each of which corresponds to a particular temporal instance. That is, each picture has an associated playback time relative to playback times of other images in the sequence.

These pictures may be considered texture pictures or texture images. In depth-based 3D video coding, each texture picture in a sequence may also correspond to a depth map. That is, a depth map corresponding to a texture picture describes depth data for the corresponding texture picture. Multiview video data may include data for various different views, where each view may include a respective sequence of texture

components and corresponding depth components.

[0060] A picture generally corresponds to a particular temporal instance. Video data may be represented using a sequence of access units, where each access unit includes all data corresponding to a particular temporal instance. Thus, for example, for multiview video data plus depth coding, texture images from each view for a common temporal instance, plus the depth maps for each of the texture images, may all be included within a particular access unit. Hence, an access unit may include multiple views, where each view may include data for a texture component, corresponding to a texture image, and data for a depth component, corresponding to a depth map. Multiview-plus-depth creates a variety of coding possibilities, such as intra-picture, inter-picture, intra-view, inter-view, motion prediction, and the like.

[0061] In this manner, 3D video data may be represented using a multiview video plus depth format, in which captured or generated views (texture) are associated with corresponding depth maps. Moreover, in 3D video coding, textures and depth maps may be coded and multiplexed into a 3D video bitstream. Depth maps may be coded as grayscale images, where "luma" samples (that is, pixels) of the depth maps represent depth values. [0062] In general, a block of depth data (a block of samples of a depth map) may be referred to as a depth block. A depth value may be referred to as a luma value associated with a depth sample. In any case, conventional intra- and inter-coding methods may be applied for depth map coding.

[0063] Depth maps commonly are characterized by sharp edges and constant areas, and edges in depth maps typically present strong correlations with corresponding texture data. Due to the different statistics and correlations between texture and corresponding depth, different coding schemes have been designed for depth maps based on a 2D video codec.

[0064] HEVC techniques related to this disclosure are reviewed below. Examples of video coding standards include ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC), including its Scalable Video Coding (SVC) and Multiview Video Coding (MVC) extensions. The latest joint draft of MVC is described in "Advanced video coding for generic audiovisual services," ITU-T Recommendation H.264, Mar 2010.

[0065] In addition, High Efficiency Video Coding (HEVC), mentioned above, is a new and upcoming video coding standard, developed by the Joint Collaboration Team on Video Coding (JCT-VC) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Motion Picture Experts Group (MPEG). A recent draft of the HEVC standard, JCTVC- L1003, Benjamin Bross Woo-Jin Han, Jens-Ranier Ohm, Gary Sullivan, Ye-Kui Wang, "High Efficiency Video Coding (HEVC) text specification draft 10 (for FDIS &

Consent)," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Geneva, CH, 14-23 Jan. 2013, is available from the following link: http://phenix.it-sudparis.eu/ict/doc end user/documents/ 12 Geneva/wg l 1 /JCTVC- L 1003- V 1 l .zip.

[0066] FIG. 1 is a block diagram illustrating an example video encoding and decoding system 10 that may utilize various techniques of this disclosure for depth coding. In some examples, video encoder 20 and video decoder 30 may be configured to perform various functions for partition-based inter-coding of depth data with simplified depth coding of residual information for 3D video coding. As shown in FIG. 1, system 10 includes a source device 12 that provides encoded video data to be decoded at a later time by a destination device 14. In particular, source device 12 provides the video data to destination device 14 via a computer-readable medium 16. Source device 12 and destination device 14 may comprise any of a wide range of devices, including desktop computers, notebook (i.e., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, so-called "smart" pads, televisions, cameras, display devices, digital media players, video gaming consoles, video streaming device, or the like. In some cases, source device 12 and destination device 14 may be equipped for wireless communication.

[0067] Destination device 14 may receive the encoded video data to be decoded via computer-readable medium 16. Computer-readable medium 16 may comprise any type of medium or device capable of moving the encoded video data from source device 12 to destination device 14. In one example, computer-readable medium 16 may comprise a communication medium to enable source device 12 to transmit encoded video data directly to destination device 14 in real-time.

[0068] The encoded video data may be modulated according to a communication standard, such as a wireless communication protocol, and transmitted to destination device 14. The communication medium may comprise any wireless or wired

communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet- based network, such as a local area network, a wide-area network, or a global network such as the Internet. The communication medium may include routers, switches, base stations, or any other equipment that may be useful to facilitate communication from source device 12 to destination device 14.

[0069] In some examples, encoded data may be output from output interface 22 to a computer-readable storage medium, i.e., a storage device. Similarly, encoded data may be accessed from the storage device by input interface. The storage device may include any of a variety of distributed or locally accessed data storage media such as a hard drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other suitable digital storage media for storing encoded video data. In a further example, the storage device may correspond to a file server or another intermediate storage device that may store the encoded video generated by source device 12.

[0070] Destination device 14 may access stored video data from the storage device via streaming or download. The file server may be any type of server capable of storing encoded video data and transmitting that encoded video data to the destination device 14. Example file servers include a web server (e.g., for a website), an FTP server, network attached storage (NAS) devices, or a local disk drive. Destination device 14 may access the encoded video data through any standard data connection, including an Internet connection. This may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., DSL, cable modem, etc.), or a combination of both that is suitable for accessing encoded video data stored on a file server. The transmission of encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.

[0071] The techniques of this disclosure are not necessarily limited to wireless applications or settings. The techniques may be applied to video coding in support of any of a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, Internet streaming video transmissions, such as dynamic adaptive streaming over HTTP (DASH), digital video that is encoded onto a data storage medium, decoding of digital video stored on a data storage medium, or other applications. In some examples, system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

[0072] In the example of FIG. 1, source device 12 includes video source 18, video encoder 20, and output interface 22. Destination device 14 includes input interface 28, video decoder 30, and display device 32. In accordance with this disclosure, video encoder 20 of source device 12 may be configured to apply techniques for partition-based depth coding with non-rectangular partitions. In other examples, a source device and a destination device may include other components or arrangements. For example, source device 12 may receive video data from an external video source 18, such as an external camera. Likewise, destination device 14 may interface with an external display device, rather than including an integrated display device.

[0073] The illustrated system 10 of FIG. 1 is merely one example. Techniques for depth coding may be performed by any digital video encoding and/or decoding device.

Although generally the techniques of this disclosure are performed by a video encoder 20 and/or video decoder 30, the techniques may also be performed by a video

encoder/decoder, typically referred to as a "CODEC." Moreover, the techniques of this disclosure may also be performed by a video preprocessor. Source device 12 and destination device 14 are merely examples of such coding devices in which source device 12 generates coded video data for transmission to destination device 14. In some examples, devices 12, 14 may operate in a substantially symmetrical manner such that each of devices 12, 14 include video encoding and decoding components. Hence, system 10 may support one-way or two-way video transmission between video devices 12, 14, e.g., for video streaming, video playback, video broadcasting, or video telephony.

[0074] Video source 18 of source device 12 may include a video capture device, such as a video camera, a video archive containing previously captured video, and/or a video feed interface to receive video from a video content provider. As a further alternative, video source 18 may generate computer graphics -based data as the source video, or a combination of live video, archived video, and computer generated video. In some cases, if video source 18 is a video camera, source device 12 and destination device 14 may form so-called smart phones, tablet computers or video phones. As mentioned above, however, the techniques described in this disclosure may be applicable to video coding in general, and may be applied to wireless and/or wired applications. In each case, the captured, pre-captured, or computer-generated video may be encoded by video encoder 20. The encoded video information may then be output by output interface 22 onto a computer-readable medium 16.

[0075] Computer-readable medium 16 may include transient media, such as a wireless broadcast or wired network transmission, or data storage media (that is, non-transitory storage media). In some examples, a network server (not shown) may receive encoded video data from source device 12 and provide the encoded video data to destination device 14, e.g., via network transmission. Similarly, a computing device of a medium production facility, such as a disc stamping facility, may receive encoded video data from source device 12 and produce a disc containing the encoded video data. Therefore, computer-readable medium 16 may be understood to include one or more computer- readable media of various forms, in various examples.

[0076] This disclosure may generally refer to video encoder 20 "signaling" certain information to another device, such as video decoder 30. It should be understood, however, that video encoder 20 may signal information by associating certain syntax elements with various encoded portions of video data. That is, video encoder 20 may "signal" data by storing certain syntax elements to headers or in payloads of various encoded portions of video data. In some cases, such syntax elements may be encoded and stored (e.g., stored to computer-readable medium 16) prior to being received and decoded by video decoder 30. Thus, the term "signaling" may generally refer to the communication of syntax or other data for decoding compressed video data, whether such communication occurs in real- or near-real-time or over a span of time, such as might occur when storing syntax elements to a medium at the time of encoding, which then may be retrieved by a decoding device at any time after being stored to this medium.

[0077] Input interface 28 of destination device 14 receives information from computer- readable medium 16. The information of computer-readable medium 16 may include syntax information defined by video encoder 20, which is also used by video decoder 30, that includes syntax elements that describe characteristics and/or processing of blocks and other coded units, e.g., GOPs. Display device 32 displays the decoded video data to a user, and may comprise any of a variety of display devices such as a cathode ray tube (CRT), a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, a projection device, or another type of display device.

[0078] Although not shown in FIG. 1, in some aspects, video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include appropriate MUX-DEMUX units, or other hardware and software, to handle encoding of both audio and video in a common data stream or separate data streams. If applicable, MUX-DEMUX units may conform to the ITU H.223 multiplexer protocol, as one example, or other protocols such as the user datagram protocol (UDP).

[0079] Video encoder 20 and video decoder 30 each may be implemented as any of a variety of suitable encoder or decoder circuitry, as applicable, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic circuitry, software, hardware, firmware or any combinations thereof. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined video encoder/decoder (CODEC). A device including video encoder 20 and/or video decoder 30 may comprise an integrated circuit, a microprocessor, and/or a wireless communication device, such as a cellular telephone.

[0080] Video encoder 20 and video decoder 30 may operate according to a video coding standard, such as the HEVC standard and, more particularly, the 3D-HEVC extension of the HEVC standard, as referenced in this disclosure. HEVC presumes several additional capabilities of video coding devices relative to devices configured to perform coding according to other processes, such as, e.g., ITU-T H.264/AVC. For example, whereas H.264 provides nine intra-prediction encoding modes, the HM may provide as many as thirty- five intra-prediction encoding modes. [0081] In general, HEVC specifies that a video picture (or "frame") may be divided into a sequence of treeblocks or largest coding units (LCU) that include both luma and chroma samples. Syntax data within a bitstream may define a size for the LCU, which is a largest coding unit in terms of the number of pixels. A slice includes a number of consecutive treeblocks in coding order. A picture may be partitioned into one or more slices. Each treeblock may be split into coding units (CUs) according to a quadtree. In general, a quadtree data structure includes one node per CU, with a root node

corresponding to the treeblock. If a CU is split into four sub-CUs, the node

corresponding to the CU includes four leaf nodes, each of which corresponds to one of the sub-CUs.

[0082] Each node of the quadtree data structure may provide syntax data for the corresponding CU. For example, a node in the quadtree may include a split flag, indicating whether the CU corresponding to the node is split into sub-CUs. Syntax elements for a CU may be defined recursively, and may depend on whether the CU is split into sub-CUs. If a CU is not split further, it is referred as a leaf-CU. Four sub-CUs of a leaf-CU may also be referred to as leaf-CUs even if there is no explicit splitting of the original leaf-CU. For example, if a CU at 16x16 size is not split further, the four 8x8 sub-CUs will also be referred to as leaf-CUs although the 16x16 CU was never split.

[0083] A CU in HEVC has a similar purpose as a macroblock of the H.264 standard, except that a CU does not have a size distinction. For example, a treeblock may be split into four child nodes (also referred to as sub-CUs), and each child node may in turn be a parent node and be split into another four child nodes. A final, unsplit child node, referred to as a leaf node of the quadtree, comprises a coding node, also referred to as a leaf-CU. Syntax data associated with a coded bitstream may define a maximum number of times a treeblock may be split, referred to as a maximum CU depth, and may also define a minimum size of the coding nodes. Accordingly, a bitstream may also define a smallest coding unit (SCU). This disclosure uses the term "block" to refer to any of a CU, PU, or TU, in the context of HEVC, or similar data structures in the context of other standards (e.g., macroblocks and sub-blocks thereof in H.264/ A VC).

[0084] A CU includes a coding node and prediction units (PUs) and transform units (TUs) associated with the coding node. A size of the CU corresponds to a size of the coding node and must be square in shape. The size of the CU may range from 8x8 pixels up to the size of the treeblock with a maximum of 64x64 pixels or greater. Each CU may contain one or more PUs and one or more TUs. Syntax data associated with a CU may describe, for example, partitioning of the CU into one or more PUs. Partitioning modes may differ between whether the CU is skip or direct mode encoded, intra-prediction mode encoded, or inter-prediction mode encoded. PUs may be partitioned to be non- square in shape, or include partitions that are non-rectangular in shape, in the case of depth coding as described in this disclosure. Syntax data associated with a CU may also describe, for example, partitioning of the CU into one or more TUs according to a quadtree. A TU can be square or non-square (e.g., rectangular) in shape.

[0085] The HEVC standard allows for transformations according to TUs, which may be different for different CUs. The TUs are typically sized based on the size of PUs within a given CU defined for a partitioned LCU, although this may not always be the case. The TUs are typically the same size or smaller than the PUs. In some examples, residual samples corresponding to a CU may be subdivided into smaller units using a quadtree structure known as "residual quad tree" (RQT). The leaf nodes of the RQT may be referred to as transform units (TUs). Pixel difference values associated with the TUs may be transformed to produce transform coefficients, which may be quantized.

[0086] A leaf-CU may include one or more prediction units (PUs). In general, a PU represents a spatial area corresponding to all or a portion of the corresponding CU, and may include data for retrieving reference samples for the PU. The reference samples may be pixels from a reference block. In some examples, the reference samples may be obtained from a reference block, or generated, e.g., by interpolation or other techniques. A PU also includes data related to prediction. For example, when the PU is intra-mode encoded, data for the PU may be included in a residual quadtree (RQT), which may include data describing an intra-prediction mode for a TU corresponding to the PU. As another example, when the PU is inter-mode encoded, the PU may include data defining one or more motion vectors for the PU. The data defining the motion vector for a PU may describe, for example, a horizontal component of the motion vector, a vertical component of the motion vector, a resolution for the motion vector (e.g., one-quarter pixel precision or one-eighth pixel precision), a reference picture to which the motion vector points, and/or a reference picture list (e.g., List 0, List 1, or List C) for the motion vector.

[0087] A leaf-CU having one or more PUs may also include one or more transform units (TUs). The transform units may be specified using an RQT (also referred to as a TU quadtree structure), as discussed above. For example, a split flag may indicate whether a leaf-CU is split into four transform units. Then, each transform unit may be split further into further sub-TUs. When a TU is not split further, it may be referred to as a leaf-TU. Generally, for intra coding, all the leaf-TUs belonging to a leaf-CU share the same intra prediction mode. That is, the same intra-prediction mode is generally applied to calculate predicted values for all TUs of a leaf-CU. For intra coding, a video encoder 20 may calculate a residual value for each leaf-TU using the intra prediction mode, as a difference between the portion of the CU corresponding to the TU and the original block. A TU is not necessarily limited to the size of a PU. Thus, TUs may be larger or smaller than a PU. For intra coding, a PU may be collocated with a corresponding leaf-TU for the same CU. In some examples, the maximum size of a leaf-TU may correspond to the size of the corresponding leaf-CU.

[0088] Moreover, TUs of leaf-CUs may also be associated with respective quadtree data structures, referred to as residual quadtrees (RQTs). That is, a leaf-CU may include a quadtree indicating how the leaf-CU is partitioned into TUs. The root node of a TU quadtree generally corresponds to a leaf-CU, while the root node of a CU quadtree generally corresponds to a treeblock (or LCU). TUs of the RQT that are not split are referred to as leaf-TUs. In general, this disclosure uses the terms CU and TU to refer to a leaf-CU and leaf-TU, respectively, unless noted otherwise.

[0089] A video sequence typically includes a series of pictures. As described herein, "picture" and "frame" may be used interchangeably. That is, a picture containing video data may be referred to as a video frame, or simply a "frame." A group of pictures (GOP) generally comprises a series of one or more of the video pictures. A GOP may include syntax data in a header of the GOP, a header of one or more of the pictures, or elsewhere, that describes a number of pictures included in the GOP. Each slice of a picture may include slice syntax data that describes an encoding mode for the respective slice. Video encoder 20 typically operates on video blocks within individual video slices in order to encode the video data. A video block may correspond to a coding node within a CU. The video blocks may have fixed or varying sizes, and may differ in size according to a specified coding standard.

[0090] As an example, HEVC supports prediction in various PU sizes. Assuming that the size of a particular CU is 2Nx2N, HEVC supports intra-prediction in PU sizes of 2Nx2N or NxN, and inter-prediction in symmetric PU sizes of 2Nx2N, 2NxN, Nx2N, or NxN. A PU having a size of 2Nx2N represents an undivided CU, as it is the same size as the CU in which it resides. In other words, a 2Nx2N PU is the same size as its CU. The HM also supports asymmetric partitioning for inter-prediction in PU sizes of 2NxnU, 2NxnD, nLx2N, and nRx2N. In asymmetric partitioning, one direction of a CU is not partitioned, while the other direction is partitioned into 25% and 75%. The portion of the CU corresponding to the 25% partition is indicated by an "n" followed by an indication of "Up", "Down," "Left," or "Right." Thus, for example, "2NxnU" refers to a 2Nx2N CU that is partitioned horizontally with a 2Nx0.5N PU on top and a 2Nxl.5N PU on bottom.

[0091] In this disclosure, "NxN" and "N by N" may be used interchangeably to refer to the pixel dimensions of a video block in terms of vertical and horizontal dimensions, e.g., 16x16 pixels or 16 by 16 pixels. In general, a 16x16 block will have 16 pixels in a vertical direction (y = 16) and 16 pixels in a horizontal direction (x = 16). Likewise, an NxN block generally has N pixels in a vertical direction and N pixels in a horizontal direction, where N represents a nonnegative integer value. The pixels in a block may be arranged in rows and columns. Moreover, blocks need not necessarily have the same number of pixels in the horizontal direction as in the vertical direction. For example, blocks may comprise NxM pixels, where M is not necessarily equal to N.

[0092] Following intra-predictive or inter-predictive coding using the PUs of a CU, video encoder 20 may calculate residual data for the TUs of the CU. The PUs may comprise syntax data describing a method or mode of generating predictive pixel data in the spatial domain (also referred to as the pixel domain) and the TUs may comprise coefficients in the transform domain following application of a transform, e.g., a discrete cosine transform (DCT), an integer transform, a wavelet transform, or a conceptually similar transform to residual video data. The residual data may correspond to pixel differences between pixels of the unencoded picture and prediction values corresponding to the PUs. Video encoder 20 may form the TUs including the residual data for the CU, and then transform the TUs to produce transform coefficients for the CU.

[0093] Following any transforms to produce transform coefficients, video encoder 20 may perform quantization of the transform coefficients. Quantization generally refers to a process in which transform coefficients are quantized to possibly reduce the amount of data used to represent the coefficients, providing further compression. The quantization process may reduce the bit depth associated with some or all of the coefficients. For example, an n-bit value may be rounded down to an m-bit value during quantization, where n is greater than m.

[0094] Following quantization, video encoder 20 may scan the transform coefficients, producing a one-dimensional vector from the two-dimensional matrix including the quantized transform coefficients. The scan may be designed to place higher energy (and therefore lower frequency) coefficients at the front of the array and to place lower energy (and therefore higher frequency) coefficients at the back of the array.

[0095] In some examples, video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to produce a serialized vector that can be entropy encoded. In other examples, video encoder 20 may perform an adaptive scan. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may entropy encode the one-dimensional vector, e.g., according to context- adaptive variable length coding (CAVLC), context- adaptive binary arithmetic coding (CABAC), syntax -based context- adaptive binary arithmetic coding (SBAC), Probability Interval Partitioning Entropy (PIPE) coding or another entropy encoding methodology. Video encoder 20 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 30 in decoding the video data.

[0096] Video encoder 20 may further send syntax data, such as block-based syntax data, picture-based syntax data, and GOP-based syntax data, to video decoder 30, e.g., in a picture header, a block header, a slice header, or a GOP header. The GOP syntax data may describe a number of pictures in the respective GOP, and the picture syntax data may indicate an encoding/prediction mode used to encode the corresponding picture.

[0097] Video encoder 20 and/or video decoder 30 may intra-code depth data. In addition, in accordance with examples of this disclosure, video encoder 20 and/or video decoder 30 may inter-code depth data. In particular, video encoder 20 and/or video decoder 30 may perform partition-based inter-coding of depth data, using non-rectangular partitions, and may perform an average value-based residual coding process for depth inter coding similar to simplified depth coding (SDC) of residual data for depth intra coding, as will be described.

[0098] For example, in 3D-HEVC, video encoder 20 and/or video decoder 30 may use Depth Modeling Modes (DMMs) to code a prediction unit of a depth slice. In some instances, four DMMs may be available for intra-coding depth data. In all four modes, video encoder 20 and/or video decoder 30 partitions a depth block into more than one region, as specified by a DMM pattern. Video encoder 20 and/or video decoder 30 then generates a predicted depth value for each region, which may be referred to as a "DC" predicted depth value that is based on the values of neighboring depth samples.

[0099] The DMM pattern may be explicitly signaled, predicted from spatially

neighboring depth blocks, and/or predicted from a co-located texture block. For example, a first DMM (e.g., DMM mode 1) may include signaling starting and/or ending points of a partition boundary of a depth block. A second DMM (e.g., DMM mode 2) may include predicting partition boundaries of a depth block based on a spatially neighboring depth block. Third and fourth DMMs (e.g., DMM mode 3 and DMM mode 4) may include predicting partition boundaries of a depth block based on a co-located texture block of the depth block.

[0100] With four DMMs available, there may be signaling associated with each of the four DMMs (e.g., DMM modes 1-4). For example, video encoder 20 may select a DMM to code a depth PU based on a rate-distortion optimization. Video encoder 20 may provide an indication of the selected DMM in an encoded bitstream with the encoded depth data. Video decoder 30 may parse the indication from the bitstream to determine the appropriate DMM for decoding the depth data. In some instances, a fixed length code may be used to indicate a selected DMM. In addition, the fixed length code may also indicate whether a prediction offset (associated with a predicted DC value) is applied.

[0101] As noted above, a video encoder 20 and/or video decoder 30 configured in accordance with one or more aspects of this disclosure may further include features for applying partition-based coding, e.g., using partitions defined by DMMs, for inter-coding, and may apply a simplified depth coding for residual data associated with inter-predicted blocks. Some aspects of this disclosure may further relate to reducing complexity and promoting efficiency of partition-based inter-coding of depth data in conjunction with simplified depth coding. In various examples, techniques are descried for inter-coding of non-rectangular partitions of PU's of depth view components.

[0102] FIG. 2 is a diagram illustrating intra prediction modes used in high efficiency video coding (HEVC). FIG. 2 generally illustrates the prediction directions associated with various directional intra-prediction modes available for intra-coding in HEVC. In the current HEVC standard, for the luma component of each Prediction Unit (PU), an intra prediction method is utilized with 33 angular prediction modes (indexed from 2 to 34), DC mode (indexed with 1) and Planar mode (indexed with 0), as shown in FIG. 2.

[0103] With planar mode, prediction is performed using a so-called "plane" function. With DC mode, prediction is performed based on an averaging of pixel values within the block. With a directional prediction mode, prediction is performed based on a

neighboring block's reconstructed pixels along a particular direction (as indicated by the mode). In general, the tail end of the arrows shown in FIG. 1 represents a relative one of neighboring pixels from which a value is retrieved, while the head of the arrows represents the direction in which the retrieved value is propagated to form a predictive block.

[0104] 3D-HEVC in MPEG will now be described in further detail. A Joint

Collaboration Team on 3D Video Coding (JCT-3V) of VCEG and MPEG is developing a 3D video (3DV) standard based on HEVC, for which part of the standardization efforts includes the standardization of the multiview video codec based on HEVC (MV-HEVC) and another part for 3D Video coding based on HEVC (3D-HEVC), mentioned above. For 3D-HEVC, new coding tools, including those in coding unit (CU)/prediction unit (PU) level, for both texture and depth views may be included and supported.

[0105] Currently, the HEVC-based 3D Video Coding (3D-HEVC) codec in MPEG is based on the solutions proposed in documents m22570 and m22571. The full citation for m22570 is: Schwarz et al., Description of 3D Video Coding Technology Proposal by Fraunhofer HHI (HEVC compatible configuration A), MPEG Meeting ISO/IEC

JTC1/SC29/WG11, Doc. MPEG11/M22570, Geneva, Switzerland, November/December 2011. The full citation for m22571 is: Schwarz et al., Description of 3D Video

Technology Proposal by Fraunhofer HHI (HEVC compatible; configuration B), MPEG Meeting - ISO/IEC JTC1/SC29/WG11, Doc. MPEG11/M22571, Geneva, Switzerland, November/December 2011. The latest reference software HTM version 7.0 for the 3D- HEVC standard presently under development can be downloaded from the following link:

[HTM -7.0]: https://hevc.hhi.fi-aunhofer.de/svn/svn 3DVCSoftware/tags/HTM-7.0/

The latest software description (document number: D1005) as well as the working draft of the 3D-HEVC standard is available from the following link: http://phenix.it-sudparis.eu/ict2/doc end user/documents/4 Incheon/wg l 1/JCT3V- D 1005-v l .zip

The link immediately above includes the following documents: D1005_spec_vl and JCT3V-D1005_vl. These documents are identified as follows: Gerhard Tech, Krzysztof Wegner, Ying Chen, Sehoon Yea, "3D-HEVC Test Model 4," JCT3V-D1005_spec_vl, Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 4th Meeting: Incheon, KR, 20-26, April 2013, and Gerhard Tech, Krzysztof Wegner, Ying Chen, Sehoon Yea, "3D-HEVC Test Model 4," JCT3V-D1005_vl, Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 4th Meeting: Incheon, KR, 20-26, April 2013, (collectively hereinafter "D1005" or "WD3").

[0106] In 3D-HEVC, each access unit contains multiple view components, each of which contains a unique view id, or view order index, or layer id. A view component contains a texture view component as well as a depth view component, as described above. A texture view component is coded as one or more texture slices, while the depth view component is coded as one or more depth slices.

[0107] When 3D video data is represented using the multiview video plus depth format, texture view components are associated with corresponding depth video components, which are coded and multiplexed in a 3D video bitstream by video encoder 20. Video encoder 20 and/or video decoder 30 code the depth maps in the depth view components as grayscale luma samples to represent the depth values, and may use conventional intra- and inter-coding methods for depth map coding.

[0108] Depth maps are characterized by sharp edges and constant areas. Accordingly, due to the different statistics of depth map samples, different coding schemes have been designed for coding of depth maps by video encoder 20 and/or video decoder 30, based on a 2D video codec.

[0109] In 3D-HEVC, the same definition of intra prediction modes as in HEVC is utilized. In 3D-HEVC, Depth Modeling Modes (DMMs) are introduced together with the HEVC intra prediction modes, e.g., as described above with reference to FIG. 2, to code an Intra prediction unit of a depth slice.

[0110] For better representations of sharp edges in depth maps, the current reference software HTM applies a DMM method for intra coding of a depth map. There are four new intra modes in DMM for 3D-HEVC. In all four modes, a depth block is partitioned into two regions specified by a DMM pattern, where each region is represented by a constant value. The DMM pattern can be either explicitly signaled (DMM mode 1), predicted by spatially neighboring blocks (DMM mode 2), or predicted by a co-located texture block (DMM mode 3 and DMM mode 4).

[0111] There are two types of partitioning models defined in the DMM, including Wedgelet partitioning and Contour partitioning. FIG. 3 is a diagram illustrating an example of a Wedgelet partition pattern for use in coding an 8x8 block of pixel samples. FIG. 4 is a diagram illustrating an example of a contour partition pattern for use in coding an 8x8 block of pixel samples. [0112] Hence, as one example, FIG. 2 provides an illustration of a Wedgelet pattern for an 8x8 block. For a Wedgelet partition, a depth block is partitioned into two regions by a straight line, with a start point located at (Xs, Ys) and an end point located at (Xe, Ye), as illustrated in FIG. 3, where the two regions are labeled with P 0 and Pi. Each pattern consists of an array of size uBxvB binary digit labeling whether the corresponding sample belongs to region PO or PI where uB and vB represents the horizontal and vertical size of the current PU respectively. The regions PO and PI are represented in FIG. 2 by white and shaded samples, respectively. The Wedgelet patterns are initialized at the beginning of both encoding and decoding.

[0113] FIG. 4 shows a contour pattern for an 8x8 block. For a Contour partitioning, a depth block can be partitioned into two irregular regions, as shown in FIG. 4. The contour partitioning is more flexible than the Wedgelet partitioning, but difficult to be explicitly signaled. In DMM mode 4, a contour partitioning pattern is implicitly derived using reconstructed luma samples of the co-located texture block.

[0114] The DMM method is integrated as an alternative to the intra prediction modes specified in HEVC. A one-bit flag is signaled for each PU to specify whether DMM or unified intra prediction is applied.

[0115] With reference to FIGS. 3 and 4, each individual square within depth blocks 40 and 60 represents a respective individual pixel of depth blocks 40 and 60, respectively. Numeric values within the squares represent whether the corresponding pixel belongs to region 42 (value "0" in the example of FIG. 3) or region 44 (value "1" in the example of FIG. 3). Shading is also used in FIG. 3 to indicate whether a pixel belongs to region 42 (white squares) or region 44 (grey shaded squares).

[0116] As discussed above, each pattern (that is, both Wedgelet and Contour) may be defined by an array of size uB X vB binary digit labeling of whether the corresponding sample (that is, pixel) belongs to region P0 or PI (where P0 corresponds to region 42 in FIG. 3 and region 62 in FIG. 4, and PI corresponds to region 44 in FIG. 3 and region 64A, 64B in FIG. 4), where uB and vB represent the horizontal and vertical size of the current PU, respectively. In the examples of FIG. 3 and FIG. 4, the PU corresponds to blocks 40 and 60, respectively. Video coders, such as video encoder 20 and video decoder 30, may initialize Wedgelet patterns at the beginning of coding, e.g., the beginning of encoding or the beginning of decoding.

[0117] As shown in the example of FIG. 3, for a Wedgelet partition, depth block 40 is partitioned into two regions, region 42 and region 44, by straight line 46, with start point 48 located at (Xs, Ys) and end point 50 located at (Xe, Ye). In the example of FIG. 3, start point 48 may be defined as point (8, 0) and end point 50 may be defined as point (0, 8).

[0118] As shown in the example of FIG. 4, for Contour partitioning, a depth block, such as depth block 60, can be partitioned into two irregularly- shaped regions. In the example of FIG. 4, depth block 60 is partitioned into region 62 and region 64A, 64B using contour partitioning. Although pixels in region 64A are not immediately adjacent to pixels in region 64B, regions 64A and 64B may be defined to form one single region, for the purposes of predicting a PU of depth block 60. Contour partitioning may be more flexible than the Wedgelet partitioning, but may be relatively more difficult to signal. In DMM mode 4, in the case of 3D-HEVC, the contour partitioning pattern is implicitly derived using reconstructed luma samples of the co-located texture block.

[0119] In this manner, a video coder, such as video encoder 20 and video decoder 30 of FIG. 1, and FIGS. 8 and 9 described below, may use line 46, as defined by start point 48 and end point 50, to determine whether a pixel of depth block 40 belongs to region 42 (which may also be referred to as region "P0") or to region 44 (which may also be referred to as region "PI"), as shown in FIG. 3. Likewise, in some examples, a video coder may use lines 66, 68 of FIG. 4 to determine whether a pixel of depth block 60 belongs to region 64A (which may also be referred to as region "P0") or to region 64B (which may also be referred to as region "PI"). Regions "P0" and "PI" are default naming conventions for different regions partitioned according to DMM, and thus, region P0 of depth block 40 would not be considered the same region as region P0 of depth block 60.

[0120] Region boundary chain coding is another mode in 3D-HEVC. Region boundary chain coding mode is introduced together with the HEVC intra prediction modes and DMM modes to code an intra prediction unit of a depth slice. For brevity, "region boundary chain coding mode" is denoted by "chain coding" for simplicity in the texts, tables and figures described below.

[0121] A chain coding of a PU is signaled with a starting position of the chain, the number of the chain codes and for each chain code, a direction index. A chain is a connection between a sample and one of its eight-connectivity samples. FIG. 5 illustrates eight possible types of chains defined in a chain coding process. FIG. 6 illustrates region boundary chain coding mode with one depth prediction unit (PU) partition pattern and the coded chains in chain coding. One example of the chain coding process is illustrated in FIGS. 5 and 6. As shown in FIG. 5, there are eight different types of chain, each assigned with a direction index ranging from 0 to 7. A chain is a connection between a sample and one of its eight-connectivity samples.

[0122] To signal the arbitrary partition pattern shown in FIG. 6, a video encoder identifies the partition pattern and encodes the following information in the bitstream:

1. One bit "0" is encoded to signal that the chains start from the top boundary

2. Three bits "011" are encoded to signal the starting position "3" at the top boundary

3. Four bits "0110" are encoded to signal the total number of chains as 7

4. A series of connected chains indexes "3, 3, 3, 7, 1, 1, 1" are encoded, where each chain index is converted to a code word using a look-up-table.

[0123] As shown in block 70 of FIG. 5, there are 8 different types of chain, each assigned with a direction index ranging from 0 to 7. The chain direction types may aid a video coder in determining partitions of a depth block. Note, that instead of directly coding the direction index (0 ... 7), differential coding may be applied for signaling the direction index.

[0124] The example of FIG. 6 includes a first partition 72 and a second partition 74 separated by a chain 76 that indicates the partitioning structure. A video encoder (such as video encoder 20) may determine and signal chain 76 for a PU in an encoded bitstream, while a video decoder (such as video decoder 30) may parse data representing chain 76 from an encoded bitstream.

[0125] In general, chain 76 includes a starting position, an indication of a number of links in the chain (e.g., a number of chain codes), and for each chain code, a direction index. To signal the arbitrary partition pattern shown in the example of FIG. 6, video encoder 20 may encode one bit (e.g., 0) to indicate that chain 76 begins from the top boundary. Video encoder 20 may encode three bits (e.g., 011) to indicate that chain 76 begins after the third depth sample of the top boundary. Video encoder 20 may encode four bits (e.g., 0110) to indicate that there are 7 total links in chain 76. Video encoder 20 may also encode a series of connected chains indexes (e.g., 3, 3, 3, 7, 1, 1, 1) to indicate the direction of each chain link (e.g., in accordance with block 70). In some examples, video encoder 20 may convert each index to a code word using a look-up-table. A video decoder, such as video decoder 30, may parse the signaling described above to determine the partitioning pattern of a block. Video decoder 30 may then decode depth values for each partition.

[0126] In 3D-HEVC, simplified depth coding (SDC) mode is introduced together with the HEVC intra prediction modes, DMM modes and chain coding mode to code an intra PU of a depth slice. For 3D-HEVC, video encoder 20 signals an additional flag for each intra depth PU to specify whether the current PU is coded using SDC modes. In the current 3D-HEVC, SDC is only applied for a 2Nx2N PU partition size, and is not applied for PU partition sizes of less than 2Nx2N.

[0127] When SDC is used, video encoder 20 does not include individual residual values for all samples in a depth block, and does not generated quantized transform coefficients. Instead of coding quantized transform coefficients, in SDC modes, video encoder 20 represents a depth block with the following types of information:

1. The type of partition of the current depth block, including

a. DC (1 partition)

b. DMM mode 1 (2 partitions)

c. Planar (1 partition)

2. For each partition, a residual value is signaled in the bitstream

[0128] Hence, in SDC, video encoder 20 encodes only one residual for each PU of an intra-coded depth CU. For each PU, instead of coding the differences for each pixel, video encoder 20 determines a difference between an average value of the original signal (i.e., an average value of the pixels in the block to be coded) and an average value of the prediction signal (i.e., an average value of the pixel samples in the predictive block), and uses this difference as the residual for all pixels in the PU, and signaled this signal residual value to video decoder 30.

[0129] For 3D-HEVC, four sub-modes are defined in SDC including SDC mode 1, SDC mode 2, and SDC mode 3, which correspond to the partition types of DC, DMM mode 1, and Planar, respectively. In SDC, as mentioned above, no transform or quantization is applied by video encoder 20. Likewise, in SDC, video decoder 30 does not apply inverse quantization or inverse transform operations.

[0130] The depth values can be optionally mapped to indexes using a Depth Lookup Table (DLT), which is constructed by analyzing the frames within a first intra period before encoding a full video sequence. If DLT is used, the entire DLT is transmitted by video encoder 20 to video decoder 30 in a sequence parameter set (SPS), and decoded index values are mapped back to depth values by video decoder 30 based on the DLT. With the use of DLT, further coding gain is observed.

[0131] For the signaling of residual in SDC modes, as described above, for each partition, the difference of the representative value of current partition and its predictor, i.e., residual, is signaled by video encoder 20 in the encoded bitstream without transform and quantization. It should be noted that the residual may be signaled by video encoder 20 using two different methods depending on the usage of DLT:

1. When DLT is not used, the delta between the representative value of a current partition and its predictor is directly transmitted.

2. When DLT is used, instead of directly signaling the difference of depth values, the difference of the indices to the DLT are signaled, i.e., index of the

representative value of current partition and the index of the predictor in the DLT.

The corresponding decoding process can be referred to section H.8.4.4.3 of JCT3V- D1005.

[0132] FIG. 7 is a diagram illustrating derivation of a motion vector inheritance (MVI) candidate for depth coding. Depth inter-coding in 3D-HEVC will now be discussed in more detail. Motion Vector Inheritance (MVI) for depth coding can be used to inherit motion information for a depth component from motion information for a corresponding texture component. The main idea behind motion vector inheritance (MVI) is to exploit the similarity of the motion characteristics between the texture images and associated depth images.

[0133] FIG. 7 shows a depth picture 80 and its corresponding texture picture 82, and a corresponding texture block 84 used for MVI for a current PU 86 in depth picture 80. For a given PU (current PU 86) in the depth image, an MVI candidate reuses the motion vectors and reference indices of the already coded co-located texture block 82 located by the center position of the current depth block 86, if it is available. FIG. 6 shows an example of the derivation process of the MVI candidate where the corresponding texture block is selected as the 4x4 block located to the right bottom of the center of the current PU 86.

[0134] It should be noted that motion vectors with integer precision are used in depth coding while quarter precision of motion vectors is utilized for texture coding. Therefore, the motion vector of the corresponding texture block shall be rounded to integer precision before using as a MVI candidate.

[0135] JCT3V-D0069, Liu et al., "Non-CE: Simplified Inter Mode Coding of Depth," Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 4th Meeting: Incheon, KR, 20-26 Apr. 2013, describes the introduction of a simplified inter mode coding for depth, in which an alternative approach for coding a residual signal is proposed. The basic idea of simplified inter mode coding for depth is to encode only one residual for each PU of inter-coded depth CU (excluding a skipped depth CU), in a manner similar to SDC for intra mode coding. For each PU, instead of getting the differences between each pixel and a corresponding prediction sample, the difference between an average value of an original signal and an average value of a prediction signal is used as the residual for all pixels in the PU, and is signaled by video encoder 20 to video decoder 30 in the encoded bitstream. In the following paragraphs, this simplified inter mode coding for depth method is also referred to as average value-based inter-prediction and the conventional inter-prediction mode (where each pixel has its own difference value included in the residual data) is referred to as as pixel-based inter-prediction.

[0136] The CU-level flag to indicate the usage of average value-based inter-prediction mode is signaled by video encoder 20 after the signaling of the prediction mode (i.e., part_mode in JCT3V-C1005, Tech et al., "3D-HEVC Test Model 3," Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 3rd Meeting: Geneva, CH , 17-23 Jan. 2013, (hereinafter JCT3V- C1005)) and motion information of all PUs within one CU (i.e., prediction_unit in JCT3V-C1005).

[0137] Video encoder 20 will perform rate distortion optimized selection to select between the pixel-based inter-prediction approach in 3D-HEVC and the proposed average value -based inter-prediction mode approach in JCT3V-D0069 when encoding a residual signal for an inter-coded depth CU (excluding skipped depth CU), and encodes one flag in CU-level to indicate whether the average value-based inter-prediction mode or the pixel based inter-prediction is selected.

[0138] In U.S. provisional application no. 61/813,478, filed April 18, 2013, it was proposed that the Wedgelet partition technique can be applied to inter depth coding together with an SDC mode similar to that used 3D-HEVC Intra coding. [0139] Extending the SDC concept of JCT3V-D0069 for inter coding of non-regular partitions, including Wedgelet partition modes, may present some issues, especially in motion prediction. Some issues will be discussed below. First, a brief discussion of motion prediction modes in HEVC and 3D-HEVC is provided.

[0140] In general, motion information for inter-predictive coding of a PU may be obtained directly, or by using various prediction modes. Techniques for motion information prediction, sometimes referred to as motion vector prediction, may include a merge mode, skip mode, and advanced motion vector prediction (AMVP) mode. In general, according to merge mode and/or skip mode, a current video block, e.g., PU, inherits the motion information, e.g., motion vector, prediction direction, and reference picture index, from another, previously-coded neighboring block. The previously coded neighboring block may be a spatially-neighboring block in the same picture, or a co- located block in a temporal or inter-view reference picture. In some examples, motion information for a depth PU may be obtained from spatial neighbor depth block or a corresponding texture block. Accordingly, a merge or AMVP candidate list for a depth block may include, in some examples, one or more candidates from a texture block. Merge, skip and AMVP modes may be considered motion vector prediction modes.

[0141] When implementing merge/skip mode, video encoder 20 constructs a list of merge candidates that include the motion information of the reference blocks in a defined matter, selects one of the merge candidates, e.g., based on a rate-distortion determination, and signals a candidate list index identifying the selected merge candidate to video decoder 30 in the bitstream. Video decoder 30, in implementing the merge/skip mode, receives this candidate list index, reconstructs the merge candidate list according to the defined manner, and selects the one of the merging candidates in the candidate list indicated by the index. The candidate list typically will include a number of spatial neighbors in the same picture or view, a temporal motion vector predictor (TMVP) corresponding to a co-located block in another picture or view, and other candidates that may be specified according to different modes.

[0142] For merge mode, video decoder 30 uses the motion vector of the selected one of the merging candidates to identify a prediction block in the same reference picture as the motion vector for the selected one of the merge candidates. Accordingly, at the decoder side, once the candidate list index is decoded, all of the motion information of the corresponding block of the selected candidate may be inherited for the block to be coded, such as, e.g., motion vector, prediction direction, and reference picture index. Merge mode and skip mode promote bitstream efficiency by allowing video encoder 20 to signal an index into the merge candidate list, rather than all of the motion information for inter- prediction of the current video block.

[0143] When implementing AMVP, video encoder 20 also constructs a list of candidate motion vector predictors in a defined matter. The list of candidates for AMVP may be the same as or similar to the list of candidates for merge or skip mode. Video encoder 20 selects one of the candidates, e.g., based on a rate-distortion determination, and signals a candidate list index identifying the selected candidate to video decoder 30 in the bitstream. Similar to merge mode, when implementing AMVP, video decoder 30 reconstructs the list of candidate MVPs in the defined matter, decodes the candidate list index from encoder 20, and selects one of the candidates based on candidate list index.

[0144] In contrast to the merge/skip mode, however, when implementing AMVP, video encoder 20 also signals a reference picture index and prediction direction, specifying the reference picture to which the motion vector predictor (MVP) specified by the candidate list index points. Further, video encoder 20 determines a motion vector difference (MVD) for the current block, where the MVD is a difference between the motion vector of the selected candidate and the actual motion vector that would otherwise be used for the current block. Hence, for AMVP, in addition to the reference picture index, reference picture direction and candidate list index, video encoder 20 signals the MVD for the current block in the bitstream. Due to the signaling of the reference picture index and prediction vector difference for a given block, AMVP may not be as efficient as merge/skip mode, but may provide improved fidelity of the coded video data. Also, AMVP typically requires encoder 20 to perform motion estimation for the block to be coded, and also generate a mode decision for the block.

[0145] A number of issues associated with partition-based inter coding of depth data, e.g., in conjunction with average value-based inter prediction, will now be discussed. As one issue, when an average value-based inter-prediction mode similar to SDC is extended for use in inter-coding of non-rectangular partitions, such as Wedgelet partitions, in the AMVP mode, all of the possible partitions of a CU need to be checked at the encoder for motion estimation and mode decision. As result, the use of an average value-based inter- prediction mode similar to SDC for inter coding of non-rectangular partitions can present a huge increase in complexity at encoder 20. For example, video encoder 20 must perform motion estimation for numerous, possible non-rectangular partitions, e.g., to generate motion vectors for generation of motion vector difference (MVD) values, relative to the motion vectors of AMVP candidates. AM VP would also require the encoder 20 to generate a reference picture index and prediction direction for each non- rectangular partition, and then select an AMVP candidate based on a rate-distortion analysis. This added complexity may be undesirable for an encoder 20, causing excessive processing time, complexity, and power consumption.

[0146] As another problem, for each partition of a CU with partition mode equal to SIZE_2Nx2N, due to the different texture characteristics, the motion vectors belonging to different partitions are typically different when motion estimation is performed.

However, based on the current motion prediction candidate list, i.e., merge candidate list, generation processes for motion vector prediction in HEVC, e.g., for merge/skip or AMVP mode, the candidate lists for the two partitions in a block may be the same.

Therefore, it is highly possible that the merge candidate list may be inefficient for at least one partition, and possibly even inefficient for both partitions. In other words, using the same list of candidates for motion vector prediction, e.g., for merge/skip or AMVP, for two different non-rectangular partitions, such as two different Wedgelet partitions or Contour partitions, may not be optimal, and may impair coding efficiency.

[0147] As an additional problem, typically, motion fields are maintained by rectangular blocks. However, there is presently no mechanism to handle the management of motion information in non-rectangular blocks, including access of the motion information of each partition when decoding the current block and access of the motion information of the neighboring blocks which are coded with non-rectangular partitions, such as

Wedgelet partitions or Contour partitions. That is, there is no mechanism to store and retrieve motion information for a non-rectangular partition for purposes of using the motion information to decode the non-rectangular partition. Also, there is no there is no mechanism to store and retrieve motion information for a non-rectangular partition for purposes of using the motion information to predict motion information for another block to be coded, e.g., when the non-rectangular partition forms part of a spatial neighbor or temporal block used in merge/skip or AMVP mode.

[0148] Accordingly, there should be a rational method for selecting and storing motion information for a non-rectangular partition, so that a decoder 30 may readily retrieve the motion information to decode the non-rectangular partition. Likewise, if a non- rectangular partition is used as a motion vector prediction candidate for coding of another rectangular or non-rectangular prediction unit (PU) in a motion prediction mode such as merge/skip or AMVP mode, there should be a rational method for storing and retrieving such information for the non-rectangular partition.

[0149] Techniques related to depth map inter coding in 3D-HEVC are proposed in this disclosure to support non-rectangular partitioning of depth inter blocks that are inter- coded together with an average value-based inter prediction mode such as an SDC mode performed in a manner similar to that proposed in JCT3V-D0069. More specifically, a new mode, namely Non-Rectangular Depth Inter Partition (NRDIP) is introduced. In this new NRDIP mode, residual information may be skipped. In NRDIP mode, non- rectangular partitions may be inter-coded with respect to prediction blocks generated by motion estimation and compensation or by motion vector prediction, e.g., merge/skip or AMVP, using an average value-based inter prediction mode similar to SDC, e.g., with average value -based coding of a residual value for each partition. Aspects of various examples of an NRDIP mode are discussed below. Such aspects may be used by a video encoder 20 and/or video decoder 30 independently or in any suitable combination with one another.

[0150] In discussing these aspects, reference will be made to the proposed inter depth partition mode (NRDIP), which may generally refer to a mode in which a CU is partitioned into non-rectangular partitions, and residual information for such non- rectangular partitions is encoded used in an average value-based residual coding, which may be similar to SDC used for intra depth partition mode coding, in that the residual for each non-rectangular partition may be coded as a residual value representing a difference between an average value of pixels in the non-rectangular partition and an average value of pixel samples in an inter-predictive block generated for the non-rectangular partition. The inter-predictive block may be generated directly, using motion estimation and motion compensation, or indirectly by motion information prediction processes such as merge/skip or AMVP coding.

[0151] The candidate lists used for motion information prediction will be generally be referred to herein as a merge list or merge candidate list, or motion prediction candidate list, with the understanding that such a list may be used for merge mode, skip mode, or AMVP mode, unless indicated otherwise. Likewise, a merge candidate may generally refer to a candidate in a merge list that is used for merge, skip or AMVP mode. For example, the same or similar merge list may be used for merge, skip or AMVP modes. In some examples, however, AMVP or skip mode may be disabled when non-rectangular partitions are coded for a PU of a depth view component. [0152] When reconstructing an NRDIP coded non-rectangular partition, the residual value generated using the average value-based residual coding process may be added to each of pixel values in the predictive block. The residual value may be generated by video encoder 20 and signaled to video decoder 30, such that video decoder 30 may receive the residual value in the encoded bitstream. Alternatively, in some examples, the residual for a non-rectangular partition may be obtained by prediction of the residual, e.g., from other blocks or partitions. The inter depth coding processes of this disclosure are described with reference to wedgelet partitions, for purposes of illustration, as an example of non-rectangular partitions, but may be applicable to coding other non- rectangular partitions.

[0153] 1. In a first aspect, the proposed inter depth partition mode (NRDIP) may be applied by a video encoder 20 or video decoder 30 to only CUs with 2Nx2N partition size to avoid both encoder complexity and signaling overhead as well as to simplify the cost of the motion vector maintenance for the two non-rectangular partitions of each PU inside a CU. Hence, NRDIP is applied only for a PU having a size of 2Nx2N, which represents an undivided CU, in that the PU is the same size as the CU in which it resides. In other words, a 2Nx2N PU is the same size as its CU.

[0154] In this example, video encoder 20 and video decoder 30 apply the NRDIP mode, e.g., with average value-based inter-prediction for non-rectangular partitions, only when the CU that includes the non-rectangular partitions has a size of 2Nx2N pixel samples. If a depth PU is not 2Nx2N in size, then NRDIP is not used. For example, if the depth PU is not 2Nx2N in size, the PU may not be partitioned into non-rectangular partitions and average value -based residual coding may not be used.

[0155] Instead, intra-coding of the non-rectangular partitions may be performed by encoder 20 and decoder 30, e.g., using SDC, or rectangular partitions may be used for inter coding of the PU. By requiring a 2Nx2N PU size for NRDIP with non-rectangular partitions, the number of non-rectangular partitions for which motion estimation and mode selection would need to be performed can be reduced at the encoder side. In addition, the number of non-rectangular PU's requiring storage of motion vectors and other motion information can be reduced, promoting efficiency and reducing complexity.

a. As an option, furthermore, the proposed inter depth partition mode may be applied by video encoder 20 and video decoder 30 to only CUs with a 2Nx2N partition size where the size of the CU is larger than to a predetermined minimum size KxK. As an example, K may be equal to e.g., 8, 16 or 32, where K refers to a number of pixel samples in the CU. Hence, video encoder 20 and video decoder 30 may impose a minimum size threshold to a PU, where the PU, which is 2Nx2N and therefore the same size as the CU, must be larger than or equal to a predetermined size, b. As another option, either as an alternative or in addition to item a above, furthermore, the proposed inter depth partition mode may be applied by video encoder 20 and/or video decoder 30 to only CUs with 2Nx2N partition size where the size of the CUs is smaller than KxK. In this example, K may be equal to e.g., 64, 32 or 16, where K in this instance refers to a number of pixel samples in the CU. Hence, video encoder 20 and video decoder 30 may impose a maximum size threshold to a PU, where the PU, which is 2Nx2N and therefore the same size as the CU, must be smaller than or equal to a predetermined size.

The minimum size threshold and maximum size threshold may be combined such that a CU must be larger than or equal to a first predetermined size and smaller than a second predetermined size, or larger than a first predetermined size and smaller than or equal to a second predetermined size, in order for video encoder 20 and video decoder 30 to apply the NRDIP mode, whereby the PU is partitioned using non- rectangular partitions and the partitions are inter-coded using average value-based residual coding, e.g., either by motion vector prediction with candidates from a merge list or by motion estimation.

A candidate from a merge list may be selected for prediction of motion information for a non-rectangular partition, e.g., based on a rate-distortion decision. For example, video encoder 20 may select the candidate that generates the best, or a least an example balance between distortion and required bit rate, for coding the non-rectangular partition when a particular merge list candidate is selected for prediction of motion information.

[0156] 2. In a second aspect, the AMVP mode for the NRDIP mode is disabled by video encoder 20 and/or video decoder 30; therefore, the motion estimation for the NRDIP mode is not needed. In this manner, NRDIP may be applied by video encoder 20 and video decoder 30 to non-rectangular partitions with motion vector prediction based on merge mode or skip mode, where motion estimation and mode selection for the non- rectangular partitions are not necessary. However, video encoder 20 and video decoder 30 do not permit AMVP to be used with NRDIP.

[0157] For example, video encoder 20 and video decoder 30 do not apply AMVP to non- rectangular partitions when NRDIP is selected or, alternatively, do not apply NRDIP to non-rectangular partitions when AMVP mode is selected. Instead, by disabling AMVP for non-rectangular partitions, video encoder 20 or video decoder 30 may simplify the inter-coding of non-rectangular partitions using SDC coding. In particular, in some examples, video encoder 20 and video decoder 30 may only apply NRDIP to non- rectangular partitions that are not coded using AMVP mode, such as non-rectangular partitions that are coded using merge/skip mode or non-rectangular partitions that are coded directly using motion estimation and motion compensation.

[0158] Hence, video encoder 20 and video encoder 30 may be configured to inter-code non-rectangular partitions of a PU for a depth video component, but disable AMVP, so that motion vector prediction with a merge candidate list may be used for prediction of the motion information of a non-rectangular partition, but AMVP is not used. In this manner, a candidate spatial neighbor block or a candidate temporal motion vector predictor (TMVP) block from a merge candidate list may be selected for a non- rectangular partition by video encoder 20, and signaled to video decoder 30 by a merge index, and the motion information of the non-rectangular partition may be predicted based on the motion information of the selected block using merge mode or skip mode, but not AMVP mode.

[0159] With AMVP disabled, the motion information for the non-rectangular partition may be predicted as the motion information of the selected candidate block, e.g., by predicting, from the selected block, the motion vector and reference index to a reference picture list. However, with AMVP disabled, it is not necessary to perform motion estimation for the non-rectangular partition in order to generate additional motion information such as a motion vector difference between the motion vector of the non- rectangular partition and the motion vector of the selected block, or a prediction direction and reference picture index for the non-rectangular partition.

[0160] Hence, disabling AMVP for motion vector prediction when non-rectangular partitions are coded for a depth PU can simplify the inter-prediction process. For NRDIP, video encoder 20 may signal, and video decoder 30 may receive, residual data that is generated based on differences between inter-predictive reference samples generated using the predicted motion information and the pixels of the non-rectangular partition. Again, the residual data may be coded using an SDC-like process where a residual value represents a difference between an average value of the pixels of the non-rectangular partition and the average value of the inter-predictive reference samples generated using the predicted motion information. For example, a single residual value is then added to each of the predictive reference samples to reconstruct the original pixels of the coded partition. The inter-predictive samples may be generated, for example, by motion compensation of the non-rectangular partition or the PU using the motion vector and reference index of the predicted motion information.

a. As an option, in addition or as an alternative, furthermore, the skip mode for the NRDIP mode is also be disabled by video encoder 20 and/or video decoder 30. In skip mode, video encoder 20 does not generate a residual value for a PU. Instead, the motion information of a candidate block of the merge list is accepted as the predicted motion information of the block to the coded, the non-rectangular partition in this case, and no residual is generated between the values of inter-predictive samples and the values of pixels of the non-rectangular partition.

Accordingly, skip mode is generally incompatible with the average value- based residual coding of SDC, in that no residual coding is performed in skip mode; therefore, skip mode can be disabled for the NRDIP mode when motion vector prediction is used to predict motion information for a non-rectangular partition of a PU of a depth view component.

[0161] 3. In a third aspect, after a motion vector is predicted for a non-rectangular partition, e.g., with a merge index to a candidate in a derived merge candidate list, motion compensation may need to be done by a video encoder 20 and/or video decider 30 for a non-rectangular partition. As an example, the motion compensation for a non-rectangular partition may be done first by motion compensation for a rectangular block covering the non-rectangular partition. For example, motion information including a motion vector that is predicted by a merge candidate from the merge candidate list may be used to identify a rectangular prediction block that encompasses, i.e., covers the non-rectangular partition. The prediction block may not be a non-rectangular partition, or at least a non- rectangular partition that corresponds to the non-rectangular partition that is being coded.

[0162] Upon generation of inter-predictive pixel samples of the rectangular prediction block, e.g., using the motion information of a selected merge candidate, a portion of the rectangular prediction block that corresponds to the samples of the non-rectangular partition may be extracted from the rectangular block to form the prediction samples for the non-rectangular partition being coded. The samples in the extracted portion of the rectangular block may then be used by a video encoder 20 and/or video decoder 30 to form the inter-prediction reference samples for the non-rectangular partition and a residual value indicating a difference between an average value of the samples in the extracted portion of the inter-predictive samples and an average value of the samples in the non-rectangular partition to be coded, e.g., using an average value-based residual coding mode similar to an SDC mode. In this manner, a rectangular prediction block identified by the motion vector is first accessed to form a rectangular prediction block for the non-rectangular partition to be coded, and then the suitable prediction samples, corresponding to the portion of the rectangular prediction block that covers the non- rectangular partition, are extracted for the non-rectangular partition.

[0163] FIG. 8 is a diagram illustrating generation of a motion compensated rectangular block forming a reference block used to obtain inter-predictive samples for a non- rectangular partition. In the example of FIG. 8, a PU of a depth view component of a picture includes non-rectangular partitions P0 and PI. Upon selection of a merge candidate from a merge candidate list for a non-rectangular partition for a motion estimation mode such as merge, skip or AMVP, the motion information for the selected candidate is used as predicted motion information by video encoder 20 or video decoder 30 to generate inter-predictive samples for the non-rectangular partition.

[0164] In this example, the motion information, e.g., a motion vector and reference index into a reference picture list (L0 or LI), is used for motion compensation of a rectangular block that encompasses, i.e., contains or covers, the non-rectangular partition. For example, a rectangular block that contains non-rectangular partition P0, within the PU, is motion compensated by the predicted motion information to generate motion

compensated block MCB1. Motion compensated block MCB1 is sized to cover non- rectangular partition P0. In some examples, the rectangular block of the PU that is used to produce motion compensated rectangular MCB 1 with the predicted motion information may be the smallest rectangular block in the PU that contains the non- rectangular partition P0.

[0165] Upon generating the motion compensated block MCB1, a video encoder 20 or video decoder 30 may extract the inter-predictive samples, the P0 samples, that correspond to the non-rectangular partition P0. Likewise, video encoder 20 or video decoder 30 may generate a motion compensated block MCB2 using the predicted motion information and a rectangular block in the PU that is selected to contain non-rectangular partition PI. The motion compensated block may include, as inter-predictive samples, pixels identified by a predicted motion vector in a reference picture identified by predicted reference index. The pixels used as inter-predictive reference samples may be generated at integer pixel precision, or possibly at a fractional pixel precision. The motion vector and reference index may be obtained from a selected candidate spatial neighbor block to the PU being coded, or from a temporal motion vector predictor, or from another candidate, such as a motion vector inheritance (MVI) candidate.

[0166] Upon generating the motion compensated block MCB2, a video encoder 20 or video decoder 30 may extract the inter-predictive samples, the PI samples, that correspond to the non-rectangular partition PI. Likewise, upon extracting the pertinent inter-predictive samples, video encoder 20 generates residual data indicating a difference between an average value of the P0 samples and an average value of the pixels of non- rectangular partition P0. Similarly, upon extracting the pertinent inter-predictive samples, video encoder 20 generates residual data indicating a difference between an average value of the PI samples and an average value of the pixels of non-rectangular partition PI. Video encoder 20 may select a non-rectangular partition scheme and motion vector prediction, e.g., by selection of a merge list candidate, that produces a best or acceptable rate-distortion result. Video encoder 20 then may signal, for the PU, the non-rectangular partitioning scheme and, for the non-rectangular partitions of the PU, the pertinent residual data, and a merge index indicating the selected merge candidate for prediction of the motion information used to generate MCB1 and MCB2. Video decoder 30 may use the signaled information to generate the inter-predictive samples and reconstruct the non- rectangular partitions with the residual data.

[0167] 4. In a fourth aspect, video encoder 20 and/or video decoder 30 may be configured so that each of the two non-rectangular partitions belonging to the same 2Nx2N PU contains a separate set of motion information, identified by reference indices and motion vectors corresponding to reference picture list 0 (RefPicListO) and/or RefPicListl. In this manner, video decoder 30 accesses a separate set of motion information for each non-rectangular partition to inter-decode and reconstruct the non- rectangular partitions independently of one another using average value based residual coding. In this example, video encoder 20 stores and signals motion information for each non-rectangular partition separately, so that each of the non-rectangular partitions of a PU or a depth view component can be decoded with separate motion information

independently of the motion information of the other non-rectangular partition.

[0168] Again, the motion information may include motion vectors, prediction direction, and reference indices. The motion vector may imply motion vector magnitude and prediction direction. In some examples, the motion information may be determined by motion estimation and explicitly signaled for each non-rectangular partition, or determined by motion prediction and signaled by a merge index into a merge candidate list derived for the respective non-rectangular partition. For motion prediction, each non- rectangular partition may use the same merge candidate list or the non-rectangular partitions may use separate and possibly different merge candidate lists, as described in further detail below.

[0169] As discussed in further detail in item 7 below, when the non-rectangular partitions are used for motion prediction for other PU's, one of the sets of motion information of the two partitions may be selected for a given PU. For example, if a merge candidate for a PU to be coded includes a candidate PU with two non-rectangular partitions, and the candidate PU is selected according to merge mode, the motion information for one of the non-rectangular partitions may be used as the motion information for the PU to be coded.

[0170] 5. In a fifth aspect, to accommodate the current HEVC motion compression scheme for motion vectors in the current frame, the following methods are introduced for use by video encoder 20 and video encoder 30:

a. As one option, when the size of a 2Nx2N PU is equal to 8x8, for each partition, the reference index and motion vector corresponding to RefPicListO are kept in a buffer, such as a line buffer, while the reference index and motion vector corresponding to RefPicListl are truncated and never accessed by spatial neighboring blocks. In general, truncation means that the storage that would otherwise be needed to store this information, e.g., the reference index and motion vector corresponding to RefPicListl in this example, is not needed, and thus the memory for storing this information can be released. Hence, if a 2Nx2N PU has a size of 8x8, encoder 20 will store a reference index and motion vector for RefPicListO for non-rectangular partitions of the PU, but will truncate any reference index and motion vector corresponding to RefPicListl, thereby reducing the amount of motion information stored in the line buffer, and the number of overall motion information entries the line buffer, and supporting a compatible motion compression scheme.

When a 2Nx2N PU is 16x16 or larger, no compression is needed. In this case, there is no need to truncate the reference index and motion vector in a line buffer that stores motion information for the PU's. Instead, for a PU that is 16x16 or larger, all motion information may be stored in the line buffer by video encoder 20 without compression, including, for non-rectangular partitions of the PU, any reference index and motion vector for RefPicListO, as well as any reference index and motion vector corresponding to RefPicListl.

Hence, in one example, if non-rectangular partitions each include a reference index and a motion vector for a first reference picture list and a reference index and a motion vector for a second reference picture list, only the reference index and motion vector for the first reference picture list is stored in a line buffer as motion information for the non-rectangular partitions. Accordingly, a video decoder 30 may use the compressed motion information, e.g., the motion information including the reference index and motion vector for RefPicListO, for generation of inter-predictive samples for the non- rectangular partitions. As another option, as an alternative or in addition to option 5a above, when the size of a 2Nx2N PU is 8x8, or some other specified size, the reference index and motion vector corresponding to the non- rectangular partition that covers a particular pixel, such as, e.g., the top-left pixel of the current PU being coded, are kept, including motion vectors and reference indices corresponding to both RefPicListO and RefPicListl, and the motion information of the other partition is truncated and never accessed.

FIG. 9 is a diagram illustrating selection of motion information for a non-rectangular partition P0 that covers a top-left pixel of a PU of a depth view component. To support compression of motion information, the motion information of partition P0 is kept, e.g., in a line buffer, while the motion information of partition PI is truncated and not accessed for inter-prediction. Instead, non-rectangular partition PI of PU uses the motion information of non-rectangular partition P0.

Hence, in this example, video encoder 20 keeps the motion information for one of the non-rectangular partitions of the PU, and stores the information in a memory buffer such as a line buffer, while truncating the motion information for the other non-rectangular partition.

In some examples, the motion information that is kept then may be used for both non-rectangular partitions of the PU, while the motion information that is truncated is not used. For example, a video decoder 30 may use the compressed motion information that is kept for generation of inter-predictive samples for both non-rectangular partitions. When a 2Nx2N PU is 16x16 or larger, no compression is needed. In this case, there is no need to truncate the reference index and motion vector in a line buffer that stored motion information for the PU's. Instead, for a PU that is 16x16 or larger, all motion information may be stored in the line buffer by video encoder 20 without compression.

i. Selection of the motion information for one partition based on coverage of the top-left pixel is provided as one example. Alternatively, the motion information from one non- rectangular partition may be chosen, and motion information from the other non-rectangular partition excluded, by other criteria. For example, the selection may be based on the

reference index and the motion vectors as follows: first video encoder 20 selects the non-rectangular partition that has a smaller reference index to RefPicListX (with X being 0 or 1), and then keeps the motion information for the selected partition in memory, and truncates the motion information for the non- selected partition.

If reference indices to RefPicListO and/or RefPicList 1 of the two partitions are the same, then video encoder 20 selects the partition that has a larger motion vector magnitude:

Abs(MVx)+ Abs(MVy), wherein MVx and MVy are the horizontal and vertical components of a motion vector MV, and then keeps the motion information for the selected partition in memory, and truncates the motion information for the non- selected partition. Video decoder 30 may access motion information in a reciprocal manner.

[0171] 6. In a sixth aspect, to accommodate the current HEVC motion compression scheme for motion vectors in a temporal co-located picture used for temporal motion vector prediction (TMVP), the following methods are introduced for use by video encoder 20 and video decoder 30:

a. For each 16x16 block, if non-rectangular partitions are used, only the motion information covering a particular pixel, such as, e.g., the top- left sample of the 16x16 block, needs to be accessed when TMVP is enabled for motion estimation. In this example, video encoder 20 stores the motion information for the non-rectangular partition that covers the top-left pixel sample, e.g., in a manner similar to that shown in FIG. 9, discussed above, of the 16x16 block and truncates the motion information for the other non-rectangular partition, such that the motion information for the other non-rectangular partition is never accessed. Video decoder 30 then uses the compressed motion information for inter-coding of the non-rectangular partitions. The motion information for the non-rectangular partition that covers the top-left pixel sample of the 16x16 block may be derived from the TMVP candidate or from another merge or AMVP candidate, depending on a selection by encoder 20. However, in either case, video encoder 20 may select the non-rectangular partition that covers the top-left pixel sample of the 16x16 block when TMVP is enabled.

Alternatively, if non-rectangular partitions are used for one 16x16 block, only the motion information covering the center samples of the 16x16 block are needed to be accessed when TMVP is enabled. In this example, video encoder 20 stores the motion information for the non-rectangular partition that covers the center pixel sample or samples of the 16x16 block and truncates the motion information for the other non-rectangular partition, such that the motion information for the other non-rectangular partition is never accessed. In some examples, a center pixel sample may be considered to be a pixel with relative coordinates of (width/2, height/2) relative to the top-left corner of the block, wherein width x height represents the size of the block.

In one examle, the motion information for the non-rectangular partition that covers the top-left pixel sample of the 16x16 block may be derived from the TMVP candidate or from another merge or AMVP candidate, depending on a selection by encoder 20. However, in either case, video encoder 20 may select the non-rectangular partition that covers the top-left pixel sample of the 16x16 block when TMVP is enabled.

Alternatively, if non-rectangular partitions are used for one 16x16 block, in this example, video encoder 20 always sets the temporal merging candidate to be unavailable even when TMVP is enabled. In this example, the temporal merging candidate cannot be selected by video encoder 20 or video decoder 30 for either non-rectangular partition of the PU to be coded. [0172] 7. In a seventh aspect, a block that includes non-rectangular partitions may be used as a motion vector prediction candidate, e.g., for merge mode, skip or AMVP mode, to predict the motion information of another block. That is, a block of a CU with non- rectangular partitions may be used for prediction of motion information for blocks of other CU's. However, the candidate block includes two non-rectangular partitions with separate motion information. Accordingly, it may be necessary to select the motion information of one of the non-rectangular partitions for use in predicting the motion information of another block.

[0173] In one example, when a block in a CU that is coded with non-rectangular partitions is used to predict the motion information of another block or a partition in another CU, the motion vectors of the CU may be accessed in two steps. First, for a given block position, it is decided which partition it belongs to by checking its center pixel. Again, in some examples, a center pixel may be a pixel with relative coordinates of (width/2, height/2) relative to the top-left corner of the rectangular block in which the partitions reside, wherein width x height represents the size of the block.

[0174] Then, the motion information of the non-rectangular partition covering the center pixel is used for prediction of the motion information for the block to be coded. For example, when the motion information for a block of a first CU is to be predicted using the motion information of one of two non-rectangular partitions in a block of a second CU, video encoder 20 selects the non-rectangular partition of the block in the second CU that covers a particular pixel, such as the center pixel, of the block of the second CU. That is, video encoder 20 may be configured to select the non-rectangular partition that covers the center pixel of the block used for motion prediction.

[0175] In this manner, when there are two non-rectangular partitions in a candidate block, such as a spatial neighbor block or TMVP, in a merge list used for motion estimation, video encoder 20 can select one of the non-rectangular partitions and use its motion information to predict the motion information for the block to be coded. Video encoder 20 generates a merge index indicating the candidate that is selected for motion prediction.

[0176] Upon accessing a candidate block using the merge index, video decoder 30 selects the one of the two non-rectangular partitions that covers a particular pixel, such as a center pixel of the block used for motion estimation. A center pixel may be a center pixel or one of a set of pixels that are at the center of the block. [0177] FIG. 10 is a diagram illustrating selection of the motion information of non- rectangular partition P0 for use in motion prediction for a block in another CU based on the determination that partition P0 covers a center pixel of the candidate block for motion prediction.

[0178] 8. In an eighth aspect, when creating a merge candidate list for each non- rectangular partition, the spatial neighboring blocks are not the same as those used for the current PU, if the PU is coded with 2Nx2N. FIGS. 11 A, 1 IB, 11C and 1 ID are diagrams illustrating definition of neighboring blocks used for a merge candidate list for non- rectangular partitions of a PU in depth coding. In this example, because the spatial neighboring blocks are not the same as those used for the current PU, not all blocks AO, Al, B0, Bl, B2 typically used for merge list construction for a rectangular PU, e.g., as shown in FIG. 11 A, are used to create the same merge candidate lists for two non- rectangular partitions.

[0179] Generating identical merge candidate lists for a motion prediction mode such as merge mode or AMVP mode for the non-rectangular partitions using the spatial neighbors of the PU may not be optimal, as each partition may have different

characteristics. Accordingly, it is desirable to provide a scheme for generating different merge candidate lists for motion prediction for the non-rectangular partitions.

a. In one example, a spatial neighboring block (AO, Al, B0, Bl or B2) that is adjacent to the 2Nx2N PU, but is not adjacent to the currently coded partition is not inserted into the merge candidate list of the current block by video encoder 20 or video decoder 30. That is, if a particular spatial neighboring block is not adjacent to a particular non-rectangular partition, then video encoder 20 and video decoder 30 construct the merge candidate list for that non-rectangular partition to exclude that spatial neighboring block.

This example is shown in FIG. 11B. In this example, the merge candidate list for the first non-rectangular partition P0 of the PU contains the black- colored spatial neighbors AO, Al, B2 that are adjacent to non-rectangular partition P0. Hence, AO, Al, and B2 are used for merge candidate list construction and motion estimation for the first non-rectangular partition P0. Blocks AO, Al and B2 may correspond to the spatial neighbor blocks AO, Al, and B2 ordinarily used as spatial neighbors for merge list construction in HEVC.

Video encoder 20 and/or video decoder 30 sets the merge candidate list for the second partition PI to contain the light gray- shaded spatial neighbors B0, Bl. Blocks B0 and Bl may correspond to the spatial neighbor blocks B0 and Bl ordinarily used as spatial neighbors for merge list construction in HEVC. In each case, the merge candidate list for a non-rectangular partition includes only spatial neighbor candidates that are adjacent to the respective partition. In one example, the merge candidate list for a non-rectangular partition includes those spatial neighbor candidate blocks that are ordinarily used for merge list construction in HEVC and are adjacent to the respective partition (e.g., AO, Al and B2 for partition 0 and B0 and B 1 for partition 1 in the example of FIG. 1 IB).

A block may be adjacent to a partition, for example, if all pixels of one of its four sides are adjacent to the pixels of the partition. In some examples, a block with a corner pixel adjacent to a corner of a partition may be considered adjacent. In another example, for each spatial neighboring block, if the given spatial neighboring block is not available, a substitution for the merge candidate list is found by encoder 20 and/or decoder 30 by shifting the position of an unavailable spatial neighboring block horizontally or vertically until the block becomes adjacent to the current partition. More specifically, if a spatial neighboring block is considered as not adjacent to the current block, as described above, video encoder 20 it is shifted horizontally or vertically until it becomes adjacent to the current position, if it is still within the left or top region of the current PU and it is different from the available ones. That is, video encoder 20 shifts from the position of a first, unavailable, i.e.., non-adjacent, spatial neighboring block, either vertically or horizontally, until it arrives at the position of a second spatial neighboring block that is adjacent to the pertinent non-rectangular partition. This second spatial neighboring block then may be used as a substitute spatial neighbor block and is added to the merge list by encoder 20 and/or decoder 30. In some examples, the substitute spatial neighbor block may be the first spatial neighbor block, adjacent to the partition, that is reached upon shifting. As shown in FIG. 11C, for example, additional spatial blocks in black and light shading, respectively, are generated to form merge candidate lists for the two partitions. In particular, in the example of FIG. l lC, partition P0 uses black spatial neighbors AO, Al, B2 and B3 for its merge candidate list, and partition PI uses light-shaded spatial neighbors B0, Bl and B4 for the merge candidate list.

To generate the merge candidate list for partition P0 of the PU, in this example, video encoder 20 and/or video decoder 30 selects spatial neighbor blocks AO, Al and B2, which are already adjacent to partition P0, and horizontally shifts to the left from the position of spatial neighbor block Bluntil it reaches the position of a spatial neighbor block B3 that is adjacent to partition P0. Block B3 is then added to the merge candidate list for partition P0, along with AO, Al and B2. Likewise, video encoder 20 and/or decoder 30 generates a merge candidate list for partition PI that includes the available spatial neighbor blocks B0 and Bl, plus spatial neighbor block B4 generated by shifting to the right from the position of unavailable, non-adjacent block B2 until the position of block B4 is reached.

Block B4 is then added by encoder 20 and/or decoder 30 as a substitute candidate spatial neighbor block for the merge list for partition PI. The spatial neighbor block that is considered unavailable as not being adjacent to a given partition, and from which shifting is made to reach the position of a substitute block that is adjacent to the respective partition, may be one of the ordinary spatial neighbor blocks used for merge list construction for a rectangular block in HEVC. For other partition results, it is possible that additional blocks may be generated for a given partition by vertical shifting or horizontal shifting, and that some merge candidate lists may include a block produced by vertical shifting and another block produced by horizontal shifting. In another example, for each partition, starting from any of the corner blocks of the partition (which is also the corner of the current PU), the block is shifted vertically by one pixel until the block becomes a typically spatial neighboring block used for merge mode (AO, Al, BO, Bl or B2); simultaneously, starting from the corner block of the partition, the block is shifted horizontally until the block becomes a typically spatial

neighbouring block used for merge mode. A typical spatial neighboring block may be a spatial neighboring block that would ordinarily be included in a merge candidate list for the PU, without partitioning. As shown in FIG. 11D, after shifting, spatial neighbor blocks AO and Bl may be used for, i.e., shared by, both merge candidate lists of the two partitions. A corner block of a partition may refer to a corner block (of the PU the partition belongs to) that is fully contained by a partition.

For partition P0, when video encoder 20 and/or video decoder 30 shifts horizontally from corner block B2 of partition P0, the first ordinary merge candidate block, e.g., that is ordinarily used in HEVC merge list construction, is spatial neighbor block Bl. Hence, spatial neighbor block Bl is added to the candidate merge list for partition P0. That is, although spatial neighbor block Bl is not adjacent to partition P0, video encoder 20 and/or decoder 30 adds spatial neighbor block Bl to the spatial neighbor blocks AO, Al and B2 used in the merge list for partition P0. Shifting corner block AO horizontally does not reach any ordinary merge candidate block (i.e., any block that is ordinarily used in merge list construction in HEVC), as there is no other block that is ordinarily used along the bottom edge of the PU.

For partition PI, shifting from corner block B3 to the left results in selection of ordinary spatial neighbor block AO, i.e., a block that is ordinarily used in merge list construction in HEVC. Hence, spatial neighbor block AO is added to the candidate merge list for partition PI, along with spatial neighbor blocks BO and Bl. Shifting from corner block AO vertically or from corner block BO horizontally would only yield blocks Al and Bl, respectively, as the first ordinary blocks used in HEVC merge list construction, but such blocks are already part of the merge candidate list for the respective partitions P0 and PI. In the example of FIG. 11D, as a result of the corner block shifting, partition P0 uses spatial neighbors AO, Al, and B2 for the merge candidate list, and partition PI uses spatial neighbors BO, Bl and AO for the merge candidate list.

Hence, spatial neighbors AO and Bl are used in merge lists for both partition P0 and partition PI. Note that, in FIG. 1 ID, blocks AO, Bl (indicated by cross-hatching) are used by video encoder 20 and/or decoder 30 to form merge candidate lists for both non-rectangular partitions, whereas blocks Al and B2 (shaded black) are used only for partition P0, and block B0 (shaded in light gray) is used only for partition PI.

[0180] 9. As a ninth aspect, when the coding tool, motion vector inheritance (MVI), is utilized by a video encoder 20 and/or video decoder 30 to create a merge candidate from the co-located texture block located by the center position of the current depth block, for each partition, a particular pixel, such as a corner pixel, of the partition is used to identify a co-located texture block, the motion information of which is used to create another MVI candidate. Alternatively, any other pixel within a partition can be used to identify the co-located texture block.

[0181] In general, in this example, MVI may be used to generate one or more motion prediction candidates for inclusion in a merge candidate list that may be used for merge mode or AMVP mode for a block to be coded. A video encoder 20 and/or video decoder 30 may select a co-located texture block from a corresponding texture view component as a first additional merge candidate, i.e., a first MVI candidate, for the merge candidate list by selecting the texture block having a center position that is the same as or similar to the center position of the depth PU. A center position may be, for example, a position with relative coordinates of (width/2, height/2) relative to the top-left corner of the block, wherein width x height represents the size of the block.

[0182] Video encoder 20 and/or video decoder 30 may select a second, additional merge candidate, i.e., a second MVI candidate, for one non-rectangular partition of the PU by selecting another texture block that has a corner pixel position, such a top-left pixel position, that is the same as or similar to a corner pixel position of the non-rectangular partition. Video encoder 20 and/or video decoder 30 may also select yet a third, additional merge candidate, i.e., a third MVI candidate, for the non-rectangular partition of the PU by selecting another texture block that has a corner pixel position, such a top- left pixel position, that is the same as or similar to a corner pixel position of the other non-rectangular partition.

[0183] The resulting merge candidate list for each non-rectangular partition may be the same or different, and may include all of the first, second and third MVI candidates.

Alternatively, the merge candidate list for one of the non-rectangular partitions may include the first MVI candidate and the one of the MVI candidates that was generated for the respective non-rectangular partition, but exclude the one of the MVI candidates that was not generated for the respective non-rectangular partition.

[0184] Hence, in constructing a merge list for a non-rectangular partition of a PU of a depth view component, video encoder 20 and/or decoder 30 may select a texture block in a corresponding texture view component, e.g., of the same view, based on at least partial co-location of the respective non-rectangular partition and the block of the texture view component. The partial co-location may be determined based on co-location of a particular pixel, such as a center pixel of the PU containing the non-rectangular partitions, or a corner pixel of a first partition or a corner pixel of a second partition, with a pixel in a texture block.

[0185] In various examples, the merge list for a non-rectangular partition may include spatial neighbor blocks, a TMVP, and/or one or more MVI candidates. The MVI candidate or candidates for a non-rectangular partition may include a texture block co- located with a central pixel (or other selected pixel) of the rectangular PU in which the partition resides, a texture block co-located with a corner pixel of the partition, or a texture block co-located with a corner pixel of the other partition of the PU, a

combination of a texture block co-located with a central pixel (or other selected pixel) of the rectangular PU in which the partition resides and a texture block co-located with a corner pixel of the partition, a combination of a texture block co-located with a central pixel (or other selected pixel) of the rectangular PU in which the partition resides and a texture block co-located with a corner pixel of the other partition of the PU, a

combination of a texture block co-located with a central pixel (or other selected pixel) of the rectangular PU in which the partition resides, a texture block co-located with a corner pixel of the partition, and a texture block co-located with a corner pixel of the other partition of the PU, or a combination of a texture block co-located with a corner pixel of the partition and a texture block co-located with a corner pixel of the other partition of the PU.

[0186] The candidates then, if selected from the merge list, may be used by video encoder 20 and/or video decoder 30 for prediction of motion information for the pertinent non-rectangular partition. In various examples, average value-based residual coding, similar to SDC, may be used to code the residual data of the partition relative to inter- predictive reference samples generated using the predicted motion information. a. In one alternative, two MVI candidates may be added to the merge list and each MVI candidate may be created by the motion information of a co-located texture block which is located by a corner pixel, or another pixel, of a respective one of the non-rectangular partitions. In this way, the original MVI candidate derived from the co-located texture block located by the center position of the current depth block CU is not inserted into the merge candidate list for either of the non- rectangular partitions.

Instead, the merge candidate list for each non-rectangular partition includes the second and third MVI candidates, or the merge candidate list for the one non-rectangular partition includes the second MVI candidate, but not the third MVI candidate, and the merge candidate list for the other non-rectangular partition includes the third MVI candidate, but not the second MVI candidate. In addition to the MVI candidate or candidates, the merge candidate list may include other merge candidates, such as the ordinary merge candidates used in HEVC to form a merge candidate list (e.g., AO, Al, B0, Bl, B2).

In this case, each of the non-rectangular partitions may have the same merge candidate list, including the same MVI candidates, or different merge candidate lists, including the same or different MVI candidates. Alternatively, the merge candidate lists may be formulated as described above with respect to the eighth aspect 8 discussed above, and then supplemented to include one or more MVI candidates as described in this ninth aspect. For example, any of the merge candidate lists described with respect to FIGS. 11A-11D may further include MVI candidates according to different examples described in this ninth aspect.

[0187] Video encoder 20 and video decoder 30 may operate substantially in accordance with the HEVC standard, as modified or supplemented to support inter depth coding according to various examples of this disclosure. However, the techniques of this disclosure are not necessarily limited to any particular coding standard or technique. A draft of the HEVC standard currently, referred to as "HEVC Working Draft 9," is described in Bross et al., "High Efficiency Video Coding (HEVC) text specification draft 9," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 11th Meeting: Shanghai, China, October, 2012, which is downloadable from:

http://phenix.int-evry.fr/ict/doc end user/documents/ 1 1 Shanghai/wg 1 1 /JCTVC- 1003- v8.zip

[0188] Another recent draft of the HEVC standard, referred to as "HEVC Working Draft 10" or "WD10," is described in document JCTVC-L1003v34, Bross et al., "High efficiency video coding (HEVC) text specification draft 10 (for FDIS & Last Call)," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 12th Meeting: Geneva, CH, 14-23 January, 2013, which, as of June 6, 2013, is downloadable from:

http://phenix.int-eyrv.fr/ict/doc end user/documents/12 Geneva/wgl l/JCTVC-L1003- v34.zip.

[0189] Yet another draft of the HEVC standard, is referred to herein as "WD10 revisions" described in Bross et al., "Editors' proposed corrections to HEVC version 1," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 13 th Meeting, Incheon, KR, April 2013, which as of June 7, 2013, is available from:

http:// pheni . i nt-evry.fr/jct/doc end user/documents/ 13 Incheon/wg l 1/JCTVC-M0432- v3.zip

[0190] In one example implementation, is it assumed that the Wedgelet index is explicitly signaled, although the non-rectangular partitions may be signaled differently or derived. Set forth below is example coding unit (CU) syntax in which a Wedgelet index is signaled. Below, highlighting indicates changes relative to current syntax and semantics in JCT-3V D1005 or HEVC WD10, as applicable.

TABLE 1 - Coding Unit Syntax

[0191] Set forth below is example prediction unit partition syntax.

Table 2 - Prediction unit artition syntax

[0192] Example coding unit semantics are described as follows: wedge_full_tab_idx[ xO ][ yO ] specifies the index of the Wedgelet pattern in the corresponding pattern list in a way similar when depth_intra_mode[ xO ][ yO ] is equal to INTRA_DEP_DMM_WFULL. inter_sdc_resi_abs[ i ][ xO ][ yO ], inter_sdc_resi_sign_flag[ i ][ xO ][ yO ] are used to derive the residual of the i-th partition, i.e., InterSdcResi[ i ][ xO ][ yO ] as follows:

InterSdcResi[ i ][ x0 ][ y0 ] = ( l - 2 * inter_sdc_resi_sign_flag[ i ][ xO ][ yO ] ) * inter_sdc_resi_abs[ i ][ xO ][ yO ]

All the pixels within one partition share the same residual, i.e.,

InterSdcResi[ i ][ xO ][ yO ].

[0193] Example prediction unit partition semantics are described as follows: merge_idx[ 0 ][ xO ][ yO ] specifies the merge index of the current prediction unit (0-th partition). merge_idx[ 1 ][ x0 ][ y0 ] specifies the merge index of the current prediction unit (1-th partition).

[0194] An example binarization process for part_mode is described as follows: H.9.3.2.6 Binarization process for part_mode

Inputs to this process are a request for a binarization for the syntax element part_mode a luma location ( xCb, yCb ), specifying the top-left sample of the current luma coding block relative to the top-left luma sample of the current picture, and a variable log2CbSize specifying the current luma coding block size.

Output of this process is the binarization of the syntax element.

The binarization for the syntax element part_mode when log2CbSize > Log2MaxDmmCbSize is specified in Table 9-34 depending on the values of CuPredMode[ xCb ][ yCb ] and log2CbSize. The binarization for the syntax element part_mode when log2CbSize <= Log2MaxDmmCbSize is specified in Table 9-35 depending on the values of CuPredMode[ xCb ][ yCb ] and log2CbSize.

Table 9-34 - Binarization for part_mode when

log2CbSize > Log2MaxDmmCbSize

Table 9-35 - Binarization for part_mode when

log2CbSize <= Log2MaxDmmCbSize Bin string

log2CbSize > log2CbSize = = MinCbLog2SizeY

CuPredMode[ xCb ][ yCb ] part_mode PartMode

MinCbLog2SizeY

!amp_enabled_flag amp_enabled_flag log2CbSize = = 3 log2CbSize > 3

0 PART_2Nx2N - - 1 1

MODEJNTRA

1 PARTJSfxN - - 0 0

0 PART_2Nx2N 1 1 1 1

1 PART_2NxN 001 0011 001 001

2 PART_Nx2N 000 0001 000 000

3 PART_NxN - - -

4 PART_2NxnU

MODEJNTER - 00100 - -

5 PART_2NxnD - 00101 - -

6 PART_nLx2N - 00000 - -

7 PART_nRx2N - 00001 - -

8 PART 01 01 01 01

WEDGLET

[0195] FIG. 12 is a block diagram illustrating an example video encoder 20 that may be configured to implement the techniques of this disclosure. FIG. 12 is provided for purposes of explanation and should not be considered limiting of the techniques as broadly exemplified and described in this disclosure. For purposes of explanation, this disclosure describes video encoder 20 in the context of HEVC coding and, more

particularly, 3D-HEVC. However, the techniques of this disclosure may be applicable to other coding standards or methods.

[0196] In the example of FIG. 12, video encoder 20 includes a prediction processing unit 100, a residual generation unit 102, a transform processing unit 104, a quantization unit 106, an inverse quantization unit 108, an inverse transform processing unit 110, a

reconstruction unit 112, a filter unit 114, a decoded picture buffer 116, and an entropy encoding unit 118. Prediction processing unit 100 includes an inter-prediction processing unit 120 and an intra-prediction processing unit 126. Inter-prediction processing unit 120 includes a motion estimation unit 122 and a motion compensation unit 124. In other examples, video encoder 20 may include more, fewer, or different functional components.

[0197] Video encoder 20 may receive video data. Video encoder 20 may encode each CTU in a slice of a picture of the video data. Each of the CTUs may be associated with equally-sized luma coding tree blocks (CTBs) and corresponding CTBs of the picture. As part of encoding a CTU, prediction processing unit 100 may perform quad- tree partitioning to divide the CTBs of the CTU into progressively-smaller blocks. The smaller block may be coding blocks of CUs. For example, prediction processing unit 100 may partition a CTB associated with a CTU into four equally- sized sub-blocks, partition one or more of the sub-blocks into four equally-sized sub-sub-blocks, and so on.

Video encoder 20 may encode CUs of a CTU to generate encoded representations of the CUs (i.e., coded CUs). As part of encoding a CU, prediction processing unit 100 may partition the coding blocks associated with the CU among one or more PUs of the CU. Thus, each PU may be associated with a luma prediction block and corresponding chroma prediction blocks.

[0198] Video encoder 20 and video decoder 30 may support PUs having various sizes. As indicated above, the size of a CU may refer to the size of the luma coding block of the CU and the size of a PU may refer to the size of a luma prediction block of the PU.

Assuming that the size of a particular CU is 2Nx2N, video encoder 20 and video decoder 30 may support PU sizes of 2Nx2N or NxN for intra prediction, and symmetric PU sizes of 2Nx2N, 2NxN, Nx2N, NxN, or similar for inter prediction. Video encoder 20 and video decoder 30 may also support asymmetric partitioning for PU sizes of 2NxnU, 2NxnD, nLx2N, and nRx2N for inter prediction. In accordance with aspects of this disclosure, video encoder 20 and video decoder 30 also support non-rectangular partitions of a PU for depth inter coding.

[0199] Inter-prediction processing unit 120 may generate predictive data for a PU by performing inter prediction on each PU of a CU. The predictive data for the PU may include a predictive sample blocks of the PU and motion information for the PU. Inter- prediction unit 121 may perform different operations for a PU of a CU depending on whether the PU is in an I slice, a P slice, or a B slice. In an I slice, all PUs are intra predicted. Hence, if the PU is in an I slice, inter-prediction unit 121 does not perform inter prediction on the PU. Thus, for blocks encoded in I-mode, the predicted block is formed using spatial prediction from previously-encoded neighboring blocks within the same frame.

[0200] If a PU is in a P slice, motion estimation unit 122 may search the reference pictures in a list of reference pictures (e.g., "RefPicListO") for a reference region for the PU. The reference region for the PU may be a region, within a reference picture, that contains sample blocks that most closely corresponds to the sample blocks of the PU. Motion estimation unit 122 may generate a reference index that indicates a position in RefPicListO of the reference picture containing the reference region for the PU. In addition, motion estimation unit 122 may generate an MV that indicates a spatial displacement between a coding block of the PU and a reference location associated with the reference region. For instance, the MV may be a two-dimensional vector that provides an offset from the coordinates in the current decoded picture to coordinates in a reference picture. Motion estimation unit 122 may output the reference index and the MV as the motion information of the PU. Motion compensation unit 124 may generate the predictive sample blocks of the PU based on actual or interpolated samples at the reference location indicated by the motion vector of the PU.

[0201] If a PU is in a B slice, motion estimation unit 122 may perform uni-prediction or bi-prediction for the PU. To perform uni-prediction for the PU, motion estimation unit 122 may search the reference pictures of RefPicListO or a second reference picture list ("RefPicListl") for a reference region for the PU. Motion estimation unit 122 may output, as the motion information of the PU, a reference index that indicates a position in RefPicListO or RefPicListl of the reference picture that contains the reference region, an MV that indicates a spatial displacement between a sample block of the PU and a reference location associated with the reference region, and one or more prediction direction indicators that indicate whether the reference picture is in RefPicListO or RefPicListl. Motion compensation unit 124 may generate the predictive sample blocks of the PU based at least in part on actual or interpolated samples at the reference region indicated by the motion vector of the PU.

[0202] To perform bi-directional inter prediction for a PU, motion estimation unit 122 may search the reference pictures in RefPicListO for a reference region for the PU and may also search the reference pictures in RefPicListl for another reference region for the PU. Motion estimation unit 122 may generate reference picture indexes that indicate positions in RefPicListO and RefPicListl of the reference pictures that contain the reference regions. In addition, motion estimation unit 122 may generate MVs that indicate spatial displacements between the reference location associated with the reference regions and a sample block of the PU. The motion information of the PU may include the reference indexes and the MVs of the PU. Motion compensation unit 124 may generate the predictive sample blocks of the PU based at least in part on actual or interpolated samples at the reference region indicated by the motion vector of the PU. [0203] In accordance with one or more techniques of this disclosure, one or more units within video encoder 20 may perform one or more techniques described herein as part of a video encoding process. Additional 3D components may also be included within video encoder 20.

[0204] Intra-prediction processing unit 126 may generate predictive data for a PU by performing intra prediction on the PU. The predictive data for the PU may include predictive sample blocks for the PU and various syntax elements. Intra-prediction processing unit 126 may perform intra prediction on PUs in I slices, P slices, and B slices.

[0205] To perform intra prediction on a PU, intra-prediction processing unit 126 may use multiple intra prediction modes to generate multiple sets of predictive data for the PU. To use an intra prediction mode to generate a set of predictive data for the PU, intra- prediction processing unit 126 may extend samples from sample blocks of neighboring PUs across the sample blocks of the PU in a direction associated with the intra prediction mode. The neighboring PUs may be above, above and to the right, above and to the left, or to the left of the PU, assuming a left-to-right, top-to-bottom encoding order for PUs, CUs, and CTUs. Intra-prediction processing unit 126 may use various numbers of intra prediction modes, e.g., 33 directional intra prediction modes. In some examples, the number of intra prediction modes may depend on the size of the region associated with the PU.

[0206] Prediction processing unit 100 may select the predictive data for PUs of a CU from among the predictive data generated by inter-prediction processing unit 120 for the PUs or the predictive data generated by intra-prediction processing unit 126 for the PUs. In some examples, prediction processing unit 100 selects the predictive data for the PUs of the CU based on rate/distortion metrics of the sets of predictive data. The predictive sample blocks of the selected predictive data may be referred to herein as the selected predictive sample blocks.

[0207] Residual generation unit 102 may generate, based on the luma, Cb and Cr coding block of a CU and the selected predictive luma, Cb and Cr blocks of the PUs of the CU, a luma, Cb and Cr residual blocks of the CU. For instance, residual generation unit 102 may generate the residual blocks of the CU such that each sample in the residual blocks has a value equal to a difference between a sample in a coding block of the CU and a corresponding sample in a corresponding selected predictive sample block of a PU of the CU. [0208] Transform processing unit 104 may perform quad- tree partitioning to partition the residual blocks associated with a CU into transform blocks associated with TUs of the CU. Thus, a TU may be associated with a luma transform block and two chroma transform blocks. The sizes and positions of the luma and chroma transform blocks of TUs of a CU may or may not be based on the sizes and positions of prediction blocks of the PUs of the CU. A quad-tree structure known as a "residual quad-tree" (RQT) may include nodes associated with each of the regions. The TUs of a CU may correspond to leaf nodes of the RQT.

[0209] Transform processing unit 104 may generate transform coefficient blocks for each TU of a CU by applying one or more transforms to the transform blocks of the TU.

Transform processing unit 104 may apply various transforms to a transform block associated with a TU. For example, transform processing unit 104 may apply a discrete cosine transform (DCT), a directional transform, or a conceptually similar transform to a transform block. In some examples, transform processing unit 104 does not apply transforms to a transform block. In such examples, the transform block may be treated as a transform coefficient block.

[0210] Quantization unit 106 may quantize the transform coefficients in a coefficient block. The quantization process may reduce the bit depth associated with some or all of the transform coefficients. For example, an ra-bit transform coefficient may be rounded down to an m-bit transform coefficient during quantization, where n is greater than m. Quantization unit 106 may quantize a coefficient block associated with a TU of a CU based on a quantization parameter (QP) value associated with the CU. Video encoder 20 may adjust the degree of quantization applied to the coefficient blocks associated with a CU by adjusting the QP value associated with the CU. Quantization may introduce loss of information, thus quantized transform coefficients may have lower precision than the original ones.

[0211] Inverse quantization unit 108 and inverse transform processing unit 110 may apply inverse quantization and inverse transforms to a coefficient block, respectively, to reconstruct a residual block from the coefficient block. Reconstruction unit 112 may add the reconstructed residual block to corresponding samples from one or more predictive sample blocks generated by prediction processing unit 100 to produce a reconstructed transform block associated with a TU. By reconstructing transform blocks for each TU of a CU in this way, video encoder 20 may reconstruct the coding blocks of the CU. [0212] Filter unit 114 may perform one or more deblocking operations to reduce blocking artifacts in the coding blocks associated with a CU. Decoded picture buffer 116 may store the reconstructed coding blocks after filter unit 114 performs the one or more deblocking operations on the reconstructed coding blocks. Inter-prediction unit 120 may use a reference picture that contains the reconstructed coding blocks to perform inter prediction on PUs of other pictures. In addition, intra-prediction processing unit 126 may use reconstructed coding blocks in decoded picture buffer 116 to perform intra prediction on other PUs in the same picture as the CU.

[0213] Entropy encoding unit 118 may receive data from other functional components of video encoder 20. For example, entropy encoding unit 118 may receive coefficient blocks from quantization unit 106 and may receive syntax elements from prediction processing unit 100. Entropy encoding unit 118 may perform one or more entropy encoding operations on the data to generate entropy-encoded data. For example, entropy encoding unit 118 may perform a context-adaptive variable length coding (CAVLC) operation, a CAB AC operation, a variable-to-variable (V2V) length coding operation, a syntax-based context-adaptive binary arithmetic coding (SB AC) operation, a Probability Interval Partitioning Entropy (PIPE) coding operation, an Exponential- Golomb encoding operation, or another type of entropy encoding operation on the data. Video encoder 20 may output a bitstream that includes entropy-encoded data generated by entropy encoding unit 118. For instance, the bitstream may include data that represents a RQT for a CU.

[0214] FIG. 13 is a block diagram illustrating an example video decoder 30 that is configured to implement the techniques of this disclosure. FIG. 13 is provided for purposes of explanation and is not limiting on the techniques as broadly exemplified and described in this disclosure. For purposes of explanation, this disclosure describes video decoder 30 in the context of HEVC coding. However, the techniques of this disclosure may be applicable to other coding standards or methods.

[0215] In the example of FIG. 13, video decoder 30 includes an entropy decoding unit 150, a prediction processing unit 152, an inverse quantization unit 154, an inverse transform processing unit 156, a reconstruction unit 158, a filter unit 160, and a decoded picture buffer 162. Prediction processing unit 152 includes a motion compensation unit 164 and an intra-prediction processing unit 166. In other examples, video decoder 30 may include more, fewer, or different functional components. [0216] Video decoder 30 may receive a bitstream. Entropy decoding unit 150 may parse the bitstream to decode syntax elements from the bitstream. Entropy decoding unit 150 may entropy decode entropy-encoded syntax elements in the bitstream. Prediction processing unit 152, inverse quantization unit 154, inverse transform processing unit 156, reconstruction unit 158, and filter unit 160 may generate decoded video data based on the syntax elements extracted from the bitstream.

[0217] The bitstream may comprise a series of NAL units. The NAL units of the bitstream may include coded slice NAL units. As part of decoding the bitstream, entropy decoding unit 150 may extract and entropy decode syntax elements from the coded slice NAL units. Each of the coded slices may include a slice header and slice data. The slice header may contain syntax elements pertaining to a slice. The syntax elements in the slice header may include a syntax element that identifies a PPS associated with a picture that contains the slice.

[0218] In addition to decoding syntax elements from the bitstream, video decoder 30 may perform a reconstruction operation on a non-partitioned CU. To perform the

reconstruction operation on a non-partitioned CU, video decoder 30 may perform a reconstruction operation on each TU of the CU. By performing the reconstruction operation for each TU of the CU, video decoder 30 may reconstruct residual blocks of the CU.

[0219] As part of performing a reconstruction operation on a TU of a CU, inverse quantization unit 154 may inverse quantize, i.e., de-quantize, coefficient blocks associated with the TU. Inverse quantization unit 154 may use a QP value associated with the CU of the TU to determine a degree of quantization and, likewise, a degree of inverse quantization for inverse quantization unit 154 to apply. That is, the compression ratio, i.e., the ratio of the number of bits used to represent original sequence and the compressed one, may be controlled by adjusting the value of the QP used when quantizing transform coefficients. The compression ratio may also depend on the method of entropy coding employed.

[0220] After inverse quantization unit 154 inverse quantizes a coefficient block, inverse transform processing unit 156 may apply one or more inverse transforms to the coefficient block in order to generate a residual block associated with the TU. For example, inverse transform processing unit 156 may apply an inverse DCT, an inverse integer transform, an inverse Karhunen-Loeve transform (KLT), an inverse rotational transform, an inverse directional transform, or another inverse transform to the coefficient block.

[0221] If a PU is encoded using intra prediction, intra-prediction processing unit 166 may perform intra prediction to generate predictive blocks for the PU. Intra-prediction processing unit 166 may use an intra prediction mode to generate the predictive luma, Cb and Cr blocks for the PU based on the prediction blocks of spatially- neighboring PUs. Intra-prediction processing unit 166 may determine the intra prediction mode for the PU based on one or more syntax elements decoded from the bitstream.

[0222] Prediction processing unit 152 may construct a first reference picture list

(RefPicListO) and a second reference picture list (RefPicListl) based on syntax elements extracted from the bitstream. Furthermore, if a PU is encoded using inter prediction, entropy decoding unit 150 may extract motion information for the PU. Motion compensation unit 164 may determine, based on the motion information of the PU, one or more reference regions for the PU. Motion compensation unit 164 may generate, based on samples blocks at the one or more reference blocks for the PU, predictive luma, Cb and Cr blocks for the PU.

[0223] As indicated above, video encoder 20 may signal the motion information of a PU using merge mode or AMVP mode. When video encoder 20 signals the motion information of a current PU using AMVP mode, entropy decoding unit 150 may decode, from the bitstream, a reference index, a MVD for the current PU, and a candidate index. Furthermore, motion compensation unit 164 may generate a merge candidate list for the current PU. The merge candidate list includes one or more MV predictor candidates. Each of the MV predictor candidates specifies a MV of a PU that spatially or temporally neighbors the current PU. Motion compensation unit 164 may determine, based at least in part on the candidate index, a selected MV predictor candidate in the merge candidate list. Motion compensation unit 164 may then determine the MV of the current PU by adding the MVD to the MV specified by the selected MV predictor candidate. In other words, for AMVP, MV is calculated as MV = MVP + MVD, wherein the index of the motion vector predictor (MVP) is signaled and the MVP is one of the MV candidates (spatial or temporal) from the merge list, and the MVD is signaled to the decoder side.

[0224] If the current PU is bi-predicted, entropy decoding unit 150 may decode an additional reference index, MVD, and candidate index from the bitstream. Motion compensation unit 164 may repeat the process described above using the additional reference index, MD, and candidate index to derive a second MV for the current PU. In this way, motion compensation unit 164 may derive a MV for RefPicListO (i.e., a RefPicListO MV) and a MV for RefPicListl (i.e., a RefPicListl MV).

[0225] In accordance with one or more techniques of this disclosure, one or more units within video decoder 30 may perform one or more techniques described herein as part of a video decoding process. Additional 3D components may also be included within video decoder 30.

[0226] Continuing reference is now made to FIG. 13. Reconstruction unit 158 may use the luma, Cb and Cr transform blocks associated with TUs of a CU and the predictive luma, Cb and Cr blocks of the PUs of the CU, i.e., either intra-prediction data or inter- prediction data, as applicable, to reconstruct the luma, Cb and Cr coding blocks of the CU. For example, reconstruction unit 158 may add samples of the luma, Cb and Cr transform blocks to corresponding samples of the predictive luma, Cb and Cr blocks to reconstruct the luma, Cb and Cr coding blocks of the CU.

[0227] Filter unit 160 may perform a deblocking operation to reduce blocking artifacts associated with the luma, Cb and Cr coding blocks of the CU. Video decoder 30 may store the luma, Cb and Cr coding blocks of the CU in decoded picture buffer 162.

Decoded picture buffer 162 may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on a display device, such as display device 32 of FIG. 1. For instance, video decoder 30 may perform, based on the luma, Cb and Cr blocks in decoded picture buffer 162, intra prediction or inter prediction operations on PUs of other CUs. In this way, video decoder 30 may extract, from the bitstream, transform coefficient levels of the significant luma coefficient block, inverse quantize the transform coefficient levels, apply a transform to the transform coefficient levels to generate a transform block, generate, based at least in part on the transform block, a coding block, and output the coding block for display.

[0228] FIG. 14 is a flow diagram illustrating a method for enabling coding according to a non-rectangular depth inter prediction mode. The method illustrated in FIG. 14 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the first aspect described above. For FIGS. 14-22, various methods will be described jointly with reference to similar or reciprocal operations performed by a video encoder 20 and video decoder 30. As shown in FIG. 14, upon receiving or generating a depth PU (200), encoder 20 or decoder 30 determines whether the depth PU has a size of 2N x 2N (202). If not, encoder 20 and/or decoder 30 disables, i.e., does not apply a non-rectangular depth inter prediction (NRDIP) mode (208). If so, encoder 20 or decoder 30 generates non-rectangular partitions for the PU (204), and applies the NRDIP coding mode (206) to code the non-rectangular partitions.

[0229] Hence, as shown in FIG. 14, a video decoder 30 may be configured to perform an inter-prediction decoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture only when a size of the PU is the same as the size of a coding unit (CU) of the PU. The inter-prediction mode comprises obtaining motion information indicating inter-predictive samples for the non- rectangular partitions, and obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data.

[0230] Also, as shown in FIG. 14, a video encoder 20 may be configured to perform an inter-prediction encoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture only when a size of the PU is the same as the size of a coding unit (CU) of the PU. The inter-prediction mode comprises generating motion information indicating inter-predictive samples for the non- rectangular partitions, generating residual data representing differences between the inter- predictive samples and pixels of the non-rectangular partitions, and encoding the non- rectangular partitions of the PU based at least in part on the motion information and the residual data.

[0231] In some examples, performing the inter-prediction mode comprises performing the inter-prediction mode only when a size of the PU is larger than a predetermined size of KxK pixels. As an example, K may be equal to, e.g., one of 8, 16 or 32. In other examples, performing the inter-prediction mode comprises performing the inter- prediction mode only when a size of the PU is smaller than a predetermined size of JxJ pixels. As an example, J may be equal to one of 16, 32 or 64. In some examples, the minimum and maximum thresholds K and J may be combined such that the inter- prediction mode is performed only when a size of the PU is larger than or equal to KxK and smaller than JxJ, where J and K are different, or alternatively when a size of the PU is larger than to KxK and smaller than or equal to JxJ, where J and K are different.

[0232] The non-rectangular partitions include wedgelet partitions. The motion information may indicate a reference index indicating a reference picture in a reference picture list for generation of the inter-predictive samples and a motion vector that identifies the inter-predictive samples. For example, the motion vector may indicate a block for use in generating the inter-predictive samples. Video decoder 30 may obtain the motion information by motion vector prediction, e.g., using merge, skip, AMVP modes, or by receiving the motion information in an encoded bitstream. In addition, video decoder 30 may receive residual data in the encoded bitstream from video encoder 20.

[0233] Video encoder 20 may signal the motion information or a merge index or other information for motion vector prediction. For example, video encoder 20 may generate the motion information based on motion vector prediction or motion estimation, and encode the non-rectangular partitions based on the motion information and residual data. When the motion information is generated based on motion vector prediction, video encoder 20 may encode a merge index to indicate one of a plurality of candidate sets of motion information in a merge candidate list. When motion estimation is used, video encoder 20 may encode the motion information.

[0234] In the NRDIP mode, the residual data for each of the non-rectangular partitions represents a difference between an average value of pixels of the respective non- rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition. In this manner, this inter-predictive residual data may be similar to SDC residual data generated in a depth intra mode. Encoder 20 may send, and decoder 30 may receive, the residual data in the encoded bitstream.

[0235] FIG. 15 is a flow diagram illustrating a method for performing coding with motion prediction while disabling advanced motion vector prediction (AMVP) in a non- rectangular depth inter prediction mode. The method illustrated in FIG. 15 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the second aspect described above.

[0236] As shown in FIG. 15, upon receiving or generating a depth PU with non- rectangular partitions (220), video encoder 20 and/or decoder may determine whether motion vector prediction is enabled (222). If so, in this example, video encoder 20 and/or decoder 30 disable AMVP (224), and proceed to predict motion information using motion vector prediction (226). If not (222), video encoder 20 and/or decoder 30 use estimated motion information (226), i.e., with video encoder 20 performing motion estimation to generate motion information and video decoder 30 receiving the motion information. With predicted motion information (226) or estimated motion information (228), video encoder 20 and/or decoder 30 select inter-predictive samples for the non- rectangular partitions (230), generate (video encoder 20) or receive (video decoder 30) residual data based on the predictive samples and pixels of the non-rectangular partition (232), and either encodes (video encoder 20) or reconstructs (video decoder 30) the non- rectangular partitions (234).

[0237] Accordingly, as shown in FIG. 15, a video decoder 30 may be configured to perform an inter-prediction decoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture. The inter-prediction mode may comprise obtaining motion information indicating inter-predictive samples for the non-rectangular partitions, obtaining residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data. In this example, decoder 30 disables an advanced motion vector prediction (AMVP) mode when the motion information is obtained using motion vector prediction.

[0238] Likewise, a video encoder 20 may perform an inter-prediction encoding mode for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture. The inter-prediction mode may comprise generating motion information indicating inter-predictive samples for the non-rectangular partitions, generating residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and encoding the non-rectangular partitions of the PU based at least in part on the motion information and the residual data. In this example, encoder 20 disables an advanced motion vector prediction (AMVP) mode when the motion information is generated using motion vector prediction.

[0239] The motion vector prediction may include a merge mode in which the

motion information is obtained from one of a plurality of candidates in a merge candidate list for the non-rectangular partition. In some examples, encoder 20 and/or decoder 30 also may disable a skip mode when the motion information is obtained using motion vector prediction. The non-rectangular partitions may include wedgelet partitions.

[0240] In some examples, the motion information includes a reference index indicating a reference picture in a reference picture list for generation of the inter-predictive samples and a motion vector that identifies the inter-predictive samples, and obtaining the motion information comprises obtaining the motion information by motion vector prediction or by receiving the motion information in an encoded bitstream. The residual data for each of the non-rectangular partitions may represent a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter- predictive samples for the respective non-rectangular partition.

[0241] FIG. 16 is a flow diagram illustrating a method for performing a method for performing coding with motion prediction in a non-rectangular depth inter prediction mode. The method illustrated in FIG. 16 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the third aspect described above.

[0242] As shown in FIG. 16, video encoder 20 and/or video decoder predict motion information for a non-rectangular partition of a PU of a depth video component (240), and perform motion compensation of a rectangular block that covers the non-rectangular partition in the depth PU (242). Video encoder 20 and/or decoder 30 select inter- predictive samples from the motion-compensated rectangular block for the non- rectangular partition (244), e.g., by extracting samples that correspond to the non- rectangular partition. The non-rectangular partition may be, for example, a wedgelet partition.

[0243] Video encoder 20 generates, and video decoder 30 obtains in the bitstream from video encoder 30, residual data based on the predictive samples and the pixels of the non- rectangular partition (246). Again, average-based residual coding may be employed. Video encoder 20 and/or decoder 30 then encode or reconstruct, respectively, the non- rectangular partition using the predictive samples and the residual data (248). The residual data for each of the non-rectangular partitions may represent a difference between an average value of pixels of the respective non-rectangular partition and an average value of the selected inter-predictive samples for the respective non-rectangular partition.

[0244] As shown in FIG. 16, video decoder 30 may be configured to predict motion information for a non-rectangular partition of video data of a prediction unit (PU) of a depth view component of a picture, perform motion compensation on a rectangular block of the prediction unit that covers the non-rectangular partition using the predicted motion information to obtain inter-predictive samples, select some of the inter-predictive samples for the non-rectangular partition of the PU, obtain residual data representing differences between the selected inter-predictive samples and pixels of the non-rectangular partition, and reconstruct the non-rectangular partition of the PU based at least in part on the selected predictive samples and the residual data. [0245] Likewise, in this example, video encoder 20 may be configured to predict motion information for a non-rectangular partition of video data of a prediction unit (PU) of a depth view component of a picture, perform motion compensation on a rectangular block of the prediction unit that covers the non-rectangular partition using the predicted motion information to obtain inter-predictive samples, select some of the inter-predictive samples for the non-rectangular partition of the PU, generate residual data representing differences between the selected inter-predictive samples and pixels of the non- rectangular partition, and encode the non-rectangular partition of the PU based at least in part on the motion information and the residual data.

[0246] Predicting the motion information by encoder 20 or decoder 30 may comprise receiving a merge index for the non-rectangular partition, and obtaining the motion information from a selected one of a plurality of candidates in a merge candidate list for the non-rectangular partition. Selecting some of the predictive samples may comprise selecting a subset comprising less than all of the predictive samples for the non- rectangular partition of the PU. In some examples, the selected predictive samples spatially correspond to locations of pixels of the non-rectangular partition within the motion compensated rectangular block.

[0247] FIG. 17 is a flow diagram illustrating a method for maintaining separate sets of motion information for non-rectangular partitions in a non-rectangular depth inter prediction mode. The method illustrated in FIG. 17 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the fourth aspect described above.

[0248] As shown in FIG. 17, video encoder 20 and/or video decoder 30 obtain separate sets of motion information for the non-rectangular partitions of a PU of a depth video component (260). The motion information for the non-rectangular partitions could be the same or different, in terms of motion vector and reference index, but it is stored and signaled separately. Using the separate motion information, video encoder 20 and/or decoder 30 obtains inter-predictive samples for each non-rectangular partition (262) and generates (encoder 20) residual data using the predictive samples and partition pixels, or obtains the residual data from the bitstream (decoder 30) (264). Video encoder 20 and/or video decoder 30 then encodes or reconstructs (266), respectively, the non-rectangular partitions using the motion information and residual data. For example, video encoder 20 encodes the motion information and residual data, and video decoder 30 generates inter- predictive reference samples using the motion information and reconstructs the pixels of the non-rectangular partitions with the reference samples and residual data. The inter- predictive reference samples may be pixels that are obtained or generated from reference data in a reference picture or view.

[0249] As shown in FIG. 17, video decoder 30 may be configured to obtain separate sets of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component, obtain sets of inter-predictive samples for the non- rectangular partitions of the PU based on the separate sets of motion information, obtain residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstruct the non-rectangular partitions of the PU based at least in part on inter-predictive samples and the residual data. For example, video decoder 30 may receive the separate sets of motion information and the residual data in the encoded video bitstream.

[0250] In a similar manner, video encoder 20 may be configured to generate separate sets of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component, obtain sets of inter-predictive samples for the non- rectangular partitions of the PU based on the separate sets of motion information, generate residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and encode the non-rectangular partitions of the PU based at least in part on the separate sets of motion information and the residual data.

[0251] The separate sets of motion may include a first set of motion information for a first one of the non-rectangular partitions of the PU, and a second set of motion information for a second one of the non-rectangular partitions of the PU. Each of the sets of motion information may indicate a motion vector and a reference index for a reference picture list.

[0252] At least one or both of the separate sets of motion information may be obtained using motion vector prediction based on a merge index for a merge candidate list indicating a plurality of motion information candidates. Video encoder 20 may generate and encode the merge index. Alternatively, one or both of the separate sets of motion information may be encoded, e.g., without motion prediction, by video encoder 20 in an encoded bitstream. The residual data for each of the non-rectangular partitions may represent a difference between an average value of pixels of the respective non- rectangular partition and an average value of the selected inter-predictive samples for the respective non-rectangular partition. [0253] FIG. 18 is a flow diagram illustrating a method for compression of motion information for non-rectangular partitions in a non-rectangular depth inter prediction mode. The method illustrated in FIG. 18 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the fifth aspect described above.

[0254] As shown in FIG. 18, a video decoder 30 receives compressed motion

information for non-rectangular partitions, such as, e.g., wedgelet partitions, of a depth PU (270). The compressed motion information may be compressed by video encoder 20 or another device in any of the ways described in this disclosure, or in other ways. Video decoder 30 generates inter-predictive samples for non-rectangular partitions using the compressed motion information (272), obtains residual data based on a difference between the predictive samples and the partition pixels (274), and reconstructs the partition based on the samples and the residual data (276). Alternatively, video encoder 20 generates and compresses the motion information (270) and encodes the non- rectangular partitions (276) based on the motion information and the residual data. The residual data for each of the non-rectangular partitions may represent a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

[0255] As shown in the example of FIG. 18, video decoder 30 may be configured to obtain motion information, e.g., from encoder via the encoded bitstream, for non- rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, wherein the motion information indicates inter-predictive samples for the non-rectangular partitions, obtain residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstruct the non-rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data. In this example, the motion information for the non-rectangular partitions is compressed when a size of the PU is smaller than a predetermined size of KxK pixels. That is, video encoder 20 encodes and video decoder 30 is configured to receive and process compressed motion information when a size of the PU is smaller than a predetermined size of KxK pixels.

[0256] In a similar manner, video encoder 20 may be configured to generate motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, compress the motion information for the non- rectangular partitions when a size of the PU is smaller than a predetermined size of KxK pixels, select inter-predictive samples for the non-rectangular partitions based on the compressed information, obtain residual data representing differences between the inter- predictive samples and pixels of the non-rectangular partitions, and encode the non- rectangular partitions of the PU based at least in part on the compressed motion information and the residual data.

[0257] The motion information may be compressed, for example, by video encoder 20 to reduce the size and/or number of entries of a line buffer used in video decoder 30 to store motion information for the non-rectangular partitions. In some examples, the motion information is not compressed by video encoder 20 when the size of the PU is greater than or equal to KxK pixels. As an example, K may be equal to 16. In one example, the motion information is compressed to include a reference index and motion vector for a first reference picture list and exclude a reference index and motion vector for a second reference picture list.

[0258] In another example, the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions that covers a particular pixel of the PU, and exclude a reference index and motion vector for one of the non- rectangular partitions that covers the particular pixel of the PU, wherein the included reference index and motion information is used for each of the non-rectangular partitions. As an example, the particular pixel may be a top-left pixel of the PU.

[0259] In another example, the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions having a smallest reference index value to a selected reference picture list, and exclude a reference index and motion vector for the other of the non-rectangular partitions, wherein the included reference index and motion information is used for each of the non-rectangular partitions. In this example, when the reference indexes of the non-rectangular partitions to the selected reference picture list are the same, the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions having a largest motion vector magnitude, and exclude a reference index and motion vector for the other of the non-rectangular partitions, wherein the included reference index and motion information is used for each of the non-rectangular partitions.

[0260] FIG. 19 is a flow diagram illustrating another method for compression of motion information for non-rectangular partitions in a non-rectangular depth inter prediction mode. The method illustrated in FIG. 19 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the sixth aspect described above.

[0261] As shown in FIG. 19, upon generating or receiving a depth PU with non- rectangular partitions (280), video encoder 20 and/or video decoder 30 determines whether TMVP is enabled (282). If not (282), video encoder 20 and/or video decoder 30 generates inter-predictive samples for the non-rectangular partitions using uncompressed motion information (288). If so (282), video encoder 20 and/or video decoder 30 generate inter-predictive samples for the non-rectangular partitions using compressed motion information (284). In this case, video encoder 20 may transmit compressed motion information. The motion information may be compressed in any of the ways described in this disclosure, or in other ways, when TMVP is enabled. Video encoder 20 and/or video decoder 30 then generates or obtains, as applicable, residual data based on the predictive samples and partition pixels (286), and encodes or reconstructs, as applicable, the partitions (290). The residual data for each of the non-rectangular partitions may represent a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

[0262] FIG. 19 represents an example of a method in which a video decoder 30 is configured to obtain motion information for non-rectangular partitions, such as, e.g., wedgelet partitions, of video data of a prediction unit (PU) of a depth view component of a picture, wherein the motion information indicates inter-predictive samples for the non- rectangular partitions, obtain residual data representing differences between the inter- predictive samples and pixels of the non-rectangular partitions, and reconstruct the non- rectangular partitions of the PU based at least in part on the inter-predictive samples indicated by the motion information and the residual data, wherein the motion

information for the non-rectangular partitions is compressed when temporal motion vector prediction (TMVP) is enabled for the motion information.

[0263] Likewise, video encoder 20 may be configured to generate motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, compress the motion information for the non-rectangular partitions when temporal motion vector prediction (TMVP) is enabled for the motion information, select inter-predictive samples for the non-rectangular partitions based on the compressed information, obtain residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and encode the non- rectangular partitions of the PU based at least in part on the compressed motion information and the residual data.

[0264] The motion information may be compressed, for example, by video encoder 20 to reduce the size and/or number of entries of a line buffer used in video decoder 30 to store motion information for the non-rectangular partitions. As an example, the motion information may be compressed when the size of the PU is equal to 16x16 pixels.

[0265] In one example, the motion information is compressed to include a reference index and motion vector for one of the non-rectangular partitions that covers a particular set of one or more pixels of the PU, and exclude a reference index and motion vector for one of the non-rectangular partitions that covers the particular pixel of the PU, wherein the included reference index and motion information is used for each of the non- rectangular partitions. In this example, the particular set of one or more pixels is a top-left pixel of the PU. In another example, the particular set of one or more pixels is a set of one or more center pixels of the PU.

[0266] The motion information may be compressed to exclude any motion information for a TMVP candidate of a merge list comprising a plurality of motion information candidates.

[0267] FIG. 20 is a flow diagram illustrating a method for selecting motion information for motion prediction from a merge candidate having non-rectangular partitions. The method illustrated in FIG. 20 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the seventh aspect described above.

[0268] As shown in FIG. 20, upon generating or receiving a block of a CU of a depth video component (300) and selecting a merge candidate from a merge candidate list (302), video encoder 20 and/or video decoder 30 determines whether the selected merge candidate block has non-rectangular partitions, such as, e.g., wedgelet partitions (304). The merge candidate list may include spatial neighbor blocks, a TMVP, and/or other candidates, which may be used for motion prediction for merge mode, skip mode, or AMVP mode. If the candidate block has non-rectangular partitions (304), video encoder 20 and/or video decoder 30 selects motion information for the block to be coded from one of the non-rectangular partitions that covers a selected pixel in the candidate block (306). If not (304), or after selecting the motion information (306), video encoder 20 and/or video decoder 30 generates inter-predictive samples for the block using the motion information (308), generates or obtains (310), as applicable, residual data, and encodes or reconstructs (312), as applicable, the block for which the motion information was predicted. The residual data for each of the non-rectangular partitions may represent a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

[0269] In the example of FIG. 20, video decoder 30 may be configured to predict motion information for a first block in a first coding unit (CU) of a depth view component of a picture based on motion information of a second block in a second CU, wherein, when the second block includes non-rectangular partitions of video data, the predicted motion information is selected to be motion information from one of the non-rectangular partitions that covers a particular pixel of the second block. Video decoder 30 may generate inter-predictive samples for the first block based on the predicted motion information, obtain residual data representing differences between the inter-predictive samples and pixels of the first block, and reconstruct the first block based at least in part on the inter-predictive samples and the residual data.

[0270] In a similar manner, video encoder 20 may be configured to predict motion information for a first block of video data in a first coding unit (CU) of a depth view component of a picture based on motion information of a second block in a second CU, wherein, when the second block includes non-rectangular partitions of video data, the predicted motion information is motion information from one of the non-rectangular partitions that covers a particular pixel of the second block, generate inter-predictive samples for the first block based on the predicted motion information, generate residual data representing differences between the inter-predictive samples and pixels of the first block, and encode the first block based at least in part on the motion information and the residual data.

[0271] In some examples, the particular pixel is a center pixel of the second block. The motion information may indicate a reference index identifying a reference picture for generation of the inter-predictive samples and a motion vector that identifies the inter- predictive samples, and wherein the motion information for the second block is one of a plurality of candidate sets of motion information in a merge candidate list. The motion information for the second block may be selected by video decoder 30 based on a merge index to the merge candidate list. Video encoder 20 may encode the merge index in the encoded bitstream. [0272] FIG. 21 is a flow diagram illustrating a method for generating merge candidate lists for motion prediction for non-rectangular partitions in a non-rectangular depth inter prediction mode. The method illustrated in FIG. 21 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the eighth aspect described above.

[0273] As shown in FIG. 21, video encoder 20 and/or video decoder 30 may construct a merge candidate list for a non-rectangular partition such as, e.g., a wedgelet partition. The merge candidate list may be used in prediction of motion information using a merge mode, skip mode, AMVP mode, or other motion prediction modes. In constructing the list, video encoder 20 and/or video decoder 30 identifies adjacent merge candidates (320), i.e., spatial neighbor blocks that are adjacent to the non-rectangular partition, and generates substitute merge candidates (322), e.g., in any of the ways described in this disclosure. Video encoder 20 and/or video decoder 30 generates merge list with the adjacent and substitute merge candidates (324), selects a merge candidate from the list (326), generates inter-predictive samples using the motion information (328), generates or obtains, as applicable, residual data (330) and encodes or reconstructs, as applicable, the non-rectangular partitions (332). The residual data for each of the non-rectangular partitions represents a difference between an average value of pixels of the respective non-rectangular partition and an average value of the inter-predictive samples for the respective non-rectangular partition.

[0274] In the example of FIG. 21, video decoder 30 may be configured to generate merge candidate lists including candidate blocks for prediction of motion information for non- rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, predict motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, generate inter-predictive samples for the non-rectangular partitions based on the predicted motion information, obtain residual data representing differences between the inter-predictive samples and pixels of the non- rectangular partitions, and reconstructing the non-rectangular partitions of the PU based at least in part on the inter-predictive samples and the residual data.

[0275] Video encoder 30 may be configured to generate merge candidate lists including candidate blocks for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a picture, predict motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, generate inter-predictive samples for the non-rectangular partitions based on the predicted motion information, generate residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and encode the non-rectangular partitions of the PU based at least in part on the predicted motion information and the residual data.

[0276] In some examples, the merge candidate lists are the same for the non-rectangular partitions. Alternatively, the merge candidate lists are different for the non-rectangular partitions. In one example, each of the merge candidate lists includes spatial neighbor blocks, and the merge candidate list for each of the non-rectangular partitions excludes spatial neighbor blocks that are not adjacent to the respective non-rectangular partition. In another example, each of the merge candidate lists includes spatial neighbor blocks, the merge candidate lists are different for the non-rectangular partitions, and each of the merge candidate lists includes at least some spatial neighbor blocks in common with one another. In another example, each of the merge candidate lists includes spatial neighbor blocks, and the spatial neighbor blocks in the merge candidate lists are mutually exclusive.

[0277] As another example, the merge candidate list for each of the non-rectangular partitions includes spatial neighbor blocks that are adjacent to the respective non- rectangular partition and one or more spatial neighbor blocks that are selected as a substitute spatial neighbor block by shifting vertically or horizontally from a first position of one of the spatial neighbor blocks that is not adjacent to the respective non-rectangular partition to a second position of one of the spatial neighbor blocks that is adjacent to the respective non-rectangular partition, and selecting the spatial neighbor block at the second position as the substitute spatial neighbor block.

[0278] As a further example, the merge candidate list for each of the non-rectangular partitions includes spatial neighbor blocks that are adjacent to the respective non- rectangular partition and one or more spatial neighbor blocks that are selected as a substitute spatial neighbor block by shifting vertically or horizontally from a first position of a corner spatial neighbor block that is not adjacent to the respective non-rectangular partition to a second position of one of the spatial neighbor blocks that is not adjacent to the respective non-rectangular partition but corresponds to a position of a spatial neighbor block ordinarily used in a merge candidate list for a PU that does not have non- rectangular partitions.

[0279] Video decoder 30 may select candidates from the merge candidate lists for each of the non-rectangular partitions based on merge index values received in an encoded bitstream. Video encoder 20 may select the candidates and encode merge index values to be received by video decoder 30 in the encoded bitstream.

[0280] FIG. 22 is a flow diagram illustrating a method for generating motion vector inheritance (MVI) candidates for a merge candidate list for motion prediction for non- rectangular partitions in a non-rectangular depth inter prediction mode. The method illustrated in FIG. 22 represents an example of techniques that may be performed by video encoder 20 and/or decoder 30 in accordance with the ninth aspect described above.

[0281] As shown in FIG. 22, video encoder 20 and/or video decoder 30 identifies a co- located texture block (340) and generates an MVI merge candidate (or a plurality of MVI merge candidates) based on the co-located texture block (342). The texture block may be considered to be co-located with a non-rectangular partition in a number of ways as described in this disclosure. Upon generating a merge list with the MVI merge candidate (or candidates) (344), video encoder 20 and/or video decoder 30 selects a merge candidate (346) for a non-rectangular partition, and generates inter-predictive samples using the motion information (348). Video encoder 20 and/or video decoder 30 generates or obtains, as applicable, residual data based on the predictive samples and the pixels of the partition (350). Video decoder 30 may receive the residual information and the motion information, e.g., a merge index, from video encoder 20 in an encoded bitstream. Video encoder 20 and/or video decoder 30 encodes or reconstructs, as applicable, the non-rectangular partition of the depth PU using the motion information and the residual data (352).

[0282] FIG. 22 represents an example of a method performed by a video decoder 30 configured to generate motion vector inheritance (MVI) candidates for merge candidate lists for prediction of motion information for non-rectangular partitions, such as, e.g., wedgelet partitions, of video data of a prediction unit (PU) of a depth view component of a view of a picture, predict motion information for the non-rectangular partitions based on selected candidates from the merge candidate lists, and generate inter-predictive samples for the non-rectangular partitions based on the predicted motion information, obtain residual data representing differences between the inter-predictive samples and pixels of the non-rectangular partitions, and reconstruct the non-rectangular partitions of the PU based at least in part on the inter-predictive samples and the residual data, wherein the MVI candidate for each of the non-rectangular partitions includes a block of a texture view component, and wherein the block is selected based on at least partial co-location of the respective non-rectangular partition and the block. [0283] A video encoder 20, in this example, may be configured to generating motion vector inheritance (MVI) candidates for merge candidate lists for prediction of motion information for non-rectangular partitions of video data of a prediction unit (PU) of a depth view component of a view of a picture, predict motion information for the non- rectangular partitions based on selected candidates from the merge candidate lists, generate inter-predictive samples for the non-rectangular partitions based on the predicted motion information, generate residual data representing differences between the inter- predictive samples and pixels of the non-rectangular partitions, and encode the non- rectangular partitions of the PU based at least in part on the prediction motion

information and the residual data, wherein the MVI candidate for each of the non- rectangular partitions includes a block of a texture view component, and wherein the block is selected based on at least partial co-location of the respective non-rectangular partition and the block.

[0284] In some examples, the co-location comprises co-location of a particular pixel of the respective non-rectangular partition with a pixel of the block. Alternatively, the co- location comprises co-location of a corner pixel of the respective non-rectangular partition with a pixel of the block.

[0285] It should be understood that the operations shown in FIGS. 14-22 are described for purposes of example. That is, the operations shown in FIGS. 14-22 need not necessarily be performed in the order shown, and fewer, additional, or alternative steps may be performed.

[0286] The techniques described above may be performed by video encoder 20 (FIGS. 1 and 12) and/or video decoder 30 (FIGS. 1 and 13), both of which may be generally referred to as a video coder. In addition, video coding may generally refer to video encoding and/or video decoding, as applicable.

[0287] While the techniques of this disclosure are generally described with respect to 3D-HEVC, the techniques are not limited in this way. The techniques described above may also be applicable to other current standards or future standards not yet developed. For example, the techniques for depth coding may also be applicable to other current or future standards requiring coding of a depth component.

[0288] In one or more examples, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol. In this manner, computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include a computer-readable medium.

[0289] By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer- readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transient, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

[0290] Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.

[0291] The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.

[0292] Various examples have been described. These and other examples are within the scope of the following claims.