DEVICES AND METHODS FOR SEQUENTIAL CODING FOR POINT CLOUD COMPRESSION

Title:

DEVICES AND METHODS FOR SEQUENTIAL CODING FOR POINT CLOUD COMPRESSION

Document Type and Number:

WIPO Patent Application WO/2022/131948

Kind Code:

Abstract:

An encoding apparatus (110) for compression of a point cloud (100) is disclosed. The encoding apparatus (110) comprises a processing circuitry (111) configured to determine a space filling curve connecting the points of the point cloud (100) in a linear order, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud (100) along the space filling curve. The processing circuitry (111) is further configured to determine for each point of the point cloud (100) a residual position vector between the position of the point and the position of a preceding point of the point cloud (100) along the space filling curve. The processing circuitry (111) is further configured to encode the plurality of residual position vectors. Moreover, a corresponding decoding apparatus (130) is disclosed.

Inventors:

FILIPPOV ALEXEY KONSTANTINOVICH (CN)
RUFITSKIY VASILY ALEXEEVICH (CN)

Application Number:

PCT/RU2020/000687

Publication Date:

June 23, 2022

Filing Date:

December 14, 2020

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HUAWEI TECH CO LTD (CN)
FILIPPOV ALEXEY KONSTANTINOVICH (CN)

International Classes:

H04N19/88; H04N19/593; H04N19/91

Foreign References:

US20190116372A1	2019-04-18
US20200217937A1	2020-07-09

Other References:

STEFAN GUMHOLD ET AL: "Predictive point-cloud compression", 20050731; 1077952576 - 1077952576, 31 July 2005 (2005-07-31), pages 137 - es, XP058302295, DOI: 10.1145/1187112.1187277
MATHEUS GADELHA ET AL: "Multiresolution Tree Networks for 3D Point Cloud Processing", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 10 July 2018 (2018-07-10), XP081245851
"G-PCC codec description", no. n19331, 25 June 2020 (2020-06-25), XP030289576, Retrieved from the Internet [retrieved on 20200625]

Attorney, Agent or Firm:

LAW FIRM "GORODISSKY & PARTNERS" LTD. (RU)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1 . An encoding apparatus (110) for compression of a point cloud (100) in a multidimensional space, each point (101 ) of the point cloud (100) having a position defined by a plurality of coordinate values associated with a plurality of coordinate axes, wherein the encoding apparatus (110) comprises: a processing circuitry (111) configured to determine a space filling curve (1100) connecting the points of the point cloud (100) in a linear order, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud (100) along the space filling curve (1100); wherein the processing circuitry (111) is further configured to determine for each point of the point cloud (100) a residual position vector between the position of the point and the position of a preceding point of the point cloud (100) along the space filling curve (1100); and wherein the processing circuitry (111) is further configured to encode the plurality of residual position vectors.

2. The encoding apparatus (110) of claim 1 , wherein the processing circuitry (111) is configured to encode the plurality or residual position vectors using an entropy encoding scheme.

3. The encoding apparatus (110) of claim 1 or 2, wherein the processing circuitry (111) is further configured to encode one or more parameters defining the space filling curve (1100).

4. The encoding apparatus (110) of any one of the preceding claims, wherein for the at least one of the plurality of coordinate axes the processing circuitry (111) is configured to encode only an absolute value of the corresponding component of a respective residual position vector.

5. The encoding apparatus (110) of any one of the preceding claims, wherein the point cloud (100) is bounded by a bounding box (1400) defined by a plurality of boundaries and wherein the space filling curve (1100) wraps at least once around one of the boundaries of the bounding box (1400).

6. The encoding apparatus (110) of any one of the preceding claims, wherein the point cloud (100) is a sub-cloud of a larger point cloud, wherein the processing circuitry (111) is further configured to determine a further space filling curve (1100') connecting the points of a further sub-cloud of the larger point cloud in a linear order, wherein for the least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the further sub-cloud along the further space filling curve (1100'); wherein the processing circuitry (111 ) is further configured to determine for each point of the further sub-cloud a residual position vector between the position of the point and the position of a preceding point of the further sub-cloud along the further space filling curve (1100'); and wherein the processing circuitry (111) is further configured to encode the plurality of residual position vectors of each point of the further sub-cloud.

7. The encoding apparatus (110) of claim 6, wherein the processing circuitry (111) is further configured to partition the multi-dimensional space into at least a first sub-space and a second sub-space, wherein the space filling curve (1100) connects the points within the first sub-space and the further space filling curve (1100') connects the points within the second sub-space.

8. An encoding method (1800) for compression of a point cloud (100) in a multidimensional space, each point of the point cloud (100) having a position defined by a plurality of coordinate values associated with a plurality of coordinate axes, wherein the encoding method (1800) comprises: determining (1801) a space filling curve (1100) connecting the points of the point cloud (100) in a linear order, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud (100) along the space filling curve (1100); determining (1803) for each point of the point cloud (100) a residual position vector between the position of the point and the position of a preceding point of the point cloud (100) along the space filling curve (1100); and encoding (1805) the plurality of residual position vectors.

9. A decoding apparatus (130) for reconstructing a point cloud (100) based on a bitstream (120) in a multi-dimensional space, each point (101) of the point cloud (100) having a position defined by a plurality of coordinate values associated with a plurality of coordinate axes, wherein the decoding apparatus (130) comprises: a processing circuitry (131) configured to decode a plurality of residual position vectors based on the bitstream (120) and a space filling curve (1100), wherein for each point (101) of the point cloud (100) the residual position vector defines a difference between the position of the currently processed point and the position of a preceding point of the point cloud (100) along the space filling curve (1100), wherein the space filling curve (1100) defines a coding order of the points of the point cloud (100) and wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud (100) along the space filling curve (1100); wherein the processing circuitry (131) is further configured to reconstruct the currently processed point on the basis of the position of the preceding point of the point cloud (100) along the space filling curve (1100) and the residual position vector.

10. The decoding apparatus (130) of claim 9, wherein the processing circuitry (131) is configured to decode the plurality or residual position vectors using an entropy encoding scheme.

11 . The decoding apparatus (130) of claim 9 or 10, wherein the processing circuitry (131) is further configured to decode one or more parameters defining the space filling curve (1100) from the bitstream (120).

12. The decoding apparatus (130) of any one of claims 9 to 11 , wherein for the at least one of the plurality of coordinate axes the processing circuitry (131) is configured to decode only an absolute value of the corresponding component of a respective residual position vector.

13. The decoding apparatus (130) of any one of claims 9 to 12, wherein the point cloud (100) is bounded by a bounding box (1400) defined by a plurality of boundaries and wherein the space filling curve (1100) wraps at least once around one of the boundaries of the bounding box (1400). 14. The decoding apparatus (130) of any one of the preceding claims, wherein the point cloud (100) is a sub-cloud of a larger point cloud, wherein the processing circuitry (131) is further configured to decode a further space filling curve (1100') defining a further coding order of the points of a further sub-cloud of the larger point cloud, wherein for the least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the further sub-cloud along the further space filling curve (1100'); and wherein the processing circuitry (131) is further configured to decode for each point of the further sub-cloud a residual position vector between the position of the point and the position of a preceding point of the further sub-cloud along the further space filling curve (1100').

15. A decoding method (1900) for reconstructing a point cloud (100) based on a bitstream (120) in a multi-dimensional space, each point (101) of the point cloud (100) having a position defined by a plurality of coordinate values associated with a plurality of coordinate axes, wherein the decoding method (1900) comprises: decoding (1901) a plurality of residual position vectors based on the bitstream (120) and a space filling curve (1100), wherein for each point (101 ) of the point cloud (100) the residual position vector defines a difference between the position of the currently processed point and the position of a preceding point of the point cloud (100) along the space filling curve (1100), wherein the space filling curve (1100) defines a coding order of the points of the point cloud (100) and wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud (100) along the space filling curve (1100); and reconstructing (1903) the currently processed point on the basis of the position of the preceding point of the point cloud (100) along the space filling curve (1100) and the residual position vector.

16. A computer program product comprising a non-transitory computer-readable storage medium for storing program code which causes a computer or a processor to perform the method (1800) of claim 8 or the method (1900) of claim 15 when the program code is executed by the computer or the processor.

Description:

DESCRIPTION

Devices and methods for sequential coding for point cloud compression

TECHNICAL FIELD

The present disclosure relates to image processing in general. More specifically, the disclosure relates to an encoding apparatus and method for lossless encoding a point cloud and a decoding apparatus and method for lossless decoding, i.e. reconstructing a point cloud with sequential coding.

BACKGROUND

Point clouds have been considered as a candidate format for transmission, in particular, of two-dimensional (2D) or three-dimensional (3D) data, either captured by 3D scanners, LIDAR sensors, or used in popular applications such as Virtual Reality/Augmented Reality (VR/AR). Point clouds are a set of points in space. Besides the spatial position, each point usually has associated attributes, such as color or even reflectance and temporal timestamps. In order to obtain a high-fidelity representation of the target objects, devices capture point clouds in the order of thousands or even millions of points. Moreover, for dynamic scenes used, for instance, in VR/AR applications, every single frame often has a unique dense point cloud, which results in the transmission of several millions of point clouds per second. For a viable transmission of such a large amount of data, compression is often applied.

Often the positions of the points of a point cloud do not have integer coordinate values, because the points being compressed, as described above, may have be generated by visual search algorithms or by real-world 3D capturing (e.g., by means of 3D scanners, such as radar or lidar devices). These non-integer point positions could represent positions of some vertices of a 3D object and have correspondence with real-world length units (e.g. meters, centimeters, millimeters, etc.). Hence, the positions of the points of a point cloud may have fractional parts, wherein the size of these fractional parts usually depends on the accuracy that is required by a specific application.

SUMMARY

It is an object of the invention to provide devices and methods for encoding and decoding a point cloud with an improved sequential coding. The foregoing and other objects are achieved by the subject matter of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.

According to a first aspect an encoding apparatus for lossless compression of a point cloud comprising a plurality of points in a multi-dimensional space, in particular a 2D or 3D space is provided. Each point of the point cloud has a position defined by a plurality of coordinate values associated with a corresponding plurality of coordinate axes.

The encoding apparatus comprises a processing circuitry configured to determine a space filling curve connecting the points of the point cloud in a linear order, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud along the space filling curve. The processing circuitry is further configured to determine for each point of the point cloud a residual position vector between the position of the point and the position of a preceding point of the point cloud along the space filling curve. The processing circuitry is further configured to encode the plurality of residual position vectors.

Advantageously, the encoding apparatus according to the first aspect allows lossless coding of a point cloud with an improved compression performance. This is because, the coding order in which the points of a point cloud are being coded affects the efficiency of prediction methods and, thus, the coding efficiency. More specifically, a known coding order may imply constraints on the values of coordinates being coded. These constraints are a source of redundancy that is advantageously taken into consideration by the encoding apparatus according to the first aspect.

In a further possible implementation form of the first aspect, the processing circuitry is configured to encode the plurality or residual position vectors using an entropy encoding scheme.

In a further possible implementation form of the first aspect, the processing circuitry is further configured to encode one or more parameters defining the space filling curve. In an implementation form, the one or parameters defining the space filling curve may define, for instance, the sorting type, the sorting coordinate, whether wrapping occurs, and a signcoding parameter of the space filling curve. In a further possible implementation form of the first aspect, for the at least one of the plurality of coordinate axes the processing circuitry is configured to encode only an absolute value of the corresponding component of a respective residual position vector. In other words, for the sorting coordinate the processing circuitry may be configured to use only the absolute value of the residual for that coordinate for encoding, while for the other coordinates the absolute value as well as the sign of the residual are encoded.

In a further possible implementation form of the first aspect, the point cloud is bounded by a bounding box defined by a plurality of boundaries and wherein the space filling curve wraps at least once around one of the boundaries of the bounding box.

In a further possible implementation form of the first aspect, the point cloud is a sub-cloud of a larger point cloud, wherein the processing circuitry is further configured to determine a further space filling curve connecting the points of a further sub-cloud of the larger point cloud in a linear order, wherein for the least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the further sub-cloud along the further space filling curve; wherein the processing circuitry is further configured to determine for each point of the further sub-cloud a residual position vector between the position of the point and the position of a preceding point of the further sub-cloud along the further space filling curve; and wherein the processing circuitry is further configured to encode the plurality of residual position vectors of each point of the further sub-cloud.

In a further possible implementation form of the first aspect, the processing circuitry is further configured to partition the multi-dimensional space into at least a first sub-space and a second sub-space, wherein the space filling curve connects the points within the first subspace and the further space filling curve connects the points within the second sub-space.

According to a second aspect an encoding method for lossless compression of a point cloud comprising a plurality of points in a multi-dimensional space, in particular 2D or 3D space is provided. Each point of the point cloud has a position defined by a plurality of coordinate values associated with a corresponding plurality of coordinate axes. The encoding method comprises the steps of: determining a space filling curve connecting the points of the point cloud in a linear order, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud along the space filling curve; determining for each point of the point cloud a residual position vector between the position of the point and the position of a preceding point of the point cloud along the space filling curve; and encoding the plurality of residual position vectors.

The encoding method according to the second aspect of the present disclosure can be performed by the encoding apparatus according to the first aspect of the present disclosure. Thus, further features of the encoding method according to the second aspect of the present disclosure result directly from the functionality of the encoding apparatus according to the first aspect of the present disclosure as well as its different implementation forms described above and below.

According to a third aspect a decoding apparatus for reconstructing a point cloud based on an encoded bitstream in a multi-dimensional space is provided. Each point of the point cloud has a position defined by a plurality of coordinate values associated with a plurality of . coordinate axe.

The decoding apparatus comprises a processing circuitry configured to decode a plurality of residual position vectors based on the bitstream and a space filling curve, wherein for each point of the point cloud the residual position vector defines a difference between the position of the currently processed point and the position of a preceding point of the point cloud along the space filling curve, wherein the space filling curve defines a coding order of the points of the point cloud and wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud along the space filling curve.

The processing circuitry of the decoding apparatus is further configured to reconstruct the currently processed point on the basis of the position of the preceding point of the point cloud along the space filling curve and the residual position vector. In a further possible implementation form of the third aspect, the processing circuitry is configured to decode the plurality or residual position vectors using an entropy encoding scheme.

In a further possible implementation form of the third aspect, the processing circuitry is further configured to decode one or more parameters defining the space filling curve from the bitstream.

In a further possible implementation form of the third aspect, for the at least one of the plurality of coordinate axes the processing circuitry is configured to decode only an absolute value of the corresponding component of a respective residual position vector.

In a further possible implementation form of the third aspect, the point cloud is bounded by a bounding box defined by a plurality of boundaries and wherein the space filling curve wraps at least once around one of the boundaries of the bounding box.

In a further possible implementation form of the third aspect, the point cloud is a sub-cloud of a larger point cloud, wherein the processing circuitry is further configured to decode a further space filling curve defining a further coding order of the points of a further sub-cloud of the larger point cloud, wherein for the least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the further sub-cloud along the further space filling curve; and wherein the processing circuitry is further configured to decode for each point of the further sub-cloud a residual position vector between the position of the point and the position of a preceding point of the further sub-cloud along the further space filling curve.

According to a fourth aspect a decoding method for reconstructing a point cloud based on a bitstream in a multi-dimensional space is provided. Each point of the point cloud has a position defined by a plurality of coordinate values associated with a plurality of coordinate axes. The decoding method comprises the steps of: decoding a plurality of residual position vectors based on the bitstream and a space filling curve, wherein for each point of the point cloud the residual position vector defines a difference between the position of the currently processed point and the position of a preceding point of the point cloud along the space filling curve, wherein the space filling curve defines a coding order of the points of the point cloud and wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud along the space filling curve; and reconstructing the currently processed point on the basis of the position of the preceding point of the point cloud along the space filling curve and the residual position vector.

According to a fifth aspect a computer program product is provided, comprising program code which causes a computer or a processor to perform the encoding method according to the second aspect or the decoding method according to the fourth aspect, when the program code is executed by the computer or the processor.

Details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following embodiments of the invention are described in more detail with reference to the attached figures and drawings, in which:

Figure 1 is a schematic diagram illustrating a point cloud encoding apparatus and decoding apparatus according to an embodiment;

Figure 2 is a schematic diagram illustrating further details of the encoding apparatus according to an embodiment;

Figure 3 is a schematic diagram illustrating further details of the decoding apparatus according to an embodiment;

Figure 4 is a flow diagram illustrating a conventional floating-point encoding method;

Figure 5 is a schematic diagram illustrating the joint encoding of a most significant bit and the absolute value of a residual signal;

Figure 6 is a schematic diagram illustrating a conventional decoding mechanism;

Figure 7 is a diagram illustrating a Lorenzo predictor; Figure 8 is a diagram illustrating a further mechanism for generating a predictor;

Figure 9 is a diagram illustrating a further mechanism for generating a predictor implemented by the G-PCC standard;

Figure 10 is a schematic diagram illustrating processing blocks implemented by an encoding apparatus according to an embodiment;

Figure 11 shows a 2D point cloud encoded by an encoding apparatus according to an embodiment using a space filling curve;

Figure 12 is a flow diagram illustrating processing blocks implemented by an encoding apparatus according to an embodiment;

Figures 13a and 13b show flow diagrams illustrating processing blocks implemented by a decoding apparatus according to an embodiment;

Figure 14 shows a 2D point cloud encoded by an encoding apparatus according to an embodiment using a wrapped space filling curve;

Figure 15 is a flow diagram illustrating processing blocks, including a context coding, implemented by an encoding apparatus according to an embodiment and a decoding apparatus according to an embodiment;

Figure 16 shows a point cloud including three sub-clouds encoded by an encoding apparatus according to an embodiment;

Figures 17a and 17b show different space filling curves used by the encoding apparatus according to an embodiment for encoding a point cloud;

Figure 18 is a flow diagram illustrating a point cloud encoding method according to an embodiment; and

Figure 19 is a flow diagram illustrating a point cloud decoding method according to an embodiment. In the following identical reference signs refer to identical or at least functionally equivalent features.

DETAILED DESCRIPTION OF THE EMBODIMENTS

In the following description, reference is made to the accompanying figures, which form part of the disclosure, and which show, by way of illustration, specific aspects of embodiments of the invention or specific aspects in which embodiments of the present invention may be used. It is understood that embodiments of the invention may be used in other aspects and comprise structural or logical changes not depicted in the figures. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.

For instance, it is to be understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if one or a plurality of specific method steps are described, a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a specific apparatus is described based on one or a plurality of units, e.g. functional units, a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units), even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless specifically noted otherwise.

Figure 1 shows an encoding apparatus 110 and a decoding apparatus 130 according to an embodiment. The encoding apparatus 110 is configured to encode, i.e. compress a point cloud 100 with a plurality of points 101 into a bitstream 120. Each of the plurality of points 101 of the point cloud 100 may be associated with a position, e.g. 2D or 3D coordinates and one or more attributes. The bitstream 120 may comprise a geometry bitstream 120a and an attribute bitstream 120b. The decoding apparatus 130 is configured to decode, i.e. decompress the bitstream into a reconstructed point cloud 100 comprising a plurality of reconstructed points 101. As illustrated in figure 1, the encoding apparatus 110 comprises a processing circuitry or processor 111 for encoding the point cloud 100 into the bitstream 120. The processor 111 may be implemented in hardware and/or software, which when executed causes the encoding apparatus 110 to perform the functions and methods described herein. The hardware may comprise digital circuitry, or both analog and digital circuitry. Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field- programmable arrays (FPGAs), digital signal processors (DSPs), or general-purpose processors. Moreover, the encoding apparatus 110 may comprise a communication interface 113, such as a wireless communication interface 113 for transmitting the bitstream 120 to the decoding apparatus 130 as well as an electronic memory 115 for storing data, in particular the bitstream 120. Likewise, the decoding apparatus 130 comprises a processing circuitry or processor 131 for reconstructing the point cloud 100 based on the bitstream 120. Moreover, the decoding apparatus 130 may comprise a communication interface 133, such as a wireless communication interface 133 for receiving the bitstream 120 from the encoding apparatus 110 as well as an electronic memory 135 for storing data, in particular the reconstructed point cloud 100.

Figure 2 shows the encoding apparatus (or short encoder) 110 according to an embodiment that produces a geometry bitstream 120a and an attribute bitstream 120b based on the positions 100a and attributes 100b of the input point cloud 100. In an embodiment, the encoder 110 may comprise one or more of the processing blocks shown in figure 2, which may be implemented in hardware and/or software, for instance, by means of the processing circuitry 111 of the encoder 110 shown in figure 1 .

The processing block 201 is performed for the set of original point positions 100a to remove bias (minimum value of points’ position within a cloud could be non-zero) and convert the floating-point representation of the positions 100a to an integer representation. More specifically, a parameter T (“translation”) may be used to removes bias (so that the minimal value of each coordinate within the output point cloud is zero) and scaling parameters may be used to convert the floating-point representation of the positions 100a of the points into a fixed-point representation, as will be described in the following in more detail.

The original point positions 100a are generally represented by floating point numbers and need not have any structure, lying in an original (or world) coordinate system denoted as Internal (or frame) coordinates , may be obtained by the processing block 201 based on the original coordinates by the coordinate transformation where T = (t _x,t _y,t _z). The parameters T and s may be selected such that the point positions n = 1, N, lie in a bounding cube [0,2 ^d) ³ for some non-negative integer parameter d. Point positions in the internal coordinate system that have been compressed (i.e. encoded) and decompressed (i.e. decoded) are denoted herein as X _n, n = 1, N _out, where N is the number of points of the original point cloud 100 and N _out is the number of points in the decoded point cloud 100. As will be appreciated, N _out may not be the same as N. The decoded point positions in the original coordinate system may be obtained from the decoded point positions in the internal coordinate system by the following coordinate transformation:

This can alternatively be expressed by the following homogeneous transformation from internal to original coordinates:

The next processing block 202 “Quantize and remove points (voxelize)” may perform the following processing steps. The point positions in the internal coordinate system are rounded. More specifically, if denotes a point position in the internal coordinate system, then its representation as a non-negative d-bit integer is where Round(-) is the function that rounds the components of a vector to the nearest integer. After such a quantization, there may be multiple points with the same position (referred to as duplicate points). These duplicate points may be optionally removed. If enabled, points with the same quantized coordinates are removed. Multiple points with the same quantized position and different attributes may be merged in a single point. The process of position quantization, duplicate point removal, and assignment of attributes to the remaining points is called voxelization, which may be considered as the process of grouping points together into voxels. The set of voxels may be the unit cubes [i - 0.5, i + 0.5) x [j - 0.5, j + 0.5) x [k - 0.5, k + 0.5) for integer values of i, j, and k between 0 and 2 ^d - 1. Specifically, the locations of all the points within a voxel may be quantized to the voxel centre, and the attributes of all the points within the voxel may be combined (e.g., averaged) and assigned to the voxel. A voxel is said to be occupied if it contains any point of the point cloud 100.

If an Octree geometry codec is used, then the geometry encoding proceeds as follows. A processing block 203 ("Analyze octree"), first, may define a cubical axis-aligned bounding box B by the two extreme points (0,0,0) and (2 ^d, 2 ^d, 2 ^d). An octree structure may then be built by recursively subdividing bounding box B. At each stage, a cube may be subdivided into 8 sub-cubes. An 8-bit code, named an occupancy code, may then be generated by associating a 1 -bit value with each sub-cube in order to indicate whether it contains points (i.e. , full and has value 1) or not (i.e., empty and has value 0). Only full sub-cubes with a size greater than 1 (i.e., non-voxels) are further subdivided. Since points may be duplicated, multiple points may be mapped to the same sub-cube of size 1 (i.e., the same voxel). In order to handle such a situation, the number of points for each sub-cube of dimension 1 may also be arithmetically encoded by the processing block 205. In an embodiment, the arithmetic encoder provided by the processing block 205 may be used for encode all information in the geometry bitstream 120a.

In order to guarantee encoder/decoder synchronization, the point order defined by the decoding process may be used during the level of detail generation process. The processing block 204 (“Analyze surface approximation”) may set the block sizes to 2 x 2 x 2 or larger, and the collection of voxels within the block may be represented by some model. The geometry may be represented within each block as a surface that intersects each edge of the block at most once. Since there are 12 edges of a block, there can be at most 12 such intersections within a block. Each such intersection is called a vertex. A vertex along an edge is detected if and only if there is at least one occupied voxel adjacent to the edge among all blocks that share the edge. The position of a detected vertex along an edge may be the average position along the edge of all such voxels adjacent to the edge among all blocks that share the edge.

The processing block 204 may be used to encode a part of octree that is built by processing block 203. The output of processing block 204 (surface parameters) may be encoded by the processing block 205 using entropy coding as follows.

Vertices, nominally being intersections of a surface with edges of a block, are shared across neighbouring blocks, not only guaranteeing continuity across blocks of the reconstructed surface, but also reducing the number of bits required to code the collection of vertices. The set of vertices may be coded in two steps. In a first processing stage implemented by the processing block 205, the set of all the unique edges (or segments) of occupied blocks is computed, and a bit vector (or segment indicator) determines which segments contain a vertex and which do not. In a second processing stage implemented by the processing block 205, for each segment that contains a vertex, the position of the vertex along the segment is uniformly scalar quantized to a small number of levels (typically, but not necessarily equal to the block width if the geometric spatial resolution is desired to approximate the voxel resolution). The segment indicators and the vertex positions are entropy coded by an arithmetic coder implemented by the processing block 205. Thus, the geometry bitstream 120a itself may be a compound bitstream comprising octree, segment indicator, and vertex position bitstreams.

The processing block 207 of the encoder 110 is configured to reconstruct point positions in the same order and to the same values as it would be performed by the decoder 130 when processing, i.e. decoding the geometry bitstream 120a. Each point may have a set of associated values, i.e. each of the values corresponds to a point parameter or attribute, such as color, transparency, etc.

The encoder 110 may comprise a processing block 206 “Transfer attributes” (also referred to as “recolouring”). Given the input point cloud positions/attributes and the reconstructed positions , the purpose of the processing block 206 is to determine the attribute values that minimize the attribute distortions. In an embodiment, the processing block 206 of the encoder 110 may implement the following processing steps:

• be the input and the reconstructed positions, respectively.

• Let N and N _rec be the number of points in the original point cloud 100 and the reconstructed point cloud 100, respectively.

If duplicated points are merged, then N _rec < N, otherwise N _rec = N.

• For each point in the reconstructed point cloud 100, let X- be its nearest neighbour in the original point cloud 100 and the attribute value associated with • For each point , in the reconstructed point cloud 100', let Q ⁺(i) =

(Xf ( /r))/ie{i,...,H(o} be the set of points in the original point cloud 100 that share X, as their nearest neighbour in the reconstructed point cloud 100, wherein H (i) is the number of elements in Q ⁺(i), and ( h.) is one of the elements of Q ⁺(i). Note that Q ⁺(i) could be empty or could have one or multiple elements.

• If Q ⁺(i) is empty, then the attribute value is associated with the point X,.

• If Q ⁺(i) is not empty, then the attribute value a, associated with the point X _L may be determined based on the following equation: where the attribute value is associated with the point X _t of the set of points Q ⁺ (i).

The encoder 110 may further comprise a processing block 213 (“RAHT”; Region Adaptive Haar Transform) configured to transform a list of attributes A _n,n = 1, ... ,/V, which is an output of processing block 206 into a list of transform coefficients T _n, n = 1, ...,/V, given a list of associated voxel locations (x _n, y _n,z _n),n = 1, ..., /V, as side information, and to invert the transform. In an embodiment, the processing block 213 may be configured to perform RAHT and its inverse with respect to a hierarchy defined by the Morton codes of the voxel locations. The Morton code of d-bit non-negative integer coordinates x, y, and z is a 3d-bit non- negative integer obtained by interleaving the bits of x, y, and z. More specifically, the Morton code M = morton(x,y,z) of non-negative d-bit integers coordinates, i.e. where x _t,y _t,z _c e {0,1} are the bits of x, y, and z from l = 1 (high order) to l = d (low order), may be the non-negative 3d-bit integer where m , ∈ {0,1} are the bits of M from l = 1 (high order) to l = 3d (low order). Let prefix (M) = [2-( ^3d-C )M ] denote the l' -bit prefix of M. Let m be such a prefix.

Define the block at level £ with prefix m to be the set of all points (x,y, z) for which m = prefix (morton(x,y,z)f Two blocks at level l' are considered to be sibling blocks if they have the same ( l' — 1)-bit prefix. The union of two sibling blocks at level l' is a block at level (£ — 1) called their parent block. The RAHT (Region Adaptive Haar Transform) of the sequence A _n,n = 1, ...,7V, and its inverse as implemented by the processing block 213 may be defined recursively as follows.

Let A _n be the attribute of a point and let T _n be its transform. Then T _n = A _n. Consider two sibling blocks and their parent block. Let (A ₀₁,A ₀₂, ...A _0w0) and (A ₁₁,A _12, ,A _1w1) be the attributes of the points (x _n,y _n,z _n) in the sibling blocks, listed in increasing Morton order, and let (T ₀₁, T ₀₂, ... , T _{0w 0}) and (T ₁₁, T ₁₂, ... , T _{1w 1}) be their respective transforms. Similarly, let (A ₁,A ₂, ... , A _wo+w1) be the attributes of all points (x _n, y _n, z _n) in their parent block, listed in increasing Morton order, and let ...,T _{w0 +w1}) be its transform. Then, In other words, in an embodiment, the transform of the parent block is the concatenation of the two sibling blocks, with the exception that the first (DC) components of the transforms of the two sibling blocks are replaced by their weighted sum and difference, and inversely the transforms of the two sibling blocks are copied from the first and last parts of the transform of the parent block, with the exception that the DC components of the transforms of the two sibling blocks are replaced by their weighted difference and sum, namely and

These are known as Givens rotations. As illustrated in figure 2, in addition or as an alternative to the RAHT processing block 213 the encoder 110 may further comprise a processing block 208 ("Generate LOD") configured to encode attributes 100b using the Level-of-Details approach and a processing block 209 implementing a lifting operation. In an embodiment, the selection whether the processing block 213 or the processing blocks 208, 209 are used, may be performed by the encoder 110 and signalled in the attribute bitstream 120b.

In an embodiment, the (LOD) generation processing block 208 is configured to re-organize the points into a set of refinement levels according to a set of Euclidean distances (di) _{l=0 L-1} specified, for instance, by a user. The distances (d/)(= ₀...L-i should satisfy the following two conditions:

The re-ordering process is deterministic and operates on the quantized positions ordered according to the octree decoding process. It may be applied at both the encoder 110 and the decoder 130 side. More specifically, the re-ordering process implemented by the processing block 208 may comprise the following stages:

• First, all the points are marked as non-visited and the set of visited points, denoted as V, is set as empty.

• Iteratively, at each iteration I, the refinement level R _t is generated as follows: o Iteration is performed over all the points. o If the current point has been visited, then it is ignored. o Otherwise, the minimum distance D of the current point to the set V is computed. o If D is strictly lower than d _h then the current point is ignored. o If D is higher or equal than d _h then the current point is marked as visited and added to both R _t and V. o This is repeated until all the points have been traversed.

• The level of detail I, LOD _t, is obtained by taking the union of the refinement levels Ro> Ri> - , RI-

• This is repeated until all the LODs are generated or until all the vertices have been visited. As will be appreciated, the LOD-based prediction strategy implemented by the processing block 208 makes points in lower LODs more influential since they are used more often for prediction. More specifically, let w(P) be the influence weight associated with a point P. Then w (P) may be computed by applying the following recursive procedure:

• Set w(P) = 1 for all points

• Traverse the points according to the inverse of the order defined by the LOD structure

• For every point update the weights of its neighbors P ∈ V(Q(i,j) as follows:

In an embodiment, the processing block 209 may implement an update operator that uses the prediction residuals P(Q(i,j)) to update the attribute values of LOD(j). More specifically, let 4(P) be the set of points such that P e Then, the update operation for P implemented by the processing block 209 may be defined as follows:

The encoder 110 may further comprise a processing block 211 ("Quantize coefficients") for quantizing the influence weights computed during the transform process. Those may be leveraged in order to guide the quantization process. More specifically, the coefficients associated with a point P are multiplied by a factor of An inverse scaling process by the same factor is applied after inverse quantization on the decoder 130 side. The scaling factors are completely determined by the reconstructed geometry and, thus, do not need to be encoded in the bitstream 120.

The encoder 110 may further comprise a processing block 212 ("Arithmetic encode") for encoding the quantized coefficients provided by processing block 211 and thereby generating the attribute bitstream 120b.

Figure 3 shows the decoding apparatus (or short decoder) 130 according to an embodiment that is configured to generate decoded positions 100a' and attributes 100b', i.e. the reconstructed point cloud 100 based on the geometry bitstream 120a and attributes bitstream 120b (provided, for instance, by the encoder 110 of figure 2 based on the input point cloud 100). In an embodiment, the decoder 130 may comprise one or more of the processing blocks shown in figure 3, which may be implemented in hardware and/or software, for instance, by means of the processing circuitry 131 of the decoder shown in figure 1. As will be appreciated, many of the processing blocks of the decoder 130 may be identical or the inverse of the corresponding processing blocks of the encoder 110 shown in figure 2. Therefore, in the following mainly the differences between the decoder 130 shown in figure 3 and the encoder 110 shown in figure 2 will be described in more detail.

Processing blocks 305 and 312 of the decoder 130 receive the geometry bitstream 120a and the attribute bitstream 120b and perform arithmetic decoding. The results of the arithmetic decoding for the geometry bitstream 120a are further used by the processing block 303 to synthesize an octree. A processing block 304 ("Synthesize surface approximation") is configured to perform surface approximations in some of the parts of the synthesized tree. The processing blocks 303 and 304 operate as the processing blocks 203 and 204 of the encoder 110 described above. The difference between decoder-side and encoder-side steps is that for the decoder-side process decisions are taken in accordance with the contents of the bitstream 120a, 120b. Moreover, the processing block 307 ("Reconstruct geometry") is identical, to the processing block 207 described above. The processing block 301 (“Inverse transform of coordinates”) of the decoder 130 is configured to perform operations inverse to the operations performed by the processing block 201 of the encoder 110 so that the decoded positions 100a' match the positions 100a provided as one of the inputs to the decoder 110.

For decoding the attribute bitstream 120b the decoder 130 comprises an arithmetic decoding processing block 312 and an inverse quantization processing block 311 , which is configured to restore decoded coefficients from their quantized representation that is signaled in the attribute bitstream 120b. The attribute transforms are applied in the same way as it is done by the encoder 120, i.e. the processing blocks 308, 309 and 313 of the decoder 130 are identical to the corresponding processing blocks 208, 209 and 213 of the encoder 110 shown in figure 2 and described above. The only difference is that the selection between the processing block 213 and the processing blocks 208, 209 is signaled in the attribute bitstream 120b.

Finally, the processing block 321 of the decoder 130 provides the mapping, which is inverse to the mapping provided by the processing block 221 of the encoder 110. Doing so, the decoded attributes 100b' are recovered from the attribute bitstream 120b. Before describing different embodiments of the encoding apparatus 110 and the decoding apparatus 130 in more detail, in the following some technical background as well as terminology concerning conventional point cloud compression will be introduced.

Figure 4 is a flow diagram illustrating a conventional floating-point method for encoding a point cloud. Each coordinate of a point of a 2D or 3D point cloud may be compressed using the processing conventional processing steps illustrated in figure 4. Residual signal d is obtained by subtracting predictor p from the current value r. Once the current position has been obtained (step 401) and a predictor p has been generated (step 403), the residual signal (or short residual) is determined as d = r - p (step 405). A further step 407 shown in figure 4 comprises determining the most significant bit (MSB) of the residual signal d. This may be done by taking the log ₂ value of the absolute value of d, i.e.

The steps 409 and 411 shown in figure 4 comprise encoding the sign value of d and k bits of | d | . As the MSB cannot assume the value of 0 and, only bits at positions [0, k — 1] are usually encoded.

Figure 5 is a schematic diagram illustrating the joint encoding of the most significant bit, MSB, and the absolute value of the residual signal d. As can be taken from figure 5, a single symbol S’ ∈ [0,2 • fc _max ] may include both the sign and the MSB position. The middle k _max of the allowed range [0,2 • k _max ] may be selected as the "zero symbol" S _zero that indicates a perfect prediction and a zero value of residual signal d. Symbols smaller than S _zero denote negative signs and larger symbols are for positive values. The value of k may be restored from a bitstream, such as the bitstream 120 based on the following equation:

Figure 6 is a schematic diagram illustrating a conventional point cloud decoding mechanism. In step 601 the residual signal d is decoded from the bitstream by decoding the following data: the sign value, the MSB value k' and k bits of Ml- The step 605 for obtaining the predictor p may be performed in the same way as at the encoder side (step 403 of figure 4). For instance, a previously reconstructed value may be used as the predictor. The predictions step 603 of figure 6 comprises summing the predictor p (obtained in step 605) and the residual signal d (obtained in step 601), i.e. r = d + p. By performing the process shown in figure 6 for each coordinate axis, all the coordinates of a reconstructed point of a point cloud may be obtained.

Figure 7 is a diagram illustrating an exemplary Lorenzo predictor, which may be employed in step 403 of figure 4 or step 603 of figure 6. Using the Lorenzo predictor a predicted point is obtained from previously decoded points as a fourth corner of a parallelogram, wherein three corners of the parallelogram are defined by the three already decoded points, i.e. available points to get the predictor, i.e. the predicted point.

Figure 8 is a diagram illustrating a further mechanism for generating a predictor that enables wrapping around minimum and maximum allowed values of the reconstructed signal r (and source signal r) and that may be employed in step 403 of figure 4 or step 603 of figure 6. In this case, the residual signal d (also referred to as "Res" in figure 8) is represented as a sum of two values: D ₁ and D ₂. A reconstructed coordinate is obtained from predicted value as shown in figure 8, i.e. by adding D ₁ to the minimum allowed value.

Figure 9 is a diagram illustrating a further mechanism for generating a predictor implemented by the G-PCC standard that may be employed in step 403 of figure 4 or step 603 of figure 6. This mode in G-PCC is called "predictive geometry coding" mode and may comprise one of the following prediction modes: no prediction; delta prediction (i.e., pO); linear prediction (i.e., ' 2p0-p1); or parallelogram predictor (i.e., 2p0+p1-p2).

As will be described in more detail in the following under further reference to figures 10 and 11 , for lossless compression of the point cloud 100, which is defined in a multi-dimensional space, e.g. a 2D or 3D space, the processing circuitry 111 of the encoding apparatus is configured to determine a space filling curve, such as the exemplary space filling curve 1100 shown in figure 11 , connecting the points 101 of the point cloud 100 in a linear order such that for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point 101 of the point cloud 100 along the space filling curve 1100. For instance, for the exemplary space filling curve 1100 shown in figure 11 the coordinate value is monotonically increasing from bottom to top along the y axis. Arranging the points 101 of the point cloud 100 in this order ensures that every preceding point 101 has smaller coordinate value for the y axis than a currently processed point 101 of the point cloud 100.

The processing circuitry 111 of the encoding apparatus 110 is further configured to determine for each point 101 of the point cloud 100 a residual position vector between the position of the point 101 and the position of a preceding point of the point cloud 100 along the space filling curve 1100. Moreover, the processing circuitry 111 of the encoding apparatus 110 is configured to encode the plurality of residual position vectors. As will be appreciated, by ordering the points 101 of the point cloud along one of the coordinate axes, the sign value of the residual component is guaranteed to be non-negative, and therefore, in the binarization mechanism sign coding may be skipped.

Complementary to the encoding apparatus 110 the processing circuitry 131 of the decoding apparatus 130 is configured to decode the plurality of residual position vectors based on the bitstream 120 and the space filling curve 1100, wherein for each point 101 of the point cloud 100 the residual position vector defines a difference between the position of the currently processed point and the position of a preceding point of the point cloud 100 along the space filling curve 1100. As already described above, the space filling curve 1100 defines a coding order of the points of the point cloud 100, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud 100 along the space filling curve 1100. The processing circuitry 131 of the decoding apparatus 130 is further configured to reconstruct the currently processed point on the basis of the position of the preceding point of the point cloud 100 along the space filling curve 1100 and the residual position vector.

Figure 10 is a schematic diagram illustrating processing blocks implemented by the encoding apparatus 110 according to an embodiment. In processing block 1001 ("Fetch data points") the encoding apparatus 110 obtains points 101 of the point cloud 100 to be encoded. In processing block 1003 ("Sort data points") the processing circuitry 111 of the encoding apparatus 110 is configured to sort, i.e. arrange the points 101 of the point clouds using a space filling curve, such as the space filling curve 1100 shown in figure 11 , in such a way that for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point 101 of the point cloud 100 along the space filling curve 1100. In processing block 1007 ("Get predictor") the previously processed point of the point cloud 100 is retrieved as the predictor for determining the residual in processing block 1005 of figure 10. In processing block 1009 ("Entropy coding") the processing circuitry 111 of the encoding apparatus is configured to encode using entropy coding the residuals, i.e. the residual position vector between the position of the currently processed point 101 and the position of the preceding point of the point cloud 100 along the space filling curve 1100. As illustrated in figure 10, one or more parameters defining the space filling curve 1100, such as the sorting type, the sorting coordinate, whether wrapping occurs, and a sign-coding parameter of the space filling curve may be exchanged between the processing blocks 1003, 1007 and 1009, for instance, for reconstructing the space filling curve 1100. In an embodiment, the processing circuitry 111 of the encoding apparatus 110 is configured to include one or more of these parameters defining the space filling curve 1100 in the bitstream.

Figure 12 is a flow diagram illustrating processing blocks implemented by the encoding apparatus 100 according to an embodiment. More specifically, in block 1201 input points 101 of the point cloud 100 are fetched in the specified coding order. An index i identifying a respective point 101 being coded is initialized to 0 (block 1203). Further a loop condition is checked to determine whether all the points 101 of the point cloud already have been processed. This could be performed by comparing the value of the point index i with the total number of points N (block 1205). Processing block 1217 allows to repeat the following steps for the next point of the point cloud 100. The blocks 1207 ("Get prediction point p") and 1209 ("Get residual point r") may be performed in the same way as it is done for the conventional point cloud encoder shown in figure 4. In block 1211 an index k identifying the current coordinate axis is initialized to 1 , and the value of abs(r(0)) is written to the bitstream 120 (block 1213). Here r(0) denotes the coordinate with index 0 of the residual signal r. As already described above, the order in which block 1201 ("fetch points") is performed guarantees that the sign of r(0) is always non-negative and hence the encoding apparatus 110 may only output the absolute value of r(0) to the bitstream 120. In processing block 1215 a condition is checked whether current dimension index is smaller than the number of dimensions K, i.e. k<K. For a 2D case, the value of K is equal to 2 and for the 3D case, the value of K is equal to 3. This condition check is the condition of the inner loop that outputs signs and absolute values of r(k) for k>0 (blocks 1219 and 1221). Processing block 1223 allows to repeat the process for the next coordinate axis.

Figures 13a and 13b show flow diagrams illustrating processing blocks implemented by the decoding apparatus 130 according to an embodiment. In a first step 1301 information about the width and the height of the 2D point cloud 100 may be obtained from the bitstream 120. For a 3D point cloud 100 this step 1301 may further include obtaining information about a depth of the point cloud 100 from the bitstream 120. In a further step 1303 the type of sort order is obtained from the bitstream 120. For instance, for a 2D point cloud 100, this type could be specified as “sorting by x”, or “sorting by y”. For a 3D point cloud 100, “sorting by z” type could also be specified. In steps 1305, 1307, 1309a, 1309b, 1311a and 1311 b sets the coordinate axis either to the x axis or the y axis depending on the information obtained in step 1303. In step 1313 the position of the currently processed point 101 may be predicted as described above, i.e. as a sum of the residual extracted from the bitstream 120 and the previously predicted point, i.e. the predictor. Step 1315 illustrates the possibility of wrap prediction implemented by the decoding apparatus 130 according to an embodiment (and the encoding apparatus 110 according to an embodiment). More specifically, in step 1315 wrap prediction may be performed along the specified axis. In an embodiment, the wrap axis k _w may be specified in dependence on the type of sort order. In an embodiment, k _w = (/c + 1) mod K, wherein k is an index of dimension for which the ordering is done and K is the total number of directions. In other words, as shown in figures 13a and 13b for a 2D point cloud 100, when the sort direction is vertical, the wrap axis may be set to the x axis, and the maximum value may be set to the width (“W”). Otherwise, i.e. when sort direction is horizontal, the wrap axis may be set to the y axis, and maximum value is set to height (“H”). For a 3D point cloud 100, when the sort direction is vertical, the wrap axis may be set to the x axis and the maximum value is set to the width (“W”). Otherwise, when the sort direction is horizontal, the wrap axis may be set to the z axis and the maximum value may be set to the depth (“D”). Otherwise, i.e. when the sort direction is in a depth direction, the wrap axis may be set to the y axis, and the maximum value may be set to the height (“H”). The prediction in block 1315 is further performed with the wrap parameters “min”, “max” and the selected wrap axis k _w that have been determined in accordance with the previous steps.

Figure 14 shows a 2D point cloud 100 encoded by an encoding apparatus according to an embodiment using a wrapped space filling curve 1100. As can be taken from the equation shown in figure 14, without wrapping the difference d ₀ may be larger than the sum of distances di + d ₂ that is encoded in case when wrapping is enabled. As will be appreciated, wrapping is beneficial only in direction that is not sorted. However, when it is known by sorting order, that for the axis k the coordinates of the points of the point cloud 100 are nondecreasing, there is need to perform wrapping. Wrapping affects binarization process and hence, when it is guaranteed that no wrapping can occur, the signaling related to wrapping could be bypassed thus reducing the overall bitrate of lossless coding.

Figure 15 is a flow diagram illustrating processing blocks implemented by the encoding apparatus 110 according to an embodiment and the decoding apparatus 130 according to an embodiment, where the context for sign coding may be determined based on the sort type and dimension index k. More specifically, in steps 1501 and 1503 the process starts with an initialization of the axis index k. As already described above, each value of the axis index k may be associated with a respective axis of the point coordinate (e.g., k=0 is the ‘x’ axis, k=1 is the ‘y’ axis, k=2 is the ‘z’ axis). In the next step 1505 (“Sort type”) the encoding apparatus 110 writes the information about the sort type to the bitstream 120 and the decoding apparatus 130 extracts this information form the bitstream 120. In this embodiment, the sort type may not guarantee a non-decreasing order of the positions of the points of the point cloud 100 for any axis. Instead, the probability of a positive prediction error could be determined based on the indicated sort type. For example, a sort type may be selected in such a way that points are sorted in accordance with their ‘x’ coordinates, but some exceptional points for which ‘x’ component may decrease also may occur. In this case, prediction error is positive most of the times, but for some of those points 101 of the point cloud there might be some exceptions. The frequency of these exceptions may determine the probability of positive and negative sign values for a respective axis index k. This probability may be used when obtaining the context model in the step 1507 (“get ctx(k)”), where ctx(k) stands for the context of the sign value of the prediction error along axis k. In the next step 1509 (“Sign with ctx(k)”) encoding of a binary value associated with the sign of the prediction error (e.g., 1 for positive and 0 for negative) may be performed. This value may be processed by a Context-Adaptive Binary Arithmetic Codec (CABAC) with the obtained context model ctx(k). Thereafter, the prediction error magnitude k may be encoded (step 1511) and the next axis with index k+1 (step 1513) is processed in the same way, until all coordinate axes have been processed (step 1515).

In another embodiment step 1505 (“Sort type”) may comprise indicating the probabilities for each coordinate axis, i.e., a set of fixed-point values within a range of [0,1], wherein each value is a probability that the prediction error for a respective axis is positive. This estimation of probabilities is further used to select the context in step 1507.

Figure 16 shows a point cloud 100 including three sub-clouds encoded by the encoding apparatus 130 according to an embodiment. The point cloud 100 may contain a large set of points, and it is known that sorting operation could be a bottleneck for processing large sets. For this reason, in an embodiment, the encoding apparatus 130 is configured to split a point cloud into sub-clouds, wherein each of the sub-clouds is sorted independently using respective space filling curve 1100, 1100'. In the embodiment shown in figure 16 the point cloud 100 is split into several sub-clouds (e.g., three sub-clouds). The points are predicted in the coding order that considers points to be sorted along the x axis. However, the encoding apparatus is configured to apply the sorting only within the sub-clouds, and hence, for the first point within a non-first sub-cloud, the value of sign sign(r(k)) should not be skipped. In this case, the further steps of the conventional procedure shown in figure 4 may be used. The coding of all points of a respective sub-cloud may be performed as described above in the context of figures 10 to 12. Figures 17a and 17b show different space filling curves used by the encoding apparatus 110 according to an embodiment for encoding a point cloud similar to the embodiment shown in figure 16. In the embodiment, shown in figures 17a and 17b the point cloud 100 is divided into sub-clouds just like a picture is divided into slices (figure 17a) or tiles (figure 17b) in video coding. A picture area is split into rectangular areas, wherein sorting is applied only within a respective sub-cloud, wherein the points of the respective sub-cloud are inside one of the rectangular areas shown in figures 17a and 17b. In this embodiment, the sign coding is not skipped for the first points of sub-clouds encoded after encoding a first sub-cloud. For coding of these "boundary" points, the further steps of the conventional procedure shown in figure 4 may be used. The coding of all points of a respective sub-cloud may be performed as described above in the context of figures 10 to 12 that may feature sign skipping and sign coding. The subdivision of the point cloud 100 into a plurality of sub-clouds with point sorting could be performed prior to encoding, i.e. by means of a point cloud 100 preprocessing. Alternatively, this subdivision could be considered as a part of the encoding process.

Figure 18 is a flow diagram illustrating an encoding method 1800 for lossless compression of a point cloud comprising a plurality of points 101 in a multi-dimensional space, in particular a 2D or 3D space, wherein each point 101 of the point cloud 100 has a position defined by a plurality of coordinate values associated with a corresponding plurality of coordinate axes. The encoding method 1800 comprises the steps of: determining 1801 a space filling curve 1100 connecting the points 101 of the point cloud 100 in a linear order, wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point 101 of the point cloud 100 along the space filling curve 1100; determining 1803 for each point 101 of the point cloud 100 a residual position vector between the position of the point 101 and the position of a preceding point of the point cloud 100 along the space filling curve 1100; and encoding 1805 the plurality of residual position vectors.

Figure 19 is a flow diagram illustrating a decoding method 1900 for reconstructing the point cloud 100 based on the bitstream 120. The decoding method 1900 comprises the steps of: decoding 1901 the plurality of residual position vectors based on the bitstream 120 and the space filling curve 1100, wherein for each point 101 of the point cloud 100 the residual position vector defines a difference between the position of the currently processed point and the position of a preceding point of the point cloud 100 along the space filling curve 1100, wherein the space filling curve 1100 defines a coding order of the points of the point cloud 100 and wherein for at least one of the plurality of coordinate axes the associated coordinate value is monotonically increasing or monotonically decreasing for each point of the point cloud 100 along the space filling curve 1100; and reconstructing 1903 the currently processed point on the basis of the position of the preceding point of the point cloud 100 along the space filling curve 1100 and the residual position vector.

The person skilled in the art will understand that the "blocks" ("units") of the various figures (method and apparatus) represent or describe functionalities of embodiments of the present disclosure (rather than necessarily individual "units" in hardware or software) and thus describe equally functions or features of apparatus embodiments as well as method embodiments (unit = step).

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described embodiment of an apparatus is merely exemplary. For example, the unit division is merely logical function division and may be another division in an actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, functional units in the embodiments of the invention may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.

Previous Patent: DEVICES AND METHODS FOR SCALABLE CODING FOR POINT CLOUD COMPRESSION

Next Patent: A DEVICE FOR PERFORMING A RECURSIVE RASTERIZATION