Method and apparatus for binary entropy encoding and decoding of point clouds

文档序号：639582 发布日期：2021-05-11 浏览：42次中文

阅读说明：本技术 用于点云的二进制熵编解码的方法和设备 (Method and apparatus for binary entropy encoding and decoding of point clouds ) 是由 S·拉瑟雷于 2019-10-02 设计创作，主要内容包括：用于对点云进行编码的方法和设备。使用二进制熵编解码来对表示体积的子体积的占用模式的比特序列进行编解码。针对比特序列中的给定比特,上下文可以基于对应于该比特的子体积的子体积邻居配置。子体积邻居配置依赖于体积的邻近体积的一组子体积的占用模式,这组子体积与对应于给定比特的子体积邻近。上下文还可以基于比特序列的先前编解码的比特的部分序列。(Method and apparatus for encoding a point cloud. A binary entropy codec is used to codec a bit sequence representing an occupancy pattern of a sub-volume of the volume. For a given bit in the bit sequence, the context may be based on a sub-volume neighbor configuration of the sub-volume corresponding to the bit. The sub-volume neighbor configuration depends on the occupancy pattern of a set of sub-volumes of the volume's neighboring volume, which set of sub-volumes is neighboring the sub-volume corresponding to a given bit. The context may also be based on a partial sequence of previously coded bits of the bit sequence.)

1. A method of encoding a point cloud to generate a bitstream of compressed point cloud data, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and the plurality of nodes representing a geometry of a volumetric space that is recursively split into sub-volumes and contains points of the point cloud, wherein occupancy of a sub-volume of a volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of a respective sub-volume within the volume in scan order, and wherein a volume has a plurality of adjacent volumes, the method comprising:

for a current node associated with a current volume split into sub-volumes, wherein each sub-volume corresponds to a child node of the current node, determining the bit sequence indicative of the occupancy of the sub-volume of the current volume; and

for at least one bit of the bit sequence of the current volume,

determining a sub-volume neighbor configuration based on occupancy data of sub-volumes of at least one neighboring volume of the current volume, the sub-volume neighbor configuration being dependent on an occupancy pattern of a set of sub-volumes of the at least one neighboring volume, the set of sub-volumes being neighboring the sub-volume of the current volume corresponding to the bit in the bit sequence;

selecting a probability for entropy encoding of the bits in the bit sequence, wherein the selecting is based at least in part on the sub-volume neighbor configuration; and

entropy encoding the bits in the sequence of bits using a binary entropy encoder based on the selected probabilities to produce encoded data for the bitstream.

2. A method of decoding a bitstream of compressed point cloud data to generate a cloud of reconstructed points, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and representing a geometry of a volumetric space that is recursively split into sub-volumes and contains points of the point cloud, wherein occupancy of a sub-volume of a volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of the respective sub-volume within the volume in scan order, and wherein a volume has a plurality of neighboring volumes, the method comprising:

for a current node associated with a current volume split into sub-volumes, wherein each sub-volume corresponds to a child node of the current node, and for at least one bit of the bit sequence of the current volume,

selecting a probability for entropy decoding of the bits in the sequence of bits, wherein the selecting is based at least in part on the sub-volume neighbor configuration; and

entropy decoding the at least one bit based on the selected probability using a binary entropy decoder to generate reconstructed bits from the bitstream.

3. The method of claim 1 or 2, wherein determining the sub-volume neighbor configuration involves:

determining a number of sub-volumes of the at least one neighboring volume that are neighboring the sub-volume of the current volume corresponding to the bit in the bit sequence based on the occupancy data of the sub-volumes of the at least one neighboring volume of the current volume; and

a threshold function is applied to the determined number.

4. The method of any preceding claim, wherein the sub-volume neighbor configuration of a given sub-volume in a given volume corresponds to an occupancy pattern of sub-volumes in adjacent volumes of the given volume that are adjacent to the given sub-volume.

5. The method according to any of the preceding claims, wherein determining the sub-volume neighbor configuration is based on occupancy data of sub-volumes of those neighboring volumes of the current volume that have been coded.

6. The method according to any of the preceding claims, wherein selecting the probability is further based on a partial sequence of already coded bits of the bit sequence and/or a neighboring configuration of the current volume, wherein the neighboring configuration of the current volume corresponds to an occupancy pattern of the neighboring volume of the current volume.

7. The method of any preceding claim, wherein determining the sub-volume neighbor configuration involves:

determining all those sub-volumes of the at least one neighboring volume that intersect the sub-volume of the current volume corresponding to the bit in the bit sequence based on the occupancy data of the sub-volumes of the at least one neighboring volume of the current volume;

applying respective weight factors to the determined sub-volumes, wherein each weight factor depends on an intersection of the respective determined sub-volume and the sub-volume of the current volume corresponding to the bit in the bit sequence; and

determining weights for sub-volumes of the at least one neighboring volume that intersect the sub-volume of the current volume that corresponds to the bit in the bit sequence based on the determined sub-volumes and their respective weight factors.

8. The method of any preceding claim, wherein the scan order within the current volume is determined such that: the maximum possible number of adjacent sub-volumes in the adjacent volume of the current volume that has been transcoded does not increase from one sub-volume to the next in the scan order.

9. The method of any preceding claim, wherein the occupancy data for the sub-volumes of a given adjacent volume of the current volume comprises an occupancy state for each of the sub-volumes of the given adjacent volume.

10. The method according to any of the preceding claims, wherein the tree structure represents an octree.

11. The method of claim 2 or any one of claims 3 to 10 when dependent on claim 2, further comprising: decoding a marker from the bitstream, the marker indicating that the probability for entropy decoding of at least one bit should be selected based at least in part on the sub-volume neighbor configuration.

12. An encoder for encoding a point cloud to generate a bitstream of compressed point cloud data, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and the plurality of nodes representing a geometry of a volumetric space that is recursively split into sub-volumes and contains points of the point cloud, wherein occupancy of a sub-volume of a volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of a respective sub-volume within the volume in scan order, and wherein a volume has a plurality of neighboring volumes, the encoder comprising:

a processor;

a memory; and

a coding application containing instructions executable by the processor, which when executed cause the processor to perform the method of claim 1 or any one of claims 3 to 10 when dependent on claim 1.

13. A decoder for decoding a bitstream of compressed point cloud data to produce a reconstructed point cloud, the point cloud being defined in a tree structure having a plurality of nodes having parent-child relationships and the plurality of nodes representing a geometry of a volumetric space that is recursively split into sub-volumes and contains points of the point cloud, wherein occupancy of a sub-volume of a volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of a respective sub-volume within the volume in scan order, and wherein a volume has a plurality of neighboring volumes, the decoder comprising:

a processor;

a memory; and

a decoding application containing instructions executable by the processor, which when executed cause the processor to perform the method of claim 2 or any of claims 3 to 11 when dependent on claim 2.

14. A non-transitory processor-readable medium storing processor-executable instructions that, when executed by a processor, cause the processor to perform the method of any of claims 1-11.

15. A computer readable signal containing program instructions which, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 11.

Technical Field

The present application relates generally to point cloud compression, and in particular to a method and apparatus for binary entropy coding (coding) of point clouds.

Background

Data compression is used in communications and computer networking to efficiently store, transmit, and reproduce information. There is an increasing interest in the representation of three-dimensional objects or spaces, which may involve large data sets, and efficient and effective compression for such representations would be very useful and valuable. In some cases, a three-dimensional object or space may be represented using a point cloud, which is a collection of points having three coordinate locations (X, Y, Z), respectively, and in some cases other attributes, such as color data (e.g., luminance and chrominance), transparency, reflectivity, normal vector, and so forth. The point cloud may be static (a still object or a snapshot of the environment/object at a single point in time) or dynamic (a time-sequential sequence of point clouds).

Example applications for point clouds include topology and mapping applications. Autonomous vehicles and other machine vision applications may rely on point cloud sensor data in the form of a 3D scan of an environment, such as from a LiDAR (laser radar) scanner. Virtual reality simulation may rely on point clouds.

It should be appreciated that point clouds can involve a large amount of data, and it is of great interest to compress (encode and decode) this data quickly and accurately. Accordingly, it would be advantageous to provide methods and apparatus for more efficiently and/or effectively compressing data of a point cloud. Moreover, it would be advantageous to find a method and apparatus for coding point clouds that can be implemented using context adaptive binary entropy coding without the need to manage an excessive amount of context.

Drawings

Reference will now be made, by way of example, to the accompanying drawings, which illustrate example embodiments of the present application, and in which:

FIG. 1 shows a simplified block diagram of an example point cloud encoder;

FIG. 2 shows a simplified block diagram of an example point cloud decoder;

FIG. 3 illustrates an example partial sub-volumes and associated tree structure for codec;

FIG. 4 illustrates recursive splitting and coding of octrees;

FIG. 5 shows an example scan pattern within an example cube from an octree;

FIG. 6 illustrates an example occupancy pattern within an example cube;

FIG. 7 illustrates, in flow diagram form, an example method for encoding a point cloud;

FIG. 8 illustrates a portion of an example octree;

FIG. 9 illustrates an example of contiguous sub-volumes;

FIG. 10 illustrates an example neighbor configuration showing occupancy between neighboring nodes;

FIG. 11 diagrammatically illustrates one illustrative embodiment of a process for point cloud entropy encoding using a parent mode dependent context;

FIG. 12 shows an illustrative embodiment of a process for point cloud entropy encoding using a context dependent on a neighbor configuration;

FIG. 13 illustrates, in flow diagram form, one example method for decoding a bitstream of compressed point cloud data;

FIG. 14 shows an example simplified block diagram of an encoder;

fig. 15 shows an example simplified block diagram of a decoder;

FIG. 16 illustrates an example Cartesian coordinate system and example rotations and/or reflections about an axis;

FIG. 17 illustrates the invariance class of neighbor configurations at one or several iterations of rotation about the Z-axis;

FIG. 18 illustrates invariance categories for neighbor configurations for vertical reflections;

FIG. 19 illustrates the invariance categories for both rotation and reflection;

FIG. 20 illustrates the invariance categories for three rotations and reflections;

fig. 21 illustrates the equivalence between non-binary codec and concatenated binary codec for occupied modes;

FIG. 22 illustrates, in a flow chart, an example method for encoding and decoding occupancy patterns in a tree-based point cloud codec (coder) using binary encoding;

FIG. 23 shows a simplified block diagram of portions of an example encoder;

FIG. 24 graphically illustrates an example context reduction operation based on neighbor screening;

FIG. 25 illustrates another example context reduction operation based on neighbor screening;

FIG. 26 illustrates, in flow diagram form, one example of a method for binary coding occupancy patterns using combined context reduction;

FIG. 27 illustrates an example of adjacent sub-volumes, some of which have been coded between adjacent sub-volumes;

FIG. 28 illustrates an example of a contiguous sub-volume and a sub-volume of the contiguous sub-volume that has been coded;

figure 29 illustrates in flow chart form a method of encoding an occupancy pattern of a current node based at least in part on a sub-volume neighbor configuration;

figure 30 illustrates, in flow diagram form, a method of decoding an occupancy pattern of a current node based at least in part on a sub-volume neighbor configuration;

figure 31 shows an example of how the volume associated with the current node is split into sub-volumes;

figures 32 to 35 show examples of sub-volume neighbour configurations for different sub-nodes of the current node;

FIG. 36 illustrates, in flow chart form, a method of determining a sub-volume neighbor configuration for a child node of a current node; and

figure 37 illustrates in flow chart form one example of a method of binary coding occupancy patterns using combined context reduction and context determination based on sub-volume neighbor configurations.

Similar reference numerals may have been used in different figures to denote similar components.

Detailed Description

Methods of encoding and decoding point clouds and encoders and decoders for encoding and decoding point clouds are described. An entropy codec (e.g., a binary entropy codec) may be used to codec a bit sequence representing the occupancy pattern of the sub-volume of the current volume. The probability (e.g., context) for entropy coding may be based on the neighbor configuration of the current volume and the partial sequence of previously coded bits of the bit sequence. The probabilities may also be based on occupancy data of sub-volumes of at least one neighboring volume of the current volume. In particular, the probability for encoding a bit in the bit sequence may be selected based at least in part on a sub-volume neighbor configuration that depends on an occupancy pattern of a set of sub-volumes of the at least one neighboring volume that is neighboring to a sub-volume of the current volume corresponding to the bit in the bit sequence.

In an example useful for understanding the present application, a determination may be made as to whether a context reduction operation is to be applied, and if it is determined that a context reduction operation is to be applied, then the operation reduces the number of contexts available. Example context reduction operations include: reduction of neighbor configurations, special handling of empty neighbor configurations, and statistical-based context merging based on masking by sub-volumes associated with previously coded bits. The reduction may be applied prior to codec, and a determination may be made during codec as to whether a condition for using the reduced context set is satisfied.

In one aspect, the present application provides a method of encoding a point cloud to generate a bitstream of compressed point cloud data, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and representing a geometry of a volumetric space that is recursively split into sub-volumes and contains points of the point cloud, wherein occupancy of a sub-volume of a volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of the respective sub-volume within the volume in a scan order, and wherein the volume has a plurality of contiguous volumes. The method comprises the following steps: for a current node associated with a current volume split into sub-volumes, wherein each sub-volume corresponds to a child node of the current node, a bit sequence is determined that indicates occupancy of the sub-volume of the current volume. The method further comprises the following steps: for at least one bit in the bit sequence of the current volume, a sub-volume neighbor configuration is determined based on occupancy data of sub-volumes of at least one neighboring volume of the current volume, the sub-volume neighbor configuration being dependent on an occupancy pattern of a set of sub-volumes of the at least one neighboring volume, the set of sub-volumes being neighboring the sub-volume of the current volume corresponding to the bit in the bit sequence. The method further comprises the following steps: a probability (e.g., context) for entropy coding of a bit in the bit sequence is selected, wherein the selecting is based at least in part on the sub-volume neighbor configuration. The method further comprises the following steps: entropy encoding bits in the sequence of bits using a binary entropy encoder based on the selected probabilities to produce encoded data for the bitstream.

Another aspect relates to a method of decoding a bit stream of compressed point cloud data to generate a reconstructed point cloud, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and representing a plurality of nodes of a geometry of a volumetric space, the volumetric space being recursively split into sub-volumes and containing points of the point cloud, wherein an occupancy of a sub-volume of the volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates an occupancy of the respective sub-volume within the volume in a scan order, and wherein the volume has a plurality of neighboring volumes. The method comprises the following steps: for a current node associated with a current volume split into sub-volumes, wherein each sub-volume corresponds to a sub-node of the current node, and for at least one bit in a bit sequence of the current volume, a sub-volume neighbor configuration is determined based on occupancy data of sub-volumes of at least one neighboring volume of the current volume, the sub-volume neighbor configuration being dependent on an occupancy pattern of a set of sub-volumes of the at least one neighboring volume, the set of sub-volumes being neighboring the sub-volume of the current volume corresponding to the bit in the bit sequence. The method further comprises the following steps: a probability (e.g., context) for entropy decoding of a bit in the bit sequence is selected, wherein the selecting is based at least in part on the sub-volume neighbor configuration. The method further comprises the following steps: entropy decoding the at least one bit based on the selected probability using a binary entropy decoder to generate reconstructed bits from the bitstream.

In some implementations, the sub-volume neighbor configuration for a given sub-volume in a given volume may correspond to an occupancy pattern for sub-volumes in adjacent volumes of the given volume that are adjacent to the given sub-volume.

In some implementations, determining the sub-volume neighbor configuration can be based on occupancy data of sub-volumes of those neighboring volumes of the current volume that have been coded.

In some embodiments, the selection probability may also be based on a partial sequence of already coded bits of the bit sequence and/or a neighboring configuration of the current volume. The proximity configuration of the current volume may correspond to an occupancy pattern of a proximity volume of the current volume.

In some implementations, determining a sub-volume neighbor configuration can involve: based on occupancy data of sub-volumes of at least one neighboring volume of the current volume, all those sub-volumes of the at least one neighboring volume are determined which intersect the sub-volume of the current volume corresponding to the bit in the bit sequence. The determining may further involve: applying respective weight factors to the determined sub-volumes, wherein each weight factor depends on an intersection of the respective determined sub-volume with a sub-volume of the current volume corresponding to a bit in the bit sequence. The determining may further involve: based on the determined sub-volumes and their respective weight factors, a weighted number of sub-volumes of the at least one neighboring volume that intersect the sub-volume of the current volume that corresponds to the bit in the bit sequence is determined.

In some embodiments, the scan order within the current volume for determining the bit sequence indicative of the occupancy pattern may be determined such that the maximum possible number of adjacent sub-volumes in the adjacent volumes of the current volume that have been coded does not increase from one sub-volume to the next in the scan order.

In some implementations, the occupancy data for the sub-volumes of the given contiguous volume of the current volume may include an occupancy state for each of the sub-volumes of the given contiguous volume.

In some embodiments, the tree structure may represent an octree.

In some embodiments, the encoding method may further include: encoding a marker indicating that a probability of entropy encoding for at least one bit has been selected based at least in part on the sub-volume neighbor configuration.

In some embodiments, the decoding method may further include: decoding a marker from the bitstream, the marker indicating that a probability of entropy decoding for at least one bit should be selected based at least in part on the sub-volume neighbor configuration.

In another aspect, the present application provides a method of encoding a point cloud to generate a bitstream of compressed point cloud data, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and representing a plurality of nodes of a geometry of a volumetric space, the volumetric space being recursively split into sub-volumes and containing points of the point cloud, wherein occupancy of a sub-volume of a volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of the respective sub-volume within the volume in a scanning order, and wherein the volume has a plurality of neighboring volumes, an occupancy pattern of the neighboring volumes being a neighbor configuration. The method comprises the following steps: determining, for at least one bit of a sequence of bits of the volume, that a context reduction condition is fulfilled, and on this basis, selecting a reduced set of contexts comprising fewer contexts than a product of a neighboring configuration's count and a number of previously coded bits in the sequence; selecting a context from the reduced set of contexts for encoding and decoding at least one bit based on an occupancy state of at least some of the neighboring volumes and at least one previously encoded bit of the bit sequence; entropy encoding at least one bit based on the selected context using a binary entropy encoder to produce encoded data for the bitstream; and updating the selected context.

In another aspect, the present application provides a method of decoding a bit stream of compressed point cloud data to produce a reconstructed point cloud, the point cloud being defined in a tree structure having a plurality of nodes having a parent-child relationship and representing a geometry of a volumetric space that is recursively split into sub-volumes and contains points of the point cloud, wherein occupancy of a sub-volume of the volume is indicated using a bit sequence, wherein each bit of the bit sequence indicates occupancy of the respective sub-volume within the volume in scan order, and wherein the volume has a plurality of neighboring volumes, an occupancy pattern of the neighboring volumes being a neighbor configuration. The decoding method comprises the following steps: determining, for at least one bit of a sequence of bits of the volume, that a context reduction condition is fulfilled, and on this basis, selecting a reduced set of contexts comprising fewer contexts than a product of a neighboring configuration's count and a number of previously coded bits in the sequence; selecting a context from the reduced set of contexts for encoding and decoding at least one bit based on an occupancy state of at least some of the neighboring volumes and at least one previously encoded bit of the bit sequence; entropy decoding at least one bit based on the selected context using a binary entropy decoder to generate reconstructed bits from the bitstream; and updating the selected context.

In some embodiments, the context reduction conditions may include: determining that the occupancy bits of the one or more previous codecs are associated with one or more respective sub-volumes positioned between one or more of the sub-volumes and adjacent volumes associated with the at least one bit. In some cases, this may include: it is determined that the four sub-volumes associated with the previously encoded bits share a face with a particular neighboring volume.

In some embodiments, the context reduction conditions may include: it is determined that at least four bits of the bit sequence have been previously coded.

In some implementations, determining that the context reduction condition is satisfied may include: determining the occupancy pattern of the neighboring volumes indicates that the plurality of neighboring volumes are unoccupied. In some of those cases, the selected reduced set of contexts may include a number of contexts corresponding to a number of previously coded bits in the bit sequence, and optionally selecting a context may include selecting a context based on a sum of previously coded bits in the bit sequence.

In some embodiments, the context reduction conditions may include: it is determined that at least a threshold number of bits in the bit sequence have been previously coded, and the reduced context set may include a lookup table that maps neighbor configurations to fewer contexts than each possible combination of patterns of previously coded bits in the bit sequence. In some examples, upon determining that the distance measure between the respective pair of available contexts is less than the threshold, a lookup table may be generated based on iteratively grouping the available contexts into multiple categories, and each category of the multiple categories may include the respective context in a smaller set, and there may be an available context for each possible combination of the neighbor configuration and the pattern of previously coded bits in the bit sequence.

In some embodiments, at least some of the adjacent volumes are adjacent volumes that share at least one face with the volume.

In another aspect, the present application describes an encoder and decoder configured to implement such encoding and decoding methods.

In yet another aspect, the present application describes a non-transitory computer-readable medium storing computer-executable program instructions that, when executed, cause one or more processors to perform the described encoding and/or decoding methods.

In yet another aspect, the present application describes a computer-readable signal containing program instructions that, when executed by a computer, cause the computer to perform the described encoding and/or decoding method.

The present application also describes computer-implemented applications, including terrain applications, mapping applications, automotive industry applications, automotive applications, virtual reality applications, and cultural heritage applications, among others. These computer-implemented applications include the following processes: receiving a data stream or data file, unpacking the data stream or data file to obtain a bit stream of compressed point cloud data, and decoding the bit stream as described in the above aspects and embodiments thereof. Thus, these computer-implemented applications utilize point cloud compression techniques in accordance with aspects and embodiments thereof described throughout this application.

Methods of encoding and decoding point clouds and encoders and decoders for encoding and decoding point clouds are also described. In some embodiments, the receiving unit receives multiplexed data obtained by multiplexing the coded point cloud data with other coded data types (such as metadata, images, video, audio, and/or graphics). The receiving unit includes a demultiplexing unit for separating the multiplexed data into the coded point data and other coded data and at least one decoding unit (or decoder) for decoding the coded point cloud data. In some other embodiments, the transmitting unit transmits multiplexed data obtained by multiplexing the coded point cloud data with other coded data types (such as metadata, images, video, audio, and/or graphics). The transmitting unit comprises at least one encoding unit (or encoder) for encoding the point cloud data and a multiplexing unit for combining the encoded and decoded point cloud data and the other encoded and decoded data into multiplexed data.

Other aspects and features of the present application will become apparent to those ordinarily skilled in the art upon review of the following description of examples in conjunction with the accompanying figures.

Any feature described with respect to one aspect or embodiment of the invention may also be used with respect to one or more other aspects/embodiments. These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described herein.

Sometimes, in the following description, the terms "node", "volume", and "sub-volume" may be used interchangeably. It should be appreciated that the nodes are associated with volumes or sub-volumes. A node is a particular point on the tree that may be an internal node or a leaf node. A volume or sub-volume is a bounded physical space represented by nodes. In some cases, the term "volume" may be used to refer to the largest bounded space defined to contain a point cloud. The volume may be recursively divided into sub-volumes for the purpose of building a tree structure of interconnected nodes to encode the point cloud data.

In this application, the term "and/or" is intended to cover all possible combinations and subcombinations of the listed elements, including any one, any subcombination, or all elements listed individually, but not necessarily excluding additional elements.

In this application, the phrase "at least one of … … or … …" is intended to cover any one or more of the listed elements, including any one, any subcombination, or all of the elements listed individually, but not necessarily excluding any additional elements, nor all elements.

The point cloud is a collection of points in a three-dimensional coordinate system. These points are generally intended to represent the exterior surface of one or more objects. Each point has a location (position) in a three-dimensional coordinate system. The position may be represented by three coordinates (X, Y, Z), which may be cartesian or any other coordinate system. These points may have other associated attributes (such as color), and in some cases these attributes may also be three component values, such as R, G, B or Y, Cb, Cr. Other associated attributes may include transparency, reflectivity, normal vector, etc., depending on the desired application of the point cloud data.

The point cloud may be static or dynamic. For example, the detailed scan or mapping of the object or terrain may be static point cloud data. LiDAR-based environmental scans for machine vision purposes may be dynamic in that the point cloud changes (at least potentially) over time, for example, with each successive scan of a volume. Thus, a dynamic point cloud is a time-sequential sequence of point clouds.

Point cloud data may be used in several applications, including protection (scanning of historical or cultural objects), mapping, machine vision (such as autonomous or semi-autonomous cars), and virtual reality systems, to give some examples. Dynamic point cloud data for applications such as machine vision may be completely different from static point cloud data for protection purposes. For example, automotive vision typically involves relatively small resolution, colorless, highly dynamic point clouds obtained by LiDAR (or similar) sensors at high capture frequencies. The purpose of such point clouds is not for human consumption or viewing, but rather for machine object detection/classification in the decision process. By way of example, a typical LiDAR frame contains tens of thousands of points, whereas a high quality virtual reality application requires millions of points. It is expected that higher resolution data will be required over time as the speed of operation increases and new applications are discovered.

While point cloud data is useful, the lack of efficient and effective compression (i.e., encoding and decoding processes) may prevent adoption and deployment. A particular challenge that does not arise in the case of other data compression (such as audio or video) when coding a point cloud is coding the geometry of the point cloud. Point clouds tend to be sparsely distributed, making it more challenging to efficiently encode and decode the locations of the points.

One of the more common mechanisms for coding and decoding point cloud data is through the use of a tree-based structure. In a tree-based structure, a bounded three-dimensional volume of a point cloud is recursively divided into sub-volumes. The nodes of the tree correspond to sub-volumes. Whether to further partition the sub-volume may be determined based on the resolution of the tree and/or whether there are any points contained in the sub-volume. A leaf node may have an occupancy flag that indicates whether its associated sub-volume contains a point or does not contain a point. The split flag may indicate whether the node has child nodes (i.e., whether the current volume has been further split into sub-volumes). In some cases, entropy coding, and in some cases, predictive coding may be used for the markers.

A commonly used tree structure is an octree. In this structure, the volumes/sub-volumes are all cubes, and each split of a sub-volume results in eight other sub-volumes/sub-cubes. Another commonly used tree structure is the KD-tree, in which a volume (cube or cuboid) is recursively divided into two parts by a plane orthogonal to one of the axes. Octree is a special case of a KD-tree, where a volume is divided by three planes, each orthogonal to one of the three axes. Both examples relate to cubes or cuboids; however, the present application is not limited to such tree structures, and in some applications the volumes and sub-volumes may have other shapes. The volume does not have to be divided into two sub-volumes (KD-trees) or eight sub-volumes (octree), but may involve other partitioning, including dividing into non-rectangular shapes or involving non-adjacent sub-volumes.

For ease of explanation, and because octrees are popular candidate tree structures for automotive applications, this application may refer to octrees, but it should be understood that the methods and apparatus described herein may be implemented using other tree structures.

Referring now to fig. 1, a simplified block diagram of a point cloud encoder 10 is shown, according to an aspect of the present application. Point cloud encoder 10 includes a tree building module 12 for receiving point cloud data and generating a tree (in this example, an octree) that represents the geometry of the volumetric space containing the point cloud and indicates the locations or positions of points from the point cloud in the geometry.

A basic process for creating an octree for coding (code) a point cloud may include:

1. starting from a bounding volume (cube) containing a point cloud in a coordinate system

2. Splitting the volume into 8 subvolumes (eight subcubes)

3. For each sub-volume, if the sub-volume is empty, then mark the sub-volume with 0, or if there is at least one point in the sub-volume, then mark the sub-volume with 1

4. Repeating (2) for all subvolumes labeled 1 to split those subvolumes until a maximum split depth is reached

5. For all leaf volumes (subcubes) of maximum depth, if it is not empty, then label the leaf cube with 1, otherwise label the leaf cube with 0.

The above process may be described as occupying being equal to splitting the process, where splitting implies occupying, and the constraint is that there is a maximum depth or resolution beyond which no further splitting will occur. In this case, a single flag indicates whether a node is split, and thus whether the node is occupied by at least one point, and vice versa. At the maximum depth, the mark indicates occupancy, where no further splitting is possible.

In some embodiments, splitting and occupying are independent, such that a node may be occupied and may or may not be split. There are two variations of this embodiment:

1. occupied after splitting. The signal flag indicates whether the node is split. If split, the node must contain a point-i.e., the split implies occupancy. Otherwise, if the node is not split, another occupancy flag indicates whether the node contains at least one point. Thus, when a node is not further split (i.e., the node is a leaf node), the leaf node must have an associated occupancy flag to indicate whether the leaf node contains any points.

2. Splitting after occupation. The signal flag indicates whether the node is occupied. If not occupied, no splitting occurs. If occupied, a split flag is encoded to indicate whether the node is further split or not.

Regardless of which of the above-described processes is used to construct the tree, the tree may be traversed in a predefined order (breadth-first or depth-first and according to the scan pattern/order within each partitioned sub-volume) to generate a bit sequence from the markers (occupancy and/or split markers). This may be referred to as serialization or binarization of the tree. As shown in fig. 1, in this example, the point cloud encoder 10 includes a binarizer 14 for binarizing an octree to produce a bitstream of binarized data representing the tree.

The bit sequence may then be encoded using an entropy encoder 16 to produce a compressed bit stream. The entropy encoder 16 may encode the sequence of bits using a context model 18 that specifies probabilities for encoding and decoding the bits based on a context determination by the entropy encoder 16. The context model 18 may be adaptively updated after each bit or defined set of bits is coded. In some cases, the entropy encoder 16 may be a binary arithmetic encoder. In some embodiments, the binary arithmetic encoder may employ Context Adaptive Binary Arithmetic Coding (CABAC). In some embodiments, codecs other than arithmetic codecs may be used.

In some cases, the entropy encoder 16 may not be a binary codec, but may operate on non-binary data. Octree data from the output of tree building module 12 may not be evaluated in binary form, but may be encoded as non-binary data. For example, in the case of an octree, eight markers (e.g., occupancy markers) within a sub-volume in their scan order may be considered to be 2⁸A 1 bit number (e.g., an integer having a value between 1 and 255, since a value of 0 is not possible for a split sub-volume, i.e., if a sub-volume is completely unoccupied, it will not be split). In some implementations, the number may be encoded by an entropy encoder using a multi-symbol arithmetic codec. Within a sub-volume (e.g., cube), the sequence of markers that define the integer may be referred to as a "pattern".

As with video or image codecs, point cloud codecs may include predictive operations in which an effort is made to predict the pattern of a sub-volume. The prediction may be spatial (dependent on previously coded sub-volumes in the same point cloud) or temporal (dependent on previously coded point clouds in a time-ordered sequence of point clouds).

A block diagram of an example point cloud decoder 50 corresponding to the encoder 10 is shown in fig. 2. The point cloud decoder 50 includes an entropy decoder 52 that uses the same context model 54 used by the encoder 10. The entropy decoder 52 receives an input bitstream of compressed data and entropy decodes the data to produce an output sequence of decompressed bits. The sequence is then converted into reconstructed point cloud data by a tree reconstructor 56. The tree reconstructor 56 reconstructs the tree structure from the decompressed data and knowledge of the scan order in which the tree data was binarized. Thus, the tree reconstructor 56 is able to reconstruct the location of the points from the point cloud (limited by the resolution of the tree codec).

An example partial sub-volume 100 is shown in fig. 3. In this example, the sub-volume 100 is shown in two dimensions for ease of illustration, and the size of the sub-volume 100 is 16 x 16. It should be noted that the sub-volume has been divided into four 8 x 8 sub-squares, and two of these four sub-squares have been further subdivided into 4 x 4 sub-squares, three of the 4 x 4 sub-squares are further divided into 2 x 2 sub-squares, and then one of the 2 x 2 sub-squares is divided into a 1 x 1 square. The 1 × 1 square is the maximum depth of the tree and represents the highest resolution for the location point data. Points from the point cloud are shown as dots in the figure.

The structure of the tree 102 is shown on the right side of the sub-volume 100. On the right side of the tree 102 is shown a sequence of split markers 104 and a corresponding sequence of occupied markers 106 obtained in a predefined breadth-first scan order. It will be observed that in this illustrative example, there is an occupancy flag for each sub-volume (node) that is not split (i.e., has an associated split flag set to zero). These sequences may be entropy encoded.

Another example of employing the occupancy ≡ split condition is shown in fig. 4. Fig. 4 illustrates the recursive splitting and coding of the octree 150. Only a portion of octree 150 is shown. The FIFO 152 is shown as processing nodes for splitting to illustrate the breadth first nature of the present process. The FIFO 152 outputs the occupied node 154 queued in the FIFO 152 for further splitting after processing its parent node 156. The tree builder splits the sub-volume associated with the occupancy node 154 into eight sub-volumes (cubes) and determines its occupancy. Occupancy may be indicated by an occupancy marker for each sub-volume. In the prescribed scan order, the labels may be referred to as the occupancy pattern of the nodes 154. The pattern may be specified by an integer representing a sequence of occupancy markers associated with the sub-volume in the predefined scan order. In the case of octrees, the patterns are integers within the range [1,255 ].

The entropy encoder then encodes the pattern using a non-binary arithmetic encoder based on the probabilities specified by the context model. In this example, the probabilities may be based on the pattern distribution of the initial distribution model and adaptively updated. In one embodiment, the mode distribution is actually a counter of the number of times each mode (integer from 1 to 255) has been encountered during codec. The pattern distribution may be updated after each sub-volume is coded. Since the relative frequency of the pattern is germane to the probability estimates and not to the absolute counts, the pattern distribution can be normalized as needed.

Based on the pattern, those child nodes that are occupied (e.g., with the flag ═ 1) are then pushed into the FIFO 152 for further splitting in turn (provided the node is not the maximum depth of the tree).

Referring now to FIG. 5, an example cube 180 from an octree is shown. The cube 180 is subdivided into eight subcubes. The scanning order used to read the indicia produces an eight-bit string that can be read as an integer [1,255] in binary form. The subcubes have the values shown in fig. 5 based on the scanning order and the resulting bit positions of the marker of each subcube in the string. The scan order can be any sequence of subcubes, provided that both the encoder and decoder use the same scan order.

By way of example, FIG. 6 shows a cube 180 that occupies four "front" subcubes. On the basis that the occupied subcube is a cube 1+4+16+64, this will correspond to pattern 85. The integer pattern number specifies the occupancy pattern in the subcube.

Because trees tend to factor the higher order bits of the point coordinates, octree representation, or more generally any tree representation, is efficient in representing points with spatial correlation. For octrees, each depth level refines the coordinates of points within a sub-volume by one bit for each component, taking eight bits per refinement. Further compression is obtained by entropy coding the split information (i.e., patterns) associated with each tree node. This further compression is possible because the mode distribution is not uniform (non-uniformity is another result of the correlation).

One potential inefficiency in current systems is that the pattern distribution (e.g., the histogram of pattern numbers seen in previously coded nodes of the tree) is developed during the coding of the point cloud. In some cases, the pattern distribution may be initialized to be equi-probable, or may be initialized to some other predetermined distribution; but using a pattern distribution means that the context model does not take into account or take advantage of local geometric dependencies.

In european patent application No. 18305037.6, the applicant describes a method and apparatus for selecting between available mode distributions for codec of occupancy modes of a particular node based on some occupancy information from previously codec nodes in the vicinity of the particular node. In one example embodiment, occupancy information is obtained from an occupancy pattern of a parent node to a particular node. In another example embodiment, occupancy information is obtained from one or more nodes that are proximate to a particular node. The content of european patent application No. 18305037.6 is incorporated herein by reference.

Referring now to fig. 7, an example method 200 of encoding a point cloud is shown in flow chart form. In this example, the method 200 involves a recursive splitting of the occupied nodes (sub-volumes) and a breadth-first traversal of the trees used for encoding and decoding.

In operation 202, the encoder determines an occupancy pattern of the current node. The current node is an occupied node that has been split into eight child nodes, each corresponding to a respective subcube. The occupancy pattern of the current node specifies the occupancy of eight child nodes in scan order. As described above, an integer between 1 and 255 (e.g., an eight-bit binary string) may be used to indicate the occupancy pattern.

In operation 204, the encoder selects a probability distribution from the set of probability distributions. The selection of the probability distribution is based on some occupancy information from nearby previously codec nodes (i.e. at least one node that is a neighbor of the current node). In some embodiments, two nodes are proximate if they are associated with respective sub-volumes that share at least one face. In a broader definition, nodes are adjacent if they share at least one edge. In yet another broader definition, two nodes are adjacent if they share at least one vertex. The parent mode, in which the current node is a child node, provides the current node with occupancy data for the current node and seven sibling nodes. In some implementations, the occupancy information is a parent mode. In some implementations, the occupancy information is occupancy data for a set of neighbor nodes that includes nodes at the same tree depth level as the current node but with different parent nodes. In some cases, combinations of these are possible. For example, the set of neighbor nodes may include some sibling nodes and some non-sibling nodes.

As indicated by operation 206, once the probability distribution has been selected, the encoder then entropy encodes the occupancy pattern of the current node using the selected probability distribution. The encoder then updates the selected probability distribution in operation 208 based on the occupancy pattern, e.g., the encoder may increment a count corresponding to the occupancy pattern. In operation 210, the encoder evaluates whether there are other nodes to be coded and, if so, returns to operation 202 to code the next node.

The probability distribution selection in operation 204 will be based on the occupancy data of nearby previously codec nodes. This allows both the encoder and decoder to make the same selection independently. For the following discussion of probability distribution selection, reference will be made to FIG. 8, which graphically illustrates a partial octree 300 including a current node 302. The current node 302 is an occupied node and is being evaluated for codec. The current node 302 is one of the eight children of the parent node 306, which in turn is a child of a grandparent node (not shown). The current node 302 is divided into eight child nodes 304. The occupancy pattern of the current node 302 is based on the occupancy of the child node 304. For example, as illustrated, using black dots is a specification of occupancy nodes, which may be 00110010, i.e., pattern 50.

The current node 302 has a peer node 308 with the same parent node 306. The parent pattern is the occupancy pattern of the parent node 306, which as illustrated would be 00110000, i.e., pattern 48. The parent mode may serve as a basis for selecting an appropriate probability distribution to entropy encode the occupancy pattern of the current node.

FIG. 9 illustrates a set of neighbors around a current node, where the neighbors are defined as nodes of a shared surface. In this example, the node/sub-volume is a cube, and the cube at the center of the image has six neighbors, one for each face. In an octree, it should be appreciated that the neighbor of the current node will include three sibling nodes. The neighbor of the current node will also include three nodes that do not have the same parent node. Thus, the occupancy data of some of the neighboring nodes will be available because they are siblings, but the occupancy data of some neighboring nodes may or may not be available, depending on whether those nodes have been previously coded. Special treatment may be applied to handle missing neighbors. In some embodiments, a missing neighbor may be assumed to be occupied or may be assumed to be unoccupied. It should be appreciated that the neighbor definition may be extended to include neighboring nodes based on shared edges or based on shared vertices to include additional neighboring sub-volumes in the evaluation.

It should be appreciated that the foregoing process looks at the occupancy of nearby nodes in an attempt to determine the likelihood of occupancy of the current node 302 in order to select the more appropriate context(s) and use more accurate probabilities for entropy coding the occupancy data of the current node 302. It should be appreciated that the occupancy states of neighboring nodes that share a face with the current node 302 may be a more accurate assessment of whether the current node 302 is likely to be isolated than based on an assessment of the occupancy states of sibling nodes, three of which will only share edges and one of which will only share vertices (in the case of an octree). However, the evaluation of occupancy status of peers has the advantage of modularity, since all relevant data for evaluation is part of the parent node, which means that it has a small memory footprint for implementation, whereas the evaluation of neighbor occupancy states involves buffering tree occupancy data, which is not needed when determining neighbor occupancy states in connection with coding future nearby nodes.

The occupancy of the neighbors may be read in a scan order that effectively assigns values to each neighbor, much like that described above with respect to the occupancy pattern. As illustrated, the neighboring node effectively assumes the values 1,2, 4, 8, 16, or 32, and thus there are 64(0 to 63) possible neighbor occupancy configurations. This value may be referred to herein as a "neighbor configuration". As an example, fig. 10 illustrates an example of a neighbor configuration 15 in which neighbors 1,2, 4, and 8 are occupied, while neighbors 16 and 32 are empty.

In some cases, both of the above criteria (parent mode and neighbor configuration) may be applied simultaneously or may be selected in between. For example, if a neighbor is available, then probability distribution selection may be made based on neighboring nodes; however, if one or more neighbors in a neighbor are not available because they are from nodes that have not yet been coded, then probability distribution selection can revert to peer node-based analysis (parent mode).

In yet another embodiment, the probability distribution selection may alternatively or additionally be based on a grandparent pattern. In other words, the probability distribution selection may be based on the occupancy state of a tertiary parent node that is a peer of parent node 306.

In yet another embodiment, additional or alternative evaluations may be taken into account in the selection of the probability distribution. For example, the probability distribution selection may look at the occupancy states of the neighbor nodes of the parent node or the neighbor nodes of the grandparent node.

Any two or more of the above criteria for evaluating local occupancy states may be used in combination in some implementations.

In the case of a non-binary entropy codec, the occupancy data of the current node may be coded by selecting a probability distribution. The probability distribution contains some probability corresponding to the number of possible occupancy patterns of the current node. For example, in the case of coding and decoding the occupation pattern of the octree, there is 2⁸1-255 possible patterns, which means that each probability distribution comprises 255 probabilities. In some embodiments, the number of probability distributions may be equal to the number of possible occupancy results in the selection criteria, i.e. using neighbor, peer and/or parent occupancy data. For example, in case the parent pattern of the octree is used as the selection criterion for determining the probability distribution to be used, there will be 255 probability distributions respectively relating to 255 probabilities. In the case of neighbor configuration, if a neighbor is defined as a shared surface, there will be 64 probability distributions, where each distribution contains 255 probabilities.

It is understood that too much distribution can result in slow adaptation due to insufficient data (i.e., contextual dilution). Thus, in some embodiments, similar patterns may be grouped so that the same probability distribution is used. For example, a single distribution may be used for modes corresponding to full occupancy, vertical orientation, horizontal orientation, mostly empty, then all other cases. This may reduce the number of probability distributions to about five. It will be appreciated that different groupings of patterns may be formed to result in different numbers of probability distributions.

Referring now to FIG. 11, one illustrative embodiment of a process 400 for point cloud entropy encoding using a parent mode dependent context is diagrammatically illustrated. In this example, the current node 402 has been split into eight child nodes, and the occupancy pattern 404 of the current node will be encoded using a non-binary entropy encoder 406. The non-binary entropy encoder 406 uses a probability distribution selected from one of six possible probability distributions 408. The selection is based on the parent mode-i.e., the selection is based on occupancy information from the parent node to the current node 402. The parent pattern is identified by an integer between 1 and 255.

The choice of probability distribution may be a decision tree that evaluates whether the pattern corresponds to the entire node (e.g., pattern 255), a horizontal structure (e.g., pattern 170 or 85; assuming the Z-axis is vertical), a vertical structure (e.g., pattern 3, 12, 48, 192), a sparsely populated distribution (e.g., pattern 1,2, 4, 8, 16, 32, 64, or 128; i.e., none of the sibling nodes is occupied), a semi-sparsely populated distribution (total number of occupied nodes between the current node and the sibling nodes ≦ 3), and all other cases. The example modes indicated for the different categories are merely examples. For example, a "horizontal" category may include patterns that relate to two or three occupied cubes on the same horizontal level. The "vertical" category may include patterns involving three or four occupied cubes in a wall-like arrangement. It should also be appreciated that finer gradations may be used. For example, the "level" category may be further subdivided into levels in the upper portion of the cube and levels in the lower portion of the cube, where there is a different probability distribution for each case. Other groupings of occupancy patterns with some correlation may be made and assigned to corresponding probability distributions. Further discussion regarding invariance between pattern groupings and neighbor configurations in the context of neighbor configurations is set forth further below.

Fig. 12 shows an illustrative embodiment of a process 500 for point cloud entropy encoding using a context dependent on a neighbor configuration. This example assumes the definition of the neighbors and neighbor configuration numbers used above in connection with fig. 9. This example also assumes that each neighbor configuration has a dedicated probability distribution, which means that there are 64 different probability distributions. The current node 502 has an occupancy pattern 504 to be encoded. The probability distribution is selected based on the nodes in the vicinity of the current node 502. That is, the neighbor configuration NC in [0,63] is found and used to select the associated probability distribution.

It will be appreciated that in some embodiments, neighbor configurations may be grouped such that more than one neighbor configuration uses the same probability distribution based on similarities in patterns. In some embodiments, the process may use a different arrangement of neighbors for contextual analysis (selection) of the distribution. Additional neighbors may be added, such as eight neighbors that are diagonally adjacent on all three axes, or twelve neighbors that are diagonally adjacent on two axes. Embodiments that avoid specific neighbors may also be used, for example to avoid using neighbors that introduce additional dependencies in depth-first scans or to introduce only dependencies on specific axes in order to reduce the codec state of the large tree.

In this example, the case where NC ═ 0 is handled in a certain manner. If there are no occupied neighbors, it may indicate that the current node 502 is isolated. Thus, the process 500 further checks the number of occupied child nodes of the current node 502. If only one child node is occupied (i.e., the Number Occupied (NO) is equal to 1), a flag is encoded indicating that the single child node is occupied and the index of the node is coded using 3 bits. If more than one child node is occupied, then process 500 uses NC-0 probability distributions to codec the occupancy pattern.

Referring now to fig. 13, an example method 600 for decoding a bitstream of encoded point cloud data is shown in flow chart form.

In operation 602, the decoder selects one of the probability distributions based on occupancy information from one or more nodes in the vicinity of the current node. As described above, the occupancy information may be a parent pattern from the parent node to the current node (i.e., occupancy of the current node and its siblings), or it may be occupancy of neighboring nodes to the current node, which may include some of the sibling nodes. Other or additional occupancy information may be used in some embodiments.

Once the probability distribution has been selected, the decoder entropy decodes a portion of the bitstream using the selected probability distribution to reconstruct the occupancy pattern of the current node in operation 604. The occupancy pattern is used by the decoder to reconstruct the tree in order to reconstruct the encoded point cloud data. Once the point cloud data is decoded, it may be output from the decoder for use, such as for rendering views, segmentation/classification, or other applications.

In operation 606, the decoder updates the probability distribution based on the reconstructed occupancy pattern, and then if there are other nodes to decode, it moves to the next node in the buffer and returns to operation 602.

Example embodiments of the above-described method have been shown to provide compression improvements in which the increase in codec complexity is negligible. Although neighbor-based selection has higher computational complexity and greater memory usage, neighbor-based selection exhibits better compression performance than parent-mode-based selection. In some tests, the relative improvement in bits per point over the MPEG point cloud test model was between 4% and 20%. It has been noted that initializing the probability distribution based on the distribution derived using the test data results in improved performance compared to initializing with a uniform distribution.

Some of the above examples are based on a tree codec process that uses a non-binary codec to represent the occupancy pattern. New developments using binary entropy codecs are presented further below.

In one variation of neighbor-based probability distribution selection, the number of distributions may be reduced by exploiting the symmetry of the neighborhood. A structurally similar configuration with lines of symmetry can reuse the same distribution by permuting the neighborhood or permuting the pattern distribution. In other words, neighbor configurations that may use the same pattern distribution may be grouped into categories. A category that contains more than one neighbor configuration may be referred to herein as a "neighbor configuration" because one of these neighbor configurations effectively subsumes the other neighbor configurations by reflecting or permuting those other configurations.

As an example, consider eight corner patterns NC ∈ [21,22,25,26,37,38,41,42], which represent the symmetry of corner neighbor patterns, respectively. It is possible that these values of the NC correlate well with a particular but different pattern of nodes. It is also possible that these correlation patterns follow the same symmetry as the neighbor patterns. As an example, a method of reusing multiple cases where a single distribution represents NC may be implemented, the reuse being achieved by permuting the probabilities of the distributions.

The encoder derives a mode number for the node based on the occupancy of the child node. The encoder selects the distribution and permutation functions according to the neighbor configuration. The encoder reorders the probabilities included in the distributions according to a permutation function and then arithmetically encodes the pattern numbers using the permuted distributions. The update of the probability of the permutation distribution by the arithmetic coder is mapped back to the original distribution by the inverse permutation function.

The corresponding decoder first selects the same distribution and permutation functions according to the neighbor configuration. The permuted distribution is generated in the same manner as the encoder, with the permuted distribution being used by the arithmetic decoder to entropy decode the mode numbers. The bits including the pattern number are then assigned to the corresponding children, respectively.

It should be noted that the same permutation may be implemented, but without reordering the data of the distribution itself, but rather introducing a hierarchy of indirection and using a permutation function to permute the look-up of a given index in the distribution.

Alternative embodiments consider permutations of the modes themselves rather than distributions, allowing for reordering before or after entropy encoding/decoding, respectively. This approach may be more suitable for efficient implementation by bit-by-bit reordering operations. In this case, neither the encoder nor the decoder performs a re-ordering of the distribution, but modifies the operation of the coding pattern numbering toWherein c is_iIs the occupancy state of the ith sub-cell, and σ (i) is the permutation function. One such example permutation functionThe NC-22 distribution is allowed to be used for the NC-41 distribution. The permutation function can be used by a decoderThe occupation status of the child node is derived from the encoded pattern number.

The method for deriving the required permutation may be based on the rotational symmetry of the neighbor configuration or may be based on reflections along a particular axis. Furthermore, it is not necessary for the permutation to permute all positions according to, for example, symmetry; instead, partial permutations may be used. For example, when replacing NC 22 with NC 41, the position in the axis of symmetry may not be replaced, resulting in a mappingWhere positions 0, 2, 4, 6 are not replaced. In other embodiments, only pairs 1 and 7 are transposed.

An example of an embodiment based on rotational symmetry and reflection is provided below for the special case of an octree with six neighbors sharing a common plane with the current cube. Without loss of generality, the Z-axis extends perpendicularly with respect to the direction of viewing the figure, as shown in fig. 16. Then, the relative position of the neighbors, such as "above" (respectively "below") should be understood as being in the increasing (respectively decreasing) Z direction along the Z axis. The same comments apply to left/right along the X-axis and front/back along the Y-axis.

Fig. 16 shows three rotations 2102, 2104, and 2106 along Z, Y and the X axis, respectively. The three rotations are 90 degrees, i.e. they perform a quarter turn rotation along their respective axes.

Fig. 17 shows the invariance categories of neighbor configurations at one or several iterations of rotation 2102 along the Z-axis. This invariance represents the same statistical behavior of the point cloud geometry along any direction belonging to the XY plane. This is particularly true for the use case of a car moving on the earth's surface locally approximated by the XY plane. The horizontal configuration is a given occupancy of four neighbors (located to the left, right, front, and back of the current cube), independent of the occupancy of the upper neighbor (2202) and the lower neighbor (2204). Under rotation 2102, the four horizontal configurations 2206, 2208, 2210, and 2212 belong to the same class of invariance. Similarly, the two configurations 2214 and 2216 belong to the same category of invariance. There are only six categories of invariance under rotation 2102 (grouped under category set 2218).

The vertical configuration is a given occupancy of the two neighbors 2202 and 2204, independent of the occupancy of the four neighbors located to the left, right, front, and back of the current cube. As shown in fig. 18, there are four possible vertical configurations. Thus, if one considers invariance with respect to rotation 2102 along the Z-axis, there are 24 possible configurations, 6 × 4.

The reflection 2108 along the Z-axis is shown in fig. 16. The vertical configurations 2302 and 2304 depicted in fig. 18 belong to the same class of invariance under reflection 2108. There are three categories of invariance under reflection 2108 (grouped under category set 2306). Invariance under reflection 2108 means that in terms of point cloud geometry statistics, the behavior in the up and down directions is substantially the same. This is an accurate assumption of a moving car on the road.

If one assumes invariance under both rotation 2102 and reflection 2108, then there are 18 classes of invariance resulting from the product of the two sets 2218 and 2306. These 18 categories are shown in fig. 19.

Additional invariance is applied under two other rotations 2104 and 2106, the two configurations 2401 and 2402 belonging to the same category of invariance. Further, two configurations 2411 and 2412, two configurations 2421 and 2422, three configurations 2431, 2432 and 2433, two configurations 2441 and 2442, two configurations 2451 and 2452, and the last two configurations 2461 and 2462 all belong to the same category. Thus, invariance under the three rotations (2102, 2104, and 2106) and reflection 2108 causes 10 classes of invariance, as shown in FIG. 20.

According to the example provided above, the number of valid neighbor configurations (i.e., the class in which the 64 neighbor configurations can be grouped) is any of 64, 24, 18, or 10, with or without assuming invariance under the three rotations and reflections.

Before entropy coding, the patterns undergo the same transformation (i.e. rotation and reflection) since the neighbor configuration does belong to one of the invariance classes. This preserves the statistical consistency between the unchanged neighbor configuration and the codec mode.

It should also be understood that during traversal of the tree, child nodes will have certain neighboring nodes at the same tree depth that have been previously visited and can be causally used as dependencies. For these same level neighbors, instead of consulting parent collocated neighbors, the same level neighbors can be used. Since the same level of neighbors have a parent's halved size, if any of the four immediately adjacent neighboring child nodes (i.e., the four child nodes sharing a face with the current node) are occupied, then one configuration takes into account the occupied neighbors. Entropy coding tree occupancy patterns using binary coding

The above technique of using neighbor occupancy information to encode and decode tree occupancy is detailed in european patent application No. 18305037.6. The described embodiments focus on non-binary entropy coding using occupancy patterns, where the pattern distribution is selected based on neighbor occupancy information. However, in some cases, using a binary codec may be more efficient in terms of hardware implementation. Moreover, on-the-fly updates to many probabilities may require fast access to memory and operations within the heart of the arithmetic codec. Therefore, it may be advantageous to find methods and apparatuses for entropy coding of occupancy patterns using a binary arithmetic codec. It would be advantageous to use a binary codec if it could be done without significantly degrading compression performance while keeping away from the excessive number of contexts to track.

The use of binary codecs instead of non-binary codecs is reflected in the entropy formula:

H(X₁,X₂|Y)＝H(X₁|Y)H(X₂|Y,X₁)

wherein X ═ X₁,X₂) Is the non-binary information to be coded and Y is the context used for coding, i.e. the neighbor configuration or the selected mode distribution. To convert the non-binary codec of X into a binary codec, the information (X)₁,X₂) Splitting into information X₁And X₂These information can be separately coded without increasing entropy. For this purpose, one of the two types of information must be coded and decoded depending on the other, where X is₂Dependent on X₁. This can be extended to n bits of information in X. For example, for n-3:

H(X₁,X_2,X₃|Y)＝H(X₁|Y)H(X₂|Y,X₁)H(X₃|Y,X₁,X₂)

it should be understood that as the occupancy pattern (i.e., bit sequence X) becomes longer, there are more conditions for coding and decoding later bits in the sequence. For binary codecs (e.g. CABAC), this means that the number of contexts to be tracked and managed increases dramatically. Take an octree as an example, where the occupied pattern is an octet sequence b ═ b₀…b₇The bit sequence can be split into eight binary information bits b₀…b₇. The codec may use neighbor configuration N (or NC) to determine context. Assuming that the neighbor configuration can be reduced to 10 valid neighbor configurations by grouping the neighbor configurations into invariance classes, as described above, N is an integer belonging to {0,1,2, …,9 }. For brevity, "category of invariant neighbor configurations" may sometimes be referred to herein simply as "neighbor configurations," but it should be appreciated that this reduced number of neighbor configurations may be implemented based on category-based grouping of neighbor configurations according to invariance.

Fig. 21 illustrates splitting an eight-bit pattern or sequence into eight separate bits for binary entropy coding. It should be noted that the first bit of the sequence is encoded based on the neighbor configuration, so there are a total of ten contexts available.Based on neighbor configuration and any previously coded bits (i.e., bit b)₀) To encode the next bit of the sequence. This contains a total of 20 available contexts: 10 as from N and b₀A product of 2. Using a context selected from 1280 available contexts to pair the final bit b₇Carrying out entropy coding: as 10 from N and b from the previous coding₀、......、b₆Given the product of 128 of the partial modes. That is, for each bit, the number of contexts (i.e., possible combinations of conditions/dependencies) is the defined number of neighbor configurations (10 in this example, based on grouping 64 neighbor configurations into categories) and an ordered sequence (by 2) from n-1 previously coded bits^n-1Given) of the number of possible partial modes.

Thus, there are 2550 contexts in total to maintain in connection with the binary codec of the occupied mode. This is a large number of contexts to track and relative shortfalls can lead to poor performance due to context dilution, especially for later bits in the sequence.

Thus, in one aspect, the present application discloses an encoder and decoder that determines whether a context set can be reduced, and if so, applies a context reduction operation to achieve a smaller available context set for entropy codec of at least part of an occupancy pattern using a binary codec. In another aspect, the present application also discloses an encoder and decoder that apply one or more rounds of state reduction using the same context reduction operation in order to perform efficient context selection from a fixed number of contexts. In some embodiments, context reduction is applied a priori when generating a look-up table of contexts and/or algorithm conditions, which is then used by the encoder or decoder in selecting the appropriate context. The reduction is based on testable conditions that the encoder and decoder evaluate to determine from which look-up table to select or how to index/select from to obtain the selected context.

Referring now to fig. 22, an example method 3000 for coding occupancy patterns in a tree-based point cloud codec using binary coding is shown in a flow chart. Method 3000 may be implemented by an encoder or a decoder. In the case of an encoder, a codec operation is being encoded, and in the case of a decoder, a codec operation is being decoded. The encoding and decoding is context-based entropy encoding and decoding.

The example method 3000 is used for entropy coding of occupancy patterns (i.e., bit sequences) for a particular node/volume. The occupancy pattern represents the occupancy state of a sub-node (sub-volume) of the node/volume. In the case of an octree, there are eight child nodes/sub-volumes. In operation 3002, a neighbor configuration is determined. A neighbor configuration is an occupancy state for one or more volumes that are adjacent to the volume for which the occupancy pattern is to be coded. As discussed above, there are various possible embodiments for determining the neighbor configuration. In some examples, there are 10 neighbor configurations, and the neighbor configuration for the current volume is identified based on the occupancy of six volumes that share a face with the current volume.

In operation 3004, an index i of a child node of the current volume is set to 0. Then, in operation 3006, it is evaluated whether context reduction is possible. Different possible context reduction operations are discussed in more detail below. Whether context reduction is possible may be evaluated based on, for example, which bit (e.g., index value) in the bit sequence is being coded. In some cases, context reduction may be possible for later bits in the sequence rather than for the first few bits. Evaluating whether context reduction is possible may be based on, for example, neighbor configurations, since some neighbor configurations may achieve simplification. In some implementations, additional factors may be used to evaluate whether context reduction is possible. For example, an upper limit Bo may be provided as the maximum number of contexts that the binary codec can use to codec bits, and if the initial number of contexts used to codec bits is higher than Bo, context reduction is applied (otherwise context reduction is not applied) such that the number of contexts after reduction is at most Bo. Such a limit Bo may be defined in the encoder and/or decoder specifications in order to ensure that a software or hardware implementation that is capable of handling Bo contexts will always be able to encode and/or decode a point cloud without generating overflow in terms of the number of contexts. Knowing the limit Bo beforehand also allows to anticipate the complexity and memory footprint caused by the binary entropy codec, thus facilitating the design of the hardware. Typical values for Bo are from ten to several hundred.

If the context reduction is determined to be available, then in operation 3008, a context reduction operation is applied. The context reduction operation reduces the number of available contexts in the set of available contexts to a smaller set containing fewer total contexts. It will be recalled that since a context may depend on the partial mode of bits from a previous codec of a bit sequence, the number of available contexts may depend in part on the bit position in the sequence, i.e. the index. In some embodiments, prior to reduction, the number of contexts available in the set may be based on the number of neighbor configurations multiplied by the number of possible partial modes along with the bits of the previous codec. For bits at index i (where i ranges from 0 to n), the number of partial modes will be from 2ⁱAnd (4) giving.

As mentioned above, in some embodiments, the context reduction operation is performed prior to the codec, and the resulting reduced context set is the context set available to the encoder and decoder during the codec operation. The context sets used and/or selected for reduction during codec may be based on evaluating one or more conditions prior to using those reduction sets that correspond to the conditions evaluated in operation 3006 for determining the number of contexts that may be reduced. For example, in the case of a particular neighbor configuration that allows the use of a reduced context set, the encoder and/or decoder may first determine whether the neighbor configuration condition is satisfied, and then if the neighbor configuration condition is satisfied, use the corresponding reduced context set.

In operation 3010, bit b is determined based on neighbor configurations and partial patterns of previously coded bits in the bit sequence_iI.e. selecting bit b from the set (or reduced set, if any) of available contexts_iThe context of (a). The current bit is then entropy encoded by the entropy codec using the selected context in operation 3012.

In operation 3014, if the index i indicates that the bit of the current codec is the last bit in the sequence (i.e., i equals i)_max) Then the codec process proceeds to the next node. Otherwise, the index i is incremented in operation 3016, and the process returns to operation 3006.

It should be appreciated that in some embodiments, context selection may not be dependent on neighbor configuration. In some cases, it may rely only on the fractional pattern (if any) of previously coded bits in the sequence.

A simplified block diagram of a portion of an example encoder 3100 is illustrated in fig. 23. In this illustration, it is understood that the occupancy pattern 3102 is obtained as the corresponding volume is divided into child nodes and circulated through a FIFO buffer 3104 that preserves the geometry of the point cloud. The encoding of the occupied mode 3102 is illustrated as involving a concatenation of binary encoders 3106, one binary codec for each bit of the mode. Among at least some of the binary codecs 3106 are context reduction operations 3108, which operate to reduce the available context to a smaller set of available contexts.

Although fig. 23 illustrates a series of binary codecs 3106, in some embodiments, only one binary codec is used. In case more than one codec is used, the codecs may be (partially) parallelized. Considering the context dependency of one bit on the aforementioned bits in a bit sequence, the codec of a mode may not necessarily be completely parallelized, but it is possible to improve the pipelining by using cascaded binary codecs for the modes to achieve a certain degree of parallelization and speed increase.

Context determination based on sub-volume neighbor configuration

In the above, the current occupied-bit b in the bit sequence is determined, i.e. selected from the set of available contexts (or reduced set, if any), based on the neighbor configuration and partial mode of the previously coded and decoded bits in the bit sequence_iI.e. the probability associated with the respective entropy codec to be used for entropy codec. The current bit is then entropy coded by the binary codec using the selected context (probability). Additionally or alternatively, the occupancy state of a child node of a neighboring node of the current node (e.g., an occupied-neighbor node that has been coded) may be considered to determine bit b in the bit sequence for the current node_iContext (e.g., probability) for the codec. Next, an embodiment of determining a context in consideration of the occupied states of the child nodes of the neighbor nodes of the current node will be described.

In some embodiments, the method of encoding or decoding a bit sequence may use knowledge of the occupancy states of the children of the nodes adjacent to the current node to determine the context. That is, knowledge of the occupancy of child nodes that have already been coded occupying neighboring nodes (possibly all neighboring nodes or all neighboring nodes that have already been coded) may be used to drive (possibly together with the neighbor configuration N, e.g. reduced to ten configuration N)₁₀Together) occupied bit b for the current node_iAn entropy codec for performing the codec.

Hereinafter, without being limited by the intended, a method of encoding or decoding a bit sequence using knowledge of the occupancy states of the children of the neighboring nodes of the current node to determine a context will be described with reference to an octree and a cube that is a volume associated with the nodes of the octree.

Referring now to fig. 27, there is shown a current node (i.e., its associated volume or current volume) 4000 and its six neighbors 4010, 4020, 4030, 4040, 4050 and 4060. For the current example of an octree, the neighbors of the current node may be defined as those nodes (at the same level or depth of the tree) as follows: its associated volume shares a face with the current volume. Volumes that share faces with each other may also be referred to as volumes that are in contact with each other. Other definitions of neighboring nodes are also possible. For example, the neighbors of the current node may be those nodes (at the same level or depth of the tree) as follows: its associated volume shares an edge (or vertex) with the current volume. In general, regardless of the tree structure, neighboring nodes may be those nodes (at the same level or depth of the tree) as follows: its associated volume intersects the current volume (e.g., at least in some predefined non-empty set).

In the context of the present application, it is understood that volumes (nodes) that intersect each other are neighboring volumes (nodes). Thus, the terms "intersect with … …" and "adjacent to … …" may be considered synonymous in the context of the present application.

It is noted that the expressions "volume" and "sub-volume" may be used somewhat interchangeably in the sense that each sub-volume is itself a volume that may be subdivided into sub-volumes. In any case, the volume/sub-volume relationships can be clearly understood by specifying parent-child relationships between the involved nodes/volumes.

Assume that the scan order of the nodes is performed in a breadth first manner, in increasing X order, then in increasing Y order, and finally in increasing Z order. In doing so, three neighbors with the lowest X coordinate (i.e., neighbor 4010), the lowest Y coordinate (i.e., neighbor 4030), and the lowest Z coordinate (i.e., neighbor 4050) have been coded. Thus, if one of the three neighbors is occupied, the configuration of the occupied sub-volume associated with the occupied neighbor is known. Although the present example defines the scan order in increasing X order, then increasing Y order, and finally increasing Z order, other breadth-first scan orders may be used for this purpose. Regardless of the particular breadth-first scan order used, three neighbors have been coded for the current node that are not at the boundary of the 3D space.

Referring now to fig. 28, an exemplary current volume is shown in which all three neighbors that have been coded (i.e., neighbors 4010, 4030, and 4050) are occupied. The occupied sub-volumes of neighbor 4010 are sub-volumes 4011, 4012, and 4013; the occupied sub-volumes of neighbor 4030 are sub-volumes 4031, 4032 and 4033; and the occupied sub-volumes of neighbor 4050 are sub-volumes 4051 and 4052. In this example, all three neighbors that have been coded are occupied, but it should be understood that typically only two neighbors or one of the two neighbors may actually be occupied, or even none of them.

The set of sub-volume neighbor configurations may be defined using knowledge of occupied sub-volumes of occupied neighbors that have been coded, e.g., one sub-volume neighbor configuration exists for each sub-volume of the current volume. The sub-volume neighbor configuration may be used as bit b for selecting a context (e.g., a probability associated with a corresponding binary entropy codec) to apply to the bit sequence of the current volume_iCriterion for entropy coding and decoding. Determining sub-volume neighbor configurations for a given sub-volume of the current volume may be based on occupancy data for sub-volumes of those neighboring volumes of the current volume that have been encoded. Additionally, the sub-volume neighbor configuration for a given sub-volume in the current volume may correspond to the occupancy pattern of all those sub-volumes in the neighboring (already coded) volume of the given volume that are neighboring the given sub-volume.

Referring now to fig. 29, an example of a method 4100 of encoding a point cloud to generate a bitstream of compressed point cloud data is shown in flow chart form. The point cloud is defined in a tree structure with a plurality of nodes having a parent-child relationship and which represent the geometry of a volume space that is recursively split into sub-volumes and contains the points of the point cloud. A bit sequence is used to indicate occupancy of a sub-volume of a volume, wherein each bit of the bit sequence indicates occupancy of a respective sub-volume within the volume in a scan order, and the volume has a plurality of contiguous volumes. The operations of method 4100 described below are performed separately for a current node associated with a current volume split into sub-volumes, where each sub-volume corresponds to a child node of the current node. In operation 4110, a bit sequence indicating occupancy of a sub-volume of the current volume is determined. However, the device is not suitable for use in a kitchenThereafter, the following is performed for at least one bit of the bit sequence of the current volume. In operation 4120, a sub-volume neighbor configuration is determined based on occupancy data for sub-volumes of at least one neighboring volume of the current volume. The sub-volume neighborhood configuration depends on the occupancy pattern of a set of sub-volumes of the at least one contiguous subset that are contiguous to a given sub-volume of the current volume. The given sub-volume of the current volume is the sub-volume corresponding to the bit in the bit sequence. E.g. for bit b of a bit sequence_iA given sub-volume is corresponding to said bit b_iI.e. the given sub-volume is bit b_iA sub-volume of the occupancy state is indicated. A sub-volume neighbor configuration is then determined for the considered sub-volume of the current volume. It is worthy to note that operation 4120 may be performed for any, some, or all bits of the bit sequence. In operation 4130, a probability (e.g., context) for entropy coding of a bit in a bit sequence is selected. Here, the selection is based at least in part on sub-volume neighbor configurations that have been determined for the sub-volumes corresponding to the bits in the bit sequence. In operation 4140, the bits in the sequence of bits are entropy encoded using a binary entropy encoder based on the selected probabilities (e.g., contexts) to produce encoded data for the bitstream.

In some embodiments, the method 4100 may also include an operation (not shown in fig. 29) to update the selected probability (e.g., context).

As already described above, the occupancy pattern of the current node may be entropy coded using a cascade of one or more binary entropy codecs. Accordingly, operation 4130 of method 4100 may involve: for at least one bit of the bit sequence, a respective probability for coding the bit (and correspondingly, an associated entropy codec) is selected based at least in part on the sub-volume neighbor configuration that has been determined for the sub-volume corresponding to the bit to be entropy coded. Furthermore, the selection of the probability may be based on a partial sequence of already coded bits of the bit sequence and/or a neighbor configuration of the current volume. In other words, for each bit of the bit sequence, a context may be selected based on the sub-volume neighbor configuration, and further, selecting a context may be based on a partial sequence of bits of the bit sequence that have been coded and/or a neighbor configuration of the current volume. In context, it may be said that operation 4130 of method 4100 involves selecting a context for entropy coding bits based at least in part on sub-volume neighbor configurations that have been determined for the sub-volume corresponding to the bits to be coded. Then, in some embodiments, the context may be updated after operation 4140.

Referring now to fig. 30, an example of a method 4200 of decoding a bitstream of compressed point cloud data to produce a reconstructed point cloud is shown in flow chart form. The point cloud is defined in a tree structure (e.g., an octree) having a plurality of nodes having parent-child relationships and representing the geometry of a volume space that is recursively split into sub-volumes and contains the points of the point cloud. A bit sequence is used to indicate occupancy of a sub-volume of a volume, wherein each bit of the bit sequence indicates occupancy of a respective sub-volume within the volume in a scan order, and the volume has a plurality of contiguous volumes. The operations of method 4200 described below are performed separately for a current node associated with a current volume split into sub-volumes, where each sub-volume corresponds to a child node of the current node and is performed for at least one bit in a bit sequence of the current volume. In operation 4210, a sub-volume neighbor configuration is determined based on occupancy data of sub-volumes of at least one neighboring volume of the current volume. The sub-volume neighbor configuration depends on an occupancy pattern of a set of sub-volumes of the at least one contiguous subset that are contiguous to a given sub-volume of the current volume, the given sub-volume being the sub-volume of the current volume that corresponds to a bit in the bit sequence. E.g. for bit b of a bit sequence_iA given sub-volume is corresponding to said bit b_iI.e. the given sub-volume is for which the bit b is considered_iA sub-volume of the occupancy state is indicated. A sub-volume neighbor configuration is then determined for the considered sub-volume of the current volume. It is noted that this may be performed for any, some or all bits of a bit sequenceRow operation 4120. In operation 4220, probabilities (e.g., contexts) for entropy decoding of bits in a bit sequence are selected. The selection is based at least in part on the sub-volume neighbor configuration. In operation 4230, the at least one bit is entropy decoded using a binary entropy decoder based on the selected probability (e.g., context) to generate reconstructed bits from the bit stream.

In certain embodiments, method 4200 may further include an operation (not shown in fig. 30) to update the selected probabilities (e.g., contexts).

In the same manner as encoding, operation 4220 of method 4200 may involve: for at least one bit of the bit sequence representing the occupancy pattern, a respective probability for coding the bit (and correspondingly, the associated entropy codec) is selected based at least in part on the sub-volume neighbor configuration that has been determined for the sub-volume corresponding to the bit to be entropy coded. Furthermore, selecting the probability may be based on a partial sequence of already coded bits in the bit sequence and/or a neighbor configuration of the current volume. In other words, for each bit of the bit sequence, a context may be selected based on the sub-volume neighbor configuration, and further, selecting a context may be based on a partial sequence of bits in the bit sequence that have been coded and/or a neighbor configuration of the current volume. In context, it may be said that operation 4220 of method 4200 involves selecting a context for entropy coding bits based at least in part on sub-volume neighbor configurations that have been determined for the sub-volume corresponding to the bits to be coded. Then, in some embodiments, the context may be updated after operation 4230.

An example of determining the sub-volume neighbor configuration will be described below.

In some embodiments, this may involve: based on occupancy data of sub-volumes of at least one neighboring volume of the current volume, a number of sub-volumes of the at least one neighboring volume (possibly all neighboring volumes or all neighboring volumes that have been coded) is determined, which is neighboring to the sub-volume of the current volume corresponding to a bit in the bit sequence. TheThe amount will be referred to as NT [ i ] hereinafter]Where i is an indication of the current volume (or corresponding bit b in the bit sequence)_i) The sub-volume of (a). In this sense, it can be said that selecting the probabilities in operation 4130 of method 4100 and in operation 4220 of method 4200 involves: the selection probability is based at least in part on the number NT [ i ]]。

In some implementations, determining the sub-volume neighbor configuration can further involve: a threshold function is applied to the determined number NT [ i ].

Referring to fig. 31, an example of splitting the current volume (current node) B into eight sub-volumes (sub-nodes) SB 0-SB 7 is shown. For this example, adjacent volumes are defined as volumes that share faces with each other (i.e., are "in contact" with each other). Each of the sub-volumes of the current volume B is adjacent (touching) three sub-volumes of the neighborhood of the current volume. The sub-volume neighbor configuration of each sub-volume SB 0-SB 7(SBi, i ═ 0, … …, 7) may depend on the occupancy pattern of the sub-volume group contacting the corresponding sub-volume SBi of the current volume. Depending on the occupancy pattern of occupied neighbors of the current volume, at most three occupied sub-volumes of occupied neighbors of the current volume may be adjacent (or in contact, where contact/adjacent is defined as a shared surface) to the respective sub-volume SBi.

One example process of determining the sub-volume neighbor configuration of a given sub-volume SBi of the current volume using the above definition of the neighboring volume involves: the number of sub-volumes in the neighborhood of the current volume that is in contact with the corresponding sub-volume SBi is determined to be 0 < NT [ i ] < 3. In other words, the number NT [ i ] of sub-volumes SBi of the current volume indicates the number of occupied sub-volumes in the neighborhood of (in contact with, i.e., sharing a face with) the sub-volume SBi, among the neighbors of the current volume.

By definition above, the integer number 0 ≦ NT [ i ] ≦ 3 of the sub-volume SBi of the current volume is the number of occupied sub-volumes (sub-nodes) that are located in the (already coded) neighborhood of the current volume and intersect the sub-volume SBi on the face (i.e., the face that touches the sub-volume SBi).

In a preferred embodiment, the neighbors are six neighbors that share a face with the current cube, and the sub-volume of an occupied neighbor is said to be adjacent to the child node SBi if and only if one of the faces of the children of the occupied neighbor is included in the face of the cube associated with the child node SBi (i.e., in the case where the child node of the occupied neighbor intersects the child node SBi in the face). However, it should be understood that the present application is not limited to the foregoing embodiments of octrees, and is not limited to defining neighbors as nodes of an associated volume sharing surface with an associated volume of a current same level node. In fact, the present application relates to any tree representing the geometry of a point cloud. Furthermore, the neighbors of the current node may generally be defined as nodes at the same depth or level (relative to the root node of the tree) as the current node, and the associated volumes of these nodes intersect the volume of the current node (e.g., at least in a predefined non-empty set). In a tree where the volume associated with a node is a cube (such as an octree, for example), the intersection may be any non-empty set of faces, edges, vertices, or points.

Similarly, if the intersection of the volume associated with a child node with the volume associated with the child node SB is a non-empty set of points, then the occupied child nodes of the current node's neighbors are said to intersect with the child node SB of the current node B. Further, in a tree where the volume associated with a node is a cube (such as, for example, an octree), the intersection may be any non-empty set of faces, edges, vertices, or points.

It should also be understood that the neighbor definitions for the neighbor and the child nodes of the neighbor may be different. For example, in an octree, neighbors may be defined as nodes that share a face with the current node, resulting in six neighbors; however, the number NT [ i ] may count the number of neighbor occupied child nodes that have been coded to share at least one edge with the child node SBi. However, the preferred embodiment may use the definition of the intersection as a "shared surface" for both the neighbor and the child nodes of the neighbor.

As mentioned above, different definitions of neighboring nodes or volumes may be used in the context of the present application. Depending on the neighbor definition, the possible range of the number NT [ i ] will be different from 0 ≦ NT [ i ] ≦ 3 (this applies to the case where the neighboring volume is defined as a shared surface or "touching" volume). For example, if adjacent volumes are defined as volumes that intersect each other at least in the edges, then there will be 0 ≦ NT [ i ] ≦ 12. Similarly, if neighboring volumes are defined as volumes that intersect at least in the vertex, then there will be 0 ≦ NT [ i ] ≦ 19.

As mentioned above, depending on the occupancy pattern of occupied neighbors of the current volume, at most three occupied sub-volumes of occupied neighbors of the current volume may be adjacent to the respective sub-volume (sub-node) SBi, using the proximity definition of the neighbor volume (which is the volume of the shared face). Fig. 32 shows that for the child node SB0 of the current node B, assuming the scan order described above with reference to fig. 27, there are at most three occupied child nodes of the neighbor node that may be adjacent to the sub-volume associated with the child node SB0 (i.e., that may intersect the sub-volume associated with the child node SB0 in the plane). Eight possible configurations of the occupied sub-node are depicted in the figure, with associated values NT [0] of 0,1,2 or 3.

FIG. 33 shows that for the child node SB1 of the current node B, there are at most two occupied child nodes of the neighbor node that may be adjacent (i.e., intersect in the plane) with the sub-volume associated with the child node SB 1. Four possible configurations of the occupied sub-node are depicted, along with an associated value NT [1] of 0,1 or 2.

FIG. 34 shows that for the child node SB3 of the current node B, there is at most one occupied child node of the neighbor nodes that may be adjacent (i.e., intersect in the plane) with the sub-volume associated with the child node SB 3. Two possible configurations of the occupied sub-node are depicted, along with an associated value NT [3] of 0 or 1.

FIG. 35 shows that for the child node SB7 of the current node B, there are no occupied child nodes of neighbor nodes that may be adjacent (i.e., intersect in a plane) with the sub-volume associated with the child node SB 7. Thus, there is only one possible configuration of the occupied child node as depicted in the figure, i.e., NT [7] must have a value of 0. In this case, the child node does not provide additional information for context determination in the codec of the occupancy information of the child node SB 7.

Once the number NT [ i ] has been determined for the child node SBi of the current node, the sub-volume neighbor configuration values C [ i ] can be determined from NT [ i ] by applying a threshold function. The threshold function may be the following function: outputs its input value up to a certain threshold value, and outputs the threshold value for input values exceeding the threshold value. For example, the threshold may be set to 2, so that

·C[i]＝0if NT(i)＝0

·C[i]＝1if NT(i)＝1

·C[i]＝2if NT(i)>＝2

With the thresholding operation, possible values for C [0] include 0,1, or 2, while C [7] is always 0. It should be understood that other thresholding operations and threshold functions are possible. Specifically, for other neighbor definitions, the value range for the number NT [ i ] will be different from 0 ≦ NT [ i ] ≦ 3, and different thresholds may be applied.

Then, selecting a probability (e.g., context) in operation 4130 of method 4100 and in operation 4220 of method 4200 may involve: the probabilities are selected based at least in part on the sub-volume neighbor configuration values C [ i ]. In this sense, the sub-volume neighbor configuration values C [ i ] may purportedly correspond to sub-volume neighbor configurations.

Referring now to FIG. 36, there is shown in flow chart form the determination of the number of sub-volumes NT [ i ] of the current volume]And sub-volume neighbor configuration values C [ i ]]One example of the method 4300. That is, the method 4300 operates on sub-volumes of the current volume. In operation 4310, the number NT [ i ] is counted for all sub-volumes of the current volume]Initialized to zero. Then, in operation 4320, neighbors of the current volume are selected. In operation 4330, it is checked whether the selected neighbor is occupied. If the selected neighbor is occupied (YES in operation 4330), the method proceeds to operation 4340. In operation 4340, it is checked whether the selected occupied neighbor has already been coded. If the selected occupied neighbor has been coded ("yes" in operation 4340), the method proceeds to operation 4350. In operation 4350, it is determined for each occupied child node (sub-volume) of the neighbor which sub-volumes of the current volume are adjacent to the child node (depending on the neighbor definition; e.g., such asAt least at a predefined non-empty set). For each sub-volume of the current volume adjacent to the sub-node, a corresponding number NT [ i ]]The increment of (a) is 1. Subsequently, the method proceeds to operation 4360. If the selected neighbor is not occupied (no in operation 4330) or has not been coded (no in operation 4340), the method also proceeds to operation 4360. In operation 4360, it is checked whether there are neighbors of the current volume that have not been selected. If there are neighbors that have not been selected ("yes" in operation 4320), the method returns to operation 4320 to select the next neighbor of the current volume. Once all neighbors have been processed (NO in operation 4360), sub-volume neighbor configuration values C [ i ] are calculated in operation 4370 for each sub-volume of the current volume]. This may involve applying a threshold function. Then, in operation 4380, a probability (e.g., context) may be used based on the sub-volume neighbor configuration values C [ i ]]And possibly a corresponding bit b of the bit sequence based on a partial sequence and/or (reduced) neighbour configuration of already coded bits of the bit sequence_iAnd carrying out entropy coding and decoding.

In an alternative embodiment, instead of looping over the neighbors of the current volume, a loop over the sub-volume SBi of the current volume (or equivalently, over the index i) may be performed. Then, for each sub-volume SBi, the occupied sub-nodes of the neighbors that have been coded are tested to determine whether they are adjacent to the sub-volume SBi (e.g., intersect at least in a predefined non-empty set), and NT [ i ] is obtained as the number of such sub-nodes adjacent to the sub-volume SBi.

In the above determination of NT [ i ], all neighbors of a sub-volume of the current volume are treated equally regardless of their intersection with the sub-volume. In some implementations, different weights may be applied to neighbors that have different intersections with the sub-volume in question. For example, when applying a broader neighborhood definition, a neighboring sub-volume sharing a face with the sub-volume in question may have a higher weight than a neighboring sub-volume sharing (only) an edge with the sub-volume in question, which may have a higher weight than a neighboring sub-volume sharing (only) a vertex with the sub-volume in question. These different weights can then be applied in determining the weighted number NT '[ i ], wherein each neighbor of the sub-volume in question is counted with its corresponding weight in the sum resulting in the weighted number NT' [ i ].

Thus, determining a sub-volume neighbor configuration in operation 4120 of method 4100 and in operation 4210 of method 4200 may involve: all those sub-volumes of the at least one neighboring volume (possibly all neighboring volumes or all neighboring volumes that have been coded) are determined, based on occupancy data of sub-volumes of the at least one neighboring volume of the current volume, which sub-volumes intersect the sub-volume of the current volume corresponding to a bit in the bit sequence. Additionally, a respective weighting factor may be applied to the determined sub-volume. Each weight factor may depend on the intersection of the respective determined sub-volume with a sub-volume of the current volume that corresponds to a bit in the bit sequence. For example, the sub-volume that intersects the sub-volume in question in the face may have the highest weight, the sub-volume that intersects the sub-volume in question (only) in the edge may have a medium weight, and the sub-volume that intersects the sub-volume in question (only) in the vertex may have the lowest weight. In some embodiments, only two of the three neighbor definitions may be considered (which in effect corresponds to setting the weight factor of the neighbor defined by the third neighbor to zero). For example, the respective weight factors applied to the sub-volumes that intersect the sub-volumes in the face, edge, and vertex may be set to 2, 1, and 0, respectively.

Then, determining the sub-volume neighbor configuration may further involve: based on the determined sub-volumes and their respective weight factors, a weighted number NT' of sub-volumes of at least one neighboring volume that intersects a sub-volume of the current volume that corresponds to a bit in the bit sequence is determined. That is, when summing determined sub-volumes of at least one neighboring volume that intersect a sub-volume of the current volume that corresponds to a bit in the bit sequence, each determined sub-volume may be counted by its respective weight factor. In addition, thresholding may be applied to the weighted number NT '[ i ] to obtain a weighted sub-volume neighbor configuration value C' [ i ].

In some embodiments, a given occupied-bit b in a bit sequence may be selected and compared depending on the following value_iProbability (e.g. context) associated with the binary entropy codec performing the codec:

·b₀,…,b_i-1

·N₁₀

·C[i]

that is, once determined, the sub-volume neighbor configuration values C [ i ] can be set](or weighted sub-volume neighbor configuration values C' [ i]) Used as input to the entropy codec(s) that asserts the occupancy bit B associated with the subnode SBi of the current node B_iAnd carrying out coding and decoding.

Context reduction operations

The above example proposes that the codec process comprises a context reduction operation with respect to at least one bit of the occupied mode in order to reduce the available context set to a smaller available context set. In this sense, a "context reduction operation" may be understood as identifying and incorporating in a particular bit b_iIs considered a repetitive or redundant context. As mentioned above, the reduced context set may be determined prior to codec and may be provided to the encoder and decoder, and the encoder and decoder determine whether to use the reduced context set based on the same conditions described below for reducing the context set.

Neighbor configuration reduction by screening/masking

A first example context reduction operation involves reducing the number of neighbor configurations based on screening/masking. In principle, the neighbor configuration takes into account the occupancy state of the neighboring volume in the context selection process, on the basis that the neighboring volume helps to indicate whether the current volume or sub-volume is likely to be occupied. When decoding the bits associated with the sub-volumes in the current volume, then also take them into account for context selection; however, information from nearby sub-volumes may be more important and informative than occupancy information of neighboring volumes located on the other side of the sub-volume from the current sub-volume. In this sense, the previously decoded bits are associated with "screening" or "masking" sub-volumes of the contiguous volume. This may mean that in such cases the occupancy of the neighboring volume may be neglected, since the correlation of its occupancy state is subsumed by the occupancy state of the sub-volume between the current sub-volume and the neighboring volume, allowing to reduce the number of neighbor configurations.

Referring now to FIG. 24, an example context reduction operation based on neighbor screening is graphically illustrated. Examples relate to encoding and decoding occupancy patterns for a volume 3200. The occupancy pattern represents the occupancy states of eight sub-volumes within the volume 3200. In this example, four sub-volumes in the upper half of the volume 3200 have been codec, so the occupancy states of these four sub-volumes are known. The bits of the occupancy pattern being coded are associated with a fifth sub-volume 3204, which is located in the lower half of the volume 3200 below the four previously coded sub-volumes.

In this example, the codec includes: the context is determined based on the neighbor configuration. 10 neighbor configurations 3202 are shown. A volume 3200 containing a fifth sub-volume 3204 to be encoded is shown in light gray and indicated by reference numeral 3200. The neighbor configuration 3202 is based on an occupancy state of a volume adjacent to the volume 3200 and sharing a face with the volume 3200. The adjacent volume includes a top adjacent volume 3206.

In this example, the number of neighbor configurations may be reduced from 10 to 7 by ignoring the top neighboring volume 3206 in at least some of the configurations. As shown in fig. 24, three of the four configurations showing the top adjacent volume 3206 may be classified as equivalent configurations not included in the top adjacent volume 3206, thereby reducing the number of neighbor configurations to a total of 7. It may still be advantageous to keep the configuration showing all six neighboring volumes, since there is no existing 5-volume neighbor configuration that can merge the 6-volume configuration (one 5 element has been eliminated), which means that even if the top neighboring volume is removed, a new 5-element neighbor configuration is generated and no overall reduction in context occurs.

In this example, the top neighboring volume 3206 may be eliminated from the neighbor configuration because the context determination for encoding the occupancy bits associated with the fifth sub-volume 3204 will have considered the occupancy states of the four previously encoded sub-volumes directly above it, which better indicate the likelihood and directionality of occupancy of the fifth sub-volume than the occupancy states of the more distant top neighboring volume 3206.

The above example of screening or masking the top contiguous volume 3206 when encoding the occupancy bits corresponding to the fifth sub-volume 3204 by a previously encoded sub-volume is merely one example. Depending on the codec order within the volume 3200, a variety of other possible screening/masking scenarios may be implemented and utilized to reduce the available neighbor configurations.

Referring now to fig. 25, a second example of screening/masking is shown. In this example, the occupancy pattern of the volume 3200 is almost completely codec. The sub-volume to be coded is the eighth sub-volume and is hidden in the figure at the bottom corner (not visible) of the back. In this case, the occupancy states of all seven other sub-volumes have been coded. Specifically, the subvolumes along the top (thus reducing the neighbor configuration to seven total) and along the right and front sides. Thus, in addition to screening the top contiguous volume, the sub-volume with the previously coded occupancy bits shields the front contiguous volume 3210 and the right contiguous volume 3212. This may allow the neighbor configuration to be reduced from seven total to five total, as illustrated.

It will be appreciated that the two foregoing examples of shielding are illustrative, and in some cases, different configurations may be incorporated to address different shielding scenarios. The context reduction operation based on masking/screening by the sub-volumes of the previous codec is general and not limited to these two examples, but it should be appreciated that the context reduction operation cannot be applied in the case of the first sub-volume to be coded, since there needs to be at least one previously coded occupancy bit associated with the nearby sub-volume in order to be used for any masking/screening.

It should also be appreciated that the degree of masking/screening that justifies neighbor configuration reduction may be different in different embodiments. In both of the above examples, all four sub-volumes sharing a face with the neighboring volume have been previously coded before considering the neighboring volume as a mask/filter and thus removing it from the neighbor configuration. In other examples, partial masking/screening may be sufficient, for example, from one previously-coded sub-volume to three previously-coded sub-volumes of the shared surface

Context reduction through special case handling

There are certain situations where context reduction can occur without loss of useful information. In the example context determination process described above, the context for encoding the occupancy bits is based on the neighbor configuration (i.e., the occupancy pattern of the volume adjacent to the current volume) and the partial pattern attributable to the occupancy of the previously encoded sub-volume in the current volume. The latter case results in 2 to be tracked with respect to the eighth bit in the occupied-mode bit sequence⁷128 contexts. Even though the neighbor configuration is reduced to a total of five, this means 640 contexts are to be tracked.

The number of contexts is enormous based on the fact that the bits of the previous codec of the bit sequence have an order and the order is relevant when evaluating the context. However, in some cases, the order may not contain useful information. For example, in neighbor configuration null (i.e., N)₁₀0), any point within the volume may be assumed to be sparsely populated, meaning that the points have a directionality that is not strong enough to justify tracking separate contexts of different occupancy patterns in sibling subvolumes. In the case of a null neighborhood, there is no local orientation or topology for the point cloud, which means that 2 of the previously coded bits based on the bit sequence can be coded^jThe conditions are reduced to j +1 conditions. That is, the context for encoding and decoding one of the bits of the bit sequence is based on a previous encoding and decodingThe bits of the code, but not based on the ordered pattern of bits of the previous codec, but on the sum of the bits of the previous codec. In other words, the entropy expression in this particular case can be expressed as:

H(b|n)≈H(b₀|0)H(b₁|0,b₀)H(b₂|0,b₀+b₁)…H(b₇|0,b₀+b₁+…+b₆)

in some embodiments, similar observations may be made with respect to a full neighbor configuration. In some examples, the full neighbor configuration lacks directionality, which means that the order of bits of the previous codec need not be considered in determining the context. In some examples, the context reduction operation may be applied to only some of the bits in the sequence of bits, such as some of the later bits in the sequence. In some cases, applying this context reduction operation to later bits may be conditioned on determining that earlier bits associated with previously coded sub-volumes are also all occupied.

Statistical-based context reduction

Statistical analysis can be used to reduce contexts by determining which contexts cause approximately the same statistical behavior, and then combining the contexts. This analysis can be performed a priori using test data to develop a reduced set of contexts, which are then provided to both the encoder and decoder. In some cases, analysis may be performed on the current point cloud using a two-pass codec to develop a custom reduced set of contexts for the particular point cloud data. In some such cases, a mapping from a non-reduced context set to a custom-reduced context set may be indicated to the decoder by using a dedicated syntax that is coded into a bitstream.

The two contexts can be compared by the concept of "distance". The first context c has a probability p of bit b equal to zero and the second context c 'has a probability b' equal to zero. The distance between c and c' is given by:

d(c,c’)＝|p log₂p–p’log₂p’|+|(1-p)log₂(1-p)–(1-p’)log₂(1-p’)|

using this measure of similarity (distance), the contexts can then be grouped in processes such as:

1. from M₁Context start and fix threshold level ε

2. For a given context, regrouping all contexts that are less than a threshold level ε from the given context into categories

3. Repeat 2 for all non-regrouped contexts until all non-regrouped contexts are placed into a category

4. Marking from 1 to M₂M of (A)₂The species: this results in a brute force reduction function that maps {1,2, …, M }₁]→[1,2,…,M₂]Wherein M is₁≥M₂。

A brute force reduction function for mapping a context set to a smaller context set may be stored in memory for application by the encoder/decoder as a context reduction operation during codec. The mapping may be stored as a lookup table or other data structure. For example, a brute force reduction function may be applied only to later bits in a bit sequence (pattern).Combinations and subcombinations of context reduction operations

Three example context reduction operations are described above. Each of these context reduction operations may be applied separately and independently in some embodiments. Any two or more of these context reduction operations may be combined in some embodiments. The additional context reduction operations may be implemented alone or in combination with any one or more of the context reduction operations described above.

Fig. 26 illustrates, in flow diagram form, one example of a method 3300 of occupied-mode binary coding involving combined context reduction. Given a 10-element neighbor configuration N in {0,1,2, …,9}₁₀Method 3300 versus 8 bitsBinary mode b₀,b₁,…,b₇And carrying out coding and decoding. The first condition evaluated is whether the neighbor configuration is empty, i.e., N₁₀0. If the neighbor configuration is null, the bits are codec without reference to the order of the bits, as indicated with reference numeral 3302. Otherwise, coding and decoding the bit according to the normal mode until the bit b₄At bit b₄Where the encoder and decoder begin applying the brute force context reduction function BR_iThe number of contexts is reduced by mapping the context set defined by the neighbor configuration and the partial mode of the previously coded bits to a smaller context set with substantially similar statistics.

In this example, the last two bits b are masked/filtered based on using reduced neighbor configuration₆And b₇And carrying out coding and decoding.

The function may be implemented as a look-up table (LUT) for reducing the size of the context set. In a practical implementation, all reductions are taken into account by a reduction function (i.e. abbreviated LUT) that takes the context as input and provides the reduced context as output. In this example embodiment, the total number of contexts has been reduced from 2550 to 576, with each reduction function BR_iAre 70, 106, 110 and 119, respectively.

Overall, the dependence (or condition) in conditional entropy

H(b_i|b₀,…,b_i-1,N₁₀)

The less dependent conditional entropy relationships can be reduced by a reduction function BRi to obtain via the following equation:

H(b_i|BR_i[b₀,…,b_i-1,N₁₀])。

as described above, sub-volume neighbor configurations (e.g., sub-volume neighbor configuration values C [ i ]]Or a weighted sub-volume neighbor configuration value C' [ i]) Can be used to pair the occupied bit b_iAnd carrying out coding and decoding. In this case, the conditional entropy relation becomes

H(b_i|b₀,…,b_i-1,N₁₀,C[i])

And similar to the above reduction may be performed to obtain

H(b_i|BR_i[b₀,…,b_i-1,N₁₀],C[i])。

In the above, the function BR is reduced_iVia BR_i[b₀,…,b_i-1,N₁₀]Applied to a partial bit sequence b₀,…,b_i-1And 10 element neighbor configuration N₁₀. In some embodiments, this may be via a BR, for example_i[b₀,…,b_i-1,N₁₀,C[i]]Will reduce function BR_iAdditionally to sub-volume neighbor configurations (e.g., to sub-volume neighbor configuration values C [ i ]]Or a weighted sub-volume neighbor configuration value C' [ i]) To obtain the conditional entropy relationship:

H(b_i|BR_i[b₀,…,b_i-1,N₁₀,C[i]])。

in this case, it may depend on (BR)_i[b₀,…,b_i-1,N₁₀],C[i]) To select the occupied bit b_iThe choice of the probability p (e.g. context) of the binary entropy codec that is codec.

If it is in the pair of occupied bit b_iTaking into account the sub-volume neighbor configuration when entropy coding and decoding, then the above-described figure 26 will be modified to figure 37, which shows in flow chart form the sub-volume neighbor configuration based on context reduction involving combining (e.g., sub-volume neighbor configuration values C i]Or weighted sub-volume neighbor configuration values C' [ i]) An example of the method 4400 of occupied-mode binary encoding and decoding. 10-element neighbor configuration N in a given {0,1,2, …,9}₁₀And sub-volume neighbor configuration values C [ i ]](or weighted sub-volume neighbor configuration values C' [ i]) In the case of (1), method 4400 is for an 8-bit binary pattern b₀,b₁,…,b₇And carrying out coding and decoding. The first condition evaluated is whether the neighbor configuration is empty, i.e., N₁₀0. If the neighbor configuration is null, the comparison is made without reference to the order of the bitsThe bits are coded as indicated with reference 4402. Otherwise, coding and decoding the bit according to the normal mode until the bit b₄To this end, at this point, the encoder and decoder begin to apply a brute force context reduction function BR_iThe number of contexts is reduced by mapping the context set defined by the neighbor configuration and the partial mode of the previously coded bits to a smaller context set with substantially similar statistics.

In this example, the last two bits b are masked/filtered based on using reduced neighbor configuration₆And b₇And carrying out coding and decoding.

Selection of scan order within a current volume

Introducing sub-volume neighbor configurations (e.g., sub-volume neighbor configuration values C [ i ] or weighted sub-volume neighbor configuration values C' [ i ]) in the entropy codec increases the possible configurations for conditional coding.

However, this increase can be limited by carefully selecting the order of the scanning order of the sub-nodes SBi within the current volume B to determine the occupancy pattern and thus the bit sequence to be coded. I.e. bit b_iIs dependent on

Bit b₀To bit b_i-1Thereby generating 2ⁱThe configuration of the device is as follows,

10 neighbor configurations N₁₀，

And sub-volume neighbor configuration values Cj (or weighted sub-volume neighbor configuration values Cj)

It has been shown above how the sub-volumes SB0 through SB7 are 10.2ⁱThe first two bulleted points in the configuration are reduced to 10, 20, 39, 76, 149, 294, 391, and 520 configurations, respectively.

Now, it is clearly advantageous to first scan those child nodes for which the number of possible values of C [ i ] is the maximum (e.g., three in some of the above examples). In so doing, the number of configurations of the child nodes scanned for the first time becomes 10 · 3 — 30, which can be regarded as a small increase. If however the child node SB0 were to be scanned at the end, then 1560 configurations would be obtained 520 · 3. Thus, by carefully selecting the scan order within the current volume (to determine the occupancy pattern and thus the bit sequence to be coded), introducing sub-volume neighbor configurations (e.g. sub-volume neighbor configuration values ci or weighted sub-volume neighbor configuration values ci) in the entropy codec has a limited impact on the number of configurations (contexts) at the time of entropy coding.

Thus, in some embodiments, the scan order within the current volume may be determined such that from one sub-volume to the next sub-volume in the scan order, the maximum possible number of adjacent sub-volumes in the adjacent volume of the current volume that have been coded does not increase. In other words, the scan order may be determined such that the maximum possible value of the number NT [ i ] (or the weighted number NT' [ i ]) does not increase from one sub-volume to the next in the scan order.

Markers indicating use of sub-volume neighbor configurations

To allow for a low complexity distribution of the codec, it may be advantageous to codec a marker which indicates activation/deactivation taking into account the sub-volume neighbor configuration when determining the probability (e.g. context) of entropy coding the bit sequence.

Thus, in some embodiments, the encoding method may further include: encoding a marker indicating that a probability of entropy encoding for at least one bit has been selected based at least in part on the sub-volume neighbor configuration. Also, the decoding method may further include: decoding a marker from the bitstream, the marker indicating that a probability of entropy decoding for at least one bit should be selected based at least in part on the sub-volume neighbor configuration.

Context selection in a system with a fixed number of contexts

Each of the previously described context reduction operations may be further used in a compression system having a static (fixed) minimum number of contexts. In such a design, for a given symbol in an 8-bit binary pattern, one or more reduction operations are applied to determine a context probability model for encoding or decoding the symbol.

Influence on compression Performance

Compression gains are provided by the current implementation of the MPEG test model for point cloud codec using 10 neighbor configurations and non-binary codec. However, the above suggested use of 10 neighbor configurations using 2550 context concatenated binary codec leads to an even better improvement of compression efficiency. Even when context reduction is used (such as using the three techniques detailed above) to 576 total contexts, binary codec compression is slightly better than implementations using non-binary codecs and much better than test models. This observation has been shown to be consistent between different test point cloud data.

Referring now to fig. 14, a simplified block diagram of an example embodiment of an encoder 1100 is shown. The encoder 1100 includes a processor 1102, a memory 1104, and an encoding application 1106. The encoding application 1106 may include a computer program or application stored in the memory 1104 and containing instructions that, when executed, cause the processor 1102 to perform operations, such as those described herein. For example, the encoding application 1106 may encode a bitstream and output an encoded bitstream according to the processes described herein. It should be appreciated that the encoding application 1106 may be stored on a non-transitory computer-readable medium such as an optical disc, a flash memory device, random access memory, a hard drive, and the like. When executing instructions, the processor 1102 performs the operations and functions specified in the instructions to operate as a special purpose processor that implements the process (es) described. In some examples, this processor may be referred to as "processor circuitry".

Reference is now also made to fig. 15, which shows a simplified block diagram of an example embodiment of a decoder 1200. Decoder 1200 includes a processor 1202, a memory 1204, and a decoding application 1206. The decoding application 1206 may comprise a computer program or application stored in the memory 1204 and containing instructions that, when executed, cause the processor 1202 to perform operations, such as those described herein. It is to be appreciated that the decoding application 1206 can be stored on a computer-readable medium, such as an optical disk, a flash memory device, a random access memory, a hard drive, and so forth. When executing instructions, the processor 1202 performs the operations and functions specified in the instructions to operate as a special purpose processor implementing the process (es) described. In some examples, this processor may be referred to as "processor circuitry".

It should be appreciated that a decoder and/or encoder in accordance with the present application may be implemented in a number of computing devices, including but not limited to servers, appropriately programmed general purpose computers, machine vision systems, and mobile devices. The decoder or encoder may be implemented by software containing instructions for configuring one or more processors to perform the functions described herein. The software instructions may be stored in any suitable non-transitory computer readable memory, including CD, RAM, ROM, flash memory, etc.

It will be appreciated that the decoders and/or encoders described herein, as well as the modules, routines, processes, threads or other software components implementing the described methods/processes for configuring an encoder or decoder, may be implemented using standard computer programming techniques and languages. The application is not limited to a particular processor, computer language, computer programming specification, data structure, other such implementation details. Those skilled in the art will appreciate that the processes described may be implemented as part of computer executable code stored in volatile or non-volatile memory, as part of an Application Specific Integrated Chip (ASIC), etc.

The present application also provides a computer readable signal encoding data generated by applying an encoding process according to the present application.

Certain adaptations and modifications of the described embodiments can be made. The embodiments discussed above are therefore to be considered in all respects as illustrative and not restrictive.

65页详细技术资料下载

Method and apparatus for binary entropy encoding and decoding of point clouds

相关技术

网友询问留言