Method for improving V-PCC (V-policy and charging control) inter-frame prediction by three-dimensional inter-frame prediction

文档序号：1802491 发布日期：2021-11-05 浏览：30次中文

阅读说明：本技术 一种三维帧间预测改进v-pcc帧间预测的方法 (Method for improving V-PCC (V-policy and charging control) inter-frame prediction by three-dimensional inter-frame prediction ) 是由郑明魁王适邱鑫黄昕王泽峰于 2021-07-27 设计创作，主要内容包括：本发明涉及一种三维帧间预测改进V-PCC帧间预测的方法。首先,对当前帧点云找到在前一帧点云的最近点并记录距离。然后,把当前点云的点按照的优先级升序排序,以第一个点为坐标原点沿着值增加的方向构建8*8*8的包围盒。再而,设定可以容忍的点云失真阈值,并计算包围盒内最近点的距离的均值,根据均值与失真阈值的比较,把点云分成静态点云(均值小于阈值,能有效被三维帧间预测)和动态点云。对静态点云则通过构建代价函数对包围盒沿着某个最佳的维度延伸,并且每次去掉当前帧中的包围盒内点云。最后,重复上述步骤,直到所有点云分类完成,对动态点云融合并使用V-PCC编码,静态点云则熵编码其包围盒坐标的最小和最大值。(The invention relates to a method for improving V-PCC (V-PCC) inter-frame prediction by three-dimensional inter-frame prediction. Firstly, the nearest point of the current frame point cloud is found and the distance is recorded. Then, the point of the current point cloud is pressed In ascending order of priority, 8 × 8 bounding boxes are constructed with the first point as the origin of coordinates along the direction of increasing value. Then, a point cloud distortion threshold value capable of being tolerated is set And calculating the mean value of the distance of the closest point in the bounding box according to the mean value and the distortion threshold value The point cloud is divided into static point cloud (the average value is less than the threshold value, and can be effectively predicted by three-dimensional frames) and dynamic point cloud. And for the static point cloud, extending the bounding box along a certain optimal dimension by constructing a cost function, and removing the point cloud in the bounding box in the current frame each time. And finally, repeating the steps until all the point cloud classifications are finished, fusing the dynamic point cloud and encoding by using V-PCC, and entropy-encoding the minimum value and the maximum value of the bounding box coordinate of the static point cloud.)

1. A method for improving V-PCC (V-PCC) inter prediction by three-dimensional inter prediction is characterized by comprising the following encoding processes:

m1, finding the closest point of each point in the current frame point cloud in the previous frame point cloud and recording the distance of the closest point;

step M2, sorting the points of the current point cloud in ascending order according to the priorities of x, y and z, and constructing an 8 x 8 bounding box along the increasing direction of the value by taking the first point as the origin of coordinates;

step M3, firstly, setting point-to-point distortion capable of tolerating the maximum point cloud as a threshold mse _ th, calculating a mean value of the nearest point distance of the point cloud in the bounding box and comparing the mean value with the threshold mse _ th, wherein if the mean value is smaller than the threshold mse _ th, the point cloud is predicted through three-dimensional frames and is called as static point cloud, and if the mean value is larger than the threshold mse _ th, the point cloud is large in movement at the moment and cannot be effectively predicted through the three-dimensional frames and is called as dynamic point cloud;

step M4, adding a unit to the static point cloud respectively along different dimensions of x, y and z to obtain different bounding boxes, wherein the extended bounding boxes are non-empty, the mean square distance after the addition along the different dimensions is calculated, and the dimension corresponding to the minimum mean square distance is the finally selected extension dimension;

step M5, judging whether the extended mean square distance is smaller than a threshold value mse _ th, if so, updating the static point cloud and returning to the step M4; if not, the description cannot be extended;

step M6, for static point clouds which can not be extended any more, entropy coding can determine the minimum value and the maximum value of bounding box frames;

step M7, dynamic point clouds in all bounding boxes which cannot be effectively predicted by three-dimensional interframes are fused into a new frame of point clouds;

step M8, repeating steps M2 to M7, and removing the point clouds in the determined bounding box in the current coding frame each time until all the point clouds are coded;

and step M9, performing inter-frame prediction coding on the point cloud after final fusion by using V-PCC.

2. A method for improving V-PCC inter prediction by three-dimensional inter prediction is characterized by further comprising the following decoding process:

step S1, judging the name of the bit stream, and selecting the corresponding decoding mode;

step S2, for only two types of point clouds, if the current point cloud is not a static point cloud, it is only a dynamic point cloud, so a corresponding decoding method is adopted: V-PCC decoding;

step S3, for the bit stream of the static point cloud, recovering the minimum value and the maximum value of the bounding box by adopting entropy coding decoding;

step S4, determining the boundary of the bounding box according to the minimum and maximum coordinate points, and taking out the encoded previous frame point cloud in the bounding box to restore the static point cloud;

and S5, fusing the decoded dynamic point cloud and the decoded static point cloud to recover the final decoded point cloud.

Technical Field

The invention relates to a method for improving V-PCC (V-PCC) inter-frame prediction by three-dimensional inter-frame prediction.

Background

A point cloud refers to a massive collection of points of a target surface property, which includes geometric information (x, y, z) attribute information (e.g., R, G, B, reflection intensity, etc.). Compared with the traditional 2D image, the 3D point cloud can more accurately express the target object or scene. The method is widely applied to the fields of virtual reality, augmented reality, unmanned driving, medical treatment, high-precision maps and the like. However, compared to conventional 2D images, the order of magnitude of the point cloud is at least over an order of magnitude and is unordered, and thus, efficient compression of the point cloud is very challenging and essential to storage and transmission of the point cloud.

The Motion Picture Experts Group (MPEG) designs the following for the compression of motion point clouds: the method comprises the steps of firstly calculating a normal vector for each point in a frame of point cloud, projecting adjacent point clouds with similar polymerization vectors to a 2D pixel to form an irregular image block (patch) comprising a geometric image and a texture image, further forming a video sequence and compressing the video sequence by HEVC. The method based on patch solves the problem that some points cannot be restored one by one due to occlusion, but also destroys the continuity of point cloud, is not beneficial to removing the redundancy of the space-time correlation, and affects the efficiency of the subsequent video coding.

Firstly, the relative motion between a point cloud frame and a frame is estimated, and the static point cloud with the change close to 0 between the frames is segmented, so that the static point cloud can be directly recovered according to the encoded last frame and the boundary information of entropy encoding. And fusing the dynamic point cloud and compressing by using a V-PCC method. A large number of point codes and thus bit overhead can be saved.

Disclosure of Invention

The invention aims to provide a method for improving V-PCC (point cloud computing) interframe prediction by three-dimensional interframe prediction so as to make up the problem that the space-time correlation of a point cloud sequence cannot be fully utilized due to the fact that a patch method is adopted by V-PCC.

In order to achieve the purpose, the technical scheme of the invention is as follows: a method for improving V-PCC (V-PCC) inter prediction by three-dimensional inter prediction comprises the following encoding processes:

m1, finding the closest point of each point in the current frame point cloud in the previous frame point cloud and recording the distance of the closest point;

step M6, for static point clouds which can not be extended any more, entropy coding can determine the minimum value and the maximum value of bounding box frames;

step M7, dynamic point clouds in all bounding boxes which cannot be effectively predicted by three-dimensional interframes are fused into a new frame of point clouds;

step M8, repeating steps M2 to M7, and removing the point clouds in the determined bounding box in the current coding frame each time until all the point clouds are coded;

and step M9, performing inter-frame prediction coding on the point cloud after final fusion by using V-PCC.

In an embodiment of the present invention, the following decoding process is further included:

step S1, judging the name of the bit stream, and selecting the corresponding decoding mode;

step S2, for only two types of point clouds, if the current point cloud is not a static point cloud, it is only a dynamic point cloud, so a corresponding decoding method is adopted: V-PCC decoding;

step S3, for the bit stream of the static point cloud, recovering the minimum value and the maximum value of the bounding box by adopting entropy coding decoding;

and S5, fusing the decoded dynamic point cloud and the decoded static point cloud to recover the final decoded point cloud.

Compared with the prior art, the invention has the following beneficial effects: according to the method, three-dimensional inter-frame prediction is firstly carried out, and the problem that redundant information is removed because the point cloud is decomposed into 2D irregular image blocks to damage the continuity of the point cloud is avoided. The three-dimensional inter prediction is optimized by the current V-PCC inter prediction method, so that the method is only superior to the V-PCC method.

Drawings

FIG. 1 is a general flow diagram of the present invention.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

Fig. 1 is a general flowchart of a method for improving V-PCC inter prediction by three-dimensional inter prediction according to the present invention, which is divided into an encoding process and a decoding process. The method comprises the following steps:

1) the encoding process is shown in fig. 1(1) encoding flow chart:

m1, finding the closest point of each point in the current frame point cloud in the previous frame point cloud and recording the distance of the closest point;

and step M2, sorting the points of the current point cloud in ascending order according to the priorities of x, y and z, and constructing an 8-by-8 bounding box by taking the first point as a coordinate origin along the direction of increasing value.

Step M3, first setting the point-to-point distortion that can tolerate the largest point cloud as a threshold mse _ th, calculating a mean value of the closest point distances of the point clouds in the bounding box and comparing the mean value with the threshold mse _ th, where a mean value smaller than the threshold indicates that the point clouds can be predicted between three-dimensional frames (indicating that the relative motion of the point clouds in the bounding box is small, and therefore the point clouds are called static point clouds), and a mean value larger than the threshold indicates that the point clouds are moved more at this time and cannot be predicted between three-dimensional frames effectively, and the point clouds are called dynamic point clouds.

And M4, adding a unit to the static point cloud respectively along different dimensions of x, y and z to obtain different bounding boxes, wherein the extended bounding boxes are non-empty, the mean square distance after the addition along the different dimensions is calculated, and the dimension corresponding to the minimum mean square distance is the finally selected extension dimension.

Step M5, judging whether the extended mean square distance is smaller than a threshold value mse _ th, if so, updating the static point cloud and returning to the step M4; if not, the description cannot be extended.

Step M6, entropy coding can determine the minimum and maximum bounding box bounding boxes for static point clouds that cannot be re-extended.

And step M7, merging the dynamic point clouds which cannot be effectively predicted by the three-dimensional interframes into a new frame of point clouds in all bounding boxes.

And step M8, repeating the steps M2 to M7, and removing the point clouds in the determined bounding boxes in the current coding frame until all the point clouds are coded.

And step M9, performing inter-frame prediction coding on the point cloud after final fusion by using V-PCC.

Steps M6 and M8 form two encoded bitstreams and are named with the static point cloud bitstream and the dynamic point cloud bitstream, respectively.

2) The decoding process is shown in fig. 1(2) decoding flow chart:

step S1, judging the name of the bit stream, and selecting the corresponding decoding mode;

step S2, for only two types of point clouds, if the current point cloud is not a static point cloud, it is only a dynamic point cloud, so a corresponding decoding method is adopted: and V-PCC decoding.

And step S3, recovering the minimum value and the maximum value of the bounding box by decoding the static point cloud bit stream by adopting entropy coding.

And step S4, determining the boundary of the bounding box according to the minimum value coordinate point and the maximum value coordinate point. And taking out the encoded previous frame point cloud in the boundary box to restore the static point cloud.

And S5, fusing the decoded dynamic point cloud and the decoded static point cloud to recover the final decoded point cloud.

Although the present invention has been described with reference to the preferred embodiments, it is not intended to limit the present invention, and those skilled in the art can make variations and modifications of the present invention without departing from the spirit and scope of the present invention by using the methods and technical contents disclosed above. The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

6页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：SCC帧内编码单元候选预测模式缩减方法及系统

Method for improving V-PCC (V-policy and charging control) inter-frame prediction by three-dimensional inter-frame prediction

相关技术

网友询问留言