VR-oriented real-time image compression method, system and storage medium

文档序号:1046905 发布日期:2020-10-09 浏览:35次 中文

阅读说明:本技术 面向vr的实时图像压缩方法、系统和存储介质 (VR-oriented real-time image compression method, system and storage medium ) 是由 郭玉会 周鹏 冀德 于 2019-03-27 设计创作,主要内容包括:本发明涉及一种面向VR的实时图像压缩方法、系统和存储介质,包括:获取待压缩的一帧VR原始图像,将该原始图像划分成多个宏块,对每个宏块中像素进行模式预测,得到每个宏块的预测值,使用宏块自身的该预测值与自身的原像素值作差,得到残差数据;利用离散余弦变换将该残差数据从时域变换到频域,并对变换得到的频域数据进行数值量化,得到量化数据,通过对该量化数据进行熵编码,得到每个宏块的码流,将全部宏块的码流按顺序拼接整合,得到该原始图像的码流,作为压缩结果。本发明整个过程不会缓存任何影响实时性的数据,保证了很好实时性。(The invention relates to a real-time image compression method, a real-time image compression system and a real-time image compression storage medium for VR, wherein the real-time image compression method, the real-time image compression system and the real-time image compression storage medium comprise the following steps: obtaining a frame of VR original image to be compressed, dividing the original image into a plurality of macro blocks, performing mode prediction on pixels in each macro block to obtain a predicted value of each macro block, and obtaining residual data by using the difference between the predicted value of the macro block and the original pixel value of the macro block; and transforming the residual data from a time domain to a frequency domain by using discrete cosine transform, carrying out numerical quantization on the frequency domain data obtained by the transformation to obtain quantized data, carrying out entropy coding on the quantized data to obtain a code stream of each macro block, splicing and integrating the code streams of all the macro blocks in sequence to obtain the code stream of the original image as a compression result. The whole process of the invention can not cache any data which affects the real-time performance, and the good real-time performance is ensured.)

1. A real-time image compression method facing VR is characterized by comprising the following steps:

step 1, obtaining a frame of VR original image to be compressed, dividing the original image into a plurality of macro blocks, performing mode prediction on pixels in each macro block to obtain a predicted value of each macro block, and obtaining residual data by using the difference between the predicted value of the macro block and the original pixel value of the macro block;

and 2, transforming the residual data from a time domain to a frequency domain by utilizing discrete cosine transform, carrying out numerical quantization on the frequency domain data obtained by transformation to obtain quantized data, carrying out entropy coding on the quantized data to obtain a code stream of each macro block, splicing and integrating the code streams of all the macro blocks in sequence to obtain the code stream of the original image as a compression result.

2. The VR-oriented real-time image compression method of claim 1, wherein the mode prediction in step 1 is specifically: the method comprises the steps of carrying out inverse quantization reduction operation on quantized data of a previous frame image of an original image to obtain a reconstructed pixel value of each macro block of the previous frame image, collecting the reconstructed pixel values of each macro block of the previous frame image to obtain a prediction set, finding a reconstructed pixel value which is closest to the color of a current macro block in the prediction set as a comparison value, and using a difference value between the comparison value and the pixel value of the current macro block as the prediction value.

3. The VR-oriented real-time image compression method of claim 2, wherein the quantizing step 2 is implemented by removing linearity of the residual data under a normal basis matrix by an integer value, and using a resulting sparse matrix as the quantized data.

4. The VR-oriented real-time image compression method of any one of claims 1-3, wherein step 1 further includes:

and 11, after the VR video data to be compressed is obtained, dividing the VR video data into a plurality of image segments according to the content characteristics of the VR video data, and obtaining a frame of VR original image to be compressed from each image segment.

5. A VR-oriented real-time image compression system, comprising:

the method comprises the steps that a module 1 obtains a frame of VR original image to be compressed, divides the original image into a plurality of macro blocks, carries out mode prediction on pixels in each macro block to obtain a predicted value of each macro block, and uses the difference between the predicted value of the macro block and the original pixel value of the macro block to obtain residual data;

and the module 2 is used for transforming the residual data from a time domain to a frequency domain by discrete cosine transform, carrying out numerical quantization on the frequency domain data obtained by transformation to obtain quantized data, carrying out entropy coding on the quantized data to obtain a code stream of each macro block, splicing and integrating the code streams of all the macro blocks in sequence to obtain the code stream of the original image as a compression result.

6. The VR-oriented real-time image compression system of claim 5, wherein the mode prediction in module 1 is specifically: the method comprises the steps of carrying out inverse quantization reduction operation on quantized data of a previous frame image of an original image to obtain a reconstructed pixel value of each macro block of the previous frame image, collecting the reconstructed pixel values of each macro block of the previous frame image to obtain a prediction set, finding a reconstructed pixel value which is closest to the color of a current macro block in the prediction set as a comparison value, and using a difference value between the comparison value and the pixel value of the current macro block as the prediction value.

7. The VR-oriented real-time image compression system of claim 6, wherein the quantization of the value in block 2 is implemented by de-linearizing the residual data under a normal basis matrix by an integer value to obtain a sparse matrix as the quantized data.

8. The VR-oriented real-time image compression system of any one of claims 5-7, wherein the module 1 further includes:

after the module 11 acquires VR video data to be compressed, the VR video data is divided into a plurality of image segments according to content characteristics of the VR video data, and the VR original image to be compressed is acquired from each image segment.

9. An implementation method for the VR-oriented real-time image compression system of any one of claims 5 to 8.

10. A storage medium storing a program for executing the VR-oriented real-time image compression method of any one of claims 1 to 4.

Technical Field

The present invention relates to the field of wireless VR transmission and the field of video compression, and in particular, to a real-time image compression method, system and storage medium for VR.

Background

In the past, there are two general VR image compression techniques, each of which is for reducing data volume and facilitating network transmission, but the specific configurations are different, and each of which is as follows

1) Acquiring image data, performing down-sampling on each frame of acquired image data by using a sampling method, sending the sampled data to a network sending module for sending, and restoring the data by using an interpolation method at a receiving end for displaying;

2) the method comprises the steps of obtaining image data, compressing the image data, dividing different frames into three frame formats of an I frame, a P frame and a B frame in the compression process, respectively adopting I, P, B frame compression methods, namely adopting intra-frame compression, forward inter-frame compression and bidirectional inter-frame compression to compress the data, and sending the compressed data to a network sending module for sending.

Two VR image compression techniques that have been used more frequently are explained:

1) firstly, image picture data generated by a VR program is obtained, down-sampling is carried out on each frame of image data, the data volume after sampling is greatly reduced, the sampled data is sent to a network sending module and sent to a VR receiving end, and the received data is interpolated at the receiving end to restore a complete image for display; the technology focuses on the down-sampling operation of the image data, the quantity of the image data is greatly reduced through the down-sampling operation, and the subsequent network sending is facilitated;

2) firstly, image picture data generated by a VR program is obtained, then different frame data are divided into I frame data, P frame data and B frame data, different compression methods are adopted for the different frame data, intraframe compression is adopted for the I frame, interframe compression (forward prediction) is adopted for the P frame, interframe compression (bidirectional prediction) is adopted for the B frame data compression method, multi-frame images need to be cached in advance due to the bidirectional prediction, and compressed code stream data are sent to a network sending module for sending after different frames are compressed.

Two previous VR image compression techniques have problems:

1) in the prior art, a compression process uses a down-sampling method to reduce the transmitted data amount, the down-sampling can reduce the data amount but can lose a large amount of original data information, and a receiving end carries out image restoration by an interpolation method, so that a lot of noises are brought in the process, the image quality is greatly reduced, and the user experience is also greatly reduced; the acquired image data are divided according to the VR image characteristics, so that the image data are independent from each other and more suitable for subsequent compression, the correlation of the divided images in two dimensions of space and time is analyzed, the correlation is removed, the image characteristics are extracted, and the image characteristics are encoded based on the information entropy.

2) In the second prior art, because B frame compression is used, multi-frame data needs to be cached in advance for bidirectional prediction and bidirectional interpolation, a large amount of delay is brought in the process, caching one frame of data means that delay of one frame is increased, which is unacceptable for VR application, but the B frame compression is abandoned, VR images are divided by utilizing VR content characteristics, in addition, the whole process of the invention can not cache any data which affects real-time performance, and the dividing blocks are mutually independent, so that the real-time performance requirement is well met on the premise of not affecting image quality.

Disclosure of Invention

The main purposes of the invention are as follows: compared with the prior VR image compression technology, the method provided by the invention divides the acquired image data, removes the correlation of the divided image data in space and time dimensions, and encodes the divided image data on the basis of the information entropy in a frequency domain, thereby greatly improving the image quality after the reduction of the receiving end and well meeting the requirement of real-time property.

Specifically, the invention discloses a real-time image compression method facing VR, which comprises the following steps:

step 1, obtaining a frame of VR original image to be compressed, dividing the original image into a plurality of macro blocks, performing mode prediction on pixels in each macro block to obtain a predicted value of each macro block, and obtaining residual data by using the difference between the predicted value of the macro block and the original pixel value of the macro block;

and 2, transforming the residual data from a time domain to a frequency domain by utilizing discrete cosine transform, carrying out numerical quantization on the frequency domain data obtained by transformation to obtain quantized data, carrying out entropy coding on the quantized data to obtain a code stream of each macro block, splicing and integrating the code streams of all the macro blocks in sequence to obtain the code stream of the original image as a compression result.

In the real-time image compression method facing to VR, the mode prediction in step 1 specifically includes: the method comprises the steps of carrying out inverse quantization reduction operation on quantized data of a previous frame image of an original image to obtain a reconstructed pixel value of each macro block of the previous frame image, collecting the reconstructed pixel values of each macro block of the previous frame image to obtain a prediction set, finding a reconstructed pixel value which is closest to the color of a current macro block in the prediction set as a comparison value, and using a difference value between the comparison value and the pixel value of the current macro block as the prediction value.

In the real-time image compression method facing VR, the numerical quantization in step 2 is specifically to remove linearity of the residual data under a standard basis matrix by an integer value, and the obtained sparse matrix is used as the quantized data.

Wherein step 1 further comprises:

and 11, after the VR video data to be compressed is obtained, dividing the VR video data into a plurality of image segments according to the content characteristics of the VR video data, and obtaining a frame of VR original image to be compressed from each image segment.

The invention also discloses a real-time image compression system facing VR, which comprises:

the method comprises the steps that a module 1 obtains a frame of VR original image to be compressed, divides the original image into a plurality of macro blocks, carries out mode prediction on pixels in each macro block to obtain a predicted value of each macro block, and uses the difference between the predicted value of the macro block and the original pixel value of the macro block to obtain residual data;

and the module 2 is used for transforming the residual data from a time domain to a frequency domain by discrete cosine transform, carrying out numerical quantization on the frequency domain data obtained by transformation to obtain quantized data, carrying out entropy coding on the quantized data to obtain a code stream of each macro block, splicing and integrating the code streams of all the macro blocks in sequence to obtain the code stream of the original image as a compression result.

In the VR-oriented real-time image compression system, the mode prediction in the module 1 specifically includes: the method comprises the steps of carrying out inverse quantization reduction operation on quantized data of a previous frame image of an original image to obtain a reconstructed pixel value of each macro block of the previous frame image, collecting the reconstructed pixel values of each macro block of the previous frame image to obtain a prediction set, finding a reconstructed pixel value which is closest to the color of a current macro block in the prediction set as a comparison value, and using a difference value between the comparison value and the pixel value of the current macro block as the prediction value.

In the VR-oriented real-time image compression system, the quantization of the value in the module 2 specifically includes removing linearity of the residual data under a standard basis matrix by an integer value, and using an obtained sparse matrix as the quantized data.

The real-time image compression system facing VR, wherein the module 1 further includes:

after the module 11 acquires VR video data to be compressed, the VR video data is divided into a plurality of image segments according to content characteristics of the VR video data, and the VR original image to be compressed is acquired from each image segment.

The invention also discloses an implementation method of the real-time image compression system facing the VR.

The invention also discloses a storage medium for storing a program for executing the VR-oriented real-time image compression method.

In VR wireless transmission application, compared with the prior art, the invention creates appropriate division of VR images, so that the divided image content is more suitable for compression, methods for removing pixel correlation such as mode prediction, numerical value quantization and numerical value reconstruction are adopted in the compression process, and a B frame compression mode which is not beneficial to real-time property is abandoned, so that the real-time property requirement of VR application is well met, in addition, the entropy-based coding method is adopted to bring better compression efficiency, so that the whole system can compress the images in real time and send the images to a VR receiving end on the premise of hardly influencing the image quality.

Drawings

FIG. 1 is a schematic diagram of the process of the present invention;

fig. 2 and 3 are technical schemes of two conventional methods.

Detailed Description

The invention realizes a real-time image compression method facing VR. The method mines the correlation between image pixels from two dimensions of space and time and utilizes: 1) in the spatial dimension, the color correlation of adjacent pixel regions within the same picture is analyzed (I frame), and 2) in the temporal dimension, the color correlation of pixels of the same region in several pictures that are adjacent in position in the image sequence is analyzed (P frame). The correlation analysis comprises three parts: 1) mode prediction, 2) numerical quantization, 3) numerical reconstruction. After the data correlation characteristics are extracted, the system encodes the characteristics based on the information entropy to obtain encoded data with smaller size. Conventional video compression methods divide video frames into three types, reference frames (I-frames), forward predicted frames (P-frames), and bidirectional predicted frames (B-frames), and employ different compression methods thereto. The inter-frame compression method of bi-directional prediction is adopted for the B frame, and needs to be compressed according to adjacent forward frame, current frame and backward frame data, and this process requires that multiple frames of images must be buffered in advance for B frame compression. For VR, the rendering pipeline generates on average one frame every 11 milliseconds, buffering one frame more means adding 11 milliseconds of delay from motion to display. In order to ensure real-time performance, the system only divides an image into a reference frame (I frame) and a forward prediction frame (P frame) in a time dimension, the coding of the I frame only refers to the correlation characteristic of a space dimension (namely, in-frame), the coding of the P frame can refer to the correlation characteristic of the previous I frame at the same space position (namely, between frames), and the whole process does not need to buffer data influencing the real-time performance, so that delay is not caused.

In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.

The technical framework of the invention comprises five parts: 1) an image dividing module based on VR content characteristics; 2) a mode prediction module; 3) a numerical transform quantization module; 4) a numerical value encoding module; 5) and a numerical reconstruction module. The whole technical flow chart is shown in figure 1.

VR images have two features: 1) with image masking, i.e. there is always a part of the black area in the image. This is because this area is not visible after mapping to the VR lens and therefore does not participate in rendering from the beginning. 2) The non-uniform gaze point importance, i.e. the visual importance of the image, is not uniform. The central pixel of the image has the highest visual importance, and because the central pixel is positioned at the central axis of the fixation point of human eyes, the degree of deformation after the central pixel is magnified by a lens is the highest, and the requirement on resolution is also the highest. Conversely, the farther away from the center of the image, the lower the visual importance and the lower the resolution requirements.

After VR image data to be compressed is obtained, firstly, an image is divided into mutually independent image segments with smaller dimensions according to VR content characteristics, such as transition time, then mode prediction is carried out on each image segment, the mode prediction comprises intra-frame prediction (I frame) of a space domain and forward inter-frame prediction (P frame) of a time domain, the image segments are divided into a plurality of image blocks (the image blocks refer to macro blocks, such as a 16 pixel region, and the whole image is divided into continuous non-intersecting macro blocks) in the prediction process, and pixels in each image block are predicted by using reconstructed pixel values obtained by numerical reconstruction of each image block, so that a predicted value is obtained. Mode prediction refers to finding an already coded macroblock with the color closest to the current macroblock, and using their difference to represent the current macroblock. On the premise that the difference value is small, the coded data amount is smaller than the original color of a macro block which is directly coded. It can be seen that the mode prediction process requires a previously encoded macroblock, and the numerical reconstruction process reconstructs and uses as input the historical macroblock (already encoded macroblock) required by the prediction mode.

The method comprises the steps of obtaining a series of residual data with a small floating range by using the difference between a predicted value and an original pixel value, sending the residual data to a numerical transformation quantization module, transforming the residual data from a time domain to a frequency domain by the module by using discrete cosine transformation, further removing correlation in the frequency domain by using the good energy concentration characteristic of the residual data, and carrying out numerical quantization on the transformed frequency domain data. The numerical quantization is to divide the coefficients (coefficient, the linear solution of the residual matrix of the macroblock under the standard basis matrix) of the residual (the difference between the color value predicted for the macroblock in the prediction mode and the original color value) by an integer value so that most of the residual coefficients become 0, i.e. become sparse matrix for compression. This process reduces the accuracy, and the specific level can be selected by a Quantization Parameter (QP). The quantization process can further narrow the dynamic range of the input data, so that the quantized output data can be represented with only a few bits.

The quantized output data is directly sent to a numerical coding module for entropy-based coding, namely, the data is coded into a series of binary bit numbers, the entropy coding can bring higher compression ratio, and the whole coding process is lossless and reversible, namely, the coded binary bit numbers can be losslessly restored into original data before coding through decoding, which is very important for improving the image compression quality, a series of binary bit numbers are obtained after the numerical coding is finished and are called as code streams, and the network transmission can be started by sending the code stream data to a network transmission module. The code stream can be the code stream of macro blocks, the complete code stream of the original image obtained after splicing and integration is sent to the network transmission module, or the code stream of each macro block is directly sent to the network transmission module and then spliced and integrated by a subsequent display end.

During the encoding process, the quantized data is also sent to a numerical reconstruction module, the module obtains a reconstructed pixel value by carrying out reduction operations such as inverse quantization and inverse transformation on an input numerical value, and the reconstructed pixel value is sent to a mode prediction module to be used as a mode prediction reference. The reconstructed value belongs to a legacy macroblock, which is a neighbor of the current macroblock. Mode prediction selects the nearest color neighbor for the current macroblock. Since the encoding process is a streaming process, the history state, i.e. the original color of the old macroblock, is not preserved, and therefore it needs to be reconstructed, which is a concept of exchanging time for space.

In the aspect of real-time performance, the invention only adopts the compression mode of I frames and P frames in the compression process, and does not use the mode of B frames, thereby avoiding the condition that a plurality of buffer frames are used for bidirectional prediction of the B frames, compressing each frame of data generated by the VR program in time, and having no influence on the real-time performance. The front end of the whole compression method is connected with the image capturing system, the rear end of the whole compression method is connected with the network sending system, the image data can be sent to the subsequent decoding and VR display end at a lower code rate on the premise that the VR real-time requirement is met, the whole process hardly influences the image quality, and the user experience is good.

The following are method examples corresponding to the above system examples, and this embodiment mode can be implemented in cooperation with the above embodiment modes. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.

The invention also discloses a real-time image compression system facing VR, which comprises:

the method comprises the steps that a module 1 obtains a frame of VR original image to be compressed, divides the original image into a plurality of macro blocks, carries out mode prediction on pixels in each macro block to obtain a predicted value of each macro block, and uses the difference between the predicted value of the macro block and the original pixel value of the macro block to obtain residual data;

and the module 2 is used for transforming the residual data from a time domain to a frequency domain by discrete cosine transform, carrying out numerical quantization on the frequency domain data obtained by transformation to obtain quantized data, carrying out entropy coding on the quantized data to obtain a code stream of each macro block, splicing and integrating the code streams of all the macro blocks in sequence to obtain the code stream of the original image as a compression result.

In the VR-oriented real-time image compression system, the mode prediction in the module 1 specifically includes: the method comprises the steps of carrying out inverse quantization reduction operation on quantized data of a previous frame image of an original image to obtain a reconstructed pixel value of each macro block of the previous frame image, collecting the reconstructed pixel values of each macro block of the previous frame image to obtain a prediction set, finding a reconstructed pixel value which is closest to the color of a current macro block in the prediction set as a comparison value, and using a difference value between the comparison value and the pixel value of the current macro block as the prediction value.

In the VR-oriented real-time image compression system, the quantization of the value in the module 2 specifically includes removing linearity of the residual data under a standard basis matrix by an integer value, and using an obtained sparse matrix as the quantized data.

The real-time image compression system facing VR, wherein the module 1 further includes:

after the module 11 acquires VR video data to be compressed, the VR video data is divided into a plurality of image segments according to content characteristics of the VR video data, and the VR original image to be compressed is acquired from each image segment.

The invention also discloses an implementation method of the real-time image compression system facing the VR.

The invention also discloses a storage medium for storing a program for executing the VR-oriented real-time image compression method.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于编码/解码振动触觉信号的编码/解码设备和方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类