Image depth value acquisition method, device, equipment, coder-decoder and storage medium

文档序号：1538294 发布日期：2020-02-14 浏览：43次中文

阅读说明：本技术 图像深度值获取方法、装置、设备、编解码器及存储介质 (Image depth value acquisition method, device, equipment, coder-decoder and storage medium ) 是由宋剑军曾幸王宁胡祥斌于 2018-08-01 设计创作，主要内容包括：本发明实施例提供一种图像深度值获取方法、装置、设备、编码器及存储介质,针对当前视点图像帧中待预测单元内的每一已知像素点,获取已知像素点的特征信息及其在该图像帧中的各第一参考像素点的第一特征信息,基于第一特征信息及已知像素点的特征信息,计算各第一参考像素点与已知像素点关联性之权重值；根据关联性之权重值,从各第一参考像素点中选择出与已知像素点的关联性满足第一预设条件的第一目标参考像素点,将第一目标参考像素点的深度值作为该已知像素点的最终深度值；已知像素点为当前视点图像帧中待预测单元内,根据视点间的视差,能够被其他视点图像帧中预设参考图像区域内的至少一个像素点所投影匹配的像素点；避免引入新的深度值。(The embodiment of the invention provides an image depth value obtaining method, an image depth value obtaining device, an image depth value coder and a storage medium, wherein the image depth value obtaining method comprises the steps of obtaining characteristic information of a known pixel point and first characteristic information of each first reference pixel point in an image frame aiming at each known pixel point in a unit to be predicted in the current viewpoint image frame, and calculating a weight value of relevance between each first reference pixel point and the known pixel point based on the first characteristic information and the characteristic information of the known pixel point; selecting a first target reference pixel point with the relevance meeting a first preset condition with the known pixel point from all the first reference pixel points according to the weight value of the relevance, and taking the depth value of the first target reference pixel point as the final depth value of the known pixel point; the known pixel points are pixel points which can be projected and matched by at least one pixel point in a preset reference image area in other viewpoint image frames according to the parallax between viewpoints in a unit to be predicted in the current viewpoint image frame; avoiding the introduction of new depth values.)

1. An image depth value obtaining method, aiming at each known pixel point in a unit to be predicted in a current viewpoint image frame, comprises the following steps:

acquiring first characteristic information of each first reference pixel point of the known pixel point in the image frame and characteristic information of the known pixel point;

calculating a weight value of the relevance between each first reference pixel point and the known pixel point based on the first characteristic information and the characteristic information of the known pixel point;

according to the weight value of the relevance, selecting a first target reference pixel point, of which the relevance with the known pixel point meets a first preset condition, from the first reference pixel points, and taking the depth value of the first target reference pixel point as the final depth value of the known pixel point;

the known pixel points are pixel points which can be projected and matched by at least one pixel point in a preset reference image area in other viewpoint image frames according to the parallax between the current viewpoint and other viewpoints in a unit to be predicted in the current viewpoint image frame.

2. The method according to claim 1, wherein the first feature information includes texture information of the first reference pixel and coordinate position information in the viewpoint image frame to which the first reference pixel belongs, and the feature information of the known pixel includes texture information of the known pixel and coordinate position information in the viewpoint image frame to which the known pixel belongs.

3. The method as claimed in claim 2, wherein the weighting values of the relevance between the first reference pixel and the known pixel comprise first weighting values, and the calculating the weighting value of the relevance between each of the first reference pixel and the known pixel based on the first characteristic information and the characteristic information of the known pixel comprises:

calculating the correlation of texture information between the first reference pixel point and the known pixel point, calculating the correlation of coordinate position information of a viewpoint image frame between the first reference pixel point and the known pixel point, and obtaining the first weight value of the correlation of the first reference pixel point and the known pixel point according to the correlation of the texture information and the correlation of the coordinate position information of the viewpoint image frame.

4. The method of claim 3, wherein the selecting a first target reference pixel from the first reference pixels according to the weighted value of the correlation, the correlation of which with the known pixel satisfies a first predetermined condition, comprises:

selecting a first reference pixel point with the maximum first weight value of the relevance between the first reference pixel point and the known pixel point as a first target reference pixel point;

or, according to a first weighted value of the relevance between each first reference pixel and the known pixel, calculating a weighted average value of the depth values of the first reference pixels, calculating an absolute value of a difference value between the depth value of each first reference pixel and the weighted average value, and taking the first reference pixel corresponding to the depth value with the minimum absolute value as the first target reference pixel.

5. The method as claimed in claim 2, wherein the first characteristic information further includes a depth value of the first reference pixel, and the characteristic information of the known pixel further includes an initial depth value of the known pixel.

6. The image depth value acquisition method according to claim 4, wherein the acquisition process of the initial depth value includes:

taking pixel points with coded depth values in the preset reference image area as second reference pixel points;

acquiring second characteristic information of each second reference pixel point;

and calculating a second weight value of the relevance between each second reference pixel point and the known pixel point by using the second characteristic information, selecting a second target reference pixel point of which the relevance with the known pixel point meets a second preset condition from each second reference pixel point, and taking the depth value of the second target reference pixel point as the initial depth value of the known pixel point.

7. The image depth value acquisition method according to claim 6, wherein the second preset condition includes:

selecting a second reference pixel point with the maximum second weight value of the relevance between the second reference pixel point and the known pixel point as a second target reference pixel point;

or, according to a second weight value of the relevance between each second reference pixel and the known pixel, calculating a weighted average value of the depth values of the second reference pixels, calculating an absolute value of a difference value between the depth value of each second reference pixel and the weighted average value, and taking the second reference pixel corresponding to the depth value with the minimum absolute value as the second target reference pixel.

8. The image depth value obtaining method according to claim 6, wherein the second feature information includes texture information of the second reference pixel point, and viewpoint position information to which the second reference pixel point belongs; the calculating a second weight value of the association between each second reference pixel and the known pixel by using the second feature information includes:

and calculating the correlation of the texture information between the second reference pixel points and the known pixel points and the correlation of the viewpoint position information between the second reference pixel points and the known pixel points, and obtaining a second weight value of the correlation of the second reference pixel points and the known pixel points according to the correlation of the texture information and the correlation of the viewpoint position information.

9. The image depth value acquisition method according to claim 1, wherein the preset reference image area is an image area corresponding to the coordinate position of the unit to be predicted in the other viewpoint image frame; or, the preset reference image area comprises an image area corresponding to the coordinate position of the unit to be predicted in the other viewpoint image frames and an extended image area within a preset adjacent range of the image area.

10. The method according to any one of claims 5 to 9, wherein the weighted value of the association between the first reference pixel point and the known pixel point includes a third weighted value, and the calculating the weighted value of the association between each of the first reference pixel point and the known pixel point based on the first characteristic information and the characteristic information of the known pixel point includes:

calculating the correlation of the depth value between the first reference pixel point and the known pixel point, calculating the correlation of texture information between the first reference pixel point and the known pixel point, calculating the correlation of coordinate position information of the viewpoint image frame between the first reference pixel point and the known pixel point, and obtaining the third weight value of the correlation of the first reference pixel point and the known pixel point according to the correlation of the depth value, the correlation of the texture information and the correlation of the coordinate position information of the viewpoint image frame.

11. The method for obtaining the depth value of an image according to claim 10, wherein a third weight value of the relevance between the first reference pixel point and the known pixel point is calculated by the following formula:

q is a number of_sRepresenting a set of said first reference pixels, said q_w,2Representing said known pixel points, saidKernel function representing similarity of depth values of pixels, said

12. The image depth value acquisition method according to claim 10, wherein the first preset condition includes:

selecting a first reference pixel point with the maximum third weighted value of the relevance between the first reference pixel point and the known pixel point as a first target reference pixel point;

or, according to a third weighted value of the relevance between each first reference pixel and the known pixel, calculating a weighted average value of the depth values of the first reference pixels, calculating an absolute value of a difference value between the depth value of each first reference pixel and the weighted average value, and taking the first reference pixel corresponding to the depth value with the minimum absolute value as the first target reference pixel.

13. The image depth value acquisition method according to claim 10, further comprising:

in the current viewpoint image frame, judging a pixel point which cannot be projected and matched by any pixel point in a prediction reference image area in other viewpoint image frames according to the parallax between the current viewpoint and other viewpoints, and taking the pixel point as an unknown pixel point;

acquiring third characteristic information of third reference pixel points, calculating a fourth weighted value of the relevance between each third reference pixel point and the unknown pixel point, selecting a third target reference pixel point of which the relevance between the third reference pixel point and the unknown pixel point meets a third preset condition from each third reference pixel point, and taking the depth value of the third target reference pixel point as the depth value of the unknown pixel point; the third reference pixel points comprise known pixel points and/or the first reference pixel points of the current viewpoint image frame, wherein the depth values of the known pixel points are coded.

14. The image depth value acquisition method according to claim 13, wherein the third preset condition includes:

selecting a third reference pixel point with the highest fourth weighted value of the relevance between the third reference pixel point and the unknown pixel point as a third target reference pixel point;

or, according to a fourth weighted value of the relevance between each third reference pixel and the known pixel, calculating a weighted average value of the depth values of the third reference pixels, calculating an absolute value of a difference value between the depth value of each third reference pixel and the weighted average value, and taking the third reference pixel corresponding to the depth value with the minimum absolute value as the third target reference pixel.

15. An image depth value acquiring apparatus comprising:

the image prediction method comprises a first obtaining unit, a second obtaining unit and a prediction unit, wherein the first obtaining unit is used for obtaining first characteristic information of each first reference pixel point of a known pixel point in an image frame and characteristic information of the known pixel point aiming at each known pixel point in a unit to be predicted in the image frame of a current viewpoint;

a calculating unit, configured to calculate a weight value of the relevance between each first reference pixel and the known pixel based on the first feature information and the feature information of the known pixel;

a selecting unit, configured to select, according to the weighted value of the relevance, a first target reference pixel having a relevance to the known pixel that meets a first preset condition from the first reference pixels, and use a depth value of the first target reference pixel as a final depth value of the known pixel;

16. An image depth value acquiring apparatus comprising:

the information acquisition unit is used for acquiring first characteristic information of each first reference pixel point of a known pixel point in an image frame and characteristic information of the known pixel point aiming at each known pixel point in a unit to be predicted in the current viewpoint image frame;

the processing unit is used for calculating the weight value of the relevance between each first reference pixel point and the known pixel point based on the first characteristic information and the characteristic information of the known pixel point; selecting a first target reference pixel point with the relevance meeting a first preset condition with the known pixel point from the first reference pixel points according to the weighted value of the relevance, and taking the depth value of the first target reference pixel point as the final depth value of the known pixel point;

17. The image depth value obtaining apparatus according to claim 16, wherein the first feature information includes a depth value of the first reference pixel point, texture information, and coordinate position information within a viewpoint image frame to which it belongs; the characteristic information of the known pixel point comprises an initial depth value, texture information and coordinate position information in a viewpoint image frame to which the known pixel point belongs; the processing unit is further configured to calculate a correlation between depth values of the first reference pixel and the known pixel, calculate a correlation between texture information of the first reference pixel and the known pixel, calculate a correlation between coordinate position information of the viewpoint image frame between the first reference pixel and the known pixel, and obtain the third weight value of the correlation between the first reference pixel and the known pixel according to the correlation between the depth values, the correlation between the texture information, and the correlation between the coordinate position information of the viewpoint image frame.

18. The image depth value acquiring device according to claim 17, wherein the processing unit is further configured to select, as the first target reference pixel, a first reference pixel having a maximum third weight value of the relevance between the first reference pixel and the known pixel;

or, according to a third weighted value of the relevance between each first reference pixel and the known pixel, calculating a weighted average value of the depth values of the first reference pixels, calculating an absolute value of a difference value between the depth value of each second reference pixel and the weighted average value, and taking the first reference pixel corresponding to the depth value with the minimum absolute value as the first target reference pixel.

19. A codec, comprising a processor, a memory, and a communication bus;

the communication bus is used for realizing connection communication between the processor and the memory;

the processor is configured to execute one or more programs stored in the memory to implement the steps of the image depth value obtaining method as claimed in any one of claims 1 to 14.

20. A storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the steps of the picture processing method according to any one of claims 1 to 14.

Technical Field

The present invention relates to the field of multi-view video encoding and decoding, and in particular, to a method, an apparatus, a device, a codec, and a storage medium for obtaining a depth value of an image.

Background

The multi-view video is a set of video information obtained by shooting the same scene from different angles by a camera array, and can acquire three-dimensional information of scene objects compared with single view information, thereby reproducing a stereoscopic scene more vividly. In the existing multi-view coding scheme, the multi-view coding scheme based on view synthesis fully utilizes the correlation of multi-view video data and implicit three-dimensional information about scenes, provides a multi-view coding idea with high efficiency, flexibility and strong view scalability, and is more and more widely concerned by domestic scholars.

The coding and decoding format of the multi-view video + depth map comprises a large amount of data, a large amount of redundant information exists among the data, and the coding and decoding format can be divided into spatial correlation, temporal correlation, depth-texture domain correlation and inter-view domain correlation according to the correlation of available data.

The spatial correlation is the correlation between adjacent pixels in the same frame of the same viewpoint, and starting from the distance proximity of two pixels in a plane space, the correlation between the pixels is considered to be reduced along with the increase of the Euclidean space distance.

The time domain correlation is the correlation between pixels at the same position in different frames of the same viewpoint, and from the difference of two pixel points at different time, the correlation between the pixel points is considered to be reduced along with the increase of the time difference.

The depth-texture correlation is the correlation between the depth information and the texture information of the corresponding positions of the depth map and the texture map of the same viewpoint at the same time, and from the similarity of the depth values of the pixel values of the texture map corresponding to the pixel points of the two depth maps, the correlation between the texture information of the pixel points is considered to be reduced along with the increase of the difference of the depth values of the pixel points.

The inter-viewpoint correlation is the correlation between depth information and texture information of different viewpoints at the same time, and from the difference between two pixel points at different viewpoints at the same time, the correlation between the pixel points is considered to decrease as the viewpoint difference increases.

The single view performs inter prediction using a Motion Vector (MV), which is obtained by a merge (merge) technique and an Advanced Motion Vector Prediction (AMVP) technique. Inter prediction of the coding and decoding algorithm of the multi-view video + Depth map may use Disparity Vectors (DV) obtained by a Neighboring Block Disparity Vector (NBDV) technique and a Depth-direction neighboring block based Disparity vector (DoNBDV) technique, in addition to the MV.

At present, the depth map coding basically adopts a multi-view coding frame, and a coding frame structure divides a plurality of views into images of a basic view and a plurality of non-basic views. The texture map and the depth map of the basic viewpoint are independently coded by adopting a standard single viewpoint coding method, and the texture map and the depth map of the non-basic viewpoint need to depend on basic viewpoint information, so that the correlation between the viewpoints is better utilized, and the coding efficiency is greatly improved. In the actual encoding process, in the scene of the current viewpoint, the depth information of an object is not necessarily completely consistent with the depth information of the object in other viewpoints, so that the inter-viewpoint/inter-frame prediction of the depth map will most likely introduce new depth values, which will significantly reduce the encoding performance.

Disclosure of Invention

The embodiment of the invention provides an image depth value obtaining method, an image depth value obtaining device, image depth value obtaining equipment, a coder-decoder and a storage medium, and mainly solves the technical problems that: the existing depth value obtaining scheme has the problem that a new depth value is introduced aiming at a current view point, so that the coding performance is reduced.

In order to solve the foregoing technical problem, an embodiment of the present invention provides an image depth value obtaining method, where, for each known pixel point in a unit to be predicted in a current viewpoint image frame, a process of obtaining a depth value includes:

acquiring first characteristic information of each first reference pixel point of the known pixel point in the image frame and characteristic information of the known pixel point;

An embodiment of the present invention further provides an apparatus for obtaining an image depth value, including:

An embodiment of the present invention further provides an image depth value obtaining apparatus, including:

The embodiment of the invention also provides a coder-decoder, which comprises a processor, a memory and a communication bus;

the communication bus is used for realizing connection communication between the processor and the memory;

the processor is configured to execute one or more programs stored in the memory to implement the steps of the image depth value obtaining method as described in any one of the above.

Embodiments of the present invention also provide a storage medium storing one or more programs, which are executable by one or more processors to implement the steps of the image depth value obtaining method as described above.

The invention has the beneficial effects that:

according to the method, the device, the equipment, the codec and the storage medium for acquiring the image depth value provided by the embodiment of the invention, aiming at each known pixel point in a unit to be predicted in a current viewpoint image frame, the acquiring process of the depth value comprises the following steps: acquiring first characteristic information of each first reference pixel point of a known pixel point in the image frame and characteristic information of the known pixel point; calculating a weight value of the relevance between each first reference pixel point and the known pixel point based on the first characteristic information and the characteristic information of the known pixel point; selecting a first target reference pixel point with the relevance meeting a first preset condition with the known pixel point from all the first reference pixel points according to the weight value of the relevance, and taking the depth value of the first target reference pixel point as the final depth value of the known pixel point; the known pixel points are pixel points which can be projected and matched by at least one pixel point in a preset reference image area in other viewpoint image frames according to the parallax between the current viewpoint and other viewpoints in a unit to be predicted in the current viewpoint image frame; the depth value of the corresponding first reference pixel point is selected as the final depth value of the known pixel point through the weight value of the relevance between the known pixel point and each first reference pixel point in the image frame where the known pixel point is located, so that the introduction of a new depth value can be avoided, and the effect of improving the coding performance can be further achieved.

Additional features and corresponding advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a flowchart illustrating a method for obtaining depth values of an image according to a first embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for obtaining depth values of an image according to a second embodiment of the present invention;

fig. 3 is a schematic diagram of a current viewpoint image frame and other viewpoint image frames according to a second embodiment of the present invention;

FIG. 4 is a flowchart illustrating a detailed process of an image depth value obtaining method according to a second embodiment of the present invention;

FIG. 5 is a diagram illustrating a known pixel and a second reference pixel according to a second embodiment of the present invention;

FIG. 6 is a diagram illustrating an unknown pixel according to a second embodiment of the present invention;

FIG. 7 is a schematic structural diagram of an image depth value obtaining apparatus according to a third embodiment of the present invention;

FIG. 8 is a schematic structural diagram of an image depth value obtaining apparatus according to a third embodiment of the present invention;

FIG. 9 is a third schematic view illustrating an image depth value obtaining apparatus according to a third embodiment of the present invention;

FIG. 10 is a schematic structural diagram of an image depth value obtaining apparatus according to a fourth embodiment of the present invention;

fig. 11 is a schematic structural diagram of a codec according to a fifth embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

27页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种摄像模组

Image depth value acquisition method, device, equipment, coder-decoder and storage medium

相关技术

网友询问留言