Improved light field depth estimation algorithm based on energy enhanced defocusing response

文档序号:1467022 发布日期:2020-02-21 浏览:22次 中文

阅读说明:本技术 基于能量增强散焦响应的改进光场深度估计算法 (Improved light field depth estimation algorithm based on energy enhanced defocusing response ) 是由 武迎春 程星 张娟 李素月 宁爱平 王安红 于 2019-11-06 设计创作,主要内容包括:本发明属于光场图像处理及深度估计领域,特别是公开了一种基于能量增强散焦响应的改进光场深度估计算法,针对现有的DCDC算法在建立散焦响应函数时,所采用的拉普拉斯算子存在求取二阶导方向有限、且所有二阶导求和能量相互抵消的问题,对散焦响应函数进行了改进,提出基于能量增强散焦响应的光场深度估计算法,该算法充分考虑了周边像素点对当前位置散焦度的影响,通过增加具有权重差异的二阶导的个数和方向来实现能量增强,实验验证了该方法的有效性,针对一些深度信息复杂的光场图像,本发明算法获取的深度图视觉效果更好,与散焦评价结合相关性评价算法相比,改进算法获得深度图像的均方误差平均下降3.95%。(The invention belongs to the field of light field image processing and depth estimation, and particularly discloses an improved light field depth estimation algorithm based on energy-enhanced defocusing response, aiming at the problems that the direction of solving second order derivatives is limited and all the second order derivatives and energy are mutually offset when the existing DCDC algorithm is used for establishing a defocusing response function, the defocusing response function is improved, the light field depth estimation algorithm based on the energy-enhanced defocusing response is provided, the algorithm fully considers the influence of peripheral pixel points on the defocusing power of the current position, the energy enhancement is realized by increasing the number and the direction of the second order derivatives with weight difference, the effectiveness of the method is verified by experiments, aiming at light field images with complex depth information, the depth image obtained by the algorithm has better visual effect, and compared with the defocusing evaluation and correlation evaluation algorithms, the mean square error of the depth image obtained by the improved algorithm is reduced by 3.95 percent on average.)

1. The improved light field depth estimation algorithm based on the energy enhanced defocusing response is characterized by comprising the following steps of:

1) after the center of the micro lens is calibrated for the light field camera, the 4D light field L is decoded from the light field original image0(x0,y0,s0,t0) According to the light field biplane representation model and the digital refocusing theory, when the position of the light field camera imaging plane (image distance) is changed, the light field at the new imaging depth (object distance) recorded by the camera

Figure FDA0002261730040000011

Figure FDA0002261730040000012

wherein, αnA proportionality coefficient representing the distance of the image plane movement, and n is 0,1,2, 255 representing a difference αnIndex value corresponding to the value will

Figure FDA0002261730040000013

Figure FDA0002261730040000014

wherein N iss,NtRepresenting the magnitude of the 4D light field angular resolution;

2) 4D light fields at different depths of focus

Figure FDA0002261730040000015

Figure FDA0002261730040000017

wherein, omega'DIs the size of the window around the current pixel,

Figure FDA0002261730040000018

Figure FDA0002261730040000019

wherein s and t represent the second derivative step length in the horizontal and vertical directions,

Figure FDA00022617300400000110

3) 4D light fields at different depths of focus

Figure FDA00022617300400000111

Figure FDA0002261730040000021

4) taking defocus responseTime of maximum αnIndex value n of (D) forms a defocus depth map D1(x, y), decorrelation response

Figure FDA0002261730040000023

Figure FDA0002261730040000024

Figure FDA0002261730040000025

5) the fusion of the defocusing depth map and the correlation depth map is realized by using a Markov optimization theory, and the confidence degrees corresponding to different depth maps are calculated firstly:

C1(x,y)=D1(x,y)/D1′(x,y) (8)

C2(x,y)=D2(x,y)/D2′(x,y) (9)

wherein D is1' (x, y) denotes defocus responseTaking the depth map obtained from equation (6) at the next largest value, D2' (x, y) denotes the defocus response

Figure FDA0002261730040000027

Figure FDA0002261730040000028

wherein the content of the first and second substances,

Figure FDA0002261730040000029

Technical Field

The invention belongs to the field of light field image processing and depth estimation, and particularly relates to an improved light field depth estimation algorithm based on energy enhanced defocusing response.

Background

With the continuous development of the light field rendering theory and the continuous evolution of the plenoptic function, light field imaging becomes a hot topic of modern computational photography. Different from the structural design of a traditional camera, the micro-lens light field camera realizes the acquisition of 4-dimensional (4D) light field data by adding a micro-lens array between a main lens and an imaging surface based on a light field biplane representation model. The 4D light field records not only the position information (x, y) of the spatial light ray obtained by conventional imaging, but also the direction information (s, t) of the light ray. The recording of the multi-dimensional space light information is completed by single exposure, so that the light field image has wider application value in the later period, such as digital refocusing, multi-viewpoint image extraction, full focus image fusion and the like based on the light field image. The depth estimation algorithm developed on the basis has the advantage that the depth map estimation can be completed by only one light field image, and has attracted general attention of a plurality of scholars in the field of optical three-dimensional sensing in recent years.

Currently, depth estimation algorithms based on light field images are mainly classified into three categories: stereo matching, EPI (epibolar-planar image) imaging, and Defocus (DFD). The stereo matching method is mainly based on the parallax calculation of the light field sub-aperture image, the depth image estimation is completed through the corresponding relation between the parallax and the depth information, for example, the depth acquisition method based on the sub-aperture image phase sub-pixel displacement calculation proposed by Jeon and the like is based on the principle, and the depth acquisition precision of the algorithm is limited by the problems of limited viewpoint number, limited resolution and the like of the decoded sub-aperture image due to the limitation of the structural design of the micro-lens light field camera.

The EPI algorithm stacks a plurality of sub-aperture images with single-direction parallax to form a cube, cuts the cube along a certain direction to form an EPI section, completes depth estimation by utilizing the proportional relation between the polar slope of the EPI image and the depth of a target scene, for example, Li and the like complete the acquisition of a smooth depth image through EPI image structure tensor calculation and a sparse linear solution system, Zhang and the like obtain the slope of the EPI by adopting a self-rotating parallelogram, and has a better depth reconstruction effect on a discontinuous region. The algorithm needs to accurately estimate the slope of the epipolar line of each pixel point of the EPI image, and the local imperfection of image characteristics is a main reason of high algorithm complexity and poor real-time performance.

DFD completes depth estimation by performing defocus degree comparison on a plurality of multi-focus images with different focus depths of the same target scene, and the regularized depth estimation based on non-local equalization filtering proposed by Favaro is based on the principle.

Aiming at the characteristics of light field images, after the advantages and disadvantages of various Depth acquisition algorithms are fully analyzed, Tao proposes a Depth acquisition algorithm (DCDC) based on defocus from combining Depth and correlation evaluation, the algorithm fully utilizes the advantages of strong local noise resistance of a defocus Depth map and accurate global boundary of a correlation Depth map, and the optimization of a final Depth map is completed by adopting a Markov optimization theory. However, when the depth is obtained by defocusing evaluation, the adopted laplacian only considers the second derivatives of the spatial information in the horizontal and vertical directions, and the problem of mutual energy cancellation during the summation of the second derivatives exists, so that the depth obtaining precision of the algorithm applied to a complex shooting scene is influenced. Based on the method, the defocus evaluation function is improved, the quality of the depth map obtained by the defocus method is improved based on the energy-enhanced defocus evaluation method, and the acquisition precision of the final depth map is further improved.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, solves the problems that the direction of solving the second derivative is limited and the sum energy of all the second derivatives is mutually offset when the existing DCDC algorithm is used for establishing a defocusing response function, and provides an improved light field depth estimation algorithm based on energy enhancement defocusing response.

In order to solve the technical problems, the technical scheme protected by the invention is as follows: the improved light field depth estimation algorithm based on the energy enhanced defocusing response is carried out according to the following steps:

1) after the center of the micro lens is calibrated for the light field camera, the 4D light field L is decoded from the light field original image0(x0,y0,s0,t0) According to the light field biplane representation model and the digital refocusing theory, when the position of the light field camera imaging plane (image distance) is changed, the light field at the new imaging depth (object distance) recorded by the camera

Figure BDA0002261730050000021

With the original light field L0(x0,y0,s0,t0) The following relationships exist for the coordinates of (c):

Figure BDA0002261730050000022

wherein, αnA proportionality coefficient representing the distance of the image plane movement, and n is 0,1,2, 255 representing a difference αnIndex value corresponding to the value will

Figure BDA0002261730050000023

Double integration along the s and t directions respectively to obtain different αnValue-corresponding refocused image:

Figure BDA0002261730050000024

wherein N iss,NtRepresenting the magnitude of the 4D light field angular resolution;

2) 4D light fields at different depths of focusDefocus response ofEnergy enhancement is achieved by increasing the number and direction of second derivatives with weight differences, the expression of the defocus response function being asThe following:

Figure BDA0002261730050000027

wherein, omega'DIs the size of the window around the current pixel,

Figure BDA0002261730050000028

the expression of the second derivative operator based on energy enhancement is as follows:

Figure BDA0002261730050000031

wherein s and t represent the second derivative step length in the horizontal and vertical directions,representing a weight factor, wherein the closer the point to the central point, the larger the weight factor is, the larger the contribution to the Laplace operator value is, otherwise, the farther the point to the central point, the smaller the contribution to the Laplace operator value is, S, T can only take an odd number, and S x T represents the size of a window formed by the convolution of the refocused image and the second derivative operator;

3) 4D light fields at different depths of focus

Figure BDA0002261730050000033

Correlation response of

Figure BDA0002261730050000034

By calculating the 4D light field angular variance:

Figure BDA0002261730050000035

4) taking defocus response

Figure BDA0002261730050000036

Time of maximum αnIndex value n of (D) forms a defocus depth map D1(x, y), decorrelation responseMinimum αnThe index value n of (D) forms a correlation depth map D2(x, y), the specific expression is as follows:

Figure BDA0002261730050000039

5) the fusion of the defocusing depth map and the correlation depth map is realized by using a Markov optimization theory, and the confidence degrees corresponding to different depth maps are calculated firstly:

C1(x,y)=D1(x,y)/D′1(x,y) (8)

C2(x,y)=D2(x,y)/D′2(x,y) (9)

wherein, D'1(x, y) denotes defocus responseTaking the depth map, D ', obtained from the formula (6) at the time of the next largest value'2(x, y) represents the defocus response

Figure BDA00022617300500000311

Taking the depth map obtained by the formula (7) when the value is small; the final fused depth map D (x, y) is obtained by solving the following optimization problem:

Figure BDA00022617300500000312

compared with the prior art, the method improves the defocusing response function in the existing defocusing depth acquisition algorithm, provides the improved defocusing response function based on energy enhancement, fully considers the influence of peripheral pixel points on the defocusing power of the current position, and realizes energy enhancement by increasing the second-order derivative number and direction with weight difference. The effectiveness of the method provided by the invention is verified through experiments: for some light field images with complex depth information, the depth map obtained by the algorithm provided by the invention has better visual effect; compared with defocusing evaluation and correlation evaluation algorithms, the root mean square error of the depth image obtained by the improved algorithm is reduced by 3.95% on average.

Drawings

The present invention will be described in further detail with reference to the accompanying drawings.

FIG. 1 is a data processing flow of the improved light field depth estimation algorithm based on the energy enhanced defocus response of the present invention.

In fig. 2, (a) is a conventional laplacian operator, and (b) is an energy-enhanced second derivative operator proposed by the present invention.

Fig. 3 is a comparison of depth estimation effects based on "shoe" images, where (a) is a light field original image, (b) is a depth map obtained by a conventional DCDC algorithm, and (c) is a depth map obtained by the algorithm of the present invention.

Fig. 4 is a comparison of depth acquisition effects based on "stationary" images, wherein (a) is a dereprenented light field original image, (b) is a depth map acquired by the DCDC algorithm, and (c) is a depth map acquired by the algorithm of the present invention.

FIG. 5 is a sample graph of a "benchmark" dataset, where (a) is a "boxes" scene and (b) is a "dino" scene.

Fig. 6 is a comparison of depth acquisition effects of "boxes" scene, wherein (a) is a real depth map, (b) is a depth map acquired by DCDC algorithm, and (c) is a depth map acquired by the algorithm of the present invention.

Fig. 7 is a comparison of depth acquisition effects of a "dino" scene, where (a) is a standard depth map, (b) is a depth map acquired by a DCDC algorithm, and (c) is a depth map acquired by an algorithm of the present invention.

Detailed Description

In order to make the objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

The improved light field depth estimation algorithm based on the energy enhanced defocusing response is carried out according to the following steps:

1) micro-lens for light field cameraAfter the center calibration, the 4D light field L can be decoded from the light field original image0(x0,y0,s0,t0) According to the light field biplane representation model and the digital refocusing theory, when the position of the light field camera imaging plane (image distance) is changed, the light field at the new imaging depth (object distance) recorded by the camera

Figure BDA0002261730050000041

With the original light field L0(x0,y0,s0,t0) The following relationships exist for the coordinates of (c):

Figure BDA0002261730050000042

wherein, αnA proportionality coefficient representing the distance of the image plane movement, and n is 0,1,2, 255 representing a difference αnIndex value corresponding to the value will

Figure BDA0002261730050000051

Double integration along the s and t directions respectively to obtain different αnValue-corresponding refocused image:

Figure BDA0002261730050000052

wherein N iss,NtRepresenting the magnitude of the 4D light field angular resolution;

2) 4D light fields at different depths of focusDefocus response of

Figure BDA0002261730050000054

Energy enhancement is realized by increasing the number and direction of second derivatives with weight difference, and the expression of the defocusing response function is as follows:

Figure BDA0002261730050000055

wherein, omega'DIs the window size around the current pixel (to improve robustness),

Figure BDA0002261730050000056

the expression of the second derivative operator based on energy enhancement is as follows:

Figure BDA0002261730050000057

wherein s and t represent the second derivative step length in the horizontal and vertical directions,

Figure BDA0002261730050000058

representing a weight factor, wherein the closer the point to the central point, the larger the weight factor is, the larger the contribution to the Laplace operator value is, otherwise, the farther the point to the central point, the smaller the contribution to the Laplace operator value is, S, T can only take an odd number, and S x T represents the size of a window formed by the convolution of the refocused image and the second derivative operator;

3) 4D light fields at different depths of focusCorrelation response of

Figure BDA00022617300500000510

By calculating the 4D light field angular variance:

Figure BDA00022617300500000511

4) taking defocus responseTime of maximum αnIndex value n of (D) forms a defocus depth map D1(x, y), decorrelation response

Figure BDA00022617300500000513

Minimum αnThe index value n of (D) forms a correlation depth map D2(x,y) The specific expression is as follows:

Figure BDA00022617300500000514

Figure BDA00022617300500000515

5) the fusion of the defocusing depth map and the correlation depth map is realized by using a Markov optimization theory, and the confidence degrees corresponding to different depth maps are calculated firstly:

C1(x,y)=D1(x,y)/D′1(x,y) (8)

C2(x,y)=D2(x,y)/D′2(x,y) (9)

wherein, D'1(x, y) denotes defocus response

Figure BDA0002261730050000061

Taking the depth map, D ', obtained from the formula (6) at the time of the next largest value'2(x, y) represents the defocus response

Figure BDA0002261730050000062

Taking the depth map obtained by the formula (7) when the value is small; the final fused depth map D (x, y) is obtained by solving the following optimization problem:

Figure BDA0002261730050000063

compared with the traditional DCDC algorithm, the algorithm of the invention has the main difference that the defocusing response function in the step 2) is different, and the defocusing response in the DCDC algorithm

Figure BDA0002261730050000064

By calculating refocused images at different depths of focus

Figure BDA0002261730050000065

The laplace value of (a) yields:

▽ therein2Is Laplace operator, and is represented by omega to increase algorithm robustnessDThe laplacian mean corresponding to the current pixel point is calculated for the window size.

The defocus response function established by the formula (11) adopts the traditional laplacian operator to complete the focus evaluation on each pixel point of the refocused image, and the main function of the laplacian operator is to solve the second derivative in the horizontal and vertical directions:

Figure BDA0002261730050000067

wherein:

Figure BDA0002261730050000068

Figure BDA0002261730050000069

where t represents the step size of the second derivative in the horizontal and vertical directions.

As can be seen from formulas (13) and (14), the conventional laplacian only evaluates the energy changes of the four pixel points in the horizontal and vertical directions relative to the central pixel point, as shown in fig. 2(a), and the energy is cancelled when the second derivative signs in the horizontal and vertical directions are opposite in summation, so that the depth acquisition accuracy of the algorithm applied to a complex scene is reduced.

Aiming at the problem, the invention improves the defocus corresponding function in the defocus depth acquisition algorithm, provides an improved defocus response function based on energy enhancement, and a schematic diagram of an improved second derivative operator based on energy enhancement is shown in fig. 2(b), wherein S-5 and T-5 are taken as examples in the schematic diagram, and the quantity and direction information of the second derivative are labeled. When s is 0, t is 1, s is 1 and t is 0, the operator calculates 4 pixels adjacent to each other in the horizontal and vertical directions (corresponding to the figure)Four solid circles connected by a solid line) relative to the energy change of the current pixel point (corresponding to the hollow circle at the most central position in the figure), the function is equivalent to the traditional laplacian operator with the second derivative step size of 1. When s is equal to 0, t is equal to 2, s is equal to 1 and t is equal to 1, the vertical direction and the 45-degree angle direction are calculated, and the distance between the current pixel point and the current pixel point is equal to

Figure BDA0002261730050000071

The energy change … … of the pixel point (corresponding to the four hollow circles which are connected by the dotted line and are bounded by the solid line in the figure) relative to the current pixel point is analogized, when s is 2, t is 2, s is 2 and t is 2, the angle direction of 45 degrees and the angle direction of 135 degrees are calculated, and the distance between the pixel point and the current pixel point is equal to 2

Figure BDA0002261730050000072

The energy of the pixel point (corresponding to the four solid circles connected by the dotted line in the figure) of (1) is changed relative to the current pixel point. The process shows that the improved second derivative operator achieves the purpose of energy enhancement by increasing the number and direction of the second derivatives and continuously changing the weight of the second derivatives.

In order to demonstrate that the defocusing response function based on energy enhancement has stronger robustness in the depth acquisition of a complex scene, an image with more complex texture in an experimental database is selected as a test object, and the scene recorded by the micro-lens array light field camera is subjected to depth reconstruction by respectively adopting a DCDC algorithm and the algorithm of the invention.

The 1 st test image is a "shoe" image, as shown in fig. 3(a), dense cavities exist on the surface of the shoe recorded by the image, black and white color jump is reflected in the texture image, and according to the prior knowledge of the human brain, jump of depth values should also exist in the reconstructed depth image at the cavities. After the image is subjected to depth reconstruction by respectively adopting a DCDC algorithm and the algorithm method of the invention, the corresponding depth image is compared in detail in two local areas (an area A where an upper rectangular frame is located and an area B where a lower square frame is located) of the amplified image as shown in fig. 3(B) and fig. 3 (c): the spatial arrangement of the holes is clearer in the depth image obtained by the DCDC algorithm in the area A; the depth map obtained by the DCDC algorithm in the B region is smoother, and the depth map obtained by the method better reflects the change of local depth levels. It can be seen that the depth map and the texture map obtained by the method provided by the invention have stronger consistency and are more in line with the prior judgment of human beings.

The 2 nd test image is "stabler", as shown in fig. 4(a), and the depth maps obtained by the DCDC algorithm and the algorithm of the present invention are shown in fig. 4(b) and fig. 4(c), respectively. The area A of the lower rectangular frame of the enlarged view 4(a) is visible, the middle area of the stapler body is black, the color contrast is not obvious, but jump of the depth value exists, a boundary artifact exists in the local enlarged area of the depth map corresponding to the enlarged view 4(b), and the artifact of the corresponding area of the enlarged view 4(c) is obviously improved. In the enlarged view of the area B where the rectangular frame is located on the upper side of fig. 4(a), there is a significant depth jump between the pink cup and the background, the depth information of the cup is lost in the enlarged area of the depth map corresponding to fig. 4(B), and the detailed outline of the cup can be seen in the enlarged area corresponding to the part of fig. 4 (c). The experiment further proves that the algorithm provided by the invention has higher depth reconstruction precision on the local region with complex depth information.

Because the existing image database shot by the microlens array light field camera does not have a corresponding standard depth image, the two groups of experiments can only compare the superiority of the algorithm provided by the invention in terms of visual effect. In order to further quantitatively evaluate the accuracy of the depth reconstruction of the algorithm provided by the invention, an experiment is carried out by using a benchmark data set of Stanford university, the data set is shot by an array light field camera, and each scene comprises 81 multi-view images and 1 standard depth image corresponding to the multi-view images. The experiment was performed by selecting 2 scenes with complex depth details, as shown in fig. 5. In the experiment process, 81 multi-view images are regarded as 81 sub-aperture images decoded from 1 light field original image of the micro-lens array light field camera to construct 4D light field data. Scene depth estimation is performed by using a DCDC algorithm and the method of the invention, and the obtained depth map and the corresponding standard depth map are shown in FIG. 6 and FIG. 7.

Compared with the overall depth reconstruction effect, the depth map obtained by the algorithm has fewer edge artifacts compared with the DCDC algorithm, and can highlight the depth level change of the scene. Comparing the rectangular boxes in fig. 6(b) and fig. 6(c), the depth map recovered by the algorithm of the present invention reflects a slight jump of the depth level in the central area, which is closer to the standard depth map, and the DCDC algorithm loses the depth details. Comparing the rectangular boxes in fig. 7(b) and fig. 7(c), the boundary of the tooth-shaped model in the depth map recovered by the algorithm of the invention is clearer and closer to the standard depth map.

And finally, taking the standard depth map as a reference, selecting a peak signal-to-noise ratio (PSNR) and a Mean Square Error (MSE) as evaluation indexes, and quantitatively evaluating the accuracy of the DCDC algorithm and the depth map acquired by the algorithm, wherein the evaluation results are shown in table 1. By comparing data in the table, the PSNR of the depth map obtained by the algorithm is higher, and the MSE is lower.

TABLE 1 quantitative evaluation of depth map accuracy

Figure BDA0002261730050000081

Aiming at the problems that the direction of a defocus response function for obtaining a second derivative is limited and the horizontal and vertical second derivatives and energy are mutually offset in the defocus depth estimation of the existing DCDC algorithm, a second derivative operator based on energy enhancement is designed to improve the defocus response function and is used for improving the depth acquisition precision in a complex scene. The second derivative operator fully considers the influence of peripheral pixel points on the dispersion power of the current position, increases the number and direction of the second derivative to realize energy enhancement, and realizes the balance during the summation of the second derivative energy by setting a weight coefficient. The effectiveness of the method provided by the invention is demonstrated through experiments: the depth map obtained by the invention has clearer integral depth level, obvious inhibition on edge artifacts, more real local depth details, improved average PSNR (Peak to noise ratio) of 0.3616dB and reduced average mean square error of 3.95%.

While the embodiments of the present invention have been described in detail with reference to the drawings, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种图像构图分析方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!