Binocular distance measuring method and device

文档序号：187678 发布日期：2021-11-02 浏览：43次中文

阅读说明：本技术 双目测距方法及装置 (Binocular distance measuring method and device ) 是由陈汉清孔方琦沈丽萍李先红郭宏瑞于 2021-07-30 设计创作，主要内容包括：本申请提供一种双目测距方法及装置,该方法可以包括：根据用户对所述双目相机的相机标定结果,确定相机内参与被摄物体距离的对应关系以及双目相机的两个摄像头之间的相对位置关系；获取所述两个摄像头针对目标对象所拍摄的图像；通过预先构造的深度估计网络对所述图像分别进行处理,并根据处理结果确定所述目标对象与双目相机间的预估距离；根据所述对应关系确定对应于所述预估距离的目标内参；根据所述目标内参、所述相对位置关系、所述图像确定所述目标对象的深度信息。通过本申请的技术方案,可以降低相机内参变化所造成的双目测距误差,提高测距精准度。(The application provides a binocular ranging method and a device, and the method can comprise the following steps: determining the corresponding relation between the internal parameters of the camera and the distance of the shot object and the relative position relation between the two cameras of the binocular camera according to the camera calibration result of the binocular camera by the user; acquiring images shot by the two cameras for a target object; respectively processing the images through a pre-constructed depth estimation network, and determining an estimated distance between the target object and a binocular camera according to a processing result; determining target internal parameters corresponding to the pre-estimated distance according to the corresponding relation; and determining the depth information of the target object according to the target internal parameters, the relative position relation and the image. Through the technical scheme of this application, can reduce the binocular range finding error that camera internal reference changes and causes, improve the range finding precision.)

1. A binocular ranging method is applied to electronic equipment provided with a binocular camera, and the method comprises the following steps:

determining the corresponding relation between the internal parameters of the camera and the distance of a shot object and the relative position relation between the first camera and the second camera according to the camera calibration result of a user on the first camera and the second camera in the binocular camera;

acquiring a first image and a second image which are respectively shot by the first camera and the second camera aiming at a target object;

processing the first image and the second image through a pre-constructed depth estimation network, and determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to a processing result;

determining a first target internal reference corresponding to the first pre-estimated distance and a second target internal reference corresponding to the second pre-estimated distance according to the corresponding relation;

and determining the depth information of the target object according to the first target internal parameter, the second target internal parameter, the relative position relation, the first image and the second image.

2. The method according to claim 1, wherein the camera calibration result is obtained by calibrating the plurality of sets of calibration plate images shot by the first camera and the second camera respectively by the user according to the Zhang calibration method; the multiple groups of calibration plate images are obtained by shooting calibration plates at different distances by the first camera and the second camera respectively;

the camera calibration result comprises camera internal parameters of the first camera when shooting calibration boards with different distances and camera internal parameters of the second camera when shooting calibration boards with different distances.

3. The method according to claim 1, wherein the determining the corresponding relationship between the camera internal parameters and the object distance according to the camera calibration result of the user on the first camera and the second camera in the binocular camera comprises:

and according to camera internal parameters of a first camera and a second camera in the binocular camera calibrated by a user when the calibration plates with different distances are shot, fitting to obtain a first function relation between the camera internal parameters of the first camera and the distance of the shot object and a second function relation between the camera internal parameters of the second camera and the distance of the shot object, and taking the first function relation and the second function relation as the corresponding relation of the distances of the shot objects participating in the camera.

4. The method of claim 1, wherein processing the first image and the second image according to a pre-configured depth estimation network to determine a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera comprises:

respectively converting the first image and the second image into a first depth image and a second depth image according to a pre-constructed depth estimation network, wherein the first depth image and the second depth image are gray images;

determining first coordinate information of the target object in the first image and second coordinate information of the target object in the second image;

determining a first gray value of the target object in the first depth image according to the first coordinate information, and determining a second gray value of the target object in the second depth image according to the second coordinate information;

and respectively determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to the first gray value and the second gray value.

5. The method of claim 4, wherein the depth estimation network comprises a Monodepth2 network that converts the first and second images into first and second depth images, respectively, according to a pre-constructed depth estimation network, comprising:

respectively converting the first image and the second image into a first RGB depth image and a second RGB depth image according to a pre-constructed Monodepth2 network;

and carrying out gray level conversion processing on the first RGB depth image and the second RGB depth image to generate a first depth image and a second depth image.

6. The method of claim 4, wherein determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to the first gray scale value and the second gray scale value respectively comprises:

determining a first pre-estimated distance corresponding to the first gray value and a second gray distance corresponding to the second gray value according to the corresponding relation between the gray value and the distance;

the corresponding relation between the gray value and the distance is obtained by fitting a plurality of groups of depth information and gray value pairs which are obtained in advance, the depth information is obtained by measuring the distance between a test object and a test camera which shoots the test object, and the gray value is obtained by processing a test image shot by the test camera through the depth estimation network.

7. The method of claim 4, wherein determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to the first gray scale value and the second gray scale value respectively comprises:

calculating the depth information of an object positioned at the bottom edge of the first image according to the acquired height information and shooting angle of the first camera;

acquiring a gray value of the object positioned at the bottom edge of the first image through the first depth image;

and estimating a first pre-estimated distance between the target object and the first camera corresponding to the first gray value and a second pre-estimated distance between the target object and the second camera corresponding to the second gray value according to the depth information and the gray value of the object positioned at the bottom edge of the first image.

8. The method of claim 1, wherein the determining the depth information of the target object according to the first target internal parameter, the second target internal parameter, the relative position relationship, the first image and the second image comprises:

performing binocular correction on the first image and the second image according to the first target internal reference, the second target internal reference and the relative position relation;

performing binocular matching on the corrected first image and the second image to generate a disparity map;

and determining the depth distance of the target object according to the disparity map, the first target internal reference, the second target internal reference and the relative position relation.

9. The method of claim 1, wherein the electronic device is a medical device and the target object is a lesion, the method further comprising:

establishing a three-dimensional model of the part where the focus is located according to the depth information of the focus;

and determining the position information of the focus according to the three-dimensional model.

10. The binocular ranging apparatus is applied to an electronic device equipped with a binocular camera, and includes:

the calibration result determining unit is used for determining the corresponding relation between the camera internal parameters and the distance of the shot object and the relative position relation between the first camera and the second camera according to the camera calibration result of the user on the first camera and the second camera in the binocular camera;

the image acquisition unit is used for acquiring a first image and a second image which are respectively shot by the first camera and the second camera aiming at a target object;

the distance pre-estimating unit is used for processing the first image and the second image through a pre-constructed depth estimation network, and determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to a processing result;

an internal reference determining unit, configured to determine, according to the correspondence, a first target internal reference corresponding to the first pre-estimated distance and a second target internal reference corresponding to the second pre-estimated distance;

and the depth information determining unit is used for determining the depth information of the target object according to the first target internal parameter, the second target internal parameter, the relative position relation, the first image and the second image.

11. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the method of any one of claims 1-9 by executing the executable instructions.

12. A computer-readable storage medium having stored thereon computer instructions, which, when executed by a processor, carry out the steps of the method according to any one of claims 1-9.

Technical Field

The application relates to the technical field of machine vision, in particular to a binocular distance measuring method and device.

Background

With the development of the medical industry, some emerging medical devices can be used for performing surgical treatment on patients, and the current medical devices mainly use a machine vision system to measure the distance between a lesion and the medical device and position the lesion based on the measured distance so as to perform surgery on the position of the lesion. Thus, surgical success rates depend largely on machine vision accuracy of medical devices.

At present, a machine vision system of medical equipment mainly depends on the traditional binocular ranging technology to carry out distance measurement, the distance between the equipment and a measured target focus is detected through a binocular ranging device assembled on the medical equipment, but the measurement precision of the traditional binocular ranging technology cannot meet the operation requirement easily.

Disclosure of Invention

In view of the above, the present application provides a binocular ranging method and apparatus.

Specifically, the method is realized through the following technical scheme:

according to a first aspect of the present application, a binocular ranging method is provided, which is applied to an electronic device configured with a binocular camera, and includes:

acquiring a first image and a second image which are respectively shot by the first camera and the second camera aiming at a target object;

According to a second aspect of the present application, a binocular range finder is provided, applied to an electronic device configured with a binocular camera, comprising:

the image acquisition unit is used for acquiring a first image and a second image which are respectively shot by the first camera and the second camera aiming at a target object;

According to a third aspect of the present application, there is provided an electronic device comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the method as described in the embodiments of the first aspect above by executing the executable instructions.

According to a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method as described in the embodiments of the first aspect above.

According to the technical scheme, the corresponding relation of the distances of the objects participating in the camera is predetermined, the distances between the objects and the binocular camera are roughly estimated when the distance between the objects and the binocular camera is measured through the camera parameters corresponding to the estimated distance, the distance between the objects and the camera is calculated based on the camera parameters corresponding to the estimated distance, the camera parameters adopted by the binocular camera when the distance between the objects and the camera is measured can be accurately determined, the error of the binocular distance measurement is reduced, and the measurement accuracy is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic view of an imaging model of a binocular camera according to an exemplary embodiment of the present application;

fig. 2 is a flow chart illustrating a binocular ranging method according to an exemplary embodiment of the present application;

fig. 3 is a graph illustrating a set of correspondences of distances between participating objects in a camera according to an exemplary embodiment of the present application;

FIG. 4 is a diagram illustrating a fitting function of distances of participating objects within a camera according to an exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of a captured image and corresponding depth image shown in accordance with an exemplary embodiment of the present application;

FIG. 6 is a schematic diagram of a measurement model for depth information of an object at the bottom of an image according to an exemplary embodiment of the present application;

FIG. 7 is a schematic view of a binocular ranging electronic device shown in accordance with an exemplary embodiment of the present application;

fig. 8 is a block diagram illustrating a binocular ranging apparatus according to an exemplary embodiment of the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

Next, examples of the present application will be described in detail.

The distance measurement principle of the binocular camera is similar to that of human eyes, and because the images presented by the two eyes to the same object are different, the images are also called parallax, and the farther the object distance is, the smaller the parallax is; conversely, the larger the parallax, the more distant and near the object can be perceived by human eyes. Similarly, by calculating the parallax of two images captured by two cameras in the binocular camera, the distance between an object in the range captured by the two images and the binocular camera can be directly measured. FIG. 1 is a schematic view of an imaging model of a binocular camera, as shown in FIG. 1, point P is a target object to be measured, C_LRepresenting the center point, C, of a first camera in a binocular camera_RThe center point of a second camera in the binocular camera is represented, and the points of the point P imaged on the images shot by the first camera and the second camera are respectively P_LAnd P_RPoint P_LHas an abscissa of x_LPoint P_RHas an abscissa of x_R。x_LAnd x_RAnd respectively representing the horizontal positions of the points on the images of the first camera and the second camera. Wherein, the parallax is defined as d ═ x_L-x_RThat is, the difference between the corresponding x coordinates of the same spatial point in the two camera images, and the parallax can be obtained from the parallax map. The distance between the target object P and the binocular camera, namely the depth Z value, can be deduced by using the similar triangle theorem, and the calculation formula of the depth information Z of the target object is as follows:

in formula (1), f is the focal length of the camera, and b is the optical center C of the first camera_LAnd a second camera optical center C_RThe distance between, i.e. the base length, X, of the first camera and the second camera_L-X_RIs the parallax. In the conventional binocular ranging technology, a focal length f and a base length b are used as fixed parameters, and camera intrinsic parameters determined by calibrating a binocular camera in advance can be obtained.

However, in the actual distance measurement process, the camera parameter is not a fixed value, but changes with the distance of the object to be photographed. For example, the farther the subject is, the larger the focal length value used by the camera when shooting the subject is, and the more likely the camera is out of focus when the subject is close. Therefore, the traditional binocular ranging technology determines the fixed camera internal parameters through one-time camera calibration, and large errors are easily caused to ranging results.

In order to solve the problem, different camera internal parameters are adopted for calculating aiming at target objects at different distances, so that the ranging error caused by the camera internal parameters is reduced. Fig. 2 is a flowchart illustrating a binocular ranging method according to an exemplary embodiment of the present application. As shown in fig. 2, the method is applied to an electronic device configured with a binocular camera, and may include the steps of:

step 202: and determining the corresponding relation between the internal parameters of the camera and the distance of the object to be shot and the relative position relation between the first camera and the second camera according to the camera calibration result of the user on the first camera and the second camera in the binocular camera.

In this application, binocular camera usually comprises two cameras, and first camera and second camera can shoot the target object respectively simultaneously. The first camera and the second camera can be cameras with the same type and the same parameters and can also be cameras with different parameters, and the application does not limit the cameras. In the binocular camera, since the camera causes radial distortion of an image due to the characteristics of the optical lens, and since there is an error in assembly, the sensor and the optical lens are not perfectly parallel, and tangential distortion of the image exists. Therefore, before binocular distance measurement is performed by using a binocular camera, camera calibration needs to be performed on a first camera and a second camera in the binocular camera to determine camera intrinsic parameters and camera extrinsic parameters of the cameras, wherein the camera intrinsic parameters may include parameters such as a camera focal length, distortion parameters and an imaging origin, and the camera extrinsic parameters may include parameters such as a relative position relationship between the first camera and the second camera. The relative position relationship between the first camera and the second camera can be represented by a rotation matrix and a translation vector of the first camera relative to the second camera.

The calibration method is the same as the traditional camera calibration method, the camera calibration is carried out on the camera by adopting the Zhang calibration method, the chessboard calibration plate is shot by the binocular camera to obtain chessboard calibration plate images, wherein the chessboard calibration plate is a calibration plate formed by black and white square intervals, the calibration of the camera can be carried out according to a plurality of images shot by the camera from different directions on the calibration plate and the corresponding relation between the characteristic points on the calibration plate and the image points on the image plane, namely, the homography matrix of each image, the specific calibration calculation method can refer to the records in the related technology, and the details are not repeated herein.

The difference is that, compared with a method for calculating the internal and external parameters of the camera obtained by single camera calibration in the traditional binocular ranging technology, the method needs to calibrate the camera for multiple times, shoot multiple groups of calibration plate images aiming at calibration plates at different distances, and calibrate the camera respectively based on the shot multiple groups of calibration plate images so as to determine the internal parameters of the camera respectively adopted when the camera shoots each distance calibration plate, and obtain the corresponding relation of multiple groups of camera internal parameters-calibration plate distances. In the present application, camera calibration may be implemented by, for example, stereo camera calibretor toolkit provided by matlab, find4 quadr corpern subpaix provided by opencv, calibretecaramera, or other functions, and the present application is not limited thereto. For example, a calibration board may be set every 2cm at a distance of 50cm to 90cm from the camera, and images of the calibration board are taken for the 20 calibration boards respectively and camera calibration is performed, so as to obtain data pairs of distances between objects participating in shooting in 20 sets of cameras. As shown in fig. 3, which is a graph of a set of corresponding relationships between camera internal parameters and object distances, where the horizontal axis in fig. 3 is the object distance and the vertical axis is the camera focal length, and the inflection points of the curves in the image can respectively represent the focal length values of the first camera obtained by camera calibration when the first camera respectively shoots calibration boards at different distances between 50cm and 90cm, it can be found from the curves in fig. 3 that although the calibrated focal length values fluctuate due to measurement errors, the camera focal length generally increases as the object distance increases.

In an embodiment, the corresponding relationship of the distances between the camera internal parameters and the objects to be shot in the camera may be fitted in advance according to the camera internal parameters obtained by calibration when the discrete cameras shoot calibration boards with different distances, and based on the fitting operation, the distances between the camera internal parameters and the objects to be shot conform to a functional relationship. Therefore, camera parameters adopted by the camera when the target object is shot can be determined according to the distance of the target object and the functional relation, and binocular ranging calculation is performed according to the camera parameters. Still taking the above-mentioned calibration of the camera by shooting the calibration board at a distance of 50cm to 90cm for calibration, the focus variation curve in the fitting function diagram of the camera participating in the distance to the object shown in fig. 4 can be generated by fitting the value pair of the camera focus and the distance to the object obtained by calibration with a gaussian curve. All the specific experimental data can be used in the fitting process so that as many data points as possible are located on or near the fitted function image to achieve the best fit. It should be noted that the above gaussian curve fitting is only an example of the fitting manner, and in fact, any fitting algorithm in the related art may be used to fit the calibrated corresponding relationship between the camera internal parameter and the object distance, which is not limited in this application. By fitting the experimental data calibrated by the camera, a continuous function of the distance of the object participating in shooting in the camera can be obtained, the influence of the calibration error of the camera on the actual distance measurement is reduced, and the accuracy of the camera internal parameters obtained in the subsequent binocular measurement is improved.

Step 204: and acquiring a first image and a second image which are respectively shot by the first camera and the second camera aiming at a target object.

In the actual ranging process, a first camera and a second camera in the binocular camera can shoot a target object simultaneously respectively, images which are collected in real time and provided with the target object are stored, and a first image shot by the first camera and a second image shot by the second camera are obtained.

Furthermore, the first camera and the second camera should be placed in parallel as much as possible in the shooting process, and the heights of the first camera and the second camera are ensured to be equal, so that the shooting plane of the first camera for the target object and the shooting plane of the second camera for the target object are on the same plane, the subsequent calculation amount for binocular correction of the first image and the second image can be reduced, the image processing speed is improved, and the measurement error is reduced.

Step 206: and processing the first image and the second image through a pre-constructed depth estimation network, and determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to a processing result.

In the technical scheme of the application, before the target object depth is calculated through a binocular ranging technology, a first image and a second image shot by a binocular camera need to be input into a depth estimation network trained in advance, the first image and the second image are processed through the depth estimation network to obtain a corresponding depth image, and pixel values in the depth image can represent distances from an image collector to each point in a scene. For convenience of calculation, the depth image used in the present application is a gray scale image, and the pixel value of the gray scale image can be represented by a single-channel gray scale value, and fig. 5 is a schematic diagram of a shot image containing a calibration plate and a depth image corresponding to the shot image. In the related art, various unsupervised and self-supervised based depth learning algorithm models exist for predicting the image depth, and the depth estimation network model adopted by the application is not limited at all. After the depth image is acquired, the first coordinate information of the target object in the first image and the second coordinate information of the target object in the second image may be determined to acquire a gray value corresponding to the coordinate in the depth image, where the gray value may represent the depth of the target object. For example, when the target object is a calibration board, coordinates of four vertices of a checkerboard in the first image may be detected through a findchessboard function in opencv, a quadrangle surrounded by the coordinates of the four vertices is a checkerboard on the calibration board, and a gray value at the vertex coordinates in the first depth image is obtained and is a gray value at the vertices of the checkerboard of the calibration board. Further, in order to reduce errors, the gray average value of the pixel points in the quadrangle surrounded by the four vertex coordinates can be calculated, and the gray average value is used as the gray value of the calibration board. It should be noted that the grayscale values mentioned in the present application are grayscale values used for representing depth values in the depth image, regardless of the color of the subject image itself.

In an embodiment, the depth estimation network may be a monadepth 2 network, and the binocular camera inputs the first image into the monadepth 2 network after capturing the first image and the second image, where the monadepth 2 network training method may refer to the records in the related art, which is not described herein again. Taking the first image as an example, the monadepth 2 network may convert the first image into the first RGB depth image based on the parameters set by the user, and output the first RGB depth image, where the RGB depth image may represent the depth values of the objects in the image by color and color brightness, and the color of the object with closer depth in the image is closer to bright yellow in the RGB depth image, and the color of the object with farther depth is closer to black in the RGB depth image. Furthermore, since the pixel values in the RGB depth map are three-channel RGB values, which are not favorable for calculation, a built-in function of the monadepth 2 network may be used to convert the first RGB image into a first depth image with only a single gray channel, and the gray value of a single pixel in the first depth image may represent the depth value of the pixel.

Although the depth image processed by the depth estimation network can embody the depth of an object in the image through a gray value, since a common distance unit is a metric distance unit, a pixel unit and the distance unit need to be unified, and the gray value is converted into a metric distance, which is an estimated distance of a target object.

In one embodiment, the distance between each test object and the shooting camera when shooting the test image can be measured as the depth information of each test object by performing experiments on the test image containing objects with different distances in advance; inputting the test image into the depth used by binocular ranging, converting the test image into a test depth image through the related content described in the above step 106, and determining the corresponding gray value of each test object according to the coordinate information of different test objects in the test image, thereby determining the mapping relationship of multiple groups of gray values-distances. And fitting the corresponding relation between the gray value and the distance according to the mapping relation of multiple groups of gray values-distances obtained by testing, wherein the gray value and the distance accord with a functional relation based on the fitting operation. In the binocular ranging process, the corresponding preset distance can be determined according to the functional relation and the gray value of the target object, so that the distance between the target object and the binocular camera can be accurately estimated.

In another embodiment, the linear relationship y ═ kx between the level gray scale value and the depth may be preset, where x represents the gray scale value, y represents the metric distance, and k represents a preset constant. Taking the first image as an example, in the process of processing the first image, the depth information of the object located at the bottom edge of the first image can be calculated according to the trigonometric function through the height information and the shooting angle of the first camera. Fig. 6 is a schematic diagram of a model of a first camera shooting a first image, where 601 is the first camera, 602 is the first image shot by the first camera, P is a target object, Q is an object located at the bottom edge of the first image, h is height information of the first camera, α is a shooting angle, and l is depth information of the object at the bottom edge of the image to be calculated, where the height information of the camera can be obtained by a pose sensor configured with a binocular camera, and the shooting angle can be calculated by the pose sensor and a field angle of the camera. For convenience of calculation, in this embodiment, the bottom object of the first image and the reference plane of the camera height may be in the same plane. According to the depth information and the gray value of the object at the bottom edge of the image, a constant k in a preset linear function can be determined, then the first gray value and the second gray value of the target object are respectively substituted into the function relation, and the first estimated distance and the second estimated distance of the target object can be obtained through calculation.

Step 208: and determining a first target internal reference corresponding to the first pre-estimated distance and a second target internal reference corresponding to the second pre-estimated distance according to the corresponding relation.

After the first estimated distance between the target object and the first camera and the second estimated distance between the target object and the second camera are roughly estimated through the depth estimation network, based on the correspondence between the camera internal parameters determined in the step 102 and the distance of the object to be photographed, a first target internal parameter corresponding to the first estimated distance and a second target internal parameter corresponding to the second estimated distance are determined, where the first target internal parameter is a camera internal parameter adopted by the first camera when the target object is photographed, and the second target internal parameter is a camera internal parameter adopted by the second camera when the target object is photographed. For example, if the correspondence relationship between the distances of the objects participating in the camera conforms to a function relationship, the distance of the object being taken as an independent variable in the function relationship, and the camera internal parameters are used as corresponding dependent variables, the first estimated distance and the second estimated distance obtained by the depth estimation network can be respectively used as the distance of the object being taken and substituted into the function relationship, and the corresponding first target internal parameter and the second target internal parameter are obtained through calculation, wherein the first target internal parameter is the camera internal parameter adopted by the first camera when the first camera shoots the target object, and the second target internal parameter is the camera internal parameter adopted by the second camera when the second camera shoots the target object.

Step 210: and determining the depth information of the target object according to the first target internal parameter, the second target internal parameter, the relative position relation, the first image and the second image.

After determining the target internal reference adopted by the camera according to the current binocular camera, binocular correction can be performed on the first image and the second image according to the determined target internal reference and the relative position relationship between the two cameras in the binocular camera, distortion elimination and line alignment are performed on the first image and the second image respectively, so that the imaging origin coordinates of the first image and the second image are consistent, the optical axes of the first camera and the second camera are parallel, the imaging planes of the first image and the second image are coplanar, and polar lines are aligned. In this application, the binocular correction process is the same as the conventional binocular ranging correction process, and is not repeated herein.

After the correction is completed, stereo matching can be performed on the first image and the second image after binocular correction to obtain a disparity map, specifically, the stereo matching mode in this step is the same as that of the conventional binocular distance measurement algorithm, and details are not repeated here. Through stereo matching, the disparity map of the target object shot by the first camera and the second camera can be obtained. According to the disparity map and the target internal parameters determined according to the corresponding relation of the distances of the objects participating in the camera, the depth information of the target object to be measured can be calculated by using the similar triangle theorem based on the related contents shown in the figure 1.

In an embodiment, the electronic device is a medical device equipped with a binocular camera, and the target object is a lesion, depth information of the lesion can be accurately measured according to the binocular measurement method disclosed by the application, a three-dimensional model of a position where the lesion is located is established according to the measured depth information of the lesion, the medical device and a doctor can be helped to accurately determine position information of the lesion based on the established three-dimensional model, and a success rate of an operation on the position of the lesion is improved.

Corresponding to the method embodiments, the present specification also provides an embodiment of an apparatus.

Fig. 7 is a schematic structural diagram of a binocular ranging electronic device according to an exemplary embodiment of the present application. Referring to fig. 7, at the hardware level, the electronic device includes a processor 702, an internal bus 704, a network interface 706, a memory 708, and a non-volatile storage 710, although it may also include hardware required for other services. The processor 702 reads the corresponding computer program from the non-volatile storage 710 into the memory 708 and then runs it. Of course, besides the software implementation, the present application does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.

Fig. 8 is a block diagram illustrating a binocular ranging apparatus according to an exemplary embodiment of the present application. Referring to fig. 8, the apparatus includes a calibration result determining unit 802, an image obtaining unit 804, a distance predicting unit 806, an internal reference determining unit 808, and a depth information determining unit 810, in which:

the calibration result determining unit 802 is configured to determine a corresponding relationship between camera parameters and a subject distance and a relative position relationship between a first camera and a second camera in the binocular camera according to camera calibration results of a user on the first camera and the second camera.

The image acquisition unit 804 is configured to acquire a first image and a second image captured by the first camera and the second camera, respectively, with respect to a target object.

The distance pre-estimating unit 806 is configured to process the first image and the second image through a pre-configured depth estimation network, and determine a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to a processing result.

The internal reference determining unit 808 is configured to determine a first target internal reference corresponding to the first pre-estimated distance and a second target internal reference corresponding to the second pre-estimated distance according to the corresponding relationship.

The depth information determination unit 810 is configured to determine depth information of the target object from the first target internal reference, the second target internal reference, the relative positional relationship, the first image, and the second image.

Optionally, the camera calibration result is obtained by calibrating, by the user, the multiple sets of calibration plate images shot by the first camera and the second camera respectively according to a zhang's calibration method; the multiple groups of calibration plate images are obtained by shooting calibration plates at different distances by the first camera and the second camera respectively; the camera calibration result comprises camera internal parameters of the first camera when shooting calibration boards with different distances and camera internal parameters of the second camera when shooting calibration boards with different distances.

Optionally, the determining, according to a camera calibration result of the user on the first camera and the second camera in the binocular camera, a correspondence between the camera internal parameters and the distance of the object to be photographed includes: and according to camera internal parameters of a first camera and a second camera in the binocular camera calibrated by a user when the calibration plates with different distances are shot, fitting to obtain a first function relation between the camera internal parameters of the first camera and the distance of the shot object and a second function relation between the camera internal parameters of the second camera and the distance of the shot object, and taking the first function relation and the second function relation as the corresponding relation of the distances of the shot objects participating in the camera.

Optionally, the processing the first image and the second image according to a pre-configured depth estimation network to determine a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera includes: respectively converting the first image and the second image into a first depth image and a second depth image according to a pre-constructed depth estimation network, wherein the first depth image and the second depth image are gray images; determining first coordinate information of the target object in the first image and second coordinate information of the target object in the second image; determining a first gray value of the target object in the first depth image according to the first coordinate information, and determining a second gray value of the target object in the second depth image according to the second coordinate information; and respectively determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to the first gray value and the second gray value.

Optionally, the depth estimation network includes a monadepth 2 network, and the converting the first image and the second image into the first depth image and the second depth image according to a pre-constructed depth estimation network includes: respectively converting the first image and the second image into a first RGB depth image and a second RGB depth image according to a pre-constructed Monodepth2 network; and carrying out gray level conversion processing on the first RGB depth image and the second RGB depth image to generate a first depth image and a second depth image.

Optionally, the determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to the first gray scale value and the second gray scale value respectively includes: determining a first pre-estimated distance corresponding to the first gray value and a second gray distance corresponding to the second gray value according to the corresponding relation between the gray value and the distance; the corresponding relation between the gray value and the distance is obtained by fitting a plurality of groups of depth information and gray value pairs which are obtained in advance, the depth information is obtained by measuring the distance between a test object and a test camera which shoots the test object, and the gray value is obtained by processing a test image shot by the test camera through the depth estimation network.

Optionally, the determining a first pre-estimated distance between the target object and the first camera and a second pre-estimated distance between the target object and the second camera according to the first gray scale value and the second gray scale value respectively includes: calculating the depth information of an object positioned at the bottom edge of the first image according to the acquired height information and shooting angle of the first camera; acquiring a gray value of the object positioned at the bottom edge of the first image through the first depth image; and estimating a first pre-estimated distance between the target object and the first camera corresponding to the first gray value and a second pre-estimated distance between the target object and the second camera corresponding to the second gray value according to the depth information and the gray value of the object positioned at the bottom edge of the first image.

Optionally, the determining the depth distance of the target object according to the target internal reference, the relative position relationship, the first image and the second image includes: performing binocular correction on the first image and the second image according to the target internal reference and the relative position relation; performing binocular matching on the corrected first image and the second image to generate a disparity map; and determining the depth distance of the target object according to the disparity map, the first target internal reference, the second target internal reference and the relative position relation.

Optionally, the electronic device is a medical device, the target object is a lesion, and the apparatus further includes:

a three-dimensional model establishing unit 812 configured to establish a three-dimensional model of a location where the lesion is located according to the depth information of the lesion.

A lesion position determination unit 814 configured to determine position information of the lesion according to the three-dimensional model.

The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.

In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium, e.g., a memory, comprising instructions executable by a processor of a binocular ranging apparatus to implement a method as in any of the above embodiments, such as the method may comprise:

acquiring a first image and a second image which are respectively shot by the first camera and the second camera aiming at a target object;

The non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc., which is not limited in this application.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

19页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种任意地形条件下树高树干长冠长高精度测量方法

Binocular distance measuring method and device

相关技术

网友询问留言