Mine robot positioning and image building method based on laser radar and binocular camera

文档序号:1797710 发布日期:2021-11-05 浏览:8次 中文

阅读说明:本技术 一种基于激光雷达及双目相机的矿井机器人定位建图方法 (Mine robot positioning and image building method based on laser radar and binocular camera ) 是由 王艳坤 韩硕 付金宇 郭帅 李想 吕晓波 张兵 于 2021-08-03 设计创作,主要内容包括:一种基于激光雷达及双目相机的矿井机器人定位建图方法,涉及机器视觉技术领域,针对现有技术中针对矿井环境中视觉深度估计不准确的问题,本申请针对矿井环境的特殊性,在矿井下,因光照或环境纹理缺失,会出现深度估计不准确,特征缺少等情况,从而导致定位估计不准确甚至丢失。而激光雷达重定位能力较弱,不适合在长直的矿洞中工作。针对这些情况,本申请利用线特征补充缺乏的点特征,利用传感器融合解决单一传感器遇到的问题,实现了更为稳定,更高精度的定位及建图工作,并且解决了矿井环境中单一的视觉或激光雷达定位建图误差较大的问题。(A mine robot positioning and mapping method based on a laser radar and a binocular camera relates to the technical field of machine vision, aims at the problem that in the prior art, the estimation of visual depth in a mine environment is inaccurate, and aims at the particularity of the mine environment, under a mine, due to the fact that illumination or environment texture is lost, the situations of inaccurate depth estimation, characteristic shortage and the like can occur, and therefore the positioning estimation is inaccurate or even lost. And the laser radar has weak relocation capacity and is not suitable for working in long and straight mine holes. Aiming at the situations, the method and the device utilize line characteristics to supplement lacking point characteristics, solve the problems encountered by a single sensor by utilizing sensor fusion, realize more stable and higher-precision positioning and mapping work, and solve the problem of larger error of single vision or laser radar positioning mapping in the mine environment.)

1. A mine robot positioning and image building method based on a laser radar and a binocular camera is characterized by comprising the following steps:

the method comprises the following steps: carrying out combined calibration by using a laser radar and a binocular camera to obtain a conversion matrix;

step two: collecting the surrounding environment by using a laser radar and a binocular camera to obtain point cloud depth information and image information;

step three: performing point feature extraction and line feature extraction on the image information to obtain feature points and feature lines;

step four: obtaining a depth image by converting the point cloud depth information through a matrix, correcting and estimating the depth of the feature points by using the depth image, establishing a constraint relation by using the corrected and estimated feature points and feature lines, and estimating the initial pose of the binocular camera by using the established constraint relation;

step five: performing feature fusion on the feature points and the feature lines after the estimation is corrected, constructing a point-line error model by using the result after the feature fusion, and optimizing the pose of the binocular camera by using the point-line error model;

step six: and repeating the second step to the fifth step, estimating continuous frame position and posture conversion, performing loop detection and repositioning, and finally completing positioning and mapping.

2. The mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 1, wherein the specific steps of performing combined calibration by using the laser radar and the binocular camera are as follows:

firstly, calibrating internal parameters of the binocular camera, and then calibrating the radar of the binocular camera by adopting a checkerboard and an automatic calibration tool to obtain the conversion relation between the laser radar and the binocular camera.

3. The mine robot positioning and mapping method based on the laser radar and the binocular camera as claimed in claim 1, wherein the specific steps of utilizing the depth image to correct and estimate the depth of the feature points are as follows:

firstly, selecting all laser radar point clouds in a 10-10 area around a feature point, then fitting a plurality of planes to all the laser radar point clouds in the area, calculating the distances from the feature point to all the planes, selecting the plane with the minimum distance to project in the direction of the light center, and obtaining a point which is depth estimation.

4. The mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 1, wherein the point feature extraction adopts an ORB algorithm.

5. The mine robot positioning and mapping method based on the lidar and the binocular camera as claimed in claim 1, wherein the line feature extraction employs an improved LSD algorithm, and the improved LSD algorithm employs length suppression, broken line splicing and near line merging strategies to optimize feature extraction results.

6. The mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 1, wherein the point-line error model is constructed by:

where ρ isp,ρlRepresenting the Huber robust kernel function, epk,Error of the point, elk,The error of the line is represented, i represents the ith feature point, k represents the kth key frame, and j represents the jacobian matrix.

7. The mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 6, wherein the ep isk,iExpressed as:

wherein p iskiIs the ith point of the k frame image, RkRotation matrix, t, for three-dimensional point-to-plane imageskAs translation vector, XiRepresented as the ith point in the map and pi is the coincidence representation projected onto the image.

8. The mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 6, wherein the elk,jExpressed as:

9. the mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 1, wherein the method further comprises a step of eliminating short-line features before the line feature extraction, the short-line features are based on short-line elimination criteria, and the short-line elimination criteria are as follows:

leni≥(lenmin=ω×min(WI,HI)) i=1....n

WI,HIdenotes the size of the image, ω denotes the scale factor, leniThe ith line segment is represented.

10. The mine robot positioning and mapping method based on the laser radar and the binocular camera according to claim 1, wherein the loop detection specifically comprises the following steps:

firstly, constructing a visual dictionary integrated by point lines, then performing off-line training on the visual dictionary, then converting each inserted key frame into a word packet vector by using the visual dictionary obtained by the off-line training, constructing an online database according to the word packet vector, storing the online database according to an inverted index, then searching all key frames containing visual words through the inverted index, and then calculating similarity scores of point line characteristics in the key frames and a current frame of a binocular camera when the key frames in a map and the current frame have the same words:

s=αsp+βsl

wherein α + β is 1, sp、slAnd respectively carrying out point and line feature similarity of the two points and the line, and finally carrying out time consistency check, geometric check and continuity check to obtain a final closed loop result.

Technical Field

The invention relates to the technical field of machine vision, in particular to a mine robot positioning and image building method based on a laser radar and a binocular camera.

Background

According to statistics, coal resources are the most important energy consumption in energy production in China, mining conditions in China are relatively difficult, but mining technologies are quite limited, coal miners basically perform manual operation, the underground environment is severe, the life safety of people is greatly damaged, and once an accident occurs, loss is immeasurable. The mobile robot can replace people to complete some work in special scenes, and particularly in mine environments, the robot can replace people to complete high-difficulty operations such as coal digging and coal transporting. However, for a coal mine robot, a crucial technology is navigation positioning, and the robot must have good positioning and mapping capabilities to accurately plan a route and avoid obstacles. Due to the influence of illumination and object texture in the environment, the estimation of the visual depth is inaccurate, and therefore, the simple visual positioning mapping error is large.

Disclosure of Invention

The purpose of the invention is: aiming at the problem of inaccurate estimation of visual depth in a mine environment in the prior art, a mine robot positioning and image building method based on a laser radar and a binocular camera is provided.

The technical scheme adopted by the invention to solve the technical problems is as follows:

a mine robot positioning and image building method based on a laser radar and a binocular camera comprises the following steps:

the method comprises the following steps: carrying out combined calibration by using a laser radar and a binocular camera to obtain a conversion matrix;

step two: collecting the surrounding environment by using a laser radar and a binocular camera to obtain point cloud depth information and image information;

step three: performing point feature extraction and line feature extraction on the image information to obtain feature points and feature lines;

step four: obtaining a depth image by converting the point cloud depth information through a matrix, correcting and estimating the depth of the feature points by using the depth image, establishing a constraint relation by using the corrected and estimated feature points and feature lines, and estimating the initial pose of the binocular camera by using the established constraint relation;

step five: performing feature fusion on the feature points and the feature lines after the estimation is corrected, constructing a point-line error model by using the result after the feature fusion, and optimizing the pose of the binocular camera by using the point-line error model;

step six: and repeating the second step to the fifth step, estimating continuous frame position and posture conversion, performing loop detection and repositioning, and finally completing positioning and mapping.

Further, the specific steps of performing combined calibration by using the laser radar and the binocular camera are as follows:

firstly, calibrating internal parameters of the binocular camera, and then calibrating the radar of the binocular camera by adopting a checkerboard and an automatic calibration tool to obtain the conversion relation between the laser radar and the binocular camera.

Further, the specific step of performing correction estimation on the depth of the feature point by using the depth image is as follows:

firstly, selecting all laser radar point clouds in a 10-10 area around a feature point, then fitting a plurality of planes to all the laser radar point clouds in the area, calculating the distances from the feature point to all the planes, selecting the plane with the minimum distance to project in the direction of the light center, and obtaining a point which is depth estimation.

Further, the point feature extraction adopts an ORB algorithm.

Further, the line feature extraction adopts an improved LSD algorithm, and the improved LSD algorithm adopts length suppression, broken line splicing and near line merging strategies to optimize the feature extraction result.

Further, the constructing the point-line error model is represented as:

where ρ isp,ρlRepresenting the Huber robust kernel function, epk,iError of the point, elk,iThe error of the line is represented, i represents the ith feature point, k represents the kth key frame, and j represents the jacobian matrix.

Further, said epk,iExpressed as:

wherein p iskiIs the ith point of the k frame image, RkRotation matrix, t, for three-dimensional point-to-plane imageskAs translation vector, XiRepresented as the ith point in the map and pi is the coincidence representation projected onto the image.

Further, the elk,jExpressed as:

further, the line feature extraction also comprises a step of eliminating short line segment features before the line feature extraction, wherein the short line segment features are based on a short line segment elimination criterion, and the short line segment elimination criterion is as follows:

leni≥(lenmin=ω×min(WI,HI))i=1....n

WI,HIdenotes the size of the image, ω denotes the scale factor, leniThe ith line segment is represented.

Further, the loop detection specifically comprises the following steps:

firstly, constructing a visual dictionary integrated by point lines, then performing off-line training on the visual dictionary, then converting each inserted key frame into a word packet vector by using the visual dictionary obtained by the off-line training, constructing an online database according to the word packet vector, storing the online database according to an inverted index, then searching all key frames containing visual words through the inverted index, and then calculating similarity scores of point line characteristics in the key frames and a current frame of a binocular camera when the key frames in a map and the current frame have the same words:

s=αsp+βsl

wherein α + β is 1, sp、slAnd respectively carrying out point and line feature similarity of the two points and the line, and finally carrying out time consistency check, geometric check and continuity check to obtain a final closed loop result.

The invention has the beneficial effects that:

the method solves the problem that the traditional method for positioning and mapping the mobile robot in the mine is poor in positioning and mapping capacity, and can effectively and reliably estimate the mileage information of the mobile robot by utilizing sensor fusion and point-line characteristic processing.

According to the method, due to the particularity of the mine environment, in the underground mine, due to the fact that illumination or environment textures are lost, the situations that depth estimation is inaccurate, characteristics are lack and the like can occur, and therefore positioning estimation is inaccurate or even lost. And the laser radar has weak relocation capacity and is not suitable for working in long and straight mine holes. Aiming at the situations, the method and the device utilize line characteristics to supplement lacking point characteristics, solve the problems encountered by a single sensor by utilizing sensor fusion, realize more stable and higher-precision positioning and mapping work, and solve the problem of larger error of single vision or laser radar positioning mapping in the mine environment.

Drawings

FIG. 1 is an overall flow chart of the present application;

FIG. 2 is a schematic view of depth correction according to the present application;

FIG. 3 is a schematic view of an observation model of a spatial straight line according to the present application;

FIG. 4 is a schematic diagram of a dotted line integrated graph model of the present application.

Detailed Description

It should be noted that, in the present invention, the embodiments disclosed in the present application may be combined with each other without conflict.

The first embodiment is as follows: referring to fig. 1, the embodiment is specifically described, and the method for building the positioning diagram of the mine robot based on the laser radar and the binocular camera in the embodiment includes the following steps:

the method comprises the following steps: carrying out combined calibration by using a laser radar and a binocular camera to obtain a conversion matrix;

step two: collecting the surrounding environment by using a laser radar and a binocular camera to obtain point cloud depth information and image information;

step three: performing point feature extraction and line feature extraction on the image information to obtain feature points and feature lines;

step four: obtaining a depth image by converting the point cloud depth information through a matrix, correcting and estimating the depth of the feature points by using the depth image, establishing a constraint relation by using the corrected and estimated feature points and feature lines, and estimating the initial pose of the binocular camera by using the established constraint relation;

step five: performing feature fusion on the feature points and the feature lines after the estimation correction, constructing a point-line error model by using the result after the feature fusion, and optimizing the pose of the binocular camera by using the point-line error model (the binocular camera completes the matching of adjacent frames in the figure 1);

step six: and repeating the second step to the fifth step, estimating continuous frame position and posture conversion, performing loop detection and repositioning, and finally completing positioning and mapping.

The laser radar is hardly influenced by factors such as illumination, texture and the like, but the laser radar is difficult to complete loop detection and relocation, so that the two sensors are fused for positioning and mapping. In addition, a large number of supports such as hydraulic props existing in the mine are observed, and the supports have obvious line characteristics, can form good complementary action with fewer point characteristics, and increase the positioning and mapping precision.

The second embodiment is as follows: the embodiment is a further description of the first specific embodiment, and the difference between the first specific embodiment and the second specific embodiment is that the specific steps of performing the combined calibration by using the laser radar and the binocular camera are as follows:

firstly, calibrating internal parameters of the binocular camera, and then calibrating the radar of the binocular camera by adopting a checkerboard and an automatic calibration tool to obtain the conversion relation between the laser radar and the binocular camera.

The third concrete implementation mode: the present embodiment is further described with reference to the first embodiment, and the difference between the present embodiment and the first embodiment is that the specific steps of correcting and estimating the depth of the feature point using the depth image are as follows:

firstly, selecting all laser radar point clouds in a 10-10 area around a feature point, then fitting a plurality of planes to all the laser radar point clouds in the area, calculating the distances from the feature point to all the planes, selecting the plane with the minimum distance to project in the direction of the light center, and obtaining a point which is depth estimation.

The fourth concrete implementation mode: the present embodiment is a further description of the first embodiment, and the difference between the present embodiment and the first embodiment is that the point feature extraction employs an ORB algorithm.

The fifth concrete implementation mode: the embodiment is a further description of the first specific embodiment, and the difference between the first specific embodiment and the second specific embodiment is that the line feature extraction adopts an improved LSD algorithm, and the improved LSD algorithm adopts a length suppression, broken line splicing and near line merging strategy to optimize the feature extraction result.

The sixth specific implementation mode: this embodiment is a further description of the first embodiment, and the difference between this embodiment and the first embodiment is that the construction of the point-line error model is represented as:

where ρ isp,ρlRepresenting the Huber robust kernel function, epk,iError of the point, elk,iDenotes the error of the line, i denotes the ith feature point, k denotes the kth key frame, j denotes the Jacobian matrix

The seventh embodiment: this embodiment mode is a further description of a sixth embodiment mode, and the difference between this embodiment mode and the sixth embodiment mode is the epk,iExpressed as:

wherein p iskiIs the ith point of the k frame image, RkRotation matrix, t, for three-dimensional point-to-plane imageskAs translation vector, XiRepresented as the ith point in the map and pi is the coincidence representation projected onto the image.

The specific implementation mode is eight: this embodiment mode is a further description of a sixth embodiment mode, and the difference between this embodiment mode and the sixth embodiment mode is the elk,jExpressed as:

the specific implementation method nine: the present embodiment is a further description of the first specific embodiment, and the difference between the present embodiment and the first specific embodiment is that the step of removing short-segment features is further included before the line feature extraction, the short-segment features are based on a short-segment removal criterion, and the short-segment removal criterion is:

leni≥(lenmin=ω×min(WI,HI))i=1....n

WI,HIdenotes the size of the image, ω denotes the scale factor, leniThe ith line segment is represented.

The detailed implementation mode is ten: the present embodiment is further described with respect to the first embodiment, and the difference between the present embodiment and the first embodiment is that the loop back detection specifically includes the steps of:

firstly, constructing a visual dictionary integrated by point lines, then performing off-line training on the visual dictionary, then converting each inserted key frame into a word packet vector by using the visual dictionary obtained by the off-line training, constructing an online database according to the word packet vector, storing the online database according to an inverted index, then searching all key frames containing visual words through the inverted index, and then calculating similarity scores of point line characteristics in the key frames and a current frame of a binocular camera when the key frames in a map and the current frame have the same words:

s=αsp+βsl

wherein α + β is 1, sp、slAnd respectively carrying out point and line feature similarity of the two points and the line, and finally carrying out time consistency check, geometric check and continuity check to obtain a final closed loop result.

Example (b):

step one, carrying out combined calibration of the laser radar and the binocular vision to obtain a transformation relation of coordinate systems of the laser radar and the binocular vision. The calibration is carried out by using a calibration Tookit module of Autoware, a checkerboard calibration board with 9 rows and 7 columns is required to be prepared, and the calibration is divided into two steps: firstly, calibrating to obtain the internal reference of the camera, and then calibrating to obtain the external reference of the camera-Lidar.

Secondly, starting a system, and operating a laser radar and a binocular camera to acquire data, wherein the laser radar acquires point cloud depth information, and the camera acquires image information; and (3) carrying out image preprocessing on the image, adopting an ORB algorithm to carry out point feature extraction, and adopting an improved LSD algorithm to extract line features. The improved LSD algorithm comprises the strategy of optimizing line feature extraction results of length inhibition, broken line splicing, near line merging and the like.

Too many short line segment characteristics not only can aggravate the line segment detection, and the computational cost of matching also can increase the mismatching probability of line segment, and long line segment is more stable relatively and is detected more easily, and the contribution to position appearance estimation is also bigger, so set up short line segment and reject the criterion:

leni≥(lenmin=ω×min(WI,HI))i=1....n

WI,HIthe size of the image is shown, omega represents a proportionality coefficient, and a long line segment suitable for the size of the image is selected. Furthermore, long segments are often divided into due to the LSD algorithmSeveral shorter sections, resulting in some elongated edges being frequently detected repeatedly. The similar line segments are often low in quality, and the uncertainty of feature detection is increased, so that an LSD algorithm is improved by adopting a method of near line combination and broken line connection, a threshold value is set by utilizing the angle difference between the line segments, and the line segments are fitted by using a least square fitting method to obtain a new feature line segment.

And step three, projecting the point cloud to a depth image through a conversion matrix from the laser radar to the camera obtained through calibration, and then estimating the corresponding depth of the feature points. As in fig. 2, the estimation comprises the following steps: firstly, selecting all laser radar point clouds in a specific area around the depth, then fitting all the point clouds to form a plurality of planes, calculating the distance from the point to all the planes, selecting the plane with the minimum distance to project in the direction of the optical center, and obtaining a point which is relatively accurate depth estimation.

According to the principle of point-line triangulation, a constraint relation is established between the space point and the projection of the space straight line and the image coordinate system and the 2D point line in the plane. Therefore, the point-line characteristics between the tracking frames are accurately matched, and the initial pose of the camera is estimated.

And fourthly, carrying out point-line feature fusion, constructing a point-line error model, and then carrying out pose road marking point optimization to estimate the pose of the camera. The method comprises the following steps:

(1) establishing error model of space point

Let X be the ith point in the map, and for the kth key frame, its coordinates on the image are:

pk=KTkiXi

wherein T iskiIs a rotational-translational matrix.

The error can be expressed as:

epki=pki-π(RkXi+tk)

and pi is the coincidence representation projected onto the image.

(2) Error model for establishing space straight line

The observation model of the space straight line adopts a 3D-2D mode with higher precision to project the space straight line to the imageThen, the error between the projected straight line and the matching straight line in the graph is calculated. The straight line L is converted from a three-dimensional point world coordinate system to an image coordinate system in a similar waywConverted to the image coordinate system and recorded as Lc,LcExpressed using the prock coordinates as:

in the above formula, HcwFor linear transformation of matrices, by rotation of matrices RcwAnd a translation vector tcwIs formed of (t)cw)^Is an antisymmetric matrix of translation vectors. Will straight line LcProjected onto a plane to obtain a line lcThe Prock coordinates are:

straight line L in spacewProjected to the current frame image coordinate system lcLine feature l matched with current frame imagec *By matching two end points m of the line segmentj,k、nj,kAlgebraic distance d to projected line segmentsmj,k、dnj,kThe error model of the line characteristics is expressed as:

(2) establishing a point-line comprehensive error model

The error of the point is expressed as:

the error of the line is expressed as:

the solution is performed by BA optimization, and the equation can be written as:

where ρ isp,ρlIs a Huber robust kernel function. The introduction of the Huber function to reduce outliers in the error function may result in the system being optimized to the wrong value due to the presence of a mismatch. And then linearly expanding the error function E and solving a Jacobian matrix J of the error function E relative to the state variables.

And step five, constructing a visual dictionary integrated by point lines, converting each inserted key frame into a word packet vector by using the visual dictionary obtained by off-line training, constructing an online database by using the word packet vectors, storing the word packet vectors according to an inverted index, and quickly searching all key frames containing a certain visual vocabulary through the inverted index. Then when the key frame in the map and the current frame of the camera have the same vocabulary, the similarity score of the dotted line features in the two is calculated. The dotted line features need to be summed with a certain weight.

s=αsp+βsl

Wherein α + β is 1, sp、slRespectively, the similarity of the point and line characteristics of the two. In addition, the final closed loop result can be compared only by carrying out time consistency check, geometric check and continuity check.

And repeating the second step to the fifth step, estimating continuous frame position and posture conversion, and finally completing positioning and drawing construction through closed-loop detection and repositioning.

It should be noted that the detailed description is only for explaining and explaining the technical solution of the present invention, and the scope of protection of the claims is not limited thereby. It is intended that all such modifications and variations be included within the scope of the invention as defined in the following claims and the description.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种激光雷达三维距离像超分辨重构方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类