Loop detection method based on RGB-D camera

文档序号:551883 发布日期:2021-05-14 浏览:2次 中文

阅读说明:本技术 一种基于rgb-d相机的回环检测方法 (Loop detection method based on RGB-D camera ) 是由 刘屿 潘文钊 蔡鹤 何畅然 刘涛 胡国强 于 2021-02-04 设计创作,主要内容包括:本发明公开了一种基于RGB-D相机的回环检测方法,该方法首先将当前帧划分为两个区域,再分别对每个区域进行基于词袋模型的图像相似度匹配,找出与每个区域最相似的多个回环候选帧;再通过建立区域与回环候选帧间的结构矩阵,从而得到最终用于回环检测的两个回环候选帧。本发明可以有效地解决传统的回环检测方法因图像间重叠面积小而导致回环检测失败的问题。其次,本发明可以根据当前帧特征点的相似度和特征点间的拓扑结构对RGB-D相机进行回环检测,从而可以有效地检测出不仅与当前帧的特征点相似,并且特征点间拓扑结构相似的回环帧。(The invention discloses a loopback detection method based on an RGB-D camera, which comprises the steps of firstly dividing a current frame into two areas, then respectively carrying out image similarity matching based on a bag-of-words model on each area, and finding out a plurality of loopback candidate frames most similar to each area; and then two loop candidate frames finally used for loop detection are obtained by establishing a structural matrix between the region and the loop candidate frame. The invention can effectively solve the problem of loop detection failure caused by small overlapping area between images in the traditional loop detection method. Secondly, the invention can carry out loop detection on the RGB-D camera according to the similarity of the characteristic points of the current frame and the topological structure among the characteristic points, thereby effectively detecting the loop frame which is not only similar to the characteristic points of the current frame, but also similar to the topological structure among the characteristic points.)

1. A loopback detection method based on an RGB-D camera is characterized by comprising the following steps:

s1, carrying out ORB feature point extraction on the RGB image of the current frame acquired by the RGB-D camera, and dividing the current frame into 2 rectangular areas, wherein the length of each area is equal to 55-65% of the length of the current frame, and the width is equal to 95-100% of the width of the current frame;

s2, calculating a bag-of-word vector of each region of the current frame, wherein the bag-of-word vector is expressed as:

wherein the content of the first and second substances,is the bag-of-word vector of the kth area of the current frame, the value range of k is {1,2},the number of the ith class characteristic points in the kth area, N is the number of the types of the characteristic points,the bag-of-words weight, which is the i-th class feature of the current frame k-th region, is expressed as:

wherein n iskThe number of characteristic points of the kth area of the current frame;

s3, calculating the image similarity between the kth area of the current frame and each key frame, and expressing as:

wherein the content of the first and second substances,is the i component, v, of the bag-of-words vector corresponding to the k region of the current framejBag of words vector, v, corresponding to jth key framejiIs v isjThe (i) th component of (a),the image similarity between the kth area of the current frame and the jth key frame is taken as the image similarity;

s4, finding out N with the maximum similarity to the kth regional image of the current framesA key frame, marked as a loop candidate frame, NsIs an integer greater than 1;

s5, calculating the k-th area of the current frame and the corresponding NsFinding out the loop candidate frame with the maximum structure similarity and the kth region of the current frame according to the structure similarity of the loop candidate frames, and recording the loop candidate frames of the 2 regions obtained through calculation as final loop frames;

s6, respectively matching each region of the current frame with the corresponding final loop frame, re-projecting the map points corresponding to the feature points of the final loop frame to the region corresponding to the current frame, and calculating the pose of the current frame by minimizing the re-projection error, wherein the pose is expressed as:

wherein, TcwIs the final pose of the current frame of the RGB-D camera, K is the internal parameter matrix of the RGB-D camera, pk,iIs the i-th feature point, z, of the k-th region of the current framek,iIs a characteristic point pk,iCorresponding map point of, NkFor map points of the kth region of the current frameThe number of the particles;

s7 minimizing reprojection error eallThen the final pose T of the current frame of the RGB-D camera can be obtainedcw

2. The method for detecting loopback based on RGB-D camera as claimed in claim 1, wherein the procedure of step S5 is as follows:

s51, in the comparison between the current frame and the S-th loop candidate frame, the value range of S is {1,2sCalculating the i-th class characteristic point of the k-th area of the current frameThe Hamming distance between the characteristic point and each characteristic point of the ith class of the s-th loop candidate frame is found, and the corresponding characteristic point in the s-th loop candidate frame is foundCharacteristic point f with minimum Hamming distancesC has a value range of The number of the feature points of the ith class of the current frame kth area;

s52, calculating characteristic point fsThe Hamming distance between the current frame and each feature point of the ith class of the kth region, if the feature point fsAnd characteristic pointWhen the Hamming distance is minimum, the feature point f is recordedsAndare common characteristic points;

s53, establishing a structure matrix of the current frame kth area and each corresponding loop candidate frame: in whenIn the comparison between each region of the previous frame and the s-th loop candidate frame, the number of rows and columns of the structural matrix of the k-th region of the current frame are bothIs represented as:

whereinThe number of common feature points of the kth area of the current frame and the s-th loop candidate frame is,expressed as:

wherein d isk(i, j) is the distance between the ith common characteristic point and the jth common characteristic point of the kth area of the current frame, and is expressed as:

wherein the content of the first and second substances,is the three-dimensional coordinate of the i-th common feature point of the k-th area of the current frame, ds(i, j) is the distance between the ith common characteristic point and the jth common characteristic point of the s-th loop candidate frame, and is represented as:

wherein the content of the first and second substances,three-dimensional coordinates of the ith common characteristic point of the s-th loop candidate frame;

s54, traversing all elements in the structural matrix M (k, S) and enabling the structural matrix to be larger than TDThe row sequence number and the column sequence number of the element are respectively put into a row setrowSum column setcolMiddle, setrowAnd setcolInitially, it is empty set, TDIs a constant greater than 0;

s55, setting all elements in the structural matrix M (k, S) to be 1;

s56, belonging the row sequence number in the structure matrix M (k, S) to the row setrowIs set to 0 and the column number in the structural matrix M (k, s) belongs to the column setcolElement (2) is set to 0;

s57, recording the sum of all elements in the structural matrix M (k, S) as the loop weight of the kth region and the S-th loop candidate frame of the current frame, finding the loop candidate frame with the largest loop weight of the kth region of the current frame, and recording as the final loop frame of the kth region.

3. The method as claimed in claim 1, wherein the rectangular area has a length equal to 60% of the length of the current frame and a width equal to 100% of the width of the current frame in step S1.

Technical Field

The invention relates to the technical field of computer vision, in particular to a loopback detection method based on an RGB-D camera.

Background

Over the years, SLAM (simultaneous localization and mapping) technology has matured and been successfully used in a number of areas. As a key loop in the SLAM technology, loop detection can allow a camera to identify a place that has been reached, thereby eliminating accumulated errors and achieving the effect of reducing positioning errors.

The traditional loop detection method generally uses bag-of-words vectors as the measure of image similarity, and the more similar the bag-of-words vectors between images, the higher the image similarity. The word bag vector is used as the measure of the image similarity, so that the calculation cost is low, but the word bag vector only measures the similarity between the images according to the types and the number of the feature points of the images, and the topological structure between the feature points is ignored. Therefore, if the bag-of-words vector is used only for loop detection, it is possible to use an image in which the types and the number of feature points are similar to the current frame, but the topological structures of the feature points are greatly different from each other, as a loop frame of the current frame, thereby causing failure in loop detection. Secondly, the conventional loop detection generally uses the current camera frame as a whole to match the candidate frame, and if the image overlapping area of the candidate frame and the current frame is too small, the loop detection is likely to fail.

Disclosure of Invention

The present invention is directed to solve the above-mentioned defects in the prior art, and provides a method for detecting a loop based on an RGB-D camera, which not only can effectively solve the problem of failure of loop detection caused by a small overlapping area between images in the conventional loop detection method, but also can effectively detect a loop frame similar to a feature point of a current frame and having a similar topological structure between feature points.

The purpose of the invention can be achieved by adopting the following technical scheme:

a loopback detection method based on an RGB-D camera comprises the following steps:

s1, carrying out ORB feature point extraction on the RGB image of the current frame acquired by the RGB-D camera, and dividing the current frame into 2 rectangular areas, wherein the length of each area is equal to 55-65% of the length of the current frame, and the width is equal to 95-100% of the width of the current frame;

s2, calculating a bag-of-word vector of each region of the current frame, wherein the bag-of-word vector is expressed as:

wherein the content of the first and second substances,is the bag-of-word vector of the kth area of the current frame, the value range of k is {1,2},the number of the ith class characteristic points in the kth area, N is the number of the types of the characteristic points,the bag-of-words weight, which is the i-th class feature of the current frame k-th region, is expressed as:

wherein n iskThe number of characteristic points of the kth area of the current frame;

s3, calculating the image similarity between the kth area of the current frame and each key frame, and expressing as:

wherein the content of the first and second substances,is the i component, v, of the bag-of-words vector corresponding to the k region of the current framejBag of words vector, v, corresponding to jth key framejiIs v isjThe (i) th component of (a),the image similarity between the kth area of the current frame and the jth key frame is taken as the image similarity;

s4, finding out N with the maximum similarity to the kth regional image of the current framesA key frame, marked as a loop candidate frame, NsIs an integer greater than 1;

s5, calculating the k-th area of the current frame and the corresponding NsFinding out the loop candidate frame with the maximum structure similarity and the kth region of the current frame according to the structure similarity of the loop candidate frames, and recording the loop candidate frames of the 2 regions obtained through calculation as final loop frames;

s6, respectively matching each region of the current frame with the corresponding final loop frame, re-projecting the map points corresponding to the feature points of the final loop frame to the region corresponding to the current frame, and calculating the pose of the current frame by minimizing the re-projection error, wherein the pose is expressed as:

wherein, TcwIs the final pose of the current frame of the RGB-D camera, K is the internal parameter matrix of the RGB-D camera, pk,iIs the i-th feature point, z, of the k-th region of the current framek,iIs a characteristic point pk,iCorresponding map point of, NkThe number of map points of the kth area of the current frame;

s7 minimizing reprojection error eallThen the final pose T of the current frame of the RGB-D camera can be obtainedcw

Further, the step S5 process is as follows:

s51, amIn the comparison between the previous frame and the s-th loop candidate frame, the value range of s is {1,2sCalculating the i-th class characteristic point of the k-th area of the current frameThe Hamming distance between the characteristic point and each characteristic point of the ith class of the s-th loop candidate frame is found, and the corresponding characteristic point in the s-th loop candidate frame is foundCharacteristic point f with minimum Hamming distancesC has a value range of The number of the feature points of the ith class of the current frame kth area;

s52, calculating characteristic point fsThe Hamming distance between the current frame and each feature point of the ith class of the kth region, if the feature point fsAnd characteristic pointWhen the Hamming distance is minimum, the feature point f is recordedsAndare common characteristic points;

s53, establishing a structure matrix of the current frame kth area and each corresponding loop candidate frame: in the comparison between each region of the current frame and the s-th loop candidate frame, the number of rows and columns of the structural matrix of the k-th region of the current frame are bothIs represented as:

whereinThe number of common feature points of the kth area of the current frame and the s-th loop candidate frame is,expressed as:

wherein d isk(i, j) is the distance between the ith common characteristic point and the jth common characteristic point of the kth area of the current frame, and is expressed as:

wherein the content of the first and second substances,is the three-dimensional coordinate of the i-th common feature point of the k-th area of the current frame, ds(i, j) is the distance between the ith common characteristic point and the jth common characteristic point of the s-th loop candidate frame, and is represented as:

wherein the content of the first and second substances,three-dimensional coordinates of the ith common characteristic point of the s-th loop candidate frame;

s54, traversing all elements in the structural matrix M (k, S) and enabling the structural matrix to be larger than TDThe row sequence number and the column sequence number of the element are respectively put into a row setrowSum column setcolMiddle, setrowAnd setcolInitially, it is empty set, TDIs a constant greater than 0;

s55, setting all elements in the structural matrix M (k, S) to be 1;

s56, belonging the row sequence number in the structure matrix M (k, S) to the row setrowIs set to 0 and the column number in the structural matrix M (k, s) belongs to the column setcolElement (2) is set to 0;

s57, recording the sum of all elements in the structural matrix M (k, S) as the loop weight of the kth region and the S-th loop candidate frame of the current frame, finding the loop candidate frame with the largest loop weight of the kth region of the current frame, and recording as the final loop frame of the kth region.

Further, the length of the rectangular area in step S1 is equal to 60% of the length of the current frame, and the width is equal to 100% of the width of the current frame.

Compared with the prior art, the invention has the following advantages and effects:

(1) the traditional loop detection technology based on the bag-of-words model generally uses the current camera frame as a whole to match the candidate frame, and if the image overlapping area of the candidate frame and the current frame is too small, the loop detection is easy to fail. Compared with the traditional loopback detection technology based on the bag-of-words model, the method divides the current frame into two areas, then carries out image similarity matching based on the bag-of-words model on each area respectively, finds out a plurality of loopback candidate frames most similar to each area, and can effectively solve the problem that the traditional loopback detection method causes loopback detection failure due to small overlapping area between images.

(2) The traditional loop detection method generally uses bag-of-words vectors as the measure of image similarity, and the more similar the bag-of-words vectors between images, the higher the image similarity. However, the similarity between images is only measured according to the type and number of the feature points of the images, and neglecting the topological structure between the feature points, the type and number of the feature points may be similar to those of the current frame, but the images with the topological structures greatly different from those of the feature points may be used as loop frames of the current frame, thereby causing failure in loop detection. Compared with the traditional loopback detection technology based on the bag-of-words model, the method establishes the structural matrix between the region and the loopback candidate frames according to the topological relation among the characteristic points, thereby effectively detecting the loopback frames which are not only similar to the characteristic points of the current frame, but also similar in topological structure among the characteristic points.

Drawings

FIG. 1 is a flow chart of a method for detecting a loop based on an RGB-D camera according to the present invention;

FIG. 2 is a diagram illustrating all ORB feature points in a reference scenario in an embodiment of the present invention;

FIG. 3 is a schematic diagram of region segmentation in a reference scenario according to an embodiment of the present invention;

FIG. 4 is a flow chart of a method for selecting a final loop frame in an embodiment of the invention;

fig. 5 is a schematic diagram of a loopback detection result in a certain reference scenario in the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Examples

As shown in fig. 1, the present embodiment specifically discloses a loopback detection method based on an RGB-D camera, which includes the following steps:

s1, performing ORB feature point extraction on an RGB image of a current frame acquired by an RGB-D camera (as shown in figure 2), dividing the current frame into 2 rectangular areas (as shown in figure 3), wherein the length of each area is equal to 55-65% of the length of the current frame, and the width of each area is equal to 95-100% of the width of the current frame, and the purpose of dividing the current frame into 2 areas is to further prevent the problem of loop detection failure caused by undersize image overlapping areas of a candidate frame and the current frame;

s2, calculating a bag-of-word vector of each region of the current frame, wherein the bag-of-word vector is expressed as:

wherein the content of the first and second substances,is the bag-of-word vector of the kth area of the current frame, the value range of k is {1,2},the number of the ith class characteristic points in the kth area, N is the number of the types of the characteristic points,the bag-of-words weight, which is the i-th class feature of the current frame k-th region, is expressed as:

wherein n iskThe number of characteristic points of the kth area of the current frame;

s3, calculating the image similarity between the kth current frame and each key frame, and expressing as:

wherein the content of the first and second substances,is the i component, v, of the bag-of-words vector corresponding to the k region of the current framejBag of words vector, v, corresponding to jth key framejiIs v isjThe (i) th component of (a),the image similarity between the kth area of the current frame and the jth key frame is taken as the image similarity;

s4, finding and currentN with maximum image similarity for each region of the framesA key frame, marked as a loop candidate frame, NsIs a constant greater than 1;

s5, respectively calculating each region of the current frame and the corresponding NsFinding out the loop candidate frame with the maximum structure similarity and each region of the current frame, wherein the 2 loop candidate frames are marked as the final loop frame, and the flowchart of step S5 is shown in fig. 4;

in this embodiment, the step S5 includes the following steps:

s51, in the comparison between the current frame and the S-th loop candidate frame, the value range of S is {1,2sCalculating the i-th class characteristic point of the k-th area of the current frameThe Hamming distance between the characteristic point and each characteristic point of the ith class of the s-th loop candidate frame is found, and the corresponding characteristic point in the s-th loop candidate frame is foundCharacteristic point f with minimum Hamming distancesC has a value range of The number of the feature points of the ith class of the current frame kth area;

s52, calculating characteristic point fsThe Hamming distance between the current frame and each feature point of the ith class of the kth region, if the feature point fsAnd characteristic pointWhen the Hamming distance is minimum, the feature point f is recordedsAndare common characteristic points;

s53, establishing the kth current frameStructural matrix of the region and each loop candidate frame: the traditional loop detection method generally uses bag-of-words vectors as the measure of image similarity, and the more similar the bag-of-words vectors between images, the higher the image similarity. However, the bag-of-words vector only measures the similarity between images according to the types and the number of the feature points of the images, thereby ignoring topological information between the feature points. If the bag-of-words vector is used only for loop detection, the type and number of feature points may be similar to those of the current frame, but the images with greatly different topological structures between the feature points are used as the loop frame of the current frame, thereby causing failure of loop detection. Therefore, the similarity between the RGB-D images is measured by establishing a structural matrix (the structural matrix comprises topological information among characteristic points) of the kth area of the current frame and each loop candidate frame. In the comparison between each region of the current frame and the s-th loop candidate frame, the number of rows and columns of the structural matrix of the k-th region of the current frame are bothIs represented as:

whereinThe number of common feature points of the kth area of the current frame and the s-th loop candidate frame is,expressed as:

wherein d isk(i, j) is the distance between the ith common characteristic point and the jth common characteristic point of the kth area of the current frame, and is expressed as:

wherein the content of the first and second substances,is the three-dimensional coordinate of the i-th common feature point of the k-th area of the current frame, ds(i, j) is the distance between the ith common characteristic point and the jth common characteristic point of the s-th loop candidate frame, and is represented as:

wherein the content of the first and second substances,three-dimensional coordinates of the ith common characteristic point of the s-th loop candidate frame;

dk(i, j) represents the distance between any two common feature points of the kth region of the current frame, ds(i, j) represents the distance between the corresponding two common feature points of the s-th loop candidate frame. Thus, dk(i, j) and dsThe smaller the absolute value of (i, j),the smaller the difference, the more similar the topological structure between the corresponding two common characteristic point pairs;

s54, traversing all elements in the structural matrix M (k, S) and enabling the structural matrix to be larger than TDThe row sequence number and the column sequence number of the element are respectively put into a row setrowSum column setcolMiddle, setrowAnd setcolInitially, it is empty set, TDIs a constant greater than 0;

s55, setting all elements in the structural matrix M (k, S) to be 1;

s56, belonging the row sequence number in the structure matrix M (k, S) to the row setrowIs set to 0 and the column number in the structural matrix M (k, s) belongs to the column setcolElement (2) is set to 0;

the more elements of the structure matrix M (k, S) with the median value of 1 at S57 indicate that the more pairs of topologically similar common feature points, the higher the degree of topological similarity between images. Therefore, the sum of all elements in the structural matrix M (k, s) is recorded as the loop weight of the kth region and the s-th loop candidate frame of the current frame, and the loop candidate frame with the loop weight of the kth region of the current frame which is the largest is found and recorded as the final loop frame of the kth region, as shown in fig. 5.

S6, respectively matching each region of the current frame with the corresponding final loop frame, re-projecting the map points corresponding to the feature points of the final loop frame to the region corresponding to the current frame, and calculating the pose of the current frame by minimizing the re-projection error, wherein the pose is expressed as

Wherein, TcwIs the final pose of the current frame of the RGB-D camera, K is the internal parameter matrix of the RGB-D camera, pk,iIs the i-th feature point, z, of the k-th region of the current framek,iIs a characteristic point pk,iCorresponding map point of, NkThe number of map points of the kth area of the current frame;

s7 minimizing reprojection error eallThen the final pose T of the current frame of the RGB-D camera can be obtainedcw

In summary, the loop detection method disclosed in this embodiment can effectively solve the problem of failure of loop detection caused by a small overlapping area between images in the conventional loop detection method. Secondly, the loop detection method carries out loop detection on the current frame of the RGB-D camera according to the similarity of the feature points of the current frame and the topological structure among the feature points, thereby effectively detecting the loop frame which is not only similar to the feature points of the current frame, but also similar to the topological structure among the feature points.

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:对象抓取方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!