Visual SLAM positioning method and device integrating MSCKF and graph optimization

文档序号:114063 发布日期:2021-10-19 浏览:25次 中文

阅读说明:本技术 融合msckf和图优化的视觉slam定位方法及装置 (Visual SLAM positioning method and device integrating MSCKF and graph optimization ) 是由 高洪臣 李骊 于 2021-04-23 设计创作,主要内容包括:本申请公开了一种融合MSCKF和图优化的视觉SLAM定位方法及装置,所述方法主要包括:实时获取当前的图像以及惯性数据,并基于当前的惯性数据,跟踪得当前图像上的多个特征点,然后将多个特征点的信息以及惯性数据输入滤波器中,进行滤波器预测和更新,并输出当前位姿、关键帧的信息以及所述多个特征点的信息。同时从关键帧队列中提取出目标关键帧,并对该关键帧进行处理,然后针对目标关键帧进行回环检测;若回环检测失败,则将多个地图点、目标关键帧及其共视关系,更新到地图库中;若回环检测成功,则利用全局光束平差法,对关键帧位姿和地图点进行全局优化,将优化结果反馈给MSCKF进行状态融合,更新滑窗中关键帧的位姿和地图点。(The application discloses a visual SLAM positioning method and device integrating MSCKF and graph optimization, wherein the method mainly comprises the following steps: the method comprises the steps of acquiring a current image and inertia data in real time, tracking a plurality of feature points on the current image based on the current inertia data, inputting information of the feature points and the inertia data into a filter, predicting and updating the filter, and outputting information of a current pose, a key frame and the feature points. Simultaneously, extracting a target key frame from the key frame queue, processing the key frame, and then performing loop detection on the target key frame; if the loopback detection fails, updating a plurality of map points, target key frames and the common-view relationship thereof into a map library; if the loop detection is successful, global optimization is carried out on the pose of the key frame and the map point by using a global beam adjustment method, the optimization result is fed back to the MSCKF for state fusion, and the pose of the key frame and the map point in the sliding window are updated.)

1. A visual SLAM positioning method fusing MSCKF and map optimization is characterized by comprising the following steps:

acquiring a current image and inertial data in real time;

tracking and obtaining a plurality of characteristic points on a current image based on the current inertial data every time a current image is obtained;

inputting the information of the plurality of characteristic points and the inertial data into a filter, predicting and updating the filter, and outputting the information of the current pose, the key frame and the information of the plurality of characteristic points; wherein the information of the key frame is output when the current image is determined to be a key frame;

extracting a target key frame from the key frame queue; wherein the key frame queue comprises all key frames to be processed; the target key frame is the key frame with the longest time distance at present in the key frame queue;

restoring the feature points which are in common view with the target key frame in the multi-frame key frames in the key frame queue to the key frame with the latest acquisition time in the key frame list;

screening a plurality of map points from the feature points on each key frame of the key frame queue;

loop detection is carried out aiming at the target key frame;

if the loopback detection fails, updating the map points, the target key frame and the common view relationship thereof into a map library;

if the loopback detection is successful, inputting the information of the related key frame into a global light beam adjustment method model, and optimizing the related key frame and the loopback map point through the global light beam adjustment method model; wherein the relevant key frames include loop detected loop frames and the target key frame; the loopback map point is the map point detected by the loopback;

and updating the optimized related key frames and the loopback map points to the map library, and feeding back to the filter for state updating.

2. The method of claim 1, wherein inputting the information of the plurality of feature points and the inertial data into a filter, performing filter prediction and updating, and outputting pose, keyframe information, and information of the plurality of feature points comprises:

inputting the information of the plurality of characteristic points and the inertial data into a filter, and predicting and updating the filter to obtain the current pose;

dividing the plurality of feature points into two categories according to whether the tracking frame number is smaller than a preset frame number, and determining whether the current image is a key frame based on the information of a plurality of preset types of the current image;

if the current image does not belong to the key frame, outputting the current pose;

and if the current image belongs to a key frame, outputting the current pose, the information of the key frame and the information of the two types of feature points.

3. The method of claim 2, wherein the filtering out a plurality of map points from feature points on each keyframe of the keyframe queue comprises:

for any feature point on any key frame of the key frame queue, if the feature point belongs to a first category and is at least on two key frames in the key frame queue, or the feature point belongs to a second category and has a new common view relationship, determining the feature point as a map point; the feature points of the first category refer to feature points with a tracking frame number not less than a preset frame number, and the second list refers to feature points with a tracking frame number less than the preset frame number.

4. The method of claim 1, wherein the updating the optimized relevant keyframes and the loopback map points to the map library and feeding back to the filter for state updating comprises:

updating the optimized related key frames and the loopback map points into the map library;

judging whether the related key frame and the loopback map point exist in a sliding window of the filter;

if the related key frame or the loopback map point exists in the sliding window of the filter, the related key frame and the loopback map point existing in the filter are correspondingly updated to the optimized related key frame or the optimized loopback map point;

and if the relevant key frame or the loopback map point does not exist in the sliding window of the filter, updating the current inertial data into the filter.

5. A visual SLAM positioning device integrating MSCKF and map optimization, comprising:

the acquisition unit is used for acquiring a current image and inertial data in real time;

the tracking unit is used for tracking and obtaining a plurality of characteristic points on a current image based on the current inertial data every time one current image is obtained;

the positioning unit is used for inputting the information of the characteristic points and the inertial data into a filter, predicting and updating the filter, and outputting the current pose, the information of a key frame and the information of the characteristic points; wherein the information of the key frame is output when the current image is determined to be a key frame;

the extraction unit is used for extracting a target key frame from the key frame queue; wherein the key frame queue comprises all key frames to be processed; the target key frame is the key frame with the longest time distance at present in the key frame queue;

the restoring unit is used for restoring the feature points which are in common view with the target key frame in the multi-frame key frames in the key frame queue to the key frame with the latest acquisition time in the key frame list;

the screening unit is used for screening a plurality of map points from the feature points on each key frame of the key frame queue;

the loop detection unit is used for carrying out loop detection on the target key frame;

the first updating unit is used for updating the map points, the target key frames and the common view relationship thereof into a map library when the loopback detection fails;

the optimization unit is used for inputting the information of the related key frame into the global light beam adjustment method model when loop detection is successful, and optimizing the related key frame and the loop map point through the global light beam adjustment method model; wherein the relevant key frames include loop detected loop frames and the target key frame; the loopback map point is the map point detected by the loopback;

and the second updating unit is used for updating the optimized related key frames and the loopback map points to the map library and feeding back to the filter for state updating.

6. The apparatus of claim 5, wherein the positioning unit comprises:

the state updating unit is used for inputting the information of the characteristic points and the inertial data into a filter, and predicting and updating the filter to obtain the current pose;

the matching unit is used for dividing the plurality of feature points into two categories according to whether the tracking frame number is smaller than a preset frame number or not, and determining whether the current image is a key frame or not based on the information of a plurality of preset types of the current image;

the first output unit is used for outputting the current pose when the current image does not belong to a key frame;

and the second output unit is used for outputting the current pose, the information of the key frame and the information of the two types of feature points when the current image belongs to the key frame.

7. The apparatus of claim 6, wherein the screening unit comprises:

a screening subunit, configured to, for any feature point on any one of the keyframes in the keyframe queue, determine, as a map point, the any feature point if the any feature point belongs to a first category and is at least on two keyframes in the keyframe queue, or if the any feature point belongs to a second category and has a new co-view relationship; the feature points of the first category refer to feature points with a tracking frame number not less than a preset frame number, and the second list refers to feature points with a tracking frame number less than the preset frame number.

8. The apparatus of claim 5, wherein the second updating unit comprises:

the map updating unit is used for updating the optimized related key frames and the loopback map points into the map library;

a judging unit, configured to judge whether the relevant keyframe and the loopback map point exist in a sliding window of the filter;

a first front-end updating unit, configured to, when the determining unit determines that the relevant keyframe or the loopback map point exists in the sliding window of the filter, correspondingly update the relevant keyframe and the loopback map point existing in the filter to the optimized relevant keyframe or the optimized loopback map point;

and the second front-end updating unit is used for updating the current inertial data into the filter when the judging unit judges that the related key frame or the loopback map point does not exist in the sliding window of the filter.

Technical Field

The application relates to the technical field of simultaneous positioning and map construction, in particular to a visual SLAM positioning method and device integrating MSCKF and map optimization.

Background

Currently, there are two main ways to realize simultaneous localization and mapping (SLAM), which are: a filtering algorithm based approach and a non-linear optimization based approach.

The method based on the filtering algorithm is mainly based on observation information on a Multi-frame image which is obtained latest, and the State estimator carries out simultaneous positioning and map construction by adopting filtering modes such as Multi-State Constraint Kalman Filter (MSCKF) and the like, so the calculation efficiency is extremely high. The method based on nonlinear optimization mainly comprises the steps of searching common-view feature points from historical multi-frame images, detecting a constraint relation between constructed frames through loop detection, and finally solving an optimal solution through multiple iterations, so that accumulated errors can be eliminated, the method has high positioning accuracy, and is generally used for constructing a scene map.

However, the calculation accuracy is poor based on the filtering algorithm, and errors are accumulated continuously after long-time positioning, so that the positioning accuracy and robustness are reduced. And the non-linear optimization-based method has high complexity, so that the positioning efficiency is slow and high computing power is required. Therefore, the existing mode can not well take the real-time performance and the high precision of positioning into account.

Disclosure of Invention

Based on the defects of the prior art, the application provides a visual SLAM positioning method and device fusing MSCKF and graph optimization, so as to solve the problem that the operation efficiency and the operation precision cannot be guaranteed simultaneously in the prior art.

In order to achieve the above object, the present application provides the following technical solutions:

the application provides a visual SLAM positioning method integrating MSCKF and map optimization, which comprises the following steps:

acquiring a current image and inertial data in real time;

tracking and obtaining a plurality of characteristic points on a current image based on the current inertial data every time a current image is obtained;

inputting the information of the plurality of characteristic points and the inertial data into a filter, predicting and updating the filter, and outputting the information of the current pose, the key frame and the information of the plurality of characteristic points; wherein the information of the key frame is output when the current image is determined to be a key frame;

extracting a target key frame from the key frame queue; wherein the key frame queue comprises all key frames to be processed; the target key frame is the key frame with the longest time distance at present in the key frame queue;

restoring the feature points which are in common view with the target key frame in the multi-frame key frames in the key frame queue to the key frame with the latest acquisition time in the key frame list;

screening a plurality of map points from the feature points on each key frame of the key frame queue;

loop detection is carried out aiming at the target key frame;

if the loopback detection fails, updating the map points, the target key frame and the common view relationship thereof into a map library;

if the loopback detection is successful, inputting the information of the related key frame into a global light beam adjustment method model, and optimizing the related key frame and the loopback map point through the global light beam adjustment method model; wherein the relevant key frames include loop detected loop frames and the target key frame; the loopback map point is the map point detected by the loopback;

and updating the optimized related key frames and the loopback map points to the map library, and feeding back to the filter for state updating.

Optionally, in the above method, the inputting information of the plurality of feature points and the inertial data into a filter, performing filter prediction and update, and outputting pose, keyframe information, and information of the plurality of feature points includes:

inputting the information of the plurality of characteristic points and the inertial data into a filter, and predicting and updating the filter to obtain the current pose;

dividing the plurality of feature points into two categories according to whether the tracking frame number is smaller than a preset frame number, and determining whether the current image is a key frame based on the information of a plurality of preset types of the current image;

if the current image does not belong to the key frame, outputting the current pose;

and if the current image belongs to a key frame, outputting the current pose, the information of the key frame and the information of the two types of feature points.

Optionally, in the foregoing method, the screening a plurality of map points from feature points on each keyframe of the keyframe queue includes:

for any feature point on any key frame of the key frame queue, if the feature point belongs to a first category and is at least on two key frames in the key frame queue, or the feature point belongs to a second category and has a new common view relationship, determining the feature point as a map point; the feature points of the first category refer to feature points with a tracking frame number not less than a preset frame number, and the second list refers to feature points with a tracking frame number less than the preset frame number.

Optionally, in the foregoing method, the updating the optimized relevant keyframe and the loopback map point to the map library, and feeding back to the filter for state updating includes:

updating the optimized related key frames and the loopback map points into the map library;

judging whether the related key frame and the loopback map point exist in a sliding window of the filter;

if the related key frame or the loopback map point exists in the sliding window of the filter, the related key frame and the loopback map point existing in the filter are correspondingly updated to the optimized related key frame or the optimized loopback map point;

and if the relevant key frame or the loopback map point does not exist in the sliding window of the filter, updating the current inertial data into the filter.

The application provides a visual SLAM positioning device integrating MSCKF and map optimization, comprising:

the acquisition unit is used for acquiring a current image and inertial data in real time;

the tracking unit is used for tracking and obtaining a plurality of characteristic points on a current image based on the current inertial data every time one current image is obtained;

the positioning unit is used for inputting the information of the characteristic points and the inertial data into a filter, predicting and updating the filter, and outputting the current pose, the information of a key frame and the information of the characteristic points; wherein the information of the key frame is output when the current image is determined to be a key frame;

the extraction unit is used for extracting a target key frame from the key frame queue when the first awakening time is reached; wherein the key frame queue comprises all key frames to be processed; the target key frame is the key frame with the longest time distance at present in the key frame queue;

the restoring unit is used for restoring the feature points which are in common view with the target key frame in the multi-frame key frames in the key frame queue to the key frame with the latest acquisition time in the key frame list;

the screening unit is used for screening a plurality of map points from the feature points on each key frame of the key frame queue;

the loop detection unit is used for carrying out loop detection on the target key frame;

the first updating unit is used for updating the map points, the target key frames and the common view relationship thereof into a map library when the loopback detection fails;

the optimization unit is used for inputting the information of the related key frame into the global light beam adjustment method model when loop detection is successful, and optimizing the related key frame and the loop map point through the global light beam adjustment method model; wherein the relevant key frames include loop detected loop frames and the target key frame; the loopback map point is the map point detected by the loopback;

and the second updating unit is used for updating the map library and the filter with the optimized related key frames and the loopback map points.

Optionally, in the above apparatus, the positioning unit includes:

the state updating unit is used for inputting the information of the characteristic points and the inertial data into a filter, and predicting and updating the filter to obtain the current pose;

the matching unit is used for dividing the plurality of feature points into two categories according to whether the tracking frame number is smaller than a preset frame number or not, and determining whether the current image is a key frame or not based on the information of a plurality of preset types of the current image;

the first output unit is used for outputting the current pose when the current image does not belong to a key frame;

and the second output unit is used for outputting the current pose, the information of the key frame and the information of the two types of feature points when the current image belongs to the key frame.

Optionally, in the above apparatus, the screening unit includes:

a screening subunit, configured to, for any feature point on any one of the keyframes in the keyframe queue, determine, as a map point, the any feature point if the any feature point belongs to a first category and is at least on two keyframes in the keyframe queue, or if the any feature point belongs to a second category and has a new co-view relationship; the feature points of the first category refer to feature points with a tracking frame number not less than a preset frame number, and the second list refers to feature points with a tracking frame number less than the preset frame number.

Optionally, in the above apparatus, the second updating unit includes:

the map updating unit is used for updating the optimized related key frames and the loopback map points into the map library;

a judging unit, configured to judge whether the relevant keyframe and the loopback map point exist in a sliding window of the filter;

a first front-end updating unit, configured to, when the determining unit determines that the relevant keyframe or the loopback map point exists in the sliding window of the filter, correspondingly update the relevant keyframe and the loopback map point existing in the filter to the optimized relevant keyframe or the optimized loopback map point;

and the second front-end updating unit is used for updating the current inertial data into the filter when the judging unit judges that the related key frame or the loopback map point does not exist in the sliding window of the filter.

According to the visual SLAM positioning method fusing MSCKF and graph optimization, a current image and inertial data are obtained in real time, a plurality of feature points on the current image are tracked and obtained based on the current inertial data when each current image is obtained, then information of the feature points and the inertial data are input into a filter, filter prediction and updating are carried out, and information of a current pose, a key frame and the feature points are output. Thereby ensuring the efficiency of simultaneous localization based on filtering. And when the first awakening time is reached, extracting a target key frame from the key frame queue, searching a multi-frame key frame having a common-view relation with the target key frame from the key frame queue, and restoring the common-view characteristic points of the multi-frame key frame and the target key frame into a key frame list to obtain the key frame with the latest time. And screening a plurality of map points from the feature points on each key frame of the key frame queue, and performing loop detection on the target key frame. And if the loopback detection fails, updating the map points, the target key frames and the common-view relationship thereof into the map library. If the loop detection is successful, inputting the information of the related key frame into a global light beam adjustment model, optimizing the related key frame and the loop map point through the global light beam adjustment model, finally updating a map library and updating a filter for the optimized related key frame and the loop map point, thereby optimizing the gradient based on a nonlinear algorithm at regular time and ensuring the positioning precision. Therefore, the filtering algorithm and the nonlinear optimization are effectively fused for simultaneous positioning, so that the positioning efficiency and the positioning precision are ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a simultaneous location system according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a visual SLAM positioning method for integrating MSCKF and map optimization according to another embodiment of the present application;

FIG. 3 is a flow chart of filter prediction and update according to another embodiment of the present application;

FIG. 4 is a flowchart of a method for updating a map library and a filter according to another embodiment of the present application;

FIG. 5 is a block diagram of a simultaneous positioning system according to another embodiment of the present application;

FIG. 6 is a schematic structural diagram of a simultaneous positioning device according to another embodiment of the present application;

fig. 7 is a schematic structural diagram of a positioning unit according to another embodiment of the present application;

fig. 8 is a schematic structural diagram of a second update unit according to another embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In this application, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The application provides a visual SLAM positioning method fusing MSCKF and graph optimization, and aims to solve the problem that the operation efficiency and the operation precision cannot be guaranteed simultaneously in the prior art.

Optionally, in order to implement the visual SLAM positioning method based on the fusion of MSCKF and map optimization disclosed in the present application, an embodiment of the present application provides a visual SLAM positioning system based on the fusion of MSCKF and map optimization, as shown in fig. 1, specifically including: an image pickup apparatus 101, an inertial sensor 102, a front end 103, and a rear end 104.

The front end 103 includes a Filter, which may be a Kalman Filter under Multi-State Constraint (MSCKS). The method is mainly used for processing images shot by the camera 101, namely, visual features and Inertial data collected by an Inertial Measurement Unit (IMU) 102, so as to predict and update the system state, and output and input the system state to the back end in real time, thereby closely coupling and fusing the visual features and the Inertial data.

The back end 104 includes a Bundle Adjustment (BA) module 1041 and a closed loop detection module 1042. The loop closing detection module 1042 is mainly configured to perform loop detection by using the visual features, and feed back loop detection information obtained after the loop detection to the BA model, where the loop detection information is the visual features obtained by the loop detection. The BA module 1041 mainly performs loop correction according to loop detection information, and performs global optimization on the filter by using historical global information.

Optionally, a map library may be further included, for storing key frame information, motion key information, coefficient point cloud maps, and the like, and data generated in the data processing process.

Based on the above timely positioning system, another embodiment of the present application provides a visual SLAM positioning method integrating MSCKF and map optimization, as shown in fig. 2, specifically including the following steps:

s201, acquiring a current image and current inertial data in real time.

It should be noted that the simultaneous Localization in the present application refers to simultaneous Localization and Mapping (SLAM), also referred to as Concurrent Mapping and Localization (CML).

The inertial data are measured by an inertial sensor, and specifically comprise acceleration and angular velocity.

S202, tracking and obtaining a plurality of feature points on the current image based on the current inertial data every time one current image is obtained.

Specifically, the angular velocity of the inertial data is first predicted by integration, and then the optical flow method is used to track the two-dimensional coordinates of all the feature points on the image acquired in the previous frame on the current image. If the data volume of the feature points tracked on the current image does not meet the preset requirement, new feature points can be extracted from the current image.

And S203, inputting the information of the plurality of characteristic points and the inertial data into a filter, predicting and updating the filter, and outputting the current pose, the information of the key frame and the information of the plurality of characteristic points.

Wherein the information of the key frame is output when the current image is determined to be the key frame.

Specifically, the filter predicts and updates the state variables according to the information of each feature point and the inertial data. The state variables of the medium filter are composed of a group of frame information with the length of N, and the frame information comprises position and attitude information of each frame image, internal reference and external reference of the camera and state variables of inertial data.

It should be noted that, when step S203 is executed, the filter may determine whether the current image is a key frame, and if the current image is determined to be a key frame, output information of the current pose, the multiple feature points, and the key frame, that is, information of the current image determined to be a key frame. If the current image is judged not to belong to the key frame, the current image can be ignored, and therefore only the current pose can be output.

Specifically, as shown in fig. 3, a specific implementation manner of step S203 specifically includes the following steps:

and S301, inputting the information of the plurality of characteristic points and the inertia data into a filter, and predicting and updating the filter to obtain the current pose.

S302, dividing the plurality of feature points into two categories according to whether the tracking frame number is smaller than a preset frame number.

The tracking frame number of the feature point refers to the number of frames to which one feature point is continuously tracked, that is, the feature point exists, and is the number of continuously acquired images.

Because the tracking frame number can influence the processing mode of the feature points, the feature points are divided into the feature points of which the tracking frame number is not less than the preset frame number and the feature points of which the tracking frame number is less than the preset frame number according to the tracking frame number.

And S303, determining whether the current image is a key frame or not based on a plurality of preset types of information of the current image.

The most important of the preset types of information is the time interval between the current image and the image of which the previous frame is determined as the key frame. Of course, the length of the displacement, the number of new feature points on the image, and the like can also be included. Thus, more than the plurality of preset types of information, it is determined whether the current image is more critical for simultaneous localization, and thus it is determined whether it is a key frame.

If step S303 is executed to determine that the current image is a key frame, that is, the current image belongs to a key frame, step S304 is executed. If step S303 is executed to determine that the current image does not belong to a key frame, step S305 is executed.

And S304, outputting the current pose.

And S305, outputting the current pose, the information of the key frame and the information of the two types of feature points.

S204, extracting a target key frame from the key frame queue; the target key frame is the key frame with the longest acquisition time distance from the current time in the key frame queue.

Wherein the key frame queue comprises all key frames to be processed.

It should be noted that, in the embodiment of the present application, steps S201 to S203 are continuously performed in real time, so that the pose can be continuously updated, and simultaneous positioning is realized. The front end can be optimized in the subsequent key frame processing and loop detection processes, so that the accuracy of simultaneous positioning is ensured, but the time is relatively long. So, optionally, the subsequent steps may be selectively performed, i.e. may be performed at the preset first wake-up time. Therefore, in the embodiment of the present application, each frame of key frames determined in step S203 is uniformly added to the key frame queue for subsequent processing.

S205, restoring the feature points which are in common view with the target key frame in the multi-frame key frames in the key frame queue to the key frame with the latest acquisition time in the key frame list.

Specifically, other key frames having a common view relationship with the target key frame are found from the key frame queue, that is, other key frames that can see the same scene as the target key frame are found. And then, restoring the common-view characteristic points to the key frame list to acquire the latest key frame through the common-view relation.

S206, screening a plurality of map points from the feature points on each key frame of the key frame queue.

Specifically, one specific implementation manner of step S3206 is: for any feature point on any key frame of the key frame queue, if any feature point belongs to the first category and is at least on two key frames in the key frame queue, or any feature point belongs to the second category and has a new common view relationship, determining any feature point as a map point. The feature points of the first category refer to the feature points of which the tracking frame number is not less than the preset frame number, and the second list refers to the feature points of which the tracking frame number is less than the preset frame number, so that the quantity and the quality of the selected map points can be effectively ensured.

And S207, performing loop detection on the target key frame, and judging whether the loop detection is successful.

Alternatively, it is also possible for the loop back detection to be a selective on detection. Loop back detection may be based on BOW. Specifically, the detection mode is the same as the open source algorithm ORB-SLAM3, and therefore, detailed description is omitted.

If the loop detection is determined to fail, step S208 is executed. If the loop detection is successful, step S208 is executed.

And S208, updating the map points, the target key frames and the common view relation thereof into a map library.

Alternatively, after loop detection is performed for multiple times and a relatively large number of loop frames are obtained, step S209 may be performed to perform optimization, so as to avoid frequent optimization calculation.

S209, inputting the information of the relevant key frame into a global beam adjustment method model, and optimizing the relevant key frame and the loopback map point through the global beam adjustment method model; the relevant key frames comprise loop detection loop frames and target key frames, and the loop map points are the map points detected by the loop.

Specifically, a cost function (cost function) is constructed for the new map points and the related key frames to perform iteration, and the state variables of the related key frames and the map points are optimized.

And S210, updating the optimized related key frame and the loopback map point to a map library, and returning to the filter for state updating.

After the relevant keyframes and the loopback map points are optimized, the filter can be updated, thereby ensuring the accuracy of simultaneous positioning.

Optionally, in another embodiment of the present application, as shown in fig. 4, the specific implementation of step S210 specifically includes the following steps:

s401, updating the optimized related key frames and the loopback map points into a map library.

S402, judging whether the sliding window of the filter has related key frames and loopback map points.

If it is determined that the sliding window of the filter has the relevant key frame and the loopback map point, step S403 is performed. If it is determined that the relevant keyframe or the loopback map point does not exist in the sliding window of the filter, step S404 is executed.

And S403, correspondingly updating the relevant key frames and the loopback map points existing in the filter into optimized relevant key frames or optimized loopback map points.

And S404, updating the current inertia data into the filter.

According to the visual SLAM positioning method fusing MSCKF and map optimization, a current image and inertial data are obtained in real time, a plurality of feature points on the current image are obtained through tracking based on the current inertial data when each current image is obtained, then information of the feature points and the inertial data are input into a filter, filter prediction and updating are carried out, and information of a current pose, a key frame and the feature points are output. Thereby ensuring the efficiency of simultaneous localization based on filtering. And when the first awakening time is reached, extracting a target key frame from the key frame queue, searching a multi-frame key frame having a common-view relation with the target key frame from the key frame queue, and restoring the common-view characteristic points of the multi-frame key frame and the target key frame into a key frame list to obtain the key frame with the latest time. And screening a plurality of map points from the feature points on each key frame of the key frame queue, and performing loop detection on the target key frame. And if the loopback detection fails, updating the map points, the target key frames and the common-view relationship thereof into the map library. If the loop detection is successful, inputting the information of the relevant key frame into the global light beam adjustment model, optimizing the relevant key frame and the loop map point through the global light beam adjustment model, finally updating the map library with the optimized relevant key frame and the loop map point, and updating the filter, thereby optimizing the gradient based on the nonlinear algorithm at regular time and ensuring the positioning precision. Therefore, the filtering algorithm and the nonlinear optimization are effectively fused for simultaneous positioning, so that the positioning efficiency and the positioning precision are ensured.

According to the visual SLAM positioning method integrating MSCKF and map optimization provided by the above embodiment, it can be seen that the method provided by the present application is mainly divided into three parts, the first part is to process the current image and the inertial information in real time, so as to realize simultaneous positioning, the second part is to process the information of the key frame and the information of the feature point, and to perform loop detection on the key frame, and the last part is to optimize the whole map after finding enough loop frame and common view frame. The simultaneous positioning system provided by the embodiment of the application can be divided into three functional areas and a map library. Wherein each function is implemented by a function through a corresponding thread. Specifically, as shown in fig. 5, a visual-inertial odometry (VIO) thread for implementing the first part is included, and the VIO thread is a thread in which the MSCKF filtering algorithm is located and is responsible for rapidly processing input current image and inertial data and outputting a pose in real time. And the post-processing thread for realizing the second part of functions is responsible for processing the input information of the VIO thread and detecting the loop. And the global beam adjustment thread for realizing the third part of functions is a thread where the nonlinear optimization algorithm is located, and is responsible for starting optimization on the whole map after the post-processing thread finds enough loop frames and common-view frames. And the map library is used for storing key frames, map points, common view relations and dependency relations among the key frames and the like. Wherein, the common view relation is stored through a common view; dependencies are stored by extending the tree.

Another embodiment of the present application provides a visual SLAM positioning device with MSCKF and map optimization fused, as shown in fig. 6, including the following units:

the acquisition unit 601 is configured to acquire a current image and inertial data in real time.

A tracking unit 602, configured to track, every time a current image of a frame is obtained, a plurality of feature points on the current image based on current inertial data.

The positioning unit 603 is configured to input information of the plurality of feature points and the inertial data into the filter, perform filter prediction and update, and output information of the current pose, the key frame, and the plurality of feature points.

Wherein the information of the key frame is output when the current image is determined to be the key frame.

An extracting unit 604, configured to extract the target key frame from the key frame queue.

Wherein the key frame queue comprises all key frames to be processed. The target key frame is the key frame with the longest acquisition time distance from the current time in the key frame queue.

The restoring unit 605 is configured to restore the feature points, which are viewed in common with the target key frame, in the multiple frames of key frames in the key frame queue to the key frame with the latest acquisition time in the key frame list.

The filtering unit 606 is configured to filter out a plurality of map points from the feature points on each key frame of the key frame queue.

A loop detection unit 607, configured to perform loop detection on the target key frame.

The first updating unit 608 is configured to update the map points, the target keyframe, and the co-view relationship thereof into the map library when the loopback detection fails.

And the optimizing unit 609 is configured to, when loop detection is successful, input information of the relevant key frame into the global beam adjustment model, and optimize the relevant key frame and the loop map point through the global beam adjustment model.

Wherein the related key frames comprise loop frames detected by loop and target key frames. The loopback map point is a map point detected by the loopback.

The second updating unit 610 is configured to update the optimized relevant keyframe and the loopback map point to the map library, and feed back to the filter for state updating.

Optionally, a positioning unit in the visual SLAM positioning device with MSCKF and map optimization fused provided in another embodiment of the present application is, as shown in fig. 7, including the following units:

and the state updating unit 701 is configured to input information of the plurality of feature points and inertial data into the filter, and perform filter prediction and update to obtain a current pose.

A matching unit 702, configured to divide the plurality of feature points into two categories according to whether the tracking frame number is less than the preset frame number, and determine whether the current image is a key frame based on information of a plurality of preset types of the current image.

A first output unit 703, configured to output the current pose when the current image does not belong to the key frame.

And a second output unit 704, configured to output the current pose, information of the key frame, and information of the two types of feature points when the current image belongs to the key frame.

Optionally, a screening unit in the visual SLAM locating device with MSCKF and map optimization fused provided in another embodiment of the present application includes:

and the screening subunit is used for determining any feature point on any key frame of the key frame queue as a map point if the feature point belongs to the first category and is at least on two key frames in the key frame queue or the feature point belongs to the second category and has a new common view relationship.

The feature points of the first category refer to the feature points of which the tracking frame number is not less than the preset frame number, and the second list refers to the feature points of which the tracking frame number is less than the preset frame number.

Optionally, a second updating unit in the visual SLAM positioning device with MSCKF and map optimization fused provided in another embodiment of the present application is, as shown in fig. 8, including the following units:

and a map updating unit 801, configured to update the optimized relevant keyframes and the loopback map points into a map library.

The determining unit 802 is configured to determine whether there are related keyframes and loopback map points in the sliding window of the filter.

The first front-end updating unit 803 is configured to, when the determining unit determines that the sliding window of the filter has the relevant key frame or the loopback map point, correspondingly update the relevant key frame and the loopback map point existing in the filter to the optimized relevant key frame or the optimized loopback map point.

The second front-end updating unit 804 is configured to update the current inertial data to the filter when the determining unit determines that no relevant keyframe or boomerang map point exists in the sliding window of the filter.

It should be noted that each unit provided in the foregoing embodiment of the present application is a unit in each module in the visual SLAM positioning system that combines MSCKF and graph optimization provided in the foregoing embodiment, and the specific working process of each unit may refer to the implementation process of the corresponding step in the foregoing method embodiment, which is not described herein again.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种重力辅助惯性导航系统仿真平台

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!