Visual SLAM (simultaneous localization and mapping) initialization method, system and device in dynamic environment

文档序号:1599715 发布日期:2020-01-07 浏览:13次 中文

阅读说明:本技术 动态环境下的视觉slam初始化方法、系统、装置 (Visual SLAM (simultaneous localization and mapping) initialization method, system and device in dynamic environment ) 是由 汤淑明 卢晓昀 顿海洋 黄馨 张力夫 于 2019-09-27 设计创作,主要内容包括:本发明属于机器人、无人驾驶、AR视觉领域,具体涉及一种动态环境下的视觉SLAM初始化方法、系统、装置,旨在解决SLAM在动态环境下缺乏提取静态特征点的问题。本系统方法包括获取有视差的两帧图像;获取前一帧图像的匹配特征点;前一帧图像等份划分,得到多个图像块,将包含匹配特征点大于设定阈值的图像块作为格子模型,并获取格子模型的内点和质心;通过内点计算格子模型的耦合度,基于预设的耦合度阈值选取格子模型构建对应的格子模型集合;基于质心计算每一个格子模型集合的分布方差,选取值最大的集合构建静态特征集合;将静态特征集合三角化,通过非线性优化对SLAM初始化。本发明能提取足够的静态特征点。(The invention belongs to the fields of robots, unmanned driving and AR vision, in particular relates to a visual SLAM initialization method, system and device in a dynamic environment, and aims to solve the problem that SLAM lacks of extracting static feature points in the dynamic environment. The system method comprises the steps of obtaining two frames of images with parallax; acquiring matching feature points of a previous frame of image; dividing the previous frame of image in equal parts to obtain a plurality of image blocks, taking the image blocks containing the matched characteristic points larger than a set threshold value as a lattice model, and obtaining the interior points and the mass center of the lattice model; calculating the coupling degree of the lattice model through the inner points, and selecting the lattice model based on a preset coupling degree threshold value to construct a corresponding lattice model set; calculating the distribution variance of each lattice model set based on the mass center, and selecting the set with the largest value to construct a static feature set; and triangulating the static feature set, and initializing the SLAM through nonlinear optimization. The invention can extract enough static characteristic points.)

1. A visual SLAM initialization method in a dynamic environment, the method comprising:

step S10, acquiring a first image frame and a second image frame having a parallax from an input video;

step S20, respectively extracting ORB feature points of the first image frame and the second image frame, and acquiring matching feature points of the first image frame by a feature point matching method;

step S30, equally dividing the first image frame to obtain a plurality of image blocks, selecting the image blocks with matching feature points larger than a set threshold as lattice models, and acquiring the interior points of each lattice model and the centroid of the interior points through a RANSAC algorithm;

step S40, respectively calculating the coupling degree of each lattice model with other lattice models based on the inner point of each lattice model, selecting the lattice model based on a preset coupling degree threshold value, and constructing a corresponding lattice model set;

step S50, for each grid model set, calculating distribution variance based on the centroid of each grid model inner point, selecting the grid model set corresponding to the maximum distribution variance value, and constructing a static characteristic set based on the inner points of each grid model in the set;

and step S60, triangularizing the inner points in the static feature set, acquiring the three-dimensional coordinates of the SLAM through a nonlinear optimization method, and initializing the SLAM based on the three-dimensional coordinates.

2. The visual SLAM initialization method in a dynamic environment of claim 1, wherein the first image frame is a temporally previous image.

3. The visual SLAM initialization method under dynamic environment of claim 1, wherein in step S20, "obtaining the matching feature points of the first image frame by the feature point matching method" comprises: and acquiring ORB feature points matched with the first image frame and the second image frame as matched feature points of the first image frame.

4. The visual SLAM initialization method under dynamic environment of claim 1, wherein in step S30, "select image blocks with matching feature points greater than a set threshold as lattice models, and obtain the interior points and the centroid of each lattice model by RANSAC algorithm", the method includes:

selecting image blocks with the matching feature points larger than a set threshold as lattice models, and counting the number of the lattice models;

if the number of the lattice models is larger than a preset value, acquiring the interior points and the centroids of the interior points of each lattice model through a RANSAC algorithm based on the matching feature points of each lattice model, and executing the step S40, otherwise executing the step S10.

5. The visual SLAM initialization method under dynamic environment of claim 1, wherein in step S40, "for each lattice model, its coupling degree with other lattice models is calculated based on its inner point", respectively, and the method comprises:

taking each lattice model as a first model, and taking any one of other lattice models as a second model;

based on the interior points of the second model, screening the interior points which accord with the first model to be solved through an RANSAC algorithm, and counting the number of the interior points which accord with the first model to be solved;

and taking the ratio of the number to the number of the inner points of the second model as the coupling degree of the first model and the second model.

6. The visual SLAM initialization method under dynamic environment of claim 1, wherein in step S50, "for each lattice model set, the distribution variance is calculated based on the centroid of the point in each lattice model", and the calculation method is:

Figure FDA0002218613310000021

wherein, XvarianceVariance in x-direction of centroid for each point in the lattice model, YvarianceVariance in y-direction of centroid for points in each lattice model, FdistributionIs the distribution variance.

7. A visual SLAM initialization system in a dynamic environment is characterized by comprising an acquisition module, a feature matching module, a screening module, a set building module, a distribution variance calculating module and an initialization module;

the acquisition module is configured to acquire a first image frame and a second image frame with parallax from an input video;

the feature matching module is configured to extract ORB feature points of the first image frame and the second image frame respectively, and obtain matching feature points of the first image frame by a feature point matching method;

the screening module is configured to divide the first image frame into a plurality of image blocks in equal parts to obtain a plurality of image blocks, select the image blocks with the matching feature points larger than a set threshold as lattice models, and obtain the interior points of each lattice model and the mass center of the interior points through an RANSAC algorithm;

the construction set module is configured to respectively calculate the coupling degree of each lattice model with other lattice models based on the inner point of the lattice model, select the lattice model based on a preset coupling degree threshold value and construct a corresponding lattice model set;

the distribution variance calculating module is configured to calculate distribution variance for each lattice model set based on the centroid of each lattice model inner point, select the lattice model set corresponding to the maximum distribution variance value, and construct a static feature set based on the inner points of each lattice model in the set;

the initialization module is configured to triangulate the interior points in the static feature set, acquire the three-dimensional coordinates of the SLAM through a nonlinear optimization method, and initialize the SLAM based on the three-dimensional coordinates.

8. A storage device having a plurality of programs stored therein, wherein the program applications are loaded and executed by a processor to implement the visual SLAM initialization method in a dynamic environment of any of claims 1-6.

9. A processing device comprising a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; characterized in that said program is adapted to be loaded and executed by a processor to implement the visual SLAM initialization method in a dynamic environment as claimed in any one of claims 1 to 6.

Technical Field

The invention belongs to the fields of robots, unmanned driving and AR vision, and particularly relates to a visual SLAM initialization method, system and device in a dynamic environment.

Background

Visual SLAM is a system engineering, and in the classic visual SLAM framework, it is often assumed that an environment is a static environment, or that static features in the environment occupy a majority. Although more sophisticated visual SLAM systems, such as ORB-SLAM, DSO, LSD, etc., have appeared, SLAM systems are very poor in robustness in motion scenes due to the assumption of a static environment. No matter the visual SLAM system of the direct method or the indirect method is lack of interference for a moving object, in the running process of the SLAM, the resolving of the camera pose completely depends on the RANSAC method, and if the static feature points do not occupy most of the whole image or occupy a lower proportion, a correct camera motion model can not be obtained in the iterative computation of limited times.

The traditional factorization method needs to predict the number of moving objects in advance, and batch data is needed for resolving data, so that the camera pose under the motion environment is difficult to resolve in real time and in a robust mode. Modeling methods such as gaussian modeling, mixed gaussian modeling and the like have little significance to SLAM because the real significance of SLAM lies in exploring an unknown environment, and if the SLAM problem moves in a known environment, the SLAM problem can be degraded into a positioning problem, so that the environment can be completely limited to acquire high-precision maps of the environment in various ways, and then a camera is positioned by map matching.

At present, more and more deep learning-based methods are applied to moving object detection. In some visual SLAM frames, moving targets are firstly identified, then feature points in a moving target area are removed, and finally camera motion model calculation is carried out by using the remaining feature points. The method based on deep learning greatly improves the accuracy and efficiency of detection of the moving target, but still has great limitation, because the identification information of the method is usually based on the texture, color, gray scale and the like of the target, and the information has no absolute relation with whether the target moves or not. The basis for judging whether the target moves is to find out and remove a possibly moving object in a limited environment, and the defect of the method is obvious and mainly embodied in two aspects: first, a large amount of static feature point information that could otherwise be used for positioning may be lost. For example, in an indoor environment, since people are objects with moving attributes, the framework may systematically delete feature points in the area where all people are identified, while ignoring that in some cases people may be stationary, and when a stationary person occupies a large portion of the image field of view, many feature point information that could otherwise be used for localization is lost. Secondly, many moving objects often cannot be recognized due to the limitation of the training data of the deep learning training network in the strange environment, in other words, all moving objects cannot be recognized by the deep learning, so that the deep learning neural network only has the function of removing part of limited moving objects or moving feature points, and unrecognized moving objects or moving feature points still have great influence on visual positioning.

How to effectively obtain static feature points (points which do not accord with a model of the static features) in an image with low outlier rate, instead of judging motion attributes by using textures, colors, gray scales and the like in image depth learning, has important significance for solving the pose of a camera. Therefore, the invention provides a visual SLAM initialization method in a dynamic environment.

Disclosure of Invention

In order to solve the above problems in the prior art, that is, to solve the problem that the pose of a camera in an existing visual SLAM system cannot be solved in a real-time and robust manner due to lack of extraction of static feature points in a dynamic environment, a first aspect of the present invention provides a visual SLAM initialization method in a dynamic environment, including:

step S10, acquiring a first image frame and a second image frame having a parallax from an input video;

step S20, respectively extracting ORB feature points of the first image frame and the second image frame, and acquiring matching feature points of the first image frame by a feature point matching method;

step S30, equally dividing the first image frame to obtain a plurality of image blocks, selecting the image blocks with matching feature points larger than a set threshold as lattice models, and acquiring the interior points of each lattice model and the centroid of the interior points through a RANSAC algorithm;

step S40, respectively calculating the coupling degree of each lattice model with other lattice models based on the inner point of each lattice model, selecting the lattice model based on a preset coupling degree threshold value, and constructing a corresponding lattice model set;

step S50, for each grid model set, calculating distribution variance based on the centroid of each grid model inner point, selecting the grid model set corresponding to the maximum distribution variance value, and constructing a static characteristic set based on the inner points of each grid model in the set;

and step S60, triangularizing the inner points in the static feature set, acquiring the three-dimensional coordinates of the SLAM through a nonlinear optimization method, and initializing the SLAM based on the three-dimensional coordinates.

In some preferred embodiments, the first image frame is a temporally previous image.

In some preferred embodiments, in step S20, "obtaining matched feature points of the first image frame by a feature point matching method" includes: and acquiring ORB feature points matched with the first image frame and the second image frame as matched feature points of the first image frame.

In some preferred embodiments, in step S30, "select an image block containing matching feature points greater than a set threshold as a lattice model, and obtain an interior point and a centroid of each lattice model by using a RANSAC algorithm", the method includes:

selecting image blocks with the matching feature points larger than a set threshold as lattice models, and counting the number of the lattice models;

if the number of the lattice models is larger than a preset value, acquiring the interior points and the mass centers of the interior points of each lattice model through a RANSAC algorithm, and executing the step S40, otherwise executing the step S10.

In some preferred embodiments, in step S40, "for each lattice model, its coupling degree with other lattice models is calculated based on its inner point", respectively, by:

taking each lattice model as a first model, and taking any one of other lattice models as a second model;

based on the interior points of the second model, screening the interior points which accord with the first model to be solved through an RANSAC algorithm, and counting the number of the interior points which accord with the first model to be solved;

and taking the ratio of the number to the number of the inner points of the second model as the coupling degree of the first model and the second model.

In some preferred embodiments, in step S50, "for each lattice model set, the distribution variance is calculated based on the centroid of each point in each lattice model", and the calculation method is:

Figure BDA0002218613320000041

wherein, XvarianceVariance in x-direction of centroid for each point in the lattice model, YvarianceVariance in y-direction of centroid for points in each lattice model, FdistributionIs the distribution variance.

The invention provides a visual SLAM initialization system under a dynamic environment, which comprises an acquisition module, a feature matching module, a screening module, a set building module, a distribution variance calculating module and an initialization module, wherein the acquisition module is used for acquiring a visual SLAM image;

the acquisition module is configured to acquire a first image frame and a second image frame with parallax from an input video;

the feature matching module is configured to extract ORB feature points of the first image frame and the second image frame respectively, and obtain matching feature points of the first image frame by a feature point matching method;

the screening module is configured to divide the first image frame into a plurality of image blocks in equal parts to obtain a plurality of image blocks, select the image blocks with the matching feature points larger than a set threshold as lattice models, and obtain the interior points of each lattice model and the mass center of the interior points through an RANSAC algorithm;

the construction set module is configured to respectively calculate the coupling degree of each lattice model with other lattice models based on the inner point of the lattice model, select the lattice model based on a preset coupling degree threshold value and construct a corresponding lattice model set;

the distribution variance calculating module is configured to calculate distribution variance for each lattice model set based on the centroid of each lattice model inner point, select the lattice model set corresponding to the maximum distribution variance value, and construct a static feature set based on the inner points of each lattice model in the set;

the initialization module is configured to triangulate the interior points in the static feature set, acquire the three-dimensional coordinates of the SLAM through a nonlinear optimization method, and initialize the SLAM based on the three-dimensional coordinates.

In a third aspect of the present invention, a storage device is provided, in which a plurality of programs are stored, the program applications being loaded and executed by a processor to implement the visual SLAM initialization method in the dynamic environment described above.

In a fourth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the visual SLAM initialization method in a dynamic environment as described above.

The invention has the beneficial effects that:

the method can extract enough static characteristic points and solve the pose of the camera in the SLAM in real time in a robust mode. The invention provides a visual SLAM initialization method in a dynamic environment based on region segmentation, global coupling and distribution degree judgment, which screens out a static characteristic point set by using the difference of the distribution of static characteristic points and moving objects on an image and ensures that enough static characteristic points are extracted. And resolving the pose of the camera according to the static feature point set, and then triangularizing the static feature points to finish the initialization of the surrounding environment. Static semantic information is provided for the SLAM system in a static feature point set mode, and the system can operate in a dynamic environment in a robust and real-time mode.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.

FIG. 1 is a flow chart of a visual SLAM initialization method in a dynamic environment according to one embodiment of the present invention;

FIG. 2 is a block diagram of a visual SLAM initialization system in a dynamic environment according to one embodiment of the present invention;

fig. 3 is an exemplary diagram of a first image frame equal division of the visual SLAM initialization method in a dynamic environment according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The visual SLAM initialization method under the dynamic environment, as shown in figure 1, comprises the following steps:

step S10, acquiring a first image frame and a second image frame having a parallax from an input video;

step S20, respectively extracting ORB feature points of the first image frame and the second image frame, and acquiring matching feature points of the first image frame by a feature point matching method;

step S30, equally dividing the first image frame to obtain a plurality of image blocks, selecting the image blocks with matching feature points larger than a set threshold as lattice models, and acquiring the interior points of each lattice model and the centroid of the interior points through a RANSAC algorithm;

step S40, respectively calculating the coupling degree of each lattice model with other lattice models based on the inner point of each lattice model, selecting the lattice model based on a preset coupling degree threshold value, and constructing a corresponding lattice model set;

step S50, for each grid model set, calculating distribution variance based on the centroid of each grid model inner point, selecting the grid model set corresponding to the maximum distribution variance value, and constructing a static characteristic set based on the inner points of each grid model in the set;

and step S60, triangularizing the inner points in the static feature set, acquiring the three-dimensional coordinates of the SLAM through a nonlinear optimization method, and initializing the SLAM based on the three-dimensional coordinates.

In order to more clearly describe the visual SLAM initialization method in the dynamic environment of the present invention, the following describes the steps in an embodiment of the method of the present invention in detail with reference to fig. 1.

In step S10, a first image frame and a second image frame having a parallax are acquired from an input video.

SLAM (instantaneous location and mapping) can be described as: whether there is a way to let a robot move while gradually tracing a complete map of an unknown environment by placing the robot at an unknown position in the environment.

In this embodiment, a video stream of a surrounding environment is acquired by a camera, and RGB images having a time interval and a parallax between two frames are acquired based on the video stream, and the two frames of images are arranged in a time sequence and are respectively denoted as a first image frame and a second image frame.

Step S20, performing ORB feature point extraction on the first image frame and the second image frame, respectively, and obtaining matching feature points of the first image frame by a feature point matching method.

In this embodiment, the ORB feature points of the two frames of RGB images obtained in step S10 are extracted first, and then the feature points of the RGB images of the first frame and the second frame are matched.

The detection and extraction of image features are one of the most important research fields of computer vision, the ORB (organized FAST and rotaed BRIEF) algorithm is the fastest and stable feature point detection and extraction algorithm at present, and the ORB features are that the detection method of FAST feature points is combined with BRIEF feature descriptors, and are improved and optimized on the basis of the original methods. Detecting the feature points by using a FAST (features from obtained segment test) algorithm, detecting a circle of pixel values around the candidate feature points based on the gray value of the image around the feature points, and if the gray value difference between enough pixel points in the area around the candidate points and the candidate points is large enough, considering the candidate points as one feature point.

After obtaining the feature points, we need to describe the attributes of the feature points in some way. The output of these attributes we call the descriptor of the Feature point (Feature descriptor S). In real life, when an object is observed from different distances, directions, angles and illumination conditions, the size, the shape and the brightness of the object are different. But our brain can still judge it to be the same object. An ideal feature descriptor should possess these properties. That is, in images with different sizes, directions, and light and shade, the same feature point should have sufficiently similar descriptors, which are referred to as the reproducibility of the descriptors. ORB uses BRIEF algorithm to compute a descriptor of a feature point.

In this embodiment, descriptors are calculated through feature points of a first image frame and a second image frame, matching is performed through similarity of the descriptors, whether the descriptor points are identical feature points or not is judged, and an ORB feature point matched with the first image frame and the second image frame is obtained as a matching feature point of the first image frame.

Step S30, equally dividing the first image frame to obtain a plurality of image blocks, selecting the image blocks with the matching feature points larger than a set threshold as lattice models, and obtaining the interior points of each lattice model and the centroid of the interior points through a RANSAC algorithm.

How to effectively obtain static feature points in an image with low outlier rate, instead of image depth learning, motion attributes are judged by using textures, colors, gray scales and the like, and the method has important significance for solving the camera pose of the robot. In fact, it is not a simple matter to distinguish between static and dynamic objects in the field of view of a moving object. The reason is that the ratio of dynamic objects to static objects is not well defined, while object motion and rest are interconvertible. We can get the following two observations: the feature points on the moving object and the feature points on the static object are on the object, so that the closer the adjacent feature points are, the more likely the adjacent feature points belong to the same object and further belong to a moving model.

In this embodiment, the first image frame is divided into m × n image blocks, where m is the number of equal divisions of the image in the y direction, and n is the number of equal divisions of the image in the x direction, as shown in fig. 3. Setting m × n to convert the sequence of the image blocks into one-dimensional description, selecting the image blocks with matching feature points larger than a set threshold as lattice models, counting the number of the lattice models, normalizing the feature points in the image blocks if the number of the lattice models is larger than a preset value, resolving a motion model of the image blocks by using an RANSAC algorithm according to the normalized feature points, namely screening and removing the feature points which are mistakenly matched, wherein model inner points conforming to the lattice models are reserved, and executing step S40; if the number of the lattice models is smaller than the preset value, step S10 is executed to re-acquire the two frames of RGB images and re-calculate the feature points.

And step S40, respectively calculating the coupling degree of each lattice model with other lattice models based on the inner point of each lattice model, selecting the lattice model based on a preset coupling degree threshold value, and constructing a corresponding lattice model set.

In SLAM, the motion states of moving objects are mutually exclusive and are more mutually exclusive with the motion states of static objects, so that the model distribution of the moving objects is concentrated, and on the contrary, in a real environment, the obtained static characteristic points do not occupy the main part of an image, and the static characteristic points are distributed in the image, so that the model has the characteristic of cross-region coupling.

In this example, based on the lattice model obtained in step S30, a coupling matrix of the lattice model is constructed, the degree of coupling between the current lattice model and other lattice models is calculated through the interior points of the lattice model, and if the degree of coupling is greater than a preset threshold value of the degree of coupling, the other lattice models are added to the set of the current lattice model, and an image block set of each lattice model is obtained in sequence. The method comprises the following specific steps:

step S41, setting z as the number of image blocks meeting the model resolving requirement, namely a lattice model, and establishing a z x z dimension coupling matrix for the z image blocks;

in step S42, (i, j) elements in the coupling matrix are j-th lattice model interior points and are also the ratio of the number of the i-th lattice model interior points to the total number of the j-th lattice model interior points, which is also referred to as the degree of coupling between the image blocks in this patent. For example, the coupling degree of the first lattice model and the second lattice model is: based on the interior points of the second lattice model, the number of the interior points which accord with the first lattice model is obtained through a RANSAC algorithm, and the ratio of the number to the number of the interior points of the second lattice model is used as the coupling degree of the first lattice model and the second lattice model.

Step S43, setting a coupling threshold, and for the ith image block, dividing the image block in the ith row whose coupling is greater than the threshold into the ith model image block set.

And step S50, for each grid model set, calculating distribution variance based on the centroid of each grid model inner point, selecting the grid model set corresponding to the maximum distribution variance value, and constructing a static characteristic set based on the inner points of each grid model in the set.

In the field of view of a moving object (the subject of SLAM), the relative motion state is resolved, in which case the moving object differs from the stationary object only in its relative motion state.

Imagine that in the space, the motion of all objects is relative, but we still want to find a motion state with stable and wide distribution, and choose the object in this state to establish the coordinate system, because so, the motion state of the object is guaranteed to be greatly simplified. While the scene in the camera field of view is constantly changing, it is desirable to find a coordinate system that is stable in motion and exists stably. Therefore, the degree of distribution is an important basis for measuring whether the moving coordinate system is the coordinate system which we need to solve. In the earth environment, this coordinate system is generally a static coordinate system.

In this embodiment, based on the image block set obtained in step S40 and used for calculating z lattice models, a distribution function, i.e., a distribution variance F, of the image block set is calculateddistributionThe calculation steps are as follows:

respectively calculating the x mean value and the y mean value of the centroid of the lattice model in the image block set;

calculating the X-direction variance X of the lattice model centroid in the image block set based on the X-mean and the y-meanvarianceVariance in Y direction with Yvariance

Based on the above variance, the calculated distribution function is solved, as shown in equation (1):

Figure BDA0002218613320000111

and selecting the image block set with the highest distribution function as a motion model of the camera, and constructing a static characteristic point set based on the interior points in the image block set.

And step S60, triangularizing the inner points in the static feature set, acquiring the three-dimensional coordinates of the SLAM through a nonlinear optimization method, and initializing the SLAM based on the three-dimensional coordinates.

In this embodiment, based on the static feature set obtained in step S50, triangularization is performed on the interior points in the static feature point set, optimization iteration is performed on the camera motion model and the three-dimensional coordinates of the interior points in the static feature point set by using a nonlinear optimization method, the three-dimensional coordinates of the SLAM are solved, and the SLAM is initialized based on the three-dimensional coordinates.

A visual SLAM initialization system in a dynamic environment according to a second embodiment of the present invention, as shown in fig. 2, includes: the system comprises an acquisition module 100, a feature matching module 200, a screening module 300, a set construction module 400, a distribution variance calculation module 500 and an initialization module 600;

an obtaining module 100 configured to obtain a first image frame and a second image frame having a parallax from an input video;

a feature matching module 200, configured to extract ORB feature points of the first image frame and the second image frame, respectively, and obtain matching feature points of the first image frame by a feature point matching method;

the screening module 300 is configured to divide the first image frame equally to obtain a plurality of image blocks, select an image block with a matching feature point larger than a set threshold as a lattice model, and obtain an interior point of each lattice model and a centroid of the interior point through a RANSAC algorithm;

a set construction module 400 configured to calculate, for each lattice model, the coupling degree between the lattice model and another lattice model based on its inner point, select a lattice model based on a preset coupling degree threshold, and construct a corresponding lattice model set;

a distribution variance calculating module 500 configured to calculate a distribution variance for each lattice model set based on the centroid of the inner point of each lattice model, select the lattice model set corresponding to the maximum distribution variance value, and construct a static feature set based on the inner points of each lattice model in the set;

the initialization module 600 is configured to triangulate the interior points in the static feature set, acquire the three-dimensional coordinates of the SLAM by a nonlinear optimization method, and initialize the SLAM based on the three-dimensional coordinates.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

It should be noted that, the visual SLAM initialization system under the dynamic environment provided by the foregoing embodiment is only illustrated by the division of the foregoing functional modules, and in practical applications, the foregoing function allocation may be completed by different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.

A storage device according to a third embodiment of the present invention stores therein a plurality of programs adapted to be loaded by a processor and to implement the above-described visual SLAM initialization method in a dynamic environment.

A processing apparatus according to a fourth embodiment of the present invention includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the visual SLAM initialization method in a dynamic environment as described above.

It can be clearly understood by those skilled in the art that, for convenience and brevity, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method examples, and are not described herein again.

Those of skill in the art would appreciate that the various illustrative modules, method steps, and modules described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that programs corresponding to the software modules, method steps may be located in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as electronic hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.

The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:曲线一致性检测方法及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!