Target detection method, device, equipment and system

文档序号:946224 发布日期:2020-10-30 浏览:7次 中文

阅读说明:本技术 一种目标检测方法、装置、设备及系统 (Target detection method, device, equipment and system ) 是由 张经纬 方梓成 邝宏武 赵显� 于 2019-04-11 设计创作,主要内容包括:本发明实施例提供了一种目标检测方法、装置、设备及系统,方法包括:将可见光图像中检测出的目标与基于雷达数据检测出的目标进行匹配,匹配成功后再对两种目标进行特征提取及特征融合,基于融合后的特征判断检测出的目标是否为误检目标;可见,本方案中,结合可见光图像及雷达数据两方面进行目标检测,两种数据(相比于一种数据)携带的特征更丰富,利用从这两种数据中提取出的特征进行目标检测的准确度更高,而且基于可见光图像及雷达数据识别误检目标,降低了目标误检率,进一步提高了检测准确度。(The embodiment of the invention provides a target detection method, a device, equipment and a system, wherein the method comprises the following steps: matching a target detected from the visible light image with a target detected based on radar data, performing feature extraction and feature fusion on the two targets after the matching is successful, and judging whether the detected target is a false detection target or not based on the fused features; therefore, in the scheme, the target detection is carried out by combining the visible light image and the radar data, the characteristics carried by the two data (compared with one data) are richer, the accuracy of target detection by utilizing the characteristics extracted from the two data is higher, the false detection target is identified based on the visible light image and the radar data, the target false detection rate is reduced, and the detection accuracy is further improved.)

1. A method of object detection, comprising:

acquiring a visible light image and radar data, wherein the radar data comprises a radar image;

detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data;

matching the first candidate target with the second candidate target;

if the matching is successful, extracting the feature of a first candidate target in the visible light image as a first target feature, and extracting the feature of a second candidate target in the radar image as a second target feature;

Performing feature fusion on the first target feature and the second target feature to obtain a first fused feature;

and carrying out false detection and identification on the first candidate target and/or the second candidate target based on the first fused feature.

2. The method of claim 1, wherein matching the first candidate target with the second candidate target comprises:

converting the first candidate target and the second candidate target to the same coordinate system;

in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not;

and if so, determining that the first candidate target is successfully matched with the second candidate target.

3. The method of claim 2, further comprising:

tracking a first candidate target in the visible light image to obtain the speed of the first candidate target as a first speed;

determining a velocity of a second candidate target as a second velocity based on the radar data;

judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value or not;

If the first candidate target and the second candidate target are smaller than the first candidate target, determining that the first candidate target and the second candidate target are successfully matched comprises:

and if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold and the difference value between the first speed and the second speed is smaller than a preset speed difference threshold, determining that the first candidate target and the second candidate target are successfully matched.

4. The method of claim 1, wherein matching the first candidate target with the second candidate target comprises:

time stamp aligning the first candidate target with the second candidate target;

and matching the first candidate target with the same timestamp with the second candidate target.

5. The method of claim 1, further comprising:

mapping the second candidate target which is not successfully matched into the visible light image according to calibration data between the camera and the radar; wherein the camera is a camera for acquiring the visible light image; the radar is used for collecting the radar data;

determining a first candidate target which is missed to be detected in the visible light image according to a mapping result;

Extracting the characteristics of the missed first candidate target in the visible light image to serve as third target characteristics, and extracting the characteristics of the unmatched second candidate target in the radar image to serve as fourth target characteristics;

performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature;

and based on the second fused feature, carrying out false detection identification on the second candidate target which is not successfully matched and/or the first candidate target which is missed to be detected.

6. The method according to claim 1, wherein the extracting the feature of the first candidate target in the visible light image as the first target feature and the extracting the feature of the second candidate target in the radar image as the second target feature comprises:

inputting a first candidate target in the visible light image and a second candidate target in the radar image into a neural network model obtained through pre-training;

performing feature extraction on the first candidate target by using a first level in the neural network model to obtain a first target feature; performing feature extraction on the second candidate target by using a second level in the neural network model to obtain a second target feature;

The performing feature fusion on the first target feature and the second target feature to obtain a first fused feature includes:

fusing the first target feature and the second target feature by using a third level in the neural network model to obtain a first fused feature;

the false detection and identification of the first candidate target and/or the second candidate target based on the first fused feature includes:

and classifying the first fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the first candidate target and/or the second candidate target is a false detection target.

7. An object detection device, comprising:

the acquisition module is used for acquiring visible light images and radar data, wherein the radar data comprises radar images;

the detection module is used for detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data;

a matching module for matching the first candidate target with the second candidate target; if the matching is successful, triggering a feature fusion module;

The feature fusion module is used for extracting features of a first candidate target in the visible light image as first target features, and extracting features of a second candidate target in the radar image as second target features; performing feature extraction and feature fusion on the first target feature and the second target feature to obtain a first fused feature;

and the false detection identification module is used for carrying out false detection identification on the first candidate target and/or the second candidate target based on the first fused feature.

8. The apparatus of claim 7, wherein the matching module is specifically configured to:

converting the first candidate target and the second candidate target to the same coordinate system;

in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not;

and if so, determining that the first candidate target is successfully matched with the second candidate target.

9. The apparatus of claim 8, further comprising:

the tracking module is used for tracking a first candidate target in the visible light image to obtain the speed of the first candidate target as a first speed; determining a velocity of a second candidate target as a second velocity based on the radar data;

The judging module is used for judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value or not;

the matching module is further configured to: determining that the first candidate target and the second candidate target are successfully matched if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold and the difference between the first speed and the second speed is smaller than a preset speed difference threshold.

10. The apparatus of claim 7, wherein the matching module is specifically configured to:

time stamp aligning the first candidate target with the second candidate target;

and matching the first candidate target with the same timestamp with the second candidate target.

11. The apparatus of claim 7, further comprising: a mapping module and a determination module, wherein,

the mapping module is used for mapping the second candidate target which is not successfully matched into the visible light image according to calibration data between the camera and the radar; wherein the camera is a camera for acquiring the visible light image; the radar is used for collecting the radar data;

The determining module is used for determining a first candidate target which is missed to be detected in the visible light image according to a mapping result;

the feature fusion module is further configured to extract features of the missed first candidate target in the visible light image, to serve as third target features, and extract features of the second candidate target that is not successfully matched in the radar image, to serve as fourth target features; performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature;

and the false detection identification module is further used for performing false detection identification on the second candidate target which is not successfully matched and/or the first candidate target which is missed to be detected based on the second fused feature.

12. The apparatus of claim 7, wherein the feature fusion module is specifically configured to:

inputting a first candidate target in the visible light image and a second candidate target in the radar image into a neural network model obtained through pre-training;

performing feature extraction on the first candidate target by using a first level in the neural network model to obtain a first target feature; performing feature extraction on the second candidate target by using a second level in the neural network model to obtain a second target feature;

Fusing the first target feature and the second target feature by using a third level in the neural network model to obtain a first fused feature;

the false detection identification module is specifically configured to:

and classifying the first fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the first candidate target and/or the second candidate target is a false detection target.

13. An electronic device comprising a processor and a memory;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1-6 when executing a program stored in the memory.

14. An object detection system, comprising: a camera, radar, and detection device; wherein the camera and the radar perform data acquisition for the same scene;

the detection equipment is used for acquiring visible light images acquired by the camera and radar data through the radar, and the radar data comprise radar images; detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data; matching the first candidate target with the second candidate target; if the matching is successful, extracting the feature of a first candidate target in the visible light image as a first target feature, and extracting the feature of a second candidate target in the radar image as a second target feature; performing feature fusion on the first target feature and the second target feature to obtain a first fused feature; and carrying out false detection and identification on the first candidate target and/or the second candidate target based on the first fused feature.

15. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.

Technical Field

The present invention relates to the field of target detection technologies, and in particular, to a target detection method, apparatus, device, and system.

Background

In some scenes, after the image acquisition device acquires the visible light image, objects such as people, vehicles, obstacles and the like can be detected in the visible light image. In other scenarios, objects such as people, vehicles, obstacles, etc. may be detected by radar.

The method for detecting the target through the visible light image is susceptible to environmental factors, for example, under the environment with poor illumination, such as at night or in rainy days, the definition of the collected visible light image is poor, and therefore the accuracy of the detection result is also poor. And the mode of detecting the target through the radar has larger noise, higher missing detection rate and false detection rate and poorer accuracy of the detection result.

Disclosure of Invention

The embodiment of the invention aims to provide a target detection method, a target detection device, target detection equipment and a target detection system, so as to improve the detection accuracy.

In order to achieve the above object, an embodiment of the present invention provides a target detection method, including:

acquiring a visible light image and radar data, wherein the radar data comprises a radar image;

detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data;

matching the first candidate target with the second candidate target;

if the matching is successful, extracting the feature of a first candidate target in the visible light image as a first target feature, and extracting the feature of a second candidate target in the radar image as a second target feature;

Performing feature fusion on the first target feature and the second target feature to obtain a first fused feature;

and carrying out false detection and identification on the first candidate target and/or the second candidate target based on the first fused feature.

Optionally, the matching the first candidate target and the second candidate target includes:

converting the first candidate target and the second candidate target to the same coordinate system;

in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not;

and if so, determining that the first candidate target is successfully matched with the second candidate target.

Optionally, the method further includes:

tracking a first candidate target in the visible light image to obtain the speed of the first candidate target as a first speed;

determining a velocity of a second candidate target as a second velocity based on the radar data;

judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value or not;

if the first candidate target and the second candidate target are smaller than the first candidate target, determining that the first candidate target and the second candidate target are successfully matched comprises:

And if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold and the difference value between the first speed and the second speed is smaller than a preset speed difference threshold, determining that the first candidate target and the second candidate target are successfully matched.

Optionally, the matching the first candidate target and the second candidate target includes:

time stamp aligning the first candidate target with the second candidate target;

and matching the first candidate target with the same timestamp with the second candidate target.

Optionally, the method further includes:

mapping the second candidate target which is not successfully matched into the visible light image according to calibration data between the camera and the radar; wherein the camera is a camera for acquiring the visible light image; the radar is used for collecting the radar data;

determining a first candidate target which is missed to be detected in the visible light image according to a mapping result;

extracting the characteristics of the missed first candidate target in the visible light image to serve as third target characteristics, and extracting the characteristics of the unmatched second candidate target in the radar image to serve as fourth target characteristics;

Performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature;

and based on the second fused feature, carrying out false detection identification on the second candidate target which is not successfully matched and/or the first candidate target which is missed to be detected.

Optionally, the extracting the feature of the first candidate target in the visible light image as the first target feature, and extracting the feature of the second candidate target in the radar image as the second target feature include:

inputting a first candidate target in the visible light image and a second candidate target in the radar image into a neural network model obtained through pre-training;

performing feature extraction on the first candidate target by using a first level in the neural network model to obtain a first target feature; performing feature extraction on the second candidate target by using a second level in the neural network model to obtain a second target feature;

the performing feature fusion on the first target feature and the second target feature to obtain a first fused feature includes:

fusing the first target feature and the second target feature by using a third level in the neural network model to obtain a first fused feature;

The false detection and identification of the first candidate target and/or the second candidate target based on the first fused feature includes:

and classifying the first fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the first candidate target and/or the second candidate target is a false detection target.

In order to achieve the above object, an embodiment of the present invention further provides a target detection apparatus, including:

the acquisition module is used for acquiring visible light images and radar data, wherein the radar data comprises radar images;

the detection module is used for detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data;

a matching module for matching the first candidate target with the second candidate target; if the matching is successful, triggering a feature fusion module;

the feature fusion module is used for extracting features of a first candidate target in the visible light image as first target features, and extracting features of a second candidate target in the radar image as second target features; performing feature extraction and feature fusion on the first target feature and the second target feature to obtain a first fused feature;

And the false detection identification module is used for carrying out false detection identification on the first candidate target and/or the second candidate target based on the first fused feature.

Optionally, the matching module is specifically configured to:

converting the first candidate target and the second candidate target to the same coordinate system;

in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not;

and if so, determining that the first candidate target is successfully matched with the second candidate target.

Optionally, the apparatus further comprises:

the tracking module is used for tracking a first candidate target in the visible light image to obtain the speed of the first candidate target as a first speed; determining a velocity of a second candidate target as a second velocity based on the radar data;

the judging module is used for judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value or not;

the matching module is further configured to: determining that the first candidate target and the second candidate target are successfully matched if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold and the difference between the first speed and the second speed is smaller than a preset speed difference threshold.

Optionally, the matching module is specifically configured to:

time stamp aligning the first candidate target with the second candidate target;

and matching the first candidate target with the same timestamp with the second candidate target.

Optionally, the apparatus further comprises: a mapping module and a determination module, wherein,

the mapping module is used for mapping the second candidate target which is not successfully matched into the visible light image according to calibration data between the camera and the radar; wherein the camera is a camera for acquiring the visible light image; the radar is used for collecting the radar data;

the determining module is used for determining a first candidate target which is missed to be detected in the visible light image according to a mapping result;

the feature fusion module is further configured to extract features of the missed first candidate target in the visible light image, to serve as third target features, and extract features of the second candidate target that is not successfully matched in the radar image, to serve as fourth target features; performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature;

and the false detection identification module is further used for performing false detection identification on the second candidate target which is not successfully matched and/or the first candidate target which is missed to be detected based on the second fused feature.

Optionally, the feature fusion module is specifically configured to:

inputting a first candidate target in the visible light image and a second candidate target in the radar image into a neural network model obtained through pre-training;

performing feature extraction on the first candidate target by using a first level in the neural network model to obtain a first target feature; performing feature extraction on the second candidate target by using a second level in the neural network model to obtain a second target feature;

fusing the first target feature and the second target feature by using a third level in the neural network model to obtain a first fused feature;

the false detection identification module is specifically configured to:

and classifying the first fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the first candidate target and/or the second candidate target is a false detection target.

In order to achieve the above object, an embodiment of the present invention further provides an electronic device, including a processor and a memory;

a memory for storing a computer program;

and a processor for implementing any one of the above-described object detection methods when executing the program stored in the memory.

In order to achieve the above object, an embodiment of the present invention further provides a target detection system, including: a camera, radar, and detection device; wherein the camera and the radar perform data acquisition for the same scene;

the detection equipment is used for acquiring visible light images acquired by the camera and radar data through the radar, and the radar data comprise radar images; detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data; matching the first candidate target with the second candidate target; if the matching is successful, extracting the feature of a first candidate target in the visible light image as a first target feature, and extracting the feature of a second candidate target in the radar image as a second target feature; performing feature fusion on the first target feature and the second target feature to obtain a first fused feature; and carrying out false detection and identification on the first candidate target and/or the second candidate target based on the first fused feature.

To achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any one of the above object detection methods.

By applying the embodiment of the invention, the target detected in the visible light image is matched with the target detected based on the radar data, after the matching is successful, the two targets are subjected to feature extraction and feature fusion, and whether the detected target is a false detection target or not is judged based on the fused features; therefore, in the scheme, the target detection is carried out by combining the visible light image and the radar data, the characteristics carried by the two data (compared with one data) are richer, the accuracy of target detection by utilizing the characteristics extracted from the two data is higher, the false detection target is identified based on the visible light image and the radar data, the target false detection rate is reduced, and the detection accuracy is further improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a target detection method according to an embodiment of the present invention;

Fig. 2 is a schematic flow chart of a target detection method according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a virtual device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an image processing module according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a radar processing module according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a target matching process according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a process of feature extraction, fusion, and classification according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an object detection apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a target detection system according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to solve the above technical problems, embodiments of the present invention provide a method, an apparatus, a device, and a system for target detection. The method and the device can be applied to vehicle-mounted equipment, and can also be applied to various electronic equipment such as mobile phones and computers, and the like, and are not limited specifically. The following first describes the target detection method provided by the embodiment of the present invention in detail.

Fig. 1 is a first flowchart of a target detection method according to an embodiment of the present invention, including:

s101: and acquiring a visible light image and radar data, wherein the radar data comprises a radar image.

For example, if the execution subject is an on-board device, the on-board device may be in communication connection with an on-board camera and an on-board radar, and the on-board device may obtain the visible light image through the on-board camera and obtain the radar data through the on-board radar. Or the vehicle-mounted equipment can comprise a camera component and a radar component, the vehicle-mounted equipment acquires the visible light image through the camera component, and the radar data is acquired through the radar component. Alternatively, the executing subject may be other electronic devices that acquire visible light images captured by a camera and radar data by radar. For convenience of description, the execution main body is referred to as an electronic apparatus in the following.

In one case, the radar data may include raw data and a power profile. For example, the radar may transmit a modulated continuous wave, and then the radar may receive an echo signal of the modulated continuous wave after reflection; the radar may then sample the echo signal to obtain a data block, which may be understood as the raw data.

In one embodiment, the electronic device may obtain raw data sent by a radar; and performing conversion processing on the original data by utilizing a Fourier transform algorithm to obtain a power distribution diagram, and obtaining radar data comprising the original data and the power distribution diagram.

For example, the radar transmits the raw data to the electronic device, and the electronic device may perform time-frequency conversion on the raw data by using a Fourier transform algorithm, an FFT (Fast Fourier transform) algorithm, or the like, to obtain a power distribution map capable of expressing a distance velocity, or the power distribution map is called as a distance velocity power distribution map.

For example, the color of the pixel point in the power distribution map indicates the power level, the power of the background object is 0, the color may be blue, the power of the moving object is higher, the color may be red, and the specific power distribution map is not limited.

The visible light image obtained in S101 may be a continuous video frame image or may also be a single frame image, and the radar data obtained in S101 may be continuous radar data or may also be single frame radar data. The visible light image acquired in S101 is the same as the scene targeted by the radar data.

S102: detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data.

For example, the detected target may be a vehicle, an obstacle, a person, and the like, and is not limited in particular. Since the target detected in S102 may be a false detection target and is not the final detection result, the target detected in S102 is referred to as a candidate target.

For the visible light image, if the visible light image acquired in S101 is a single frame image, the target in the visible light image may be detected by using a target detection algorithm, and for convenience of description, the target detected based on the visible image is referred to as a first candidate target. The target frame can be obtained in the visible light image through detection.

If the visible light image obtained in S101 is a continuous video frame image, the visible light image may be detected by using a target detection algorithm to obtain a target frame in the visible light image; and then tracking the first candidate target in the continuous video frame images by using algorithms such as monocular distance measurement, monocular tracking and the like or other target tracking algorithms based on the calibration data of the camera. By tracking the first candidate target, the velocity of the first candidate target may be obtained, and for convenience of description, the velocity is referred to as a first velocity.

For radar data, the original data may be processed through the steps of Constant False Alarm Rate (CFAR), clustering or Non-Maximum Suppression (NMS), parameter calculation, and the like, to obtain information such as a position, a speed, and an angle of the second candidate target. For convenience of description, the target detected based on the radar data is referred to as a second candidate target, and the velocity of the second candidate target is referred to as a second velocity.

Or, in another case, a kalman filtering tracking algorithm may be adopted to track the second candidate target, so as to obtain information such as the position, the speed, the angle, and the like of the second candidate target. By adopting the Kalman filtering tracking algorithm, the detected false target can be suppressed, and the detection accuracy of the target is improved.

S103: and matching the first candidate target with the second candidate target. If the matching is successful, S104 is executed.

As described above, the visible light image acquired in S101 is the same as the scene targeted by the radar data, and ideally, the first candidate target detected in the visible light image and the second candidate target detected based on the radar data should correspond one to one.

For example, assume that the camera and radar are both located in scene a, where there are three targets: and the target 1, the target 2 and the target 3 acquire visible light images acquired by the camera aiming at the scene A, and acquire radar data through a radar. Performing target detection on the visible light image, and assuming that detection omission exists, only two first candidate targets are detected: target 1 and target 2. Target detection is performed based on radar data, and assuming that false detection exists, four second candidate targets are detected: target 1, target 2, target 3 and target 4, target 4 being a false detection target.

Two first candidate targets detected in the visible light image are matched with four second candidate targets detected based on the radar data.

In one embodiment, S103 may include: converting the first candidate target and the second candidate target to the same coordinate system; in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not; and if so, determining that the first candidate target is successfully matched with the second candidate target.

The present embodiment will be described by way of example in continuation of the above example. Two first candidate targets detected in the visible light image and four second candidate targets detected based on the radar data may be first converted into the same coordinate system. Then, it is assumed that the target 1 in the first candidate target is processed first, and whether the distances between the position of the target 1 in the first candidate target and the positions of the four second candidate targets are smaller than a preset distance threshold is respectively determined. Assuming that the distance between the position of the target 1 in the first candidate target and the position of the target 1 in the second candidate target is smaller than the preset distance threshold, the distances between the position of the target 1 in the first candidate target and the positions of the other three second candidate targets are not smaller than the preset distance threshold, which indicates that the matching between the target 1 in the first candidate target and the target 1 in the second candidate target is successful.

Assuming that the target 2 in the first candidate target is processed, whether the distances between the position of the target 2 in the first candidate target and the positions of the four second candidate targets are smaller than a preset distance threshold value or not is respectively judged. Assuming that the distance between the position of the target 2 in the first candidate target and the position of the target 2 in the second candidate target is smaller than the preset distance threshold, the distances between the position of the target 2 in the first candidate target and the positions of the other three second candidate targets are not smaller than the preset distance threshold, which indicates that the target 2 in the first candidate target and the target 2 in the second candidate target are successfully matched.

After the first candidate object is matched, the matching process may be considered to be finished. Or, in another case, the four second candidate targets may also be matched respectively, and the processing procedures are similar and will not be described again.

In some related schemes, the visible light image and the radar data are converted into the same coordinate system, namely, each pixel point in the visible light image and each coordinate point in the radar data are subjected to coordinate conversion, so that the calculation amount is large, and the conversion accuracy is poor; in the embodiment, only a partial region in the visible light image and partial data in the radar data, that is, an image region where the first candidate target is located and the radar data related to the second candidate target are subjected to coordinate conversion, so that the calculation amount is reduced, and the conversion accuracy is improved.

In another embodiment, the method further comprises: tracking a first candidate target in the visible light image to obtain the speed of the first candidate target as a first speed; determining a velocity of a second candidate target as a second velocity based on the radar data; and judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value. In this embodiment, if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold and the difference between the first speed and the second speed is smaller than a preset speed difference threshold, it is determined that the first candidate target and the second candidate target are successfully matched.

As described above, if the visible light image acquired in S101 is a continuous video frame image, the first candidate target in the continuous video frame image may be tracked by using algorithms such as monocular distance measurement and monocular tracking, or using some other target tracking algorithms based on the calibration data of the camera, so as to obtain the speed of the first candidate target, which is used as the first speed. In addition, the original data sent by the radar can be processed through the steps of constant false alarm rate detection, clustering or non-maximum value inhibition, parameter calculation and the like, and the speed of a second candidate target is obtained and is called as a second speed; or obtaining the speed of the second candidate target by adopting a Kalman filtering tracking algorithm, and the speed is called as a second speed.

Thus, object matching can be performed based on both the position and velocity of the object. For example, in one case, it may be determined whether the target is a static target or a dynamic target based on the speed of the target. For example, if the distance between the position of the first candidate target and the position of the second candidate target is smaller than the preset distance threshold, and the first candidate target and the second candidate target are both static targets, it may be considered that the first candidate target and the second candidate target are successfully matched. For another example, if the distance between the position of the first candidate target and the position of the second candidate target is smaller than the preset distance threshold, but the first candidate target is a static target and the second candidate target is a dynamic target, the first candidate target and the second candidate target are considered to be unsuccessfully matched.

Alternatively, in another case, the target speed may be determined more accurately, and the speed difference threshold may be set. For example, if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold, and the difference between the first speed and the second speed is smaller than a preset speed difference threshold, the first candidate target and the second candidate target may be considered to be successfully matched. For another example, if the distance between the position of the first candidate target and the position of the second candidate target is smaller than the preset distance threshold, but the difference between the first speed and the second speed is not smaller than the preset speed difference threshold, the first candidate target and the second candidate target are considered to be unsuccessfully matched.

For example, assuming that there are N first candidate targets, the first candidate targets may be represented asj denotes the sequence number of the first candidate object, VjRepresenting a jth first candidate target; the second candidate targets are M, and can be expressed as

Figure BDA0002024318500000112

i denotes the number of the second candidate target, RiRepresenting the ith second candidate target; m and N are both positive integers. The matching process may be understood as finding a matching relationship between a first candidate target and a second candidate target { { R { (R)i,Vj}kK denotes the number of successfully matched target pairs, a target pair comprising a first candidate target and a second candidate target, { R }i,VjDenotes the kth target pair.

In one case, a cost function may be constructed based on a distance between a position of the first candidate object and a position of the second candidate object, and a difference between the first speed and the second speed; constructing a cost matrix of pairwise matching of the first candidate target and the second candidate target based on the cost function; solving the cost matrix by adopting the Hungarian algorithm to obtain an optimal matching result, namely obtaining a matching relation between the first candidate target and the second candidate target { { R { (R)i,Vj}k}。

In one embodiment, a first candidate target may be time stamp aligned with a second candidate target; thus, S103 may include: and matching the first candidate target with the same timestamp with the second candidate target.

In this embodiment, the timestamp alignment processing may be performed on the first candidate target and the second candidate target, and then the first candidate target and the second candidate target may be matched, so that the matching accuracy may be improved.

For example, the electronic device (executing subject) may determine whether the received data transmitted by the camera and the data transmitted by the radar are synchronized in time, and if so, match a first candidate target detected based on the visible light image with a second candidate target detected based on the radar data for the visible light image and the radar data at the same time. If the target matching is not synchronous, the radar and the camera can be synchronously configured, or the visible light image and the radar data with the same timestamp can be determined in the visible light image and the radar data which are not synchronous, and then the first candidate target detected based on the visible light image and the second candidate target detected based on the radar data are matched according to the visible light image and the radar data with the same timestamp.

S104: and extracting the characteristic of a first candidate target in the visible light image as a first target characteristic, and extracting the characteristic of a second candidate target in the radar image as a second target characteristic.

As described above, the radar data acquired in S101 includes a radar image, and feature extraction may be performed on two images, namely, a visible light image and the radar image.

For example, various algorithms such as a color histogram, wavelet transform, invariant moment and the like may be used to extract features of the target, or a deep learning algorithm may be used to extract features of the target, and the specific algorithm is not limited. For convenience of description, the feature extracted from the first candidate object is referred to as a first object feature, and the feature extracted from the second candidate object is referred to as a second object feature.

S105: and performing feature fusion on the first target feature and the second target feature to obtain a first fused feature.

And then fusing the first target feature and the second target feature to obtain a first fused feature. For example, the first target feature and the second target feature may be fused in a weighted manner, or the first target feature and the second target feature may also be fused by using a fusion hierarchy in the neural network model, and a specific fusion manner is not limited.

S106: and false detection and identification are carried out on the first candidate target and/or the second candidate target based on the first fused feature.

And performing false detection identification on the first candidate target and/or the second candidate target, namely judging whether the first candidate target and/or the second candidate target is a false detection target or not. Based on the recognition result of S106, the false detection target in the detection result can be removed, and the accuracy of target detection can be improved.

In one case, the classification model obtained by pre-training may be used to classify the first fused feature, and the classification result may include: is a false detection target, and is not a false detection target. Thus, the false detection target can be identified according to the classification result. Alternatively, the classification result may further include a specific object category, such as whether the object is a vehicle or an obstacle, and the like, and the specific classification result is not limited.

In one embodiment, if there is a second candidate target that is not successfully matched in S103, the second candidate target that is not successfully matched may be mapped to the visible light image according to calibration data between the camera and the radar; wherein the camera is a camera for acquiring the visible light image; the radar is used for collecting the radar data; determining a first candidate target which is missed to be detected in the visible light image according to a mapping result; extracting the characteristic of the determined first candidate target which is not subjected to detection in the visible light image to serve as a third target characteristic, and extracting the characteristic of the second candidate target which is not successfully matched in the radar image to serve as a fourth target characteristic; performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature; and based on the second fused feature, carrying out false detection identification on the second candidate target which is not successfully matched and/or the determined first candidate target which is missed to be detected.

And performing false detection identification on the second candidate target which is not successfully matched and/or the determined first candidate target which is not detected, namely judging whether the second candidate target which is not successfully matched and/or the determined first candidate target which is not detected is a false detection target or not.

Due to the influence of environmental factors, missed detection targets may exist in the visible light image. Continuing with the above example, object detection is performed on the visible light image, and in the case of missing detection, only two first candidate objects are detected: target 1 and target 2; in the case of false detection, four second candidate targets are detected: target 1, target 2, target 3 and target 4, target 4 being a false detection target. In this case, there are two second candidate targets that did not match successfully: target 3 and target 4.

If the embodiment is adopted, the two second candidate targets (target 3 and target 4) which are not successfully matched are mapped into the visible light image according to the calibration data between the camera and the radar, and the corresponding areas in the visible light image can be detected again according to the mapping result. For example, the confidence threshold may be lowered upon re-detection to re-determine the first candidate target that is missed. Through re-detection, a first candidate target of missed detection is determined in the visible light image: target 3, in this case, the "second candidate target with no matching success: target 3 "and" the first candidate target of missed inspection determined in the visible light image: the target 3 ″ forms a target pair which is successfully matched, and the feature extraction, the feature fusion and the classification are carried out aiming at the target pair, the processing process is similar to the above content, and the details are not repeated.

By applying the embodiment, the second candidate target which is not successfully matched is mapped to the visible light image, and the first candidate target which is missed to be detected in the visible light image is determined according to the mapping result, so that the condition of missing detection in the visible light image is reduced.

In one embodiment, a neural network model may be used for feature extraction, feature fusion, and classification. For example, a first candidate target in the visible light image and a second candidate target in the radar image may be input to a neural network model obtained through pre-training; performing feature extraction on the first candidate target by using a first level in the neural network model to obtain a first target feature; performing feature extraction on the second candidate target by using a second level in the neural network model to obtain a second target feature; fusing the first target feature and the second target feature by using a third level in the neural network model to obtain a first fused feature; and classifying the first fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the first candidate target and/or the second candidate target is a false detection target.

In the present embodiment, the neural network model includes four levels, or may be considered to include four sub-neural network models.

Compared with a visible light image, the radar image is more visual, under one condition, a first candidate target in the visible light image can be processed by using a network structure with higher complexity, and a second candidate target in the radar image can be processed by using a network structure with lower complexity, so that the network structure is more reasonable, and the feature extraction effect is better. For example, a first level may employ a network structure of Resnet (residual-net) 50, and a second level may employ a network structure of Resnet 18. Residual errors are introduced into the network structure of Resnet, and the classification performance is good.

For example, before global average pooling (Avgpooling) is performed in the Resnet network, a concatenation (concat) process, i.e., feature fusion, may be performed on the first target feature and the second target feature.

And inputting the first fused feature into a fourth level, wherein the fourth level can be understood as a classifier, and outputting a classification result. The classification result may include: is a false detection target, and is not a false detection target. Thus, the false detection target can be identified according to the classification result. Alternatively, the classification result may further include a specific object category, such as whether the object is a vehicle or an obstacle, and the like, and the specific classification result is not limited.

In one case, after obtaining the classification result, the neural network model may be corrected based on the classification result to obtain a more accurate neural network model.

In the above embodiment, the "second candidate target that has not been successfully matched" and the "first candidate target that has been determined to have been missed in the visible light image" are also combined into a pair of targets that have been successfully matched, and thus the pair of targets may be input to a neural network model trained in advance.

Namely inputting a first candidate target which is determined to be missed in the visible light image and a second candidate target which is not successfully matched in the radar image into a neural network model obtained by pre-training; performing feature extraction on the first candidate target which is missed to be detected by utilizing the first level in the neural network model to obtain a third target feature; performing feature extraction on the second candidate target which is not successfully matched by using a second level in the neural network model to obtain a fourth target feature; fusing the third target feature and the fourth target feature by using a third level in the neural network model to obtain a second fused feature; and classifying the second fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the second candidate target which is not successfully matched and the first candidate target which is determined to be missed for detection in the visible light image are false detection targets.

The processing procedure of the first candidate target determined to be missed in the visible light image and the second candidate target determined to be not successfully matched in the radar image by using the neural network model is similar to the processing procedure of the first candidate target and the second candidate target successfully matched, and is not repeated here.

By applying the embodiment shown in the figure 1 of the invention, the target detected in the visible light image is matched with the target detected based on the radar data, after the matching is successful, the two targets are subjected to feature extraction and feature fusion, and whether the detected target is a false detection target or not is judged based on the fused features; therefore, in the scheme, the target detection is carried out by combining the visible light image and the radar data, the characteristics carried by the two data (compared with one data) are richer, the accuracy of target detection by utilizing the characteristics extracted from the two data is higher, the false detection target is identified based on the visible light image and the radar data, the target false detection rate is reduced, and the detection accuracy is further improved.

Fig. 2 is a schematic flow chart of a second target detection method according to an embodiment of the present invention, including:

s201: and acquiring a visible light image acquired by the camera and original data sent by the radar.

In one case, the radar data may include raw data and a power profile. For example, the radar may transmit a modulated continuous wave, and then the radar may receive an echo signal of the modulated continuous wave after reflection; the radar may then sample the echo signal to obtain a block of data, which may be interpreted as the raw data. The radar transmits the raw data to the electronic device (execution subject).

The visible light image obtained in S201 may be a continuous video frame image, or may also be a single frame image, and the raw data obtained in S201 may be continuous data, or may also be a single frame raw data. The visible light image acquired in S201 is the same as the scene for which the raw data acquired in S201 is directed.

S202: and performing conversion processing on the original data by utilizing a Fourier transform algorithm to obtain a power distribution diagram as a radar image.

For example, the radar transmits the raw data to the electronic device, and the electronic device may perform time-frequency conversion on the raw data by using a Fourier transform algorithm, an FFT (Fast Fourier transform) algorithm, or the like, to obtain a power distribution map capable of expressing a distance velocity, or the power distribution map is called as a distance velocity power distribution map.

For example, the radar image may be a power distribution map, for example, the color of a pixel in the power distribution map indicates the power level, the power of the background object is 0, the color may be blue, the power of the moving object is higher, the power may be red, and the like, and the specific power distribution map is not limited.

S203: detecting a first candidate target in the visible light image, and tracking the first candidate target to obtain a first speed; and determining a velocity of the second candidate target as the second velocity based on the radar data.

For example, the detected target may be a vehicle, an obstacle, and the like, and is not limited in particular. Since the target detected in S203 may be a false detection target and is not the final detection result, the target detected in S203 is referred to as a candidate target.

For the visible light image, the visible light image acquired in S201 may be a continuous video frame image, and each frame of the visible light image may be detected by using a target detection algorithm to obtain a target frame in each frame of the visible light image; and then tracking the first candidate target in the continuous video frame images by using algorithms such as monocular distance measurement, monocular tracking and the like or other target tracking algorithms based on the calibration data of the camera. By tracking the first candidate target, the velocity of the first candidate target may be obtained, and for convenience of description, the velocity is referred to as a first velocity.

For radar data, the raw data may be processed through the steps of Constant False Alarm Rate (CFAR), clustering or Non-Maximum Suppression (NMS), parameter calculation, and the like, to obtain information such as a position, a speed, and an angle of a second candidate target, and for convenience of description, the speed of the second candidate target is referred to as a second speed.

Or, in another case, a kalman filtering tracking algorithm may be adopted to track the second candidate target, so as to obtain information such as the position, the speed, the angle, and the like of the second candidate target. By adopting the Kalman filtering tracking algorithm, the detected false target can be suppressed, and the detection accuracy of the target is improved.

S204: and performing time stamp alignment on the first candidate target and the second candidate target, and converting the first candidate target and the second candidate target to the same coordinate system.

S204 may be understood as data aligning the first candidate target with the second candidate target, the aligning including temporal aligning and spatial aligning.

For example, the electronic device (executing body) may determine whether the received data transmitted by the camera and the data transmitted by the radar are synchronized in time, and if so, continue to execute the subsequent steps. If not, the radar and the camera can be synchronously configured, and then the subsequent steps are executed.

Spatially aligning may include converting the first candidate object and the second candidate object to the same coordinate system. Taking the in-vehicle scene as an example, the first candidate object and the second candidate object may be converted into a vehicle coordinate system.

S205: in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not; if so, S206 is performed, and if not, it indicates that the first candidate object and the second candidate object are not successfully matched, in which case S210-S213 are performed.

S206: judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value or not; if less, it means that the first candidate object is successfully matched with the second candidate object, in which case S207-S209 are performed, and if not, it means that the first candidate object is not successfully matched with the second candidate object, in which case S210-S213 are performed.

The execution order of S205 and S206 is not limited.

S207: and extracting the characteristic of a first candidate target in the visible light image as a first target characteristic, and extracting the characteristic of a second candidate target in the radar image as a second target characteristic.

S208: and performing feature fusion on the first target feature and the second target feature to obtain a first fused feature.

S209: and false detection and identification are carried out on the first candidate target and/or the second candidate target based on the first fused feature.

And performing false detection identification on the first candidate target and/or the second candidate target, namely judging whether the first candidate target and/or the second candidate target is a false detection target or not.

S210: mapping the second candidate target which is not successfully matched into the visible light image according to calibration data between the camera and the radar; and determining a first candidate target which is missed to be detected in the visible light image according to the mapping result.

S211: and extracting the characteristics of the missed first candidate target in the visible light image as third target characteristics, and extracting the characteristics of the unmatched second candidate target in the radar image as fourth target characteristics.

S212: and performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature.

S213: and based on the second fused feature, carrying out false detection and identification on the second candidate target which is not successfully matched and/or the first candidate target which is missed to be detected.

And performing false detection identification on the second candidate target which is not successfully matched and/or the first candidate target which is missed, namely judging whether the second candidate target which is not successfully matched and/or the first candidate target which is missed is/are a false detection target.

For example, various algorithms such as a color histogram, wavelet transform, invariant moment and the like may be used to extract features of the target, or a deep learning algorithm may be used to extract features of the target, and the specific algorithm is not limited. For convenience of description, the feature extracted from the first candidate object is referred to as a first object feature, and the feature extracted from the second candidate object is referred to as a second object feature.

And then fusing the first target feature and the second target feature to obtain a first fused feature. For example, the first target feature and the second target feature may be fused in a weighted manner, or the first target feature and the second target feature may also be fused by using a fusion hierarchy in the neural network model, and a specific fusion manner is not limited.

In one case, the classification model obtained by pre-training may be used to classify the first fused feature, and the classification result may include: is a false detection target, and is not a false detection target. Thus, the false detection target can be identified according to the classification result. Alternatively, the classification result may further include a specific object category, such as whether the object is a vehicle or an obstacle, and the like, and the specific classification result is not limited.

Based on the recognition results of S209 and S213, the false detection target in the detection result can be eliminated, and the accuracy of target detection can be improved.

By applying the embodiment shown in fig. 2 of the invention, on the first hand, the target detection is performed by combining the visible light image and the radar data, the characteristics carried by the two data (compared with one data) are richer, the accuracy of the target detection by using the characteristics extracted from the two data is higher, and the false detection target is identified based on the visible light image and the radar data, so that the false detection rate of the target is reduced, and the detection accuracy is further improved. In the second aspect, the target matching is performed based on the position and the speed of the target, and the matching accuracy is improved. In the third aspect, the first candidate target and the second candidate target are aligned in time and are converted into the same coordinate system, and then the targets are matched, so that the matching accuracy is improved. In the fourth aspect, the second candidate target which is not successfully matched is mapped to the visible light image, and the first candidate target which is missed to be detected in the visible light image is determined according to the mapping result, so that the condition of missing detection in the visible light image is reduced.

One specific embodiment is described below with reference to fig. 3-7:

Referring to fig. 3, the electronic device includes an image processing module, a radar processing module, a data alignment module, a feature fusion module, and a detection result output module. The image processing module is in communication connection with the camera, and the camera sends the collected visible light image to the image processing module. The radar processing module is in communication connection with the millimeter wave radar, and the millimeter wave radar sends the acquired original data to the radar processing module.

And the image processing module analyzes each frame of visible light image sent by the camera. The analysis process may include: and detecting a target frame of a target such as a vehicle or an obstacle through a target detection algorithm. According to the calibration data of the camera and the detected target frame, the position information, the speed information and the like of the target can be obtained by utilizing algorithms such as monocular distance measurement, monocular speed measurement and the like. For convenience of description, the target detected in the visible image is referred to as a first candidate target, and the velocity of the first candidate target is referred to as a first velocity.

Specifically, referring to fig. 4, the image processing module may include: the device comprises an off-line training unit, an on-line detection unit and a first information acquisition unit.

An off-line training unit: a detection model for detecting visible light images may be trained. For example, some visible light image samples may be collected and manually calibrated, and manual calibration may be understood as framing various target frames. And then training to obtain a detection model by using an image detection algorithm based on the manually calibrated image. The image detection algorithm may be YOLO (You only need to look once) v2(Version2, second Version), which has better real-time and accuracy.

An online detection unit: the detection model obtained by the off-line training unit can be used for processing the visible light image acquired by the camera in real time to obtain the target frame of the first candidate target.

A first information acquisition unit: the position and velocity information (first velocity) of the first candidate target may be obtained according to a monocular distance measurement, monocular velocity measurement algorithm. Specifically, the tracking processing may be performed on the first candidate target detected by the online detection unit to obtain a track number of the first candidate target; the position and the first speed of the first candidate object may be derived based on the flight path and the number. In addition, the first information acquisition unit may also convert the position and speed information (first speed) of the first candidate target to a vehicle coordinate system according to calibration data of the camera to facilitate subsequent data alignment.

And the radar processing module processes the original data to be sent by the millimeter wave radar to obtain a power distribution diagram. For example, the color of the pixel point in the power distribution map indicates the power level, the power of the background object is 0, the color may be blue, the power of the moving object is higher, the color may be red, and the specific power distribution map is not limited.

Specifically, referring to fig. 5, the radar processing module may include: the device comprises a signal processing unit, a target detection unit and a second information acquisition unit.

A signal processing unit: two-dimensional FFT (Fast Fourier transform) operation can be carried out on original data sent by the millimeter wave radar, clutter is removed, and a distance velocity power distribution diagram is obtained. The original data is processed through steps of constant false alarm detection, clustering or non-maximum value inhibition, parameter calculation and the like, and information such as the position, the speed (second speed), the angle and the like of the second candidate target can be obtained.

A target tracking unit: the second candidate target may be tracked by using a kalman filter tracking algorithm to obtain a trajectory of the second candidate target, and the position and the second speed of the obtained second candidate target may be corrected based on the trajectory. Thus, the detected false target can be suppressed, and the target detection accuracy can be improved.

A second information acquisition unit: according to the millimeter wave radar calibration parameters, the corrected position and the corrected second candidate target can be converted into a vehicle coordinate system, so that data alignment can be performed subsequently.

A data alignment module: the data converted into the vehicle coordinate system described above (the position and the first velocity of the first candidate object, and the position and the second velocity of the second candidate object) may be subjected to an alignment process for subsequent fusion processing.

In some cases, since the refresh frame rates of the camera and the millimeter wave radar are different, it is necessary to perform temporal alignment processing on camera-related data and radar-related data, where the camera-related data includes: the above-mentioned position and first speed of the first candidate target converted to the vehicle coordinate system, the radar-related data comprising: the position and the second speed of the second candidate object converted to the vehicle coordinate system. Additionally, data alignment may also include a matching process of the first candidate object with the second candidate object.

The specific alignment process can refer to fig. 6:

s601: the input data is the data converted into the vehicle coordinate system and comprises radar related data and camera related data, wherein the camera related data comprises: a position and a first velocity of the first candidate target, the radar-related data comprising: a position of the second candidate object and a second velocity.

S602: and judging whether the radar related data and the camera related data are synchronized in time.

If there is a frame rate difference between the millimeter wave radar and the camera, time asynchronism may result. If the time is not synchronized, a synchronization configuration may be performed and then radar-related data and camera-related data are re-input, i.e. the position and the first velocity of the first candidate object and the position and the second velocity of the second candidate object are re-obtained, which are transformed into the vehicle coordinate system. If the time is synchronized, S603 is performed.

S603: and matching the first candidate target and the second candidate target by using a matching algorithm.

For example, assuming that there are N first candidate targets, the first candidate targets may be represented as

Figure BDA0002024318500000211

j denotes the sequence number of the first candidate object, VjRepresenting a jth first candidate target; the second candidate targets are M, and can be expressed asi denotes the number of the second candidate target, RiRepresenting the ith second candidate target; m and N are both positive integers. The matching process may be understood as finding a matching relationship between a first candidate target and a second candidate target { { R { (R)i,Vj}kK denotes the number of successfully matched target pairs, a target pair comprising a first candidate target and a second candidate target, { R }i,VjDenotes the kth target pair.

In one case, the first candidate target may be determined based on the position of the first candidate target and the position of the second candidate targetConstructing a cost function according to the distance between the first speed and the second speed and the difference value of the first speed and the second speed; constructing a cost matrix of pairwise matching of the first candidate target and the second candidate target based on the cost function; solving the cost matrix by adopting the Hungarian algorithm to obtain an optimal matching result, namely obtaining a matching relation between the first candidate target and the second candidate target { { R { (R)i,Vj}k}。

S604: and judging whether a first candidate target successfully matched with each second candidate target exists or not. If not, go to S605; if so, S606 is performed.

S605: and according to parameters jointly calibrated by the millimeter wave radar and the camera, projecting the second candidate target which is not successfully matched to a front view (visible light image) of the camera to obtain a target frame in the visible light image, wherein the target frame is used as a first candidate target for missed detection.

In some cases, due to poor imaging quality of the visible light image or limitation of the detection model, there may be missed first candidate targets in the visible light image. In this step, the second candidate target which is not successfully matched is projected into the visible light image, and the obtained target frame is the first candidate target which is possibly missed.

S606: and outputting the matching relation of the first candidate target and the second candidate target.

The matching relationship includes the first candidate target and the second candidate target successfully matched in S604, and also includes the second candidate target unsuccessfully matched in S605 and the first candidate target obtained by projecting the second candidate target; the matching relationship may be a data-aligned matching relationship.

A feature fusion module: feature extraction and fusion may be performed on the first candidate target and the second candidate target included in the matching relationship output in S606.

The feature fusion module can comprise two processes of off-line classification training and on-line classification. Firstly, collecting corresponding sample data offline, carrying out manual marking on the sample data, and training by utilizing the manually marked sample data to obtain a classification model. The online collected data may then be processed by using the classification model, that is, the first candidate object and the second candidate object having a matching relationship may be processed according to the output of S606.

The specific processing procedure can refer to fig. 7, and includes: feature extraction, feature fusion and classified output.

Feature extraction: algorithms such as color histograms, wavelet transforms, invariant moments, etc., or deep learning algorithms such as Convolutional Neural Networks (CNN) may be used to extract features of the first candidate object in the visible light image and to extract features of the second candidate object in the radar image.

In one case, the feature extraction may be performed on a first candidate object using the network structure of Resnet50, and the feature extraction may be performed on a second candidate object using the network structure of Resnet 18. On the first hand, residual errors are introduced into the network structure of Resnet, and the classification performance is good. In the second aspect, compared with the visible light image, the radar image is more intuitive, the first candidate target in the visible light image is processed by using the network structure (Resnet50) with higher complexity, and the second candidate target in the radar image (Resnet18) is processed by using the network structure with lower complexity, so that the network structure is more reasonable, and the feature extraction effect is better. And in the third aspect, the convolutional neural network extraction algorithm has the characteristic of strong robustness, can learn essential characteristics better and has stronger separability.

Feature fusion: and carrying out fusion processing on the features extracted from the first candidate target and the features extracted from the second candidate target. In one case, before performing the global average pooling (avgpoling) in the Resnet network, a connection (concat) process, that is, a feature fusion process may be performed on the "feature extracted from the first candidate object" and the "feature extracted from the second candidate object".

And (4) classified output: and classifying the fused features by using the classification model obtained by the training, and outputting a classification result. The classification result may include: is a false detection target, and is not a false detection target. Thus, the false detection target can be identified according to the classification result. Alternatively, the classification result may further include a specific object category, such as whether the object is a vehicle or an obstacle, and the like, and the specific classification result is not limited.

A detection result output module: the classification result output by the feature fusion module can be utilized to correct the detection target obtained by the image processing module, for example, the false detection target is eliminated, the missing detection target is supplemented, and the like, so that the quality of target detection is improved.

Corresponding to the above method embodiment, an embodiment of the present invention provides an object detection apparatus, as shown in fig. 8, including:

an obtaining module 801, configured to obtain a visible light image and radar data, where the radar data includes a radar image;

a detection module 802, configured to detect a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data;

a matching module 803, configured to match the first candidate target with the second candidate target; if the matching is successful, triggering a feature fusion module 804;

A feature fusion module 804, configured to extract a feature of a first candidate target in the visible light image, as a first target feature, and extract a feature of a second candidate target in the radar image, as a second target feature; performing feature extraction and feature fusion on the first target feature and the second target feature to obtain a first fused feature;

a false detection identification module 805, configured to perform false detection identification on the first candidate target and/or the second candidate target based on the first fused feature.

The detection module 802 may correspond to the image processing module and the radar processing module in the embodiment of fig. 3, the matching module 803 may correspond to the data alignment module in the embodiment of fig. 3, the feature fusion module 804 may correspond to the feature fusion module in the embodiment of fig. 3, and the false detection identification module 805 may correspond to the detection result output module in the embodiment of fig. 3.

As an embodiment, the matching module 803 is specifically configured to:

converting the first candidate target and the second candidate target to the same coordinate system;

in the same coordinate system, judging whether the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold value or not;

And if so, determining that the first candidate target is successfully matched with the second candidate target.

As an embodiment, the apparatus further comprises: a tracking module and a decision module (not shown), wherein,

the tracking module is used for tracking a first candidate target in the visible light image to obtain the speed of the first candidate target as a first speed; determining a velocity of a second candidate target as a second velocity based on the radar data;

the judging module is used for judging whether the difference value of the first speed and the second speed is smaller than a preset speed difference threshold value or not;

the matching module 803 is further configured to: determining that the first candidate target and the second candidate target are successfully matched if the distance between the position of the first candidate target and the position of the second candidate target is smaller than a preset distance threshold and the difference between the first speed and the second speed is smaller than a preset speed difference threshold.

As an embodiment, the matching module 803 is specifically configured to:

time stamp aligning the first candidate target with the second candidate target;

and matching the first candidate target with the same timestamp with the second candidate target.

As an embodiment, the apparatus further comprises: a mapping module and a determination module (not shown), wherein,

the mapping module is used for mapping the second candidate target which is not successfully matched into the visible light image according to calibration data between the camera and the radar; wherein the camera is a camera for acquiring the visible light image; the radar is used for collecting the radar data;

the determining module is used for determining a first candidate target which is missed to be detected in the visible light image according to a mapping result;

the feature fusion module 804 is further configured to extract features of the missed first candidate target in the visible light image, to serve as third target features, and extract features of the second candidate target that is not successfully matched in the radar image, to serve as fourth target features; performing feature fusion on the third target feature and the fourth target feature to obtain a second fused feature;

the false detection identification module 805 is further configured to perform false detection identification on the second candidate target that is not successfully matched and/or the first candidate target that is missed to be detected based on the second fused feature.

As an embodiment, the feature fusion module 804 is specifically configured to:

Inputting a first candidate target in the visible light image and a second candidate target in the radar image into a neural network model obtained through pre-training;

performing feature extraction on the first candidate target by using a first level in the neural network model to obtain a first target feature; performing feature extraction on the second candidate target by using a second level in the neural network model to obtain a second target feature;

fusing the first target feature and the second target feature by using a third level in the neural network model to obtain a first fused feature;

the false detection identification module is specifically configured to:

and classifying the first fused features by utilizing a fourth level in the neural network model to obtain a classification result of whether the first candidate target and/or the second candidate target is a false detection target.

As an embodiment, the obtaining module 801 is further configured to:

acquiring original data sent by a radar;

and performing conversion processing on the original data by utilizing a Fourier transform algorithm to obtain a power distribution diagram, and obtaining radar data comprising the original data and the power distribution diagram.

By applying the embodiment of the invention, the target detected in the visible light image is matched with the target detected based on the radar data, after the matching is successful, the two targets are subjected to feature extraction and feature fusion, and whether the detected target is a false detection target or not is judged based on the fused features; therefore, in the scheme, the target detection is carried out by combining the visible light image and the radar data, the characteristics carried by the two data (compared with one data) are richer, the accuracy of target detection by utilizing the characteristics extracted from the two data is higher, the false detection target is identified based on the visible light image and the radar data, the target false detection rate is reduced, and the detection accuracy is further improved.

An embodiment of the present invention further provides an electronic device, as shown in fig. 9, including a processor 901 and a memory 902;

a memory 902 for storing a computer program;

the processor 901 is configured to implement any of the above-described object detection methods when executing the program stored in the memory 902.

The electronic device may be a vehicle-mounted device, and may also be applied to various devices such as a mobile phone and a computer, and is not particularly limited.

The Memory mentioned in the above electronic device may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

An embodiment of the present invention further provides a target detection system, as shown in fig. 10, including: a camera, radar, and detection device; wherein the camera and the radar perform data acquisition for the same scene;

the detection equipment is used for acquiring visible light images acquired by the camera and radar data through the radar, and the radar data comprise radar images; detecting a target in the visible light image as a first candidate target; and detecting a second candidate target based on the radar data; matching the first candidate target with the second candidate target; if the matching is successful, extracting the feature of a first candidate target in the visible light image as a first target feature, and extracting the feature of a second candidate target in the radar image as a second target feature; performing feature fusion on the first target feature and the second target feature to obtain a first fused feature; and carrying out false detection and identification on the first candidate target and/or the second candidate target based on the first fused feature.

The detection device may also implement any of the above-described target detection methods. The radar may be a millimeter-wave radar, and is not particularly limited.

An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program implements any one of the above-mentioned target detection methods.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, apparatus embodiments, device embodiments, system embodiments, and computer-readable storage medium embodiments are substantially similar to method embodiments and therefore are described with relative ease, as appropriate, with reference to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

29页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于毫米波雷达的网络监控系统及毫米波天线阵列结构

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类