Target detection method and device, computer storage medium and electronic equipment

文档序号：96189 发布日期：2021-10-12 浏览：27次中文

阅读说明：本技术 目标检测方法及装置、计算机存储介质、电子设备 (Target detection method and device, computer storage medium and electronic equipment ) 是由杨磊吴凯许新玉于 2020-04-08 设计创作，主要内容包括：本公开涉及人工智能技术领域,提供了一种目标检测方法、目标检测装置、计算机存储介质、电子设备,其中,目标检测方法包括：根据环视相机中各相机模组的布置方式以及雷达的扫描角度,生成各相机模组的曝光控制信号；根据曝光控制信号控制对应的相机模组进行曝光作业,得到目标区域的图像数据；基于与各相机模组相关联的各图像检测模块对图像数据进行目标检测,以确定图像数据中目标对象的二维边界框；根据二维边界框和雷达扫描得到的点云数据,确定目标对象的三维边界框。本公开中的目标检测方法不仅能够实现多个视角的图像数据的采集和处理,而且能够最大化降低处理延时,提高处理效率。(The present disclosure relates to the technical field of artificial intelligence, and provides a target detection method, a target detection device, a computer storage medium, and an electronic apparatus, wherein the target detection method comprises: generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar; controlling a corresponding camera module to perform exposure operation according to the exposure control signal to obtain image data of a target area; performing target detection on the image data based on each image detection module associated with each camera module to determine a two-dimensional bounding box of a target object in the image data; and determining a three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by radar scanning. The target detection method in the disclosure not only can realize the acquisition and processing of image data of multiple visual angles, but also can maximally reduce processing delay and improve processing efficiency.)

1. A method of object detection, comprising:

generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar;

controlling a corresponding camera module to perform exposure operation according to the exposure control signal to obtain image data of a target area;

performing target detection on the image data based on each image detection module associated with each camera module to determine a two-dimensional bounding box of a target object in the image data;

and determining the three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by the radar scanning.

2. The method of claim 1, wherein each camera module in the surround view camera comprises at least a rear view camera, a left view camera, a front view camera, and a right view camera;

the image data of the plurality of view angles includes at least a rear view image, a left view image, a front view image, and a right view image.

3. The method of claim 2, wherein when the rear-view camera is disposed at a first scan angle of the radar, the left-view camera is disposed at a second scan angle of the radar, the front-view camera is disposed at a third scan angle of the radar, and the right-view camera is disposed at a fourth scan angle of the radar, the method further comprises:

generating a first exposure control signal when the radar is detected to reach the first scanning angle, and controlling the rearview camera to perform exposure operation according to the first exposure control signal to obtain a rearview image of the target area;

generating a second exposure control signal when the radar is detected to reach the second scanning angle, and controlling the left-view camera to perform exposure operation according to the second exposure control signal to obtain a left-view image of the target area;

generating a third exposure control signal when the radar is detected to reach the third scanning angle, and controlling the forward-looking camera to perform exposure operation according to the third exposure control signal to obtain a forward-looking image of the target area;

and generating a fourth exposure control signal when the radar is detected to reach the fourth scanning angle, and controlling the right-view camera to perform exposure operation according to the fourth exposure control signal to obtain a right-view image of the target area.

4. The method of claim 1, wherein determining the three-dimensional bounding box of the target object from the two-dimensional bounding box and the point cloud data from the radar scan comprises:

projecting the two-dimensional bounding box to a three-dimensional space to obtain a three-dimensional space region corresponding to the two-dimensional bounding box;

extracting target point cloud data in the point cloud data, wherein the target point cloud data is point cloud data located in the three-dimensional space area;

inputting the target point cloud data into a neural network model, and determining the output of the neural network model as a three-dimensional boundary box of the target object; and the neural network model is used for predicting to obtain a three-dimensional boundary frame of the target object according to the target point cloud data.

5. The method of any of claims 1 to 3, further comprising:

when the number of the camera modules is n and the camera modules are uniformly distributed in the circumferential direction, the time interval of the exposure control signals of each camera module is T/n, and T is the scanning period of the radar.

6. The method according to any one of claims 1 to 3, wherein the rotation of the radar is a constant rotation.

7. The method of any of claims 1 to 3, further comprising: and in the rotation process of the radar, the radar is phase-locked through a GPS synchronous signal or a simulated GPS signal.

8. An object detection device, comprising:

the signal generation module is used for generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar;

the control module is used for controlling the corresponding camera module to carry out exposure operation according to the exposure control signal to obtain image data of a target area;

the detection module is used for carrying out target detection on the image data based on each image detection module associated with each camera module so as to determine a two-dimensional boundary frame of a target object in the image data;

and the determining module is used for determining the three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by the radar scanning.

9. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the object detection method of any one of claims 1 to 7.

10. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the object detection method of any one of claims 1-7 via execution of the executable instructions.

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a target detection method, a target detection apparatus, a computer storage medium, and an electronic device.

Background

With the rapid development of artificial intelligence technology, the related automatic driving technology is also rapidly developing and advancing. Target object detection in the driving environment of the unmanned automobile is one of key technologies of the environment sensing system of the unmanned automobile, and how to realize 360-degree dead-angle-free detection on targets around the automobile is significant for ensuring safe driving of the unmanned automobile.

At present, a related target detection scheme can only realize fusion detection of an image with a single view angle and a corresponding radar point cloud generally, and all steps are in a serial processing relation, so that the image view angle is single, and the processing efficiency is low.

In view of the above, there is a need in the art to develop a new target detection method and apparatus.

It is to be noted that the information disclosed in the background section above is only used to enhance understanding of the background of the present disclosure.

Disclosure of Invention

The present disclosure is directed to a target detection method, a target detection apparatus, a computer storage medium, and an electronic device, so as to avoid a defect of a single detection view angle in the prior art at least to a certain extent.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to a first aspect of the present disclosure, there is provided an object detection method, comprising: generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar; controlling a corresponding camera module to perform exposure operation according to the exposure control signal to obtain image data of a target area; performing target detection on the image data based on each image detection module associated with each camera module to determine a two-dimensional bounding box of a target object in the image data; and determining the three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by the radar scanning.

In an exemplary embodiment of the present disclosure, each camera module in the surround view camera at least includes a rear view camera, a left view camera, a front view camera, and a right view camera; the image data of the plurality of view angles includes at least a rear view image, a left view image, a front view image, and a right view image.

In an exemplary embodiment of the present disclosure, when the rear view camera is disposed at a first scanning angle of the radar, the left view camera is disposed at a second scanning angle of the radar, the front view camera is disposed at a third scanning angle of the radar, and the right view camera is disposed at a fourth scanning angle of the radar, the method further includes: generating a first exposure control signal when the radar is detected to reach the first scanning angle, and controlling the rearview camera to perform exposure operation according to the first exposure control signal to obtain a rearview image of the target area; generating a second exposure control signal when the radar is detected to reach the second scanning angle, and controlling the left-view camera to perform exposure operation according to the second exposure control signal to obtain a left-view image of the target area; generating a third exposure control signal when the radar is detected to reach the third scanning angle, and controlling the forward-looking camera to perform exposure operation according to the third exposure control signal to obtain a forward-looking image of the target area; and generating a fourth exposure control signal when the radar is detected to reach the fourth scanning angle, and controlling the right-view camera to perform exposure operation according to the fourth exposure control signal to obtain a right-view image of the target area.

In an exemplary embodiment of the present disclosure, the determining a three-dimensional bounding box of the target object according to the two-dimensional bounding box and the point cloud data obtained by the radar scanning includes: projecting the two-dimensional bounding box to a three-dimensional space to obtain a three-dimensional space region corresponding to the two-dimensional bounding box; extracting target point cloud data in the point cloud data, wherein the target point cloud data is point cloud data located in the three-dimensional space area; inputting the target point cloud data into a neural network model, and determining the output of the neural network model as a three-dimensional boundary box of the target object; and the neural network model is used for predicting to obtain a three-dimensional boundary frame of the target object according to the target point cloud data.

In an exemplary embodiment of the present disclosure, the method further comprises: when the number of the camera modules is n and the camera modules are uniformly distributed in the circumferential direction, the time interval of the exposure control signals of each camera module is T/n, and T is the scanning period of the radar.

In an exemplary embodiment of the present disclosure, the rotation process of the radar is a uniform rotation.

In an exemplary embodiment of the present disclosure, the method further comprises: and in the rotation process of the radar, the radar is phase-locked through a GPS synchronous signal or a simulated GPS signal.

According to a second aspect of the present disclosure, there is provided an object detection apparatus comprising: the signal generation module is used for generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar; the control module is used for controlling the corresponding camera module to carry out exposure operation according to the exposure control signal to obtain image data of a target area; the detection module is used for carrying out target detection on the image data based on each image detection module associated with each camera module so as to determine a two-dimensional boundary frame of a target object in the image data; and the determining module is used for determining the three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by the radar scanning.

According to a third aspect of the present disclosure, there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the object detection method of the first aspect described above.

According to a fourth aspect of the present disclosure, there is provided an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the object detection method of the first aspect described above via execution of the executable instructions.

As can be seen from the foregoing technical solutions, the object detection method, the object detection apparatus, the computer storage medium and the electronic device in the exemplary embodiments of the present disclosure have at least the following advantages and positive effects:

in the technical solutions provided in some embodiments of the present disclosure, on one hand, exposure control signals of each camera module are generated according to the arrangement mode of each camera module in the panoramic camera and the scanning angle of the radar, and the corresponding camera module is controlled to perform exposure operation according to the exposure control signals to obtain image data of a target area, so that not only image data of multiple viewing angles can be obtained, but also time synchronization of the panoramic camera and the radar can be realized, thereby acquiring data synchronously, and without increasing any cost, and the method can be freely adjusted according to the arrangement mode of the camera modules and the scanning angle of the radar, and is flexible in operation and high in applicability (suitable for different radar scanning frequencies and different camera installation modes). Furthermore, target detection is carried out on the image data based on each image detection module associated with each camera module to determine a two-dimensional boundary frame of a target object in the image data, and the detection can be carried out through the corresponding image detection modules after the image data of each visual angle is acquired, so that the problem of processing delay caused by serial processing of the image through only one image detection module in the prior art is solved, and the image detection efficiency is improved. On the other hand, the three-dimensional boundary frame of the target object is determined according to the two-dimensional boundary frame and the point cloud data obtained by radar scanning, and the marking accuracy of the target object in the three-dimensional scene can be further improved on the basis of obtaining the two-dimensional boundary frame with multiple visual angles and the point cloud data, so that the safe driving of the unmanned vehicle is guaranteed.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

FIG. 1 shows a schematic flow chart of a target detection method in an exemplary embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of a target detection method in an exemplary embodiment of the present disclosure;

FIG. 3 illustrates a sub-flow diagram of a method of object detection in an exemplary embodiment of the present disclosure;

FIG. 4 illustrates an overall flow diagram of a target detection method in an exemplary embodiment of the present disclosure;

FIG. 5A shows a schematic diagram of a prior art target detection method;

FIG. 5B shows a schematic diagram of a target detection method in an example embodiment of the present disclosure;

FIG. 6 shows a schematic structural diagram of an object detection apparatus in an exemplary embodiment of the present disclosure;

FIG. 7 shows a schematic diagram of a structure of a computer storage medium in an exemplary embodiment of the disclosure;

fig. 8 shows a schematic structural diagram of an electronic device in an exemplary embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

The terms "a," "an," "the," and "said" are used in this specification to denote the presence of one or more elements/components/parts/etc.; the terms "comprising" and "having" are intended to be inclusive and mean that there may be additional elements/components/etc. other than the listed elements/components/etc.; the terms "first" and "second", etc. are used merely as labels, and are not limiting on the number of their objects.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities.

Currently, a general process for target detection in the related art is as follows: collecting point cloud data through a laser radar, collecting image data of a visual angle through a camera, and further carrying out image detection on the image data of the visual angle based on a Convolutional Neural Network (CNN) to obtain a 2D detection frame for marking a target object; and further, carrying out post-processing segmentation or clustering on the collected point cloud data, and comprehensively obtaining a 3D detection frame for marking the position and the coordinates of the target object in the surrounding scene according to the point cloud processing result and the 2D detection frame. Obviously, the above steps are in a sequential and serial processing relationship, and if images of multiple viewing angles need to be processed, the time delay is long due to serial processing, and the processing efficiency is low.

In the embodiments of the present disclosure, firstly, an object detection method is provided, which overcomes, at least to some extent, the defect of single detection view angle of the object detection method provided in the prior art.

Fig. 1 is a flowchart illustrating an object detection method according to an exemplary embodiment of the present disclosure, where an execution subject of the object detection method may be a server that detects an object.

Referring to fig. 1, an object detection method according to one embodiment of the present disclosure includes the steps of:

step S110, generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar;

step S120, controlling the corresponding camera module to perform exposure operation according to the exposure control signal to obtain image data of a target area;

step S130, performing target detection on the image data based on each image detection module associated with each camera module to determine a two-dimensional boundary frame of a target object in the image data;

and step S140, determining a three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by radar scanning.

In the technical solution provided in the embodiment shown in fig. 1, on one hand, exposure control signals of each camera module are generated according to the arrangement mode of each camera module in the panoramic camera and the scanning angle of the radar, and the corresponding camera module is controlled to perform exposure operation according to the exposure control signals to obtain image data of a target area, so that not only can image data of multiple viewing angles be obtained, but also time synchronization of the panoramic camera and the radar can be realized, thereby acquiring data synchronously, and without increasing any cost, and the method can be freely adjusted according to the arrangement mode of the panoramic camera and the scanning angle of the radar, and is flexible in operation and high in applicability (suitable for different radar scanning frequencies and different camera installation modes). Furthermore, target detection is carried out on the image data based on each image detection module associated with each camera module to determine a two-dimensional boundary frame of a target object in the image data, and the detection can be carried out through the corresponding image detection modules after the image data of each visual angle is acquired, so that the problem of processing delay caused by serial processing of the image through only one image detection module in the prior art is solved, and the image detection efficiency is improved. On the other hand, the three-dimensional boundary frame of the target object is determined according to the two-dimensional boundary frame and the point cloud data obtained by radar scanning, and the marking accuracy of the target object in the three-dimensional scene can be further improved on the basis of obtaining the two-dimensional boundary frame with multiple visual angles and the point cloud data, so that the safe driving of the unmanned vehicle is guaranteed.

The following describes the specific implementation of each step in fig. 1 in detail:

the all-round camera in the present disclosure may be a multi-camera suite system that is composed of a plurality of camera modules (common monocular camera) and can realize 360-degree perception, thereby realizing 360-degree all-round effect. The panoramic camera is used for monitoring objects around the vehicle in real time and calculating the distance between the objects and the vehicle by matching with a calibration algorithm, so that the functions of lane departure warning, front vehicle collision avoidance, pedestrian detection and the like are realized. A detailed recognition of the object content can be achieved by looking around the camera, for example: recognizing license plate numbers, signs, etc.

Radar (which is the transliteration of radio, and is derived from the acronym of radio detection and ranging, and means "radio detection and ranging", that is, finding objects and determining their spatial positions by radio, and therefore, Radar is also called "radio positioning") realizes accurate distance measurement of target objects by measuring the time interval between transmitted light and reflected light. The radar actively sends out light detection, can resist the influence of environment, and especially in the dim light environment, the detection effect of radar is better. The radar in the present disclosure may be, for example, a laser radar, a millimeter wave radar, an ultrasonic radar, etc., and may be set by itself according to actual situations, which belongs to the protection scope of the present disclosure.

In step S110, an exposure control signal for each camera module is generated according to the arrangement of each camera module in the panoramic camera and the scanning angle of the radar.

In an exemplary embodiment of the present disclosure, the radar is configured to obtain point cloud data of a target area, each frame of the point cloud data is obtained by the radar after completing scanning for one cycle according to a scanning angle range, and the point cloud is a massive point set representing surface characteristics of an object in the target area.

The rotation process of the radar may be constant-speed rotation, and in the process of the constant-speed rotation of the radar, phase locking may be performed on the radar through a GPS (Global Positioning System, GPS for short) synchronization signal or a simulated GPS signal. Therefore, zero calibration of the radar position can be realized, and the precision of the radar rotary motion is improved.

Each camera module in the above-mentioned all around camera can include: the system comprises a rear-view camera, a left-view camera, a front-view camera and a right-view camera, wherein the rear-view camera is used for monitoring objects and target objects behind a vehicle; the left-view camera is used for monitoring objects and target objects on the left side of the vehicle; the front-view camera is used for monitoring objects and target objects in front of the vehicle; the right-view camera is used to monitor objects and target objects on the right side of the vehicle.

It should be noted that each camera module in the above-mentioned all-round camera may further include a bottom-view camera (for monitoring an object and a target object at the bottom of the vehicle), so that the target object on the driving road surface can be effectively avoided, and the driving safety of the unmanned vehicle is ensured. Specifically, the setting can be self-set according to the actual situation, and the method belongs to the protection scope of the disclosure.

The scanning angle of the radar is one circle (360 degrees), in order to match the image data acquired by each camera module in the ring view camera and the point cloud data acquired by the radar scanning one circle into one frame of scene image, referring to fig. 2, fig. 2 shows a schematic diagram of an arrangement of the camera modules in the all-around camera, and specifically shows a schematic diagram of an arrangement of the camera modules in the all-around camera, referring to fig. 2, 201 shows a radar (Lidar), 202 shows a rear view camera, 203 shows a left view camera, 204 shows a front view camera, 205 shows a right view camera, a time length (about 1 to 5ms) that the radar scans β 1, β 2, β 3, and β 4 angles, in correspondence to the exposure time lengths of the four cameras, β 1 ═ β 2 ═ β 3 ═ β 4 (the four angles are equal), specifically, the following explanation concerning step S120 may be referred to. It should be noted that the specific arrangement manner of each camera module can be set according to the actual situation, and belongs to the protection scope of the present disclosure.

For example, the initial phase of the radar may be set to 135 degrees (for example, it may also be set to 45 degrees or other angles, which may be set according to actual conditions, and which is within the protection scope of the present disclosure), and the rotation scanning sequence of the radar is counterclockwise. Further, a rear view camera (202) may be disposed at a first scanning angle (180 ° central angle position) of the radar, a left view camera (203) may be disposed at a second scanning angle (270 ° central angle position) of the radar, a front view camera (204) may be disposed at a third scanning angle (0 ° central angle position) of the radar, and a right view camera (205) may be disposed at a fourth scanning angle (90 ° central angle position) of the radar.

When the number of the camera modules is n and the camera modules are uniformly distributed in the circumferential direction, the time interval of the exposure control signal of each camera module is T/n, and T is the scanning period of the radar. Specifically, referring to the above explanation of the steps, when the scanning period of the radar is 100ms, and the number of the camera modules is 4 and the camera modules are uniformly distributed in the circumferential direction, the exposure control time interval of each camera module is 100/4 ═ 25 ms.

Further, when it is detected that the radar reaches the first scanning angle (180 degrees, corresponding to 12.5 msec), the first exposure control signal S1 for controlling the rear-view camera to perform the exposure job may be generated, when it is detected that the radar reaches the second scanning angle (270 degrees, corresponding to 37.5 msec), the second exposure control signal S2 for controlling the left-view camera to perform the exposure job may be generated, when it is detected that the radar reaches the third scanning angle (0 degrees, corresponding to 62.5 msec), the third exposure control signal S3 for controlling the front-view camera to perform the exposure job may be generated, and when it is detected that the radar reaches the fourth scanning angle (90 degrees, corresponding to 87.5 msec), the fourth exposure control signal S4 for controlling the right-view camera to perform the exposure job may be generated.

With reference to fig. 1, in step S120, the corresponding camera module is controlled to perform an exposure operation according to the exposure control signal, so as to obtain image data of the target area.

In the exemplary embodiment of the disclosure, after the exposure control signal is obtained, the corresponding camera module may be controlled according to the exposure control signal to perform an exposure operation (exposure refers to a process in which a light sensitive element of the camera is irradiated by light to generate a physicochemical reaction when a shutter is pressed and light information is recorded), so as to obtain image data of the target area. For example, the image data of the target area may include a rear view image, a left view image, a front view image, and a right view image corresponding to the above four camera modules.

Referring to the related explanation of the step S110, the rear-view camera may be controlled to perform an exposure operation according to the first exposure control signal S1 (the elapsed time of the radar scanning β 1 angle corresponds to the exposure time of the rear-view camera, which is about 1-5 ms, and may refer to the included angle β 1 in fig. 2), so as to obtain a rear-view image of the target area; controlling the left-view camera to perform exposure operation according to the second exposure control signal S2 (the time length during which the radar scans the angle β 2 corresponds to the exposure time length of the left-view camera, which is about 1-5 ms, and may refer to the included angle β 2 in fig. 2), so as to obtain a left-view image of the target area; controlling the front-view camera to perform exposure operation according to the third exposure control signal S3 (the time length during which the radar scans the angle β 3 corresponds to the exposure time length of the front-view camera, which is about 1-5 ms, and can refer to the included angle β 3 in fig. 2), so as to obtain a front-view image of the target area; and controlling the right-view camera to perform exposure operation according to the fourth exposure control signal S4 (the time length during which the radar scans the angle β 4 corresponds to the exposure time length of the right-view camera, which is about 1-5 ms, and may refer to the included angle β 4 in fig. 2), so as to obtain a right-view image of the target area. Therefore, the time synchronization of the all-round looking camera and the radar can be realized by acquiring the image data of a plurality of visual angles, so that the data are synchronously acquired, no cost is required to be increased, the data can be freely adjusted according to the arrangement mode of the camera module and the scanning angle of the radar, and the device is flexible to operate and high in applicability (suitable for different radar scanning frequencies and different camera installation modes).

In step S130, target detection is performed on the image data based on each image detection module associated with each camera module to determine a two-dimensional bounding box of a target object in the image data.

In an exemplary embodiment of the present disclosure, target detection may be performed on image data based on each image detection module associated with each camera module, so as to obtain a two-dimensional bounding box corresponding to a target object in the image data. The target object may be an object (for example, a person, a tree, another vehicle, an obstacle such as a wall, etc.) in the surrounding environment of the vehicle during driving, and the image detection module may be a detection module constructed according to a CNN model.

Referring to the above explanation of step S120, after obtaining the rear-view image, the rear-view image may be detected based on the first image detection module associated with the rear-view camera, so as to obtain the two-dimensional bounding box corresponding to the target object (i.e. the rear-view image including the two-dimensional bounding box corresponding to the target object). After the left-view image is obtained, the left-view image may be detected based on a second image detection module associated with the left-view camera, so as to obtain a two-dimensional bounding box corresponding to the target object (i.e., a left-view image including the two-dimensional bounding box corresponding to the target object). After the forward-looking image is obtained, the forward-looking image may be detected based on a second image detection module associated with the forward-looking camera to obtain a two-dimensional bounding box corresponding to the target object (i.e., the forward-looking image containing the two-dimensional bounding box corresponding to the target object). After the right-view image is obtained, the right-view image may be detected based on a second image detection module associated with the right-view camera, so as to obtain a two-dimensional bounding box corresponding to the target object (i.e., a right-view image including the two-dimensional bounding box corresponding to the target object).

Therefore, after the image data of each visual angle is acquired, the images can be detected through the corresponding image detection modules without mutual interference, the problem of processing delay caused by serial processing of the images through only one image detection module in the prior art is solved, and the image detection efficiency is improved.

By arranging a plurality of image detection modules associated with all camera modules in the panoramic camera, target detection of four images (a rear view image, a left view image, a front view image and a right view image) can be processed in parallel in the process of scanning point cloud data collected for one circle by a radar, and four images with two-dimensional boundary frames are obtained. Therefore, the length, the width and the two-dimensional coordinate position of the target object can be obtained, and further, after the radar scans for one week to obtain one frame of point cloud data, the three-dimensional boundary frame of the target object can be directly marked according to the obtained four images containing the two-dimensional boundary frame and the one frame of point cloud data without delay waiting.

In step S140, a three-dimensional bounding box of the target object is determined from the two-dimensional bounding box and the point cloud data.

In an exemplary embodiment of the present disclosure, after obtaining the two-dimensional bounding box, after an interval of 7.5 milliseconds to 11.5 milliseconds, the radar may complete one cycle of scanning, obtaining one frame of point cloud data. Furthermore, a three-dimensional boundary frame of the target object can be determined according to the two-dimensional boundary frame and the point cloud data.

Specifically, referring to fig. 3, fig. 3 shows a sub-flow diagram of a target detection method in an exemplary embodiment of the present disclosure, and specifically shows a sub-flow diagram of determining a three-dimensional bounding box of a target object according to a two-dimensional bounding box and point cloud data, including steps S301 to S303, and the following explains step S140 with reference to fig. 3.

In step S301, the two-dimensional bounding box is projected to a three-dimensional space, so as to obtain a three-dimensional space region corresponding to the two-dimensional bounding box.

In an exemplary embodiment of the disclosure, after the two-dimensional bounding box is obtained, the two-dimensional bounding box may be projected to a three-dimensional space, so as to obtain a three-dimensional space region (which may be a truncated cone, for example) corresponding to the two-dimensional bounding box.

In step S302, target point cloud data in the point cloud data is extracted, where the target point cloud data is point cloud data located in a three-dimensional space region.

In an exemplary embodiment of the present disclosure, target point cloud data in the point cloud data may be extracted, where the target point cloud data is point cloud data located in a graphic area of the three-dimensional space.

In step S303, the target point cloud data is input into the neural network model, and the output of the neural network model is determined as a three-dimensional bounding box of the target object; the neural network model is used for predicting a three-dimensional boundary frame of the target object according to the target point cloud data.

In the exemplary embodiment of the present disclosure, after the target point cloud data is extracted, the target point cloud data may be input into a neural network model, and an output of the neural network model is determined as a three-dimensional bounding box of a target object (length, width, height, and specific three-dimensional position of the target object), so that on the basis of obtaining two-dimensional bounding boxes and point cloud data of multiple viewing angles, the labeling accuracy of the target object in a stereoscopic scene is further improved, so that the labeling of the target object fits an actual operating environment better, and thus, safe driving of an unmanned vehicle is ensured.

The neural network model can be a PointNet network model (a novel neural network for directly processing point clouds and can embody the sequence invariance of input point clouds), and the target point cloud data can be subjected to three-dimensional classification and segmentation through the PointNet network model, so that a three-dimensional boundary frame of a target object is output. The specific neural network model can be set according to the actual situation, and belongs to the protection scope of the disclosure.

Exemplarily, referring to fig. 4, fig. 4 shows an overall flow diagram of a target detection method in an exemplary embodiment of the present disclosure, and referring to fig. 4, a horizontal axis (horizontal direction arrow) represents a time axis (unit: ms) reflecting a chronological relationship of data acquired by a sensor (i.e., a camera module (including a rear-view camera, a left-view camera, a front-view camera, and a right-view camera) and a radar), that is, a time interval of each image data is 25ms strictly according to an order of image (rear view → left view → front view → right view) → point cloud, and a last right-view image is 12.5ms apart from the point cloud data.

The vertical axis (vertical direction arrow) represents a process axis, and includes sensor data (i.e., a rear view image, a left view image, a front view image, a right view image acquired by each camera module, and point cloud data acquired by a radar), four image detection models (including a first image detection model, a second image detection model, a third image detection model, and a fourth image detection model, which are used for performing target detection on an image), an image detection result (i.e., a two-dimensional bounding box corresponding to a target included in an image at each view angle), a point cloud processing module (i.e., a post-processing module in fig. 4), and a point cloud processing result (i.e., a three-dimensional bounding box obtained from the two-dimensional bounding box and the point cloud data).

The overall process of target detection is as follows: the time spent by the radar for scanning for one week is 100ms, and further, when the radar reaches 180 degrees (corresponding to 12.5 milliseconds) from 135 degrees (corresponding to 0 milliseconds) of the initial phase, an exposure control signal S1 can be generated to control the rear view camera to perform exposure operation, so as to obtain a rear view image, and further, in the process of scanning for 270 degrees by the radar, the first image detection module can detect the obtained rear view image, so as to obtain a two-dimensional bounding box of the rear view image. When the radar reaches 270 (corresponding to 37.5 milliseconds), an exposure control signal S2 may be generated to control the left-view camera to perform exposure operation, so as to obtain a left-view image, and further, in the process of scanning the radar to 0 degree, the second image detection module may detect the obtained left-view image, so as to obtain a two-dimensional bounding box of the left-view image. When the radar reaches 0 degree (corresponding to 62.5 milliseconds), an exposure control signal S3 can be generated to control the front-view camera to perform exposure operation to obtain a front-view image, and further, during the scanning of the radar to 90 degrees, the third image detection module can detect the obtained front-view image to obtain a two-dimensional bounding box of the front-view image. When the radar reaches 90 degrees (corresponding to 87.5 milliseconds), an exposure control signal S4 may be generated to control the right-view camera to perform exposure operation, so as to obtain a right-view image, and further, in the process of scanning the radar at 135 degrees, the fourth image detection module may detect the obtained right-view image, so as to obtain a two-dimensional bounding box of the right-view image. When the radar reaches 135 degrees (namely, the radar scans for one circle to obtain one frame of point cloud data), at the moment, delay waiting is not needed, and the post-processing module can directly process the two-dimensional boundary frame and the point cloud data to obtain the three-dimensional boundary frame of the target object.

Exemplarily, reference may be made to fig. 5A, where fig. 5A illustrates a schematic diagram of a target detection method in the prior art, and specifically illustrates a time occupation situation of a single-view single module (before improvement), a mesh shadow portion represents a graphic 2D frame-based detection stage (CNN model-based image detection stage), and a black shadow portion represents a point cloud processing stage; since the single processing module processes the image and point cloud data at the same time, the data processing stage is only started after the data of the two channels of the image and the point cloud are updated (after 100 ms). Thus, the detection delay of image data of a single view and point cloud data in the related art is about 30-40 milliseconds.

Fig. 5B shows a schematic diagram of a target detection method in an exemplary embodiment of the disclosure, which specifically shows a time occupation situation of an improved multi-view multi-module based on a peak staggering mechanism, a grid shadow part represents a graph 2D frame detection stage (an image detection stage based on a CNN model), a black shadow represents a point cloud processing stage, since four image detection modules respectively process back/left/front/right image data in parallel and are independent of each other, each image data (12.5ms,37.5ms,62.5ms,87.5ms) enters a corresponding image detection module in time, the point cloud processing module simultaneously receives four image detection result data (a two-dimensional bounding box) and point cloud data (may receive image data first and then point cloud data, and may also receive point cloud data first and then image data, a specific data receiving order may be set according to an actual situation, belonging to the scope of the present disclosure), point cloud processing is performed when both image detection and point cloud data are updated. Thus, the detection delay of the multiplexed image data and the point cloud data in the present disclosure is about 50 milliseconds.

Comparing the detection delay conditions of fig. 5A and 5B, the present disclosure has the following advantages:

(1) the method avoids invalid waiting time delay, can enter the image detection model immediately after the image/point cloud data are released to perform image detection/point cloud processing operation, effectively improves the time utilization rate of the system, solves the time delay problem that the image data can be processed after being acquired at intervals in the prior art, and improves the processing efficiency.

(2) The detection visual angle is widened from a single visual angle to 360-degree panoramic view, and the processing of all image/point cloud data is guaranteed to be completed within 100ms, so that the detection frame rate reaches 10Hz, and the detection efficiency is guaranteed.

The present disclosure also provides an object detection apparatus, and fig. 6 shows a schematic structural diagram of the object detection apparatus in an exemplary embodiment of the present disclosure; as shown in fig. 6, the object detection apparatus 600 may include a signal generation module 601, a control module 602, a detection module 603, and a determination module 604. Wherein:

the signal generating module 601 is configured to generate an exposure control signal of each camera module according to an arrangement manner of each camera module in the panoramic camera and a scanning angle of the radar.

In an exemplary embodiment of the present disclosure, each camera module in the surround-view camera at least includes a rear-view camera, a left-view camera, a front-view camera, and a right-view camera; the image data of the plurality of view angles includes at least a rear view image, a left view image, a front view image, and a right view image.

In an exemplary embodiment of the present disclosure, when the rear-view camera is disposed at a first scanning angle of the radar, the left-view camera is disposed at a second scanning angle of the radar, the front-view camera is disposed at a third scanning angle of the radar, and the right-view camera is disposed at a fourth scanning angle of the radar, the signal generation module is configured to generate a first exposure control signal when it is detected that the radar reaches the first scanning angle, and control the rear-view camera to perform an exposure operation according to the first exposure control signal, so as to obtain a rear-view image of the target area; generating a second exposure control signal when the radar is detected to reach a second scanning angle, and controlling the left-view camera to perform exposure operation according to the second exposure control signal to obtain a left-view image of the target area; generating a third exposure control signal when the radar is detected to reach a third scanning angle, and controlling the forward-looking camera to perform exposure operation according to the third exposure control signal to obtain a forward-looking image of the target area; and generating a fourth exposure control signal when the radar is detected to reach a fourth scanning angle, and controlling the right-view camera to perform exposure operation according to the fourth exposure control signal to obtain a right-view image of the target area.

In an exemplary embodiment of the disclosure, the signal generation module is configured to, when the number of the camera modules is n and the camera modules are uniformly distributed in the circumferential direction, time intervals of the exposure control signals of the camera modules are T/n, where T is a scanning period of the radar.

In an exemplary embodiment of the present disclosure, the rotation process of the radar is a uniform rotation.

In an exemplary embodiment of the present disclosure, the method further comprises: in the rotation process of the radar, the radar is phase-locked through a GPS synchronous signal or a simulated GPS signal.

The control module 602 is configured to control the corresponding camera module to perform an exposure operation according to the exposure control signal, so as to obtain image data of the target area.

In an exemplary embodiment of the disclosure, the control module is configured to control the corresponding camera module to perform an exposure operation according to the exposure control signal, so as to obtain image data of the target area.

A detection module 603, configured to perform target detection on the image data based on each image detection module associated with each camera module, so as to determine a two-dimensional bounding box of a target object in the image data.

In an exemplary embodiment of the disclosure, the detection module is configured to perform target detection on the image data based on each image detection module associated with each camera module to determine a two-dimensional bounding box of a target object in the image data.

The determining module 604 is configured to determine a three-dimensional bounding box of the target object according to the two-dimensional bounding box and the point cloud data obtained by radar scanning.

In an exemplary embodiment of the present disclosure, the determining module is configured to project the two-dimensional bounding box to a three-dimensional space, so as to obtain a three-dimensional space region corresponding to the two-dimensional bounding box; extracting target point cloud data in the point cloud data, wherein the target point cloud data is point cloud data located in a three-dimensional space area; inputting target point cloud data into a neural network model, and determining the output of the neural network model as a three-dimensional boundary frame of a target object; the neural network model is used for predicting a three-dimensional boundary frame of the target object according to the target point cloud data.

The specific details of each module in the target detection apparatus have been described in detail in the corresponding target detection method, and therefore are not described herein again.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Moreover, although the steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer storage medium capable of implementing the above method. On which a program product capable of implementing the above-described method of the present specification is stored. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the "exemplary methods" section above of this specification, when the program product is run on the terminal device.

Referring to fig. 7, a program product 700 for implementing the above method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 800 according to this embodiment of the disclosure is described below with reference to fig. 8. The electronic device 800 shown in fig. 8 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.

As shown in fig. 8, electronic device 800 is in the form of a general purpose computing device. The components of the electronic device 800 may include, but are not limited to: the at least one processing unit 810, the at least one memory unit 820, a bus 830 connecting various system components (including the memory unit 820 and the processing unit 810), and a display unit 840.

Wherein the storage unit stores program code that is executable by the processing unit 810 to cause the processing unit 810 to perform steps according to various exemplary embodiments of the present disclosure as described in the "exemplary methods" section above in this specification. For example, the processing unit 810 may perform the following as shown in fig. 1: step S110, generating exposure control signals of all camera modules according to the arrangement mode of all camera modules in the panoramic camera and the scanning angle of the radar; step S120, controlling the corresponding camera module to perform exposure operation according to the exposure control signal to obtain image data of a target area; step S130, performing target detection on the image data based on each image detection module associated with each camera module to determine a two-dimensional boundary frame of a target object in the image data; and step S140, determining a three-dimensional boundary frame of the target object according to the two-dimensional boundary frame and the point cloud data obtained by radar scanning.

The storage unit 820 may include readable media in the form of volatile memory units such as a random access memory unit (RAM)8201 and/or a cache memory unit 8202, and may further include a read only memory unit (ROM) 8203.

The storage unit 820 may also include a program/utility 8204 having a set (at least one) of program modules 8205, such program modules 8205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 830 may be any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 800 may also communicate with one or more external devices 900 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 800, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 800 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 850. Also, the electronic device 800 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 860. As shown, the network adapter 860 communicates with the other modules of the electronic device 800 via the bus 830. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 800, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

20页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种冰箱和食材检测方法

Target detection method and device, computer storage medium and electronic equipment

相关技术

网友询问留言