License plate real-time detection method based on edge-guided sparse attention mechanism

文档序号：1354693 发布日期：2020-07-24 浏览：13次中文

阅读说明：本技术 一种基于边缘引导稀疏注意力机制的车牌实时检测方法 (License plate real-time detection method based on edge-guided sparse attention mechanism ) 是由秦华标梁静于 2020-03-22 设计创作，主要内容包括：本发明公开了一种基于边缘引导稀疏注意力机制的车牌实时检测方法,属于目标检测的技术领域。该方法首先使用卷积神经网络对输入图像进行处理,提取语义特征；然后利用一种新的边缘引导稀疏注意力机制快速捕获显著性区域,即车牌区域,其中边缘引导稀疏注意力机制包括边缘引导组件和稀疏注意力组件；接着采用级联多任务学习辅助车牌精准检测；最后采用损失掩码方法去抑制低质量的预测框,提高系统性能。本发明可实现在各种自然场景下的车牌实时检测,并具有高准确率、高召回率以及高鲁棒性,对于现实应用具有重要意义。在最大和最多样化的公开数据集CCPD上实现了最先进的性能,尤其是在CCPD-Base(100k)测试集上的检测精度达到99.9％。(The invention discloses a license plate real-time detection method based on an edge-guided sparse attention mechanism, and belongs to the technical field of target detection. Firstly, processing an input image by using a convolutional neural network, and extracting semantic features; then, a new edge-guided sparse attention mechanism is used for rapidly capturing a salient region, namely a license plate region, wherein the edge-guided sparse attention mechanism comprises an edge-guided component and a sparse attention component; then, adopting cascade multi-task learning to assist accurate detection of the license plate; and finally, a loss mask method is adopted to inhibit a low-quality prediction box, and the system performance is improved. The invention can realize the real-time detection of the license plate in various natural scenes, has high accuracy, high recall rate and high robustness, and has important significance for practical application. The most advanced performance is achieved on the largest and most diverse public data set, CCPD, and in particular the detection accuracy on the CCPD-Base (100 k) test set reaches 99.9%.)

1. A license plate real-time detection method based on an edge-guided sparse attention mechanism is characterized by comprising the following steps: the edge-guided sparse attention mechanism comprises an edge-guided component and a sparse attention component, and the real-time detection method comprises the following steps:

s1, processing the input image by using a convolutional neural network, and extracting a semantic feature map X;

s2, the edge-guided sparse attention mechanism captures a license plate region, wherein,

the edge guide component is used for enhancing target edge information and reducing noise interference, and specifically operates as follows:

s21, extracting edge information of the image by using a convolutional neural network to generate an edge guide image I;

s22, obtaining linear model coefficients (a, b) by the aid of the semantic feature map X and the edge guide map I through a convolutional neural network;

s23, constructing a linear model g by using the linear model coefficients (a, b) and the edge guide graph I_i＝a_iI_i+b_iObtaining a characteristic map X1 through a linear model;

the sparse attention component is used for reducing the computational complexity of a self-attention mechanism, the feature map X1 is input in the sparse attention component, and the specific operation in the sparse attention component is as follows:

s24, finding K most similar target pixels for each source pixel of the input feature map X1;

s25, for each source pixel, calculating an attention map by using K target pixels most similar to the source pixel;

s26, aggregating k target pixels by using an attention map to obtain corresponding output characteristics;

s3, adopting cascade multi-task learning to assist accurate detection of the license plate;

s4, using a loss mask method to suppress the low-quality prediction box.

2. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 1, wherein: the specific steps of finding the K most similar target pixels in step S24 are as follows:

s241, predicting offset map offset with 2K channels through convolutional neural network_{(k，c`，i，j)}Where K denotes K target pixels most similar to the corresponding source pixel, and 2 denotes an X-axis and a y-axis, the basic lattice basic is generated by the feature map X1_(c，i，j)It represents the original coordinates of each pixel, the basic grid has 2 channels, representing the x-axis and the y-axis, respectively;

s242, summing the original coordinates of each target pixel in the basic grid and the offset coordinates of corresponding K pixels in the offset map to obtain absolute coordinates abs _ offset_{(k，c`，i，j)}The formula is as follows:

abs_offset_{(k，c`，i，j)}＝offset_{(k，c`，i，j)}+basic_(c，i，j)，

c＝0，1；c′＝2(k-1)，2(k-1)+1；k＝1，2...，K

wherein c and c' both represent channels, k represents the kth target pixel corresponding to the source pixel point located at the ith row and the jth column;

s242, based on the feature map X1 and the offset map offset_{(k，c`，i，j)}The corresponding K most similar target pixels are found for each source pixel point by sampling, and the feature map X2 is obtained.

3. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 2, wherein: the specific calculation formula for calculating the attention diagram adopts a point multiplication or Gaussian function or an embedded Gaussian function.

4. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 3, wherein: the specific calculation formula for dot multiplication is as follows:

wherein, a_(k，i，j)Indicating the attention weight of the source pixel point located in the ith row and the jth column and the corresponding kth target pixel, and "+" indicates the multiplication of the corresponding positions.

5. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 4, wherein: the formula for obtaining the characteristic output is as follows:

wherein o is_(c，i，j)Aggregate output representing source pixels in ith row and jth column and having c channelsFeature, note that different channels of the feature map X2 at the same location have the same weight.

6. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 1, wherein: step S3 includes a first-level task learning and a second-level task learning, wherein task branches of the first-level task learning are classification confidence prediction of a license plate, relative position regression prediction of a frame, classification confidence prediction of a key point and relative position regression of the key point respectively, and the relative position regression of the key point is selectable; and the second-level task learning selectively fuses the prediction characteristic graphs obtained by the first-level multi-task learning, and further finely adjusts the target detection to obtain an accurate position.

7. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 1, wherein: the loss mask method is a water ripple loss mask, the closer the boundary box is to the target center point, the larger the regression loss weight is, and the weight of the regression loss of the boundary box far away from the target center is reduced.

8. The real-time license plate detection method based on the edge-guided sparse attention mechanism as claimed in claim 7, wherein: the water ripple loss mask is defined as:

。

Technical Field

The invention belongs to the technical field of target detection, and particularly relates to a license plate real-time detection method based on an edge-guided sparse attention mechanism.

Background

In recent years, due to the core operation of weighted summation of all positions on a feature map by a self-attention mechanism, a deep learning model is helped to capture long-distance dependency relationships, the model can focus more on salient features, and the development of many computer vision tasks such as target detection, semantic segmentation, human posture estimation and the like is promoted. From an image filtering point of view, it is essential to reduce noise and to reorganize the most important contextual semantic information over long distances.

Although popular, the self-attentiveness mechanism also has limitations. There are many methods based on self-Attention, such as those proposed by Wang X in Non-local neural networks (Proceedings of the IEEE Conference on computer Vision and Pattern registration.2018: 7794-7803), whose OCNet, Object Context Network for Scene registration, etc. and DANet, etc. proposed by Fu J, etc. in Dual Access Network for Scene registration, are all designed to achieve excellent performance without considering speed and storage cost, and whose computational complexity is O (N)²C) Furthermore, through the content of Xie C and the like in the Feature differentiation for improving adaptive robustness (Proceedings of the IEEE Conference on Computer Vision and Pattern recognition.2019:501 and 509), and experiments, we find that although the robustness of the model is improved through noise reduction, the attention machine system can have the defect that the object outline on the Feature map is blurred while the noise is reduced, which limits the further improvement of the detection precision.

In the last two decades, automatic license plate detection has been the subject of active research, and is used for overspeed violation, highway charging and vehicle passingInspired by YO L online, which achieves the best speed/precision tradeoff, most license Plate Detection networks are based on YO L O. it is noted that methods based on YO L O get a lower recall rate when detecting license plates far away from cameras because YO L O networks are difficult to detect small-sized objects, hence silver S M et al, L place Detection and Recognition in unopposed data scales (European Conference vision. springer, Cham,2018: 593;), L aroca R et al, joa road-time automatic Plate Detection on the YO L O Detection (network 8 interference) and IEEE 2018 network collection n. j-n. nIn the document L registration based on Temporal reduction, the accuracy and recall rate are improved, and the vehicle is detected before the license plate detection is proposed.

Disclosure of Invention

In order to solve the problems that a second-order-based license plate detection method, such as a method for detecting a vehicle first and then detecting a license plate, needs higher calculation cost and more parameters and is difficult to detect in real time, the invention provides a license plate real-time detection method based on an edge-guided sparse attention mechanism, wherein the edge-guided sparse attention mechanism is embedded into a detection backbone network to detect the license plate in real time; in order to overcome the defects that the existing self-attention mechanism has high calculation complexity and can reduce noise and simultaneously blur the outline of an object on a feature map, a linear model is constructed by utilizing the strong fitting capacity of an edge guide feature map and a neural network through an edge guide component, the edge contour feature of a target in the feature map is enhanced, noise interference is inhibited, and further the accuracy and the robustness of target detection are improved. The computational complexity of the sparse attention component is greatly reduced to only O (NKC), where K < N.

The purpose of the invention is realized by at least one of the following technical solutions.

A license plate real-time detection method based on an edge-guided sparse attention mechanism, wherein the edge-guided sparse attention mechanism comprises an edge-guided component and a sparse attention component, and the real-time detection method comprises the following steps: