Bank card number positioning and end-to-end identification method based on CNN and RNN

文档序号:1505621 发布日期:2020-02-07 浏览:12次 中文

阅读说明:本技术 基于cnn和rnn的银行卡***定位与端到端识别方法 (Bank card number positioning and end-to-end identification method based on CNN and RNN ) 是由 倪建军 江聚勇 朱金秀 陈鹏 于 2019-09-29 设计创作,主要内容包括:本发明公开了一种基于CNN和RNN的银行卡卡号定位与端到端识别方法,用于解决现有技术面对银行卡复杂背景花纹、多样印刷和凹凸字体、多种字体颜色和复杂拍摄场景下存在的卡号识别准确度较低的技术问题。实现步骤为:步骤1:制作银行卡图片数据集;步骤2:对银行卡图片数据集进行数据增强;步骤3:定位获取银行卡图片中的卡号区域图片;步骤4:对银行卡号区域图进行端到端的字符识别。本发明能够在各种复杂情况下,有效的对银行卡图片进行卡号定位和卡号端到端识别,可运用于证件号码识别、车牌识别、记分牌识别、票据单号识别等数字识别的场合。(The invention discloses a CNN and RNN-based bank card number positioning and end-to-end identification method, which is used for solving the technical problem of lower card number identification accuracy in the prior art in the presence of complex background patterns, various printing and concave-convex fonts, various font colors and complex shooting scenes of bank cards. The method comprises the following implementation steps: step 1: making a bank card picture data set; step 2: carrying out data enhancement on the bank card picture data set; and step 3: positioning and obtaining a card number area picture in a bank card picture; and 4, step 4: and performing end-to-end character recognition on the bank card number area map. The invention can effectively position the card number of the bank card and identify the card number end to end under various complex conditions, and can be applied to the occasions of digital identification such as certificate number identification, license plate identification, scoreboard identification, bill single number identification and the like.)

1. A bank card number positioning and end-to-end identification method based on CNN and RNN is characterized by comprising the following steps:

(1) making a bank card picture data set;

(2) carrying out data enhancement on the bank card picture data set;

(3) positioning and obtaining a card number area picture in the bank card picture;

(4) and performing end-to-end character recognition on the bank card number area graph.

2. The method for positioning and identifying end-to-end of bank card number based on CNN and RNN according to claim 1, wherein the specific steps of the step (1) are as follows:

(1a) collecting a bank card picture data set, and accurately taking the bank card number as the file name of the picture through manually amplifying the picture for observation, wherein spaces among the card numbers are distinguished by certain English letters;

(1b) and manufacturing a positioning label of the bank card picture data set by using a LabelImg open source positioning label tool for the obtained bank card picture data set with the card number label.

3. The method for positioning and identifying end-to-end of bank card number based on CNN and RNN according to claim 1, wherein the specific steps of the step (2) are as follows:

(2a) carrying out random turning processing on the bank card pictures by using a flip function of the OpenCV, and simulating different positions when the pictures are shot to obtain different bank card pictures which are horizontally turned or vertically turned or horizontally and vertically turned at the same time;

(2b) randomly cutting the bank card picture by randomly selecting pixel points with certain ranges of the horizontal and width of the picture, and simulating different positions when the picture is shot to obtain the bank card pictures with different cutting scales;

(2c) randomly rotating the bank card picture by utilizing a getrotontionmatrix 2D function and a warpAffine function of OpenCV, simulating different angles when the picture is shot, and obtaining the bank card picture with different rotation angles;

(2d) performing Gamma brightness conversion on the bank card picture by adopting an LUT function of OpenCV, and simulating the difference of illumination intensity when the picture is shot to obtain the bank card pictures with different brightness;

(2e) randomly changing some pixel point values in the picture into 0 and 255, performing random salt and pepper noise processing on the bank card picture, simulating noise generated when the picture is shot, and obtaining the bank card picture with different noise numbers;

(2f) gaussian noises with different sizes are randomly added to the pictures to simulate the surrounding complex environment and the noise brought by the shooting equipment, so that the bank card pictures with different Gaussian noises are obtained;

(2g) carrying out random Gaussian blur processing on the bank card picture by utilizing a GaussianBlur function of OpenCV, simulating surrounding complex scenes and noise brought by shooting equipment to obtain bank card pictures with different Gaussian blurs;

(2h) carrying out color dithering to the bank card picture to different degrees by adopting a cvtColor function of OpenCV, and simulating background patterns of the bank card picture and the diversity and complexity of surrounding scene colors to obtain the bank card picture in different color spaces;

(2I) and normalizing the size of the bank card picture by using a resize function of the OpenCV to obtain the bank card picture with the specified size.

4. The method for positioning and identifying end-to-end of bank card number based on CNN and RNN according to claim 1, wherein the specific steps of the step (3) are as follows:

(3a) using CNN to automatically extract the characteristic information of the bank card picture to obtain a characteristic diagram;

(3b) further extracting features from the feature map obtained in the step (3a) by using a sliding window, and predicting category information corresponding to the K anchors at the position by using the obtained features so as to define a target candidate area;

(3c) inputting the characteristic diagram obtained in the step (3b) into a Bidirectional RNN to obtain a characteristic diagram with character sequence characteristics;

(3d) further inputting the feature map obtained in the step (3c) into a one-dimensional CNN to extract features so as to obtain a semantic feature map with a higher layer;

(3e) inputting the high-level semantic feature map obtained in the step (3d) into FC for classification and regression prediction to obtain the height of k selection frames and the y-axis coordinate of the center; the category information of the k selection boxes indicates whether the selection boxes are characters or not; horizontal offsets of k selection boxes;

(3f) combining the obtained k selection frames into an integral text sequence frame by using a text construction algorithm, thereby obtaining a bank card number region positioning frame; the calculation mode of the Loss function Loss is shown as formula (1):

wherein

Figure FDA0002220975650000022

all the symbols in the formula (1) are ground treth, each Anchor is a training sample, i is the index of the Anchor in the mini-batch, siIs that the Anchor is the predicted probability of the character,

Figure FDA0002220975650000024

5. The method for positioning and identifying end-to-end of bank card number based on CNN and RNN as claimed in claim 4, wherein the specific steps of the step (4) are as follows:

(4a) carrying out graying processing on the color bank card number picture by utilizing the cvtColor function of OpenCV to the bank card number regional picture obtained in the step (3f) to obtain a gray scale picture, and automatically extracting the characteristic information of the bank card number picture when the picture is input into CNN to obtain a characteristic picture;

(4b) inputting the characteristic diagram obtained in the step (4a) into a Bidirectional RNN to obtain a characteristic diagram with character sequence characteristics;

(4c) and (3) mapping the character sequence characteristic diagram obtained in the step (4b) to a final Label sequence by utilizing a CTC translation layer.

Technical Field

The invention relates to a CNN and RNN-based bank card number positioning and end-to-end identification method, and belongs to the technical field of deep learning and computer vision.

Background

Due to the rapid development of the mobile internet, mobile payment is one of the most mainstream payment methods, and people can be bound to a bank card in various scenes to conduct fund transactions. For example: the bank card is bound in advance, and the bank card is scanned through a scanning function, the card number is automatically identified, and subsequent verification operation is carried out; before the bank performs the business, the bank needs to provide the bank card, scan the bank card, perform the card number identification operation, and the like. The realization of the bank card automatic detection and identification technology can enable users of the payment platforms to shoot through the camera of the mobile device to automatically identify the bank card, and compared with the fact that the users manually input the bank card numbers, the bank card number intelligent detection and identification technology can not only improve the working efficiency and reduce the labor cost, but also improve the user experience.

At present, the format of the bank card is more and more pursuing novelty, individuation and fashion, and the expressed result is the complex background patterns, various printing and concave-convex characters and various character colors of various bank cards, and the accompanying result is the low identification precision of the bank card number under the complex natural scene.

Chinese patent No.: CN109034145A discloses a bank card number identification method based on OpenCV. The digital image processing method considers the influence of the illumination intensity on the identification precision, and carries out image preprocessing by adopting different binarization algorithms for different illuminations; then, a card number area is obtained by a method of contour extraction after expansion corrosion; then, the card number area is divided into characters by a row projection method; and finally, carrying out character recognition by using a template matching algorithm.

Chinese patent No.: CN109242047A discloses a bank card number detection and identification method based on K-means + + clustering and residual error network classification. The identification method comprises the steps of positioning the number of each bank card number by using a K-means + + clustering algorithm, and inputting the intercepted bank card numbers into a residual error network for card number classification identification.

The method proposed in the above patent includes a large number of manually set fixed parameter thresholds to locate and identify the bank card number. Also, due to the complexity of natural scenes, all complex situations cannot be handled by means of a set of fixed parameter thresholds set manually. Therefore, the anti-interference performance and robustness of the model are not strong, the capability of automatically extracting the image features is not high, and the identification accuracy of the bank card number is low due to the fact that the model is still easily interfered by the complex background patterns of the bank card, different fonts of the bank card number, the angle and the position for shooting the bank card and the complex scene where the bank card is located.

In addition, in some application scenarios, it is also necessary to identify spaces between bank card numbers, and the methods proposed in the above patents all adopt a method of dividing and extracting effective bank card numbers, so that it is impossible to identify spaces between bank card numbers.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, provides a method for positioning and identifying the bank card number from end to end based on CNN and RNN, and is used for solving the technical problems of low card number identification accuracy and incapability of identifying spaces between bank card numbers in the prior art.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a bank card number positioning and end-to-end identification method based on CNN and RNN comprises the following steps:

(1) making a bank card picture data set;

(2) carrying out data enhancement on the bank card picture data set;

(3) positioning and obtaining a card number area picture in the bank card picture;

(4) and performing end-to-end character recognition on the bank card number area graph.

The specific steps of the step (1) are as follows:

(1a) collecting a bank card picture data set, and accurately taking the bank card number as the file name of the picture through manually amplifying the picture for observation, wherein spaces among the card numbers are distinguished by certain English letters;

(1b) and manufacturing a positioning label of the bank card picture data set by using a LabelImg open source positioning label tool for the obtained bank card picture data set with the card number label.

The specific steps of the step (2) are as follows:

(2a) carrying out random turning processing on the bank card pictures by using a flip function of the OpenCV, and simulating different positions when the pictures are shot to obtain different bank card pictures which are horizontally turned or vertically turned or horizontally and vertically turned at the same time;

(2b) randomly cutting the bank card picture by randomly selecting pixel points with certain ranges of the horizontal and width of the picture, and simulating different positions when the picture is shot to obtain the bank card pictures with different cutting scales;

(2c) randomly rotating the bank card picture by utilizing a getrotontionmatrix 2D function and a warpAffine function of OpenCV, simulating different angles when the picture is shot, and obtaining the bank card picture with different rotation angles;

(2d) performing Gamma brightness conversion on the bank card picture by adopting an LUT function of OpenCV, and simulating the difference of illumination intensity when the picture is shot to obtain the bank card pictures with different brightness;

(2e) randomly changing some pixel point values in the picture into 0 and 255, performing random salt and pepper noise processing on the bank card picture, simulating noise generated when the picture is shot, and obtaining the bank card picture with different noise numbers;

(2f) gaussian noises with different sizes are randomly added to the pictures to simulate the surrounding complex environment and the noise brought by the shooting equipment, so that the bank card pictures with different Gaussian noises are obtained;

(2g) carrying out random Gaussian blur processing on the bank card picture by utilizing a GaussianBlur function of OpenCV, simulating surrounding complex scenes and noise brought by shooting equipment to obtain bank card pictures with different Gaussian blurs;

(2h) carrying out color dithering to the bank card picture to different degrees by adopting a cvtColor function of OpenCV, and simulating background patterns of the bank card picture and the diversity and complexity of surrounding scene colors to obtain the bank card picture in different color spaces;

(2I) and normalizing the size of the bank card picture by using a resize function of the OpenCV to obtain the bank card picture with the specified size.

The specific steps of the step (3) are as follows:

(3a) using CNN (convolutional neural network) to automatically extract the characteristic information of the bank card picture to obtain a characteristic diagram;

(3b) further extracting features from the feature map obtained in the step (3a) by using a sliding window, and predicting category information corresponding to the K anchors at the position by using the obtained features so as to define a target candidate area;

(3c) inputting the characteristic diagram obtained in the step (3b) into a Bidirectional RNN (Bidirectional RNN) to obtain a characteristic diagram with character sequence characteristics;

(3d) further inputting the feature map obtained in the step (3c) into a one-dimensional CNN to extract features so as to obtain a semantic feature map with a higher layer;

(3e) inputting the high-level semantic feature map obtained in the step (3d) into an FC (full connection layer) for classification and regression prediction to obtain the height of k selection frames and the y-axis coordinate of the center; the category information of the k selection boxes indicates whether the selection boxes are characters or not; horizontal offsets of k selection boxes;

(3f) combining the obtained k selection frames into an integral text sequence frame by using a text construction algorithm, thereby obtaining a bank card number region positioning frame; the calculation mode of the Loss function Loss is shown as formula (1):

Figure RE-GDA0002289299550000041

wherein

Figure RE-GDA0002289299550000042

The calculation method of (2) is as shown in formula (2):

Figure RE-GDA0002289299550000043

all the symbols in the formula (1) are ground treth, each Anchor is a training sample, i is the index of the Anchor in the mini-batch, siIs that the Anchor is the predicted probability of the character,

Figure RE-GDA0002289299550000044

is ground channel ═ {0,1 }; j is the index of the effective Anchor in the y-coordinate regression, the effective Anchor is positive

Figure RE-GDA0002289299550000045

Figure RE-GDA0002289299550000046

Or with ground truthtext propofol>IOU, v of 0.5jAnd

Figure RE-GDA0002289299550000047

is the predicted and true y-coordinate of the jth Anchor; k is a set of anchors, o within the horizontal distance to the left or right of the actual text line bounding boxkAndis the predicted and actual offset of x for the kth Anchor;

Figure RE-GDA0002289299550000049

is a classification loss that uses Softmax loss to distinguish text from non-text;

Figure RE-GDA00022892995500000410

and

Figure RE-GDA00022892995500000411

is the regression loss, calculated using smoothL1 function, λ1And λ2Is a loss weight to balance different tasks; n is a radical ofs,NvAnd NoIs a normalized parameter, a representation

Figure RE-GDA00022892995500000412

And

Figure RE-GDA00022892995500000413

the total number of Anchor used, respectively.

The specific steps of the step (4) are as follows:

(4a) carrying out graying processing on the color bank card number picture by utilizing the cvtColor function of OpenCV to the bank card number regional picture obtained in the step (3f) to obtain a gray scale picture, and automatically extracting the characteristic information of the bank card number picture when the picture is input into CNN to obtain a characteristic picture;

(4b) inputting the characteristic diagram obtained in the step (4a) into a Bidirectional RNN to obtain a characteristic diagram with character sequence characteristics;

(4c) and (3) mapping the character sequence characteristic diagram obtained in the step (4b) to a final Label sequence by utilizing a CTC translation layer.

The invention achieves the following beneficial effects:

1. in the invention, in the data enhancement of the bank card picture, a series of data enhancement modes such as random turning, random cutting, random rotation, Gamma shading transformation, random salt and pepper noise, random Gaussian blur, color jitter and the like are carried out on the bank card picture by utilizing various image processing functions of OpenCV, so that the defect of low card number identification precision caused by the fact that a bank card data set is difficult to collect, the bank cards are various in types, the bank card shooting scene is complex and the like in the prior art is effectively overcome.

2. The combination of the CNN, the bidirectional RNN and the full connection layer used by the invention can automatically extract effective positioning characteristic information in the picture, and can accurately position the region position of the bank card number under the conditions of various complicated background pattern interferences, various printing fonts, various shooting scenes, various shooting angles and positions.

3. The invention adopts an end-to-end recognition model, and avoids the defects that the character segmentation is incomplete and the blank space between characters and complex background patterns cannot be segmented to interfere the character recognition rate, which are easily caused by the complex process of firstly segmenting a single character and then recognizing the character in the prior art.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is a flow chart of the bank card number area location of the present invention;

fig. 3 is a flow chart of end-to-end identification of a bank card number of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于深度学习的船名字符区域检测方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!