Self-learning character recognition method

文档序号:830436 发布日期:2021-03-30 浏览:11次 中文

阅读说明:本技术 一种自学习字符识别方法 (Self-learning character recognition method ) 是由 任玉荣 梁晓锋 王伟平 段智强 杨涛 于 2020-12-23 设计创作,主要内容包括:本发明属于字符识别技术领域,具体涉及一种自学习字符识别方法,包括如下步骤:(1)图像预处理;(2)训练服务器;(3)字符检测;(4)特征提取;(5)分类器处理;(6)置信度判断;(7)模型训练,结果识别。本发明方法将模板匹配和深度学习结合起来,并且将端到端的字符识别网络拆分开,独立训练字符检测和识别分类两个模型,本发明可以快速响应新增字符识别需求,并且随着数据积累识别准确率不断提升,同时本发明方法中引入位置检测功能,提高了程序的执行的准确性。(The invention belongs to the technical field of character recognition, and particularly relates to a self-learning character recognition method, which comprises the following steps: (1) preprocessing an image; (2) a training server; (3) detecting characters; (4) extracting characteristics; (5) processing by a classifier; (6) judging confidence; (7) model training and result identification. The method combines template matching and deep learning, separates an end-to-end character recognition network, and independently trains two models of character detection and recognition classification.)

1. A self-learning character recognition method is characterized by comprising the following steps:

(1) image pre-processing

Acquiring image data through a camera, and preprocessing the image data;

(2) training server

The training server comprises a data set, a model training server, a label, a position detection server, a character detection model and a classification model, wherein in the step (1), image data is obtained through a camera and the data set is uploaded at the same time, when the number of samples of the data set is larger than a preset threshold value, the model training is carried out, the character detection model and the classification model are updated after the model training is finished, and the position detection assists the character detection model and the classification model in the using process;

(3) character detection

Inputting the image preprocessed in the step (1) into a character detection algorithm, simultaneously calling data of a character detection model, retrieving the position of a character from a full image, and then intercepting a pixel area where the character is located from the full image; simultaneously, backing up the pixel area where the character is intercepted from the whole image in an updating template library;

(4) feature extraction

Respectively sending each pixel region in the pixel region where the character obtained in the step (3) is located into a feature extraction algorithm, and outputting a feature vector corresponding to each character;

(5) classifier processing

Inputting the feature vector corresponding to each character obtained in the step (4) into a classifier algorithm, calling data of a classification model for classification at the same time, and outputting class information and confidence corresponding to each character;

(6) confidence determination

Checking the confidence degrees of the characters detected in the step (5) one by one;

when the confidence coefficient is larger than a preset threshold value, the character is a correct character to be recognized, and the type of the character is the detected type, so that a recognition result is obtained;

(7) model training, result recognition

And (4) transmitting the recognition result of the step (6) to a data set through model training in a training server, wherein the transmitted data comprises images and character labels.

2. The self-learning character recognition method of claim 1, wherein: in the step (1), the preprocessing refers to adjusting the size of the image, clipping the image according to the percentage, processing the brightness of the image, and adding a mark frame or histogram equalization to the image.

3. The self-learning character recognition method of claim 1, wherein: in the step (6), when the confidence coefficient is smaller than a preset threshold value, the character is a character with an identification error; the template matching algorithm is used for recognition, and then the recognition result is transmitted to a data set through model training in a training server, wherein the transmitted data comprises images and character labels.

4. The self-learning character recognition method of claim 3, wherein the specific process of the template matching algorithm recognition is as follows: identifying the pixel region with the confidence coefficient smaller than the preset threshold value detected in the step (6) through a template matching algorithm, and outputting an identified classification result and an error value; and when the error value is smaller than a preset threshold value, the character is considered as a corresponding category in the template library.

5. The self-learning character recognition method of claim 4, wherein: in the process of starting the identification of the template matching algorithm, when the error value is greater than a preset threshold value, the character needs to be marked as a new category manually, and the pixel area where the character backed up in the step (3) is located is updated to the new category and added into an updated template library.

6. The self-learning character recognition method of claim 4, wherein the template matching algorithm is calculated by:

the template in the template library is an H × W matrix, the size of the detected character image area is converted into H × W using bilinear transformation, and then the category and error are calculated by the following formula:

the value of k in the above formula is from 1 to N, which represents N templates in the template library, after comparing the character to be recognized with all the characters in the template library one by one, the category corresponding to the template with the minimum error value E is taken as the recognition result, T is a template matrix, k is the number of templates, P is a matrix intercepted from the original image, and i and j represent indexes in two directions of the two-dimensional matrix.

7. The self-learning character recognition method of claim 1, wherein: in the step (3), the character detection algorithm trains a character position detection model by using an end-to-end YOLOv3 algorithm in deep learning, firstly, various character data are collected, data labeling and data enhancement are carried out, then, model training is carried out by using a gradient descent algorithm based on mean square error, the algorithm is packaged into a binary model file after the training is finished, and the binary model file is loaded in the reasoning process to carry out character detection.

8. The self-learning character recognition method of claim 1, wherein: in the step (4), the feature extraction algorithm uses the HOG features as the description of the character image, the character image is mapped into a feature vector through HOG, and the feature vector is used as the input of the classifier algorithm.

9. The self-learning character recognition method of claim 1, wherein: in the step (5), the classifier algorithm takes the feature vector in the feature extraction algorithm as input, and performs classification operation through the neural network, wherein the output of the neural network is a probability distribution vector, the category corresponding to the position with the maximum probability value is the category of the character, and the maximum probability value is the confidence coefficient of the recognition.

10. The self-learning character recognition method of claim 1, wherein: the training server trains the character detection model to use a character detection algorithm, and the training classification model uses a classifier algorithm.

Technical Field

The invention belongs to the technical field of character recognition, and particularly relates to a self-learning character recognition method.

Background

The existing monitoring system has a wide range of character recognition functions, and the character recognition provides understanding of deep semantics in a monitored scene. In a monitoring scene, a general character recognition workflow is to shoot marked image information on an object through a camera and then recognize characters in the image through an algorithm in a recognition device, but the types of characters which can be recognized are limited, and in the system operation process, if new characters occur, the situation of wrong recognition occurs.

The existing character recognition methods comprise methods such as template matching, deep learning network recognition and the like. The template matching method is simple and easy to use, and when new characters appear, the recognition of the newly added characters can be realized by quickly calibrating the template, but the template matching has the problem of low accuracy in the recognition of nonstandard characters in a complex environment because the generalization capability of the template matching algorithm is weak.

Deep learning network identification is an end-to-end method, can realize character positioning and classification at the same time, and has good generalization performance, but the deep learning network can be used only by training a large amount of data, and cannot respond to the newly added character identification requirement in real time. And because the characteristic conflict problem exists between the target detection and classification in the end-to-end deep learning network, the identification accuracy still has a space for improvement.

Disclosure of Invention

The invention aims to provide a self-learning character recognition method, which solves the problem that the recognition rate of new characters in the character recognition of a target object is low in a monitoring management system.

The realization process of the invention is as follows:

a self-learning character recognition method comprises the following steps:

(1) image pre-processing

Acquiring image data through a camera, and preprocessing the image data;

(2) training server

The training server comprises a data set, a model training server, a label, a position detection server, a character detection model and a classification model, wherein in the step (1), image data is obtained through a camera and the data set is uploaded at the same time, when the number of samples of the data set is larger than a preset threshold value, the model training is carried out, the character detection model and the classification model are updated after the model training is finished, and the position detection assists the character detection model and the classification model in the using process;

(3) character detection

Inputting the image preprocessed in the step (1) into a character detection algorithm, simultaneously calling data of a character detection model, retrieving the position of a character from a full image, and then intercepting a pixel area where the character is located from the full image; simultaneously, backing up the pixel area where the character is intercepted from the whole image in an updating template library;

(4) feature extraction

Respectively sending each pixel region in the pixel region where the character obtained in the step (3) is located into a feature extraction algorithm, and outputting a feature vector corresponding to each character;

(5) classifier processing

Inputting the feature vector corresponding to each character obtained in the step (4) into a classifier algorithm, calling data of a classification model for classification at the same time, and outputting class information and confidence corresponding to each character;

(6) confidence determination

Checking the confidence degrees of the characters detected in the step (5) one by one;

when the confidence coefficient is larger than a preset threshold value, the character is a correct character to be recognized, and the type of the character is the detected type, so that a recognition result is obtained;

(7) model training, result recognition

And (4) transmitting the recognition result of the step (6) to a data set through model training in a training server, wherein the transmitted data comprises images and character labels.

Further, in the step (1), the preprocessing refers to adjusting the size of the image, cropping the image according to a percentage, processing the brightness of the image, adding a mark frame to the image or performing histogram equalization.

Further, in the step (6), when the confidence coefficient is smaller than a preset threshold value, the character is a character with an identification error; the template matching algorithm is used for recognition, and then the recognition result is transmitted to a data set through model training in a training server, wherein the transmitted data comprises images and character labels.

Further, the specific process of the template matching algorithm identification is as follows: identifying the pixel region with the confidence coefficient smaller than the preset threshold value detected in the step (6) through a template matching algorithm, and outputting an identified classification result and an error value; and when the error value is smaller than a preset threshold value, the character is considered as a corresponding category in the template library.

Further, in the process of starting the template matching algorithm identification, when the error value is greater than a preset threshold value, the character needs to be manually marked as a new category, and the pixel area where the character backed up in the step (3) is located is updated to the new category and added into the updated template library.

Further, the calculation process of the template matching algorithm is as follows:

the template in the template library is an H × W matrix, the size of the detected character image area is converted into H × W using bilinear transformation, and then the category and error are calculated by the following formula:

the value of k in the above formula is from 1 to N, which represents N templates in the template library, after comparing the character to be recognized with all the characters in the template library one by one, the category corresponding to the template with the minimum error value E is taken as the recognition result, T is a template matrix, k is the number of templates, P is a matrix intercepted from the original image, and i and j represent indexes in two directions of the two-dimensional matrix.

Further, in the step (3), the character detection algorithm uses an end-to-end YOLOv3 algorithm in deep learning to train a character position detection model, firstly, various character data are collected to perform data annotation and data enhancement, then, a gradient descent algorithm based on mean square error is used to perform model training, after the training is completed, the algorithm is packaged into a binary model file, and the binary model file is loaded in the inference process to perform character detection.

Further, in step (4), the feature extraction algorithm uses the HOG features as the description of the character image, maps the character image into a feature vector through HOG, and uses the feature vector as the input of the classifier algorithm.

Further, in the step (5), the classifier algorithm takes the feature vector in the feature extraction algorithm as an input, and performs classification operation through the neural network, wherein the output of the neural network is a probability distribution vector, the category corresponding to the position with the maximum probability value is the category of the character, and the maximum probability value is the confidence of the current recognition.

Further, the training character detection model in the training server uses a character detection algorithm, and the training classification model uses a classifier algorithm.

The invention has the following positive effects:

(1) the method combines template matching and deep learning, separates an end-to-end character recognition network, and independently trains two models of character detection and recognition classification.

(2) The method of the invention introduces the position detection function, and improves the accuracy of the execution of the program.

(3) And combining a template matching algorithm, extracting a plurality of characteristic vectors from the image to be recognized, comparing the characteristic vectors with the characteristic vectors corresponding to the template, calculating the distance between the image and the template characteristic, and judging the category to which the image belongs by using the minimum distance, thereby enhancing the accuracy of character recognition in the image.

Drawings

FIG. 1 is a flow chart of the steps of the method of the present invention;

FIG. 2 is a schematic view showing an image after being preprocessed in example 1;

FIG. 3 is a diagram illustrating the detection of the character positions in embodiment 1;

fig. 4 is a schematic diagram of a pixel region where a character is located in the whole image in embodiment 1.

Detailed Description

The present invention will be further described with reference to the following examples.

Example 1

The self-learning character recognition method of the embodiment, as shown in fig. 1, includes the following steps:

(1) image pre-processing

Acquiring image data through a camera, and preprocessing the image data, as shown in fig. 2; the preprocessing refers to adjusting the size of an image, clipping the image according to percentage, processing the brightness of the image, and adding a mark frame or histogram equalization to the image.

(2) Training server

The training server comprises a data set, a model training server, a label, a position detection server, a character detection model and a classification model, wherein in the step (1), image data is obtained through a camera and the data set is uploaded at the same time, when the number of samples of the data set is larger than a preset threshold value, the model training is carried out, the character detection model and the classification model are updated after the model training is finished, and the position detection assists the character detection model and the classification model in the using process;

(3) character detection

Inputting the image preprocessed in the step (1) into a character detection algorithm, retrieving the position of the character from the full image by taking the data of the character detection model (see fig. 3), and then intercepting the pixel area where the character is located from the full image (see fig. 4); simultaneously, backing up the pixel area where the character is intercepted from the whole image in an updating template library;

(4) feature extraction

Respectively sending each pixel region in the pixel region where the character obtained in the step (3) is located into a feature extraction algorithm, and outputting a feature vector corresponding to each character;

(5) classifier processing

Inputting the feature vector corresponding to each character obtained in the step (4) into a classifier algorithm, calling data of a classification model for classification at the same time, and outputting class information and confidence corresponding to each character;

(6) confidence determination

Checking the confidence degrees of the characters detected in the step (5) one by one;

when the confidence coefficient is larger than a preset threshold value, the character is a correct character to be recognized, and the type of the character is the detected type, so that a recognition result is obtained;

when the confidence coefficient is smaller than a preset threshold value, the character is a character with an error in recognition; the template matching algorithm is started to recognize, and then the recognition result is transmitted to a data set in a training server, wherein the transmitted data comprises images and character labels. The specific process of the template matching algorithm identification is as follows: identifying the pixel region with the confidence coefficient smaller than the preset threshold value detected in the step (6) through a template matching algorithm, and outputting an identified classification result and an error value; and when the error value is smaller than a preset threshold value, the character is considered as a corresponding category in the template library. In the process of starting the identification of the template matching algorithm, when the error value is greater than a preset threshold value, the character needs to be marked as a new category manually, and the pixel area where the character backed up in the step (3) is located is updated to the new category and added into an updated template library.

(7) Model training, result recognition

And (4) transmitting the recognition result of the step (6) to a data set through model training in a training server, wherein the transmitted data comprises images and character labels.

Several algorithms involved in this embodiment are detailed below:

the calculation process of the template matching algorithm comprises the following steps:

the template in the template library is an H × W matrix, the size of the detected character image area is converted into H × W using bilinear transformation, and then the category and error are calculated by the following formula:

the value of k in the above formula is from 1 to N, which represents N templates in the template library, after comparing the character to be recognized with all the characters in the template library one by one, the category corresponding to the template with the minimum error value E is taken as the recognition result, T is a template matrix, k is the number of templates, P is a matrix intercepted from the original image, and i and j represent indexes in two directions of the two-dimensional matrix.

The (two) character detection algorithm trains a character position detection model by using an end-to-end YOLOv3 algorithm in deep learning, collects a plurality of character data, performs data annotation and data enhancement, performs model training by using a mean square error-based gradient descent (Mini Batch GD) algorithm, packages the algorithm into a binary model file after the training is completed, and loads the binary model file in the inference process to perform character detection.

And (III) the feature extraction algorithm uses the HOG features as the description of the character image, the character image is mapped into a feature vector through HOG, and the feature vector is used as the input of a classifier of the classifier algorithm.

The classifier algorithm (IV) takes the feature vector in the feature extraction algorithm as input, and carries out classification (SVM) operation through a neural network, wherein the output of the neural network is a probability distribution vector, the class corresponding to the position with the maximum probability value is the class of the character, and the maximum probability value is the confidence coefficient of the recognition.

In the training server of the embodiment, the character detection algorithm is used for training the character detection model, and the classifier algorithm is used for training the classification model.

Template matching can respond to newly added character recognition requirements in real time, but with lower accuracy. The deep learning end-to-end recognition scheme has the characteristic conflict problem, the accuracy is higher than that of template matching, but a large amount of data accumulation is needed for training. The method combines the two methods, and splits the end-to-end recognition process of deep learning into two parts, wherein the first part is character position detection, the second part is character classification recognition, the template matching recognition process is started when characters are newly added, the newly added results are stored in a data set, when the quantity of the newly added data set is newly added to a certain quantity, the automatic training process is started on a remote server, the recognition capability of the newly added characters is added into a detection model, and then the model is remotely issued. The whole program processing flow of the method is completely different from the traditional flow, position detection is introduced for the first time, and a template library concept and template matching identification are proposed and implemented; in the processing process, a template matching algorithm is introduced, an end-to-end YOLOv3 algorithm is used for the first time, and the template matching algorithm is combined for the first time.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and is not intended to limit the invention to the particular forms disclosed. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:图像语义描述方法、装置、计算设备及计算机存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!