Electronic invoice information extraction method and electronic equipment

文档序号:1490763 发布日期:2020-02-04 浏览:4次 中文

阅读说明:本技术 一种电子***信息的提取方法及电子设备 (Electronic invoice information extraction method and electronic equipment ) 是由 王志鹏 朱西华 张胜娜 于 2019-09-26 设计创作,主要内容包括:本发明实施例公开了一种电子发票信息的提取方法及电子设备,用于实现了对增值税发票指定区域的文本识别,进而提高了票据信息的识别效率。本发明实施例方法包括:选定目标类型的增值税电子发票,以及选定标准图片作为基准图片,制作自定义识别模板;导入增值税电子发票文件;将目标增值税电子发票文件进行预处理和几何校正处理,以及通过放缩和裁剪处理成统一的标准尺寸,得到处理后的增值税电子发票文件;将处理后的增值税电子发票文件与基准图片进行对齐操作;将对齐后的图片中对应于模板图片中的待识别区域作感兴趣ROI区域裁剪操作,得到ROI截取区域;对ROI截取区域进行识别操作,得到识别结果;再校对并进行结构化处理,以指定格式进行输出存储。(The embodiment of the invention discloses an electronic invoice information extraction method and electronic equipment, which are used for realizing text recognition of a value-added tax invoice designated area and further improving the recognition efficiency of bill information. The method provided by the embodiment of the invention comprises the following steps: selecting a value-added tax electronic invoice of a target type and selecting a standard picture as a reference picture, and making a custom identification template; importing a value-added tax electronic invoice file; preprocessing and geometrically correcting the target value-added tax electronic invoice file, and scaling and cutting the target value-added tax electronic invoice file into a uniform standard size to obtain a processed value-added tax electronic invoice file; aligning the processed value-added tax electronic invoice file with a reference picture; cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region; identifying the ROI intercepted area to obtain an identification result; and then proofreading and structuring processing are carried out, and output and storage are carried out in a specified format.)

1. A method for extracting electronic invoice information is characterized by comprising the following steps:

selecting a value-added tax electronic invoice of a target type, selecting a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a custom identification template;

importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

performing structural analysis on the PDF value-added tax electronic invoice file, and converting the PDF value-added tax electronic invoice file into a second value-added tax electronic invoice file in a picture format;

preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format;

aligning the processed value-added tax electronic invoice file with the reference picture;

cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region;

performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result;

and checking the identification result, performing structuring processing, and outputting and storing in a specified format.

2. The method according to claim 1, wherein the imported value-added tax electronic invoice file supports single and multiple file uploads; and after the identification is finished, respectively storing the identification results of the imported value-added tax electronic invoice files according to the file attributes.

3. The method as claimed in claim 1, wherein the selecting the value-added tax electronic invoice of the target type and the selecting the standard picture of the value-added tax electronic invoice of the target type as the reference picture, and the making of the custom identification template comprises:

selecting a standard value-added tax electronic invoice picture, wherein the standard value-added tax electronic invoice picture is complete, clear, correct and pollution-free and is used as a reference picture for manufacturing a custom identification template;

selecting 4 corners on an external frame of a table on a picture of the standard value-added tax electronic invoice as reference points for picture alignment transformation, and storing coordinate points of the 4 corners;

and selecting an area to be identified in the electronic invoice according to the requirement, storing the coordinate position of the area to be identified and the content represented by the area as the label information of the structured result, and obtaining the custom identification template.

4. The method according to any one of claims 1-3, wherein the pre-processing and geometry correcting the target value-added tax electronic invoice file comprises:

carrying out gray processing on the target value-added tax electronic invoice file, and converting the target value-added tax electronic invoice file into a single-channel gray image;

performing smooth noise reduction processing on the single-channel gray level image by adopting Gaussian filtering processing to obtain a smooth noise reduction image;

extracting 4 outer frame straight lines in the smooth noise reduction image through Hough transformation, and further calculating to obtain coordinate positions of 4 corner points of the outer frame; comparing the coordinate positions of the 4 angular points with the 4 angular points in the custom identification template, and performing geometric correction through multi-level perspective transformation to obtain a corrected picture;

and comparing the reference picture, cutting off redundant boundary regions and ROI regions in the corrected picture, keeping the size of the corrected picture consistent with that of the self-defined identification template, and obtaining a picture of the region to be identified.

5. A method according to any of claims 1-3, characterized in that said deep learning based end-to-end character recognition OCR technique relies on two deep learning models, respectively: a text detection model and a text recognition model.

6. The method of claim 4,

the character detection model is as follows: collecting pictures with various scenes and containing different characters, manually calibrating character areas, and dividing the character areas into a training set and a testing set according to a ratio of 9: 1; performing model training through a CTPN algorithm in deep learning; the detection model can detect the character area in the picture of the area to be identified, detect the character area in a line mode, and position and visually display the character area in a rectangular frame.

7. The method of claim 4,

the character recognition model is as follows: collecting a Chinese language and English language database, generating a character recognition sample set containing fixed word number length, and performing model training through a CRNN algorithm in deep learning; the recognition model can recognize the character information in the picture of the area to be recognized, does not need to perform text line segmentation and character segmentation, and outputs the character information in a character string format.

8. An electronic device, comprising:

the acquisition module is used for selecting the value-added tax electronic invoice of the target type, selecting a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a self-defined identification template; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

the processing module is used for carrying out structural analysis on the PDF value-added tax electronic invoice files and converting the PDF value-added tax electronic invoice files into second value-added tax electronic invoice files in a picture format; preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format; aligning the processed value-added tax electronic invoice file with the reference picture; cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region; performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result; and checking the identification result, performing structuring processing, and outputting and storing in a specified format.

9. An electronic device, comprising:

a transceiver, a processor, and a memory, wherein the transceiver, the processor, and the memory are connected by a bus;

the memory is used for storing operation instructions;

the transceiver is used for selecting the value-added tax electronic invoice of the target type, selecting the standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a self-defined identification template; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

the processor is used for calling the operation instruction to execute the steps of the electronic invoice information extraction method according to any one of claims 1-7.

10. A readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of extraction of electronic invoice information, according to any one of claims 1 to 7.

Technical Field

The invention relates to the technical field of image recognition, in particular to an electronic invoice information extraction method and electronic equipment.

Background

In daily work, a company financial staff has to process a lot of invoices everyday, including electronic invoices, as well as paper value-added tax special invoices, general invoices, transportation invoices and the like. The invoices are manually input, so that the time is consumed, and errors are easy to occur; the dual storage of paper invoice original paper and scanning electronic version has invisibly increased a lot of work load for financial staff. Therefore, the problem of automated processing of bills is still urgently to be solved.

The conventional Optical Character Recognition (Optical Character Recognition) technical process flow comprises the following five parts: inputting an image, preprocessing, analyzing a layout, cutting a row and a column, recognizing characters, and performing post-processing recognition and correction. Each part in the process has a great influence on the recognition result, and if one link is improperly processed, the processing result of the next link is directly influenced, so that the final recognition result is not ideal indirectly.

At present, the bill recognition technology mainly converts an unstructured bill image into structured data by means of a traditional OCR technology, so as to extract bill information. The realization method mainly comprises the following steps: and positioning the information line of each item of information on the bill, positioning the information frame position to be identified from all the information frames, and identifying all the character information in the information frames by utilizing character segmentation and OCR (optical character recognition) technology. The method is suitable for bills with orderly surfaces and less information content, and is suitable for the conditions that the time consumed by preprocessing and recognition operation is longer, positioning errors are easily caused and further recognition fails when the surfaces of the bills are complicated and the contents of the bills are more.

In addition, although electronic invoices have been widely popularized in recent years, download formats are basically Portable Document Format (PDF) formats. However, financial reimbursement of enterprises only requires paper versions and picture electronic versions, and most of bill identification technologies are based on picture formats, so for electronic invoices, financial staff also needs to manually intercept the information of the faces of the electronic invoices and convert the information into the picture formats. This operation is not only troublesome, but also prone to omission.

Disclosure of Invention

The embodiment of the invention provides an electronic invoice information extraction method and electronic equipment, which are used for setting a custom identification template, realizing text identification of a designated area of a value-added tax invoice through a PDF format electronic invoice-to-picture format technology and an end-to-end OCR technology based on deep learning, further improving the identification efficiency of bill information and improving the office efficiency of financial staff.

In view of this, a first aspect of the present invention provides an electronic invoice information extraction method, which may include:

selecting a value-added tax electronic invoice of a target type, selecting a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a custom identification template;

importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

performing structural analysis on the PDF value-added tax electronic invoice file, and converting the PDF value-added tax electronic invoice file into a second value-added tax electronic invoice file in a picture format;

preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format;

aligning the processed value-added tax electronic invoice file with the reference picture;

cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region;

performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result;

and checking the identification result, performing structuring processing, and outputting and storing in a specified format.

Optionally, in some embodiments of the present invention, the value-added tax electronic invoice file is imported, and the single or multiple file uploading is supported by the value-added tax electronic invoice file; and after the identification is finished, respectively storing the identification results of the imported value-added tax electronic invoice files according to the file attributes.

Optionally, in some embodiments of the present invention, the selecting the target type of the value-added tax electronic invoice and the selecting the standard picture of the target type of the value-added tax electronic invoice as a reference picture, and making a custom identification template includes:

selecting a standard value-added tax electronic invoice picture, wherein the standard value-added tax electronic invoice picture is complete, clear, correct and pollution-free and is used as a reference picture for manufacturing a custom identification template;

selecting 4 corners on an external frame of a table on a picture of the standard value-added tax electronic invoice as reference points for picture alignment transformation, and storing coordinate points of the 4 corners;

and selecting an area to be identified in the electronic invoice according to the requirement, storing the coordinate position of the area to be identified and the content represented by the area as the label information of the structured result, and obtaining the custom identification template.

Optionally, in some embodiments of the present invention, the preprocessing and the geometric correction processing on the target value-added tax electronic invoice file include:

carrying out gray processing on the target value-added tax electronic invoice file, and converting the target value-added tax electronic invoice file into a single-channel gray image;

performing smooth noise reduction processing on the single-channel gray level image by adopting Gaussian filtering processing to obtain a smooth noise reduction image;

extracting 4 outer frame straight lines in the smooth noise reduction image through Hough transformation, and further calculating to obtain coordinate positions of 4 corner points of the outer frame; comparing the coordinate positions of the 4 angular points with the 4 angular points in the custom identification template, and performing geometric correction through multi-level perspective transformation to obtain a corrected picture;

and comparing the reference picture, cutting off redundant boundary regions and ROI regions in the corrected picture, keeping the size of the corrected picture consistent with that of the self-defined identification template, and obtaining a picture of the region to be identified.

Optionally, in some embodiments of the present invention, the deep learning-based end-to-end character recognition OCR technology relies on two deep learning models, which are respectively: a word detection model CTPN and a word recognition model CRNN.

Optionally, in some embodiments of the present invention, the text detection model is: collecting pictures with various scenes and containing different characters, manually calibrating character areas, and dividing the character areas into a training set and a testing set according to a ratio of 9: 1; performing model training through a CTPN algorithm in deep learning; the detection model can detect the character area in the picture of the area to be identified, detect the character area in a line mode, and position and visually display the character area in a rectangular frame.

Optionally, in some embodiments of the present invention, the text recognition model is: collecting a Chinese language and English language database, generating a character recognition sample set containing fixed word number length, and performing model training through a CRNN algorithm in deep learning; the recognition model can recognize the character information in the picture of the area to be recognized, does not need to perform text line segmentation and character segmentation, and outputs the character information in a character string format.

A second aspect of the present invention provides an electronic device, which may include:

the acquisition module is used for selecting the value-added tax electronic invoice of the target type, selecting a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a self-defined identification template; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

the processing module is used for carrying out structural analysis on the PDF value-added tax electronic invoice files and converting the PDF value-added tax electronic invoice files into second value-added tax electronic invoice files in a picture format; preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format;

aligning the processed value-added tax electronic invoice file with the reference picture; cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region; performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result; and checking the identification result, performing structuring processing, and outputting and storing in a specified format.

Optionally, in some embodiments of the present invention, the value-added tax electronic invoice file is imported, and the single or multiple file uploading is supported by the value-added tax electronic invoice file; and after the identification is finished, respectively storing the identification results of the imported value-added tax electronic invoice files according to the file attributes.

Alternatively, in some embodiments of the present invention,

the acquisition module is specifically used for selecting a standard value-added tax electronic invoice picture which is complete, clear, correct and pollution-free and is used as a reference picture for manufacturing a custom identification template; selecting 4 corners on an external frame of a table on a picture of the standard value-added tax electronic invoice as reference points for picture alignment transformation, and storing coordinate points of the 4 corners; and selecting an area to be identified in the electronic invoice according to the requirement, storing the coordinate position of the area to be identified and the content represented by the area as the label information of the structured result, and obtaining the custom identification template.

Alternatively, in some embodiments of the present invention,

the processing module is specifically used for carrying out gray processing on the target value-added tax electronic invoice file and converting the target value-added tax electronic invoice file into a single-channel gray image; performing smooth noise reduction processing on the single-channel gray level image by adopting Gaussian filtering processing to obtain a smooth noise reduction image; extracting 4 outer frame straight lines in the smooth noise reduction image through Hough transformation, and further calculating to obtain coordinate positions of 4 corner points of the outer frame; comparing the coordinate positions of the 4 angular points with the 4 angular points in the custom identification template, and performing geometric correction through multi-level perspective transformation to obtain a corrected picture; and comparing the reference picture, cutting off redundant boundary regions and ROI regions in the corrected picture, keeping the size of the corrected picture consistent with that of the self-defined identification template, and obtaining a picture of the region to be identified.

Optionally, in some embodiments of the present invention, the deep learning-based end-to-end character recognition OCR technology relies on two deep learning models, which are respectively: a word detection model CTPN and a word recognition model CRNN.

Alternatively, in some embodiments of the present invention,

the character detection model is as follows: collecting pictures with various scenes and containing different characters, manually calibrating character areas, and dividing the character areas into a training set and a testing set according to a ratio of 9: 1; performing model training through a CTPN algorithm in deep learning; the detection model can detect the character area in the picture of the area to be identified, detect the character area in a line mode, and position and visually display the character area in a rectangular frame.

Alternatively, in some embodiments of the present invention,

the character recognition model is as follows: collecting a Chinese language and English language database, generating a character recognition sample set containing fixed word number length, and performing model training through a CRNN algorithm in deep learning; the recognition model can recognize the character information in the picture of the area to be recognized, does not need to perform text line segmentation and character segmentation, and outputs the character information in a character string format.

A third aspect of the present invention provides an electronic device, which may include:

a transceiver, a processor, and a memory, wherein the transceiver, the processor, and the memory are connected by a bus;

the memory is used for storing operation instructions;

the transceiver is used for selecting the value-added tax electronic invoice of the target type, selecting the standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a self-defined identification template; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

the processor is configured to invoke the operation instruction, and execute the step of the method for extracting electronic invoice information as described in any optional implementation manner of the first aspect and the first aspect in the embodiment of the present invention.

A fourth aspect of the present invention provides a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for extracting electronic invoice information as described in the first aspect and any optional implementation manner of the first aspect in the embodiment of the present invention.

According to the technical scheme, the embodiment of the invention has the following advantages:

in the embodiment of the invention, a value-added tax electronic invoice of a target type is selected, a standard picture of the value-added tax electronic invoice of the target type is selected as a reference picture, and a user-defined identification template is manufactured; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format; performing structural analysis on the PDF value-added tax electronic invoice file, and converting the PDF value-added tax electronic invoice file into a second value-added tax electronic invoice file in a picture format; preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format; aligning the processed value-added tax electronic invoice file with the reference picture; cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region; performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result; and checking the identification result, performing structuring processing, and outputting and storing in a specified format. Through the technology of converting the PDF format electronic invoice into the image format and the end-to-end OCR technology based on deep learning, the text recognition of the designated area of the value-added tax invoice is realized, the recognition efficiency of bill information is further improved, and the office efficiency of financial staff is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following briefly introduces the embodiments and the drawings used in the description of the prior art, and obviously, the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained according to the drawings.

Fig. 1 is a schematic diagram of an embodiment of an extraction method of electronic invoice information in an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for extracting electronic invoice information according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a value-added tax electronic invoice customization template in an embodiment of the invention;

FIG. 4 is a diagram of an embodiment of an electronic device in an embodiment of the invention;

fig. 5 is a schematic diagram of another embodiment of the electronic device in the embodiment of the present invention.

Detailed Description

The embodiment of the invention provides an electronic invoice information extraction method and electronic equipment, which are used for setting a custom identification template, realizing text identification of a designated area of a value-added tax invoice through a PDF format electronic invoice-to-picture format technology and an end-to-end OCR technology based on deep learning, further improving the identification efficiency of bill information and improving the office efficiency of financial staff.

In order to make the technical solutions of the present invention better understood by those skilled in the art, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. The embodiments based on the present invention should fall into the protection scope of the present invention.

As shown in fig. 1, which is a schematic diagram of an embodiment of an extraction method of electronic invoice information in an embodiment of the present invention, the extraction method may include:

101. and selecting the value-added tax electronic invoice of the target type, selecting a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a custom identification template.

The selecting the value-added tax electronic invoice of the target type and the selecting the standard picture of the value-added tax electronic invoice of the target type as the reference picture, and making the custom identification template may include: selecting a standard value-added tax electronic invoice picture, wherein the standard value-added tax electronic invoice picture is complete, clear, correct and pollution-free and is used as a reference picture for manufacturing a custom identification template; selecting 4 corners on an external frame of a table on a picture of the standard value-added tax electronic invoice as reference points for picture alignment transformation, and storing coordinate points of the 4 corners; and selecting an area to be identified in the electronic invoice according to the requirement, storing the coordinate position of the area to be identified and the content represented by the area as the label information of the structured result, and obtaining the custom identification template.

Illustratively, custom templates are formulated. Selecting a complete, clear, correct and pollution-free standard picture of the value-added tax invoice as a reference picture for making a template, manually calibrating an alignment reference point of the reference picture, namely 4 angular points of an external frame of the form, and storing the coordinate position of the reference picture; manually selecting a character area to be identified, marking the attribute content represented by the area, namely label information, storing the coordinate position of the rectangular area, and finally naming and storing the custom identification template.

Fig. 2 is a design flow chart of the method for extracting electronic invoice information according to the embodiment of the present invention. Fig. 3 is a schematic diagram of a value-added tax electronic invoice customization template in the embodiment of the present invention.

102. And importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format.

It can be understood that the value added tax electronic invoice file is imported, and the value added tax electronic invoice file supports single and multiple file uploading; and after the identification is finished, respectively storing the identification results of the imported value-added tax electronic invoice files according to the file attributes.

Illustratively, the import type of the value-added tax electronic invoice file supports single import and batch import respectively, and the import format supports a picture format (JPG and PNG) and a PDF format (text version and scan version) respectively, wherein the name of each file should be fixed to a uniform format. For example, the system automatically classifies and summarizes the items according to the names, respectively, "belonged person _ belonged department _ belonged item _ reimbursement category _ index value".

103. And carrying out structural analysis on the PDF value-added tax electronic invoice files, and converting the PDF value-added tax electronic invoice files into the value-added tax electronic invoice files in the picture format.

Illustratively, structure analysis is carried out on the uploaded PDF format electronic invoices, each page is independently converted into a picture format, namely, one picture only contains the content of one invoice; the PDF file is named, and only one sub-index value is added behind the index value, so that the project sum statistics and summarization at the later stage are facilitated.

104. Preprocessing and geometrically correcting a target value-added tax electronic invoice file, zooming and cutting the target value-added tax electronic invoice file into a uniform standard size to obtain a processed value-added tax electronic invoice file, and aligning the processed value-added tax electronic invoice file with the reference picture; and performing interesting ROI region clipping operation on the region to be identified in the aligned picture corresponding to the template picture to obtain an ROI clipped region.

The target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in the picture format and a second value-added tax electronic invoice file in the picture format.

The preprocessing and geometric correction processing of the target value-added tax electronic invoice file may include:

carrying out gray processing on the target value-added tax electronic invoice file, and converting the target value-added tax electronic invoice file into a single-channel gray image; performing smooth noise reduction processing on the single-channel gray level image by adopting Gaussian filtering processing to obtain a smooth noise reduction image; extracting 4 outer frame straight lines in the smooth noise reduction image through Hough transformation, and further calculating to obtain coordinate positions of 4 corner points of the outer frame; comparing the coordinate positions of the 4 angular points with the 4 angular points in the custom identification template, and performing geometric correction through multi-level perspective transformation to obtain a corrected picture; and comparing the reference picture, cutting off redundant boundary areas in the corrected picture, and keeping the size of the corrected picture consistent with that of the custom identification template.

Exemplarily, the image to be recognized is subjected to preprocessing and geometric correction processing, where the preprocessing is as follows: carrying out graying processing on the invoice picture to convert the invoice picture into a single-channel image; adopting Gaussian filtering processing to carry out smooth noise reduction processing on the gray level image; extracting 4 outer frame straight lines in the invoice image through Hough transformation, and further calculating to obtain coordinate positions of 4 corner points of the outer frame; comparing the 4 corner points with the 4 corner points in the template, and performing geometric correction through multi-stage perspective transformation; and comparing the reference picture, cutting off redundant boundary regions and ROI regions in the corrected picture, keeping the size of the corrected picture consistent with that of the self-defined identification template, and obtaining a picture of the region to be identified.

105. And carrying out character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result.

The end-to-end character recognition OCR technology based on deep learning relies on two deep learning models, which are respectively: a text detection model and a text recognition model.

The character detection model is as follows: collecting pictures with various scenes and containing different characters, manually calibrating character areas, and dividing the character areas into a training set and a testing set according to a ratio of 9: 1; performing model training through a CTPN algorithm in deep learning; the detection model can detect the character area in the picture of the area to be identified, detect the character area in a line mode, and position and visually display the character area in a rectangular frame.

The character recognition model is as follows: collecting a Chinese language and English language database, generating a character recognition sample set containing fixed word number length, and performing model training through a CRNN algorithm in deep learning; the recognition model can recognize the character information in the picture of the area to be recognized, does not need to perform text line segmentation and character segmentation, and outputs the character information in a character string format.

Exemplarily, according to a text detection-CTPN model, taking a picture after preprocessing and geometric correction as an input, performing text detection to obtain the positioning information of a text line, which may include:

collecting related bill pictures and natural pictures containing characters in various scenes; calibrating the character area in the picture by using a labeling tool LabelImg software; constructing a CTPN model, dividing a calibrated sample set into a training set and a verification set according to the proportion of 9:1, training a network model, and if convergence is achieved, storing the model; if not, stopping training, adjusting parameters, and retraining until the algorithm converges.

According to the text recognition-CRNN model, the pre-processed and geometrically corrected picture is used as input for text detection to obtain the positioning information of the text line, which may include: taking the detected screenshot of the text line region as the input of a text recognition model, and performing text recognition to obtain a text information character string; extracting a bill background picture for the bill, adding related fonts contained in the bill, and generating a training sample with a fixed word number in a manual synthesis mode; the training sample is closer to the reality data by adding noise; constructing a CRNN model, dividing a calibrated sample set into a training set and a verification set according to the proportion of 9:1, training a network model, and if convergence is achieved, storing the model; if not, stopping training, adjusting parameters, and retraining until the algorithm converges.

106. And checking the identification result, performing structuring processing, and outputting and storing in a specified format.

Illustratively, structured text information is constructed according to the obtained text information and the label information of the area, and the structured text information is output and saved in a specified format.

It can be understood that the structuring process is to compare the labels of the rectangular areas in the template, define the text information identified in the corresponding rectangular area in the picture to be identified as the information of the label area, and constitute the structured data. For example: and converting the character string result recognized by the CRNN characters into a format specified by a user and outputting the format, such as an amount part in an invoice, if capital and lowercase amounts are recognized, outputting the capital amount to an capital column and outputting the lowercase amount to a lowercase column respectively.

In the embodiment of the invention, a value-added tax electronic invoice of a target type is selected, a standard picture of the value-added tax electronic invoice of the target type is selected as a reference picture, and a user-defined identification template is manufactured; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format; performing structural analysis on the PDF value-added tax electronic invoice file, and converting the PDF value-added tax electronic invoice file into a second value-added tax electronic invoice file in a picture format; preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format; aligning the processed value-added tax electronic invoice file with the reference picture; cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region; performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result; and checking the identification result, performing structuring processing, and outputting and storing in a specified format. The invention can set the identification area template according to the self requirement, directly import the electronic invoice in PDF format and picture format, output the structured result of the custom format based on the end-to-end text identification model of deep learning, greatly improve the office efficiency of financial staff, shorten the reimbursement period, ensure the identification accuracy and play a certain role in promoting the realization of automatic office.

It can be understood that the structure analysis of the PDF format electronic invoice file is added, so that the tedious work of manually capturing images for the electronic invoice is omitted; the function of self-defining template is added, so that the picture can be quickly and accurately positioned to the best recognition effect, and the region of interest can be extracted according to the self requirement; the end-to-end recognition of the text content is carried out by using the deep learning algorithm, so that the complex and complicated design and low-efficiency processing flow of the traditional OCR method are avoided, and the recognition accuracy is greatly improved.

As shown in fig. 4, a schematic diagram of an embodiment of an electronic device provided in an embodiment of the present invention may include:

the acquiring module 401 is configured to select a value-added tax electronic invoice of a target type, select a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and make a custom identification template; importing a value-added tax electronic invoice file, wherein the value-added tax electronic invoice file comprises a value-added tax electronic invoice file in a PDF format and a first value-added tax electronic invoice file in a picture format, and each value-added tax electronic invoice file is named as a uniform format;

the processing module 402 is used for performing structure analysis on the value-added tax electronic invoice files of the PDF class and converting the value-added tax electronic invoice files into second value-added tax electronic invoice files in a picture format; preprocessing and geometrically correcting a target value-added tax electronic invoice file, and processing the target value-added tax electronic invoice file into a uniform standard size through scaling and cutting to obtain a processed value-added tax electronic invoice file, wherein the target value-added tax electronic invoice file comprises a first value-added tax electronic invoice file in a picture format and a second value-added tax electronic invoice file in the picture format; aligning the processed value-added tax electronic invoice file with the reference picture; cutting the region to be identified in the aligned picture corresponding to the template picture into the region of interest ROI to obtain the ROI intercepted region; performing character detection and character recognition operation on the ROI intercepted area by utilizing an end-to-end character recognition algorithm based on deep learning to obtain a recognition result; and checking the identification result, performing structuring processing, and outputting and storing in a specified format.

Optionally, in some embodiments of the present invention, the value-added tax electronic invoice file is imported, and the single or multiple file uploading is supported by the value-added tax electronic invoice file; and after the identification is finished, respectively storing the identification results of the imported value-added tax electronic invoice files according to the file attributes. Alternatively, in some embodiments of the present invention,

the obtaining module 401 is specifically configured to select a standard value-added tax electronic invoice picture, where the standard value-added tax electronic invoice picture is complete, clear, correct, and pollution-free and is used as a reference picture for making a custom identification template; selecting 4 corners on an external frame of a table on a picture of the standard value-added tax electronic invoice as reference points for picture alignment transformation, and storing coordinate points of the 4 corners; and selecting an area to be identified in the electronic invoice according to the requirement, storing the coordinate position of the area to be identified and the content represented by the area as the label information of the structured result, and obtaining the custom identification template.

Alternatively, in some embodiments of the present invention,

the processing module 402 is specifically configured to perform graying processing on the target value-added tax electronic invoice file to convert the target value-added tax electronic invoice file into a single-channel grayscale image; performing smooth noise reduction processing on the single-channel gray level image by adopting Gaussian filtering processing to obtain a smooth noise reduction image; extracting 4 outer frame straight lines in the smooth noise reduction image through Hough transformation, and further calculating to obtain coordinate positions of 4 corner points of the outer frame; comparing the coordinate positions of the 4 angular points with the 4 angular points in the custom identification template, and performing geometric correction through multi-level perspective transformation to obtain a corrected picture; and comparing the reference picture, cutting off redundant boundary regions and ROI regions in the corrected picture, keeping the size of the corrected picture consistent with that of the self-defined identification template, and obtaining a picture of the region to be identified.

Optionally, in some embodiments of the present invention, the deep learning-based end-to-end character recognition OCR technology relies on two deep learning models, which are respectively: a text detection model and a text recognition model.

Alternatively, in some embodiments of the present invention,

the character detection model is as follows: collecting pictures with various scenes and containing different characters, manually calibrating character areas, and dividing the character areas into a training set and a testing set according to a ratio of 9: 1; performing model training through a CTPN algorithm in deep learning; the detection model can detect the character area in the picture of the area to be identified, detect the character area in a line mode, and position and visually display the character area in a rectangular frame.

Alternatively, in some embodiments of the present invention,

the character recognition model is as follows: collecting a Chinese language and English language database, generating a character recognition sample set containing fixed word number length, and performing model training through a CRNN algorithm in deep learning; the recognition model can recognize the character information in the picture of the area to be recognized, does not need to perform text line segmentation and character segmentation, and outputs the character information in a character string format.

As shown in fig. 5, which is a schematic diagram of an embodiment of an electronic device in an embodiment of the present invention, the electronic device may include:

the system comprises a transceiver 501, a processor 502 and a memory 503, wherein the transceiver 501, the processor 502 and the memory 503 are connected through a bus;

a memory 503 for storing operation instructions;

the transceiver 501 is used for selecting a value-added tax electronic invoice of a target type, selecting a standard picture of the value-added tax electronic invoice of the target type as a reference picture, and making a self-defined identification template; importing value-added tax electronic invoice files, wherein formats supported by the value-added tax electronic invoice files comprise PDF (Portable document Format) and picture types, and each value-added electronic invoice file is named as a uniform format;

the processor 502 is configured to invoke the operation instruction to execute the steps of the method for extracting electronic invoice information shown in fig. 1 according to the embodiment of the present invention.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于稀疏表示的冠层植物高光谱图像分类方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!