Bill recognition system
阅读说明:本技术 票据识别系统 (Bill recognition system ) 是由 野田享弘 于 2019-03-29 设计创作,主要内容包括:本发明提供一种票据识别系统,能够提高用户填写的手写文字的识别精度,削减人工校正作业业务。经公共通信网络使中央服务器与系统终端彼此可通信连接,其中,所述中央服务器识别票据中记载的用户的手写文字,所述系统终端包括读取通过用户填写至所述票据中的手写文字的图像扫描仪,在该票据识别系统中,所述中央服务器包括手写文字识别单元,其从所述系统终端接收经所述图像扫描仪读取到的所述票据的图像数据,将接收到的所述票据的图像数据的用户手写文字通过至少两种以上不同算法的OCR识别程序分别识别,该识别结果一致的部分确定所述票据中记载的手写文字,识别结果不一致的部分作为校正处理的对象。(The invention provides a bill recognition system, which can improve the recognition precision of handwritten characters filled by a user and reduce manual correction operation business. A center server and a system terminal are communicably connected to each other via a public communication network, wherein the center server recognizes handwritten characters of a user written in a bill, the system terminal includes an image scanner that reads handwritten characters filled in the bill by the user, in the bill recognition system, the center server includes a handwritten character recognition unit that receives image data of the bill read via the image scanner from the system terminal, recognizes the received handwritten characters of the user of the image data of the bill respectively by OCR recognition programs of at least two different algorithms, a part of which recognition results are coincident determines the handwritten characters written in the bill, and a part of which recognition results are not coincident is an object of correction processing.)
1. A bill identifying system is characterized in that,
communicably connecting a central server that recognizes handwritten characters of a user written in a ticket and a system terminal including an image scanner that reads handwritten characters filled in the ticket by the user to each other via a public communication network,
in the bill identifying system,
the central server includes a handwritten character recognition unit that receives image data of the bill read by the image scanner from the system terminal, respectively recognizes user handwritten characters of the received image data of the bill by OCR recognition programs of at least two different algorithms, determines handwritten characters described in the bill at a portion where recognition results are identical, and takes a portion where recognition results are not identical as an object of correction processing.
2. The document identification system according to claim 1,
the handwritten character recognition means performs a number specifying process of extracting a number from the handwritten character described in the form, and analyzing a feature of the handwritten number of the user, thereby specifying a number of a part that cannot be recognized in the recognition results of the OCR recognition programs.
3. The bill identifying system according to claim 1 or 2,
the handwritten character recognition means performs first user name correction processing for acquiring a matching user name from a bank name, branch store name, subject, and account number described in the ticket and specified as a result of recognition by the OCR recognition program, and correcting the user name by comparing the acquired user name with the user name specified as a result of recognition by the OCR recognition program.
4. The bill identifying system according to any one of claims 1 to 3,
the handwritten character recognition means performs a second user name correction process of extracting phonetic kana for Chinese character recognition based on the user name specified as the recognition result of each of the OCR recognition programs described in the ticket, and correcting the user name by comparing the extracted phonetic kana with the user name specified as the recognition result of each of the OCR recognition programs.
5. The bill identifying system according to any one of claims 1 to 4,
the central server specifies a ticket layout from among the plurality of types of ticket layouts based on image data of the tickets, extracts handwritten characters of the user from filling columns of first user information, second user information and amount information of the specified ticket layout, and recognizes the characters by the OCR recognition program.
Technical Field
The present invention relates to a bill recognition system, and more particularly, to a bill recognition system capable of recognizing handwritten characters written in a standard bill read by an image scanner with high accuracy by optical character recognition processing.
Background
Conventionally, in banks and the like, a user writes handwritten characters (chinese characters, katakana, numerals, and the like) on a fixed bill, and processes remittance and the like based on the bill. In recent years, it has been common to read the handwritten form with an image scanner and perform an Optical Character Recognition (OCR) process to register the form in a computer system.
In recent years, in a program that performs optical character recognition processing (hereinafter abbreviated as OCR), in particular, recognition accuracy (discrimination accuracy) of printed characters has been improved. However, the handwritten character filled in by the user is not recognized with sufficient accuracy, and for example, in katakana, numerals, and the like, handwritten characters that cannot be recognized by OCR occur with a considerable probability. In order to cope with this, the handwritten character which cannot be recognized by OCR is corrected by the judgment of the bank clerk, and the correction work becomes a burden on window business of the bank and the like.
Therefore, a character recognition system and a character recognition method are disclosed that can efficiently perform character recognition or correction based on the contents described in a specific filling field for a bill having a plurality of filling fields (see patent document 1).
Patent document 1: japanese patent laid-open No. 2007 and 011656
Disclosure of Invention
However,
Therefore, an object of the present invention is to provide a bill recognition system that can improve recognition accuracy of handwritten characters written by a user and reduce manual modification work.
In a bill recognition system of the present invention in which a center server for recognizing handwritten characters of a user written in a bill and a system terminal including an image scanner for reading handwritten characters filled in the bill by the user are communicably connected to each other via a public communication network, the center server includes a handwritten character recognition unit for receiving image data of the bill read by the image scanner from the system terminal, recognizing the handwritten characters of the user of the received image data of the bill by OCR recognition programs of at least two or more different algorithms, respectively, a part of the recognition results being identical determines the handwritten characters written in the bill, and a part of the recognition results being not identical is an object of correction processing.
The handwritten character recognition means performs a number specifying process of extracting a number from the handwritten character described in the form, and analyzing a feature of the handwritten number of the user, thereby specifying a number of a part that cannot be recognized in the recognition results of the OCR recognition programs.
The handwritten character recognition means performs a first user name correction process of acquiring a corresponding user name from a bank name, branch store name, subject, and account number described in the form and specified as a result of recognition by the OCR recognition program, and correcting the user name by comparing the acquired user name with the user name specified as a result of recognition by the OCR recognition program.
The handwritten character recognition means may perform a second user name correction process of extracting phonetic kana based on the kanji recognition of the user name specified as the recognition result of each of the OCR recognition programs described in the ticket, and comparing the extracted phonetic kana with the user name specified as the recognition result of each of the OCR recognition programs to correct the user name.
Further, a plurality of types of ticket layouts are registered in advance, and the central server specifies a ticket layout from among the plurality of types of ticket layouts based on image data of the ticket, extracts handwritten characters of a user from a filling field of first user information, second user information, and amount information of the specified ticket layout, and recognizes the characters by the OCR recognition program.
According to the invention, the central server comprises at least two or more OCR recognition programs with different algorithms. Then, the handwritten characters of the user written (filled) in the bill (for example, a remittance request book of a financial institution such as a bank) are recognized by a plurality of OCR recognition programs based on the image data of the bill received from the system terminal, the handwritten characters whose recognition results of two or more OCR recognition programs having different algorithms match are identified, and the handwritten characters whose recognition results do not match are targeted for correction processing. Thus, the handwritten character of the user recorded in the bill can be quickly discriminated. The handwritten character to be subjected to the correction processing is finally corrected by a judgment or the like of a person (for example, a staff of a financial institution). Thus, according to the present invention, since handwritten characters written on a form by a user are automatically recognized and confirmed by an OCR recognition program having at least two different algorithms, it is possible to improve recognition accuracy of handwritten characters.
The handwritten character recognition means extracts numerals from the handwritten characters described in the form, analyzes the features of the handwritten numerals of the user, and performs numeral identification processing to identify numerals of parts that cannot be identified in each recognition result of at least two or more OCR recognition programs (hereinafter, simply referred to as a plurality of OCR recognition programs) having different algorithms. That is, in the present invention, only the number is extracted from the handwritten character described in the form, and the feature of the extracted handwritten number of the user is determined (for example, the number "7" is described in a wide width and the circle above the number "9" is small). Then, the feature is added to the number of the unrecognizable part in the recognition results of the plurality of OCR recognition programs, and the handwritten number is recognized and specified. In this way, by adding the recognition pattern of the handwritten numeral to the conventional OCR recognition program and performing determination in accordance with the feature of the handwritten numeral of each user, it is possible to automatically recognize and confirm the numerals of the portions which cannot be recognized by the plurality of OCR recognition programs.
The handwritten character recognition means performs a first user name correction process of acquiring a user name (a payee name in the case where the bill is a remittance request book) from a bank name, a branch name, a subject, and an account number specified as a result of recognition by a plurality of OCR recognition programs described in the bill, and comparing the acquired user name with a plurality of user names specified as a result of recognition by the OCR recognition programs to correct the user name. That is, in a financial institution (a bank in the bank or other banks with business cooperation, etc.), a bank name, a branch name, a subject, and a user name (katakana) of an account number set by a user are normally registered by a customer management computer or the like. Therefore, when the bank name, branch store name, subject, and account number are specified as the recognition results of each of the plurality of OCR recognition programs, the user acquires a formal user name (katakana) corresponding to the bank name, branch store name, subject, and account number from a customer management computer of the financial institution that opens the account. Then, the acquired user name (katakana) is compared with user names (katakanas) identified by a plurality of OCR recognition programs, and when the acquired formal user name (katakana) is different from the user names (katakana) identified by a plurality of OCR recognition programs, the acquired formal user name (katakana) is corrected. This makes it possible to temporarily correct the user name (katakana) identified by the plurality of OCR recognition programs to a regular user name, thereby improving the reliability of identifying the user name (katakana).
The handwritten character recognition unit performs a second user name correction process of extracting phonetic pseudonyms recognized based on chinese characters of user names (for example, a requester name and a receiver name when a ticket is a remittance request book) specified as recognition results of the plurality of OCR recognition programs, and correcting the user names by comparing the extracted phonetic pseudonyms with user names (katakana) specified as recognition results of the plurality of OCR recognition programs. That is, in the bill (remittance request book) of the financial institution (the home bank, the business cooperation other bank, etc.), usually, the user name is filled in with both the kanji and the katakana handwritten characters. Therefore, when the recognition of a chinese character is determined in each of the plurality of OCR recognition programs, the phonetic kana of the chinese character is compared with the user name (katakana) determined as the recognition result in each of the plurality of OCR recognition programs, and when the user names (katakana) are different, the phonetic kana of the chinese character is corrected as the user name (katakana). Therefore, when there is a portion where the user name (katakana) described in the ticket cannot be recognized or a portion where the user name is erroneously recognized in the recognition results of the plurality of OCR recognition programs, the user name can be specified based on the phonetic kana of the acquired kanji. That is, the reliability of specifying the user name (katakana) can be improved.
A central server registers a plurality of bill layouts in advance, specifies a bill layout from the plurality of bill layouts based on image data of a bill, extracts handwritten characters of a user from a first user information, a second user information and a filling column of money amount information of the specified bill layout, and recognizes the characters by a plurality of OCR recognition programs. That is, there are various types of ticket layouts, and for example, if a ticket is set as a remittance request book, the positions of the filling columns are different for each remittance request book, such as recipient information (bank name, branch name, subject, account number, name, etc.) as first user information, consignor information (address, name, telephone number, etc.) and amount information (remittance amount, remittance commission, etc.) as second user information. Therefore, the document layout is identified from the image data of the document read by the image scanner using the characters (requester, payee, amount, etc.) printed on the document as a trigger condition, and it is determined which document layout corresponds to one of the plurality of document layouts registered on the center server. And then, extracting the handwritten characters filled in by the user according to the determined bill layout, and recognizing through a first OCR recognition program and a second OCR recognition program. This makes it possible to reliably recognize the first user information, the second user information, and the amount information.
Drawings
Fig. 1 is a diagram showing a configuration of a bill identifying system according to the present embodiment.
Fig. 2 is a block diagram illustrating an electrical configuration of the center server of the bill identifying system according to the present embodiment.
Fig. 3 is a flowchart of various processes of the central server of the bill identifying system according to the present embodiment.
Fig. 4 is a diagram illustrating a number determination process of the bill identifying system according to the present embodiment.
Fig. 5 is a list of similar characters of katakana in the bill identifying system according to the present embodiment.
Fig. 6 is a diagram illustrating a first user name correction process of the bill identifying system according to the present embodiment.
Fig. 7 is a diagram illustrating a second user name correction process of the ticket recognition system according to the present embodiment.
Fig. 8 is a diagram illustrating a bill layout of the bill identifying system according to the present embodiment.
Description of the symbols
1 Bill recognition system
10 system terminal
20 image scanner
25 Central Server
30 System Server
100 bank
Detailed Description
The present invention is a bill recognition system in which a central server for recognizing handwritten characters of a user written in a bill and a system terminal including an image scanner for reading handwritten characters filled in the bill by the user are communicably connected to each other via a public communication network, wherein the central server includes a handwritten character recognition unit for receiving image data of the bill read by the image scanner from the system terminal, respectively recognizing the handwritten characters of the user of the received image data of the bill by OCR recognition programs of at least two different algorithms, a part of the recognition results being coincident determines the handwritten characters written in the bill, and a part of the recognition results being non-coincident is an object of correction processing.
Next, an embodiment of the present invention will be described in detail with reference to the drawings. Fig. 1 is a diagram showing a configuration of a bill identifying system according to the present embodiment. Fig. 2 is a block diagram illustrating an electrical configuration of the center server of the bill identifying system according to the present embodiment. Fig. 3 is a flowchart of various processes of the central server of the bill identifying system according to the present embodiment. Fig. 4 is a diagram illustrating digital correction of the bill identifying system according to the present embodiment. Fig. 5 is a list of similar characters of katakana in the bill identifying system according to the present embodiment. Fig. 6 is a diagram illustrating a first user name correction process of the bill identifying system according to the present embodiment. Fig. 7 is a diagram illustrating a second user name correction process of the ticket recognition system according to the present embodiment. Fig. 8 is a diagram illustrating a bill layout of the bill identifying system according to the present embodiment.
[ Structure of Bill identification System ]
Next, an outline of the configuration of the bill identifying system in the present embodiment will be described with reference to fig. 1. In the following description, an operator who operates the present system will be referred to as a bank as a financial institution, and recognition of handwritten characters written by a user in a remittance request as one of tickets will be described as an example.
As shown in fig. 1, the
As the
The
The
In the above configuration, in the present embodiment, the
Further, in the
In this way, by recognizing the user's handwritten character described in the image data of the remittance request book by the first OCR recognition program and the second OCR recognition program having different algorithms, it is possible to quickly recognize the user's handwritten character described in the specific ticket.
Further, in the present embodiment, various correction processes (a number specifying process, a first user name correction process, and a second user name correction process, which will be described later) are executed on the recognition results of the handwritten characters in the first OCR recognition program and the second OCR recognition program, and the handwritten characters in the unrecognizable portion and the misrecognized portion are recognized and corrected. By executing the various correction processes in this way, the accuracy of automatically recognizing and specifying the handwritten character written by the user can be improved, and therefore, the
[ Structure of Central Server 25 ]
Next, an electrical configuration of the
The
The input/
The
The external communication I/
[ digital determination processing ]
Next, the number specifying process in the
As shown in fig. 4, the number of the portion that cannot be recognized by the first OCR recognition program and the second OCR recognition program is a
Then, when the extracted
In this way, by adding the recognition pattern of the handwritten numeral to the handwritten numeral of each of the conventional first and second OCR recognition programs and performing the numeral specification processing determined in accordance with the feature of the handwritten numeral of each of the users, the numerals of the part which cannot be recognized by each of the first and second OCR recognition programs can be automatically recognized and specified.
The above-described features of the handwritten numeral of the user are merely examples, and the numbers of the parts that cannot be recognized by the first OCR recognition program and the second OCR recognition program can be recognized by detecting the features of the handwritten numeral of various users, such as a circle below the handwritten numeral of "8" recognized as a numeral and a large right-angle inclination of the handwritten numeral of "1" recognized as a numeral. Further, when a sample or the like of the handwritten numeral of the user is few and no feature is found in the handwritten numeral of the user, the numeral should be determined as a numeral of an unrecognizable part.
[ first user name correction processing ]
Next, the first user name correction processing in the
In a receipt such as a remittance request form, columns are provided in which a name (user name) of a recipient, a bank name of an account opened by the recipient, a branch name, a subject, and an account number are written by hand. Then, in the bank where the account of the payee is opened, the formal user name (katakana) of the payee is registered in the customer management computer (the
As shown in fig. 6 (a), in the first and second OCR recognition programs, a bank name "ニホン", a branch name "ギンザ", a subject "フツウ", and an account number "9999999" of the payee-opened account are specified. When the payee opens an account in the own bank, the central server 25 (customer management computer) is inquired about the formal user name of the payee, and when the payee opens an account in the other bank, the system server 30 (customer management computer) is inquired about the formal user name of the payee.
As a result, the bank name, the branch name, the subject, the account number, and the user name of the payee shown in table 51 are acquired. At this time, the bank name, branch name, subject, account number, and user name of the payee identified by the first OCR recognition program or the second OCR recognition program are "ワキタアイ" shown in table 50, and are "クキタマイ" in table 51, as opposed to this. In general, in recognition of katakana in an OCR recognition program, there are many similar characters and the recognition rate is low. An example of similar characters of the katakana is shown in fig. 5 as 26 kinds of similar character data. The similar character data shown in fig. 5 is data registered in advance in the
That is, if the acquired user name "クキタマイ" of the payee in table 51 is compared with the user name "ワキタアイ" of the payee shown in table 50, it is apparent that katakana "ワ" and "ク" (17 in fig. 5), and "ア" and "マ" (1 in fig. 5) surrounded by M1 in the figure are similar characters of the katakana, and thus, are determined in consideration of the first OCR recognition program or the second OCR recognition program being erroneously recognized. However, as shown in table 51, since it is confirmed that the formal payee's user name is "クキタマイ", the correction of the user name of the erroneously recognized payee is changed to the payee's user name shown in table 51 (the part surrounded by the circle M1 in the drawing), whereby the erroneous recognition by the OCR recognition program is eliminated, and the formal payee's user name can be specified.
Similarly, as shown in fig. 6 (b), the first OCR recognition program and the second OCR recognition program specify the bank name "ニホン", the branch name "ギンザ", the subject "フツウ", and the account number "0000000" of the account opened by the payee. When the payee opens an account in the own bank, the central server 25 (customer management computer) is inquired about the formal user name of the payee, and when the payee opens an account in the other bank, the system server 30 (customer management computer) is inquired about the formal user name of the payee.
As a result, the bank name, the branch name, the subject, the account number, and the user name of the payee shown in table 54 are acquired. In this case, the bank name, the branch name, the subject, the account number, and the user name of the payee identified by the first OCR recognition program and the second OCR recognition program of the present embodiment are "ス" in table 53? キイ? ロー ", in contrast, it is" ス ズ キイ チ ロー "in table 54. Thus, "? The "(or blank) portion is" ズ "and" チ "surrounded by a circle M2 in the figure, and the user name of the payee can be determined to be" ス ズ キイ チ ロー ".
In this way, by correcting the user name (katakana) of the recipient when the recognition result of the user name (katakana) of each of the first OCR recognition program and the second OCR recognition program is determined by erroneous recognition or when the user name (katakana) of the recipient cannot be determined by recognition to the user name of the authorized recipient, the user name of the recipient which the user has filled in the remittance application by the handwritten character of the katakana can be accurately determined.
[ second user name correction processing ]
Next, a second user name correction process in the
As shown in fig. 7 (a), when the
As shown in fig. 7 (b), "アキヤアマキユ" in
In the
In the
According to the second username correction processing, even when the user name of the katakana specified as the recognition result of each of the first OCR recognition program and the second OCR recognition program cannot be recognized or when the recognition is wrong, the user name can be specified based on the ZhuYin kana of the Chinese character. That is, the reliability of specifying the correct user name (katakana) can be improved.
[ handwritten character recognition processing ]
Next, a handwritten character recognition process for recognizing a handwritten character written in a remittance application as a ticket by a user in the
As shown in fig. 3, the
The
That is, as shown in fig. 8 (a) to 8 (c), the
The
When there are digits of the part that cannot be recognized by the first OCR recognition program, the
The
The
As described above, in steps S12 to S15, the results of execution of various correction processes (the number specifying process, the first user name correcting process, and the second user name correcting process) are reflected on the recognition of the image data of the first OCR recognition program and the recognition result thereof, and stored in the predetermined area of the
The
When there are numbers of the parts that cannot be recognized by the second OCR recognition program, the
The
The
As described above, in steps S16 to S19, the results of execution of various correction processes (digital confirmation process, first user name correction process, and second user name correction process) are reflected on the recognition of the image data of the second OCR recognition program and the recognition results, and stored in the predetermined area of the
The
The
The processing of steps S20 to S23 compares the recognition results of the image data of the first OCR recognition program and the second OCR recognition program, and stores the results of the secondary determination processing of the first OCR recognition program and the second OCR recognition program in a predetermined area of the
As described above, in the handwritten character recognition process according to the present embodiment, the image data of the remittance request read by the
Although the embodiments of the present invention have been described above, the specific configuration of the present invention is not limited to the above embodiments, and design changes and the like without departing from the scope of the subject matter of the present invention are also included in the present invention.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:用于自动驾驶车辆中的闭环感知的方法和系统