Price identification method and device

文档序号:1937511 发布日期:2021-12-07 浏览:23次 中文

阅读说明:本技术 一种价格识别方法和装置 (Price identification method and device ) 是由 薄晰月 陈华昌 于 2020-07-20 设计创作,主要内容包括:本发明公开了一种价格识别方法和装置,涉及计算机技术领域。该方法的一具体实施方式包括:对目标图片进行文本检测和识别,得到识别结果;按照位置信息排序识别到的文本,以基于正则匹配方式处理排序后的文本,过滤出疑似价格文本;确定与第一疑似价格文本位置相交或相邻、且一同符合预定合并规则的第二疑似价格文本,合并所述第一疑似价格文本和所述第二疑似价格文本,得到所述目标图片的价格文本。该实施方式解决众多电商场景下复杂版式的小数价格识别,提出注意力值这一理念,将小数价格特殊的位置信息以及整数部分突出的面积占比合二为一进行考虑,以选取有效并真实的价格。(The invention discloses a price identification method and device, and relates to the technical field of computers. One embodiment of the method comprises: performing text detection and identification on the target picture to obtain an identification result; the recognized texts are sorted according to the position information, the sorted texts are processed in a regular matching mode, and suspected price texts are filtered; and determining a second suspected price text which is intersected or adjacent to the first suspected price text and accords with a preset combination rule together, and combining the first suspected price text and the second suspected price text to obtain the price text of the target picture. The method solves the decimal price recognition of complex formats in numerous e-commerce scenes, provides the idea of attention value, and combines the special position information of the decimal price and the outstanding area ratio of the integer part into one for consideration so as to select effective and real price.)

1. A price identification method, comprising:

performing text detection and identification on the target picture to obtain an identification result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

the recognized texts are sorted according to the position information, the sorted texts are processed in a regular matching mode, and suspected price texts are filtered;

and determining a second suspected price text which is intersected or adjacent to the first suspected price text and accords with a preset combination rule together, and combining the first suspected price text and the second suspected price text to obtain the price text of the target picture.

2. The method according to claim 1, wherein after processing the sorted text based on the regular matching and filtering out the suspected price text, further comprising:

for the fields in the single suspected price text, if the fields are inquired in a non-price field library, determining that the single suspected price text is not a price text and filtering; and/or

And if the tail field of the single suspected price text comprises a metering unit field, determining that the single suspected price text is not a price text and filtering.

3. The method according to claim 1 or 2, wherein after processing the sorted text based on the regular matching manner and filtering out the suspected price text, the method further comprises:

and determining context information adjacent to the single suspected price text, and if the context information and/or the context information contain a non-price field, determining that the single suspected price text is not a price text and filtering.

4. The method of claim 1, wherein determining a second suspected price text that intersects or is adjacent to the first suspected price text location and that together meet a predetermined merge rule comprises:

processing attribute information of each suspected price text in the target picture according to an attention value calculation mode to obtain an attention value of each suspected price text;

determining a third suspected price text with the largest attention value, and filtering out intersecting or adjacent suspected price texts positioned at the right side of the third suspected price text from the remaining suspected price texts to determine a fourth suspected price text which accords with the preset combination rule;

and removing the third suspected price text and the fourth suspected price text, determining a fifth suspected price text with the largest attention value from the remaining suspected price texts, and repeating the position filtering and merging rule judging operations until the number of the remaining suspected price texts is one.

5. The method according to claim 4, wherein the attribute information includes a pixel height, a pixel width, and a text length;

before processing the attribute information of each suspected price text in the target picture according to the attention value calculation mode, the method further includes:

regarding the position information of a single suspected price text, taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the horizontal axis as the pixel width, and taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the longitudinal axis as the pixel height; and

and calculating the text length of the single suspected price text by using a text length calculation mode.

6. The method according to claim 4 or 5, wherein filtering out intersecting or adjacent plausible price text to the right of the remaining plausible price text comprises:

and calculating the coordinate value of the upper right corner point or the lower right corner point of the third suspected price text and the coordinate value of the upper left corner point or the lower left corner point of each residual price text, and determining the suspected price text with the distance within a preset range.

7. The method of claim 1, wherein the recognition result further comprises a confidence level, wherein the confidence level represents a confidence level of the recognized text;

the step of determining a second suspected price text which is intersected or adjacent with the first suspected price text and accords with a preset merging rule together comprises the following steps:

judging whether the tail part of the first suspected price text is a number or a decimal point;

judging whether the head of the second suspected price text is a number or a decimal point;

and judging whether the confidence degrees of the first suspected price text and the second suspected price text are both greater than or equal to a preset value.

8. The method of claim 7, wherein the obtaining the price text of the target picture further comprises:

and taking the average value of the sum of the confidence degrees of the first suspected price text and the second suspected price text to obtain the confidence degree of the price text.

9. The method of claim 1, wherein said merging the first suspected price text and the second suspected price text comprises:

and combining the digital parts of the first suspected price text and the second suspected price text, and adding a decimal point between the first suspected price text and the second suspected price text.

10. The method of claim 1, further comprising:

when the target picture does not contain a suspected price text or a price text, carrying out item price search based on the identified text to obtain a price search result; wherein the text includes at least one of an item brand, an item class, a name, and a model number.

11. A price identifying device, comprising:

the recognition module is used for carrying out text detection and recognition on the target picture to obtain a recognition result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

the filtering module is used for sorting the recognized texts according to the position information, processing the sorted texts in a regular matching mode and filtering out suspected price texts;

and the merging module is used for determining a second suspected price text which is intersected or adjacent with the first suspected price text and accords with a preset merging rule together, and merging the first suspected price text and the second suspected price text to obtain the price text of the target picture.

12. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-10.

13. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-10.

Technical Field

The invention relates to the technical field of computers, in particular to a price identification method and device.

Background

With the continuous rise of the industry in the e-commerce field, price comparison systems are in use for comparing prices of articles under various e-commerce flags, so that the best recommendation is provided for consumers. The method for identifying the price of the main graph of the article is gradually needed, and the currently and generally adopted mode is to detect and identify the price text in the main graph based on Optical Character Recognition (OCR), and the method comprises the following specific steps:

firstly, detecting digital fields in a main graph of an article based on a deep learning method, then identifying prices in the digital fields, finally filtering non-price digital fields through regular matching amount or according to area sequencing and the like, and selecting and outputting effective and real prices.

In the process of implementing the invention, the inventor finds that the prior art has at least the following problems:

1. the main graph of the article usually contains some interference items, if only the digital fields are detected, a large number of non-price fields appear in the subsequent OCR recognition result, and the difficulty of later-stage rule extraction is increased;

2. the price of the article cannot be corresponded due to no relevant context key information; for example, the main article map contains a plurality of similar articles with different models, the models are displayed in the upper part and the prices are shown in the lower part, if only the price field is detected, effective model information is lost, and the prices cannot be corresponding to the article models one by one;

3. the main graph without price cannot be used for subsequent price identification; although the price is not described in the main graph of a part of articles, the text field of the main graph of the part of articles usually defines the attribute information such as the name, the model, the capacity and the like of the articles, if only the price is detected and identified, the price of the same kind of articles cannot be accessed from the database based on other key information in the graph, and the space for the subsequent development of price identification is limited;

4. position information is not considered, so that the price of the special plate type decimal point cannot be accurately extracted; for the amount of the decimal point, the main graph of the article usually highlights the integral part and reduces the decimal part, and the format is usually disassembled for detection, so the identification result is usually disassembled, and the current rule cannot solve the combination of the prices.

Disclosure of Invention

In view of this, embodiments of the present invention provide a price identification method and apparatus, which can at least solve the problems in the prior art that a price field cannot be filtered, and an identification result of a special format decimal point price field cannot be extracted correctly.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a price identifying method including:

performing text detection and identification on the target picture to obtain an identification result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

the recognized texts are sorted according to the position information, the sorted texts are processed in a regular matching mode, and suspected price texts are filtered;

and determining a second suspected price text which is intersected or adjacent to the first suspected price text and accords with a preset combination rule together, and combining the first suspected price text and the second suspected price text to obtain the price text of the target picture.

Optionally, after the sorted texts are processed based on the regular matching manner and suspected price texts are filtered out, the method further includes:

for the fields in the single suspected price text, if the fields are inquired in a non-price field library, determining that the single suspected price text is not a price text and filtering; and/or

And if the tail field of the single suspected price text comprises a metering unit field, determining that the single suspected price text is not a price text and filtering.

Optionally, after the sorted texts are processed based on the regular matching manner and suspected price texts are filtered out, the method further includes:

and determining context information adjacent to the single suspected price text, and if the context information and/or the context information contain a non-price field, determining that the single suspected price text is not a price text and filtering.

Optionally, the determining a second suspected price text which is intersected with or adjacent to the first suspected price text and meets a predetermined merging rule together includes:

processing attribute information of each suspected price text in the target picture according to an attention value calculation mode to obtain an attention value of each suspected price text;

determining a third suspected price text with the largest attention value, and filtering out intersecting or adjacent suspected price texts positioned at the right side of the third suspected price text from the remaining suspected price texts to determine a fourth suspected price text which accords with the preset combination rule;

and removing the third suspected price text and the fourth suspected price text, determining a fifth suspected price text with the largest attention value from the remaining suspected price texts, and repeating the position filtering and merging rule judging operations until the number of the remaining suspected price texts is one.

Optionally, the attribute information includes a pixel height, a pixel width, and a text length;

before processing the attribute information of each suspected price text in the target picture according to the attention value calculation mode, the method further includes:

regarding the position information of a single suspected price text, taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the horizontal axis as the pixel width, and taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the longitudinal axis as the pixel height; and

and calculating the text length of the single suspected price text by using a text length calculation mode.

Optionally, the filtering out the intersecting or adjacent suspected price text with the position located at the right side of the remaining suspected price text includes:

and calculating the coordinate value of the upper right corner point or the lower right corner point of the third suspected price text and the coordinate value of the upper left corner point or the lower left corner point of each residual price text, and determining the suspected price text with the distance within a preset range.

Optionally, the recognition result further includes a confidence level, and the confidence level represents the confidence level of the recognized text;

the step of determining a second suspected price text which is intersected or adjacent with the first suspected price text and accords with a preset merging rule together comprises the following steps:

judging whether the tail part of the first suspected price text is a number or a decimal point;

judging whether the head of the second suspected price text is a number or a decimal point;

and judging whether the confidence degrees of the first suspected price text and the second suspected price text are both greater than or equal to a preset value.

Optionally, the obtaining the price text of the target picture further includes: and taking the average value of the sum of the confidence degrees of the first suspected price text and the second suspected price text to obtain the confidence degree of the price text.

Optionally, the merging the first suspected price text and the second suspected price text includes: and combining the digital parts of the first suspected price text and the second suspected price text, and adding a decimal point between the first suspected price text and the second suspected price text.

Optionally, the method further includes: when the target picture does not contain a suspected price text or a price text, carrying out item price search based on the identified text to obtain a price search result; wherein the text includes at least one of an item brand, an item class, a name, and a model number.

To achieve the above object, according to another aspect of an embodiment of the present invention, there is provided a price identifying apparatus including:

the recognition module is used for carrying out text detection and recognition on the target picture to obtain a recognition result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

the filtering module is used for sorting the recognized texts according to the position information, processing the sorted texts in a regular matching mode and filtering out suspected price texts;

and the merging module is used for determining a second suspected price text which is intersected or adjacent with the first suspected price text and accords with a preset merging rule together, and merging the first suspected price text and the second suspected price text to obtain the price text of the target picture.

Optionally, the filtering module is further configured to:

for the fields in the single suspected price text, if the fields are inquired in a non-price field library, determining that the single suspected price text is not a price text and filtering; and/or

And if the tail field of the single suspected price text comprises a metering unit field, determining that the single suspected price text is not a price text and filtering.

Optionally, the filtering module is further configured to: and determining context information adjacent to the single suspected price text, and if the context information and/or the context information contain a non-price field, determining that the single suspected price text is not a price text and filtering.

Optionally, the merging module is configured to:

processing attribute information of each suspected price text in the target picture according to an attention value calculation mode to obtain an attention value of each suspected price text;

determining a third suspected price text with the largest attention value, and filtering out intersecting or adjacent suspected price texts positioned at the right side of the third suspected price text from the remaining suspected price texts to determine a fourth suspected price text which accords with the preset combination rule;

and removing the third suspected price text and the fourth suspected price text, determining a fifth suspected price text with the largest attention value from the remaining suspected price texts, and repeating the position filtering and merging rule judging operations until the number of the remaining suspected price texts is one.

Optionally, the attribute information includes a pixel height, a pixel width, and a text length;

the merging module is further configured to:

regarding the position information of a single suspected price text, taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the horizontal axis as the pixel width, and taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the longitudinal axis as the pixel height; and

and calculating the text length of the single suspected price text by using a text length calculation mode.

Optionally, the merging module is configured to:

and calculating the coordinate value of the upper right corner point or the lower right corner point of the third suspected price text and the coordinate value of the upper left corner point or the lower left corner point of each residual price text, and determining the suspected price text with the distance within a preset range.

Optionally, the recognition result further includes a confidence level, and the confidence level represents the confidence level of the recognized text;

the merging module is configured to:

judging whether the tail part of the first suspected price text is a number or a decimal point;

judging whether the head of the second suspected price text is a number or a decimal point;

and judging whether the confidence degrees of the first suspected price text and the second suspected price text are both greater than or equal to a preset value.

Optionally, the merging module is further configured to: and taking the average value of the sum of the confidence degrees of the first suspected price text and the second suspected price text to obtain the confidence degree of the price text.

Optionally, the merging module is configured to: and combining the digital parts of the first suspected price text and the second suspected price text, and adding a decimal point between the first suspected price text and the second suspected price text.

Optionally, the system further includes a price searching module, configured to:

when the target picture does not contain a suspected price text or a price text, carrying out item price search based on the identified text to obtain a price search result; wherein the text includes at least one of an item brand, an item class, a name, and a model number.

To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a price recognition electronic device.

The electronic device of the embodiment of the invention comprises: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement any of the price identification methods described above.

To achieve the above object, according to a further aspect of the embodiments of the present invention, there is provided a computer-readable medium having a computer program stored thereon, the program, when executed by a processor, implementing any one of the above price identification methods.

According to the scheme provided by the invention, one embodiment of the invention has the following advantages or beneficial effects: the method solves the decimal price recognition of complex formats in a plurality of E-commerce scenes, and has the innovation point that the simple rules of using regular matching and the like only depending on recognition results are not used singly, but the idea of reasonably designing and proposing attention values is adopted, and the special position information of decimal prices and the outstanding area ratio of integer parts are taken into consideration in a combined mode so as to select effective and real prices.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of a main flow of a price identification method according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart diagram of an alternative price identification method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the main modules of a price identifying apparatus according to an embodiment of the present invention;

FIG. 4 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

FIG. 5 is a schematic block diagram of a computer system suitable for use with a mobile device or server implementing an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that the target picture in the embodiment of the present invention may be an article main picture in the e-commerce scene, and the price in the picture is identified to allow the price comparison system to compare the prices of articles of different e-commerce, and this scheme takes the article main picture as an example for description.

The main article map is wide in related category, and the price in the main map is various, wherein the main article map containing the price and the main article map containing only the name and the model of the article exist. In addition, the main pattern of the object containing the price is also different, such as different fonts, character spacing, font colors, question mark prices (such as 890), original prices marked off by horizontal lines, decimal point prices with different sizes, and the price of the drawn art body.

Referring to fig. 1, a main flowchart of a price identification method provided by an embodiment of the present invention is shown, including the following steps:

s101: performing text detection and identification on the target picture to obtain an identification result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

s102: the recognized texts are sorted according to the position information, the sorted texts are processed in a regular matching mode, and suspected price texts are filtered;

s103: and determining a second suspected price text which is intersected or adjacent to the first suspected price text and accords with a preset combination rule together, and combining the first suspected price text and the second suspected price text to obtain the price text of the target picture.

In the above embodiment, for step S101, in order to solve the problem that the existing method that depends solely on deep learning detects and then directly outputs the numbers in the recognition graph, the present solution proposes a new recognition strategy for the main graph field of the article, so as to better perform the price extraction work according to the rule.

The process of text detection and recognition of the main object picture by using the deep learning model comprises the following steps:

1) price Text detection based on deep learning method EAST (An Efficient and Accurate Scene Text Detector, Scene character detection model)

The EAST detector can be well suitable for detecting the main graph price field of a complex format by virtue of the characteristics of high efficiency and accuracy, angle information and angular point offset can be used as key information, and four angular points of an initially predicted inclined rectangular frame are more accurately positioned, so that the EAST detector can well perform a plurality of complex main graph price detection tasks with angles and strong background interference.

2) Price text recognition based on deep learning method CRNN (Convolutional Recurrent Neural network)

The main character recognition technology can be divided into single character recognition and string character recognition, and the CRNN as the latter has obvious advantages for some character recognition tasks with unfixed character spacing, various font styles and artistic special effects. The main graph price is mostly based on bold numbers, and in addition, a large number of additional special effects such as artistic delineation, artistic rendering and thickening are also included, so that the requirements on the generalization capability and robustness of the recognizer are high, and the CRNN can train a model with relatively excellent generalization capability by means of memory-capable neurons such as LSTM/GRU and the like so as to solve many challenges of the main graph price recognition.

Outputting all texts in the main graph of the article, the confidence coefficient and the position information of each text through the identification process; the confidence coefficient is an identifiable confidence coefficient and represents the reliability of identifying the text field, and the value range is 0-1; the position information is coordinate values of corner points of the text region in the main picture of the article, and the corner points at least comprise an upper left corner point, a lower left corner point, an upper right corner point and a lower right corner point.

For step S102, based on the recognized text position information, the texts are sorted from top to bottom according to their positions, and based on a regular matching manner, such as re.findall (r '\ d + \?d', text), the suspected price texts are filtered.

Some suspected price texts are not price texts, so that before subsequent operations, screening operations can be performed to filter out definite non-price texts, such as "preferential 50 yuan", "original price 998", "5V", "36 mA", and the like.

1) A non-price field library can be pre-constructed and comprises non-price fields such as 'discount', 'original price', 'full', 'subtract', and the like. And for the suspected price texts, verifying whether the non-price fields in the non-price field library are hit one by one to filter the hit suspected price texts (namely, the non-price texts).

In addition, the tail field of the matched price in the suspected price text is verified, whether the tail contains the metering unit fields of mA, ml and the like is checked, and therefore clear non-price texts, such as fields of 5V, 36mA and the like, are filtered.

2) The suspected price text is subjected to context information identification one by one, for example, prompt information is arranged above some numbers to clearly indicate the number is not a price, such as 'capacity', 'highest temperature', 'maximum number of times of use', and the like. It should be noted that the context information refers to available text information, and whether the text information is the above information or the below information, as long as one field contains a non-price field, it can be determined that the text is not price text and filtered.

The non-price field used in the method 2) may be the same as or different from the non-price field used in the method 1).

For step S103, the suspected price text after the two filtering passes is output to the next stage to merge the price text with the special format, which is specifically referred to the attention value calculation and merging rule judgment description shown in the following fig. 2, and will not be described herein again.

Through detecting and discerning whole text fields in the article main map to do benefit to the rule of design later stage extraction and merger price, in addition, this kind of mode also can provide effectual key information for the article of the no price text in the main map, like article brand, article type, name and model etc. then carries out the whole network and searches for prices.

The method provided by the embodiment solves the decimal price identification of a complex format under numerous E-commerce scenes, and has the innovation point that the simple rules of a simple identification result such as regular matching and the like are not singly used, but the idea of reasonably designing and proposing the attention value is adopted, and the special position information of the decimal price and the outstanding area ratio of the integer part are combined into one to be considered so as to select the effective and real price.

Referring to fig. 2, a schematic flow chart of an alternative price identification method according to an embodiment of the present invention is shown, including the following steps:

s201: performing text detection and identification on the target picture to obtain an identification result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

s202: the recognized texts are sorted according to the position information, the sorted texts are processed in a regular matching mode, and suspected price texts are filtered;

s203: processing attribute information of each suspected price text in the target picture according to an attention value calculation mode to obtain an attention value of each suspected price text;

s204: determining a third suspected price text with the largest attention value, and filtering out intersecting or adjacent suspected price texts positioned at the right side of the third suspected price text from the remaining suspected price texts to determine a fourth suspected price text which accords with the preset combination rule;

s205: merging the third suspected price text and the fourth suspected price text to obtain a price text of the target picture, and removing the third suspected price text and the fourth suspected price text;

s206: and determining a fifth suspected price text with the largest attention value from the remaining suspected price texts, and repeating the position filtering and merging rule judging operations until the number of the remaining suspected price texts is one.

In the above embodiment, for steps S201 and S202, reference may be made to the description of steps S101 and S102 shown in fig. 1, and details are not repeated here.

In the above embodiment, regarding step S203, considering that some decimal amounts with special formats and different sizes often exist in the main graph of the article, the scheme provides a new way that the extraction of such prices cannot be completed simply by using common basic extraction rules, specifically:

after filtering the suspected price texts, firstly calculating an attention value attention score of each suspected price text, wherein the calculation rule is as follows:

wherein, text refers to a suspected price text output at any stage, and text refers to a text output at any stageheightRefers to the pixel height of text region in the main picture of the original object, textwidthRefers to the pixel width, text, of the text region in the main picture of the original objectlengthWhich refers to the text length of the text region recognition result.

In the attribute information, for the pixel height and the pixel width, the position information refers to the position of the text region in the main article map, namely the specific coordinates of four points, and the minimum circumscribed rectangle of the text region can be constructed according to the position information, so that the width and the height are obtained. And for text length, it is usually calculated from a code, e.g., len (text).

For steps S204 and S205, the suspected price texts are sorted in descending order from high to low according to the attention value attention _ score, so as to obtain a "potential decimal queue (potential _ dot _ num _ parts)", where the suspected price text at the head of the queue has a decimal price with a maximum probability of being a special format.

1. Popping up a third suspected price text positioned at the head of the queue;

2. and filtering out the intersected or adjacent elements positioned at the right side of the residual suspected price text of the current potential decimal queue from the residual suspected price text to obtain a temporary right side queue. The specific criteria are as follows: according to the position information in the recognition result, if the upper left corner or the lower left corner of the element B is in the vicinity of the upper right corner or the lower right corner of the element a (for example, 20 pixels), the element B is considered as the right-side intersection or adjacent element of the element a.

3. And (3) carrying out merging rule judgment on the third suspected price text at the head of the queue and each suspected price text Bi (1 ═ i ═ total number of the temporary right queue) in the temporary right queue, wherein the specific rule is as follows:

1) the tail part of the suspected price text recognition result is a number or a decimal point; can be judged according to unicode coding;

2) the head of the Bi suspected price text recognition result can be only a number or a decimal point;

3) the confidence degrees of the suspected price text A and the suspected price text Bi are more than or equal to 0.5.

4. And for the two suspected price texts (namely the third suspected price text and the fourth suspected price text) which accord with the merging rule, merging to obtain A + Bi. For example, a is 36, B is 9, a + B gives 36.9; a is 36, B is 9, A + B is still 36.9, because only AB number part is taken before merging (code implementation is not manual), and decimal point is added when merging is carried out after the three rules are met.

Further, after the third suspected price text and the fourth suspected price text are combined, the recognition result of the third suspected price text is updated, and the confidence coefficient of the combined price text is obtained by taking an average value in combination with the confidence coefficient of the fourth suspected price text.

It should be noted that the confidence level averaging is only output to the service side for use, and the post-processing merged price text is manually adjusted, because of the merging operation, the confidence level of the post-processing merged price text needs to be adjusted downward.

For step S206, after the third suspected price text and the fourth suspected price text are merged, the two texts are removed from the potential decimal queue. And repeating the process, popping up suspected price texts at the head of the queue one by one, repeating the position filtering, merging rule judging and text merging operation until the number of texts in the queue is one, ending the circulation, and outputting the last suspected price text.

The method provided by the embodiment provides the concept of attention value of the special format decimal point price and defines a set of brand new merging rules, so that the price of the suspected decimal point is captured more accurately, and the merging of the price text is completed.

Compared with the prior art, the method provided by the embodiment of the invention has at least the following beneficial effects:

1. before the suspected price texts are processed and combined, filtering is carried out based on a regular matching mode, a non-price field mode, a metering unit name mode and associated context information so as to remove interference items and reduce the difficulty of follow-up rule extraction;

2. considering the position information of different suspected price texts, determining the suspected price texts with higher decimal probability, and combining the judgment of a combination rule to accurately combine the price texts;

3. when no price text or suspected price text exists in the picture, the price of the same kind of articles is accessed from the database based on information such as article names, models and capacities in the text, and the application range of the scheme is expanded.

Referring to fig. 3, a schematic diagram of main modules of a price identifying apparatus 300 according to an embodiment of the present invention is shown, including:

the identification module 301 is configured to perform text detection and identification on the target picture to obtain an identification result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

the filtering module 302 is configured to sort the identified texts according to the position information, process the sorted texts in a regular matching-based manner, and filter out suspected price texts;

the merging module 303 is configured to determine a second suspected price text which is intersected with or adjacent to the first suspected price text and meets a predetermined merging rule together, and merge the first suspected price text and the second suspected price text to obtain a price text of the target picture.

In the implementation apparatus of the present invention, the filtering module 302 is further configured to:

for the fields in the single suspected price text, if the fields are inquired in a non-price field library, determining that the single suspected price text is not a price text and filtering; and/or

And if the tail field of the single suspected price text comprises a metering unit field, determining that the single suspected price text is not a price text and filtering.

In the implementation apparatus of the present invention, the filtering module 302 is further configured to: and determining context information adjacent to the single suspected price text, and if the context information and/or the context information contain a non-price field, determining that the single suspected price text is not a price text and filtering.

In the device for implementing the present invention, the merging module 303 is configured to:

processing attribute information of each suspected price text in the target picture according to an attention value calculation mode to obtain an attention value of each suspected price text;

determining a third suspected price text with the largest attention value, and filtering out intersecting or adjacent suspected price texts positioned at the right side of the third suspected price text from the remaining suspected price texts to determine a fourth suspected price text which accords with the preset combination rule;

and removing the third suspected price text and the fourth suspected price text, determining a fifth suspected price text with the largest attention value from the remaining suspected price texts, and repeating the position filtering and merging rule judging operations until the number of the remaining suspected price texts is one.

In the implementation device of the invention, the attribute information comprises pixel height, pixel width and text length;

the merging module 303 is configured to: regarding the position information of a single suspected price text, taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the horizontal axis as the pixel width, and taking the difference value of the maximum coordinate value and the minimum coordinate value in the direction of the longitudinal axis as the pixel height; and

and calculating the text length of the single suspected price text by using a text length calculation mode.

In the device for implementing the present invention, the merging module 303 is configured to: and calculating the coordinate value of the upper right corner point or the lower right corner point of the third suspected price text and the coordinate value of the upper left corner point or the lower left corner point of each residual price text, and determining the suspected price text with the distance within a preset range.

In the implementation device of the invention, the recognition result further comprises a confidence coefficient, and the confidence coefficient represents the confidence degree of the recognized text;

the merging module 303 is configured to:

judging whether the tail part of the first suspected price text is a number or a decimal point;

judging whether the head of the second suspected price text is a number or a decimal point;

and judging whether the confidence degrees of the first suspected price text and the second suspected price text are both greater than or equal to a preset value.

In the device for implementing the present invention, the merging module 303 is configured to: and taking the average value of the sum of the confidence degrees of the first suspected price text and the second suspected price text to obtain the confidence degree of the price text.

In the device for implementing the present invention, the merging module 303 is configured to: and combining the digital parts of the first suspected price text and the second suspected price text, and adding a decimal point between the first suspected price text and the second suspected price text.

The device further includes a price searching module 304 (not shown) for: when the target picture does not contain a suspected price text or a price text, carrying out item price search based on the identified text to obtain a price search result; wherein the text includes at least one of an item brand, an item class, a name, and a model number.

In addition, the detailed implementation of the device in the embodiment of the present invention has been described in detail in the above method, so that the repeated description is not repeated here.

FIG. 4 illustrates an exemplary system architecture 400 to which embodiments of the invention may be applied.

As shown in fig. 4, the system architecture 400 may include terminal devices 401, 402, 403, a network 404, and a server 405 (by way of example only). The network 404 serves as a medium for providing communication links between the terminal devices 401, 402, 403 and the server 405. Network 404 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may use terminal devices 401, 402, 403 to interact with a server 405 over a network 404 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 401, 402, 403.

The terminal devices 401, 402, and 403 may be various electronic devices having display screens and supporting web browsing, and the server 405 may be a server that provides various services.

It should be noted that the method provided by the embodiment of the present invention is generally executed by the server 405, and accordingly, the apparatus is generally disposed in the server 405.

It should be understood that the number of terminal devices, networks, and servers in fig. 4 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 501.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an identification module, a filtering module, and a merging module. Where the names of these modules do not in some cases constitute a limitation on the modules themselves, for example, a merge module may also be described as a "text merge module".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:

performing text detection and identification on the target picture to obtain an identification result; the identification result comprises a text and position information, and the position information represents coordinate values of corner points of the text in the target picture;

the recognized texts are sorted according to the position information, the sorted texts are processed in a regular matching mode, and suspected price texts are filtered;

and determining a second suspected price text which is intersected or adjacent to the first suspected price text and accords with a preset combination rule together, and combining the first suspected price text and the second suspected price text to obtain the price text of the target picture.

According to the technical scheme of the embodiment of the invention, compared with the prior art, the method has the following beneficial effects:

1. before the suspected price texts are processed and combined, filtering is carried out based on a regular matching mode, a non-price field mode, a metering unit name mode and associated context information so as to remove interference items and reduce the difficulty of follow-up rule extraction;

2. considering the position information of different suspected price texts, determining the suspected price texts with higher decimal probability, and combining the judgment of a combination rule to accurately combine the price texts;

3. when no price text or suspected price text exists in the picture, the price of the same kind of articles is accessed from the database based on information such as article names, models and capacities in the text, and the application range of the scheme is expanded.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:更新语料库的方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!