Picture information extraction method and device, computer equipment and storage medium

文档序号:1875550 发布日期:2021-11-23 浏览:17次 中文

阅读说明:本技术 图片信息抽取方法、装置、计算机设备及存储介质 (Picture information extraction method and device, computer equipment and storage medium ) 是由 欧光礼 于 2021-08-31 设计创作,主要内容包括:本申请实施例属于人工智能领域,应用于医疗领域中,涉及一种图片信息抽取方法,包括获取待处理图片,对待处理图片进行文本识别,得到待处理文本;根据预设词袋模型对待处理文本进行特征提取,得到特征数据,基于特征数据对待处理图片进行分类;在图片类型为非制式文件类型时,输入特征数据至预设抽取器,得到第一抽取文本,输入特征数据至目标抽取模型,得到第二抽取文本;对第一抽取文本和第二抽取文本进行文本筛选,得到目标抽取文本,输入目标抽取文本至目标知识库,生成结构化数据。本申请还提供一种图片信息抽取装置、计算机设备及存储介质。此外,本申请还涉及区块链技术,结构化数据可存储于区块链中。本申请实现了对图片信息的高效抽取。(The embodiment of the application belongs to the field of artificial intelligence, is applied to the field of medical treatment, and relates to a picture information extraction method which comprises the steps of obtaining a picture to be processed, and carrying out text recognition on the picture to be processed to obtain a text to be processed; extracting features of the text to be processed according to a preset word bag model to obtain feature data, and classifying the pictures to be processed based on the feature data; when the picture type is a non-standard file type, inputting the characteristic data to a preset extractor to obtain a first extracted text, and inputting the characteristic data to a target extraction model to obtain a second extracted text; and performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating structured data. The application also provides a picture information extraction device, computer equipment and a storage medium. In addition, the present application also relates to blockchain techniques in which structured data can be stored. The method and the device for extracting the picture information realize efficient extraction of the picture information.)

1. A picture information extraction method is characterized by comprising the following steps:

acquiring a picture to be processed, and performing text recognition on the picture to be processed to obtain a text to be processed;

acquiring a preset bag-of-words model, performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classifying the picture to be processed based on the feature data, and determining the picture type of the picture to be processed;

when the picture type is a non-standard file type, inputting the characteristic data to a preset extractor, extracting to obtain a first extracted text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extracted text;

and performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

2. The method for extracting picture information according to claim 1, wherein the step of performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data comprises:

performing morphology reduction, part of speech tagging and number normalization on the text to be processed to obtain a preprocessed text of the text to be processed;

and inputting the preprocessed text to the preset bag-of-words model, and calculating to obtain the characteristic data.

3. The method according to claim 1, wherein the step of classifying the picture to be processed based on the feature data and determining the picture type of the picture to be processed comprises:

acquiring normalized coordinate information of the text to be processed;

and classifying the pictures to be processed according to the normalized coordinate information and the characteristic data to obtain the picture types of the pictures to be processed.

4. The method according to claim 1, wherein the step of inputting the feature data to a preset extractor to obtain a first extracted text comprises:

matching the feature data with word vectors corresponding to preset standard keywords according to the preset extractor to obtain hit keywords matched with the standard keywords in the text to be processed;

and performing neighborhood search on the hit keywords to obtain the first extracted text.

5. The method for extracting picture information according to claim 1, wherein the step of inputting the feature data to a target extraction model and extracting to obtain a second extracted text comprises:

the target extraction model comprises a bidirectional long-short term memory network and a conditional random field model, the characteristic data is input into the bidirectional long-short term memory network, and the characteristic vector of the text to be processed is obtained through calculation;

and calculating the characteristic vector according to the conditional random field model to obtain entity information corresponding to the characteristic vector, and determining the entity information as the second extracted text.

6. The method for extracting picture information according to claim 1, wherein the step of text-screening the first extracted text and the second extracted text comprises:

obtaining the confidence degrees corresponding to the first extracted text and the second extracted text respectively;

and sequencing the first extracted text and the second extracted text according to the confidence coefficient, and selecting a preset number of texts from high confidence coefficient to low confidence coefficient as the target extracted text.

7. The method for extracting picture information according to claim 1, further comprising, after the step of determining the picture type of the picture to be processed:

when the picture type is a standard file type, acquiring a preset extraction template;

and performing text extraction on the text to be processed according to the preset extraction template to obtain a target extraction text corresponding to the text to be processed.

8. A picture information extraction device characterized by comprising:

the identification module is used for acquiring a picture to be processed and performing text identification on the picture to be processed to obtain a text to be processed;

the acquisition module is used for acquiring a preset bag-of-words model, extracting the characteristics of the text to be processed according to the preset bag-of-words model to obtain characteristic data, classifying the pictures to be processed based on the characteristic data and determining the picture type of the pictures to be processed;

the extraction module is used for inputting the characteristic data to a preset extractor when the picture type is a non-standard file type, extracting to obtain a first extraction text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extraction text;

and the generating module is used for performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

9. A computer device comprising a memory and a processor, wherein the memory stores computer readable instructions, and the processor implements the steps of the picture information extraction method according to any one of claims 1 to 7 when executing the computer readable instructions.

10. A computer-readable storage medium, having computer-readable instructions stored thereon, which, when executed by a processor, implement the steps of the picture information extraction method according to any one of claims 1 to 7.

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for extracting picture information, a computer device, and a storage medium.

Background

In the process of evaluating the disease risk of a patient, the patient is often required to submit relevant case data, and the disease risk of the patient is judged according to the case data. The case data comprises patient certificate data, medical diagnosis and treatment data and physical examination data, and the medical diagnosis and treatment data comprises different data pictures such as outpatient medical records, inpatient medical records, pathological reports and the like.

The traditional risk decision-making method adopts a manual decision-making mode, the whole process is time-consuming and labor-consuming, and the problem of low accuracy of patient information extraction is finally caused because the patient information types have huge differences and the information provided by the patient can not be accurately extracted.

Disclosure of Invention

An embodiment of the present application provides a method and an apparatus for extracting picture information, a computer device, and a storage medium, so as to solve the technical problem of inaccurate extraction of picture information.

In order to solve the above technical problem, an embodiment of the present application provides a method for extracting picture information, which adopts the following technical solutions:

acquiring a picture to be processed, and performing text recognition on the picture to be processed to obtain a text to be processed;

acquiring a preset bag-of-words model, performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classifying the picture to be processed based on the feature data, and determining the picture type of the picture to be processed;

when the picture type is a non-standard file type, inputting the characteristic data to a preset extractor, extracting to obtain a first extracted text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extracted text;

and performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

Further, the step of extracting the features of the text to be processed according to the preset bag-of-words model to obtain feature data includes:

performing morphology reduction, part of speech tagging and number normalization on the text to be processed to obtain a preprocessed text of the text to be processed;

and inputting the preprocessed text to the preset bag-of-words model, and calculating to obtain the characteristic data.

Further, the classifying the picture to be processed based on the feature data, and the determining the picture type of the picture to be processed includes:

acquiring normalized coordinate information of the text to be processed;

and classifying the pictures to be processed according to the normalized coordinate information and the characteristic data to obtain the picture types of the pictures to be processed.

Further, the step of inputting the feature data to a preset extractor and extracting to obtain a first extracted text includes:

matching the feature data with word vectors corresponding to preset standard keywords according to the preset extractor to obtain hit keywords matched with the standard keywords in the text to be processed;

and performing neighborhood search on the hit keywords to obtain the first extracted text.

Further, the step of inputting the feature data to a target extraction model and extracting to obtain a second extracted text includes:

the target extraction model comprises a bidirectional long-short term memory network and a conditional random field model, the characteristic data is input into the bidirectional long-short term memory network, and the characteristic vector of the text to be processed is obtained through calculation;

and calculating the characteristic vector according to the conditional random field model to obtain entity information corresponding to the characteristic vector, and determining the entity information as the second extracted text.

Further, the step of text screening the first extracted text and the second extracted text includes:

obtaining the confidence degrees corresponding to the first extracted text and the second extracted text respectively;

and sequencing the first extracted text and the second extracted text according to the confidence coefficient, and selecting a preset number of texts from high confidence coefficient to low confidence coefficient as the target extracted text.

Further, after the step of determining the picture type of the picture to be processed, the method further includes:

when the picture type is a standard file type, acquiring a preset extraction template;

and performing text extraction on the text to be processed according to the preset extraction template to obtain a target extraction text corresponding to the text to be processed.

In order to solve the above technical problem, an embodiment of the present application further provides a picture information extraction device, which adopts the following technical solutions:

the identification module is used for acquiring a picture to be processed and performing text identification on the picture to be processed to obtain a text to be processed;

the acquisition module is used for acquiring a preset bag-of-words model, extracting the characteristics of the text to be processed according to the preset bag-of-words model to obtain characteristic data, classifying the pictures to be processed based on the characteristic data and determining the picture type of the pictures to be processed;

the extraction module is used for inputting the characteristic data to a preset extractor when the picture type is a non-standard file type, extracting to obtain a first extraction text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extraction text;

and the generating module is used for performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

acquiring a picture to be processed, and performing text recognition on the picture to be processed to obtain a text to be processed;

acquiring a preset bag-of-words model, performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classifying the picture to be processed based on the feature data, and determining the picture type of the picture to be processed;

when the picture type is a non-standard file type, inputting the characteristic data to a preset extractor, extracting to obtain a first extracted text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extracted text;

and performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

acquiring a picture to be processed, and performing text recognition on the picture to be processed to obtain a text to be processed;

acquiring a preset bag-of-words model, performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classifying the picture to be processed based on the feature data, and determining the picture type of the picture to be processed;

when the picture type is a non-standard file type, inputting the characteristic data to a preset extractor, extracting to obtain a first extracted text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extracted text;

and performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

The method comprises the steps of obtaining a picture to be processed, and performing text recognition on the picture to be processed to obtain a text to be processed; then, acquiring a preset bag-of-words model, performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classifying the picture to be processed based on the feature data, determining the picture type of the picture to be processed, performing different processing on the text to be processed according to different picture types, and further improving the picture information extraction efficiency; when the picture type is a non-standard file type, inputting the characteristic data to a preset extractor, extracting to obtain a first extracted text, inputting the characteristic data to a target extraction model, and extracting to obtain a second extracted text, so that the target extracted text can be accurately obtained according to the first extracted text and the second extracted text; and then, text screening is carried out on the first extracted text and the second extracted text to obtain a target extracted text, the target extracted text is input to a target knowledge base, and structured data corresponding to the picture to be processed is generated.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for picture information extraction according to the present application;

FIG. 3 is a schematic structural diagram of an embodiment of a picture information extraction apparatus according to the present application;

FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.

Reference numerals: the picture information extraction device 300, the identification module 301, the acquisition module 302, the extraction module 303 and the generation module 304.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The patient may use the terminal devices 101, 102, 103 to interact with the server 105 over the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.

It should be noted that the picture information extraction method provided in the embodiment of the present application is generally executed by a server/terminal device, and accordingly, the picture information extraction apparatus is generally disposed in the server/terminal device.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continuing reference to FIG. 2, a flow diagram of one embodiment of a method of picture information extraction in accordance with the present application is shown. The picture information extraction method comprises the following steps:

step S201, acquiring a picture to be processed, and performing text recognition on the picture to be processed to obtain a text to be processed;

in this embodiment, the picture to be processed is a picture with patient information that needs to be subjected to information extraction. For example, in the medical field, the picture to be processed may be various pictures containing patient diagnosis information or identity information. The method comprises the steps of obtaining a picture to be processed, and performing text Recognition on the picture to be processed to obtain a text to be processed, wherein the text to be processed can be obtained by recognizing the picture to be processed through an OCR (Optical Character Recognition) technology.

Step S202, acquiring a preset bag-of-words model, performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classifying the picture to be processed based on the feature data, and determining the picture type of the picture to be processed;

in this embodiment, a Bag of words model (Bag of words model) is a document representation manner, and in the information retrieval, the Bag of words model assumes that, for a document, the document is regarded as a set of a plurality of words by ignoring elements such as word order, grammar and syntax of the document, each word in the document is independent of other words, and for any word in the document, the word is not independently selected by the influence of the document semantics. Therefore, a preset bag-of-words model is obtained, the text to be processed is used as a text set according to the preset bag-of-words model, the occurrence frequency of each word is calculated from the text set, and the occurrence frequency of each word is used as a word vector corresponding to each word in the text to be processed. Specifically, when a text to be processed is obtained, preprocessing the text to be processed according to sentences to obtain a preprocessed text to be processed; inputting the preprocessed text to be processed into a preset word bag model, and performing vector representation on words in the preprocessed text to be processed through the preset word bag model to obtain a word vector corresponding to each word, wherein the word vector is the feature data. And then, classifying the picture to be processed according to the characteristic data, and determining the picture type of the picture to be processed. The image types comprise a non-standard file type and a standard file type, the standard file type is an image type with a fixed standard such as an identity card, and the non-standard file type is all image types except the card, such as a patient diagnosis information image. When the feature data is obtained, matching the feature data with element words in an element library, namely calculating the similarity of the feature data and vectors corresponding to the element words in the element library to obtain an element matching degree; determining the element words with the element matching degree larger than or equal to a preset threshold value as the element words matched with the feature data, acquiring the image types of the element words, and determining the image types of the current to-be-processed images according to the image types of the element words.

Step S203, when the picture type is a non-standard file type, inputting the feature data to a preset extractor, extracting to obtain a first extracted text, inputting the feature data to a target extraction model, and extracting to obtain a second extracted text;

in this embodiment, when the picture type is a non-standard file type, the text to be processed is input to a preset extractor, the preset extractor is a preset text extractor, multiple groups of extraction instructions are set in the preset extractor, and text extraction is performed on the text to be processed according to extraction instructions corresponding to different extraction rules set in the preset extractor, so as to obtain a first extracted text. Meanwhile, the text to be processed is input into a target extraction model, which is a preset neural network text extraction model, such as a text extraction model composed of a Bi-directional Long Short-Term Memory network (BILSTM) and a Conditional Random field model (CRF). And extracting the text to be processed according to the target extraction model to obtain a second extracted text.

Step S204, performing text screening on the first extracted text and the second extracted text to obtain a target extracted text, inputting the target extracted text to a target knowledge base, and generating the structured data corresponding to the picture to be processed.

In this embodiment, when a first extracted text and a second extracted text are obtained, text screening is performed on the first extracted text and the second extracted text to obtain a target extracted text. Specifically, when a first extracted text and a second extracted text are obtained, the appearance frequencies of all texts in the first extracted text and the second extracted text are sorted from high to low, and a preset number of texts are selected from high to low to serve as target extracted texts. When a target extraction text is obtained, inputting the target extraction text to a target knowledge base, wherein a knowledge graph is stored in the target knowledge base and comprises all data related to the target extraction text. According to the knowledge graph, all data related to the current target extraction text and the target extraction text can be found to form structured data corresponding to the picture to be processed.

It is emphasized that to further ensure the privacy and security of the structured data, the structured data may also be stored in a node of a blockchain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

According to the method and the device, the picture information is efficiently extracted, the extraction efficiency and the extraction accuracy of the picture information are improved, the decision-making efficiency of intelligent decision-making according to the extracted text information is further improved, and the decision-making period is shortened.

In some optional implementation manners of this embodiment, the step of performing feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data includes:

performing morphology reduction, part of speech tagging and number normalization on the text to be processed to obtain a preprocessed text of the text to be processed;

and inputting the preprocessed text to the preset bag-of-words model, and calculating to obtain the characteristic data.

In this embodiment, when obtaining a text to be processed, performing morphological restoration, part of speech tagging and number normalization on the text to be processed to obtain a preprocessed text. Specifically, the word shape is restored to a process of restoring words to a prototype vocabulary in a dictionary according to the part of speech after English is segmented; the part-of-speech tagging is a process of performing part-of-speech tagging on each word according to a preset dictionary to obtain the part-of-speech (such as noun and verb) of each word; and the number normalization is a process of performing normalized format mapping on the numbers in the text to be processed so that the numbers conform to a preset format. And the text to be processed after the morphological restoration, the part of speech tagging and the number normalization is the preprocessed text. And when the preprocessed text is obtained, inputting the preprocessed text into a preset word bag model, and calculating to obtain the characteristic data of the text to be processed.

According to the method and the device, the preprocessed text is obtained by performing morphology reduction, part of speech tagging and number normalization on the text to be processed, and then the feature data is obtained by performing feature extraction on the preprocessed text, so that the text can be accurately represented according to the feature data, and the accuracy of classifying the pictures to be processed is further improved.

In some optional implementation manners of this embodiment, the classifying the to-be-processed picture based on the feature data, and the determining the picture type of the to-be-processed picture includes:

acquiring normalized coordinate information of the text to be processed;

and classifying the pictures to be processed according to the normalized coordinate information and the characteristic data to obtain the picture types of the pictures to be processed.

In this embodiment, in order to more accurately classify the picture to be processed, when the feature data is obtained, normalized coordinate information of the text to be processed is obtained, and the picture to be processed is classified according to the feature data and the normalized coordinate information. Specifically, the normalized coordinate information is coordinate information obtained by normalizing the characters in the picture to be processed. Acquiring pixel information of each character in the picture to be processed, and mapping the pixel information to a uniform coordinate dimension according to a preset mapping standard to obtain normalized coordinate information of the text to be processed. Comparing the normalized coordinate information with standard coordinate information, and acquiring characteristic data corresponding to the normalized coordinate information when the normalized coordinate information is matched with the standard coordinate information; and then, calculating the matching degree of the feature data corresponding to the normalized coordinate information and the standard word vector corresponding to the standard coordinate information, and if the feature data corresponding to the normalized coordinate information is successfully matched with the standard word vector corresponding to the standard coordinate information, determining that the picture type to which the standard word vector corresponding to the standard coordinate information belongs is the picture type of the picture to be processed.

According to the embodiment, the normalized coordinate information of the text to be processed is obtained, and the pictures to be processed are classified according to the normalized coordinate information and the characteristic data, so that the accuracy of classifying the pictures to be processed is improved.

In some optional implementation manners of this embodiment, the step of inputting the feature data to a preset extractor, and extracting to obtain a first extracted text includes:

matching the feature data with word vectors corresponding to preset standard keywords according to the preset extractor to obtain hit keywords matched with the standard keywords in the text to be processed;

and performing neighborhood search on the hit keywords to obtain the first extracted text.

In this embodiment, when a text to be processed is extracted according to a preset extractor, a preset standard keyword is obtained, and cosine similarity between feature data of the text to be processed and a word vector of the standard keyword is calculated, so as to obtain a hit keyword matched with the standard keyword in the text to be processed. And when the hit keywords are obtained, performing neighborhood search on the hit keywords to obtain a first extracted text. Specifically, when a hit keyword is obtained, a domain mapping set corresponding to the hit keyword is obtained, and an optimal solution of the hit keyword on the domain mapping set is calculated according to a preset domain search algorithm, so that a first extracted text is obtained. The domain search algorithm comprises search scores such as a local search algorithm and a variable domain search algorithm, and different search algorithms correspond to different domain functions; and calculating the optimal solution of the hit keywords on the corresponding domain mapping set according to the domain function to obtain a first extracted text.

According to the method and the device, the domain search is carried out on the hit keywords to obtain the first extracted text, so that the first extracted text is accurately extracted, and the extraction accuracy and the extraction efficiency of the first extracted text are improved.

In some optional implementation manners of this embodiment, the step of inputting the feature data to the target extraction model and extracting to obtain the second extracted text includes:

the target extraction model comprises a bidirectional long-short term memory network and a conditional random field model, the characteristic data is input into the bidirectional long-short term memory network, and the characteristic vector of the text to be processed is obtained through calculation;

and calculating the characteristic vector according to the conditional random field model to obtain entity information corresponding to the characteristic vector, and determining the entity information as the second extracted text.

In the present embodiment, the target extraction model includes a two-way long-short term memory network and a conditional random field model. According to the bidirectional long and short term memory network, semantic features can be extracted from input feature data, so that semantic information of sentence context can be better represented by feature vectors obtained through extraction. When feature data corresponding to a text to be processed is obtained, inputting the feature data to a bidirectional long-short term memory network in a target extraction model, and calculating according to the bidirectional long-short term memory network to obtain a feature vector; and then inputting the feature vector into a conditional random field model in a target extraction model, and outputting the optimal entity information corresponding to the feature vector according to the conditional random field model to obtain a second extraction text corresponding to the text to be processed.

According to the embodiment, the second extracted text is extracted through the target extraction model, so that the second text is efficiently extracted, and further, the picture information can be accurately extracted and determined according to the second extracted text.

In some optional implementation manners of this embodiment, the text screening for the first extracted text and the second extracted text includes:

obtaining the confidence degrees corresponding to the first extracted text and the second extracted text respectively;

and sequencing the first extracted text and the second extracted text according to the confidence coefficient, and selecting a preset number of texts from high confidence coefficient to low confidence coefficient as the target extracted text.

In this embodiment, the confidence corresponding to the first extracted text is a first confidence, and the first confidence is a similarity between the first extracted text and the standard keyword, which is calculated according to a preset extractor; the confidence corresponding to the second extracted text is a second confidence, and the second confidence is a predicted value of the second extracted text calculated according to the target extraction model. Specifically, when a first extracted text and a second extracted text are obtained, the similarity corresponding to the first extracted text and the standard keyword is obtained, and the similarity is used as a first confidence coefficient; and meanwhile, obtaining a predicted value corresponding to the second extracted text, wherein when the target extraction model carries out entity output on the second extracted text, the probability information of the entity corresponding to the second extracted text is the predicted value. And uniformly sequencing the first extracted text and the second extracted text according to the first confidence coefficient and the second confidence coefficient, and selecting a preset number of texts from high to low as target extracted texts.

According to the method and the device, the target extraction text is determined through the confidence coefficient, and the accuracy rate of extracting the text information in the picture to be processed is improved.

In some optional implementation manners of this embodiment, after the step of determining the picture type of the picture to be processed, the method further includes:

when the picture type is a standard file type, acquiring a preset extraction template;

and performing text extraction on the text to be processed according to the preset extraction template to obtain a target extraction text corresponding to the text to be processed.

In this embodiment, when the picture type of the picture to be processed is a standard file type, a preset extraction template is obtained, where the preset extraction template includes a text template corresponding to each standard file type. And performing text extraction on the text to be processed according to the preset extraction template to obtain a target extraction text corresponding to the text to be processed. Specifically, when a preset extraction template is obtained, a template field of the preset extraction template is obtained, the template field is matched with a text field in a text to be processed to obtain a text field matched with the template field, and text information under the text field in the text to be processed is obtained to obtain a target extraction text.

According to the method and the device, the text information of the to-be-processed picture of the standard file type is extracted through the preset extraction template, and the text information extraction efficiency of the to-be-processed picture of the standard file type is improved.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, the processes of the embodiments of the methods described above can be included. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a picture information extraction apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 3, the picture information extraction apparatus 300 according to the present embodiment includes: an identification module 301, an acquisition module 302, an extraction module 303, and a generation module 304. Wherein:

the identification module 301 is configured to acquire a to-be-processed picture, and perform text identification on the to-be-processed picture to obtain a to-be-processed text;

in this embodiment, the picture to be processed is a picture with patient information that needs to be subjected to information extraction. For example, in the medical field, the picture to be processed may be various pictures containing patient diagnosis information or identity information. The method comprises the steps of obtaining a picture to be processed, and performing text Recognition on the picture to be processed to obtain a text to be processed, wherein the text to be processed can be obtained by recognizing the picture to be processed through an OCR (Optical Character Recognition) technology.

An obtaining module 302, configured to obtain a preset bag-of-words model, perform feature extraction on the text to be processed according to the preset bag-of-words model to obtain feature data, classify the picture to be processed based on the feature data, and determine a picture type of the picture to be processed;

in this embodiment, a Bag of words model (Bag of words model) is a document representation manner, and in the information retrieval, the Bag of words model assumes that, for a document, the document is regarded as a set of a plurality of words by ignoring elements such as word order, grammar and syntax of the document, each word in the document is independent of other words, and for any word in the document, the word is not independently selected by the influence of the document semantics. Therefore, a preset bag-of-words model is obtained, the text to be processed is used as a text set according to the preset bag-of-words model, the occurrence frequency of each word is calculated from the text set, and the occurrence frequency of each word is used as a word vector corresponding to each word in the text to be processed. Specifically, when a text to be processed is obtained, preprocessing the text to be processed according to sentences to obtain a preprocessed text to be processed; inputting the preprocessed text to be processed into a preset word bag model, and performing vector representation on words in the preprocessed text to be processed through the preset word bag model to obtain a word vector corresponding to each word, wherein the word vector is the feature data. And then, classifying the picture to be processed according to the characteristic data, and determining the picture type of the picture to be processed. The image types comprise a non-standard file type and a standard file type, the standard file type is an image type with a fixed standard such as an identity card, and the non-standard file type is all image types except the card, such as a patient diagnosis information image. When the feature data is obtained, matching the feature data with element words in an element library, namely calculating the similarity of the feature data and vectors corresponding to the element words in the element library to obtain an element matching degree; determining the element words with the element matching degree larger than or equal to a preset threshold value as the element words matched with the feature data, acquiring the image types of the element words, and determining the image types of the current to-be-processed images according to the image types of the element words.

In some optional implementations of this embodiment, the obtaining module 302 includes:

the preprocessing unit is used for performing morphology reduction, part of speech tagging and number normalization on the text to be processed to obtain a preprocessed text of the text to be processed;

and the processing unit is used for inputting the preprocessed text to the preset word bag model and calculating to obtain the characteristic data.

In this embodiment, when obtaining a text to be processed, performing morphological restoration, part of speech tagging and number normalization on the text to be processed to obtain a preprocessed text. Specifically, the word shape is restored to a process of restoring words to a prototype vocabulary in a dictionary according to the part of speech after English is segmented; the part-of-speech tagging is a process of performing part-of-speech tagging on each word according to a preset dictionary to obtain the part-of-speech (such as noun and verb) of each word; and the number normalization is a process of performing normalized format mapping on the numbers in the text to be processed so that the numbers conform to a preset format. And the text to be processed after the morphological restoration, the part of speech tagging and the number normalization is the preprocessed text. And when the preprocessed text is obtained, inputting the preprocessed text into a preset word bag model, and calculating to obtain the characteristic data of the text to be processed.

In some optional implementations of this embodiment, the obtaining module 302 further includes:

the first acquisition unit is used for acquiring the normalized coordinate information of the text to be processed;

and the classification unit is used for classifying the picture to be processed according to the normalized coordinate information and the characteristic data to obtain the picture type of the picture to be processed.

In this embodiment, in order to more accurately classify the picture to be processed, when the feature data is obtained, normalized coordinate information of the text to be processed is obtained, and the picture to be processed is classified according to the feature data and the normalized coordinate information. Specifically, the normalized coordinate information is coordinate information obtained by normalizing the characters in the picture to be processed. Acquiring pixel information of each character in the picture to be processed, and mapping the pixel information to a uniform coordinate dimension according to a preset mapping standard to obtain normalized coordinate information of the text to be processed. Comparing the normalized coordinate information with standard coordinate information, and acquiring characteristic data corresponding to the normalized coordinate information when the normalized coordinate information is matched with the standard coordinate information; and then, calculating the matching degree of the feature data corresponding to the normalized coordinate information and the standard word vector corresponding to the standard coordinate information, and if the feature data corresponding to the normalized coordinate information is successfully matched with the standard word vector corresponding to the standard coordinate information, determining that the picture type to which the standard word vector corresponding to the standard coordinate information belongs is the picture type of the picture to be processed.

The extraction module 303 is configured to, when the picture type is a non-standard file type, input the feature data to a preset extractor, extract to obtain a first extraction text, input the feature data to a target extraction model, and extract to obtain a second extraction text;

in this embodiment, when the picture type is a non-standard file type, the text to be processed is input to a preset extractor, the preset extractor is a preset text extractor, multiple groups of extraction instructions are set in the preset extractor, and text extraction is performed on the text to be processed according to extraction instructions corresponding to different extraction rules set in the preset extractor, so as to obtain a first extracted text. Meanwhile, the text to be processed is input into a target extraction model, which is a preset neural network text extraction model, such as a text extraction model composed of a Bi-directional Long Short-Term Memory network (BILSTM) and a Conditional Random field model (CRF). And extracting the text to be processed according to the target extraction model to obtain a second extracted text.

In some optional implementations of this embodiment, the extraction module 303 includes:

the matching unit is used for matching the feature data with word vectors corresponding to preset standard keywords according to the preset extractor to obtain hit keywords matched with the standard keywords in the text to be processed;

and the searching unit is used for performing neighborhood searching on the hit keywords to obtain the first extracted text.

In this embodiment, when a text to be processed is extracted according to a preset extractor, a preset standard keyword is obtained, and cosine similarity between feature data of the text to be processed and a word vector of the standard keyword is calculated, so as to obtain a hit keyword matched with the standard keyword in the text to be processed. And when the hit keywords are obtained, performing neighborhood search on the hit keywords to obtain a first extracted text. Specifically, when a hit keyword is obtained, a domain mapping set corresponding to the hit keyword is obtained, and an optimal solution of the hit keyword on the domain mapping set is calculated according to a preset domain search algorithm, so that a first extracted text is obtained. The domain search algorithm comprises search scores such as a local search algorithm and a variable domain search algorithm, and different search algorithms correspond to different domain functions; and calculating the optimal solution of the hit keywords on the corresponding domain mapping set according to the domain function to obtain a first extracted text.

In some optional implementations of this embodiment, the extracting module 303 further includes:

the first calculation unit is used for inputting the characteristic data to the bidirectional long-short term memory network and calculating to obtain a characteristic vector of the text to be processed, wherein the target extraction model comprises the bidirectional long-short term memory network and the conditional random field model;

and the second calculating unit is used for calculating the characteristic vector according to the conditional random field model to obtain entity information corresponding to the characteristic vector and determine the entity information as the second extracted text.

In the present embodiment, the target extraction model includes a two-way long-short term memory network and a conditional random field model. According to the bidirectional long and short term memory network, semantic features can be extracted from input feature data, so that semantic information of sentence context can be better represented by feature vectors obtained through extraction. When feature data corresponding to a text to be processed is obtained, inputting the feature data to a bidirectional long-short term memory network in a target extraction model, and calculating according to the bidirectional long-short term memory network to obtain a feature vector; and then inputting the feature vector into a conditional random field model in a target extraction model, and outputting the optimal entity information corresponding to the feature vector according to the conditional random field model to obtain a second extraction text corresponding to the text to be processed.

The generating module 304 is configured to perform text screening on the first extracted text and the second extracted text to obtain a target extracted text, input the target extracted text to a target knowledge base, and generate structured data corresponding to the to-be-processed picture.

In this embodiment, when a first extracted text and a second extracted text are obtained, text screening is performed on the first extracted text and the second extracted text to obtain a target extracted text. Specifically, when a first extracted text and a second extracted text are obtained, the appearance frequencies of all texts in the first extracted text and the second extracted text are sorted from high to low, and a preset number of texts are selected from high to low to serve as target extracted texts. When a target extraction text is obtained, inputting the target extraction text to a target knowledge base, wherein a knowledge graph is stored in the target knowledge base and comprises all data related to the target extraction text. According to the knowledge graph, all data related to the current target extraction text and the target extraction text can be found to form structured data corresponding to the picture to be processed.

It is emphasized that to further ensure the privacy and security of the structured data, the structured data may also be stored in a node of a blockchain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

In some optional implementations of this embodiment, the generating module 304 includes:

the second acquisition unit is used for acquiring the confidence degrees corresponding to the first extracted text and the second extracted text respectively;

and the sorting unit is used for sorting the first extracted text and the second extracted text according to the confidence coefficient, and selecting a preset number of texts from high confidence coefficient to low confidence coefficient as the target extracted text.

In this embodiment, the confidence corresponding to the first extracted text is a first confidence, and the first confidence is a similarity between the first extracted text and the standard keyword, which is calculated according to a preset extractor; the confidence corresponding to the second extracted text is a second confidence, and the second confidence is a predicted value of the second extracted text calculated according to the target extraction model. Specifically, when a first extracted text and a second extracted text are obtained, the similarity corresponding to the first extracted text and the standard keyword is obtained, and the similarity is used as a first confidence coefficient; and meanwhile, obtaining a predicted value corresponding to the second extracted text, wherein when the target extraction model carries out entity output on the second extracted text, the probability information of the entity corresponding to the second extracted text is the predicted value. And uniformly sequencing the first extracted text and the second extracted text according to the first confidence coefficient and the second confidence coefficient, and selecting a preset number of texts from high to low as target extracted texts.

In some optional implementations of the embodiment, the picture information extraction apparatus 300 further includes:

a third obtaining unit, configured to obtain a preset extraction template when the picture type is a standard file type;

and the extraction unit is used for extracting the text of the text to be processed according to the preset extraction template to obtain a target extraction text corresponding to the text to be processed.

In this embodiment, when the picture type of the picture to be processed is a standard file type, a preset extraction template is obtained, where the preset extraction template includes a text template corresponding to each standard file type. And performing text extraction on the text to be processed according to the preset extraction template to obtain a target extraction text corresponding to the text to be processed. Specifically, when a preset extraction template is obtained, a template field of the preset extraction template is obtained, the template field is matched with a text field in a text to be processed to obtain a text field matched with the template field, and text information under the text field in the text to be processed is obtained to obtain a target extraction text.

The picture information extraction device provided by the embodiment realizes efficient extraction of the picture information, improves the extraction efficiency and the extraction accuracy of the picture information, further improves the decision efficiency of intelligent decision according to the extracted text information, and reduces the decision period.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only a computer device 6 having components 61-63 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a patient through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 61 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal storage unit of the computer device 6 and an external storage device thereof. In this embodiment, the memory 61 is generally used for storing an operating system installed in the computer device 6 and various application software, such as computer readable instructions of a picture information extraction method. Further, the memory 61 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 62 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute computer readable instructions stored in the memory 61 or process data, for example, execute computer readable instructions of the picture information extraction method.

The network interface 63 may comprise a wireless network interface or a wired network interface, and the network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.

The computer device provided by the embodiment realizes efficient extraction of the picture information, improves the extraction efficiency and the extraction accuracy of the picture information, further improves the decision efficiency of intelligent decision according to the extracted text information, and reduces the decision period.

The present application further provides another embodiment, which is to provide a computer-readable storage medium, wherein the computer-readable storage medium stores computer-readable instructions, which can be executed by at least one processor, so as to cause the at least one processor to execute the steps of the picture information extraction method as described above.

The computer-readable storage medium provided by the embodiment realizes efficient extraction of the picture information, improves extraction efficiency and extraction accuracy of the picture information, further improves decision efficiency of intelligent decision according to the extracted text information, and reduces a decision period.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

20页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:图文匹配结果确定方法、装置、电子设备及可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!