Entity and semantic relation recognition method and device, electronic equipment and storage medium

文档序号:1378991 发布日期:2020-08-14 浏览:6次 中文

阅读说明:本技术 实体及语义关系识别方法、装置、电子设备及存储介质 (Entity and semantic relation recognition method and device, electronic equipment and storage medium ) 是由 邹晶 周英能 肖婷 于 2020-04-17 设计创作,主要内容包括:本发明实施例涉及自然语言处理领域,公开了一种实体及语义关系识别方法,该方法将待识别语料转化为词向量,得到待识别词向量集,利用预训练完成的编码层将所述待识别词向量集进行原始编码,得到原始编码词向量集,计算所述原始编码词向量集的注意力值,根据所述注意力值和所述原始编码词向量集求得编码词向量集,利用预训练完成的解码层,识别所述编码词向量集的实体序列得到实体集和解码词向量集,将所述解码词向量集输入至概率分布函数,得到语义关系集。本发明还提出一种实体及语义关系识别装置、电子设备以及存储介质。本发明能够解决语义关系识别时参数众多导致训练时占用计算资源,且文本特征构造不精细,进而影响识别准确率的问题。(The embodiment of the invention relates to the field of natural language processing, and discloses an entity and semantic relation recognition method, which comprises the steps of converting linguistic data to be recognized into word vectors to obtain a word vector set to be recognized, carrying out original coding on the word vector set to be recognized by utilizing a coding layer which is trained in advance to obtain an original coding word vector set, calculating the attention value of the original coding word vector set, obtaining a coding word vector set according to the attention value and the original coding word vector set, recognizing an entity sequence of the coding word vector set by utilizing a decoding layer which is trained in advance to obtain an entity set and a decoding word vector set, and inputting the decoding word vector set to a probability distribution function to obtain a semantic relation set. The invention also provides an entity and semantic relation recognition device, electronic equipment and a storage medium. The method can solve the problems that the parameters are numerous during semantic relation recognition, so that the calculation resources are occupied during training, and the text characteristic structure is not fine, so that the recognition accuracy is influenced.)

1. An entity and semantic relationship recognition method, the method comprising:

performing word vector conversion on the corpus to be recognized to obtain a word vector set to be recognized;

performing original coding on the word vector set to be recognized by utilizing a coding layer which is trained in advance to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and solving a coding word vector set according to the multiple groups of attention values and the original coding word vector set;

recognizing an entity sequence of the encoding word vector set by using a decoding layer which is trained in advance to obtain an entity set, and performing decoding operation on the encoding word vector set to obtain a decoding word vector set;

and inputting the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

2. The entity and semantic relationship recognition method according to claim 1, further comprising constructing an entity and semantic relationship recognition model comprising the coding layer and the decoding layer, and training the coding layer and the decoding layer, wherein the training comprises:

step A: acquiring a training corpus and a training label set, and dividing the training label set into an entity label set and a semantic relation label set;

and B: respectively converting the training corpus set, the entity tag set and the semantic relation tag set into a training vector set, an entity vector set and a semantic relation vector set;

and C: inputting the training vector set into the coding layer to carry out coding operation to obtain a coding training set;

step D: inputting the coding training set into the decoding layer to perform the decoding operation to obtain a predicted entity set and a predicted semantic relation set;

step E: calculating first loss values of the predicted entity set and the entity vector set, and calculating second loss values of the predicted semantic relation set and the semantic relation vector set; step F: calculating a total loss value according to the first loss value and the second loss value;

step G: c, optimizing the internal parameters of the coding layer and the decoding layer according to a pre-constructed optimization function under the condition that the total loss value is greater than a preset loss value, and returning to the step C;

step H: and under the condition that the total loss value is smaller than the preset loss value, obtaining the trained coding layer and decoding layer.

3. The entity and semantic relationship recognition method according to claim 1 or 2, wherein the obtaining a set of encoded word vectors according to the plurality of sets of attention values and the set of original encoded word vectors comprises:

carrying out dimension splicing on the multiple groups of attention values to obtain a vector set of the encoding regulating words;

and performing integration operation on the original encoding word vector set and the encoding adjusting word vector set to obtain the encoding word vector set.

4. The entity and semantic relationship recognition method of claim 3, wherein the calculating the plurality of sets of attention values of the original encoded word vector set comprises:

initializing a plurality of groups of attention matrixes;

calculating a projection matrix of a plurality of groups of attention matrixes according to a pre-constructed projection equation;

and inputting the vector dimensions of the plurality of groups of projection matrixes and the original coding word vector set into a pre-constructed attention calculation function to obtain the plurality of groups of attention values.

5. The entity and semantic relationship recognition method of claim 4, wherein the attention calculation function is:

wherein the headiTo the attention value, dbilstmRepresenting vector dimensions of the original encoding word vector set, softmax representing a normalized exponential function, Q'i、K′i、V′iRepresenting a projection matrix.

6. The entity and semantic relationship recognition method according to claim 1 or 2, wherein the recognizing the entity sequence of the encoded word vector set by using the pre-trained decoding layer to obtain an entity set, and performing a decoding operation on the encoded word vector set to obtain a decoded word vector set, comprises:

identifying entity vectors included in the encoding word vector set according to a pre-constructed conditional random field model to obtain an entity vector set;

and according to a pre-constructed fully-connected neural network, calculating the semantic relation among the encoding word vectors in the encoding word vector set to obtain diversity, and according to a pre-constructed threshold function, cleaning the semantic relation to obtain the diversity to obtain the decoding word vector set.

7. The entity and semantic relationship recognition method according to claim 6, wherein the obtaining the set of vectors of decoded words by cleaning the diversity of semantic relationships according to a pre-constructed threshold function comprises:

taking the semantic relation diversity as an input parameter of the threshold function;

calculating the threshold function to obtain the probability distribution of the diversity of the semantic relation;

and clearing the semantic relation according to the probability distribution to obtain diversity and obtain the vector set of the decoded words.

8. An entity and semantic relationship recognition apparatus, the apparatus comprising:

the word vector conversion module is used for carrying out word vector conversion on the linguistic data to be recognized to obtain a word vector set to be recognized;

the coding module is used for carrying out original coding on the word vector set to be recognized by utilizing a coding layer which is trained in advance to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and obtaining a coding word vector set according to the multiple groups of attention values and the original coding word vector set;

the entity identification module is used for identifying the entity sequence of the encoding word vector set by utilizing the decoding layer which is trained in advance to obtain an entity set, and decoding the encoding word vector set to obtain a decoding word vector set;

and the semantic relation recognition module is used for inputting the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the entity and semantic relationship identification method of any one of claims 1 to 7.

10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the entity and semantic relationship recognition method according to any one of claims 1 to 7.

Technical Field

The embodiment of the invention relates to the field of natural language processing, in particular to a method, electronic equipment, a device and a computer readable storage medium for identifying entity and semantic relation.

Background

In the field of natural language processing, a process of extracting valuable information by analyzing texts is called information extraction, and further, by identifying the entity and semantic relationship of unstructured texts with non-uniform structures, simple and clear structured data can be obtained, so that people can efficiently retrieve and manage the data, and the method has great significance for quickly identifying the entity and semantic relationship.

At present, the commonly used entity and semantic relationship identification mainly comprises a sequence marking method based on a neural network method and an attention mechanism, but the inventor finds that the two methods can realize the identification of the entity and semantic relationship, but the two methods still have places to be perfected, for example, the parameters cause occupation of computing resources during training, and the structure of text features is not fine enough, thereby influencing the identification accuracy.

Disclosure of Invention

The embodiment of the invention aims to provide an entity and semantic relation recognition method, electronic equipment, a device and a computer readable storage medium, so that the problems that in the entity and semantic relation recognition process, parameters are numerous, calculation resources are occupied in training, and the structure of text features is not fine enough, and the recognition accuracy is influenced are effectively solved.

In order to solve the above technical problem, an embodiment of the present invention provides an entity and semantic relationship identification method, where the method includes:

performing word vector conversion on the corpus to be recognized to obtain a word vector set to be recognized;

performing original coding on the word vector set to be recognized by utilizing a coding layer which is trained in advance to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and solving a coding word vector set according to the multiple groups of attention values and the original coding word vector set;

recognizing an entity sequence of the encoding word vector set by using a decoding layer which is trained in advance to obtain an entity set, and decoding the encoding word vector set to obtain a decoding word vector set;

and inputting the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

In order to solve the above problem, the present invention further provides an entity and semantic relationship recognition apparatus, including:

the word vector conversion module is used for carrying out word vector conversion on the linguistic data to be recognized to obtain a word vector set to be recognized;

the coding module is used for carrying out original coding on the word vector set to be recognized by utilizing a pre-trained decoding layer to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and solving the coding word vector set according to the multiple groups of attention values and the original coding word vector set;

the entity identification module is used for identifying an entity sequence of the encoding word vector set to obtain an entity set by utilizing a decoding layer which is trained in advance, and decoding the encoding word vector set to obtain a decoding word vector set;

and the semantic relation recognition module is used for inputting the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

In order to solve the above problem, the present invention also provides an electronic device, including:

a memory storing at least one instruction; and

and the processor executes the instructions stored in the memory to realize the entity and semantic relation identification method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, where at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the entity and semantic relationship identification method described above.

According to the embodiment of the invention, the original coding layer which is pre-trained is used for carrying out original coding to obtain the original coding word vectors, and the feature relation between the original coding word vectors is further calculated by combining the attention value method, so that the defect of feature expression is effectively solved, the text feature structure is finer, and the subsequent recognition accuracy is improved; in addition, the decoding operation is carried out by using the decoding layer finished by pre-training, and the semantic relation is directly calculated through a preset probability distribution function, so that the generation of redundant information among the semantic relations is avoided, and the calculation resource expenditure is reduced.

Preferably, the method further includes constructing an entity and semantic relationship recognition model including the encoding layer and the decoding layer, and training the encoding layer and the decoding layer, wherein the training includes:

step A: acquiring a training corpus and a training label set, and dividing the training label set into an entity label set and a semantic relation label set;

and B: respectively converting the training corpus set, the entity tag set and the semantic relation tag set into a training vector set, an entity vector set and a semantic relation vector set;

and C: inputting the training vector set into the coding layer to carry out coding operation to obtain a coding training set;

step D: inputting the coding training set into the decoding layer to perform the decoding operation to obtain a predicted entity set and a predicted semantic relation set;

step E: calculating first loss values of the predicted entity set and the entity vector set, and calculating second loss values of the predicted semantic relation set and the semantic relation vector set;

step F: calculating a total loss value according to the first loss value and the second loss value;

step G: c, optimizing the internal parameters of the coding layer and the decoding layer according to a pre-constructed optimization function under the condition that the total loss value is greater than a preset loss value, and returning to the step C;

step H: and under the condition that the total loss value is smaller than the preset loss value, obtaining the trained coding layer and decoding layer.

Preferably, the obtaining a set of vectors of encoded words according to the plurality of sets of attention values and the set of vectors of original encoded words includes:

carrying out dimension splicing on the multiple groups of attention values to obtain a vector set of the encoding regulating words;

and performing integration operation on the original encoding word vector set and the encoding adjusting word vector set to obtain the encoding word vector set.

Preferably, the obtaining, by calculation, multiple sets of attention values of the original encoding word vector set according to the vector dimensions of the original encoding word vector set includes:

initializing a plurality of groups of attention matrixes;

calculating a projection matrix of a plurality of groups of attention matrixes according to a pre-constructed projection equation;

and inputting the vector dimensions of the plurality of groups of projection matrixes and the original coding word vector set into a pre-constructed attention calculation function to obtain the plurality of groups of attention values.

Preferably, the attention calculation function is:

wherein the headiTo the attention value, dbilstmRepresenting vector dimensions of the original encoding word vector set, softmax representing a normalized exponential function, Q'i、K′i、V′iRepresenting a projection matrix.

According to the embodiment of the invention, the BilSTM is utilized to complete extraction of the time sequence characteristics of the text word vector, and the attention value is also used for constructing the text characteristics.

Preferably, the identifying the entity sequence of the encoded word vector set by using the pre-trained decoding layer to obtain an entity set, and performing a decoding operation on the encoded word vector set to obtain a decoded word vector set includes:

identifying entity vectors included in the encoding word vector set according to a pre-constructed conditional random field model to obtain an entity vector set;

and according to a pre-constructed fully-connected neural network, calculating the semantic relation among the encoding word vectors in the encoding word vector set to obtain diversity, and according to a pre-constructed threshold function, cleaning the semantic relation to obtain the diversity to obtain the decoding word vector set.

Preferably, the step of obtaining the vector set of decoded words by cleaning the diversity of semantic relationships according to a pre-established threshold function includes:

taking the semantic relation diversity as an input parameter of the threshold function;

calculating the threshold function to obtain the probability distribution of the diversity of the semantic relation;

and clearing the semantic relation according to the probability distribution to obtain diversity and obtain the vector set of the decoded words.

According to the embodiment of the invention, the partial word vectors with strong attention determined by the probability distribution function are endowed with larger weight, and the parts needing to be ignored are endowed with smaller weight, so that the importance degree of each word in the sentence can be reflected by weighting the attention weight of the clause, the semantic relation extraction task can be completed, and the recognition accuracy is improved.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.

FIG. 1 is a diagram illustrating examples of entities and semantic relationships according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a training coding layer and a decoding layer in the entity and semantic relationship recognition method according to the embodiment of the present invention;

FIG. 3 is a detailed flowchart illustrating an implementation procedure of S2 in the training encoding layer and the decoding layer provided in FIG. 2 according to an embodiment of the present invention;

FIG. 4 is a detailed flowchart illustrating an implementation procedure of S3 in the training encoding layer and the decoding layer provided in FIG. 2 according to an embodiment of the present invention;

fig. 5 is a schematic flowchart of a method for performing entity and semantic relationship recognition by using the trained coding layer and decoding layer obtained in fig. 2 according to an embodiment of the present invention;

FIG. 6 is a block diagram of an entity and semantic relationship recognition apparatus according to an embodiment of the present invention;

fig. 7 is a schematic diagram of an internal structure of an electronic device implementing an entity and semantic relationship recognition method according to an embodiment of the present invention;

the objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.

The core of the embodiment of the invention is to use a pre-constructed and trained entity and semantic relation recognition model to recognize entity and semantic relation of a corpus to be recognized, thereby effectively solving the problems that the traditional entity and semantic relation recognition model has numerous parameters, occupies computing resources during training and has an imperfect structure for text characteristics, and further influences recognition accuracy.

The following describes implementation details of the entity and semantic relationship recognition in this embodiment in detail, and the following description is only provided for the convenience of understanding and is not necessary for implementing this embodiment.

For a better understanding of the present invention, the terms referred to in the examples of the present invention will be explained as follows:

the entity is a noun which is formed by a meaningful single word or a plurality of words and used for representing a specific object, the entity semantic relation is used for representing the semantic relation existing between the entities, and the entity and semantic relation recognition is a process of firstly distinguishing entity information and entity relation information and integrating the entity information and the entity relation information. For example, a set of properties is purchased in china at the sea in the sentence "xiaoming". For example, three entity words, namely "xiaoming", "china" and "shanghai", exist in the sentence, the corresponding entity types are "person name", "place name" and "place name", and three pairs of entity relationship information exist, and are represented by relationship triplets (xiaoming, china, resident), (xiaoming, shanghai, resident), (shanghai, china, located), further, the entity and semantic relationship identification can be shown as shown in fig. 1 of the specification, showing the relationship between the three entity words and the relationship information, such as that xiaoming is a person name entity, china and shanghai are place names, and semantic relationship is that mankind is xiaoming resident in china and shanghai.

In the embodiment of the present invention, entity and semantic relationship recognition is performed on a corpus to be recognized by using a pre-constructed and trained entity and semantic relationship recognition model, where the entity and semantic relationship recognition model includes a coding layer and a decoding layer, and before performing recognition of an entity set and a semantic relationship set in the corpus to be recognized by using the entity and semantic relationship recognition model, the coding layer and the decoding layer need to be trained, as shown in fig. 2, training the coding layer and the decoding layer includes:

s1, acquiring a training corpus and a training label set, and dividing the training label set into an entity label set and a semantic relation label set.

The corpus set and the training label set are in one-to-one correspondence, and include various corpora and training labels, such as corpus a: "the winter is too beautiful", the training label of training corpus a is including: entity: "winter", semantic relationship: "beautiful in winter". With respect to entity tags, embodiments of the present invention may convert each entity tag into a form of sequence labeling BIO in advance, i.e., if the word consists of only one word, the word is labeled as "B-entity category"; if the entity is composed of multiple words, label the first word of the entity as "B-entity category", and the remaining words as "I-entity category"; other non-solid words are all labeled "O".

Regarding the semantic relation label, the embodiments of the present invention can be divided into the following categories, i.e. if there is no semantic relation between a word and any word in a sentence, the semantic relation label of the word is the word itself, and is correspondingly "None"; and secondly, if the dependency relationship exists between a word and other words, the semantic relationship label of the word is a related word and corresponds to the relationship class between the two words.

Preferably, for convenience of calculation, all entities and semantic relations in the corpus can be represented in a matrix representation form, for example, for a sentence with length N, assuming that there are M entity classes in total and N semantic relation classes, the dimension of the mixed representation matrix of the entities and the semantic relations is N × N, the (i, j) th element M represents that the ith word and the jth word have a semantic relation with sequence number M, where i, j belongs to N, and M belongs to M.

S2, according to the pre-constructed word vector conversion method, the training corpus set, the entity label set and the semantic relation label set are respectively converted into a training vector set, an entity vector set and a semantic relation vector set.

In the preferred embodiment of the present invention, the pre-constructed word vector transformation method can use the currently open-source ELMo Model (Embedding from Language Model). The embodiment of the invention respectively inputs the training corpus set, the entity tag set and the semantic relationship tag set into the ELMo model to obtain the training vector set, the entity vector set and the semantic relationship vector set.

And S3, inputting the training vector set to a pre-constructed coding layer for coding operation to obtain a coding training set.

The coding layer is mainly used for performing information representation on input word vectors. In the embodiment of the invention, the coding layer comprises a BilSTM coding network coding stage, an attention calculation stage, a dimension splicing stage and an integration stage.

In detail, referring to fig. 3, the detailed implementation flow of step S3 includes:

and S31, inputting the training vector set into a BilSTM coding network for original coding to obtain an original coding word vector set.

In a preferred embodiment of the present invention, the BilSTM coding network is a known technology, and can obtain a currently published and trained BilSTM coding network from a network or a public forum, and encode the training vector set by using the trained BilSTM coding network, thereby obtaining an original coding word vector set.

S32, calculating a plurality of groups of attention values of the original encoding word vector set according to the vector dimensions of the original encoding word vector set.

Further, the S32 includes: initializing a plurality of groups of attention matrixes; calculating a projection matrix of a plurality of groups of attention matrixes according to a pre-constructed projection equation; and taking the vector dimensions of the plurality of groups of projection matrixes and the original coding word vector set as input values of a pre-constructed attention calculation function, and calculating the attention calculation function to obtain a plurality of groups of attention values.

In detail, the attention matrix is Q, K, V, and the pre-constructed projection equation is as follows:

Q′i=QWi Q

K′i=KWi K

V′i=VWi K

where K is a key value, which may be preset, i is 1,2, and h is a preset number of attention heads, and W is a preset number of attention headsi QAn i-th translation matrix, W, representing the attention matrix W relative to the attention matrix Qi KRepresenting an attention matrixW is relative to the ith conversion matrix of key value K, Q'i、K′i、V′iRepresenting a projection matrix. Further, Q is also referred to as a query matrix related to the original coded word vector set, K is a key matrix of the query matrix, and V is an additional matrix related to the query matrix and the key matrix.

Further, in the preferred embodiment of the present invention, the attention calculation function is as follows:

wherein the headiFor the Attention value, Attention represents the Attention calculation function, dbilstmA vector dimension representing the set of original encoded word vectors.

And S33, carrying out dimension splicing on the plurality of groups of attention values to obtain a coding adjusting word vector set.

The dimension splicing method is various, and according to the embodiment of the invention, the coded adjusting word vector set can be obtained by randomly distributing all the attention values according to the specified rows and columns.

S34, integrating the original encoding word vector set and the encoding adjusting word vector set to obtain the encoding training set.

In the embodiment of the present invention, an integration operation of adding the original encoding word vector set and the encoding adjusting word vector set may be performed to obtain the encoding word training set, and in detail, the mathematical expression of the integration operation is as follows:

hself=hmul+hBiLSTM

wherein h isBiLSTMRepresenting said set of original encoded word vectors, hselfRepresents the training set of said coded words, hmulRepresenting the set of encoded word vectors.

And S4, inputting the coding training set to a pre-constructed decoding layer for decoding operation to obtain a prediction entity set and a prediction semantic relation set.

The decoding layer is mainly used for extracting information from the coding training set. In the embodiment of the invention, the decoding layer comprises a conditional random field model, a fully connected neural network, a threshold function cleaning stage and a screening stage.

In detail, referring to fig. 4, the detailed implementation flow of step S4 includes:

and S41, identifying the entity vectors included in the coding training set according to the conditional random field model to obtain an entity vector set.

In detail, the Conditional Random Field model (CRF) is a model of an automatically identifiable entity that is currently disclosed.

And S42, calculating semantic relationship scores among the code word vectors in the code word vector set according to the pre-constructed fully-connected neural network to obtain semantic relationship diversity.

In detail, suppose that there is a sentence A consisting of n w1,w2,…,wnThe words are formed, then any word wkAnd the word wiIf the corresponding code word vector is hkAnd hiThen, the following calculation method may be adopted to calculate the semantic relationship diversity between the encoding word vectors in the encoding word vector set:

wherein s is(r)(hk,hi) Represents hkAnd hiThe score of the semantic relationship of (a),representing clauses, V, composing sentence A(r)、W(r)、U(r)、G(r)、b(r)For the internal parameters of the fully-connected neural network, f represents an activation function. Further, theThe composition mode is as follows:

wherein:

in detail, hcut=[hst+1,hst+2,…,hend-1],mean(hcut) Denotes the mean value, st is min (k, i), end is max (k, i),representing a value of a preset parameter, dencoderRepresenting the encoded word vector.

S43, according to a pre-constructed threshold function, the semantic relation is cleared to obtain diversity and a decoding word vector set is obtained.

In detail, the step of clearing the semantic relation according to a pre-established threshold function to obtain a diversity of decoded word vectors includes: and taking the semantic relation diversity as an input parameter of the threshold function, calculating the threshold function to obtain the probability distribution of the semantic relation diversity, and clearing the semantic relation diversity according to the probability distribution to obtain the decoding word vector set.

In the preferred embodiment of the present invention, the threshold function is a sigmoid function that is currently disclosed.

S44, screening the encoding word vector set according to the decoding word vector set to obtain a semantic relation set.

In the preferred embodiment of the present invention, it can be known from S43 that the threshold function is a sigmoid function, and the value range of the sigmoid function is between [ -1, 1], and the screening rule in the embodiment of the present invention can use the encoded word vector corresponding to the decoded word vector set having a value range greater than 0, to further obtain the semantic relation set.

S5, calculating a first loss value of the predicted entity set and the entity vector set, calculating a second loss value of the predicted semantic relationship set and the semantic relationship vector set, and calculating a total loss value according to the first loss value and the second loss value.

In detail, the first loss value may be calculated by using a log-likelihood function that is currently disclosed, the second loss value may be calculated by using a cross-entropy function, and the total loss value may be calculated by adding the first loss value and the second loss value.

And S6, judging whether the total loss value is larger than a preset loss value.

And S7, if the total loss value is larger than the preset loss value, optimizing the internal parameters of the coding layer and the decoding layer according to a pre-constructed optimization function, and returning to S3.

Wherein, the pre-constructed optimization function can adopt the Adam algorithm, the random gradient descent method and the like which are disclosed currently.

And S8, if the total loss value is smaller than the preset loss value, outputting the trained coding layer and the trained decoding layer.

Referring to fig. 5, a schematic flow chart of a method for performing entity and semantic relationship recognition by using the coding layer and the decoding layer obtained in fig. 2 according to an embodiment of the present invention is shown, where the method for performing entity and semantic relationship recognition by using the coding layer and the decoding layer trained in fig. 2 includes:

s10, obtaining the linguistic data to be recognized, and performing word vector conversion on the linguistic data to be recognized according to a pre-constructed word vector conversion method to obtain a word vector set to be recognized.

The corpus to be recognized can be presented in various forms, such as a section of voice and a section of characters, for example, if the east inputs a section of voice of the east: i like Guangdong and Fujian, do not like Heilongjiang because Heilongjiang is too cold.

In the embodiment of the present invention, the method for converting the pre-constructed word vector described herein is the same as the method for converting the pre-constructed word vector described in the above-mentioned S2.

S20, carrying out original coding on the word vector set to be recognized by utilizing the coding layer finished by pre-training to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and obtaining a coding word vector set according to the multiple groups of attention values and the original coding word vector set.

In the embodiment of the present invention, the detailed execution method of S20 may refer to the encoding operation method in S3.

S30, recognizing the entity sequence of the encoding word vector set by using the decoding layer finished by pre-training to obtain an entity set, and decoding the encoding word vector set to obtain a decoding word vector set.

And S40, inputting the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

In the embodiment of the present invention, the detailed implementation methods of S30 and S40 may refer to the description of S4.

Through the scheme provided by the embodiment of the invention, the voice is input in S10: i like Guangdong and Fujian, do not like Heilongjiang, because Heilongjiang is too cold, can get: i like Guangdong, I like Fujian, I don't like Heilongjiang and Heilongjiang are cold, which are four groups of semantic relationship sets.

FIG. 6 is a functional block diagram of the entity and semantic relation recognition apparatus according to the present invention.

The entity and semantic relationship recognition apparatus 100 of the present invention can be installed in an electronic device. According to the implemented functions, the entity and semantic relationship recognition apparatus may include a model training module 101, a word vector conversion module 102, an encoding module 103, an entity recognition module 104, and a semantic relationship recognition module 105. A module according to the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the model training module 101 is configured to construct an entity and semantic relationship recognition model including the coding layer and the decoding layer, and train the coding layer and the decoding layer.

The word vector conversion module 102 is configured to perform word vector conversion on the corpus to be recognized to obtain a word vector set to be recognized.

The encoding module 103 is configured to perform original encoding on the word vector set to be recognized by using a pre-trained encoding layer to obtain an original encoded word vector set, calculate multiple groups of attention values of the original encoded word vector set, and obtain an encoded word vector set according to the multiple groups of attention values and the original encoded word vector set.

And the entity identification module 104 is configured to identify an entity sequence of the encoded word vector set by using the pre-trained decoding layer to obtain an entity set, and perform a decoding operation on the encoded word vector set to obtain a decoded word vector set.

And the semantic relation recognition module 105 is configured to input the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

When the module in the device provided by the application is used, entity and semantic relation recognition can be carried out on the corpus to be recognized by using the entity and semantic relation recognition model which is trained in advance based on the same entity and semantic relation recognition method, and the technical effect same as that of the method embodiment can be achieved when the module is used specifically, namely the problems that the parameters are numerous, calculation resources are occupied during training, the structure of text features is not fine enough, and the recognition accuracy is influenced are solved effectively.

Fig. 7 is a schematic structural diagram of an electronic device implementing the entity and semantic relationship recognition method according to the present invention.

The electronic device 1 may comprise a processor 12, a memory 11 and a bus, and may further comprise a computer program, such as a physical and semantic relationship recognition program, stored in the memory 11 and executable on the processor 12.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of entity and semantic relation recognition programs, but also to temporarily store data that has been output or is to be output.

The processor 12 may be formed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 12 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., an execution entity and a semantic relationship recognition program) stored in the memory 11 and calling data stored in the memory 11.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 12 or the like.

Fig. 7 only shows an electronic device with components, and it will be understood by a person skilled in the art that the structure shown in fig. 7 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.

For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 12 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.

Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The requesting entity and the semantic relation identifying program stored in the memory 11 of the electronic device 1 are a combination of a plurality of instructions, which when executed in the processor 12, can be implemented as follows:

the method comprises the following steps of firstly, constructing an entity and semantic relation recognition model comprising a coding layer and a decoding layer, and training the coding layer and the decoding layer, wherein the training comprises the following steps:

step a, acquiring a training corpus and a training label set, and dividing the training label set into an entity label set and a semantic relation label set.

The corpus set and the training label set are in one-to-one correspondence, and include various corpora and training labels, such as corpus a: "the winter is too beautiful", the training label of training corpus a is including: entity: "winter", semantic relationship: "beautiful in winter". With respect to entity tags, embodiments of the present invention may convert each entity tag into a form of sequence labeling BIO in advance, i.e., if the word consists of only one word, the word is labeled as "B-entity category"; if the entity is composed of multiple words, label the first word of the entity as "B-entity category", and the remaining words as "I-entity category"; other non-solid words are all labeled "O".

Regarding the semantic relation label, the embodiments of the present invention can be divided into the following categories, i.e. if there is no semantic relation between a word and any word in a sentence, the semantic relation label of the word is the word itself, and is correspondingly "None"; and secondly, if the dependency relationship exists between a word and other words, the semantic relationship label of the word is a related word and corresponds to the relationship class between the two words.

Preferably, for convenience of calculation, all entities and semantic relations in the corpus can be represented in a matrix representation form, for example, for a sentence with length N, assuming that there are M entity classes in total and N semantic relation classes, the dimension of the mixed representation matrix of the entities and the semantic relations is N × N, the (i, j) th element M represents that the ith word and the jth word have a semantic relation with sequence number M, where i, j belongs to N, and M belongs to M.

And b, respectively converting the training corpus set, the entity tag set and the semantic relation tag set into a training vector set, an entity vector set and a semantic relation vector set according to a pre-constructed word vector conversion method.

In the preferred embodiment of the present invention, the pre-constructed word vector transformation method can use the currently open-source ELMo Model (Embedding from Language Model). The embodiment of the invention respectively inputs the training corpus set, the entity tag set and the semantic relationship tag set into the ELMo model to obtain the training vector set, the entity vector set and the semantic relationship vector set.

And c, inputting the training vector set into a pre-constructed coding layer to perform coding operation to obtain a coding training set.

The coding layer is mainly used for performing information representation on input word vectors. In the embodiment of the invention, the coding layer comprises a BilSTM coding network coding stage, an attention calculation stage, a dimension splicing stage and an integration stage.

In detail, the detailed implementation flow of step c includes:

and c1, inputting the training vector set into a BilSTM coding network to obtain an original coding word vector set.

In a preferred embodiment of the present invention, the BilSTM coding network is a known technology, and can obtain a currently published and trained BilSTM coding network from a network or a public forum, and encode the training vector set by using the trained BilSTM coding network, thereby obtaining an original coding word vector set.

And c2, calculating multiple groups of attention values of the original encoding word vector set according to the vector dimensions of the original encoding word vector set.

Further, the calculating multiple sets of attention values of the original encoding word vector set according to the vector dimensions of the original encoding word vector set includes: initializing a plurality of groups of attention matrixes; calculating a projection matrix of a plurality of groups of attention matrixes according to a pre-constructed projection equation; and taking the vector dimensions of the plurality of groups of projection matrixes and the original coding word vector set as input values of a pre-constructed attention calculation function, and calculating the attention calculation function to obtain a plurality of groups of attention values.

In detail, the attention matrix is Q, K, V, and the pre-constructed projection equation is as follows:

Q′i=QWi Q

K′i=KWi K

V′i=VWi K

where K is a key value, which may be preset, i is 1,2, and h is a preset number of attention heads, and W is a preset number of attention headsi QAn i-th translation matrix, W, representing the attention matrix W relative to the attention matrix Qi KRepresenting the ith transformation matrix, Q ', of the attention matrix W relative to the key value K'i、K′i、V′iRepresenting a projection matrix. Further, Q is also referred to as a query matrix related to the original coded word vector set, K is a key matrix of the query matrix, and V is an additional matrix related to the query matrix and the key matrix.

Further, in the preferred embodiment of the present invention, the attention calculation function is as follows:

wherein the headiFor the Attention value, Attention represents the Attention calculation function, dbilstmA vector dimension representing the set of original encoded word vectors.

And c3, carrying out dimension splicing on the plurality of groups of attention values to obtain a coding adjusting word vector set.

The dimension splicing method is various, and according to the embodiment of the invention, the coded adjusting word vector set can be obtained by randomly distributing all the attention values according to the specified rows and columns.

And c4, integrating the original encoding word vector set and the encoding adjusting word vector set to obtain the encoding training set.

In the embodiment of the present invention, an integration operation of adding the original encoding word vector set and the encoding adjusting word vector set may be performed to obtain the encoding word training set, and in detail, the mathematical expression of the integration operation is as follows:

hself=hmul+hBiLSTM

wherein h isBiLSTMRepresenting said set of original encoded word vectors, hselfRepresents the training set of said coded words, hmulRepresenting the set of encoded word vectors.

And d, inputting the coding training set into a pre-constructed decoding layer for decoding operation to obtain a predicted entity set and a predicted semantic relation set.

The decoding layer is mainly used for extracting information of a coding training set, and in the embodiment of the invention, the decoding layer comprises a conditional random field model, a fully-connected neural network, a threshold function cleaning stage and a screening stage.

In detail, the detailed implementation flow of the step d comprises:

and d1, recognizing the entity vectors included in the coding training set according to the conditional random field model to obtain an entity vector set.

In detail, the Conditional Random Field model (CRF) is a model of an automatically identifiable entity that is currently disclosed.

D2, calculating the semantic relationship score among the code word vectors in the code word vector set according to the pre-constructed fully-connected neural network to obtain the diversity of the semantic relationship;

further, suppose that there is a sentence A consisting of n w1,w2,…,wnThe words are formed, then any word wkAnd the word wiIf the corresponding code word vector is hkAnd hiThen, the following calculation method may be adopted to calculate the semantic relationship score between the encoding word vectors in the encoding word vector set:

wherein s is(r)(hk,hi) Represents hkAnd hiThe score of the semantic relationship of (a),representing clauses, V, composing sentence A(r)、W(r)、U(r)、G(r)、b(r)For the internal parameters of the fully-connected neural network, f represents an activation function.

Further, theThe composition mode is as follows:

wherein:

in detail, hcut=[hst+1,hst+2,…,hend-1],mean(hcut) Denotes the mean value, st is min (k, i), end is max (k, i),representing a value of a preset parameter, dencoderRepresenting the encoded word vector. And d3, cleaning the semantic relation according to a pre-constructed threshold function to obtain diversity, and obtaining a vector set of the decoded words.

In detail, the step of clearing the semantic relation according to a pre-established threshold function to obtain a diversity of decoded word vectors includes: and taking the semantic relation diversity as an input parameter of the threshold function, calculating the threshold function to obtain the probability distribution of the semantic relation diversity, and clearing the semantic relation diversity according to the probability distribution to obtain the decoding word vector set.

And d4, screening the encoding word vector set according to the decoding word vector set to obtain a semantic relation set.

In the preferred embodiment of the present invention, step d3 shows that the threshold function is a sigmoid function, and the value range of the sigmoid function is between [ -1, 1], so that the screening rule of the embodiment of the present invention can use the encoded word vector corresponding to the decoded word vector set having a value range greater than 0, to further obtain the semantic relation set.

And e, calculating first loss values of the predicted entity set and the entity vector set, calculating second loss values of the predicted semantic relation set and the semantic relation vector set, and calculating a total loss value according to the first loss values and the second loss values.

In detail, the first loss value may be calculated by using a log-likelihood function that is currently disclosed, the second loss value may be calculated by using a cross-entropy function, and the total loss value may be calculated by adding the first loss value and the second loss value.

F, judging whether the total loss value is greater than a preset loss value;

and g, if the total loss value is greater than the preset loss value, optimizing the internal parameters of the coding layer and the decoding layer according to a pre-constructed optimization function, and returning to the step c.

Wherein, the pre-constructed optimization function can adopt the Adam algorithm, the random gradient descent method and the like which are disclosed currently.

And h, outputting the trained coding layer and the trained decoding layer if the total loss value is smaller than the preset loss value.

And step two, obtaining the linguistic data to be recognized, and performing word vector conversion on the linguistic data to be recognized according to a pre-constructed word vector conversion method to obtain a word vector set to be recognized.

The corpus to be recognized can be presented in various forms, such as a section of voice and a section of characters, for example, if the east inputs a section of voice of the east: i like Guangdong and Fujian, do not like Heilongjiang because Heilongjiang is too cold.

In the embodiment of the present invention, the method for converting the pre-constructed word vector described herein is the same as the method for converting the pre-constructed word vector described in step b above.

And thirdly, carrying out original coding on the word vector set to be recognized by utilizing a coding layer which is trained in advance to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and solving the coding word vector set according to the multiple groups of attention values and the original coding word vector set.

In the embodiment of the present invention, the encoding operation may refer to the encoding operation method in step c.

And step four, recognizing the entity sequence of the encoding word vector set by using the decoding layer finished by pre-training to obtain an entity set, and decoding the encoding word vector set to obtain a decoding word vector set.

And step five, inputting the vector set of the decoded words into a preset probability distribution function to obtain a semantic relation set.

In the embodiment of the present invention, the detailed execution method of the step four and the step five may refer to the description in the step d.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

The computer-readable storage medium has stored thereon an entitlement control program that is executable by one or more processors to perform operations comprising:

constructing an entity and semantic relation recognition model, and training a coding layer and a decoding layer which are included in the entity and semantic relation recognition model;

performing word vector conversion on the corpus to be recognized to obtain a word vector set to be recognized;

performing original coding on the word vector set to be recognized by utilizing a coding layer which is trained in advance to obtain an original coding word vector set, calculating multiple groups of attention values of the original coding word vector set, and solving a coding word vector set according to the multiple groups of attention values and the original coding word vector set;

identifying an entity sequence of the encoding word vector set by using a decoding layer which is pre-trained to obtain an entity set, and performing decoding operation on the encoding word vector set to obtain a decoding word vector set;

and inputting the decoded word vector set to a preset probability distribution function to obtain a semantic relation set.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:文本信息处理方法、装置、存储介质及电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!