Named entity recognition method and device for Chinese sentences

文档序号:661727 发布日期:2021-04-27 浏览:2次 中文

阅读说明:本技术 一种中文句子的命名实体识别方法及装置 (Named entity recognition method and device for Chinese sentences ) 是由 吴旭 颉夏青 吴京宸 彭湃 邱莉榕 张勇东 方滨兴 张熙 于 2020-12-22 设计创作,主要内容包括:本发明公开了一种中文句子的命名实体识别方法,包括:将中文字符序列输入识别模型,由识别模型通过字符嵌入层将中文字符序列转换为字向量并输出到识别模型中的卷积网络,卷积网络对每个字向量进行卷积运算得到局部语义向量并输出到识别模型中的自适应结合层,自适应结合层对字符的局部语义向量进行注意力计算后与对应字向量拼接得到表征向量并输出到识别模型中的序列建模网络,序列建模网络对字符的表征向量进行隐层建模并将建模得到的隐层向量输出到识别模型中的标签推理层计算字符的隐层向量对应的标签。通过卷积网络提取字符的局部语义信息后与潜在词基于字词间注意力实现字词信息融合,从而实现潜在词信息的利用,避免词边界错误传递的问题。(The invention discloses a named entity recognition method of a Chinese sentence, which comprises the following steps: the method comprises the steps that a Chinese character sequence is input into an identification model, the identification model converts the Chinese character sequence into word vectors through a character embedding layer and outputs the word vectors to a convolution network in the identification model, the convolution network conducts convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to a self-adaptive combination layer in the identification model, the self-adaptive combination layer conducts attention calculation on the local semantic vector of the character and is spliced with the corresponding word vector to obtain a characterization vector and outputs the characterization vector to a sequence modeling network in the identification model, the sequence modeling network conducts hidden layer modeling on the characterization vector of the character and outputs the hidden layer vector obtained through modeling to a label reasoning layer in the identification model to calculate a label corresponding to the hidden layer vector of the character. After extracting the local semantic information of the characters through the convolutional network, realizing word information fusion with the potential words based on attention among words, thereby realizing the utilization of the potential word information and avoiding the problem of wrong transmission of word boundaries.)

1. A method for identifying a named entity of a Chinese sentence is characterized by comprising the following steps:

inputting Chinese character sequence into trained entity recognition model, converting each character in the Chinese character sequence into word vector by the entity recognition model through character embedding layer, and outputting to convolution network in the entity recognition model, the convolution network carries out convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to the self-adaptive combined layer in the entity recognition model, the local semantic vector of each character is subjected to attention calculation by the self-adaptive binding layer and then spliced with the corresponding word vector to obtain a characterization vector, and the characterization vector is output to a sequence modeling network in the entity recognition model, carrying out hidden layer modeling on the characterization vector of each character by the sequence modeling network, and outputting the hidden layer vector obtained by modeling to a label inference layer in the entity recognition model to calculate a label corresponding to the hidden layer vector of each character;

and acquiring a label sequence output by the entity recognition model and taking the label sequence as a named entity recognition result.

2. The method of claim 1, wherein the character embedding layer converts each character in the chinese character sequence to a word vector, comprising:

and aiming at each character in the Chinese character sequence, searching a word vector corresponding to the character in a trained word vector table.

3. The method of claim 1, wherein the convolution network convolves each word vector to obtain a local semantic vector, comprising:

for each word vector, carrying out convolution operation on the word vector through a first convolution layer in the convolution network, and outputting the convolution operation to a second convolution layer of the convolution network;

and the second convolution layer performs convolution operation on the vector obtained by the first convolution layer to obtain a local semantic vector of the word vector.

4. The method of claim 1, wherein the adaptive binding layer performs attention calculation on the local semantic vector of each character and then splices the local semantic vector with the corresponding word vector to obtain a characterization vector, and the method comprises:

receiving a word vector matrix of all potential words corresponding to each character output by a potential word embedding layer in the entity recognition model;

and aiming at each character, performing attention calculation on the local semantic vector of the character and the word vector matrix, and splicing an attention calculation result and the word vector of the character to obtain a characterization vector of the character.

5. The method of claim 4, wherein the potential word embedding layer obtains a word vector matrix of all potential words corresponding to each character, and comprises:

matching the Chinese character sequence with a pre-constructed dictionary tree to obtain a sub-character string;

matching each substring with a trained dictionary to obtain a substring successfully matched;

associating the successfully matched sub-character strings with the characters contained in the sub-character strings to obtain a potential word set of each character;

for each character, the word vector of each potential word in the potential word set of the character is searched in a word vector table to form a word vector matrix of the character.

6. The method of claim 1, wherein the sequence modeling network performs hidden layer modeling on the token vector of each character, comprising:

performing hidden layer modeling on the characterization vector of each character through a forward long-time and short-time memory network in the sequence modeling network to obtain a forward hidden layer vector of each character;

carrying out hidden layer modeling on the characterization vector of each character through a backward long-time and short-time memory network in the sequence modeling network to obtain a backward hidden layer vector of each character;

and respectively splicing the forward hidden layer vector and the backward hidden layer vector of each character through a splicing layer in the sequence modeling network to obtain the hidden layer vector of each character.

7. An apparatus for recognizing a named entity of a chinese sentence, the apparatus comprising:

an entity recognition module for inputting the Chinese character sequence into the trained entity recognition model, so that each character in the Chinese character sequence is converted into a word vector by the entity recognition model through a character embedding layer and is output to a convolution network in the entity recognition model, the convolution network carries out convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to the self-adaptive combined layer in the entity recognition model, the local semantic vector of each character is subjected to attention calculation by the self-adaptive binding layer and then spliced with the corresponding word vector to obtain a characterization vector, and the characterization vector is output to a sequence modeling network in the entity recognition model, carrying out hidden layer modeling on the characterization vector of each character by the sequence modeling network, and outputting the hidden layer vector obtained by modeling to a label inference layer in the entity recognition model to calculate a label corresponding to the hidden layer vector of each character;

and the acquisition module is used for acquiring the label sequence output by the entity identification model and taking the label sequence as a named entity identification result.

8. The apparatus of claim 7, wherein the obtaining module is specifically configured to, in the process of converting each character in the chinese character sequence into a word vector by the character embedding layer, look up, for each character in the chinese character sequence, a word vector corresponding to the character in a trained word vector table.

9. An electronic device comprising a readable storage medium and a processor;

wherein the readable storage medium is configured to store machine executable instructions;

the processor configured to read the machine executable instructions on the readable storage medium and execute the instructions to implement the steps of the method of any one of claims 1-6.

10. A chip comprising a readable storage medium and a processor;

wherein the readable storage medium is configured to store machine executable instructions;

the processor configured to read the machine executable instructions on the readable storage medium and execute the instructions to implement the steps of the method of any one of claims 1-6.

Technical Field

The invention relates to the technical field of natural language processing, in particular to a method and a device for recognizing a named entity of a Chinese sentence.

Background

The main task of named entity recognition is to recognize entities with specific meanings in unstructured text, which mainly include names of people, places, organizations, proper nouns and the like. The method and the tasks such as word segmentation, dependency syntactic analysis and the like are used as the most important basic tasks in natural language processing tasks, play a role of a foundation stone in a plurality of downstream tasks, and the recognition effect of the method and the system often determines the height which can be reached by the downstream tasks to a great extent. Especially in the information extraction task, it exists as a decisive basic task.

Named entity recognition of Chinese sentences is an important sub-topic in the field of Chinese natural language processing. However, due to the diversity of chinese expressions, the semantics of the entity is usually highly related to the context semantics, and the task of identifying the chinese entity is difficult due to the fuzzy word boundaries caused by the lack of separators of the chinese words in the chinese text. In addition, as the mainstream entity recognition is performed based on the sequence labeling, the labeling cost of the training set is high, and thus, many entity recognition tasks limit the recognition effect of the model due to the lack of sufficient training sets.

Disclosure of Invention

The invention aims to provide a method and a device for recognizing a named entity of a Chinese sentence aiming at the defects of the prior art, and the aim is realized by the following technical scheme.

The invention provides a named entity recognition method of a Chinese sentence in a first aspect, which comprises the following steps:

inputting Chinese character sequence into trained entity recognition model, converting each character in the Chinese character sequence into word vector by the entity recognition model through character embedding layer, and outputting to convolution network in the entity recognition model, the convolution network carries out convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to the self-adaptive combined layer in the entity recognition model, the local semantic vector of each character is subjected to attention calculation by the self-adaptive binding layer and then spliced with the corresponding word vector to obtain a characterization vector, and the characterization vector is output to a sequence modeling network in the entity recognition model, carrying out hidden layer modeling on the characterization vector of each character by the sequence modeling network, and outputting the hidden layer vector obtained by modeling to a label inference layer in the entity recognition model to calculate a label corresponding to the hidden layer vector of each character;

and acquiring a label sequence output by the entity recognition model and taking the label sequence as a named entity recognition result.

A second aspect of the present invention provides a named entity recognition apparatus for a chinese sentence, the apparatus comprising:

an entity recognition module for inputting the Chinese character sequence into the trained entity recognition model, so that each character in the Chinese character sequence is converted into a word vector by the entity recognition model through a character embedding layer and is output to a convolution network in the entity recognition model, the convolution network carries out convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to the self-adaptive combined layer in the entity recognition model, the local semantic vector of each character is subjected to attention calculation by the self-adaptive binding layer and then spliced with the corresponding word vector to obtain a characterization vector, and the characterization vector is output to a sequence modeling network in the entity recognition model, carrying out hidden layer modeling on the characterization vector of each character by the sequence modeling network, and outputting the hidden layer vector obtained by modeling to a label inference layer in the entity recognition model to calculate a label corresponding to the hidden layer vector of each character;

and the acquisition module is used for acquiring the label sequence output by the entity identification model and taking the label sequence as a named entity identification result.

A third aspect of the invention provides an electronic device comprising a readable storage medium and a processor;

wherein the readable storage medium is configured to store machine executable instructions;

the processor is configured to read the machine executable instructions on the readable storage medium and execute the instructions to implement the steps of the method according to the first aspect.

A fourth aspect of the invention proposes a chip comprising a readable storage medium and a processor;

wherein the readable storage medium is configured to store machine executable instructions;

the processor is configured to read the machine executable instructions on the readable storage medium and execute the instructions to implement the steps of the method according to the first aspect.

Based on the method and the device for identifying the named entities of the Chinese sentences, the method and the device have the following advantages that:

the Chinese character sequence is input into the entity recognition model, after the local semantic information of each character is extracted through the convolutional network, the character sequence and the corresponding potential words are subjected to self-adaptive calculation based on the attention between the words to realize word information fusion, so that the utilization of the potential word information is fully and reasonably realized, the problem of wrong transmission of word boundaries is avoided, and the aim of optimizing the Chinese entity recognition task is fulfilled.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flowchart illustrating an embodiment of a method for named entity recognition of a Chinese sentence according to an exemplary embodiment of the present invention;

FIG. 2 is a diagram illustrating an entity recognition model architecture in accordance with an exemplary embodiment of the present invention;

FIG. 3 is a diagram illustrating an example of dictionary matching in accordance with one illustrative embodiment of the present invention;

FIG. 4 is a diagram illustrating a hardware configuration of an electronic device in accordance with an exemplary embodiment of the present invention;

fig. 5 is a schematic structural diagram of a named entity recognition apparatus for a chinese sentence according to an exemplary embodiment of the present invention.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

In the prior art, the word boundary ambiguity problem has a significant effect on the recognition effect of the Chinese entity, for example, the sentence "Beijing city Changchun bridge" can be classified as "Beijing/city long/spring bridge" in the Chinese word segmentation, which results in that "Beijing city (ns)/Changchun bridge (ns)" may be mistakenly recognized as "Beijing (ns)/city long/spring bridge (nr)", resulting in word information loss.

In order to solve the above technical problems, the present invention provides an improved method for identifying a named entity in chinese, which is described in detail in the following embodiments.

Fig. 1 is a flowchart illustrating an embodiment of a method for recognizing a named entity of a chinese sentence according to an exemplary embodiment of the present invention, which may be applied to an electronic device, and in combination with the entity recognition model structure illustrated in fig. 2, the entity recognition model structure is obtained by training in advance, and includes a character embedding layer, a convolutional network, an adaptive binding layer, a sequence modeling network, a tag inference layer, and a latent word embedding layer. As shown in fig. 1, the method for identifying a named entity in a chinese sentence includes the following steps:

step 101: inputting Chinese character sequence into trained entity recognition model, converting each character in the Chinese character sequence into word vector by the entity recognition model through character embedding layer, and outputting to convolution network in the entity recognition model, the convolution network carries out convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to the self-adaptive combined layer in the entity recognition model, the local semantic vector of each character is subjected to attention calculation by the self-adaptive binding layer and then spliced with the corresponding word vector to obtain a characterization vector, and the characterization vector is output to a sequence modeling network in the entity recognition model, and performing hidden layer modeling on the characterization vector of each character by using the sequence modeling network, and outputting the hidden layer vector obtained by modeling to a label inference layer in the entity recognition model to calculate a label corresponding to the hidden layer vector of each character.

In one embodiment, the process converts each character in the chinese character sequence to a word vector for the character embedding layer by looking up a word vector corresponding to the character in a trained word vector table for each character in the chinese character sequence.

Specifically, referring to the output of the character embedding layer shown in fig. 2, it can be regarded as a chinese character sequence s ═ { c } for an input chinese sentence1,c2,…,cn}∈VcWhere n is the length of the input Chinese sentence, VcIs a character dictionary. Each character ci will be based on the pre-trained word vector tableCharacter embedding is carried out to obtain a corresponding word vector, wherein dcIs the dimension of the word vector, mcSize for the character vocabulary:

xi c=ec(ci)

thus, the word vector sequence output by the character embedding layer is obtained as { x }1 c,x2 c,...,xn c}。

In the embodiment, since the local semantic information of the sentence plays an important role in the entity recognition effect, for example, "zhang sui lu" and "zhuang sui", both have zhang sui zi words, but they are place names in the former and person names in the latter, and at the same time, the local semantic information can also provide local semantic support for the subsequent inter-word attention calculation.

Based on the above, in the process of performing convolution operation on each word vector by aiming at the convolution network to obtain the local semantic vector, for each word vector, convolution operation can be performed on the word vector through the first convolution layer in the convolution network and output to the second convolution layer of the convolution network, and then convolution operation is performed on the vector obtained by the first convolution layer by the second convolution layer to obtain the local semantic vector of the word vector.

In the convolution network, the first convolution layer comprises a plurality of convolution kernels with the same size, the second convolution layer comprises a plurality of convolution kernels with the same size, and the convolution kernels in the two convolution layers are different in size.

In particular, seeAs shown in FIG. 2, the input Chinese character sequence s passes through the character embedding layer and then outputs a word vector sequence { x }1 c,x2 c,...,xn c}, usingRepresenting a convolution kernel in a convolutional network, where k represents the size of the convolution kernel, dcAnd representing the dimension of the word vector, obtaining a vector calculation formula of local semantics contained in the ith character by convolution of the convolution kernel as follows:

wherein the content of the first and second substances,the character embedding and splicing matrix represents a context window with the ith character as the center, and f is an activation function, and particularly can be ReLU.

The second convolutional layer in the convolutional network learns local semantic information by using a plurality of convolutional kernels, and the number of the convolutional kernels is assumed to be dcnn. The local semantic vector of the ith character is the concatenation of all the convolution kernel outputs in the second convolution layer, so that the output of the convolution network is a ═ { a ═ a1,a2,…,anTherein of

It should be noted that, because the Chinese word information is usually the smallest semantic unit for understanding Chinese, in order to improve the effect of Chinese entity recognition, the invention avoids the problem of wrong transmission of word boundaries by introducing the Chinese word information.

Based on this, referring to the latent word embedding layer and the adaptive binding layer shown in fig. 2, the process of utilizing the chinese word information (i.e., the latent word information) is described in two steps:

firstly, aiming at a process of acquiring a word vector matrix of all potential words corresponding to each character by a potential word embedding layer, matching a Chinese character sequence with a pre-constructed dictionary tree to obtain sub-character strings, matching each sub-character string with a trained dictionary to acquire a sub-character string successfully matched, then associating the sub-character string successfully matched with characters contained in the sub-character string to obtain a potential word set of each character, and inquiring a word vector of each potential word in the potential word set of the character in a word vector table aiming at each character to form the word vector matrix of the character.

The dictionary tree is constructed by words in a dictionary trained in advance.

Specifically, a large dictionary is obtained based on large-scale corpus pre-trainingAnd word vector tablemwAs the size of the word vector table, dwMatching all the sub-character strings of the Chinese character sequence by utilizing the dictionary D to obtain all the potential words for the dimensionality of the word vector, and associating the successfully matched sub-character strings with the characters contained in the sub-character strings to obtain a potential word set of each character, wherein the formula is as follows:

it should be noted that if the set of potential words of a certain character is empty, it is filled with "NONE".

For non-empty potential word sets, by querying the word vector table ewA word vector matrix is obtained, and the formula is as follows:

referring to fig. 2 and fig. 3, taking "zhang san jia is on the north and east of the river" as an example, after dictionary matching, a potential word set of each character can be obtained, for example, the potential word set corresponding to "east" is:

A(c6)={w5,6("Jiangdong"), w6,6("east"), w6,7("northeast"), w5,8("north and east of the river") }

Secondly, in the process of splicing the local semantic vector of each character with the corresponding word vector to obtain a characterization vector after performing attention calculation on the local semantic vector of each character aiming at the self-adaptive combination layer, performing attention calculation on the local semantic vector of each character and the word vector matrix by receiving the word vector matrix of all the potential words corresponding to each character output by the potential word embedding layer in the entity recognition model, aiming at each character, and splicing the attention calculation result and the word vector of the character to obtain the characterization vector of the character.

Specifically, because the potential words corresponding to each character are mutually exclusive, only one potential word is in accordance with the real semantics, and therefore, the method selects and hands on the specific potential word by a mode of calculating the attention among the words, and the formula for performing attention calculation on the local semantic vector of the character and the word vector is as follows:

wherein the content of the first and second substances,parameter matrices which are all adaptive bonding layers, dmodel1=hcw×dhead1And d ismodel1Dimension equal to local semantic vector of convolution network output, hcwTo pay attention to the number of heads, dhead1Softmax is a normalization function for the vector dimension of one attention head.

Then, residual errors of the attention calculation result and the local semantic vector of the character are connected and spliced with the word vector of the character to obtain a representation vector y of the characteriThe formula is as follows:

in one embodiment, referring to FIG. 2, the token vector for each character output by the adaptive join layer is input into a sequence modeling network to better model the sequence dependency between characters. The specific implementation process comprises the following steps: the method comprises the steps of performing hidden layer modeling on a characterization vector of each character through a forward long-short time memory network in a sequence modeling network to obtain a forward hidden layer vector of each character, performing hidden layer modeling on the characterization vector of each character through a backward long-short time memory network in the sequence modeling network to obtain a backward hidden layer vector of each character, and finally splicing the forward hidden layer vector and the backward hidden layer vector of each character through a splicing layer in the sequence modeling network to obtain a hidden layer vector of each character, wherein the sequence of hidden layer vectors is represented as H { H ═ H { (H) } H { (H } H1,h2,...,hn}。

The forward Long-Short Term Memory network and the backward Long-Short Term Memory network are both LSTM (Long Short Term Memory networks) networks.

As will be understood by those skilled in the art, the tag inference layer in the entity recognition model may use a conditional random field algorithm to find the sequence with the highest conditional probability among all possible tag sequences, which is the final tag sequence.

Step 102: and acquiring a label sequence output by the entity recognition model and taking the label sequence as a named entity recognition result.

Based on the above FIG. 2, the input Chinese character sequence "Zhang three in the northeast of the Yangtze river", the named entity recognition results output after passing through the entity recognition model shown in FIG. 2 are B-NR, E-NR, O, B-NS, M-NS, and E-NS. Wherein, "B" represents a start bit of an entity, "E" represents an end bit of an entity, "M" represents a middle bit of an entity, "NR" represents a person name entity, "NS" represents a place name entity, and "O" represents others.

For the process from step 101 to step 102, the same data set is used to compare the scheme with the existing entity identification module, and the evaluation is measured according to the precision P value, the recall ratio R value and the score F1, as shown in table 1 below, the scheme model is superior to the existing model no matter the precision P value, the recall ratio R value and the score F1.

Model (model) P R F1
Existing model 1 93.66 93.31 93.48
Existing model 2 94.81 94.11 94.46
Model of the scheme 95.60 95.95 95.77

TABLE 1

In addition, the invention cancels the main improvement points in the model for comparison experiments respectively by adopting a variable control mode so as to demonstrate the contribution of each improvement point to the overall improvement of the model. As shown in table 2 below, the F1 value for the complete model of the first behavior scheme; the second action removes the convolution network in the complete model and directly uses the output of the character embedding layer as the input of the adaptive binding layer, and the experimental result shows that the removal of the convolution network results in the loss of F1 value of 0.3. The convolution network plays an important role in learning character local information, and the character features fused with the local information can provide a basis for attention of potential word calculation; the third step is to remove the adaptive binding layer in the complete model and directly use the output of the convolutional network as the input of the sequence modeling network, and experimental results show that the removal of the adaptive binding layer results in the loss of the F1 value of 6.6, so that the adaptive binding layer based on the inter-word attention plays a main role in improving the recognition effect of the model.

Model (model) F1
Model of the scheme 61.01
-CNN 60.70
-CAW 54.41

TABLE 2

To this end, the process shown in fig. 1 is completed, the chinese character sequence is input into the entity recognition model, after the local semantic information of each character is extracted through the convolutional network, the character is adaptively calculated with the corresponding potential word based on the attention between words to realize word information fusion, so that the utilization of the potential word information is fully and reasonably realized, the problem of wrong transmission of word boundaries is avoided, and the purpose of optimizing the task of recognizing the chinese entity is further achieved.

Fig. 4 is a hardware block diagram of an electronic device according to an exemplary embodiment of the present invention, the electronic device including: a communication interface 401, a processor 402, a machine-readable storage medium 403, and a bus 404; wherein the communication interface 401, the processor 402 and the machine-readable storage medium 403 communicate with each other via a bus 404. The processor 402 may execute the named entity recognition method for chinese sentences described above by reading and executing machine executable instructions in the machine readable storage medium 403 corresponding to the control logic of the named entity recognition method for chinese sentences, the details of which are described in the above embodiments and will not be described herein again.

The machine-readable storage medium 403 referred to in this disclosure may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: volatile memory, non-volatile memory, or similar storage media. In particular, the machine-readable storage medium 403 may be a RAM (Random Access Memory), a flash Memory, a storage drive (e.g., a hard disk drive), any type of storage disk (e.g., an optical disk, a DVD, etc.), or similar storage medium, or a combination thereof.

Corresponding to the embodiment of the named entity recognition method of the Chinese sentence, the invention also provides an embodiment of a named entity recognition device of the Chinese sentence.

Fig. 5 is a flowchart illustrating an embodiment of a named entity recognition apparatus for a chinese sentence according to an exemplary embodiment of the present invention, as shown in fig. 5, the named entity recognition apparatus for a chinese sentence includes:

an entity recognition module 510 for inputting the Chinese character sequence into the trained entity recognition model, converting each character in the Chinese character sequence into a word vector by the entity recognition model through a character embedding layer and outputting the word vector to a convolution network in the entity recognition model, the convolution network carries out convolution operation on each word vector to obtain a local semantic vector and outputs the local semantic vector to the self-adaptive combined layer in the entity recognition model, the local semantic vector of each character is subjected to attention calculation by the self-adaptive binding layer and then spliced with the corresponding word vector to obtain a characterization vector, and the characterization vector is output to a sequence modeling network in the entity recognition model, carrying out hidden layer modeling on the characterization vector of each character by the sequence modeling network, and outputting the hidden layer vector obtained by modeling to a label inference layer in the entity recognition model to calculate a label corresponding to the hidden layer vector of each character;

an obtaining module 520, configured to obtain the tag sequence output by the entity identification model and use the tag sequence as a named entity identification result.

In an optional implementation manner, the obtaining module 510 is specifically configured to, in a process of converting each character in the chinese character sequence into a word vector at a character embedding layer, look up, for each character in the chinese character sequence, a word vector corresponding to the character in a trained word vector table.

In an optional implementation manner, the obtaining module 510 is specifically configured to, in a process that the convolution network performs convolution operation on each word vector to obtain a local semantic vector, perform convolution operation on each word vector through a first convolution layer in the convolution network, and output the convolution operation to a second convolution layer of the convolution network; and the second convolution layer performs convolution operation on the vector obtained by the first convolution layer to obtain a local semantic vector of the word vector.

In an optional implementation manner, the obtaining module 510 is specifically configured to receive a word vector matrix of all potential words corresponding to each character output by a potential word embedding layer in the entity recognition model in a process that the local semantic vector of each character is spliced with a corresponding word vector to obtain a characterization vector after the adaptive binding layer performs attention calculation on the local semantic vector of each character; and aiming at each character, performing attention calculation on the local semantic vector of the character and the word vector matrix, and splicing an attention calculation result and the word vector of the character to obtain a characterization vector of the character.

In an optional implementation manner, the obtaining module 510 is specifically configured to, in a process of obtaining a word vector matrix of all potential words corresponding to each character in the potential word embedding layer, match the chinese character sequence with a pre-constructed dictionary tree to obtain a sub-character string; matching each substring with a trained dictionary to obtain a substring successfully matched; associating the successfully matched sub-character strings with the characters contained in the sub-character strings to obtain a potential word set of each character; for each character, the word vector of each potential word in the potential word set of the character is searched in a word vector table to form a word vector matrix of the character.

In an optional implementation manner, the obtaining module 510 is specifically configured to, in a process that the sequence modeling network performs hidden layer modeling on a token vector of each character, perform hidden layer modeling on the token vector of each character through a forward long-term and short-term memory network in the sequence modeling network to obtain a forward hidden layer vector of each character; carrying out hidden layer modeling on the characterization vector of each character through a backward long-time and short-time memory network in the sequence modeling network to obtain a backward hidden layer vector of each character; and respectively splicing the forward hidden layer vector and the backward hidden layer vector of each character through a splicing layer in the sequence modeling network to obtain the hidden layer vector of each character.

The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种命名实体识别和实体关系抽取的联合方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!