Semantic role labeling method and device, electronic equipment and computer readable medium

文档序号：361615 发布日期：2021-12-07 浏览：7次中文

阅读说明：本技术 语义角色标注方法、装置、电子设备和计算机可读介质 (Semantic role labeling method and device, electronic equipment and computer readable medium ) 是由钱叶于 2020-11-13 设计创作，主要内容包括：本公开的实施例公开了语义角色标注方法、装置、电子设备和计算机可读介质。该方法的一具体实施方式包括：提取目标文本对应的词集中每个词在该目标文本的上下文关联信息,以生成第一词向量,得到第一词向量集；提取该第一词向量集中每个第一词向量的该目标文本的上下文关联信息,以生成第二词向量,得到第二词向量集；对该第二词向量集中的每个第二词向量对应的词进行语义角色标注,以生成标注语义角色的词,得到标注语义角色的词集。该实施方式通过多次提取文本中各个词的上下文信息,可以提高文本语义标注的准确率。(The embodiment of the disclosure discloses a semantic role labeling method, a semantic role labeling device, electronic equipment and a computer readable medium. One embodiment of the method comprises: extracting context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector to obtain a first word vector set; extracting context associated information of the target text of each first word vector in the first word vector set to generate a second word vector to obtain a second word vector set; and performing semantic role labeling on the words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, so as to obtain a word set labeled with semantic roles. According to the embodiment, the context information of each word in the text is extracted for multiple times, so that the accuracy of text semantic annotation can be improved.)

1. A semantic role labeling method comprises the following steps:

extracting context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector to obtain a first word vector set;

extracting context associated information of the target text of each first word vector in the first word vector set to generate a second word vector to obtain a second word vector set;

and performing semantic role labeling on the words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, so as to obtain a word set labeled with semantic roles.

2. The method of claim 1, wherein the extracting context association information of each word in a word set corresponding to a target text in the target text to generate a first word vector, and obtaining a first word vector set comprises:

carrying out shielding operation on target words in the word set corresponding to the target text to obtain a word set subjected to shielding operation;

performing word embedding on each word in the word set after the shielding operation to generate a third word vector to obtain a third word vector set;

and coding each third word vector in the third word vector set to obtain the first word vector set.

3. The method of claim 2, wherein said encoding each third word vector in the third set of word vectors to obtain the first set of word vectors comprises:

and inputting each third word vector in the third word vector set to a pre-trained coding network to obtain the first word vector set, wherein the coding network comprises a preset number of layers of coding layers.

4. The method of claim 1, wherein said extracting context relevant information of the target text for each first word vector in the first set of word vectors to generate a second word vector, resulting in a second set of word vectors, comprises:

and inputting each word vector in the first word vector set to a pre-trained bidirectional gating circulating unit network to obtain the second word vector set.

5. The method of claim 1, wherein the semantic role labeling words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, and obtaining a set of words labeled with semantic roles, comprises:

and inputting each word vector in the second word vector set to a pre-trained conditional random field to obtain the word set labeled with the semantic role.

6. The method of claim 3, wherein the coding layer is generated by:

inputting each fourth word vector in the fourth word vector set to the self-attention layer to obtain a fifth word vector set;

inputting the fifth word vector set to a discarding layer to obtain a sixth word vector set;

inputting each fourth vector in the fourth vector set and a sixth word vector corresponding to the sixth word vector set to an addition layer for addition to generate a seventh vector, so as to obtain a seventh vector set;

inputting the seventh vector set into a normalization layer to perform normalization processing to obtain an eighth vector set;

inputting the eighth vector set to a linear transformation layer to obtain a ninth vector set;

inputting the ninth word vector set to the discarding layer to obtain a tenth word vector set;

inputting each eighth vector in the eighth vector set and a tenth word vector corresponding to the tenth word vector set to an addition layer for addition to generate an eleventh vector, so as to obtain an eleventh vector set;

and inputting the eleventh vector set into the normalization layer for normalization processing to obtain a twelfth vector set as the output of the coding layer.

7. A semantic character tagging apparatus comprising:

the first extraction unit is configured to extract context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector, so as to obtain a first word vector set;

the second extraction unit is configured to further extract context associated information of a word corresponding to each first word vector in the first word vector set in the target text to generate a second word vector, so as to obtain a second word vector set;

and the semantic role labeling unit is configured to perform semantic role labeling on the words corresponding to each second word vector in the second word vector set so as to generate words labeled with semantic roles, and obtain the word set labeled with the semantic roles.

8. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method recited in any of claims 1-6.

9. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-6.

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a semantic role labeling method, a semantic role labeling device, electronic equipment and a computer readable medium.

Background

Currently, Semantic Role Labeling (SRL) can be centered on predicates of sentences, and analyze the relationship between each component in a sentence and the predicates without deeply analyzing Semantic information contained in the sentence. Namely, Predicate-Argument (Argument) structure of a sentence, and describing these structural relationships by semantic roles, are an important intermediate step in many natural language understanding tasks (e.g., information extraction, chapter analysis, deep question answering, etc.). At present, when semantic role labeling is performed on a text, the following methods are generally adopted: the method comprises the steps of obtaining features of a text by utilizing deep learning, inputting feature results into a conditional random field, and outputting a label sequence with the maximum probability through the conditional random field.

However, when semantic character labeling is performed on a text in the above manner, the following technical problems often exist:

first, when the features of a text are obtained by deep learning, the context information of the text cannot be well preserved. Further, the result of semantic annotation is affected. Partial redundant information exists in the extracted text context information.

Second, much redundant information contained in the text context information cannot be effectively removed, and the existence of the redundant information causes interference to downstream tasks of text processing.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Some embodiments of the present disclosure propose semantic role labeling methods, apparatuses, devices and computer readable media to solve the technical problems mentioned in the background section above.

In a first aspect, some embodiments of the present disclosure provide a semantic role labeling method, including: extracting context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector to obtain a first word vector set; extracting context associated information of the target text of each first word vector in the first word vector set to generate a second word vector to obtain a second word vector set; and performing semantic role labeling on the words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, so as to obtain a word set labeled with semantic roles.

Optionally, the extracting context associated information of each word in the word set corresponding to the target text in the target text to generate a first word vector, and obtaining a first word vector set includes: carrying out shielding operation on target words in the word set corresponding to the target text to obtain a word set subjected to shielding operation; performing word embedding on each word in the word set after the shielding operation to generate a third word vector to obtain a third word vector set; and coding each third word vector in the third word vector set to obtain the first word vector set.

Optionally, encoding each third word vector in the third word vector set to obtain the first word vector set includes: and inputting each third word vector in the third word vector set to a pre-trained coding network to obtain the first word vector set, wherein the coding network comprises at least one coding layer.

Optionally, the extracting context associated information of the target text of each first word vector in the first word vector set to generate a second word vector, and obtaining a second word vector set, includes: and inputting each word vector in the first word vector set into a pre-trained bidirectional gating circulating unit network to obtain the second word vector set.

Optionally, performing semantic role labeling on a word corresponding to each second word vector in the second word vector set to generate a word labeled with a semantic role, and obtaining a word set labeled with a semantic role, including: and inputting each word vector in the second word vector set to a pre-trained conditional random field to obtain the word set labeled with the semantic role.

Optionally, the coding layer is generated by: inputting each fourth word vector in the fourth word vector set to the self-attention layer to obtain a fifth word vector set; inputting the fifth word vector set to a discarding layer to obtain a sixth word vector set; inputting each fourth vector in the fourth vector set and a sixth word vector corresponding to the sixth word vector set to an addition layer for addition to generate a seventh vector, so as to obtain a seventh vector set; inputting the seventh vector set into a normalization layer to perform normalization processing to obtain an eighth vector set; inputting the eighth vector set to a linear transformation layer to obtain a ninth vector set; inputting the ninth word vector set to the discarding layer to obtain a tenth word vector set; inputting each eighth vector in the eighth vector set and a tenth word vector corresponding to the tenth word vector set to an addition layer for addition to generate an eleventh vector, so as to obtain an eleventh vector set; and inputting the eleventh vector set into the normalization layer for normalization processing to obtain a twelfth vector set as the output of the coding layer.

In a second aspect, some embodiments of the present disclosure provide a semantic role labeling apparatus, including: the first extraction unit is configured to extract context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector, so as to obtain a first word vector set; the second extraction unit is configured to further extract context associated information of a word corresponding to each first word vector in the first word vector set in the target text to generate a second word vector, so as to obtain a second word vector set; and the semantic role labeling unit is configured to perform semantic role labeling on the words corresponding to each second word vector in the second word vector set so as to generate words labeled with semantic roles, and obtain a word set labeled with semantic roles.

Optionally, the first extraction unit is further configured to: carrying out shielding operation on target words in the word set corresponding to the target text to obtain a word set subjected to shielding operation; performing word embedding on each word in the word set after the shielding operation to generate a third word vector to obtain a third word vector set; and coding each third word vector in the third word vector set to obtain the first word vector set.

Optionally, the first extraction unit is further configured to: and inputting each third word vector in the third word vector set to a pre-trained coding network to obtain the first word vector set, wherein the coding network comprises at least one coding layer.

Optionally, the second extraction unit is further configured to: and inputting each word vector in the first word vector set into a pre-trained bidirectional gating circulating unit network to obtain the second word vector set.

Optionally, the semantic role labeling unit is further configured to: and inputting each word vector in the second word vector set to a pre-trained conditional random field to obtain the word set labeled with the semantic role.

Optionally, the coding layer is generated by the following steps: inputting each fourth word vector in the fourth word vector set to the self-attention layer to obtain a fifth word vector set; inputting the fifth word vector set to a discarding layer to obtain a sixth word vector set; inputting each fourth vector in the fourth vector set and a sixth word vector corresponding to the sixth word vector set to an addition layer for addition to generate a seventh vector, so as to obtain a seventh vector set; inputting the seventh vector set into a normalization layer to perform normalization processing to obtain an eighth vector set; inputting the eighth vector set to a linear transformation layer to obtain a ninth vector set; inputting the ninth word vector set to the discarding layer to obtain a tenth word vector set; inputting each eighth vector in the eighth vector set and a tenth word vector corresponding to the tenth word vector set to an addition layer for addition to generate an eleventh vector, so as to obtain an eleventh vector set; and inputting the eleventh vector set into the normalization layer for normalization processing to obtain a twelfth vector set as the output of the coding layer.

In a third aspect, some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the first aspects.

In a fourth aspect, some embodiments of the disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as in any one of the first aspect.

The above embodiments of the present disclosure have the following beneficial effects: the semantic role labeling method can be used for obtaining the word sets labeled with the semantic roles, and the semantic role labeling accuracy is improved. Specifically, the inventors found that the reason for the inaccurate semantic role labeling is: when the features of the text are obtained by utilizing deep learning, the context information of the text cannot be well reserved. Further, the result of semantic annotation is affected. Partial redundant information exists in the extracted text context information. Further, the result of semantic annotation is affected. Based on this, the semantic role labeling method of some embodiments of the present disclosure enhances extraction of context information in the target text for each word in the word set corresponding to the target text, so as to reduce loss of text context information. In addition to this, there is partial redundant information for the extracted text context information. Redundant information can be reduced by further extracting context information of each word corresponding vector in the target text. Furthermore, the accuracy of semantic role labeling can be improved.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

FIG. 1 is a schematic diagram of an application scenario of the semantic role tagging method of some embodiments of the present disclosure;

FIG. 2 is a flow diagram of some embodiments of a semantic role tagging method according to the present disclosure;

FIG. 3 illustrates a schematic diagram of an occlusion operation on a target text corresponding word set in a semantic role labeling method according to some embodiments of the present disclosure;

FIG. 4 illustrates a schematic diagram of an encoding network in a semantic role labeling method according to some embodiments of the present disclosure;

FIG. 5 is a flow diagram of further embodiments of a semantic role labeling method according to the present disclosure;

FIG. 6 is a schematic block diagram illustration of some embodiments of a semantic role tagging apparatus according to the present disclosure;

FIG. 7 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 is a schematic diagram of an application scenario of a semantic role labeling method according to some embodiments of the present disclosure.

As shown in fig. 1, the electronic device 101 may first extract context association information of each word in the set of words 103 corresponding to the target text 102 in the target text 102 to generate a first word vector, resulting in a first set of word vectors 104. In this application scenario, the target text 102 may be: "Li Ming met Zhang III on pedestrian street at night every night. The word set 103 may be: "plum", "yesterday", "evening", "at", "walking street", "meet", "having", "zhang". The first word vector set 104 (12, 2, 4, 22, 45) corresponds to "Li Ming" in the word set 103. The first word vector set 104 (13, 23, 5, 36, 2) corresponds to "yesterday" in the word set 103. The first word vector set 104 (23, 38, 23, 3, 5) corresponds to "night" in the word set 103. The first word vector set 104 (67, 23, 9, 36, 4) corresponds to "in" the word set 103. The first word vector set 104 (45, 12, 5, 68, 3) corresponds to "pedestrian street" in the word set 103. The first word vector set 104 (2, 43, 19, 88, 2) corresponds to "encountered" in the word set 103. The first word vector set 104 (98, 23, 45, 8, 9) corresponds to "yes" in the word set 103. The first word vector set 104 (12, 53, 3, 88, 7) corresponds to "zhang san" in the word set 103.

Then, the context associated information of the target text 103 of each first word vector in the first word vector set 104 is extracted to generate a second word vector, resulting in a second word vector set 105. In the context of the present application, the first set of word vectors 104 (12, 2, 4, 22, 45) corresponds to the second set of word vectors 105 (89, 2, 4, 32, 41). The first set of word vectors 104 described above (13, 23, 5, 36, 2) corresponds to the second set of word vectors 105 (13, 23, 5, 36, 2). The first set of word vectors 104 described above (23, 38, 23, 3, 5) corresponds to the second set of word vectors 105 (24, 8, 213, 3, 5). The first set of word vectors 104 described above (67, 23, 9, 36, 4) corresponds to the second set of word vectors 105 (67, 23, 9, 36, 4). The first set of word vectors 104 described above (45, 12, 5, 68, 3) corresponds to the second set of word vectors 105 (23, 9, 200, 8, 9). The first set of word vectors 104 described above (2, 43, 19, 88, 2) corresponds to the second set of word vectors 105 (2, 43, 19, 88, 2). The first set of word vectors 104 (98, 23, 45, 8, 9) corresponds to the second set of word vectors 105 (92, 33, 65, 8, 9). The first set of word vectors 104 described above (12, 53, 3, 88, 7) corresponds to the second set of word vectors 105 (42, 53, 3, 88, 7).

Finally, semantic role labeling is performed on the words corresponding to each second word vector in the second word vector set 105 to generate words labeled with semantic roles, so as to obtain a word set 106 labeled with semantic roles. In the application scenario, the word set 106 labeled with semantic characters includes: "meet", "meet" - > "plum", "meet" - > "night", "meet" - > "at", "meet" - > "three", "at night" - > "yesterday", "at" - > "pedestrian street". Where "encounter" is the core of text 102. "meet" - > "Li Ming" characterizes the relationship between the two as the main and the subordinate. Encountering "- >" night "characterizes both as an intermediate relationship. "encounter" - > "characterizes a relationship between two shapes. "encountering" - > "characterizes both as right-appended relationships. "meet" - > "zhang san" characterizes both as an actor relationship. "night" - > "yesterday" characterizes both as a centering relationship. "in" - > "pedestrian street" characterizes that the two are in a guest-intervening relationship.

The electronic device 101 may be hardware or software. When the electronic device 101 is hardware, it may be implemented as a distributed cluster formed by a plurality of servers or terminal devices, or may be implemented as a single server or a single terminal device. When the electronic device 101 is embodied as software, it may be installed in the hardware devices listed above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein.

It should be understood that the number of electronic devices 101 in fig. 1 is merely illustrative. There may be any number of electronic devices, as desired for implementation.

With continued reference to fig. 2, a flow 200 of some embodiments of a semantic role annotation method in accordance with the present disclosure is illustrated. The semantic role labeling method comprises the following steps:

step 201, extracting context associated information of each word in the word set corresponding to the target text in the target text to generate a first word vector, so as to obtain a first word vector set.

In some embodiments, an executing body (e.g., the electronic device 101 shown in fig. 1) of the semantic role labeling method may extract context associated information of each word in a word set corresponding to the target text in the target text to generate a first word vector, resulting in a first word vector set. The word set is obtained by segmenting the target text. As an example, the execution agent may obtain the first word vector set by inputting the word set to a pre-trained Long Short-Term Memory network (LSTM) to generate a first word vector.

In some optional implementation manners of some embodiments, the extracting context associated information of each word in the word set corresponding to the target text in the target text to generate a first word vector, and obtaining the first word vector set may include the following steps:

firstly, carrying out shielding operation on target words in a word set corresponding to the target text to obtain the word set subjected to the shielding operation. The occlusion operation may be to occlude the determined phrase or entity composed of a plurality of words as a uniform unit. As an example, the set of words corresponding to the target text may be occluded according to phrase pairs. As another example, the set of target text correspondences may be occluded in accordance with the entity.

As shown in fig. 3, the target text 301 may be: "Li Ming met Zhang III on pedestrian street at night every night. The set of words 302 may be: "plum", "yesterday", "evening", "at", "walking street", "meet", "having", "zhang". And (3) blocking the 'in' and 'out' in the word set 302 to obtain a word set 303 after blocking operation.

And secondly, performing Word Embedding (Word Embedding) on each Word in the Word set after the shielding operation to generate a third Word vector to obtain a third Word vector set. The word embedding is a method for converting words in a text into a digital vector. The word embedding process may be embedding a high dimensional space of all word quantities in a much lower dimensional continuous vector space, each word or phrase being mapped to a vector in the real number domain, the word vector being generated as a result of the word embedding.

As an example, the execution subject may perform One-hot encoding (One-hot encoding) on each word in the word set after the occlusion operation to generate a third word vector, resulting in a third word vector set.

As another example, the execution subject may perform Word Embedding (Word Embedding) on each Word in the Word set after the occlusion operation according to a Word2vec (Word to vector) algorithm to generate a third Word vector, so as to obtain a third Word vector set.

And thirdly, coding each third word vector in the third word vector set to obtain the first word vector set.

In some optional implementations of some embodiments, each third word vector in the third word vector set is input to a pre-trained coding network, so as to obtain the first word vector set. The coding network may be a natural language neural network (transform) coding network. The coding network includes a predetermined number of layers of coding layers. As an example, the predetermined number of layers may be 12 layers.

As an example, each third word vector in the third word vector set may be input to a Recurrent Neural Network (RNN) to encode the third word vector, so as to obtain the first word vector set.

As an example, as shown in fig. 4, first, the third vector set 401 is input to the first coding layer 4021 of the coding network 402, and an output vector of the first coding layer 4021 is obtained. Then, the first coding layer 4021 is input to the second coding layer 4022 of the coding network 402, and an output vector of the second coding layer 4022 is obtained. Further, the second code layer 4022 is input to the third code layer 4023 of the code network 402, and an output vector of the third code layer 4023 is obtained. Finally, the third encoding layer 4023 is input to the fourth encoding layer 4024 of the encoding network 402, so as to obtain the first word vector set 403.

In some optional implementations of some embodiments, the coding layer is generated by:

step one, inputting each fourth word vector in the fourth word vector set to the self-attention layer to obtain a fifth word vector set. Wherein the Self-Attention layer is a Self-Attention layer.

And step two, inputting the fifth word vector set to a discarding layer to obtain a sixth word vector set. The discarding layer may be a random deactivation (Dropout) layer.

And thirdly, inputting each fourth vector in the fourth vector set and a sixth word vector corresponding to the sixth word vector set into an addition layer for addition to generate a seventh vector, so as to obtain a seventh vector set.

And fourthly, inputting the seventh vector set to a normalization layer for normalization processing to obtain an eighth vector set.

And fifthly, inputting the eighth vector set to a linear conversion layer to obtain a ninth vector set. As an example, the eighth vector set may be input to a feed-forward Neural Network (feed-forward Neural Network), resulting in a ninth vector set.

And sixthly, inputting the ninth word vector set to the discarding layer to obtain a tenth word vector set.

And seventhly, inputting each eighth vector in the eighth vector set and a tenth word vector corresponding to the tenth word vector set to an addition layer for addition to generate an eleventh vector, so as to obtain an eleventh vector set.

And step eight, inputting the eleventh vector set into the normalization layer for normalization processing to obtain a twelfth vector set as the output of the coding layer.

Step 202, extracting context associated information of the target text of each first word vector in the first word vector set to generate a second word vector, so as to obtain a second word vector set.

In some embodiments, the executing entity may extract context related information of the target text of each first word vector in the first word vector set to generate a second word vector, resulting in a second word vector set. As an example, each first word vector in the first set of word vectors may be input to a pre-trained Long Short-Term Memory network (LSTM) to obtain a second set of word vectors. As another example, each first word vector in the first word vector set may be input to a pre-trained Gated round-robin Unit (GRU) network to obtain the second word vector set.

Step 203, performing semantic role labeling on the words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, and obtaining a word set labeled with semantic roles.

In some embodiments, the execution subject may perform semantic role labeling on a word corresponding to each second word vector in the second word vector set to generate a word labeled with a semantic role, so as to obtain a word set labeled with a semantic role. The semantic role labeling can be a shallow semantic analysis technology, and a predicate-argument structure of a sentence is analyzed by taking the sentence as a unit. Specifically, the task of semantic role labeling is to study the relationship between each component in a sentence and a predicate centering on the predicate of the sentence, and describe the relationship between each word in the sentence by a semantic role.

As an example, the execution subject may input each second word in the second word vector set to a pre-trained bidirectional Recurrent Neural Network (RNN) to obtain a word set labeled with a semantic role.

As another example, the executing entity may input each second word in the second word vector set to a pre-trained Long Short-Term Memory network (LSTM) to obtain a set of words with semantic tagged characters.

As another example, each word vector in the second word vector set is input to a pre-trained Hidden Markov Model (HMM) to obtain the word set labeled with the semantic character.

In some optional implementations of some embodiments, each word vector in the second word vector set is input to a pre-trained Conditional Random Field (CRF) to obtain the word set labeled with the semantic role.

With continued reference to FIG. 5, a flow 500 of further embodiments of a semantic role tagging method according to the present disclosure is illustrated. The semantic role labeling method comprises the following steps:

step 501, extracting context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector, and obtaining a first word vector set.

Step 502, inputting each word vector in the first word vector set to a pre-trained bidirectional gated cyclic unit network to obtain the second word vector set.

In some embodiments, an executing entity (e.g., the electronic device 101 shown in fig. 1) may input each word vector in the first word vector set to a pre-trained Bidirectional Gated round-robin (BiGRU) network to obtain the second word vector set. The basic unit of the BiGRU consists of a forward-propagating GRU unit and a backward-propagating GRU unit, the output of the current unit is determined by the two unidirectional GRUs together, and the BiGRU can obtain the mapping relation between input and output information by using past and future information.

Step 503, performing semantic role labeling on the words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, and obtaining a word set labeled with semantic roles.

In some embodiments, specific implementation of steps 501 and 503 and technical effects brought by the implementation may refer to steps 201 and 203 in those embodiments corresponding to fig. 2, and are not described herein again.

One inventive aspect of the embodiments of the present disclosure solves the technical problem mentioned in the background art, that is, "many redundant information contained in the text context information cannot be effectively removed, and the existence of the redundant information may interfere with the downstream task of text processing". The factors that cannot effectively remove redundant information are often as follows: in the prior art, unidirectional GRUs are often adopted to extract text context information, although the unidirectional GRUs require fewer parameters, the computational complexity is reduced, the model training speed is increased, and the possibility of overfitting is also reduced. However, the unidirectional GRU can only learn information before the current time, and cannot learn information after the current time, and understanding of the semantics of a word requires that a word is placed in the context of the context for learning. To address the problem, the present disclosure uses BiGRU to further extract textual context information, which uses past and future information to derive a mapping between input and output information. Further, redundant information included in the text context information can be effectively reduced.

With continued reference to fig. 6, as an implementation of the above method for the above figures, the present disclosure provides some embodiments of a semantic role labeling apparatus, which correspond to the above method embodiments of fig. 2, and which can be applied to various electronic devices.

As shown in fig. 6, the semantic role labeling apparatus 600 of some embodiments includes: a first extraction unit 601, a second extraction unit 602, and a semantic character labeling unit 603. The first extraction unit 601 is configured to extract context associated information of each word in a word set corresponding to a target text in the target text, so as to generate a first word vector, resulting in a first word vector set. A second extracting unit 602, configured to extract context associated information of the target text of each first word vector in the first word vector set to generate a second word vector, resulting in a second word vector set. The semantic role labeling unit 603 is configured to perform semantic role labeling on a word corresponding to each second word vector in the second word vector set to generate a word labeled with a semantic role, so as to obtain a word set labeled with a semantic role.

In some optional implementations of some embodiments, the first extraction unit 601 may be further configured to: carrying out shielding operation on target words in the word set corresponding to the target text to obtain a word set subjected to shielding operation; performing word embedding on each word in the word set after the shielding operation to generate a third word vector to obtain a third word vector set; and coding each third word vector in the third word vector set to obtain the first word vector set.

In some optional implementations of some embodiments, the first extraction unit 601 may be further configured to: and inputting each third word vector in the third word vector set to a pre-trained coding network to obtain the first word vector set, wherein the coding network comprises a predetermined number of layers of coding layers.

In some optional implementations of some embodiments, the second extraction unit 602 may be further configured to: and inputting each word vector in the first word vector set into a pre-trained bidirectional gating circulating unit network to obtain the second word vector set.

In some optional implementations of some embodiments, the semantic role annotation unit 603 may be further configured to: and inputting each word vector in the second word vector set to a pre-trained conditional random field to obtain the word set labeled with the semantic role.

In some optional implementations of some embodiments, the coding layer is generated by: inputting each fourth word vector in the fourth word vector set to the self-attention layer to obtain a fifth word vector set; inputting the fifth word vector set to a discarding layer to obtain a sixth word vector set; inputting each fourth vector in the fourth vector set and a sixth word vector corresponding to the sixth word vector set to an addition layer for addition to generate a seventh vector, so as to obtain a seventh vector set; inputting the seventh vector set into a normalization layer to perform normalization processing to obtain an eighth vector set; inputting the eighth vector set to a linear transformation layer to obtain a ninth vector set; inputting the ninth word vector set to the discarding layer to obtain a tenth word vector set; inputting each eighth vector in the eighth vector set and a tenth word vector corresponding to the tenth word vector set to an addition layer for addition to generate an eleventh vector, so as to obtain an eleventh vector set; and inputting the eleventh vector set into the normalization layer for normalization processing to obtain a twelfth vector set as the output of the coding layer.

It will be understood that the elements described in the apparatus 600 correspond to various steps in the method described with reference to fig. 2. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 600 and the units included therein, and are not described herein again.

Referring now to fig. 7, shown is a schematic diagram of an electronic device 700 suitable for use in implementing some embodiments of the present disclosure. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from storage 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.

In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In some such embodiments, the computer program may be downloaded and installed from a network via communications means 709, or may be installed from storage 708, or may be installed from ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of some embodiments of the present disclosure.

It should be noted that the computer readable medium described above in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the apparatus; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: extracting context associated information of each word in a word set corresponding to a target text in the target text to generate a first word vector to obtain a first word vector set; extracting context associated information of the target text of each first word vector in the first word vector set to generate a second word vector to obtain a second word vector set; and performing semantic role labeling on the words corresponding to each second word vector in the second word vector set to generate words labeled with semantic roles, so as to obtain a word set labeled with semantic roles.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first extraction unit, a second extraction unit, and a semantic role labeling unit. The names of the units do not form a limitation on the units themselves in some cases, for example, the first extraction unit may also be described as a unit that extracts context associated information of each word in a word set corresponding to the target text in the target text to generate a first word vector, resulting in a first word vector set.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

19页详细技术资料下载

Semantic role labeling method and device, electronic equipment and computer readable medium

相关技术

网友询问留言