Dependency relationship detection method, device and equipment

文档序号：1889945 发布日期：2021-11-26 浏览：5次中文

阅读说明：本技术 依赖关系的检测方法、装置及设备 (Dependency relationship detection method, device and equipment ) 是由刘万青于 2021-08-31 设计创作，主要内容包括：本申请提供一种依赖关系的检测方法、装置及设备,该方法包括：基于语句训练数据获取目标训练词语,确定目标训练词语的目标词性,确定目标训练词语与关联训练词语之间的目标依赖关系；确定目标训练词语中每个字的字向量,基于目标训练词语中所有字的字向量之和确定目标训练词语对应的训练词语特征；确定目标词性对应的训练词性特征,确定目标依赖关系对应的训练依赖特征；基于训练词语特征、训练词性特征和训练依赖特征,构建出目标训练特征,基于目标训练特征对初始依存文法模型进行训练,得到已完成训练的目标依存文法模型。通过本申请的技术方案,目标依存文法模型的检测准确率较高,训练后的目标依存文法模型比较小,提高训练准确率。(The application provides a method, a device and equipment for detecting dependency relationship, wherein the method comprises the following steps: obtaining target training words based on sentence training data, determining target part of speech of the target training words, and determining target dependency between the target training words and associated training words; determining a word vector of each word in the target training words, and determining training word characteristics corresponding to the target training words based on the sum of the word vectors of all the words in the target training words; determining training part-of-speech characteristics corresponding to the target part-of-speech, and determining training dependent characteristics corresponding to the target dependent relationship; and constructing a target training characteristic based on the training word characteristic, the training part-of-speech characteristic and the training dependence characteristic, and training the initial dependence grammar model based on the target training characteristic to obtain a trained target dependence grammar model. By the technical scheme, the detection accuracy of the target dependency grammar model is high, the trained target dependency grammar model is small, and the training accuracy is improved.)

1. A method for detecting dependency relationships, the method comprising:

obtaining sentence training data, wherein the sentence training data comprises a plurality of training words, the part of speech of each training word and the dependency relationship of at least one training set, and the training set comprises two training words;

obtaining a target training word based on the sentence training data, determining a target part of speech of the target training word, and determining a target dependency relationship between the target training word and the associated training word;

determining a word vector of each word in the target training words, and determining training word characteristics corresponding to the target training words based on the sum of the word vectors of all the words in the target training words; determining training part-of-speech characteristics corresponding to the target part-of-speech, and determining training dependent characteristics corresponding to the target dependent relationship;

constructing a target training characteristic based on the training word characteristic, the training part-of-speech characteristic and the training dependent characteristic, and training the configured initial dependency grammar model based on the target training characteristic to obtain a trained target dependency grammar model; the target dependency grammar model is used for detecting the dependency relationship among detection words in the sentence detection data.

2. The method of claim 1, wherein determining training word features corresponding to the target training word based on a sum of word vectors of all words in the target training word comprises:

determining a word vector average value of all words based on the sum of the word vectors of all words in the target training words, and determining training word characteristics corresponding to the target training words based on the word vector average value.

3. The method of claim 1, wherein the determining the training part-of-speech feature corresponding to the target part-of-speech and the determining the training dependency feature corresponding to the target dependency relationship comprise:

inquiring a part-of-speech mapping table through the target part-of-speech to obtain training part-of-speech characteristics corresponding to the target part-of-speech; querying a dependency relationship mapping table through the target dependency relationship to obtain training dependency characteristics corresponding to the target dependency relationship; the part-of-speech mapping table comprises a corresponding relation between parts-of-speech and part-of-speech characteristics, and the dependency mapping table comprises a corresponding relation between dependency and dependency characteristics.

4. The method of claim 1,

the obtaining of the target training words based on the sentence training data includes: adopting a semantic dependency analysis method based on transfer to divide the statement training data into a plurality of action sequences, wherein the action sequences comprise transfer actions and configuration data, and the configuration data comprise stack data, sequence data and dependency results; the sequence data is used for storing a plurality of training words in the sentence training data, the stack data is used for storing the training words taken out of the sequence data, and the dependency result is used for storing the dependency relationship between the training words in the stack data; selecting K1 training words from the stacked data as target training words, and selecting K2 training words from the sequence data as target training words;

the determining a target dependency relationship between the target training words and associated training words comprises:

determining a target dependency between the target training word and an associated training word based on a dependency between training words stored in the dependency results in the configuration data.

5. The method of claim 1, wherein training the configured initial dependency grammar model based on the target training features, resulting in a trained target dependency grammar model comprises:

inputting the target training characteristics to an initial dependency grammar model, and processing the target training characteristics by the initial dependency grammar model by adopting a cubic function to obtain target characteristic values;

adjusting the initial dependency grammar model based on the target characteristic value to obtain an adjusted dependency grammar model, and determining whether the adjusted dependency grammar model is converged;

if not, taking the adjusted dependent grammar model as an initial dependent grammar model, and returning to execute the operation of inputting the target training characteristics to the initial dependent grammar model;

and if so, taking the adjusted dependent grammar model as a target dependent grammar model.

6. A method for detecting dependency relationships, the method comprising:

obtaining statement detection data, wherein the statement detection data comprise a plurality of detection words and part of speech of each detection word; acquiring target detection words based on the statement detection data, determining target part of speech of the target detection words, and determining target dependency between the target detection words and the associated detection words;

determining a word vector of each word in the target detection words, and determining detection word characteristics corresponding to the target detection words based on the sum of the word vectors of all the words in the target detection words; determining detection part-of-speech characteristics corresponding to the target part-of-speech, and determining detection dependency characteristics corresponding to the target dependency relationship;

and constructing target detection characteristics based on the detection word characteristics, the detection part-of-speech characteristics and the detection dependency characteristics, and inputting the target detection characteristics to a trained target dependency grammar model to obtain the dependency relationship between two detection words in the statement detection data.

7. The method of claim 6, wherein the determining the detection term characteristics corresponding to the target detection term based on a sum of word vectors of all words in the target detection term comprises:

determining a word vector average value of all words based on the sum of the word vectors of all words in the target detection word, and determining the detection word characteristics corresponding to the target detection word based on the word vector average value.

8. The method according to claim 6, wherein the determining the detection part-of-speech feature corresponding to the target part-of-speech and the determining the detection dependency feature corresponding to the target dependency relationship comprise:

querying a part-of-speech mapping table through the target part-of-speech to obtain a detection part-of-speech characteristic corresponding to the target part-of-speech; querying a dependency relationship mapping table through the target dependency relationship to obtain a detection dependency characteristic corresponding to the target dependency relationship; the part-of-speech mapping table comprises a corresponding relation between parts-of-speech and part-of-speech characteristics, and the dependency mapping table comprises a corresponding relation between dependency and dependency characteristics.

9. The method of claim 6,

the obtaining of the target detection word based on the sentence detection data includes: the statement detection data is segmented into a plurality of action sequences by adopting a semantic dependency analysis method based on transfer, the action sequences comprise transfer actions and configuration data, and the configuration data comprise stack data, sequence data and dependency results; the sequence data is used for storing a plurality of detection words in the statement detection data, the stack data is used for storing detection words taken out of the sequence data, and the dependency result is used for storing the dependency relationship between the detection words in the stack data; selecting K1 detection words from the stack data as target detection words, and selecting K2 detection words from the sequence data as target detection words;

the determining a target dependency relationship between the target detection word and the associated detection word includes:

determining a target dependency relationship between the target detection word and an associated detection word based on a dependency relationship between the detection words stored in the dependency result in the configuration data.

10. The method of claim 6,

inputting the target detection characteristics to a trained target dependency grammar model to obtain a dependency relationship between two detection words in the sentence detection data, including:

inputting the target detection characteristics to the target dependency grammar model, and processing the target detection characteristics by the target dependency grammar model by adopting a cubic function to obtain target characteristic values;

classifying the target characteristic values through the target dependency grammar model to obtain confidence degrees corresponding to M categories respectively; wherein M is a positive integer, and each category corresponds to a dependency relationship;

and determining the maximum confidence coefficient of the M confidence coefficients, and determining the dependency relationship corresponding to the category corresponding to the maximum confidence coefficient as the dependency relationship between two detection words in the statement detection data.

11. An apparatus for detecting dependency relationship, the apparatus comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring sentence training data, the sentence training data comprises a plurality of training words, the part of speech of each training word and the dependency relationship of at least one training group, and the training group comprises two training words; obtaining a target training word based on the sentence training data, determining a target part of speech of the target training word, and determining a target dependency relationship between the target training word and the associated training word;

the determining module is used for determining a word vector of each word in the target training words and determining training word characteristics corresponding to the target training words based on the sum of the word vectors of all the words in the target training words; determining training part-of-speech characteristics corresponding to the target part-of-speech, and determining training dependent characteristics corresponding to the target dependent relationship;

the training module is used for constructing a target training characteristic based on the training word characteristic, the training part-of-speech characteristic and the training dependent characteristic, and training the configured initial dependent grammar model based on the target training characteristic to obtain a trained target dependent grammar model; the target dependency grammar model is used for detecting the dependency relationship among detection words in the sentence detection data.

12. An apparatus for detecting dependency relationship, the apparatus comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring statement detection data, and the statement detection data comprises a plurality of detection words and the part of speech of each detection word; acquiring target detection words based on the statement detection data, determining target part of speech of the target detection words, and determining target dependency between the target detection words and the associated detection words;

the determining module is used for determining a word vector of each word in the target detection words and determining detection word characteristics corresponding to the target detection words based on the sum of the word vectors of all the words in the target detection words; determining detection part-of-speech characteristics corresponding to the target part-of-speech, and determining detection dependency characteristics corresponding to the target dependency relationship;

and the detection module is used for constructing a target detection characteristic based on the detection word characteristic, the detection part-of-speech characteristic and the detection dependency characteristic, inputting the target detection characteristic to a trained target dependency grammar model, and obtaining the dependency relationship between two detection words in the statement detection data.

13. A dependency detection apparatus, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; wherein the processor is configured to execute the machine executable instructions to perform the steps of:

alternatively, the first and second electrodes may be,

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, and a device for detecting a dependency relationship.

Background

Machine learning is a way to realize artificial intelligence, is a multi-field cross subject, and relates to a plurality of subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. Machine learning is used to study how computers simulate or implement human learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to improve their performance. Machine learning focuses more on algorithm design, so that a computer can automatically learn rules from data and predict unknown data by using the rules. Machine learning has found a wide variety of applications, such as deep learning, data mining, computer vision, natural language processing, biometric recognition, search engines, medical diagnostics, speech and handwriting recognition, and others.

In the machine learning technology, it is the basis for implementing artificial intelligence to enable a machine to understand natural language, and in order to enable the machine to understand natural language, syntactic analysis is generally performed in the natural language processing technology, and plays a key role in understanding the whole sentence, and the syntactic analysis refers to analyzing words and grammar in the sentence. The dependency grammar analysis is a main implementation mode of syntactic analysis, and has become a main implementation mode of syntactic analysis due to clear structure, easy understanding and labeling and capability of capturing remote collocation or modification relation among words, and is widely applied to multiple fields of natural language processing.

In dependency grammar analysis, "dependency" refers to the relationship between two words in a sentence that is not equivalent to the relationship between two words, and in the expression "dependency" there is a direction, the dominant word is called the dominant one, and the dominant word is called the dependent one.

In summary, the dependency grammar analysis is a main implementation manner of the natural language processing technology, but how to implement the dependency grammar analysis, there is no reasonable implementation manner at present, that is, when the dependency grammar analysis is used to analyze a sentence, an accurate and reliable analysis result cannot be obtained, and the analysis accuracy is not high.

Disclosure of Invention

The application provides a method for detecting dependency relationship, which comprises the following steps:

The application provides a detection device of dependency, the device includes:

The application provides a detection device of dependency, includes: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; wherein the processor is configured to execute the machine executable instructions to perform the steps of:

Based on the technical scheme, in the embodiment of the application, the training word characteristics can be determined based on the sum of the word vectors of all the words in the target training word, and the initial dependency grammar model is trained based on the training word characteristics, the training part-of-speech characteristics corresponding to the target part-of-speech and the training dependency characteristics corresponding to the target dependency relationship, so that the trained target dependency grammar model is obtained. When the target dependency grammar model is adopted to detect the dependency relationship among the detection words in the sentence detection data, an accurate and reliable detection result can be obtained, the detection accuracy is high, namely the detection accuracy of the target dependency grammar model is high, so that the sentence can be analyzed by adopting dependency grammar analysis, an accurate and reliable analysis result is obtained, and the analysis accuracy is high. The words are expressed in a word vector addition mode, so that the accuracy of the dependency grammar can be effectively improved, the training speed is high, the trained target dependency grammar model is small, the training accuracy is improved, and the model accuracy is improved.

Drawings

FIG. 1 is a flow chart of a method for dependency detection in one embodiment of the present application;

FIG. 2 is a flow chart of a method for dependency detection in one embodiment of the present application;

FIG. 3 is a diagram of dependency grammar analysis in one embodiment of the present application;

FIG. 4 is a flow chart of a method for dependency detection in one embodiment of the present application;

FIG. 5 is a diagram of sentence training data in one embodiment of the present application;

FIG. 6A is a schematic illustration of a prediction process for a "word" in one embodiment of the present application;

FIG. 6B is a diagram illustrating a prediction process for a word in one embodiment of the present application;

FIG. 7 is a block diagram illustrating a dependency grammar model in one embodiment of the present application;

FIG. 8 is a flow chart of a method for dependency detection in one embodiment of the present application;

fig. 9A and 9B are structural diagrams of a dependency detecting apparatus according to an embodiment of the present application;

fig. 9C is a block diagram of a dependency detection apparatus according to an embodiment of the present application.

Detailed Description

The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".

The embodiment of the application provides a dependency relationship detection method, in the detection method, a target dependency grammar model can be trained, and the dependency relationship between detection words in sentence detection data is detected based on the target dependency grammar model, so that an accurate and reliable detection result is obtained, the detection accuracy is high, namely the detection accuracy of the target dependency grammar model is high, the target dependency grammar model is small, so that the dependency grammar analysis can be adopted to analyze sentences, accurate and reliable analysis results are obtained, and the analysis accuracy is high.

The method for detecting a dependency relationship provided in the embodiment of the present application may be applied to a device for detecting a dependency relationship, where the device for detecting a dependency relationship may be any type of device, such as an intelligent terminal, a server, a notebook Computer, a host, a PC (Personal Computer), and the like, and the type of the device is not limited.

The embodiment of the application relates to a training process and a detection process. In the training process, a dependency grammar model is constructed in advance, the dependency grammar model is called as an initial dependency grammar model for convenient distinguishing, the initial dependency grammar model is trained to obtain a trained dependency grammar model, and the trained dependency grammar model is called as a target dependency grammar model for convenient distinguishing. In the detection process, sentence detection data is detected based on the target dependency grammar model, namely, the dependency relationship among detection words is detected.

For example, the training process and the detection process may be implemented by the same executing agent or different executing agents. For example, the device 1 executes a training process to obtain a target dependency grammar model, and executes a detection process based on the target dependency grammar model after obtaining the target dependency grammar model. For another example, the device 1 executes the training process to obtain the target dependent grammar model, after obtaining the target dependent grammar model, the target dependent grammar model is deployed to the device 2, and the device 2 executes the detection process based on the target dependent grammar model.

Referring to fig. 1, a flowchart of a method for detecting a dependency relationship provided in an embodiment of the present application is shown for a training process of a target dependency grammar model, where the method may include the following steps:

101, obtaining sentence training data, wherein the sentence training data comprises a plurality of training words, the part of speech of each training word and the dependency relationship of at least one training set, and the training set comprises two training words.

For example, for convenience of distinction, in the embodiments of the present application, sentence data of a training process is referred to as sentence training data, and words of the training process (located in the sentence training data) are referred to as training words.

102, acquiring a target training word based on the sentence training data, determining the target part of speech of the target training word, and determining the target dependency relationship between the target training word and the associated training word.

For example, a transfer-based semantic dependency analysis method may be employed to split the sentence training data into a plurality of action sequences, which may include, for each action sequence, a transfer action and configuration data, which may include stack data, sequence data, and dependency results. The sequence data is used for storing a plurality of training words in the sentence training data, the stack data is used for storing the training words extracted from the sequence data, and the dependency result is used for storing the dependency relationship between the training words in the stack data. On the basis, K1 training words can be selected from the stacked data as target training words, and K2 training words can be selected from the sequence data as target training words. For example, K1 may be a positive integer, K2 may be a positive integer, and K1 and K2 may be the same or different.

For example, determining the target dependency relationship between the target training word and the associated training word may include, but is not limited to: and determining the target dependency relationship between the target training words and the associated training words based on the dependency relationship between the training words stored in the dependency result in the configuration data.

And 103, determining a word vector of each word in the target training word, and determining the training word characteristic corresponding to the target training word based on the sum of the word vectors of all the words in the target training word. And determining training part-of-speech characteristics corresponding to the target part-of-speech, and determining training dependent characteristics corresponding to the target dependent relationship.

For example, determining the training word feature corresponding to the target training word based on the sum of the word vectors of all words in the target training word may include, but is not limited to: determining the word vector average value of all the words based on the sum of the word vectors of all the words in the target training word, and determining the training word characteristics corresponding to the target training word based on the word vector average value, namely, taking the word vector average value as the training word characteristics.

For example, determining the training part-of-speech features corresponding to the target part-of-speech and determining the training dependency features corresponding to the target dependency relationship may include, but are not limited to: and querying a part-of-speech mapping table through the target part-of-speech to obtain training part-of-speech characteristics corresponding to the target part-of-speech, and querying a dependency relationship mapping table through the target dependency relationship to obtain training dependency characteristics corresponding to the target dependency relationship. The part-of-speech mapping table comprises a corresponding relation between parts-of-speech and part-of-speech characteristics, and the dependency mapping table comprises a corresponding relation between a dependency and dependency characteristics.

And 104, constructing a target training characteristic based on the training word characteristic, the training part-of-speech characteristic and the training dependent characteristic, and training the configured initial dependent grammar model based on the target training characteristic to obtain the trained target dependent grammar model. For example, the target dependent grammar model may be used to detect dependencies between detection words within the statement detection data.

For example, training the configured initial dependency grammar model based on the target training feature to obtain a trained target dependency grammar model may include, but is not limited to: and inputting the target training characteristic to an initial dependency grammar model, and processing the target training characteristic by the initial dependency grammar model by adopting a cubic function to obtain a target characteristic value. And adjusting the initial dependency grammar model based on the target characteristic value to obtain an adjusted dependency grammar model, and determining whether the adjusted dependency grammar model is converged.

If not, the adjusted dependent grammar model is used as the initial dependent grammar model, and the operation of inputting the target training characteristics to the initial dependent grammar model is returned to be executed. If so, taking the adjusted dependency grammar model as a target dependency grammar model, and obtaining the trained target dependency grammar model.

For the detection process of the target dependency grammar model, referring to fig. 2, which is a flowchart of the detection method of the dependency relationship proposed in the embodiment of the present application, the method may include the following steps:

step 201, obtaining statement detection data, where the statement detection data includes a plurality of detection words and part of speech of each detection word. For convenience of distinction, in the embodiments of the present application, sentence data in a detection process is referred to as sentence detection data, and words (located in the sentence detection data) in the detection process are referred to as detection words.

Step 202, based on the sentence detection data, obtaining a target detection word, determining a target part of speech of the target detection word, and determining a target dependency relationship between the target detection word and the associated detection word.

For example, a transfer-based semantic dependency analysis method may be employed to segment the statement detection data into a plurality of action sequences, which may include, for each action sequence, a transfer action and configuration data, which may include stack data, sequence data, and dependency results. The sequence data is used for storing a plurality of detection words in the statement detection data, the stack data is used for storing the detection words extracted from the sequence data, and the dependency result is used for storing the dependency relationship between the detection words in the stack data. On the basis, K1 detection words can be selected from the stacked data as target detection words, and K2 detection words can be selected from the sequence data as target detection words. For example, K1 may be a positive integer, K2 may be a positive integer, and K1 and K2 may be the same or different.

For example, determining the target dependency relationship between the target detection word and the associated detection word may include, but is not limited to: and determining the target dependency relationship between the target detection words and the associated detection words based on the dependency relationship between the stored detection words in the dependency result in the configuration data.

Step 203, determining the word vector of each word in the target detection word, and determining the detection word characteristic corresponding to the target detection word based on the sum of the word vectors of all the words in the target detection word. And determining the detection part-of-speech characteristics corresponding to the target part-of-speech, and determining the detection dependency characteristics corresponding to the target dependency relationship.

For example, determining the detection word feature corresponding to the target detection word based on the sum of the word vectors of all words in the target detection word may include, but is not limited to: determining the word vector average value of all words based on the sum of the word vectors of all words in the target detection word, and determining the detection word characteristic corresponding to the target detection word based on the word vector average value, namely taking the word vector average value as the detection word characteristic.

For example, determining the detection part-of-speech characteristics corresponding to the target part-of-speech and determining the detection dependency characteristics corresponding to the target dependency relationship may include, but are not limited to: and querying a part-of-speech mapping table through the target part-of-speech to obtain a detection part-of-speech characteristic corresponding to the target part-of-speech, and querying a dependency relationship mapping table through the target dependency relationship to obtain a detection dependency characteristic corresponding to the target dependency relationship. The part-of-speech mapping table comprises a corresponding relation between parts-of-speech and part-of-speech characteristics, and the dependency mapping table comprises a corresponding relation between a dependency and dependency characteristics.

And 204, constructing a target detection characteristic based on the detection word characteristic, the detection part-of-speech characteristic and the detection dependency characteristic, and inputting the target detection characteristic to a trained target dependency grammar model to obtain a dependency relationship between two detection words in the statement detection data.

For example, the target detection feature may be input to a target dependent grammar model, and the target dependent grammar model may process the target detection feature by using a cubic function to obtain a target feature value. Classifying the target characteristic value through a target dependency grammar model to obtain confidence degrees corresponding to M categories respectively; wherein, M may be a positive integer, and each category corresponds to a dependency relationship. On this basis, the maximum confidence coefficient of the M confidence coefficients can be determined, and the dependency relationship corresponding to the category corresponding to the maximum confidence coefficient is determined as the dependency relationship between two detection terms in the sentence detection data.

The following describes the technical solution of the embodiment of the present application with reference to a specific application scenario.

In the natural language processing technology, syntactic analysis plays a key role in understanding the whole sentence, dependency grammar analysis is the main implementation mode of syntactic analysis, and has become the main implementation mode of syntactic analysis due to the clear structure, easy understanding and labeling and capability of capturing remote collocation or modification relation among words, and the dependency grammar analysis is widely applied to multiple fields of natural language processing. In dependency grammar analysis, "dependency" refers to the relationship between two words in a sentence that is not equivalent to the relationship between two words, and in the expression "dependency" there is a direction, the dominant word is called the dominant one, and the dominant word is called the dependent one. Referring to FIG. 3, a diagram of dependency grammar analysis is shown, showing the dependency relationship between two words, with the direction of the arrow pointing from the dominant (i.e., dominant word) to the dependent (i.e., dependent word). As can be seen from fig. 3, the dependency relationship between "yes" and "beijing" is SBV (i.e., the main predicate), yes "is used as the dominant, and" beijing "is used as the subordinate. The dependency between "yes" and "capital" is VOB (i.e. moving guest relationship), "yes" as the master and "capital" as the slave. The dependency between "of" and "china" is DE (i.e. a defined relationship), "of" as a dominant, and "china" as a subordinate. The dependency between "capital" and "of" is ATT (i.e., a centering relationship), with "capital" as the master and "of" as the slave. As can be seen from fig. 3, each word acts as a secondary at most once, but each word may act as a multiple-master.

In order to solve the above problems, the embodiment of the present application provides a method for detecting a dependency relationship, which may train a target dependency grammar model, and detect a dependency relationship between detection words in sentence detection data based on the target dependency grammar model, so as to obtain an accurate and reliable detection result, where the detection accuracy is high, that is, the detection accuracy of the target dependency grammar model is high, and the target dependency grammar model is small, so that a sentence can be analyzed by using dependency grammar analysis, and an accurate and reliable analysis result is obtained, and the analysis accuracy is high.

In the embodiment of the application, a training process and a detection process can be involved, wherein in the training process, an initial dependency grammar model is constructed in advance, and the initial dependency grammar model is trained to obtain a trained target dependency grammar model. Fig. 4 is a schematic diagram of a training process according to an embodiment of the present application.

Step 401, obtaining sentence training data, where the sentence training data includes a plurality of training words, a part of speech of each training word, and a dependency relationship of at least one training set, and the training set includes two training words.

Referring to fig. 5, a schematic diagram of sentence training data, where "beijing is the capital of china", the sentence training data may include training words (i.e., vocabulary), parts of speech, position numbers, and dependencies (i.e., dependencies of multiple training sets, each of which includes two training words).

Wherein the sentence training data may include the following training words: "Beijing", "Ye", "China", "first capital". In the sentence training data, the position number is used to indicate that the training word is the second training word of all the training words, for example, the position number of "beijing" is "0", the position number of "yes" is "1", the position number of "china" is "2", the position number of "yes" is "3", the position number of the fourth training word is "4", and the fifth training word is indicated.

The sentence training data may further include part of speech of each training word, for example, the part of speech of "beijing" is "noun", "yes" is "verb", "chinese" is "noun", "top" is "auxiliary word", and "top" is "noun".

In the sentence training data, the dependency relationship between two training words may also be included, for example, the dependency relationship between "yes" and "beijing" is SBV (i.e., main-meaning relationship), the dependency relationship between "yes" and "first capital" is VOB (i.e., guest moving relationship), the dependency relationship between "of" and "china" is DE (i.e., limited relationship), and the dependency relationship between "first capital" and "is ATT (i.e., fixed relationship). Of course, the above only shows several dependencies, such as SBV, VOB, DE, ATT, etc., in practical applications, there may be other types of dependencies, such as IOB (i.e. inter-guest relationship), CMP (i.e. dynamic complement structure), HED (i.e. core relationship), etc., and the dependencies are not limited.

Step 402, based on the sentence training data, obtaining a target training word, determining a target part of speech of the target training word, and determining a target dependency relationship between the target training word and the associated training word.

For example, a training word (e.g., at least one training word) in the sentence training data may be used as the target training word, for example, any of "beijing", "yes", "chinese", "first capital" may be used as the target training word. For example, "beijing" may be used as the target training word, in which case the target part of speech of the target training word is "noun", since there is a dependency relationship between "yes" and "beijing", the associated training word of the target training word is "yes", and the target dependency relationship between the target training word and the associated training word is "SBV". For another example, "china" may be used as the target training word, in which case the target part of speech of the target training word is "noun", and since there is a dependency relationship between "of" and "china", the associated training word of the target training word is "of" and the target dependency relationship between the target training word and the associated training word is "DE".

In one possible implementation, a transfer-based semantic dependency analysis method may be adopted to obtain a target training word, determine a target part of speech of the target training word, and determine a target dependency relationship between the target training word and an associated training word, which is described below.

A transfer-based semantic dependency analysis method may be employed to segment the statement training data into a plurality of action sequences, which may include, for each action sequence, a transfer action and configuration data, which may include stack data, sequence data, and dependency results. The sequence data is used for storing a plurality of training words in sentence training data, the stack data is used for storing the training words extracted from the sequence data, and the dependency result is used for storing the dependency relationship between the training words in the stack data.

For example, for sentence training data "beijing is the capital of china", a semantic dependency analysis method based on transfer may be adopted to segment the sentence training data into a plurality of action sequences shown in table 1, each row of table 1 represents one action sequence, and the action sequence may include a transfer action (transition), stack data S, sequence data B (i.e., node sequence B), and a dependency result a, and the stack data S, the sequence data B, and the dependency result a constitute configuration data (configuration). The sequence data B is used for storing a plurality of training words, the training words in the sequence data B are sequentially extracted and placed into stack data S, the stack data S is used for storing the training words extracted from the sequence data, ROOT represents a virtual ROOT node, called a virtual ROOT for short, and the dependency result A is used for storing the dependency relationship among the training words in the stack data S.

TABLE 1

Transfer motion	Stack data S	Sequence data B	Dependence result A
					[ROOT]	[ Beijing/' yes/' China/' capital]	Empty collector
SHIFT	[ ROOT Beijing]	[ is/China/capital)]
				SHIFT	[ ROOT Beijing/& ltI/& gt]	[ China/capital)]
left-ARC(SBV)	[ ROOT is]	[ China/capital)]	A U SBV (Ye, Beijing)
				SHIFT	[ ROOT is/China]	[ first capital ] of]
SHIFT	[ ROOT is/Chinese/]	[ capital of all]
				left-ARC(DE)	[ ROOT is/is]	[ capital of all]	A U DE (of China)
SHIFT	[ ROOT is/is first]	[.]
				left-ARC(ATT)	[ ROOT is/capital]	[.]	A U ATT (of first capital)
right-ARC(VOB)	[ ROOT is]	[.]	U.E. A VOB (Yes, capital)
				right-ARC(ROOT)	[ROOT]	[.]	A U (ROOT, is)

For example, a semantic dependency analysis method based on transfer may be used to perform serialized segmentation on the sentence training data "beijing is the capital of china", and the segmentation is performed as a combination of a line configuration data (configuration) and a transfer action (transition), as described in table 1, where each line is referred to as an action sequence. Then, features are extracted from the configuration data as input to the model, and the dependency between the words is predicted.

As described in table 1, taking the action sequence in row 4 as an example, then the transfer action (transition) may be "LEFT-arc (SBV)", the configuration data (configuration) may include stack data S, sequence data B, and dependent result a, where the stack data S may be "[ ROOT is ]", the sequence data B may be "[ china/first all ]", and the dependent result a may be "a £ SBV (yes, beijing)".

The following describes the semantic dependency analysis method based on migration with reference to a specific application scenario, but the following description is only an example, and the semantic dependency analysis method based on migration is not limited as long as configuration data can be obtained, and the configuration data includes stack data S, sequence data B, and dependency result a.

Illustratively, the configuration data (configuration) may be expressed as: c ═ S, B, a, S denotes a stack (stack), i.e., stack data S, B denotes a buffer queue, i.e., sequence data B, a denotes a currently obtained dependent arc set (dependency arcs), i.e., dependent result a.

Suppose a sentence training data is w₁,w₂,...w_n，w_nFor training the training words in the data for this sentence, then, in the initial state, for the configuration data, s ═ ROOT]，b＝[w₁,w₂,...w_n]And a is Φ. If sequence data B is empty, and s is ROOT]It means that the last configuration data is already, i.e. the end point status, and at this time, the whole decision process needs to be ended, i.e. the division of the configuration data is completed.

The transition action (transition) can have three states, which are: LEFT-ARC, RIGHT-ARC, SHIFT. As will be explained in more detail below, S can be used_iThe number of elements representing the stack data S (i.e. stack) (stack data S goes first and then goes out, and is the number of elements in the order of out), denoted by b_iCorresponding to the several elements in the sequence data B (buffer) (sequence data B is first in first out, and the several elements are counted in the order of out), then:

LEFT-arc (l): when the number of elements in the stack data S is greater than or equal to 2, a dependency arc (i.e., dependency relationship) may be added as S₁→S₂And the dependency (label) corresponding to the dependency arc is l, then S can be set as₂Is removed from the stack data S.

RIGHT-arc (l): when the number of elements in the stack data S is greater than or equal to 2, a dependency arc (i.e., dependency relationship) may be added as S₂→S₁And the dependency (label) corresponding to the dependency arc is l, then S can be set as₁Is removed from the stack data S.

SHIFT: when the number of elements in the sequence data B (i.e., buffer) is greater than or equal to 1, B can be set₁Is removed from sequence data B and added to stack data S (i.e., stack).

N_lRepresenting the total number of types of l in the dependency relationship, then there is 2N in a configuration corresponding to a transition_l+1, i.e. each decision step is a 2N decision_l+1 classification problem.

Based on the configuration data of each row shown in table 1, K1 training words may be selected from stack data S as target training words, and K2 training words may be selected from sequence data B as target training words, where K1 may be a positive integer, K2 may also be a positive integer, and K1 and K2 may be the same or different. See the subsequent examples for how to select training words as target training words.

For example, after the target training word is obtained, as shown in fig. 5, since each training word has a part-of-speech, the part-of-speech of the target training word may be used as the target part-of-speech.

For example, based on each row of configuration data shown in Table 1, a dependency relationship may be selected from the dependency result A as a target dependency relationship between the target training word and the associated training word. For example, when a training word is selected from the configuration data in row 4 (e.g., stack data S and sequence data B) as a target training word, a dependency relationship is selected from the dependency result a of the configuration data in row 4 as a target dependency relationship, when a training word is selected from the configuration data in row 5 as a target training word, a dependency relationship is selected from the dependency result a of the configuration data in row 5 as a target dependency relationship, and so on.

And 403, determining a word vector of each word in the target training word, and determining the training word characteristic corresponding to the target training word based on the sum of the word vectors of all the words in the target training word. For example, the word vector average of all words is determined based on the sum of the word vectors of all words, and the training word feature corresponding to the target training word is determined based on the word vector average, that is, the word vector average is used as the training word feature.

In one possible implementation, based on each row configuration data shown in table 1, training word features corresponding to the row configuration data may be determined, that is, each row configuration data corresponds to a set of training word features, and assuming that the number of target training words is M, each row configuration data corresponds to M training word features.

For convenience of description, it is taken as an example that each row of configuration data corresponds to 18 training word features, that is, the value of M is 18, and of course, the training word feature corresponding to each row of configuration data may also be a part of the following 18 training word features, or may also correspond to other training word features, which is not limited in this respect.

S₁.w,S₂.w,S₃.w,b₁.w,b₂.w,b₃.w,lc₁(S₁).w,lc₁(S₂).w,rc₁(S₁).w,rc₁(S₂).w,

lc₂(S₁).w,lc₂(S₂).w,rc₂(S₁).w,rc₂(S₂).w,lc₁(lc₁(S₁)).w,lc₁(lc₁(S₂)).w,

rc₁(rc₁(S₁)).w,rc₁(rc₁(S₂)).w

The above-mentioned 18 training word features are described below with reference to table 2 (i.e., the third row of table 1).

TABLE 2

Transfer motion	Stack data S	Sequence data B	Dependence result A
				SHIFT	[ ROOT Beijing/& ltI/& gt]	[ China/capital)]

In the above 18 training word features, w represents a word, i.e., a word representing a corresponding position. S₁W represents the training word feature corresponding to the first word (in order from right to left) in the stack data S, i.e., the first word in the stack data S is taken as the target training word, i.e., "yes" in table 2. S₂W represents the training word feature corresponding to the second word in the stack data S, i.e., the second word in the stack data S is taken as the target training word, i.e., "beijing" in table 2. S₃W represents training word features corresponding to the third word in the stack data S, i.e., the third word in the stack data S is taken as a target training word, i.e., "ROOT" in table 2. In practical applications, when there is no word in the corresponding position, the expression "None" is used.

b₁W represents training word characteristics corresponding to the first word (in left-to-right order) in the sequence data B, i.e. the first word in the sequence data B is used as a target training word, "china" in table 2. b₂W represents the training word feature corresponding to the second word in sequence data B, i.e., the second word in sequence data B is the target training word, "of table 2. b₃W represents the third word in sequence data BThe training word feature corresponding to the word, i.e. the third word in the sequence data B is used as the target training word, i.e. "first all" in table 2.

Illustratively, lc₁Denotes left-most children, being the first word to the left of the word, rc₁Representing right-most child, is the first word to the right of the word, lc₂Denotes a second word to the left of the word, rc₂Represents the second word to the right of the word, on the basis of which:

lc₁(S₁) W represents the training word feature corresponding to the first word left of the first word in the stack data S, i.e. the first word left of the first word in the stack data S is taken as the target training word, i.e. "beijing" in table 2. Furthermore, lc₁(S₂) W denotes the training word feature corresponding to the first word left of the second word in the stack data S, i.e., the first word left of the second word in the stack data S is taken as the target training word, i.e., "ROOT" in table 2. Furthermore, rc₁(S₁) W represents the training word feature corresponding to the first word to the right of the first word in the stack data S, i.e. the first word to the right of the first word in the stack data S is taken as the target training word, i.e. "None". Furthermore, rc₁(S₂) W represents the training word feature corresponding to the first word to the right of the second word in the stack data S, i.e. the first word to the right of the second word in the stack data S is taken as the target training word, i.e. "yes" in table 2.

lc₂(S₁) W denotes the training word feature, lc, corresponding to the second word to the left of the first word in the stack data S₂(S₂) W represents training term features corresponding to the second term to the left of the second term in the stack data S, rc₂(S₁) W represents the training term feature corresponding to the second term to the right of the first term in the stack data S, rc₂(S₂) W represents training word features corresponding to the second word to the right of the second word in the stack data S. lc₁(lc₁(S₁) ). w denotes the first word to the left of the candidate word (the first word to the left of the first word in the stack data S), lc₁(lc₁(S₂) ). w denotes the first word to the left of the candidate word (the first word to the left of the second word in the stack data S), rc₁(rc₁(S₁) ). w denotes the first word to the right of the candidate word (the first word to the right of the first word in the stack data S), rc₁(rc₁(S₂) ). w denotes the first word to the right of the candidate word (the first word to the right of the second word in the stack data S).

In summary, 18 target training words are shown, which may be selected from the stack data S and the sequence data B, and for each target training word, training word features corresponding to the target training word may be determined, that is, 18 target training words correspond to 18 training word features, that is, the 18 training word features. The following describes the process of determining the characteristics of the training words.

Firstly, a Word vector file may be stored in advance, the Word vector file may be obtained by training using a CBOW (Continuous Bag of words) algorithm, or may be obtained by training using other algorithms, and the training process of the Word vector file is not limited. The word vector file includes a mapping of "words" to "word vectors", i.e., the input to the word vector file is "words" and the output of the word vector file is "word vectors".

For example, each word in the word vector file may correspond to one word vector of 1 × 100 dimensions, and may also correspond to word vectors of other dimensions, which is not limited in this respect.

Secondly, aiming at each word in the target training words, a word vector corresponding to the word can be obtained by inquiring the word vector file, so that the word vector of each word in the target training words is determined. For example, for a target training word "beijing", including "north" and "beijing", a word vector corresponding to "north" and a word vector corresponding to "beijing" are obtained by querying a word vector file. For example, for the target training term "yes", yes may be included, and by querying the word vector file, a word vector corresponding to yes is obtained.

Secondly, for the target training word, if the target training word only comprises one word, the word vector corresponding to the word is composed into the training word characteristic corresponding to the target training word. For example, for the target training word "yes", the word vector corresponding to "yes" is used as the training word feature.

For a target training word, if the target training word includes at least two words, determining training word features corresponding to the target training word based on the sum of word vectors of all the words, that is, taking the average value of the word vectors of all the words as the training word features. For example, for the target training word "beijing", a word vector average value of a word vector corresponding to "north" and a word vector corresponding to "beijing" may be used as the training word feature, that is, a sum of the word vector corresponding to "north" and the word vector corresponding to "beijing" is divided by 2.

In summary, in this embodiment, the training word features of the target training word may be represented in a form of word vector addition, and are shown in the following formula as an example of the training word features of the target training word:

in the above formula, v_jA word vector representing the jth word in the target training words, n represents the total number of words in the target training words, the value range of j is 1-n,training word features representing target training words.

For example, for the target training word "Beijing", n is 2, j is 1-2, and when j is 1, v is_jRepresenting the word vector corresponding to "north", when the value of j is 2, then v_jThe word vector corresponding to Beijing is represented, namely the sum of the word vector corresponding to Beijing and the word vector corresponding to Beijing is divided by 2 to obtain the training word characteristic corresponding to the target training word Beijing

In summary, in this embodiment, the prediction process for the "word" (see fig. 6A) is optimized to the prediction process for the "word" (see fig. 6B), so as to improve the prediction accuracy.

And step 404, determining training part-of-speech characteristics corresponding to the target part-of-speech corresponding to the target training words.

Referring to step 403, M target training words may be obtained, and taking the value of M as 18 as an example, 18 target training words correspond to 18 target parts of speech, that is, each target training word corresponds to one target part of speech, for example, training part of speech characteristics corresponding to the target parts of speech may be represented as:

S₁.t,S₂.t,S₃.t,b₁.t,b₂.t,b₃.t,lc₁(S₁).t,lc₁(S₂).t,rc₁(S₁).t,rc₁(S₂).t,

lc₂(S₁).t,lc₂(S₂).t,rc₂(S₁).t,rc₂(S₂).t,lc₁(lc₁(S₁)).t,lc₁(lc₁(S₂)).t,

rc₁(rc₁(S₁)).t,rc₁(rc₁(S₂)).t

in the above 18 training part-of-speech features, t represents part-of-speech, i.e., the part-of-speech of the word representing the corresponding position. S₁T represents the training part-of-speech feature corresponding to the first word (i.e., the target training word) in the stack data S, S₂T represents training part-of-speech characteristics corresponding to the second word in the stack data S, S₃T represents a training part-of-speech feature corresponding to a third word in the stack data S, b₁T represents sequence data BTraining part-of-speech characteristics corresponding to the first word in the sentence, b₂T represents a training part-of-speech feature corresponding to the second word in the sequence data B, B₃T represents the training part-of-speech feature, lc, corresponding to the third word in sequence data B₁(S₁) T represents the training part-of-speech feature corresponding to the first word on the left of the first word in the stack data S, and so on, and the training part-of-speech features corresponding to other target training words are not described herein again, which may be referred to as step 403.

In summary, 18 target training words are shown, which may be selected from the stack data S and the sequence data B, and for each target training word, a training part-of-speech feature corresponding to the target training word may be determined, that is, 18 training part-of-speech features, which are total to 18 target training words, may be determined. The following describes a process of determining a training part-of-speech feature.

First, a part-of-speech mapping table (i.e., a part-of-speech feature file) may be stored in advance, and the part-of-speech mapping table may include a correspondence between parts-of-speech and part-of-speech features. For example, for each part of speech (such as noun, verb, auxiliary word, adjective, etc.), part of speech features may be randomly generated for the part of speech, and the part of speech features may be 1 × 100 dimensional features, or may be features of other dimensions, and the generation manner is not limited.

Secondly, for each target part-of-speech (e.g., 18 target parts-of-speech corresponding to 18 target training words), the part-of-speech mapping table may be queried through the target part-of-speech to obtain a training part-of-speech feature corresponding to the target part-of-speech, and obviously, 18 target parts-of-speech may correspond to 18 training part-of-speech features.

Step 405, determining training dependency characteristics corresponding to the target dependency relationship corresponding to the target training words.

In one possible implementation, based on each row configuration data shown in table 1, a training dependent feature corresponding to the row configuration data may be determined, that is, each row configuration data corresponds to a set of training dependent features, and assuming that the number of training dependent features is N, N training dependent features may be obtained. For convenience of description, the value of N is taken as 12 for example, and of course, the training dependent feature corresponding to each row of configuration data may also be a part of the following feature, or may also correspond to other training dependent features, which is not limited in this respect.

lc₁(S₁).l,lc₁(S₂).l,rc₁(S₁).l,rc₁(S₂).l,lc₂(S₁).l,lc₂(S₂).l,rc₂(S₁).l,

rc₂(S₂).l,lc₁(lc₁(S₁)).l,lc₁(lc₁(S₂)).l,rc₁(rc₁(S₁)).l,rc₁(rc₁(S₂)).l

In the above 12 training dependency features, l represents a dependency relationship between words, i.e., a target dependency relationship between a target training word and an associated training word. For example, lc₁(S₁) L represents S₁W and lc₁(S₁) Training dependency features corresponding to target dependencies between w, S₁W denotes the first word in the stack data S (i.e. the target training word), lc₁(S₁) W represents the first word (i.e., the associated training word) to the left of the first word in the stack data S. Furthermore, lc₁(S₂) L represents S₂W and lc₁(S₂) The training dependency features corresponding to the target dependency relationships between w, and so on, the training dependency features corresponding to other target dependency relationships are not repeated here, and the meanings of the related target training words/associated training words may be referred to in step 403.

In summary, 12 training-dependent features are shown, and the determination of the training-dependent features is explained below.

First, a dependency map (i.e., a dependency feature file) may be stored in advance, and the dependency map may include a correspondence between dependencies and dependency features. For example, for each dependency (e.g., SBV, VOB, DE, ATT, IOB, CMP, HED, etc.), a dependency feature may be randomly generated for the dependency, where the dependency feature may be a 1 × 100-dimensional dependency feature, or may be a dependency feature of other dimensions, and the generation manner of the dependency feature is not limited.

Secondly, for each target dependency relationship (e.g. 12 target dependency relationships), the dependency relationship mapping table may be queried through the target dependency relationship to obtain the training dependency features corresponding to the target dependency relationship, and obviously, 12 target dependency relationships may correspond to 12 training dependency features.

Step 406, constructing a target training feature based on the training word feature, the training part-of-speech feature and the training dependency feature, that is, combining the training word feature, the training part-of-speech feature and the training dependency feature to obtain the target training feature. For example, 18 training word features, 18 training part-of-speech features, and 12 training dependent features may be combined to obtain 48 target training features.

Step 407, training the configured initial dependency grammar model based on the target training feature to obtain a trained target dependency grammar model. For example, the target dependency grammar model may be used to detect dependencies between detection words in the sentence detection data, see the following embodiments.

Illustratively, for step 407, the following steps may be taken to implement the training process:

step 4071, obtain the configured initial dependency grammar model, where the initial dependency grammar model may be configured arbitrarily according to experience, and the initial dependency grammar model is not limited, for example, the initial dependency grammar model may be based on deep learning, or may be based on a neural network.

Referring to fig. 7, it is a schematic structural diagram of an initial dependency grammar model, which may include an Input layer (Input layer), a Hidden layer (Hidden layer), and a Soft max layer (Soft max layer), where the Soft max layer is used to statistically classify the result probability of the classification network. Of course, in practical applications, the initial dependency grammar model may also include other types of network layers, which is not limited in this regard.

Step 4072, inputting the target training features into the initial dependency grammar model.

For example, the target training features may be used as input data of an input layer of the initial dependency grammar model, and the target training features are input to the input layer of the initial dependency grammar model, that is, 18 training word features, 18 training part-of-speech features, and 12 training dependency features are used as input data of the input layer.

Step 4073, processing the target training feature with an activation function (the activation function may be a cubic function or other types of functions) through the initial dependency grammar model to obtain a target feature value.

For example, the input layer of the initial dependency grammar model may input the target training feature to the hidden layer of the initial dependency grammar model, and the hidden layer may process the target training feature using the activation function to obtain the target feature value. For example, the activation function of the hidden layer may be a cubic function, which is only an example and can be fitted to various combinations of the above 48 features (18 training word features, 18 training part-of-speech features, and 12 training dependent features), as shown in the following formula:

in the above formula, x₁、x₂...x_mThe 48 characteristics are shown, namely m is 48, w₁、w₂、…、w_mAnd b is the network parameters of the hidden layer of the initial dependency grammar model, which are the parameters to be optimized, and the network parameters are adjusted in the adjusting process of the initial dependency grammar model. x is the number of_ix_jx_kThe expression refers to one of the feature combinations after fitting, in other words, the cubic formula can make a combination of every three features in the 48 features, and the cubic formula can contain all combinations after expansion.

Obviously, x is₁、x₂...x_mAnd substituting the 48 characteristics into the formula to obtain the target characteristic value h.

Step 4074, adjusting the initial dependency grammar model based on the target feature value to obtain an adjusted dependency grammar model. For example, the loss function may be pre-constructed, and the loss function is not limited and may be configured empirically. The input of the loss function is a target characteristic value, and the output of the loss function is a loss value, so that after the target characteristic value is substituted into the loss function, the loss value can be obtained, and the network parameters of the initial dependency grammar model are adjusted based on the loss value to obtain an adjusted dependency grammar model, and the adjustment mode is not limited, such as a gradient descent method.

Step 4075, determine whether the adjusted dependent grammar model has converged.

If not, step 4076 is performed, and if so, step 4077 is performed.

For example, if the loss value determined based on the target feature value is less than the threshold value, it is determined that the adjusted dependent grammar model has converged, otherwise, it is determined that the adjusted dependent grammar model has not converged.

For another example, if the number of iterations of the dependent grammar model reaches a preset number threshold, it is determined that the adjusted dependent grammar model has converged, otherwise, it is determined that the adjusted dependent grammar model has not converged.

For another example, if the iteration duration of the dependent grammar model reaches the preset duration threshold, it is determined that the adjusted dependent grammar model has converged, otherwise, it is determined that the adjusted dependent grammar model has not converged.

Of course, the above are just a few examples, and the way of determining whether this has been converged is not limited.

Step 4076, the adjusted dependent grammar model is used as the initial dependent grammar model, and the process returns to step 4072, that is, the target training features are input to the adjusted initial dependent grammar model.

Step 4077, the adjusted dependency grammar model is used as the trained target dependency grammar model.

And finishing the training process to obtain the trained target dependency grammar model.

In the detection process, sentence detection data can be detected based on the target dependency grammar model, namely, the dependency relationship among detection words is detected. Fig. 8 is a schematic diagram of a detection process according to an embodiment of the present application.

Step 801, obtaining statement detection data, where the statement detection data includes a plurality of detection words and part of speech of each detection word. For example, sentence detection data of the dependency relationship to be detected is obtained, and the sentence detection data is segmented by using a segmentation mode to obtain a plurality of detection words and the part of speech of each detection word.

Taking the sentence detection data of "Beijing is the capital of China" as an example, the sentence detection data includes the following detection words: "Beijing", "Ye", "China", "first capital". The part of speech of each detection word comprises: the part of speech of "Beijing" is "noun", "the part of speech of" is "verb", "the part of speech of" Chinese "is" noun "," the part of speech of "Chinese" is "help word", and "the part of speech of" capital "is" noun ".

Step 802, based on the sentence detection data, obtaining a target detection word, determining a target part of speech of the target detection word, and determining a target dependency relationship between the target detection word and the associated detection word.

In one possible implementation, a semantic dependency analysis method based on transfer may be adopted to obtain a target detection word, determine a target part of speech of the target detection word, and determine a target dependency relationship between the target detection word and an associated detection word, which is described below.

The statement detection data may be split into multiple action sequences using a transfer-based semantic dependency analysis method, and for each action sequence, the action sequence may include a transfer action and configuration data, which may include stack data, sequence data, and dependency results. The sequence data is used for storing a plurality of detection words in statement detection data, the stack data is used for storing the detection words extracted from the sequence data, and the dependency result is used for storing the dependency relationship between the detection words in the stack data.

For example, for statement detection data "beijing is the capital of china", a semantic dependency analysis method based on transfer may be adopted to segment the statement detection data into a plurality of action sequences shown in table 1, each row in table 1 represents one action sequence, the action sequence may include a transfer action, stack data S, sequence data B, and dependency result a, and the stack data S, the sequence data B, and the dependency result a constitute configuration data. The sequence data B is used for storing a plurality of detection words, the detection words in the sequence data B are sequentially extracted and placed in the stack data S, the stack data S is used for storing the detection words extracted from the sequence data, and the dependence result A is used for storing the dependence relationship among the detection words in the stack data S.

When the phrase detection data is divided into a plurality of operation sequences shown in table 1, unlike step 402, the dependency relationship in the dependency result a can be obtained from the phrase training data in step 402, and if "SBV" which is the dependency relationship between "and" beijing "is obtained from the phrase training data, the dependency relationship cannot be obtained from the phrase detection data in step 802, and therefore, the dependency relationship corresponding to the previous piece of configuration data of the current configuration data (i.e., the detection result of the target dependency grammar model) is stored in the dependency result a as the dependency relationship corresponding to the current configuration data, which will be described below.

And (3) regarding the first row of configuration data, the dependency result A is an empty set, and steps 802-807 are executed based on the first row of configuration data, so as to obtain the dependency relationship among the detection words, and the dependency relationship is stored in the dependency result A of the second row of configuration data. And for the second row of configuration data, the dependency result a is the dependency relationship corresponding to the first row of configuration data, steps 802-807 are executed based on the second row of configuration data to obtain the dependency relationship between the detection words, and the dependency relationship is stored in the dependency result a of the third row of configuration data, and so on.

Based on each row of configuration data shown in table 1, K1 detection terms can be selected from the stack data S as target detection terms, and K2 detection terms can be selected from the sequence data B as target detection terms, K1 can be a positive integer, K2 can also be a positive integer, and K1 and K2 can be the same or different.

For example, after the target detection word is obtained, since each detection word has a part-of-speech, the part-of-speech of the target detection word may be used as the target part-of-speech. Further, based on each row configuration data shown in table 1, a dependency relationship may be selected from the dependency result a as a target dependency relationship between the target detection word and the associated detection word. For example, when a detection word is selected as a target detection word from the 4 th row of configuration data (e.g., stack data S and sequence data B), a dependency relationship may be selected as a target dependency relationship from the dependency result a of the 4 th row of configuration data, and so on.

Step 803, determining a word vector of each word in the target detection word, and determining the detection word characteristic corresponding to the target detection word based on the sum of the word vectors of all the words in the target detection word. For example, a word vector average value of all words is determined based on the sum of the word vectors of all words, and a detection word feature corresponding to the target detection word is determined based on the word vector average value, that is, the word vector average value is used as the detection word feature.

For an example, the implementation process of step 803 may refer to step 403, and will not be described herein.

And step 804, determining detection part-of-speech characteristics corresponding to the target part-of-speech corresponding to the target detection words.

For example, a part-of-speech mapping table is queried through the target part-of-speech, and a detected part-of-speech feature corresponding to the target part-of-speech is obtained, where the part-of-speech mapping table includes a correspondence between the part-of-speech and the part-of-speech feature.

For an exemplary implementation of step 804, refer to step 404, which is not described herein again.

And 805, determining detection dependency characteristics corresponding to the target dependency relationship corresponding to the target detection words.

For example, the dependency relationship mapping table is queried through the target dependency relationship to obtain the detection dependency characteristics corresponding to the target dependency relationship, and the dependency relationship mapping table includes the corresponding relationship between the dependency relationship and the dependency characteristics.

For an exemplary implementation of step 805, refer to step 405, which is not described herein.

Step 806, constructing a target detection feature based on the detection word feature, the detection part-of-speech feature, and the detection dependency feature, that is, combining the detection word feature, the detection part-of-speech feature, and the detection dependency feature to obtain the target detection feature. For example, 18 detection word features, 18 detection part-of-speech features, and 12 detection-dependent features may be combined to obtain 48 target detection features.

And step 807, inputting the target detection characteristics to the trained target dependency grammar model to obtain the dependency relationship between two detection words in the sentence detection data.

Illustratively, for step 807, the following steps may be employed to implement the detection process:

step 8071, obtain the trained object dependency grammar model. Referring to FIG. 7, it is a schematic diagram of a structure of a target dependent grammar model, which may include, but is not limited to, an Input layer (Input layer), a Hidden layer (Hidden layer), and a Soft max layer (Soft max layer).

Step 8072, inputting the target detection characteristics to the target dependency grammar model.

Step 8073, processing the target detection feature by using an activation function (the activation function may be a cubic function or other types of functions) through the target dependency grammar model to obtain a target feature value.

For example, the input layer of the target dependency grammar model may input the target detection feature to the hidden layer, and the hidden layer may process the target detection feature by using a cubic function to obtain the target feature value h.

8074, classifying the target characteristic value through a target dependency grammar model to obtain confidence degrees corresponding to the M categories; m is a positive integer, and each category corresponds to a dependency.

For example, after obtaining the target feature value h, the hidden layer of the target dependency grammar model may input the target feature value h to a Soft max layer (Soft maximum layer), where the Soft max layer functions to make statistics on the classification network result probability, and thus the Soft max layer may perform classification processing based on the target feature value. The locking target dependency grammar model is used for giving the dependency relationship of M categories (such as SBV, VOB, DE, ATT, IOB, CMP, HED and the like), then the Soft max layer can output the confidence degrees corresponding to the M categories respectively, namely M confidence degrees, and each category corresponds to one dependency relationship.

In the above embodiment, the input layer to the hidden layer are fully connected layers, and the fully connected layer and the Soft max layer are added behind the hidden layer, and the Soft max layer output belongs to 2N_lA probability of one of + 1.

Step 8075, determining the maximum confidence level of the M confidence levels, and determining the dependency relationship corresponding to the category corresponding to the maximum confidence level as the dependency relationship between two detection terms in the statement detection data.

In the embodiment, on the basis of the transfer-based method, words are represented in a word vector addition mode, the training accuracy is improved, word vectors are obtained by training through a CBOW algorithm, and the method is a word vector-based dependency grammar rapid analysis method. For example, the performance of a dependency grammar analysis can be evaluated using two types of metrics: las (labeled attribute score), the percentage of all words for which the correct dominant word is found and for which the dependency label type is correct, uas (unobelled attached score), the percentage of all words for which the correct dominant word is present, may be expressed by the following formula.

Before the technical scheme of the application is adopted, tests show that the LAS accuracy is 60% and the UAS accuracy is 0.66. After the technical scheme of the application is adopted, namely words are represented in a word vector addition mode, and a word vector pre-training model is adopted, the LAS accuracy is 76% and the UAS accuracy is 0.80%. Compared with the prior art, the LAS is improved by 16 percent, the UAS is improved by 14 percent, and the effect is obviously improved on the whole.

Based on the same application concept as the method, an apparatus for detecting dependency relationship is provided in the embodiment of the present application, as shown in fig. 9A, which is a schematic structural diagram of the apparatus, and the apparatus may include:

an obtaining module 911, configured to obtain sentence training data, where the sentence training data includes a plurality of training words, a part of speech of each training word, and a dependency relationship of at least one training group, and the training group includes two training words; obtaining a target training word based on the sentence training data, determining a target part of speech of the target training word, and determining a target dependency relationship between the target training word and the associated training word;

a determining module 912, configured to determine a word vector of each word in the target training word, and determine a training word feature corresponding to the target training word based on a sum of the word vectors of all words in the target training word; determining training part-of-speech characteristics corresponding to the target part-of-speech, and determining training dependent characteristics corresponding to the target dependent relationship;

a training module 913, configured to construct a target training feature based on the training word feature, the training part-of-speech feature, and the training dependency feature, and train the configured initial dependency grammar model based on the target training feature to obtain a trained target dependency grammar model; the target dependency grammar model is used for detecting the dependency relationship among detection words in the sentence detection data.

In a possible implementation manner, the determining module 912 is specifically configured to, based on a sum of word vectors of all words in the target training word, determine training word features corresponding to the target training word: determining a word vector average value of all words based on the sum of the word vectors of all words in the target training words, and determining training word characteristics corresponding to the target training words based on the word vector average value.

In a possible implementation manner, the determining module 912 determines the training part-of-speech feature corresponding to the target part-of-speech, and when determining the training dependency feature corresponding to the target dependency relationship, is specifically configured to: inquiring a part-of-speech mapping table through the target part-of-speech to obtain training part-of-speech characteristics corresponding to the target part-of-speech; querying a dependency relationship mapping table through the target dependency relationship to obtain training dependency characteristics corresponding to the target dependency relationship; the part-of-speech mapping table comprises a corresponding relation between parts-of-speech and part-of-speech characteristics, and the dependency mapping table comprises a corresponding relation between dependency and dependency characteristics.

In a possible implementation manner, the obtaining module 911 obtains the target training words based on the sentence training data, specifically to: adopting a semantic dependency analysis method based on transfer to divide the statement training data into a plurality of action sequences, wherein the action sequences comprise transfer actions and configuration data, and the configuration data comprise stack data, sequence data and dependency results; the sequence data is used for storing a plurality of training words in the sentence training data, the stack data is used for storing the training words taken out of the sequence data, and the dependency result is used for storing the dependency relationship between the training words in the stack data; selecting K1 training words from the stacked data as target training words, and selecting K2 training words from the sequence data as target training words;

the obtaining module 911, when determining the target dependency relationship between the target training words and the associated training words, is specifically configured to: determining a target dependency between the target training word and an associated training word based on a dependency between training words stored in the dependency results in the configuration data.

In a possible implementation, the training module 913 trains the configured initial dependency grammar model based on the target training features, and when obtaining the trained target dependency grammar model, the training module is specifically configured to: inputting the target training characteristics to an initial dependency grammar model, and processing the target training characteristics by the initial dependency grammar model by adopting a cubic function to obtain target characteristic values;

and if so, taking the adjusted dependent grammar model as a target dependent grammar model.

Based on the same application concept as the method, an apparatus for detecting dependency relationship is provided in the embodiment of the present application, as shown in fig. 9B, which is a schematic structural diagram of the apparatus, and the apparatus may include:

the obtaining module 921 is configured to obtain statement detection data, where the statement detection data includes a plurality of detection words and part of speech of each detection word; acquiring target detection words based on the statement detection data, determining target part of speech of the target detection words, and determining target dependency between the target detection words and the associated detection words;

the determining module 922 is configured to determine a word vector of each word in a target detection word, and determine a detection word feature corresponding to the target detection word based on a sum of the word vectors of all words in the target detection word; determining detection part-of-speech characteristics corresponding to the target part-of-speech, and determining detection dependency characteristics corresponding to the target dependency relationship;

the detection module 923 is configured to construct a target detection feature based on the detection word feature, the detection part-of-speech feature, and the detection dependency feature, and input the target detection feature to a target dependency grammar model that has been trained, to obtain a dependency relationship between two detection words in the sentence detection data.

In a possible implementation manner, the determining module 922 is specifically configured to, based on a sum of word vectors of all words in the target detection word, determine a detection word feature corresponding to the target detection word: determining a word vector average value of all words based on the sum of the word vectors of all words in the target detection word, and determining the detection word characteristics corresponding to the target detection word based on the word vector average value.

In a possible implementation manner, the determining module 922 determines the detection part-of-speech feature corresponding to the target part-of-speech, and when determining the detection dependency feature corresponding to the target dependency relationship, the determining module is specifically configured to: querying a part-of-speech mapping table through the target part-of-speech to obtain a detection part-of-speech characteristic corresponding to the target part-of-speech; querying a dependency relationship mapping table through the target dependency relationship to obtain a detection dependency characteristic corresponding to the target dependency relationship; the part-of-speech mapping table comprises a corresponding relation between parts-of-speech and part-of-speech characteristics, and the dependency mapping table comprises a corresponding relation between dependency and dependency characteristics.

In a possible implementation manner, the obtaining module 921 is specifically configured to, based on the sentence detection data, obtain the target detection word: the statement detection data is segmented into a plurality of action sequences by adopting a semantic dependency analysis method based on transfer, the action sequences comprise transfer actions and configuration data, and the configuration data comprise stack data, sequence data and dependency results; the sequence data is used for storing a plurality of detection words in the statement detection data, the stack data is used for storing detection words taken out of the sequence data, and the dependency result is used for storing the dependency relationship between the detection words in the stack data; selecting K1 detection words from the stack data as target detection words, and selecting K2 detection words from the sequence data as target detection words;

the obtaining module 921 is specifically configured to, when determining the target dependency relationship between the target detection word and the associated detection word: determining a target dependency relationship between the target detection word and an associated detection word based on a dependency relationship between the detection words stored in the dependency result in the configuration data.

In a possible implementation manner, the detection module 923 is specifically configured to input the target detection feature to a trained target dependency grammar model, and obtain a dependency relationship between two detection terms in the sentence detection data: inputting the target detection characteristics to the target dependency grammar model, and processing the target detection characteristics by the target dependency grammar model by adopting a cubic function to obtain target characteristic values; classifying the target characteristic values through the target dependency grammar model to obtain confidence degrees corresponding to M categories respectively; wherein M is a positive integer, and each category corresponds to a dependency relationship; and determining the maximum confidence coefficient of the M confidence coefficients, and determining the dependency relationship corresponding to the category corresponding to the maximum confidence coefficient as the dependency relationship between two detection words in the statement detection data.

Based on the same application concept as the method described above, the embodiment of the present application provides a device (i.e., an electronic device) for detecting a dependency relationship, and as shown in fig. 9C, the electronic device includes: a processor 931 and a machine-readable storage medium 932, the machine-readable storage medium 932 storing machine-executable instructions executable by the processor 931; the processor 931 is configured to execute machine executable instructions to implement the following steps:

Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where a plurality of computer instructions are stored on the machine-readable storage medium, and when the computer instructions are executed by a processor, the method for detecting a dependency relationship disclosed in the above examples of the present application can be implemented.

The machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

32页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种面向电力领域知识学习的文献推荐方法及装置

Dependency relationship detection method, device and equipment

相关技术

网友询问留言