Natural language reasoning method and system

文档序号:1242922 发布日期:2020-08-18 浏览:5次 中文

阅读说明:本技术 一种自然语言推理方法及系统 (Natural language reasoning method and system ) 是由 许光銮 孙显 苏武运 于泓峰 李沛光 姚方龙 刘那与 王铁平 田雨 于 2020-04-24 设计创作,主要内容包括:本发明提供一种自然语言推理方法及系统,包括:获取需要进行语言推理的自然语言句子对,并进行处理;采用预先训练好的算法模型,对所述处理过的句子对进行关系预测;其中,所述算法模型在自然语言推理模型中引入了句内注意力模块及句间注意力模块;句子对的关系包括:矛盾、蕴含、无关;本发明在增强自然语言推理模型性能的同时,提高了结果的可解释性;其中,算法模型在自然语言推理模型中引入了句内注意力模块来提升句子的表达能力,并采用了句间注意力模块促进句子间的交互,同时增强了自然语言推理模型性能,提高了结果的可解释性。(The invention provides a natural language reasoning method and a system, comprising the following steps: acquiring and processing natural language sentence pairs needing language reasoning; adopting a pre-trained algorithm model to predict the relationship of the processed sentence pairs; the algorithm model introduces an intra-sentence attention module and an inter-sentence attention module into the natural language reasoning model; the sentence pair relationship includes: contradiction, implication, irrelevance; the invention enhances the natural language reasoning model performance and improves the interpretability of the result; the algorithm model introduces an intra-sentence attention module in the natural language reasoning model to improve the expression capability of sentences, adopts an inter-sentence attention module to promote the interaction among the sentences, enhances the performance of the natural language reasoning model and improves the interpretability of results.)

1. A natural language reasoning method, comprising:

acquiring and processing natural language sentence pairs needing language reasoning;

adopting a pre-trained algorithm model to predict the relationship of the processed sentence pairs;

the algorithm model introduces an intra-sentence attention module and an inter-sentence attention module into the natural language reasoning model; the sentence pair relationship includes: contradiction, implication, irrelevance.

2. A natural language reasoning method according to claim 1, wherein the training of the algorithmic model comprises:

carrying out data processing on the marked natural language sentence pair to obtain a sample set;

dividing the sample set into a training set and a testing set;

migrating an inter-sentence attention module trained by adopting a supervision mechanism in advance into a natural language reasoning model by using migration learning, and capturing alignment information among sentences in a training set;

analyzing by adopting a constituent sentence method analysis method based on a training set containing alignment information, and extracting an analysis result by utilizing an analysis algorithm to obtain a monitoring signal of an intra-sentence attention module;

performing joint training on the intra-sentence attention module and the natural language reasoning model by using a multi-task learning method, and performing optimization adjustment on the inter-sentence attention model in the training process;

the choice of the analytical algorithm is determined by the results of the component analysis.

3. A natural language inference method according to claim 2, wherein said training of an algorithm model, further comprises:

and optimizing the algorithm model by adopting a training set.

4. A natural language reasoning method according to claim 1 or 2, wherein the data processing comprises:

a Stanford word segmentation tool is adopted to segment words of the sentence, and after the words are segmented, a bidirectional LSTM network is used for word embedding processing;

counting the occurrence frequency of all words in the corpus, and keeping the words with the word occurrence frequency larger than a certain threshold value in the sentences as high-frequency words;

taking out a pre-training word vector corresponding to the high-frequency word based on a word list of the pre-training word vector;

the threshold is greater than or equal to 80%.

5. The natural language reasoning method of claim 2, wherein the training with the supervision mechanism to obtain the inter-sentence attention module comprises:

the sentences in the training set are segmented by adopting a Stanford segmentation tool, and after the segmentation, the bidirectional LSTM network is used for word embedding processing;

selecting a word from the sentence pairs in sequence based on each sentence pair in the processed training set, extracting a word set aligned with the word from the assumed sentence represented in the sentence pair according to the label of the training set, and using the word set as a monitoring signal of the word in the inter-sentence attention module until all the monitoring signals are extracted;

wherein, the alignment data set in the training set is marked with alignment information between two sentences; the labeling of the training set comprises: the relationship of each sentence pair and the alignment of words.

6. The natural language reasoning method of claim 2, wherein the analyzing with a constituent sentence analysis method based on the training set including the alignment information and extracting the analysis result with an analytic algorithm to obtain the supervision signal of the intra-sentence attention module comprises: generating a sentence component analysis tree with words as nodes from natural language sentences in the training set by adopting a sentence component analysis method;

and analyzing the component syntactic analysis tree based on an analysis algorithm, acquiring leaf nodes contained in a subtree taking words reflecting sentence pair relations as root nodes, and extracting multi-scale supervision signals of the words.

7. A natural language reasoning method as claimed in claim 2, wherein said optimally adjusting said inter-sentence attention model during training comprises:

performing word segmentation and word embedding processing on an original training set or a test set to serve as original data;

after one layer of bidirectional LSTM network processing, inputting the processed original data into an inter-sentence attention module to obtain the predicted alignment relation between words in each sentence;

calculating a mean square error loss based on the predicted alignment relationship between words in each sentence and the supervisory signal;

and performing optimization adjustment on the interphrase attention module in a supervision training mode based on the mean square error loss.

8. A natural language reasoning method according to claim 7, wherein the penalty function of the interpretive attention module is calculated as follows:

wherein n is n words in the sentence, j is the current jth word, sinter,jβ for the preset expected alignment weight distribution of the current word j in the current sentence to all words in the other sentence in the sentence pairjThe alignment weight distribution obtained after learning of all words in another sentence by the current word j learned by the inter-sentence attention module.

9. A natural language reasoning method according to claim 6, wherein the penalty function for the intra-sentence attention module is calculated as follows:

where n is n words in the sentence, sintra,iA preset expected alignment weight distribution for the current word i to all words in the current sentence, αiThe alignment weight distribution is obtained after the current word learned by the intra-sentence attention module learns all words in the current sentence;

preferably, the objective function of the natural language inference model is calculated as follows:

wherein, L is the target function representation, C is the category number of all labels, for the current jth sample, tjResults annotated for the sample set, and yjPredicting the current sentence pair by the model;

preferably, the loss function for performing joint training on the intra-sentence attention module and the natural language inference model by using the multitask learning method is as follows:

J(θ)=λ1L+λ2MSEintra

λ1and λ2Is an over-parameter and is used to control the specific gravity of the two loss functions.

10. A natural language reasoning system, comprising:

the candidate sentence pair selecting module is used for acquiring and processing natural language sentence pairs needing language reasoning;

loading a reasoning module, and predicting the relation of the processed sentence pairs by adopting a pre-trained algorithm model;

the algorithm model introduces an intra-sentence attention module and an inter-sentence attention module into the natural language reasoning model; the sentence pair relationship includes: contradiction, implication, irrelevance.

Technical Field

The invention relates to the field of artificial intelligence, in particular to a method for processing natural language by adopting artificial intelligence, and specifically relates to a natural language reasoning method and system.

Background

Natural language processing is the core direction of artificial intelligence, aiming at exploring and implementing various theories and methods for efficient communication between a person and a computer using natural language. Natural language processing has rich application scenarios, in which natural language reasoning is one of important research contents, and aims to judge the relationship between two "preconditions" and "hypotheses" and is widely applied to question answering, emotion analysis, dialog systems and the like.

Currently, attention mechanisms are an important component of models in natural language reasoning tasks. Attention mechanisms can be divided into intra-sentence attention, which is used to capture the correlation between words in a sentence, and inter-sentence attention, which is the capture of alignment information between two sentences. Although the attention mechanism is widely applied to natural language reasoning models, the attention method adopted by the existing models is non-parametric or unsupervised, the attention module is not constrained in the models, and no target can guide the training of the attention module, so that the model has poor interpretability.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a natural language reasoning method, which comprises the following steps:

acquiring and processing natural language sentence pairs needing language reasoning;

adopting a pre-trained algorithm model to predict the relationship of the processed sentence pairs;

the algorithm model introduces an intra-sentence attention module and an inter-sentence attention module into the natural language reasoning model; the sentence pair relationship includes: contradiction, implication, irrelevance.

Preferably, the training of the algorithm model comprises:

carrying out data processing on the marked natural language sentence pair to obtain a sample set;

dividing the sample set into a training set and a testing set;

migrating an inter-sentence attention module trained by adopting a supervision mechanism in advance into a natural language reasoning model by using migration learning, and capturing alignment information among sentences in a training set;

analyzing by adopting a constituent sentence method analysis method based on a training set containing alignment information, and extracting an analysis result by utilizing an analysis algorithm to obtain a monitoring signal of an intra-sentence attention module;

performing joint training on the intra-sentence attention module and the natural language reasoning model by using a multi-task learning method, and performing optimization adjustment on the inter-sentence attention model in the training process;

the choice of the analytical algorithm is determined by the results of the component analysis.

Preferably, the training of the algorithm model further includes:

and optimizing the algorithm model by adopting a training set.

Preferably, the data processing includes:

a Stanford word segmentation tool is adopted to segment words of the sentence, and after the words are segmented, a bidirectional LSTM network is used for word embedding processing;

counting the occurrence frequency of all words in the corpus, and keeping the words with the word occurrence frequency larger than a certain threshold value in the sentences as high-frequency words;

taking out a pre-training word vector corresponding to the high-frequency word based on a word list of the pre-training word vector;

the threshold is greater than or equal to 80%.

Preferably, the module for training by using a supervision mechanism to obtain inter-sentence attention includes:

the sentences in the training set are segmented by adopting a Stanford segmentation tool, and after the segmentation, the bidirectional LSTM network is used for word embedding processing;

selecting a word from the sentence pairs in sequence based on each sentence pair in the processed training set, extracting a word set aligned with the word from the assumed sentence represented in the sentence pair according to the label of the training set, and using the word set as a monitoring signal of the word in the inter-sentence attention module until all the monitoring signals are extracted;

wherein, the alignment data set in the training set is marked with alignment information between two sentences; the labeling of the training set comprises: the relationship of each sentence pair and the alignment of words.

Preferably, the analyzing based on the training set including the alignment information by using a constituent sentence analysis method, and extracting the analysis result by using an analysis algorithm to obtain the monitoring signal of the intra-sentence attention module includes: generating a sentence component analysis tree with words as nodes from natural language sentences in the training set by adopting a sentence component analysis method;

and analyzing the component syntactic analysis tree based on an analysis algorithm, acquiring leaf nodes contained in a subtree taking words reflecting sentence pair relations as root nodes, and extracting multi-scale supervision signals of the words.

Preferably, the optimizing and adjusting the inter-sentence attention model in the training process includes:

performing word segmentation and word embedding processing on an original training set or a test set to serve as original data;

after one layer of bidirectional LSTM network processing, inputting the processed original data into an inter-sentence attention module to obtain the predicted alignment relation between words in each sentence;

calculating a mean square error loss based on the predicted alignment relationship between words in each sentence and the supervisory signal;

performing optimization adjustment on an interphrase attention module in a supervision training mode based on the mean square error loss;

preferably, the computation formula of the loss function of the interpretive attention module is as follows:

wherein n is n words in the sentence, j is the current jth word, sinter,jβ for the preset expected alignment weight distribution of the current word j in the current sentence to all words in the other sentence in the sentence pairjThe alignment weight distribution obtained after learning of all words in another sentence by the current word j learned by the inter-sentence attention module.

Preferably, the intra-sentence attention module loss function is calculated as follows:

where n is n words in the sentence, sintra,iA preset expected alignment weight distribution for the current word i to all words in the current sentence, αiThe alignment weight distribution is obtained after the current word learned by the intra-sentence attention module learns all words in the current sentence;

preferably, the objective function of the natural language inference model is calculated as follows:

wherein, L is the target function representation, c is the category number of all labels, for the current jth sample, tjResults annotated for the sample set, and yjPredicting the current sentence pair by the model;

preferably, the loss function for performing joint training on the intra-sentence attention module and the natural language inference model by using the multitask learning method is as follows:

J(θ)=λ1L+λ2MSEintra

λ1and λ2Is an over-parameter and is used to control the specific gravity of the two loss functions.

Based on the same inventive concept, the invention also provides a natural language reasoning system, which comprises:

the acquisition module is used for acquiring and processing natural language sentence pairs needing language reasoning;

the determining module is used for predicting the relation of the processed sentences by adopting a pre-trained algorithm model;

the algorithm model introduces an intra-sentence attention module and an inter-sentence attention module into the natural language reasoning model; the sentence pair relationship includes: contradiction, implication, irrelevance.

The invention has the beneficial effects that:

the invention provides a natural language reasoning method and a system, comprising the following steps: acquiring and processing natural language sentence pairs needing language reasoning; adopting a pre-trained algorithm model to predict the relationship of the processed sentence pairs; the algorithm model introduces an intra-sentence attention module and an inter-sentence attention module into the natural language reasoning model; the sentence pair relationship includes: contradiction, implication, irrelevance; the invention enhances the natural language reasoning model performance and improves the interpretability of the result; the algorithm model introduces an intra-sentence attention module in the natural language reasoning model to improve the expression capability of sentences, adopts an inter-sentence attention module to promote the interaction among the sentences, enhances the performance of the natural language reasoning model and improves the interpretability of results.

Drawings

FIG. 1 is a flow chart of a natural language reasoning method provided by the present invention;

FIG. 2 is a schematic structural diagram of a natural language inference method provided by the present invention;

FIG. 3 shows the parsing results of the components in example 1;

FIG. 4 is a sentence alignment format in example 1;

fig. 5 is a block diagram of the natural language inference system of the present invention.

Detailed Description

The invention provides a brand-new natural language reasoning framework. In this framework, the intra-sentence attention module is supervised-trained using the constituent syntactic analysis results, and the inter-sentence attention module is supervised-trained using the inter-sentence alignment information. The method adopts a multi-task learning and migration learning method to respectively fuse the supervision training of the intra-sentence attention and the inter-sentence attention modules into the training of the natural language reasoning model, thereby improving the performance of the natural language reasoning model and simultaneously solving the problem of poor interpretability of the existing model.

For a better understanding of the present invention, reference is made to the following description taken in conjunction with the accompanying drawings and examples.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:文本纠错方法、相关设备及可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!