Natural language generation method based on dynamic knock-out network

文档序号:1271766 发布日期:2020-08-25 浏览:38次 中文

阅读说明:本技术 一种基于动态推敲网络的自然语言生成方法 (Natural language generation method based on dynamic knock-out network ) 是由 王春辉 胡勇 于 2020-05-13 设计创作,主要内容包括:本发明公开一种基于动态推敲网络的自然语言生成方法。所述方法包括:建立Encoder-Attention-Decoder模型;以原始文档x为输入,调用所述模型生成K个初始的句子;从K个句子中随机选取一个句子y<Sub>c</Sub>;根据x和y<Sub>c</Sub>调用所述模型,模型中的两个Attention模块分别对x和y<Sub>c</Sub>进行处理,并将处理结果融合在一起,生成K个新的句子;重复上述步骤直到满足收敛条件。本发明通过反复调用所述模型,利用模型中的Attention模块对生成的句子进行反复推敲、润色,能够输出高质量的句子。(The invention discloses a natural language generation method based on a dynamic knock-out network. The method comprises the following steps: establishing an Encode-Attention-Decoder model; calling the model to generate K initial sentences by taking an original document x as input; randomly selecting a sentence y from K sentences c (ii) a According to x and y c Calling the model, wherein two Attention modules in the model respectively correspond to x and y c Processing, and fusing the processing results together to generate K new sentences; and repeating the steps until the convergence condition is met. The invention repeatedly calls the model, repeatedly pushes and moistens the generated sentence by utilizing the Attention module in the model, and can output the high-quality sentence.)

1. A natural language generation method based on a dynamic deduction network is characterized by comprising the following steps:

step 1, establishing an Encoder-Attention-Decoder model, wherein the input of the model is an original document, and the output of the model is a document meeting the task requirement;

step 2, calling the model by taking the original document x as input to generate K initial sentences Y0={y01,y02,…,y0K};

Step 3, randomly selecting one sentence y from K sentencesc

Step 4, according to x and ycCalling the model, wherein two Attention modules in the model respectively correspond to x and ycProcessing the sentences and merging the processing results to generate K sentences Yc={yc1,yc2,…,ycK};

And 5, repeating the steps 3 and 4 until the convergence condition is met.

2. The method for generating natural language based on dynamic deduction network as claimed in claim 1, wherein said convergence condition is:

the input x is unchanged, and the conditional probability P (y | x) of the sentence y generated by calling the model twice continuously is unchanged; or the Levenshtein distance between sentences generated by calling the model for two times continuously is not changed.

Technical Field

The invention belongs to the technical field of natural language understanding, and particularly relates to a natural language generation method based on a dynamic knockout network.

Background

Currently, natural Language generation (nlg) is part of natural Language processing, and generates a natural Language from a machine presentation system such as a knowledge base or a logical form. The natural language generation system may be said to be a translator that converts material into natural language expressions, which may be interpreted as the inverse of natural language understanding. In natural language generation, neither the input sequence nor the output sequence is fixed in length, such as machine translation, automatic summarization, etc. To handle such variable length inputs and outputs, a recurrent neural network rnn (recurrentneuronetwork) is typically used. For a simple multi-layer feedforward neural network, the intermediate state of the network is recalculated for each input, which is not affected by the intermediate state calculated from the previous sample. The RNN may store this historical information and calculate current state information based on the current inputs and historical state, so the RNN may process inputs of any length. The main idea of RNN is to cyclically compress the input sequence into a fixed-dimension vector, i.e. the intermediate state of the network, by constantly combining the input at the current time with the historical state.

From the machine learning perspective, natural language generation can be viewed as a process of supervised learning, a process of learning from one arbitrary length sequence to another arbitrary length sequence. If a typical Encoder-Decoder encorder-Decoder structure is used, the encorder needs to compress the entire input word sequence into a fixed-dimension vector and then the Decoder decodes the entire output word sequence from it. This requires that the fixed-dimension vector contain all the information of the input sentence, which is obviously difficult to achieve, and it also becomes a performance bottleneck of the Encoder-Decoder structure, making it unable to handle long sentences well. For this reason, it has been proposed to introduce an Attention mechanism into an Encoder-Decoder framework to make the Decoder pay more Attention to input end part word segmentation, thereby alleviating the problem caused by compressing an input sequence into a fixed dimension vector in the Encoder-Decoder framework. At present, natural language is extracted and generated by a method based on an Encode-Attention-Decoder framework, decoding is often performed only once, and modeling of repeated deduction in a human writing process is lacked, so that the problems of insufficient fluency and low quality of generated sentences exist.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides a natural language generation method based on a dynamic tap network.

In order to achieve the purpose, the invention adopts the following technical scheme:

a natural language generation method based on a dynamic knockout network comprises the following steps:

step 1, establishing an Encoder-Attention-Decoder model, wherein the input of the model is an original document, and the output of the model is a document meeting the task requirement;

step 2, calling the model by taking the original document x as input to generate K initial sentences Y0={y01,y02,…,y0K};

Step 3, randomly selecting one sentence y from K sentencesc

Step 4, according to x and ycCalling the model, wherein two Attention modules in the model respectively correspond to x and ycProcessing the sentences and merging the processing results to generate K sentences Yc={yc1,yc2,…,ycK};

And 5, repeating the steps 3 and 4 until the convergence condition is met.

Compared with the prior art, the invention has the following beneficial effects:

the invention generates K initial sentences by establishing an Encoder-Attention-Decoder model and taking an original document x as input, calls the model and randomly selects a sentence y from the K sentencescAccording to x and ycCalling the model, wherein two Attention modules in the model respectively correspond to x and ycAnd processing, merging the processing results together to generate K new sentences, and repeating the steps until a convergence condition is met. The invention repeatedly calls the model, repeatedly knocks the generated sentence by using the Attention module in the model, and can output the high-quality sentence.

Drawings

Fig. 1 is a flowchart of a natural language generation method based on a dynamic knockout network according to an embodiment of the present invention;

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

The embodiment of the invention provides a natural language generation method based on a dynamic knock-out network, a flow chart is shown in figure 1, and the method comprises the following steps:

s101, establishing an Encoder-orientation-Decoder model, wherein the input of the model is an original document, and the output of the model is a document meeting the task requirement;

s102, calling the model by taking the original document x as input to generate K initial sentences Y0={y01,y02,…,y0K};

S103, randomly selecting a sentence y from the K sentencesc

S104, according to x and ycCalling the model, wherein two Attention modules in the model respectively correspond to x and ycProcessing the sentences and merging the processing results to generate K sentences Yc={yc1,yc2,…,ycK};

And S105, repeating the steps S103 and S104 until a convergence condition is met.

In this embodiment, step S101 is mainly used to establish an Encoder-Attention-Decoder model. The model is obtained by introducing an Attention mechanism into a classic Encoder-Decoder structure, wherein both an Encoder Encoder and a Decoder Decoder adopt a Recurrent Neural Network (RNN), and the output of each RNN is connected with an Attention module. The Decoder of the model of the present embodiment needs to be called for many times to realize repeated knocking and retouching of the generated sentence, thereby outputting a high-quality sentence.

In the present embodiment, step S102 is mainly used to generate K initial sentences. The K initial sentences are obtained by calling the model with the original document x as input. For example, the input sentence is "beijing is the capital of china", the task requires the chinese-english machine translation, the model is called, K is 3, and 3 initial sentences can be generated. For example, the 3 initial sentences may be: "Beijing is the focal", "Beijing is the focal of China" and "Beijing is the focal of China".

In this embodiment, step S103 is mainly used to randomly select one sentence from the K sentences generated in the previous step. For example, the second sentence "Beijing is capital of China" is selected; of course, the first sentence "Beijing is the title" may be selected.

In the present embodiment, the steps S104 and S105 are mainly used to realize repeated pushing of the generated sentenceKnocking to generate high quality sentences. It is a common practice to perform translation and writing. To this end, a tap process is added to the encoder-decoder framework, allowing the decoder to operate in two stages: the decoder of the first stage is used for decoding and generating an original sequence; the decoder of the second stage, through the process of repeated jostling, grinds and refines the original sentence, and is able to produce a better sentence by observing future words in the original sentence of the first stage. Steps S104, S105 are the second phase of the decoder operation. The specific method comprises the following steps: calling the model again, wherein the input x is unchanged, and two Attention modules are utilized to respectively input x and the sentence y selected in the last stepcCarrying out processing such as feature extraction, compression and the like, merging processing results (connected end to end), and outputting K new sentences; then, a sentence is randomly selected from the K new sentences. Repeating steps S104 and S105 realizes the jolting and the retouching of the generated sentence. And when the convergence condition is met, stopping the pushing and knocking, and outputting the pushed and knocked high-quality sentence. For example, based on the original document "Beijing is the capital of China" and the selected sentence "Beijing is the capital of China", the model is called again, and 3 new sentences are output. Suppose that the 3 sentences output after the model is called for many times are: "Beijing is the central of China", "Beijing is the central of China" and "Beijing is the central of China". Since the 3 sentences output at this time are completely the same, the convergence condition is satisfied, and the model calling is stopped. The last sentence output is the most accurate translation result "Beijing is the theoretical of China".

As an alternative embodiment, the convergence condition is:

the input x is unchanged, and the conditional probability P (y | x) of the sentence y generated by calling the model twice continuously is unchanged; or the Levenshtein distance between sentences generated by calling the model for two times continuously is not changed.

This embodiment gives two specific convergence conditions. The two conditions are logical or relationship, i.e. the sentence generation process is stopped as long as either one of the conditions is satisfied. The first condition is to judge whether the conditional probability P (y | x) of the model generation sentence y is changed or not according to two consecutive calls. P (y | x) is obtained by the output layer through the softmax activation function. The second condition is to judge whether the Levenshtein distance between sentences generated by calling the model twice continuously changes. The Levenshtein distance is also known as the string edit distance, and the Levenshtein distance of string A, B refers to the minimum number of operands required to convert string a to string B using character manipulation. Character operations include deleting, inserting, and modifying a character. In general, the smaller the Levenshtein distance of two strings, the more similar they are. When two strings are equal, their Levenshtein distance is 0.

The above description is only for the purpose of illustrating a few embodiments of the present invention, and should not be taken as limiting the scope of the present invention, in which all equivalent changes, modifications, or equivalent scaling-up or down, etc. made in accordance with the spirit of the present invention should be considered as falling within the scope of the present invention.

6页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:疾病分类编码识别方法、装置及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!