Text abstract generation method and device, electronic equipment and storage medium

文档序号:1831798 发布日期:2021-11-12 浏览:9次 中文

阅读说明:本技术 文本摘要生成方法、装置、电子设备及存储介质 (Text abstract generation method and device, electronic equipment and storage medium ) 是由 念天磊 于 2021-08-20 设计创作,主要内容包括:本公开提供了一种文本摘要生成方法、装置、电子设备及存储介质,涉及计算机技术,尤其涉及自然语言处理技术领域。具体实现方案为:将目标文本输入文本摘要生成模型;针对所述文本摘要生成模型在每一轮单词预测输出的候选单词集合,按照每一候选单词的预测概率的降序顺序,从所述候选单词集合中选取累计概率超过预设阈值的目标单词集合,并从所述目标单词集合中选取当前轮次的预测单词;响应于选取的预测单词为结束标识符,依次将每一轮次的预测单词拼接,得到所述目标文本的摘要文本。目标单词集合中单词数目会随着候选单词的概率分布进行动态调整,从中选取预测单词,提高了摘要文本的多样性。(The disclosure provides a text abstract generation method and device, electronic equipment and a storage medium, relates to computer technology, and particularly relates to the technical field of natural language processing. The specific implementation scheme is as follows: inputting a target text into a text abstract generating model; aiming at a candidate word set output by the text abstract generation model in each round of word prediction, selecting a target word set with accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word, and selecting a predicted word of the current round from the target word set; and responding to the selected predicted words as ending identifiers, sequentially splicing the predicted words of each turn to obtain the abstract text of the target text. The number of words in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and predicted words are selected from the target word set, so that the diversity of the abstract text is improved.)

1. A text abstract generating method comprises the following steps:

inputting a target text into a text abstract generating model;

aiming at a candidate word set output by the text abstract generation model in each round of word prediction, selecting a target word set with accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word, and selecting a predicted word of the current round from the target word set;

and responding to the selected predicted words as ending identifiers, sequentially splicing the predicted words of each turn to obtain the abstract text of the target text.

2. The method of claim 1, wherein,

the candidate word set output by the text abstract generating model in each round of word prediction is determined based on the target text and the predicted words selected in the previous round.

3. The method of claim 1, wherein the text summarization generation model is pre-trained by:

obtaining an initial model;

acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

sequentially taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;

and judging whether a termination condition of model training is met, and if so, determining the current model as a text abstract generation model.

4. A text abstract generation model training method comprises the following steps:

obtaining an initial model;

acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

sequentially taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;

and judging whether a termination condition of model training is met, and if so, determining the current model as a text abstract generation model.

5. The method of claim 4, further comprising: acquiring a plurality of sample abstract texts with different text styles corresponding to the sample target text in the following modes:

acquiring the sample target text and an initial abstract text corresponding to the sample target text;

carrying out text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;

and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.

6. A text summary generation apparatus comprising:

the input module is used for inputting the target text into the text abstract generation model;

the prediction module is used for selecting a target word set with the accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set predicted and output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set;

and the splicing module is used for sequentially splicing the predicted words in each turn to obtain the abstract text of the target text in response to the selected predicted word as the ending identifier.

7. The apparatus of claim 6, wherein,

the candidate word set output by the text abstract generating model in each round of word prediction is determined based on the target text and the predicted words selected in the previous round.

8. The apparatus of claim 6, further comprising a training module,

the training model is used for pre-training the text abstract generation model in the following mode:

obtaining an initial model;

acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

sequentially taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;

and judging whether a termination condition of model training is met, and if so, determining the current model as a text abstract generation model.

9. A text summary generation model training device comprises:

the first acquisition module is used for acquiring an initial model;

the second acquisition module is used for acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

the iterative training module is used for taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model in sequence and iteratively training the initial model;

and the judging module is used for judging whether the termination condition of the model training is met, and if so, determining the current model as the text abstract generating model.

10. The apparatus of claim 9, further comprising: a rewrite module to:

acquiring a plurality of sample abstract texts with different text styles corresponding to the sample target text in the following modes:

acquiring the sample target text and an initial abstract text corresponding to the sample target text;

carrying out text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;

and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.

Technical Field

The present disclosure relates to the field of computer technology, and more particularly, to the field of natural language processing.

Background

The text abstract generation is an important research field in natural language processing, can convert a longer text into a shorter text containing key information, and plays an important role in the fields of intelligent question and answer robots, news abstracts, landing page abstracts and the like.

Disclosure of Invention

The disclosure provides a text abstract generating method, a text abstract generating device, text abstract generating equipment and a storage medium.

According to an aspect of the present disclosure, there is provided a text summary generation method, including: inputting a target text into a text abstract generating model;

aiming at a candidate word set output by the text abstract generation model in each round of word prediction, selecting a target word set with accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word, and selecting a predicted word of the current round from the target word set;

and responding to the selected predicted words as ending identifiers, sequentially splicing the predicted words of each turn to obtain the abstract text of the target text.

According to another aspect of the present disclosure, there is provided a text summary generation model training method including:

obtaining an initial model;

acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

sequentially taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;

and judging whether a termination condition of model training is met, and if so, determining the current model as a text abstract generation model.

According to another aspect of the present disclosure, there is provided a text summary generation apparatus including:

the input module is used for inputting the target text into the text abstract generation model;

the prediction module is used for selecting a target word set with the accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word aiming at the candidate word set predicted and output by the text abstract generation model in each round of word prediction, and selecting the prediction word of the current round from the target word set;

and the splicing module is used for sequentially splicing the predicted words in each turn to obtain the abstract text of the target text in response to the selected predicted word as the ending identifier.

According to another aspect of the present disclosure, there is provided a text summarization generation model training apparatus, including:

the first acquisition module is used for acquiring an initial model;

the second acquisition module is used for acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

the iterative training module is used for taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model in sequence and iteratively training the initial model;

and the judging module is used for judging whether the termination condition of the model training is met, and if so, determining the current model as the text abstract generating model.

According to still another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a text summary generation method and/or a text summary generation model training method.

According to yet another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to execute a text digest generation method and/or a text digest generation model training method.

According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a text summary generation method and/or a text summary generation model training method.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow chart of a text summary generation method according to an embodiment of the present disclosure;

fig. 2 is a schematic flow chart of a training method for a text summarization generation model according to an embodiment of the present disclosure;

FIG. 3 is a block diagram of an apparatus for implementing a text summary generation method of an embodiment of the present disclosure;

FIG. 4 is a block diagram of an apparatus for implementing a text summarization generation model training method of an embodiment of the present disclosure;

fig. 5 is a block diagram of an electronic device for implementing a text summary generation method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The text abstract generation is an important research field in natural language processing, can convert a longer text into a shorter text containing key information, and plays an important role in the fields of intelligent question and answer robots, news abstracts, landing page abstracts and the like.

In the related technology, a long text is input into a text abstract generation model, the text abstract generation model calculates the prediction result of word prediction in each round, then a decoding module selects words with high probability from the prediction results as prediction words, and the prediction words in each round are spliced to obtain a text abstract.

Where decoding may be understood as selecting a word from the predicted plurality of words as the output word for the current round. The decoding algorithm of the decoding module directly affects the finally output text summary, and commonly used decoding algorithms include greedy search, beam search, and topk sampling.

And the greedy search decoding algorithm selects the word with the highest conditional probability each time as the current output. In the beam search decoding algorithm, several paths with the highest probability are reserved simultaneously during decoding, and one path with the highest probability is selected as current output. And a top sampling decoding algorithm takes the first k words with the highest conditional probability, reforms the first k words into new probability distribution, and samples the new probability distribution to serve as current output.

In the above decoding techniques, the text digests generated based on the first two decoding algorithms are not sufficiently diverse. the topk sampling decoding algorithm is fixed, so that a proper k value for any scene is difficult to find, a word with small probability is easily introduced when the k value is too large, the text is not smooth, and an available word is lost when the k value is too small. However, the probability distribution at the decoding end varies with the input text, so that it is difficult to determine a general k value.

For the text summary generation field, one typical application scenario is the summary generation of landing pages. The landing page can be in various forms such as an advertisement landing page, an enterprise promotion landing page and the like.

Text in the landing page is usually abstracted to obtain abstract text, which is used as a title of a click link of the landing page to attract a user to click.

However, in the related art, after the model training is completed, the number of generated candidates of the abstract texts of the output landing pages is small, and the tactical organization style is single, which is not favorable for optimizing the various landing pages.

In order to improve the diversity of abstract texts, the disclosure provides a text abstract generation method, a text abstract generation device, electronic equipment and a storage medium.

In an embodiment of the present disclosure, a method for generating a text abstract is provided, where the method includes:

inputting a target text into a text abstract generating model;

aiming at a candidate word set output by the text abstract generation model in each round of word prediction, selecting a target word set with accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word, and selecting a predicted word of the current round from the target word set;

and when the selected predicted words are end identifiers, sequentially splicing the predicted words of each turn to obtain the abstract text of the target text.

It can be seen that a threshold value of the cumulative probability is preset, in each round of word prediction process, a target word set with the cumulative probability exceeding the preset threshold value is selected from the candidate word sets according to the descending order of the prediction probabilities, so that the number of words contained in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, and the diversity of the abstract text can be ensured.

And the target word set is a core small subset formed by words with higher probability, and the predicted words are selected from the core small subset, so that words with small probability cannot be selected, and further, the abstract text cannot be unsmooth. In addition, because the words are selected according to the descending order of the prediction probability, the available words with higher probability are not lost.

The text abstract generating method, the text abstract generating device, the electronic device and the storage medium provided by the embodiments of the disclosure are described in detail below.

Referring to fig. 1, fig. 1 is a schematic flow chart of a text summary generation method provided in an embodiment of the present disclosure, and as shown in fig. 1, the method may include the following steps:

s101: and inputting the target text into a text abstract generating model.

In the embodiment of the present disclosure, the target text is a text that needs to be summarized, for example, a text in a landing page.

The text summary generation model may be an autoregressive language model that is capable of predicting the next word that is likely to follow based on the above. The text abstract generation model comprises a prediction module and a decoding module, wherein the prediction module is used for outputting a candidate word set in each round of word prediction; and the decoding module is used for selecting a word from the candidate word set based on a decoding algorithm to be used as a predicted word of the current round.

S102: and aiming at a candidate word set output by the text abstract generation model in each round of word prediction, selecting a target word set with the accumulated probability exceeding a preset threshold from the candidate word set according to the descending order of the prediction probability of each candidate word, and selecting the predicted word of the current round from the target word set.

The words may be words or characters.

In the embodiment of the disclosure, a target word set with an accumulated probability exceeding a preset threshold is selected from candidate word sets according to a descending order of the prediction probability of each candidate word aiming at the candidate word set predicted and output by the text abstract generating model in each round of word.

As an example, the preset threshold is 0.9, in a certain round of word prediction process, the candidate word set includes { a, B, C, D, E … }, the prediction probabilities are {0.5,0.3,0.12,0.03,0.012 … }, since 0.5+0.3<0.9, 0.5+0.3+0.12>0.9, the target word set with the cumulative probability exceeding 0.9 is selected as { a, B, C } in descending order of the prediction probabilities, and the number of words included in the target word set in the current round is 3.

As another example, the preset threshold is 0.9, in a certain round of word prediction, the candidate word set includes { a, B, C, D, E … }, the prediction probabilities are {0.4,0.3,0.12,0.09,0.05 … }, and since 0.4+0.3+0.12<0.9, 0.4+0.3+0.12+0.09>0.9, the target word set with the cumulative probability exceeding 0.9 is selected as { a, B, C, D } in descending order of the prediction probabilities, and the number of words included in the target word set in this round is 4.

Therefore, the number of words in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, and in each word prediction process, one predicted word is selected from a plurality of words contained in the target word set, so that the diversity of the abstract text can be ensured.

When the predicted word of the current round is selected from the target word set, the predicted word can be selected randomly or sampled based on the prediction probability.

S103: and responding to the selected predicted words as ending identifiers, sequentially splicing the predicted words of each turn to obtain the abstract text of the target text.

In the embodiment of the disclosure, the candidate word set predicted to be output by the text summary generation model in each round of words is determined based on the target text and the predicted words selected in the previous round.

That is, in each round of word prediction, the predicted word is taken as the above, and the next word is predicted in conjunction with the target text.

And in a certain round of word prediction process, the selected predicted word is an ending identifier which indicates that the abstract text is ended, and the splicing result of the predicted word in each round is the abstract text of the target text.

It can be seen that a threshold value of the cumulative probability is preset, in each round of word prediction process, a target word set with the cumulative probability exceeding the preset threshold value is selected from the candidate word sets according to the descending order of the prediction probabilities, so that the number of words contained in the target word set can be dynamically adjusted along with the probability distribution of the candidate words, in each round of word prediction process, one predicted word is selected from a plurality of words contained in the target word set, and the diversity of the abstract text can be ensured.

And the target word set is a core small subset formed by words with higher probability, and the predicted words are selected from the core small subset, so that words with small probability cannot be selected, and further, the abstract text cannot be unsmooth. In addition, because the words are selected according to the descending order of the prediction probability, the available words with higher probability are not lost.

The text abstract generation method provided by the disclosure overcomes the problem of poor diversity of the traditional decoding strategy, does not need to retrain the model, can be used based on the existing model, and is suitable for all abstract text generation tasks.

If the text abstract generating method provided by the embodiment of the disclosure is applied to abstract generation of landing pages, the target text is a text in the landing page, and the abstract text generated according to the target text has better diversity and can be used as a title of a click link of the landing page, so that a user can be better attracted to click.

In one embodiment of the present disclosure, to further improve the diversity of text summarization generation, a text summarization generation model may be trained based on a plurality of sets of sample text sets, each sample text set containing a sample target text and a plurality of sample summarization texts of different text styles.

Specifically, referring to fig. 2, fig. 2 is a schematic flow chart of a text abstract generation model training method provided in the embodiment of the present disclosure, which may adopt the following steps to train a text abstract generation model:

s201: an initial model is obtained.

In the embodiment of the present disclosure, an RNN (Recurrent Neural Networks) model, an Encoder-Decoder (encoding-decoding) model, and the like may be selected as the initial model.

S202: acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text.

In the embodiment of the present disclosure, a plurality of sample abstract texts with different text styles corresponding to the sample target text may be obtained as follows: and acquiring a sample target text and an initial abstract text corresponding to the sample target text. And carrying out text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts. The initial abstract text and the rewritten abstract text are determined as sample abstract texts of a plurality of different text styles corresponding to the sample target text.

Wherein the sample target text may be text in a landing page, and the initial summary text may be a title of the landing page.

The text style may also be understood as a language organization style, or a conversational style. Such as a terse style, a rich style, a spoken style, etc.

In order to obtain abstract texts with different text styles, the abstract texts are used as training samples of the model, and the text style rewriting can be performed on the initial abstract text based on a relevant algorithm or model for text rewriting; and a manual rewriting mode can be adopted, and the text semantics are removed in the rewriting process.

As an example, the sample target text is an introduction to the commodity "XX" in the landing page, and the initial summary text is a link title of the landing page, specifically, "professional production category XX". In order to enrich the language and technology organization style of the linked title of the landing page, the text style rewriting can be carried out on the initial abstract text to obtain abstract texts with different language and technology organization styles, such as 'professional XX solution', 'professional production XX, direct sale of manufacturers, welcome incoming call consultation' or 'high-quality XX, wherein the price is more substantial'. As can be seen, the rewritten abstract text contains a variety of text styles.

In the embodiment of the present disclosure, both the initial abstract text and the rewritten abstract text are used as sample abstract texts of sample target texts, that is, one sample target text corresponds to a plurality of sample abstract texts of different text styles.

S203: and sequentially taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model.

In the embodiment of the disclosure, in each iteration training, a sample target text and a sample abstract text corresponding to the sample target text are used as input to train an initial model.

Specifically, in each iteration training, a sample target text is input into an initial model, and model parameters are adjusted based on the difference between an output abstract text and the sample abstract text.

The process of training the model based on the single sample target text and the single sample abstract text can also be referred to in the related art.

S204: and judging whether a termination condition of model training is met, and if so, determining the current model as a text abstract generation model.

The termination condition of the model training can be that the iteration times reach the preset times, or the loss function value of the model is smaller than the preset threshold value.

Because one sample target text corresponds to a plurality of sample abstract texts with different text styles, through a plurality of times of iterative training, the trained text abstract generation model can learn different text styles, and the diversity of the output text abstract is further improved.

That is, after training is completed, the target text is input into the text summary generation model, and in each round of word prediction process, the prediction module in the text summary generation model can output a plurality of candidate words with relatively flat probability,

furthermore, it is worth mentioning that in the present disclosure, the decoding algorithm is optimized on the one hand, and the training data of the model is optimized on the other hand, which are organically combined and complement each other.

Specifically, text style rewriting is performed on the initial abstract text to obtain a plurality of sample abstract texts with different text styles for training the model. The trained model can learn different text styles, so that a prediction module in the model can output a plurality of candidate words which are evenly distributed in each word prediction process, and the candidate words can belong to different text styles.

Therefore, in each round of decoding process, a target word set with the accumulated probability exceeding a preset threshold is determined, the target word set is a core small subset formed by words with higher probability and different text styles, and then a predicted word is selected from the target word set. Therefore, if the target text is input into the abstract generating model for multiple times, abstract texts with different text styles can be generated.

As an example, if the model is trained without training data for optimizing the model, that is, the model is trained by sample data in a single text style, since the model only learns a single text style, the output is relatively single, that is, the candidate words output by the prediction module in the model are likely to be peak-distributed, for example, with probabilities of {0.7,0.1,0.08 … } and the like, in this case, only the word corresponding to the probability of 0.7 is selected, and the generated abstract text is coherent.

In the disclosure, the text abstract generating model is trained by using sample abstract texts of a plurality of text styles, so that the text abstract generating model can learn different text styles, and therefore, the candidate words output by the prediction module in the model are rich, that is, the output candidate words are evenly distributed, for example, the probability is {0.35,0.3,0.28 … } and the like, and then the prediction words are selected by combining the decoding algorithm provided by the disclosure. For example, the preset probability threshold is 0.9, since 0.35+0.3<0.9, 0.35+0.3+0.28>0.9, the first three candidate words constitute a target word set, and one word is selected as a predicted word from the target word set, so that diversity of generated texts can be ensured, and the generated abstract text is consistent no matter which predicted word is selected.

Therefore, in the disclosure, the optimization of the two aspects is combined, so that the effect that one is added and one is larger than two is achieved, and the diversity of the generated abstract text can be obviously improved.

According to the experimental measurement and calculation, under the same target text, a beam search decoding algorithm is adopted, the candidate number of the generated abstract text is 1, and the smoothness is 95%; by adopting a topk sampling decoding algorithm, the candidate number of the generated abstract texts is 6, and the smoothness is 85%; by adopting the method provided by the disclosure, the candidate number of the generated abstract texts is 8, and the smoothness is 93%. Therefore, the text abstract generating method provided by the disclosure improves the diversity of abstract texts on the premise of ensuring higher smoothness.

The text abstract generating method provided by the disclosure is applied to the field of advertisement landing pages, can generate the link titles of the advertisement landing pages with different tactical styles, improves the attraction to users, and improves the conversion rate of the advertisement landing pages.

Referring to fig. 3, fig. 3 is a block diagram of an apparatus for implementing a text summary generation method according to an embodiment of the present disclosure, and as shown in fig. 3, the apparatus may include:

an input module 301, configured to input a target text into a text abstract generation model;

a prediction module 302, configured to select, according to a descending order of prediction probabilities of each candidate word, a target word set from the candidate word sets, where an accumulated probability of the target word set exceeds a preset threshold, for a candidate word set output by the text abstract generation model in each word prediction, and select a predicted word in a current round from the target word set;

and the splicing module 303 is configured to splice the predicted words of each turn in sequence to obtain the abstract text of the target text in response to the selected predicted word being the ending identifier.

In one embodiment of the present disclosure, the candidate word set output by the text summarization generation model in each round of word prediction is determined based on the target text and the predicted words selected in the previous round.

In an embodiment of the present disclosure, on the basis of the apparatus shown in fig. 3, the apparatus further includes: the training module is used for pre-training the text abstract generation model in the following mode:

obtaining an initial model;

acquiring a plurality of groups of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

sequentially taking each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model, and iteratively training the initial model;

and judging whether a termination condition of model training is met, and if so, determining the current model as a text abstract generation model.

Referring to fig. 4, fig. 4 is a block diagram of an apparatus for implementing a text summary generation model training method according to an embodiment of the present disclosure, and as shown in fig. 4, the apparatus may include:

a first obtaining module 401, configured to obtain an initial model;

a second obtaining module 402, configured to obtain multiple sets of sample text sets; each sample text set comprises a sample target text and a plurality of sample abstract texts with different text styles corresponding to the sample target text;

an iterative training module 403, configured to take each sample target text and a sample abstract text corresponding to the sample target text as training samples of the initial model in sequence, and iteratively train the initial model;

and the judging module 404 is configured to judge whether a termination condition of the model training is met, and if so, determine the current model as the text abstract generating model.

In one embodiment of the present disclosure, the method further includes: a rewrite module to:

acquiring a plurality of sample abstract texts with different text styles corresponding to the sample target text in the following modes:

acquiring the sample target text and an initial abstract text corresponding to the sample target text;

carrying out text style rewriting on the initial abstract text to obtain a plurality of rewritten abstract texts;

and determining the initial abstract text and the rewritten abstract text as sample abstract texts of a plurality of different text styles corresponding to the sample target text.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

The present disclosure provides an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a text summary generation method and/or a text summary generation model training method.

The present disclosure provides a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a text summary generation method and/or a text summary generation model training method.

The present disclosure provides a computer program product comprising a computer program which, when executed by a processor, implements a text summary generation method and/or a text summary generation model training method.

FIG. 5 illustrates a schematic block diagram of an example electronic device 500 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the apparatus 500 comprises a computing unit 501 which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The calculation unit 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as the text digest generation method. For example, in some embodiments, the text summary generation method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When loaded into RAM 503 and executed by computing unit 501, a computer program may perform one or more of the steps of the text summary generation method described above. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the text summary generation method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种融合文本结构信息和语义信息的文本关键词抽取方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!