Reading understanding model construction method and device, electronic equipment and storage medium

文档序号：661717 发布日期：2021-04-27 浏览：2次中文

阅读说明：本技术 一种阅读理解模型构建方法、装置、电子设备及存储介质 (Reading understanding model construction method and device, electronic equipment and storage medium ) 是由吕向楠于 2021-03-26 设计创作，主要内容包括：本发明提供一种阅读理解模型构建方法、装置、电子设备及存储介质,该方法包括：根据第一领域场景数据集,得到第一领域场景的训练集；根据所述第一领域场景的训练集,对通用阅读理解模型做二次训练,得到所述第一领域场景的专用阅读理解模型；所述通用阅读理解模型为根据通用领域场景数据集预先训练深度学习模型得到的。本发明在通用阅读理解模型的基础上进行增强训练减少了训练时间以及训练数据的标注成本,针对不同的领域场景单独建立专用阅读理解模型提高了对于单一领域的准确率。(The invention provides a reading understanding model construction method, a reading understanding model construction device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining a training set of a first field scene according to the first field scene data set; performing secondary training on a general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene; the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set. According to the invention, the training time and the labeling cost of training data are reduced by performing enhanced training on the basis of the general reading understanding model, and the accuracy of a single field is improved by independently establishing the special reading understanding model aiming at different field scenes.)

1. A reading understanding model building method is characterized by comprising the following steps:

obtaining a training set of a first field scene according to the first field scene data set;

performing secondary training on a general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene;

the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

2. The reading understanding model building method of claim 1, wherein the obtaining a training set of the first domain scene from the first domain scene data set comprises:

if the domain scene category of the first domain scene data set is determined to be the same as the domain scene category of the second domain scene data set, merging the first domain scene data set and the second domain scene data set to obtain a training set of the first domain scene; wherein the second domain scene data set is existing domain scene data.

3. The reading understanding model building method of claim 1, wherein the obtaining a training set of the first domain scene from the first domain scene data set comprises:

and if it is determined that an existing domain scene data set with the same domain scene category as the first domain scene data does not exist, taking the first domain scene data as a training set of the first domain scene.

4. The reading understanding model building method according to claim 3, wherein the determining that there is no existing domain scene data set of the same domain scene category as the first domain scene data, and after the first domain scene data set is used as a training set of the first domain scene, further comprises:

and adding the first domain scene data set into the existing domain scene data.

5. The reading understanding model building method of claim 1, wherein the obtaining step of the universal reading understanding model comprises:

acquiring a universal field scene data set;

splitting the general field scene data set into a general training set, a general verification set and a general test set;

and training, verifying and testing the initial deep learning type according to the universal training set, the universal verification set and the universal test set to obtain the universal reading understanding model.

6. The reading understanding model constructing method of claim 5,

the universal field scene data set is a data set of a multi-field scene reading understanding sample after data annotation;

the multi-field scene reading understanding sample at least comprises a question, a document, an answer and a document position corresponding to the question and the answer.

7. The reading understanding model building method according to claim 5, wherein the training, verifying and testing the deep learning type according to the universal training set, the universal verification set and the universal test set to obtain the universal reading understanding model, further comprises:

training an initial deep learning model according to the general training set, and optimizing parameters of the model;

verifying the deep learning model after training optimization according to the general verification set to optimize the hyper-parameters of the model;

testing and evaluating the deep learning model after the verification and optimization according to the general test set to obtain the generalization error of the deep learning model after the verification and optimization;

and if the generalization error is determined to be smaller than a preset threshold value, taking the deep learning model after verification optimization as the general reading understanding model.

8. A reading understanding model building apparatus, comprising:

the training set generation module is used for obtaining a training set of the first field scene according to the first field scene data set;

the special reading understanding model generating module is used for carrying out secondary training on the general reading understanding model according to the training set of the first field scene to obtain the special reading understanding model of the first field scene;

the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the reading understanding model building method according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the reading understanding model building method according to any one of claims 1 to 7.

Technical Field

The invention relates to the technical field of computer application, in particular to a reading understanding model construction method and device, electronic equipment and a storage medium.

Background

Machine-reading understanding refers to predicting answers to questions using a model for given questions and documents.

The existing reading understanding model only divides training data into a training set, a verification set and a test set, and trains an optimization model. For reading and understanding requirements of different field scenes, sample data is expanded only by directly adding a training set, a verification set and a test set, so that the reading and understanding model has question and answer capability for new field scenes, and the accuracy of the model to answers of a single field scene fluctuates in the continuous construction and use process. The reading and understanding model is specially trained for a single field, a lot of time is needed, and the cost of marking data is high.

Disclosure of Invention

The invention provides a reading understanding model construction method, which is used for solving the defects that in the prior art, a reading understanding model is low in accuracy rate of a single field scene and high in cost for specially training the reading understanding model for the single field.

In a first aspect, the present invention provides a reading understanding model construction method, including:

obtaining a training set of a first field scene according to the first field scene data set;

performing secondary training on a general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene;

the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

According to the reading understanding model construction method provided by the invention, the obtaining of the training set of the first field scene according to the first field scene data set comprises the following steps:

According to the reading understanding model construction method provided by the invention, after determining that there is no existing field scene data set with the same field scene category as the first field scene data, and taking the first field scene data set as a training set of the first field scene, the method further comprises:

and adding the first domain scene data set into the existing domain scene data.

According to the reading understanding model construction method provided by the invention, the step of acquiring the general reading understanding model comprises the following steps:

acquiring a universal field scene data set;

splitting the general field scene data set into a general training set, a general verification set and a general test set;

According to the reading understanding model construction method provided by the invention, the general field scene data set is a data set of a multi-field scene reading understanding sample after data labeling;

the multi-field scene reading understanding sample at least comprises a question, a document, an answer and a document position corresponding to the question and the answer.

According to the reading understanding model construction method provided by the invention, the deep learning type is trained, verified and tested according to the universal training set, the universal verification set and the universal test set to obtain the universal reading understanding model, and the method further comprises the following steps:

training an initial deep learning model according to the general training set, and optimizing parameters of the model;

verifying the deep learning model after training optimization according to the general verification set to optimize the hyper-parameters of the model;

and if the generalization error is determined to be smaller than a preset threshold value, taking the deep learning model after verification optimization as the general reading understanding model.

In a second aspect, the present invention further provides a reading understanding model building apparatus, including:

the training set generation module is used for obtaining a training set of the first field scene according to the first field scene data set;

the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

In a third aspect, the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the reading understanding model building method according to the first aspect are implemented.

In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the reading understanding model building method according to the first aspect.

According to the reading understanding model construction method and device, the electronic equipment and the storage medium, a training set of a first field scene is obtained according to a first field scene data set and is used for training a general reading understanding model; and performing secondary training on the general reading understanding model according to the training set of the first field scene to obtain the special reading understanding model of the first field scene. The method has the advantages that the accuracy rate of a single field is improved by independently establishing the special reading understanding model aiming at different field scenes, the enhancement training is carried out on the basis of the general reading understanding model, and the training time and the labeling cost of training data are reduced.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of a reading understanding model building method provided by the invention;

FIG. 2 is a flow chart of a method for obtaining a generic reading understanding model according to the present invention;

FIG. 3 is a schematic diagram of a training method of a universal reading understanding model provided by the present invention;

FIG. 4 is a schematic structural diagram of a reading understanding model building apparatus provided by the present invention;

fig. 5 is a schematic structural diagram of a reading understanding model building electronic device provided by the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention is used for the construction of a special reading understanding model under a multi-field scene, and is described in the following with reference to FIGS. 1 to 5.

In a first aspect, as shown in fig. 1, the reading understanding model building method provided by the present invention includes:

s11, obtaining a training set of the first field scene according to the first field scene data set;

s12, performing secondary training on the general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene;

the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

Specifically, the first-field scene data set in step S11 is newly input scene data, and the present invention is constructed for the special reading understanding model in the multi-field scene situation, so that a training set of the first-field scene needs to be acquired according to the scene category of the first-field scene data set and is used as a training basis for the special reading understanding model.

Then, in step S12, the general reading understanding model is trained twice based on the training set of the first-field scene data, so as to obtain a special reading understanding model of the first scene model.

The universal field scene data set is reading understanding question-answer training data relating to a plurality of field scenes, and a universal reading understanding model is obtained by training a deep learning model according to the field scene data set. The general reading understanding model can independently read and understand question answering, is suitable for multiple field scenes and has strong generalization capability, but for a single field scene, the general reading understanding model has limited question answering accuracy rate for the single field scene due to the influence of training data of scene data of other fields in a training data set.

And performing secondary training on the general reading understanding model according to the training set of the first field scene data, enhancing the question answering capability of the model on the first field scene, improving the accuracy and obtaining the special reading understanding model of the first field scene.

It is to be understood that the first domain scene may be one of a plurality of domain scenes related to the general domain scene data set, or may be a related domain of the plurality of domain scenes related to the general domain scene data set, so that there is a basis for "enhancement". For example, the general field scene set data relates to a fuse, a contactor, a PLC and an alarm, and the first field scene can be a frequency converter, a transformer and other related field scenes, and has a certain reinforcement learning basis due to belonging to the field of an electric circuit. And performing secondary training on each acquired field scene data on the basis of the general reading understanding model to obtain special reading understanding models of different field scenes.

According to the method, a training set of a first field scene is obtained according to a first field scene data set and is used for training a general reading understanding model; and performing secondary training on the general reading understanding model according to the training set of the first field scene to obtain the special reading understanding model of the first field scene. The method has the advantages that the accuracy rate of a single field is improved by independently establishing the special reading understanding model aiming at different field scenes, the enhancement training is carried out on the basis of the general reading understanding model, and the training time and the labeling cost of training data are reduced.

In an embodiment of the present invention, the obtaining a training set of a first domain scene according to a first domain scene data set includes: if the domain scene category of the first domain scene data set is determined to be the same as the domain scene category of the second domain scene data set, merging the first domain scene data set and the second domain scene data set to obtain a training set of the first domain scene; wherein the second domain scene data set is existing domain scene data.

Specifically, determining that the domain scene class of the first domain scene data set is the same as the domain scene class of the second domain scene data set (the second domain scene data set is the existing domain scene data), means that the first domain scene is an existing classification, and the classification has a corresponding domain scene data set. At this time, the data of the two domain scene classification sets need to be merged, and the merged data is used as a training set of the domain scene classification for carrying out secondary training on the general reading understanding model.

In this embodiment, a training set of a first domain scene is obtained by determining that a domain scene type of a first domain scene data set is the same as a domain scene type of a second domain scene data set, and merging the first domain scene data set and the second domain scene data set. The training data volume of the first field scene is enlarged, and the accuracy of the special reading understanding model response is improved.

In an embodiment of the present invention, if it is determined that the domain scene category of the first domain scene data set is the same as the domain scene category of the second domain scene data set, the reading understanding model specific to the second domain scene is trained in an enhanced manner according to the first domain scene data, so as to obtain an optimized reading understanding model specific to the second domain scene.

Compared with the previous embodiment, the total data set used for training the reading understanding model is the same, and the finally obtained special reading understanding model is also the same, but the embodiment directly performs the enhancement training based on the special reading understanding model of the second field scene according to the first field scene data set, so that the training time is saved.

In an embodiment of the present invention, the obtaining a training set of a first domain scene according to a first domain scene data set includes: and if it is determined that an existing domain scene data set with the same domain scene category as the first domain scene data does not exist, taking the first domain scene data as a training set of the first domain scene.

Specifically, it is determined that there is no existing domain scene data set that is the same as the domain scene category of the first domain scene data, meaning that the first domain scene is a new domain scene category, and at this time, the dedicated reading understanding model of the first domain scene needs to be reconstructed on the basis of the general reading understanding model, so that the first domain scene data set is directly used as the training set of the first domain scene. In the embodiment, when the first field scene is the new field scene, the special reading understanding model is established for the first field scene on the basis of the general reading understanding model, so that the time spent on model training and the cost for acquiring training data are saved.

In an embodiment of the present invention, after determining that there is no existing domain scene data set with a same domain scene category as the first domain scene data, and using the first domain scene data set as a training set of the first domain scene, the method further includes: and adding the first domain scene data set into the existing domain scene data.

Specifically, after the dedicated reading understanding model is generated for the first domain scene data, the first domain scene data set needs to be added to the existing domain scene data. So as to carry out detection, combination or new construction and other operations after subsequently receiving new field scene data. For example, if the subsequently received new domain scene data is of the same type as the first domain scene data set added to the existing domain scene data, the received new domain scene data is added to the first domain scene data set.

In the embodiment, the first field scene data set is added into the existing field scene data, so that the operations of scene data set management, subsequent field scene data detection and the like are facilitated, and the field scene data are enriched.

As shown in fig. 2, in an embodiment of the present invention, the step of obtaining the general reading understanding model includes:

and S21, acquiring the universal field scene data set.

The general field scene data set is a field scene data set relating to a plurality of field scenes, is used for training an initial training deep learning model, aims to improve the generalization capability of the question-answering model and has universality.

And S22, splitting the general field scene data set into a general training set, a general verification set and a general test set.

The splitting of the general field scene data set is performed to prevent model overfitting and reduce generalization errors. The specific proportion of splitting the general field scene data set can be set according to the total amount of the general field scene data as required.

And S23, training, verifying and testing the initial deep learning type according to the universal training set, the universal verification set and the universal test set to obtain the universal reading understanding model.

In the embodiment, the initial deep learning model is trained by using the general field scene data set, so that the general reading understanding model with question answering capability for a plurality of field scenes is obtained as the basic model, and the training time of the special reading understanding model and the cost for acquiring training data are saved.

In an embodiment of the invention, the universal field scene data set is a data set of a multi-field scene reading comprehension sample after data labeling; the multi-field scene reading understanding sample at least comprises a question, a document, an answer and a document position corresponding to the question and the answer.

The data labeling is to label the reading understanding sample, the label represents the question-answer related attribute of the reading understanding sample, and the labeled data is used for training the initial deep learning model to obtain the universal reading understanding model with the question-answer capability on the scene questions in the related field.

For example, the labeled content may be attributes such as a domain, a scene, a question content, an answer content, etc., for example, a question "price of schneider EA9AN (20A) breaker" labeled as "brand model specification product price", an answer "EA 9AN (20A) price of 80 yuan, appearing on page 3, paragraph 2" labeled as "model specification price value currency document position" of the schneider EA9AN product manual "document, a general reading understanding model is trained by inputting the document, the labeled question into the initial deep learning model, and correcting the initial deep learning model according to the labeled answer and the document position.

The general field scene data set in this embodiment is a data set of a multi-field scene reading understanding sample after data labeling, and facilitates model training. The multi-field scene reading understanding sample at least comprises the question, the document, the answer and the document position corresponding to the question and the answer, so that the trained general reading understanding model can deduce the answer of the question according to the input question and document, match the positions of the document where the question and the answer are located, and the readability of the answer is improved.

In an embodiment of the present invention, the training, verifying and testing the deep learning type according to the general training set, the general verification set and the general test set to obtain the general reading understanding model further includes: s231, training the initial deep learning model according to the general training set, and optimizing parameters of the model; s232, verifying the deep learning model after training optimization according to the general verification set, and optimizing the hyper-parameters of the model; s233, testing and evaluating the verified and optimized deep learning model according to the general test set to obtain the generalization error of the verified and optimized deep learning model; and S234, determining that the generalization error is smaller than a preset threshold value, and taking the verified and optimized deep learning model as the general reading understanding model.

Specifically, S231, training an initial deep learning model according to the universal training set, and optimizing parameters of the model to enable the model to have learning ability; s232, verifying the deep learning model after training optimization according to the general verification set, optimizing hyper-parameters of the model, wherein the verification set is used for preventing the model from being over-fitted on the training set, namely preventing the model from learning more specific characteristics on the training set. The hyper-parameters may include regularization parameters, the number of layers of the neural network, the number of neurons in each hidden layer, and the like; s233, using the universal test set to test the performance of the model; and S234, if the generalization error is determined to be smaller than a preset threshold value, taking the verified and optimized deep learning model as the general reading understanding model, namely, selecting the model with better generalization capability as the general reading understanding model.

In the embodiment, the initial deep learning model is optimized by using the universal training set, the universal verification set and the universal test set, and a universal reading understanding model with better generalization capability is screened out.

In another series of embodiments of the present invention, the training method of the dedicated reading understanding model refers to the training method of the general reading understanding model, and is not described herein again.

The reading understanding model building device provided by the invention is described below, and the reading understanding model building device described below and the reading understanding model building method described above can be referred to correspondingly.

In a second aspect, as shown in fig. 4, the reading understanding model building apparatus provided by the present invention includes: a training set generation module 41 and a special reading understanding model generation module 42.

The training set generating module 41 is configured to obtain a training set of a first field scene according to a first field scene data set; the special reading understanding model generating module 42 is configured to perform secondary training on the general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene; the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

In this embodiment, a training set of a first field scene is obtained according to a first field scene data set, and is used for training a general reading understanding model; and performing secondary training on the general reading understanding model according to the training set of the first field scene to obtain the special reading understanding model of the first field scene. The method has the advantages that the accuracy rate of a single field is improved by independently establishing the special reading understanding model aiming at different field scenes, the enhancement training is carried out on the basis of the general reading understanding model, and the training time and the labeling cost of training data are reduced.

In a third aspect, fig. 5 illustrates a schematic physical structure diagram of an electronic device, and as shown in fig. 5, the electronic device may include: a processor (processor)510, a communication Interface (Communications Interface)520, a memory (memory)530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a method of reading understanding model construction, the method comprising: obtaining a training set of a first field scene according to the first field scene data set; performing secondary training on a general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene; the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In a fourth aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to execute a reading understanding model building method provided in the above aspects, the method including: obtaining a training set of a first field scene according to the first field scene data set; performing secondary training on a general reading understanding model according to the training set of the first field scene to obtain a special reading understanding model of the first field scene; the general reading understanding model is obtained by training a deep learning model in advance according to a general field scene data set.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

13页详细技术资料下载

Reading understanding model construction method and device, electronic equipment and storage medium

相关技术

网友询问留言