Model training and keyword extraction method and device

文档序号：486985 发布日期：2022-01-04 浏览：25次中文

阅读说明：本技术 一种模型训练及关键词提取方法及装置 (Model training and keyword extraction method and device ) 是由校娅沈元童咏之奚骏泉汤彪张敏于 2021-09-15 设计创作，主要内容包括：本说明书公开了一种模型训练及关键词提取方法及装置,通过确定训练样本,将训练样本的关键词、关键词的实体分类结果以及所述关键词的情感分类结果,作为该训练样本的第一标注、第二标注和第三标注,基于各训练样本中的字符的位置以及关键词的位置,确定各训练样本对应的各字符的字向量,并基于各字向量,确定各训练样本的关键词,以根据各关键词的词向量,确定各关键词的实体分类结果和情感分类结果,以各训练样本提取出的关键词、实体分类结果、情感分类结合以及各标注,对该关键词提取模型进行训练。使得基于确定出的关键词进行推荐时,不仅可基于各关键词对应的实体分类推荐,还可基于各关键词对应的情感分类进行推荐,提高了推荐精度。(The specification discloses a method and a device for model training and keyword extraction, wherein a training sample is determined, keywords of the training sample, entity classification results of the keywords and emotion classification results of the keywords are used as a first label, a second label and a third label of the training sample, word vectors of the characters corresponding to the training sample are determined based on positions of the characters and positions of the keywords in the training sample, the keywords of the training sample are determined based on the word vectors, the entity classification results and the emotion classification results of the keywords are determined according to the word vectors of the keywords, and the keyword extraction model is trained by combining the extracted keywords, the entity classification results, the emotion classification and the labels of the training samples. When recommendation is performed based on the determined keywords, recommendation can be performed based on entity classification corresponding to the keywords and emotion classification corresponding to the keywords, and recommendation accuracy is improved.)

1. A method for training a keyword extraction model, the method comprising:

acquiring a plurality of pieces of non-structural information, respectively using sentences in the plurality of pieces of non-structural information as training samples, and determining keywords in the training samples, entity classification results of the keywords and emotion classification results of the keywords as first labels, second labels and third labels of the training samples aiming at each training sample;

inputting the training sample as input into a preprocessing module in a keyword extraction model to be trained, determining a content vector and a position vector of each character in the training sample, and determining a word vector of each character according to the content vector and the position vector of each character, wherein the position vector is determined based on the position of the character in the training sample and the position of a first label of the training sample in the training sample;

inputting each word vector corresponding to the training sample as input into an extraction module in the keyword extraction model to obtain a keyword of the training sample output by the extraction module, and determining a word vector corresponding to the keyword according to each character corresponding to the keyword;

the word vector is used as input and is respectively input into an entity classification module and an emotion classification module in the keyword extraction model, and an entity classification result of the training sample output by the entity classification module and an emotion classification result of the training sample output by the emotion classification module are respectively obtained;

determining loss according to the keywords and the first label, the entity classification result and the second label, and the emotion classification result and the third label corresponding to the training sample, and adjusting model parameters in the keyword extraction model by taking the minimum loss as an optimization target, wherein the model parameters at least comprise position parameters, and the keyword extraction model is used for determining the keywords of the non-structural information and classification thereof.

2. The method of claim 1, wherein determining, for each training sample, an entity classification result of the keyword in the training sample as the second label of the training sample specifically comprises:

aiming at each training sample, inputting the keywords of the training sample into a pre-trained entity classification model, and determining an entity classification result corresponding to the keywords as a second label of the training sample;

the entity classification model is obtained by learning based on small sample keywords marked with entity classification results.

3. The method of claim 1, wherein determining, for each training sample, the emotion classification result for the keyword in the training sample comprises:

aiming at each training sample, inputting the training sample into a pre-trained emotion classification model, and determining an emotion classification result corresponding to the training sample as an emotion classification result of the keyword;

the emotion classification model is obtained by learning based on a small sample sentence marked with an emotion classification result.

4. The method of claim 1, wherein determining, for each training sample, the keyword in the training sample as the first label of the training sample specifically comprises:

for each training sample, determining a keyword corresponding to the training sample through a pre-trained pre-extraction model, and using the keyword as a first label of the training sample; wherein the content of the first and second substances,

training the pre-extraction model in the following way:

acquiring non-structural information;

segmenting the non-structural information, determining each word corresponding to the non-structural information, and determining each candidate word according to the occurrence frequency of each word;

judging whether the candidate word exists in a preset keyword dictionary or not aiming at each candidate word;

if yes, determining the candidate word as a positive sample;

if not, determining that the candidate word is a negative sample;

and taking each candidate word as input, inputting the input into a pre-extraction model to be trained to obtain a pre-extraction result of each candidate word output by the pre-extraction model, and training the pre-extraction model according to the label and the pre-extraction result of each candidate word.

5. The method of claim 1, wherein determining the position vector for each character in the training sample comprises:

for each character in the training sample, determining a first position vector of the character according to the position of the character in the training sample;

determining a second position vector of the character according to the position of the character in the training sample and the position of the first label of the training sample in the training sample;

and determining the position vector of the character according to the first position vector and the second position vector of the character, wherein the second position vector is positively correlated with the distance between the character and the first label.

6. The method of claim 5, wherein determining the word vector for the character based on the content vector and the position vector for the character comprises:

determining an auxiliary vector of the character according to content vectors of other characters in the training sample;

and determining a word vector of the character according to the content vector, the position vector and the auxiliary vector corresponding to the character.

7. A keyword extraction method, characterized in that the method comprises:

acquiring unstructured information, wherein the unstructured information at least comprises one statement;

inputting each sentence in the non-structural information into a pre-processing module of a pre-trained keyword extraction model, determining a content vector and a position vector of each character in the sentence, and determining a word vector of each character according to the content vector and the position vector of the character for each character, wherein the position vector is determined based on position parameters of the keyword extraction model;

inputting each word vector corresponding to the sentence into an extraction module in the keyword extraction model, and determining a keyword corresponding to the sentence;

and respectively inputting the keywords into an entity classification module and an emotion classification module of the keyword extraction model, and determining entity classification results and emotion classification results corresponding to the keywords, wherein the keywords and the entity classification results and emotion classification results corresponding to the keywords are used for recommending the non-structural information to a user.

8. An apparatus for training a keyword extraction model, the apparatus comprising:

the system comprises a sample determining module, a judging module and a judging module, wherein the sample determining module is used for acquiring a plurality of pieces of non-structural information, respectively using sentences in the plurality of pieces of non-structural information as training samples, and determining keywords in the training samples, entity classification results of the keywords and emotion classification results of the keywords as first labels, second labels and third labels of the training samples aiming at each training sample;

the preprocessing module is used for inputting the training sample into a preprocessing module in a keyword extraction model to be trained, determining a content vector and a position vector of each character in the training sample, and determining a word vector of each character according to the content vector and the position vector of each character, wherein the position vector is determined based on the position of the character in the training sample and the position of a first label of the training sample in the training sample;

the extraction module is used for inputting each character vector corresponding to the training sample into the extraction module in the keyword extraction model to obtain the keywords of the training sample output by the extraction module, and determining the word vector corresponding to the keywords according to each character corresponding to the keywords;

the classification module is used for inputting the word vector into an entity classification module and an emotion classification module in the keyword extraction model respectively, and obtaining an entity classification result of the training sample output by the entity classification module and an emotion classification result of the training sample output by the emotion classification module respectively;

and the training module is used for determining loss according to the keywords and the first label, the entity classification result and the second label, and the emotion classification result and the third label corresponding to the training sample, and adjusting model parameters in the keyword extraction model by taking the minimum loss as an optimization target, wherein the model parameters at least comprise position parameters, and the keyword extraction model is used for determining the keywords of the non-structural information and classification thereof.

9. A keyword extraction module, the module comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring unstructured information which at least comprises one statement;

the preprocessing module is used for inputting each statement in the non-structural information into a preprocessing module of a pre-trained keyword extraction model, determining a content vector and a position vector of each character in the statement, and determining a word vector of each character according to the content vector and the position vector of the character aiming at each character, wherein the position vector is determined based on a position parameter of the keyword extraction model;

the extraction module is used for inputting each word vector corresponding to the statement into the extraction module in the keyword extraction model and determining the keyword corresponding to the statement;

and the classification module is used for respectively inputting the keywords into the entity classification module and the emotion classification module of the keyword extraction model, determining entity classification results and emotion classification results corresponding to the keywords, and recommending the non-structural information to a user by the keywords and the entity classification results and emotion classification results corresponding to the keywords.

10. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1 to 6 or 7.

11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 6 or 7 when executing the program.

Technical Field

The specification relates to the technical field of computers, in particular to a model training and keyword extraction method and device.

Background

Currently, with the development of computer technology, information generated by users has become one of the information sources of service providers. However, since most of information generated by the user, such as comments of the user, is unstructured information and cannot be directly applied to the unstructured information, how to extract key information from the unstructured information has become one of the problems that the service provider needs to solve. The keyword extraction method can extract keywords from sentences and determine the characteristics of categories of the keywords, and is widely applied to scenes in which a service provider recommends content for users.

In the prior art, a commonly used keyword extraction method is implemented based on a keyword extraction model. Specifically, for each sentence in the unstructured information requiring keyword extraction, feature extraction is performed on the sentence, a sentence vector corresponding to the sentence is determined, and then the sentence vector is input into a keyword extraction model trained in advance as an input, so that a keyword corresponding to each sentence in the unstructured information output by the keyword extraction model and a category corresponding to the keyword are obtained.

However, in the prior art, when a keyword extraction model is trained, emotion classification of each keyword is not considered, that is, the keyword represents a positive emotion or a negative emotion in the unstructured information, so that when each unstructured information is recommended to a user based on the keyword extracted by using the keyword extraction model, the determined recommended content is inaccurate, and the recommendation precision is low.

Disclosure of Invention

The present specification provides a method and an apparatus for model training and keyword extraction, which partially solve the above problems in the prior art.

The technical scheme adopted by the specification is as follows:

the present specification provides a training method of a keyword extraction model, including:

Optionally, for each training sample, determining an entity classification result of the keyword in the training sample, as a second label of the training sample, specifically including:

the entity classification model is obtained by learning based on small sample keywords marked with entity classification results.

Optionally, for each training sample, determining an emotion classification result of the keyword in the training sample specifically includes:

the emotion classification model is obtained by learning based on a small sample sentence marked with an emotion classification result.

Optionally, for each statement, determining each keyword in the statement, as a first label of the statement, specifically including:

for each statement, inputting the statement as input into a pre-extraction model trained in advance, and determining a keyword corresponding to the statement as a first label of the statement; wherein the content of the first and second substances,

training the pre-extraction model in the following way:

acquiring non-structural information;

judging whether the candidate word exists in a preset keyword dictionary or not aiming at each candidate word;

if yes, determining the candidate word as a positive sample;

if not, determining that the candidate word is a negative sample;

Optionally, determining a position vector of each character in the training sample specifically includes:

for each character in the training sample, determining a first position vector of the character according to the position of the character in the training sample;

Optionally, determining a word vector corresponding to the character according to the content vector and the position vector of the character specifically includes:

determining an auxiliary vector of the character according to content vectors of other characters in the training sample;

and determining a word vector of the character according to the content vector, the position vector and the auxiliary vector corresponding to the character.

The present specification provides a keyword extraction method, including:

acquiring unstructured information, wherein the unstructured information at least comprises one statement;

inputting each word vector corresponding to the sentence into an extraction module in the keyword extraction model, and determining a keyword corresponding to the sentence;

This specification provides a training device of keyword extraction model, includes:

This specification provides a keyword extraction module, including:

The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described training method of the keyword extraction model or the keyword extraction method.

The present specification provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements a training method or a keyword extraction method of the keyword extraction model when executing the program.

The technical scheme adopted by the specification can achieve the following beneficial effects:

in the method for training a keyword extraction model provided in this specification, a training sample is determined, and a keyword of the training sample, an entity classification result of the keyword, and an emotion classification result of the keyword are used as a first label, a second label, and a third label of the training sample, and a word vector of each character corresponding to each training sample is determined based on a position of a character and a position of the keyword in each training sample, and the keyword of each training sample is determined based on each word vector, so that the entity classification result and the emotion classification result of each keyword are determined according to the word vector of each keyword, and the keyword, the entity classification result, the emotion classification combination, and each label extracted from each training sample are used to train the keyword extraction model. When recommendation is performed based on the determined keywords, recommendation can be performed based on entity classification corresponding to the keywords and emotion classification corresponding to the keywords, and recommendation accuracy is improved.

According to the method, when recommendation is performed based on the keywords determined by the method, recommendation can be performed based on entity classification corresponding to the keywords and emotion classification corresponding to the keywords, and recommendation accuracy is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the specification and not to limit the specification in a non-limiting sense. In the drawings:

FIG. 1 is a schematic flow chart of a method for training a keyword extraction model provided herein;

FIG. 2 is a schematic structural diagram of a keyword extraction model provided in the present specification;

FIG. 3 is a schematic diagram of a method of generating training samples provided herein;

FIG. 4 is a schematic diagram of a keyword extraction process provided in the present specification;

FIG. 5 is a device for training a keyword extraction model provided in the present specification;

fig. 6 is a keyword extraction apparatus provided in the present specification;

fig. 7 is a schematic diagram of an electronic device corresponding to fig. 1 or fig. 5 provided in the present specification.

Detailed Description

In order to make the objects, technical solutions and advantages of the present disclosure more clear, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the present specification.

The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic flowchart of a method for training a keyword extraction model provided in this specification, and specifically includes the following steps:

s100: obtaining a plurality of pieces of non-structural information, respectively using sentences in the plurality of pieces of non-structural information as training samples, and determining keywords in the training samples, entity classification results of the keywords and emotion classification results of the keywords as first labels, second labels and third labels of the training samples aiming at each training sample.

Generally, in the content recommendation field, keyword extraction and classification may be performed on each piece of unstructured information through a keyword extraction model, and a keyword and a type of the keyword of each piece of unstructured information are determined, so as to recommend related content and the like to a user according to the determined content of the keyword and the type of the keyword.

Generally, the keyword extraction model is obtained by a server for training the model, and is trained in advance based on training samples. The present specification provides a method for training a keyword extraction model, and as such, the process of training the keyword extraction model may be performed by a server for training the model.

The training model can be divided into a sample generation phase and a training model phase, and samples used for training the model can be determined in the sample generation phase according to model requirements and training requirements. In this specification, the server may first determine training samples for training the keyword extraction model, and since the keyword extraction model is generally extracted and classified on the basis of respective non-structural information, the server may first determine respective non-structural information to determine the training samples.

Based on this, the server may obtain a plurality of pieces of non-structural information, and use statements in the plurality of pieces of structural information as training samples, respectively, where each piece of non-structural information may be product introduction of each product in a platform of a service provider, or may be comments on each product by each user, and a source and a form of specific non-structural information may be set as needed, which is not limited in this specification.

In one or more embodiments provided in this specification, since each keyword corresponding to the non-structural information exists in each sentence, and the difference between the categories of the keywords in different sentences may be large, the server may determine, for each training sample, the keyword corresponding to the training sample, and the entity classification and the emotion classification corresponding to the keyword, so as to determine the first label, the second label, and the third label of the training sample.

Specifically, for each training sample, the server may determine, through a pre-extraction model trained in advance, a keyword corresponding to each training sample as a first label of the training sample. The keyword extraction model can be obtained by training in the following way:

for each piece of non-structural information, the server may first perform word segmentation on the non-structural information, and count the frequency of occurrence of each word, and then select each word with a higher frequency of occurrence as each candidate word.

Secondly, judging whether the word exists in a dictionary prestored in the server or not according to each candidate word, if so, taking the candidate word as a positive sample, and if not, taking the candidate word as a negative sample.

Then, for each candidate word, determining a score corresponding to the candidate word according to parameters such as the occurrence frequency corresponding to the candidate word, whether the candidate word is a new word, whether the meaning is complete, and the association degree with other words. In general, the higher the score, the greater the probability of characterizing the candidate word as a keyword, and vice versa.

And finally, the server can determine whether each candidate word is a keyword according to a preset keyword score threshold value, and train the keyword extraction model based on the score corresponding to each keyword and the label thereof.

In one or more embodiments provided in this specification, after determining the keyword corresponding to each training sample, the server may further determine an entity classification corresponding to each keyword as the second label corresponding to the training sample.

Specifically, in the present specification, eight entity tag categories of concept, food, commodity, facility, scene, environment, service, unknown, and the like may be defined according to business needs. The concept represents the service types of the merchants, such as a 'cyber red shop' and a 'Chuan vegetable shop', the labels of the food comprise food related categories such as dishes, drinks and food materials, the commodities are defined as entities sold by the merchants except the food, such as souvenirs, the facilities are available equipment of the merchants, such as a 'smokeless area' and a 'card seat', the scene describes the scene types suitable for the merchants, such as 'suitable group building' and 'suitable photographing', and the environment and the service respectively describe the environment features and the service types provided by the merchants. For example, on a river, a boating may be done, etc., and the unknown identifies keywords that cannot correspond to the other seven tags.

The server may obtain a pre-trained entity classification model, and obtain entity classification results corresponding to the keywords output by the entity classification model by using the keywords as input.

Because the labels are limited, the method can be obtained by training small sample keywords marked with entity classification:

and acquiring a small number of keywords with labels, inputting each keyword into an entity classification model to be trained by taking each keyword as input, and determining an entity classification result corresponding to each keyword. And when a new keyword is received, determining an entity classification result corresponding to the keyword according to the similarity between the keyword and each pre-stored keyword.

Of course, the server may also perform entity classification on each keyword according to the preset entity classification rule, for example, "spicy rabbit head-chuancai" and the like, for example, the entity classification rule is preset.

In one or more embodiments provided in this specification, when determining the entity classification corresponding to each keyword, it may further determine, according to information in the non-structural information, an emotion classification result corresponding to each keyword, that is, a positive emotion or a negative emotion, and use the emotion classification result corresponding to each keyword as a third label of the training sample.

Specifically, the server may determine whether a sentence or a character meeting each emotion classification rule exists in the non-structural information according to a preset emotion classification rule, and further determine an emotion classification result corresponding to each keyword. For example, in "sweet and sour back is not good at all", the keyword "sweet and sour back" belongs to negative emotion. In the 'sweet and sour Ridge recommended', the keyword 'sweet and sour Ridge' belongs to positive emotion.

S102: the training sample is used as input and is input into a preprocessing module in a keyword extraction model to be trained, content vectors and position vectors of characters in the training sample are determined, and for each character, a sub-vector of the character is determined according to the content vector and the position vector of the character, wherein the position vector is determined based on the position of the character in the training sample and the position of a first label of the training sample in the training sample.

In one or more embodiments provided in this specification, for each training sample, if the weight of a character is higher in the training sample closer to the position of the keyword, the keyword is easier to determine, and therefore, after each training sample is determined, the server may input each training sample into a preprocessing module of the keyword extraction model, and obtain a content vector of each character in the training sample and a position vector that can be used to represent the distance between each character and the keyword. The position vector is inversely related to the length of the character from the keyword, and when training is performed based on the position vector, a position parameter can be determined, and the position parameter can be used to represent the distance from the keyword.

Specifically, the server may determine, for each character in each training sample, a content vector according to content corresponding to the character, determine a position vector of the character according to a position of the character and a position of a keyword in the training sample, and then fuse the content vector and the position vector of the character to determine a word vector of the character.

Further, because the model requires that the dimensions of the training samples determined by the server should be equal when determining the training samples, the server may determine the longest sentence in each training sample, and unify the dimensions corresponding to other sentences based on the number of characters in the sentence, that is, unify the dimensions. That is, assuming that the number of reference words is 20, if the number of characters in a sentence is 10, the 11 th to 20 th characters in the sentence can be complemented. Therefore, the word vector of the character may further include a mask vector for characterizing whether the character exists at the position in the statement.

Furthermore, in the same sentence, the position corresponding to the character may also affect the part of speech of the character, so that in determining the position vector, the position vector may also be determined only according to the position of the character in the sentence.

In addition, when the subsequent keyword is determined to classify the emotion, the word vector based on the keyword needs to be determined, and the emotion classification is obtained by determining the whole sentence, so that the server can perform semantic coding on the training sample after determining the word vector corresponding to each character in the training sample, so that the word vector corresponding to each character in the coded training sample contains an auxiliary vector capable of representing the influence of other characters in the training sample on the character.

Specifically, the server may determine, for each character, a weight corresponding to each other character according to a distance between each character and the character, and determine an auxiliary vector of the character by weighted summation according to a word vector corresponding to each other character and a weight thereof.

Of course, when determining the auxiliary vector, the word vectors corresponding to the characters may be multiplied by the corresponding weights and then spliced to determine the auxiliary vector of the character. Or determining the auxiliary vector of the character through a neural network model, a coding and decoding network model of attention mechanism, and the like, and the specific method for determining the auxiliary vector can be set according to needs, which is not limited in this specification.

And inputting each word vector corresponding to the training sample as input into an extraction module in the keyword extraction model to obtain the keywords of the training sample output by the extraction module, and determining the word vector corresponding to the keywords according to each character corresponding to the keywords.

S104: and inputting each word vector corresponding to the training sample as input into an extraction module in the keyword extraction model to obtain the keywords of the training sample output by the extraction module, and determining the word vector corresponding to the keywords according to each character corresponding to the keywords.

In one or more embodiments provided in this specification, after determining the word vector of each character corresponding to the training sample, the server may determine, for each training sample, the keyword in the training sample based on each word vector in the training sample.

Specifically, the server may input each word vector corresponding to the training sample as an input to the extraction module of the keyword extraction model, obtain, for each character, a determination result of whether the character and a previous character of the character are entity words, and use the determined entity word as the keyword of the training sample. In step S100, the determined first label may be determined according to whether each character and the previous character are entity words. For example, assuming that A indicates that the current character and the previous character do not form a physical word and B indicates that the current character and the previous character form a physical word, the first label of "balloon missing" should be "BABBB".

In addition, when determining the keywords of the training sample, there may be a problem that the boundaries of the keywords cannot be determined, for example, in the sentence "tesla releases the latest product," both "tesla" and "pull" belong to entity words, but obviously, tesla is the real keyword corresponding to the sentence, and therefore, when determining the keywords, the influence of the chinese participles on the determined keywords may also be considered. That is, the position of the fourth label that the character corresponds to the word to which it belongs, i.e., start, middle, end, is determined.

After determining each keyword, the server may determine, for each keyword, a word vector corresponding to the keyword according to a word vector of each character corresponding to the keyword. The determination can be carried out in various ways such as pooling and splicing, and the specification does not limit the determination.

S106: and taking the word vector as input, and respectively inputting the word vector into an entity classification module and an emotion classification module in the keyword extraction model to respectively obtain an entity classification result of the training sample output by the entity classification module and an emotion classification result of the training sample output by the emotion classification module.

In one or more embodiments provided herein, after determining each word vector, the server may determine an entity classification result and an emotion classification result corresponding to each word vector based on each word vector.

Specifically, the server may use, for each training sample, a word vector corresponding to the training sample as an input, and input the word vector into the entity classification module and the emotion classification module of the keyword extraction model, respectively, to obtain an entity classification result of the keyword output by the entity classification module, which is used as the entity classification result of the training sample, and an emotion classification module of the keyword output by the emotion classification module, which is used as the entity classification result of the training sample. As shown in fig. 2.

Fig. 2 is a schematic structural diagram of a keyword extraction model provided in this specification, and it can be seen that a training sample is input to a preprocessing module in the keyword extraction model to determine each word vector corresponding to the training sample, and then the keyword in the training sample is extracted by the extraction module based on each word vector, and the word vector of the keyword is determined, and the word vectors are respectively input to an entity classification module and an emotion classification module to determine an entity classification result and an emotion classification result of the training sample.

S108: determining loss according to the keywords and the first label, the entity classification result and the second label, and the emotion classification result and the third label corresponding to the training sample, and adjusting model parameters in the keyword extraction model by taking the minimum loss as an optimization target, wherein the model parameters at least comprise position parameters, and the keyword extraction model is used for determining the keywords of the non-structural information and classification thereof.

In one or more embodiments provided in this specification, after determining the keywords, the entity classification results, and the emotion classification results of each training sample, the server may train the keyword extraction model to be trained.

Specifically, the server may determine a first loss according to the keyword and the first label corresponding to the training sample, determine a second loss according to the entity classification result and the second label corresponding to the training sample, determine a third loss according to the emotion classification result and the third label corresponding to the training sample, and then determine a total loss according to a sum of the first loss, the second loss, and the third loss. And adjusting model parameters in the keyword extraction model by taking the minimum total loss as an optimization target.

Of course, the total loss can also be determined by weighting and summing the preset weights corresponding to the losses. And for each loss, determining a weight corresponding to the loss according to other losses, and further determining a total loss, where a specific manner of determining the total loss may be set as required, and this is not limited in this specification.

The training method based on the keyword extraction model shown in FIG. 1 is implemented by determining training samples, using keywords of the training samples, entity classification results of the keywords, and emotion classification results of the keywords as first labels, second labels, and third labels of the training samples, determining word vectors of the characters corresponding to the training samples based on positions of the characters and positions of the keywords in the training samples, determining the keywords of the training samples based on the word vectors, determining the entity classification results and emotion classification results of the keywords according to the word vectors of the keywords, training the keyword extraction model based on the keywords extracted from the training samples, the entity classification results, emotion classification combinations, and the labels, so that when recommendation is performed based on the determined keywords, not only entity classification corresponding to the keywords is performed, and the recommendation precision can be improved based on the emotion classification corresponding to each keyword.

In addition, when the keywords corresponding to each training sample are determined in step S100, the server may further extract each keyword using a knowledge graph. Specifically, the server may first define an undirected weighted graph, then perform word segmentation on each statement in the unstructured information, and determine each candidate word corresponding to the unstructured information as a word segmentation result corresponding to the unstructured information.

Secondly, the server can judge whether the candidate word meets the filtering condition or not aiming at each candidate word, and if so, the first appointed ranking word after the word and the candidate word can be added into the dictionary together. Wherein, the storage form of the content in the dictionary is (word 1, word 2) -occurrence times.

Then, after determining the dictionary, the server may traverse the dictionary, and add each content in the dictionary, word 1 and word 2, as the starting point and the ending point of an edge in the graph, and the number of occurrences as the weight of the edge, to the well-defined undirected weighted graph.

And finally, iterating the undirected weighted graph, determining the weight value corresponding to each candidate word, and selecting the specified number of candidate words with higher weight values as the keywords of the statement.

Further, as mentioned above, each keyword corresponding to the unstructured information exists in each sentence, and the difference between the categories of the keywords in different sentences may be large, the server may determine the keyword corresponding to each sentence. After determining the keywords corresponding to each piece of unstructured information, the server may determine each sentence corresponding to the unstructured information according to each punctuation in the unstructured information, a preset sentence segmentation rule, and the like. The unstructured information includes "a, B. C. For example, assume that the preset rule is "period" indicating the end of a sentence. The server may separate the unstructured information into two statements, "a, B" and "C". Of course, the preset sentence segmentation rules and the like can be set according to needs, and the description does not limit the rules.

Further, when determining the third label of each training sample in step S100, the third label can also be obtained by means of a model.

Specifically, the server may obtain a previously trained emotion classification model, and use each training sample as an input to obtain an emotion classification corresponding to each training sample output by the emotion classification model, as an emotion classification result corresponding to a keyword of each training sample.

Wherein, because of its limited labels, i.e., positive and negative emotions, it can be trained by small sample sentences labeled with entity classes:

and acquiring a small number of sentences with labels, inputting each sentence into an emotion classification model to be trained by taking each sentence as input, and determining an emotion classification result corresponding to each sentence. Determining loss according to the emotion classification result and the label of each statement, and adjusting the model parameters of the emotion classification model according to the minimum loss.

It should be noted that the entity classification model and the emotion classification model may be the same model result or different model results, and the specific model structure may be set as required, which is not limited in this specification.

As shown in fig. 3, the training sample is input into a pre-extraction model, a keyword of the training sample output by the pre-extraction model is determined, the keyword is used as a first label, the keyword is used as an input and is input into an entity classification model, an entity classification result corresponding to the keyword is determined, the keyword is used as a second label, the training sample is input into an emotion classification model, and the emotion classification result of the training sample is determined, and is used as a third label. Wherein the entity classification in the graph is an entity classification result, and the emotion classification is an emotion classification result

Based on the training method of the keyword extraction model shown in fig. 1, the present specification further provides a keyword extraction method, as shown in fig. 4.

Fig. 4 is a schematic flow chart of keyword extraction provided in this specification, including:

s200: acquiring unstructured information, wherein the unstructured information at least comprises one statement.

S202: and for each sentence in the non-structural information, inputting the sentence into a pre-processing module of a pre-trained keyword extraction model, determining a content vector and a position vector of each character in the sentence, and for each character, determining a word vector of the character according to the content vector and the position vector of the character, wherein the position vector is determined based on position parameters of the keyword extraction model.

S204: and inputting each word vector corresponding to the sentence into an extraction module in the keyword extraction model, and determining the keyword corresponding to the sentence.

S206: and respectively inputting the keywords into an entity classification module and an emotion classification module of the keyword extraction model, and determining entity classification results and emotion classification results corresponding to the keywords, wherein the keywords and the entity classification results and emotion classification results corresponding to the keywords are used for recommending the non-structural information to a user.

In the present specification, the first one or more embodiments are provided, because the complaint rate prediction model used in the present specification is obtained by training based on the model parameters of the prediction models of other sub-scenes belonging to the same business general scene as the target sub-scene, the scene characteristics of at least some other sub-scenes belonging to the same business general scene, the user characteristics of the user, and the like, when complaint rate prediction is performed, the user characteristics of each user, the scene characteristics of the target sub-scene, and the scene characteristics of the sub-scenes having an association relationship with each user can be obtained, and the complaint rate of each user with respect to the target sub-scene is determined.

For a specific method for determining word vectors corresponding to each sentence, and determining keywords according to each word vector, and further determining an entity classification result and an emotion classification result, reference may be made to the contents of steps S102 to S106, which is not described herein again.

Based on the same idea, the present specification further provides a training apparatus and a keyword extraction apparatus for a corresponding keyword extraction model, as shown in fig. 5 or 6.

Fig. 5 is a training apparatus of a keyword extraction model provided in this specification, including:

the sample determination module 300 is configured to obtain a plurality of non-structural information, use statements in the plurality of non-structural information as training samples, and determine, for each training sample, a keyword in the training sample, an entity classification result of the keyword, and an emotion classification result of the keyword, as a first label, a second label, and a third label of the training sample;

a preprocessing module 302, configured to input the training sample as an input into a preprocessing module in a keyword extraction model to be trained, determine a content vector and a position vector of each character in the training sample, and determine, for each character, a word vector of the character according to the content vector and the position vector of the character, where the position vector is determined based on a position of the character in the training sample and a position of a first label of the training sample in the training sample;

an extraction module 304, configured to input each word vector corresponding to the training sample as input to an extraction module in the keyword extraction model, obtain a keyword of the training sample output by the extraction module, and determine a word vector corresponding to the keyword according to each character corresponding to the keyword;

a classification module 306, configured to input the word vector as input to an entity classification module and an emotion classification module in the keyword extraction model, respectively, and obtain an entity classification result of the training sample output by the entity classification module and an emotion classification result of the training sample output by the emotion classification module;

the training module 308 is configured to determine a loss according to the keyword and the first label, the entity classification result and the second label, and the emotion classification result and the third label corresponding to the training sample, and adjust a model parameter in the keyword extraction model with the minimum loss as an optimization goal, where the model parameter at least includes a position parameter, and the keyword extraction model is used to determine a keyword of non-structural information and a classification thereof.

Optionally, the sample determining module 300 is specifically configured to, for each training sample, input the keyword of the training sample as an input into a pre-trained entity classification model, and determine an entity classification result corresponding to the keyword as a second label of the training sample, where the entity classification model is obtained based on a small sample keyword labeled with the entity classification result.

Optionally, the sample determining module 300 is specifically configured to, for each training sample, input the training sample as an input into a pre-trained emotion classification model, and determine an emotion classification result corresponding to the training sample as an emotion classification result of the keyword, where the emotion classification model is obtained by learning based on a small sample sentence labeled with the emotion classification result.

Optionally, the sample determining module 300 is specifically configured to determine, for each training sample, a keyword corresponding to the training sample through a pre-extraction model trained in advance, where the keyword is used as a first label of the training sample; wherein the pre-extraction model is trained in the following way: acquiring non-structural information; segmenting the non-structural information, determining each word corresponding to the non-structural information, and determining each candidate word according to the occurrence frequency of each word; judging whether the candidate word exists in a preset keyword dictionary or not aiming at each candidate word; if yes, determining the candidate word as a positive sample; if not, determining that the candidate word is a negative sample; and taking each candidate word as input, inputting the input into a pre-extraction model to be trained to obtain a pre-extraction result of each candidate word output by the pre-extraction model, and training the pre-extraction model according to the label and the pre-extraction result of each candidate word.

Optionally, the preprocessing module 302 is specifically configured to, for each character in the training sample, determine a first position vector of the character according to a position of the character in the training sample, determine a second position vector of the character according to the position of the character in the training sample and a position of a first label of the training sample in the training sample, and determine a position vector of the character according to the first position vector and the second position vector of the character, where the second position vector is positively correlated with a distance between the character and the first label.

Optionally, the preprocessing module 302 is specifically configured to determine an auxiliary vector of the character according to content vectors of other characters in the training sample, and determine a word vector corresponding to the character according to the content vector, the position vector, and the auxiliary vector corresponding to the character.

Fig. 6 is a keyword extraction apparatus provided in this specification, including:

an obtaining module 400, configured to obtain unstructured information, where the unstructured information includes at least one statement.

A preprocessing module 402, configured to input each sentence in the unstructured information into a preprocessing module of a pre-trained keyword extraction model, determine a content vector and a position vector of each character in the sentence, and determine, for each character, a word vector of the character according to the content vector and the position vector of the character, where the position vector is determined based on a position parameter of the keyword extraction model.

And an extraction module 404, configured to input each word vector corresponding to the statement into an extraction module in the keyword extraction model, and determine a keyword corresponding to the statement.

And the classification module 406 is configured to input the keyword into an entity classification module and an emotion classification module of the keyword extraction model, respectively, and determine an entity classification result and an emotion classification result corresponding to the keyword, where the keyword and the entity classification result and emotion classification result corresponding to the keyword are used to recommend the non-structural information to a user.

The present specification also provides a computer-readable storage medium storing a computer program, which can be used to execute the method for training the keyword extraction model provided in fig. 1 and the method for extracting keywords provided in fig. 4.

This specification also provides a schematic block diagram of the electronic device shown in fig. 7. As shown in fig. 7, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to implement the method for training the keyword extraction model shown in fig. 1 and the method for extracting keywords shown in fig. 4. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

23页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种基于信息识别的网络安全系统及方法

Model training and keyword extraction method and device

相关技术

网友询问留言