Entity relationship type determination method, device and equipment and storage medium

文档序号:361616 发布日期:2021-12-07 浏览:4次 中文

阅读说明:本技术 实体关系类型确定方法、装置和设备及存储介质 (Entity relationship type determination method, device and equipment and storage medium ) 是由 杨韬 于 2021-05-20 设计创作,主要内容包括:本申请公开了一种实体关系类型确定方法、装置和设备及存储介质,涉及人工智能技术领域,用于提升实体关系类型确定的准确性。该方法包括:获取目标实体对关联的目标句袋;将目标句袋输入至已训练的实体关系确定模型,针对目标句袋中各个句子,分别执行如下操作:针对一个句子,基于一个句子中各个字符各自对应的字符表示向量,获得一个句子的句子表示向量;基于获得的各个句子表示向量,分别确定相应句子的句子权重值,每个句子权重值表征一个句子对于目标实体对的关系确定的重要程度;基于各个句子各自对应的句子表示向量和句子权重值,确定目标句袋的句袋表示向量,并基于句袋表示向量,确定目标实体对包括的两个实体之间的目标关系类型。(The application discloses a method, a device and equipment for determining entity relationship types and a storage medium, relates to the technical field of artificial intelligence, and is used for improving the accuracy of determining the entity relationship types. The method comprises the following steps: acquiring a target sentence bag associated with the target entity pair; inputting the target sentence bag into the trained entity relationship determination model, and respectively executing the following operations for each sentence in the target sentence bag: aiming at a sentence, obtaining a sentence expression vector of the sentence based on the character expression vector corresponding to each character in the sentence; respectively determining sentence weight values of corresponding sentences based on the obtained sentence expression vectors, wherein each sentence weight value represents the determined importance degree of a sentence to the relation of a target entity pair; and determining a sentence bag representation vector of the target sentence bag based on the sentence representation vector and the sentence weight value respectively corresponding to each sentence, and determining a target relation type between two entities included in the target entity pair based on the sentence bag representation vector.)

1. A method for determining entity relationship types, the method comprising:

acquiring a target sentence bag associated with the target entity pair; the target sentence bag comprises a plurality of sentences, and each sentence comprises the target entity pair;

inputting the target sentence bag into a trained entity relationship determination model, and respectively executing the following operations for each sentence in the target sentence bag: for a sentence, obtaining a sentence representation vector of the sentence based on the character representation vector corresponding to each character in the sentence;

respectively determining sentence weight values of corresponding sentences based on the obtained sentence expression vectors by adopting the trained entity relationship determination model, wherein each sentence weight value represents the importance degree of a sentence for determining the relationship of the target entity pair;

and determining a sentence bag representation vector of the target sentence bag based on a sentence representation vector and a sentence weight value respectively corresponding to each sentence by adopting the trained entity relationship determination model, and determining a target relationship type between two entities included in the target entity pair based on the sentence bag representation vector.

2. The method of claim 1, wherein the training process of the entity-relationship determination model comprises:

determining multiple relation types output by a preset model based on the entity relation, and acquiring multiple triples; each triple comprises an entity pair, and one of the multiple relation types is labeled in association with the corresponding entity pair;

for the triples, the following operations are respectively performed: for a triple, carrying out sentence matching by adopting an entity pair contained in the triple to obtain a sentence sample containing the entity pair contained in the triple;

constructing corresponding training samples respectively based on the obtained sentence samples, wherein each training sample comprises a plurality of sentence samples containing the same entity pair;

respectively labeling the corresponding relation types of entity pairs associated with the corresponding training samples aiming at the obtained training samples;

and carrying out iterative training on the entity relationship determination model to be trained based on the marked training samples until a convergence condition is met, and obtaining the trained entity relationship determination model.

3. The method of claim 2, wherein prior to constructing respective training samples based on the obtained plurality of sentence samples, the method further comprises:

performing word segmentation operation on the obtained multiple sentence samples to obtain multiple word segments;

for the multiple relation types, the following operations are respectively executed:

for a relation type, determining mutual information coefficients of each word in the multiple words and the relation type, wherein one mutual information coefficient is used for representing the importance degree of one word to the relation type;

selecting at least one word segmentation corresponding to the mutual information coefficient larger than a set threshold value based on the obtained multiple mutual information coefficients;

screening sentence samples which do not contain any participle in the at least one participle from a plurality of sentence samples corresponding to the relation type;

the constructing of the corresponding training samples based on the obtained sentence samples respectively comprises:

and respectively constructing corresponding training samples based on the plurality of residual sentence samples.

4. The method of claim 3, wherein determining, for a relationship type, mutual information coefficients for each of the plurality of participles and the relationship type comprises:

for the plurality of participles, the following operations are respectively executed:

determining a first probability of occurrence of a participle for the participle;

determining a second probability of occurrence of said one relationship type and determining a third probability of occurrence of said one word-segmentation when said one relationship type exists;

and determining a mutual information coefficient corresponding to the word segmentation based on the first probability, the second probability and the third probability.

5. The method of claim 1, wherein before obtaining the sentence-representation vector of the one sentence based on the character-representation vectors corresponding to the respective characters in the one sentence, the method further comprises:

performing character splitting on the sentence to obtain a plurality of characters included in the sentence;

for the plurality of characters, the following operations are respectively executed:

performing feature coding on one character to obtain a content representation vector, a position representation vector and a source representation vector of the character; the content representation vector is used for representing content corresponding to the character, the position representation vector represents the position of the character in the sentence, and the source representation vector represents the sentence from which the character comes;

and obtaining a character representation vector of the character based on the content representation vector, the position representation vector and the source representation vector.

6. The method of claim 1, wherein before obtaining the sentence-representation vector of the one sentence based on the character-representation vectors corresponding to the respective characters in the one sentence, the method further comprises:

performing character splitting on the sentence to obtain a plurality of characters included in the sentence;

according to the sequence of the characters in the sentence, sequentially carrying out feature coding on each character in the characters to respectively obtain character representation vectors corresponding to the characters; when the character is subjected to feature coding, feature extraction is carried out on the character, a basic expression vector of the character is obtained, and a character expression vector of the character is obtained based on the basic expression vector and a character expression vector of a character before the character.

7. The method according to any one of claims 1-5, wherein obtaining the sentence representation vector of the one sentence based on the character representation vectors corresponding to the respective characters in the one sentence comprises:

performing mean pooling on each obtained character representation vector to obtain the sentence representation vector; alternatively, the first and second electrodes may be,

determining a character weight value of each character, and obtaining a sentence expression vector based on a character expression vector and a character weight value corresponding to each character; wherein each character weight value characterizes a degree of importance of one character to the one sentence.

8. The method according to any one of claims 1-5, wherein determining a sentence weight value for a respective sentence based on the obtained respective sentence representation vector comprises:

for each sentence representation vector, the following operations are respectively executed:

for one sentence expression vector, obtaining an intermediate expression vector of the sentence expression vector based on the sentence expression vector and a pre-training weight matrix included by the entity relation determination model;

obtaining a vector dot product between the intermediate representation vector and a pre-training parameter vector included in the entity relationship determination model;

and carrying out normalization processing on the obtained vector dot products to obtain sentence weight values corresponding to the sentence expression vectors.

9. An entity relationship type determination apparatus, the apparatus comprising:

the system comprises an acquisition unit, a relation determination module and a relation determination module, wherein the acquisition unit is used for acquiring a target sentence bag associated with a target entity pair and inputting the target sentence bag into a trained entity relation determination model; the target sentence bag comprises a plurality of sentences, and each sentence comprises the target entity pair;

a sentence coding unit, configured to use the trained entity relationship determination model to perform the following operations for each sentence in the target sentence bag, respectively: for a sentence, obtaining a sentence representation vector of the sentence based on the character representation vector corresponding to each character in the sentence;

a sentence bag encoding unit, configured to determine, based on the obtained sentence expression vectors, sentence weight values of corresponding sentences respectively by using the trained entity relationship determination model, each sentence weight value representing an importance degree of a sentence determined for the relationship of the target entity pair, and determine, based on the sentence expression vectors and the sentence weight values corresponding to the respective sentences by using the trained entity relationship determination model, a sentence bag expression vector of the target sentence bag;

and the predicting unit is used for determining a target relationship type between the two entities included in the target entity pair based on the sentence bag representation vector by adopting the trained entity relationship determination model.

10. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor,

the processor, when executing the computer program, realizes the steps of the method of any one of claims 1 to 8.

11. A computer storage medium having computer program instructions stored thereon, wherein,

the computer program instructions, when executed by a processor, implement the steps of the method of any one of claims 1 to 8.

Technical Field

The application relates to the technical field of computers, in particular to the technical field of Artificial Intelligence (AI), and provides a method, a device and equipment for determining entity relationship types and a storage medium.

Background

With the development of network technology, a great deal of knowledge is contained in common texts, and it is a very necessary work to mine relevant knowledge from the texts. Entity (entity) relationship extraction is the work of mining the relationship between entities from the common text to construct a triple data to enrich the knowledge map, and belongs to a basic technology in Natural Language Processing (NLP). For example, for one sentence: zhang san in 1961, 9 and 27 sunrise in A City, the sentence contains Zhang san, 1961, 9 and 27 sundays in 1961 and A City, and the three entities have a certain association relationship, and from the sentence, two entities of Zhang san and 1961, 9 and 27 sundays in 1961 are in a relationship of birth time, and the Zhang san and the A City are in a relationship of birth place. Therefore, after the entity relationship extraction based on this statement, two triples (zhang, time of birth, 1961, 9, 27) and (zhang, place of birth, city a) can be obtained, and then these triples can be added to the knowledge graph.

Knowledge maps have a wide range of applications in many fields. For example, in the search field, a user may ask knowledge-based questions such as "what was born when zhangsan", how high the zumuluman peak is, "and the two questions may be resolved into two queries, namely" (zhangsan, time of birth), "and" (zumuluman, altitude; in the recommendation field, the knowledge in the knowledge map is combined with the recommendation model to provide better recommendation results for the user; or in the conversation field, the user may ask relevant questions, and the user may not have knowledge map for accurate answering.

Therefore, the accuracy of the relation extraction task directly determines the accuracy of the knowledge graph, so that the experience of subsequent downstream application is influenced, and how to improve the accuracy of the relation extraction task to construct a high-quality knowledge graph is a problem to be considered.

Disclosure of Invention

The embodiment of the application provides a method, a device and equipment for determining an entity relationship type and a storage medium, which are used for improving the accuracy of determining the entity relationship type.

In one aspect, a method for determining an entity relationship type is provided, where the method includes:

acquiring a target sentence bag associated with the target entity pair; the target sentence bag comprises a plurality of sentences, and each sentence comprises the target entity pair;

inputting the target sentence bag into a trained entity relationship determination model, and respectively executing the following operations for each sentence in the target sentence bag: for a sentence, obtaining a sentence representation vector of the sentence based on the character representation vector corresponding to each character in the sentence;

respectively determining sentence weight values of corresponding sentences based on the obtained sentence expression vectors by adopting the trained entity relationship determination model, wherein each sentence weight value represents the importance degree of a sentence for determining the relationship of the target entity pair;

and determining a sentence bag representation vector of the target sentence bag based on a sentence representation vector and a sentence weight value respectively corresponding to each sentence by adopting the trained entity relationship determination model, and determining a target relationship type between two entities included in the target entity pair based on the sentence bag representation vector.

In one aspect, an entity relationship type determining apparatus is provided, the apparatus includes:

the system comprises an acquisition unit, a relation determination module and a relation determination module, wherein the acquisition unit is used for acquiring a target sentence bag associated with a target entity pair and inputting the target sentence bag into a trained entity relation determination model; the target sentence bag comprises a plurality of sentences, and each sentence comprises the target entity pair;

a sentence coding unit, configured to use the trained entity relationship determination model to perform the following operations for each sentence in the target sentence bag, respectively: for a sentence, obtaining a sentence representation vector of the sentence based on the character representation vector corresponding to each character in the sentence;

a sentence bag encoding unit, configured to determine, based on the obtained sentence expression vectors, sentence weight values of corresponding sentences respectively by using the trained entity relationship determination model, each sentence weight value representing an importance degree of a sentence determined for the relationship of the target entity pair, and determine, based on the sentence expression vectors and the sentence weight values corresponding to the respective sentences by using the trained entity relationship determination model, a sentence bag expression vector of the target sentence bag;

and the predicting unit is used for determining a target relationship type between the two entities included in the target entity pair based on the sentence bag representation vector by adopting the trained entity relationship determination model.

Optionally, the apparatus further includes a model training unit, configured to:

determining multiple relation types output by a preset model based on the entity relation, and acquiring multiple triples; each triple comprises an entity pair, and one of the multiple relation types is labeled in association with the corresponding entity pair;

for the triples, the following operations are respectively performed: for a triple, carrying out sentence matching by adopting an entity pair contained in the triple to obtain a sentence sample containing the entity pair contained in the triple;

constructing corresponding training samples respectively based on the obtained sentence samples, wherein each training sample comprises a plurality of sentence samples containing the same entity pair;

respectively labeling the corresponding relation types of entity pairs associated with the corresponding training samples aiming at the obtained training samples;

and carrying out iterative training on the entity relationship determination model to be trained based on the marked training samples until a convergence condition is met, and obtaining the trained entity relationship determination model.

Optionally, the model training unit is further configured to:

performing word segmentation operation on the obtained multiple sentence samples to obtain multiple word segments;

for the multiple relation types, the following operations are respectively executed:

for a relation type, determining mutual information coefficients of each word in the multiple words and the relation type, wherein one mutual information coefficient is used for representing the importance degree of one word to the relation type;

selecting at least one word segmentation corresponding to the mutual information coefficient larger than a set threshold value based on the obtained multiple mutual information coefficients;

screening sentence samples which do not contain any participle in the at least one participle from a plurality of sentence samples corresponding to the relation type;

and respectively constructing corresponding training samples based on the plurality of residual sentence samples.

Optionally, the model training unit is further configured to:

for the plurality of participles, the following operations are respectively executed:

determining a first probability of occurrence of a participle for the participle;

determining a second probability of occurrence of said one relationship type and determining a third probability of occurrence of said one word-segmentation when said one relationship type exists;

and determining a mutual information coefficient corresponding to the word segmentation based on the first probability, the second probability and the third probability.

Optionally, the sentence encoding unit is specifically configured to:

performing character splitting on the sentence to obtain a plurality of characters included in the sentence;

for the plurality of characters, the following operations are respectively executed:

performing feature coding on one character to obtain a content representation vector, a position representation vector and a source representation vector of the character; the content representation vector is used for representing content corresponding to the character, the position representation vector represents the position of the character in the sentence, and the source representation vector represents the sentence from which the character comes;

and obtaining a character representation vector of the character based on the content representation vector, the position representation vector and the source representation vector.

Optionally, the sentence encoding unit is specifically configured to:

performing character splitting on the sentence to obtain a plurality of characters included in the sentence;

according to the sequence of the characters in the sentence, sequentially carrying out feature coding on each character in the characters to respectively obtain character representation vectors corresponding to the characters; when the character is subjected to feature coding, feature extraction is carried out on the character, a basic expression vector of the character is obtained, and a character expression vector of the character is obtained based on the basic expression vector and a character expression vector of a character before the character.

Optionally, the sentence encoding unit is specifically configured to:

performing mean pooling on each obtained character representation vector to obtain the sentence representation vector; alternatively, the first and second electrodes may be,

determining a character weight value of each character, and obtaining a sentence expression vector based on a character expression vector and a character weight value corresponding to each character; wherein each character weight value characterizes a degree of importance of one character to the one sentence.

Optionally, the sentence bag encoding unit is specifically configured to:

for each sentence representation vector, the following operations are respectively executed:

for one sentence expression vector, obtaining an intermediate expression vector of the sentence expression vector based on the sentence expression vector and a pre-training weight matrix included by the entity relation determination model;

obtaining a vector dot product between the intermediate representation vector and a pre-training parameter vector included in the entity relationship determination model;

and carrying out normalization processing on the obtained vector dot products to obtain sentence weight values corresponding to the sentence expression vectors.

In one aspect, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the above methods when executing the computer program.

In one aspect, a computer storage medium is provided having computer program instructions stored thereon that, when executed by a processor, implement the steps of any of the above-described methods.

In one aspect, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps of any of the methods described above.

In the embodiment of the application, when the relation type of the target entity pair is determined, a plurality of sentences corresponding to the target entity pair are used as the input of the entity relation determination model, and, when the entity relationship determination model determines the entity relationship based on the target sentence bag, a sentence expression vector is acquired based on characters included in each sentence, and determining a sentence weight value for each sentence based thereon, obtaining a bag representation vector based on the sentence representation vector and the sentence weight value, wherein the sentence weight value characterizes how important a sentence is to the relationship determination of the target entity pair, such that the weight of a noisy sentence is made smaller, thereby filtering the noise sentences in the sentence bag to a certain extent, reducing the influence degree of the noise sentences on the determination of the relation types, and further, the accuracy of determining the entity relationship type is improved, and correspondingly, the accuracy of the constructed knowledge graph is also improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or related technologies, the drawings needed to be used in the description of the embodiments or related technologies are briefly introduced below, it is obvious that the drawings in the following description are only the embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is an application scenario diagram provided in an embodiment of the present application;

fig. 2 is a schematic diagram of a training process of an entity relationship determination model according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an entity relationship determination model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a pooling process provided by an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating a generation process of training samples according to an embodiment of the present disclosure;

FIG. 6 is a schematic flowchart of a process for constructing training samples according to an embodiment of the present disclosure;

FIG. 7 is a flowchart illustrating feature encoding of each character according to an embodiment of the present application;

fig. 8 is a schematic diagram of obtaining a character representation vector of each character according to an embodiment of the present application;

fig. 9 is another schematic flowchart of feature encoding of each character according to the embodiment of the present application;

fig. 10 is a schematic flowchart of an entity relationship type determination method according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of an entity relationship type determining apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In the present application, the embodiments and features of the embodiments may be arbitrarily combined with each other without conflict. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

For the convenience of understanding the technical solutions provided by the embodiments of the present application, some key terms used in the embodiments of the present application are explained first:

and (3) entity pair: an entity pair includes two entities, also called named entities (named entities), which refer to names of people, organizations, places, and other entities identified by names, and may include numbers, dates, currencies, addresses, and so on.

For example, for one sentence: zhang san is born in city A in 1961 at 9.27, and the sentence contains "Zhang san", "1961 at 9.27.27 and" city A ", and every two entities can form an entity pair, for example, the two entities of" Zhang san "and" 1961 at 9.27 "can form an entity pair, and the two entities of" Zhang san "and" city A "can form an entity pair.

Entity relationship type: the correlation attribute between two entities included in an entity pair is characterized, as in the above example, two entities, "zhang san" and "1961, 9, month and 27 days" are "birth time" relationships, and then the entity relationship type corresponding to the entity pair consisting of zhang san "and" 1961, 9, month and 27 days "is birth time; if "zhangsan" and "a city" are the relationship of "place of birth", the type of the relationship of the entity pair consisting of zhangsan "and" a city "is the place of birth.

Knowledge graph: the method combines theories and methods of applying subjects such as mathematics, graphics, information visualization technology, information science and the like with methods such as metrology citation analysis, co-occurrence analysis and the like, and visually displays the core structure overall knowledge framework of the subjects by utilizing a visual map. The knowledge graph is a huge semantic network graph, each entity is regarded as a node in the graph, and the nodes are connected through entity relationship types.

Triplet: a triplet includes a pair of entities and the type of relationship between the pair of entities, for example, the pair of entities can be represented in the manner of (entity 1, type of relationship, entity 2), and the two triplets (zhang san, time of birth, 9/27/1961) and (zhang san, place of birth, city a) can be obtained by taking the above example as an example.

Sentence bag (bag): a sentence pocket is made up of a plurality of sentences containing identical pairs of entities.

Entity relationship determination model: refers to a trained network model for determining the type of relationship between pairs of entities, the entity relationship determination model being a Machine Learning (ML) based approach that enables the ability to process and understand the meaning of sentences and sentence bags, and determines the type of relationship for corresponding pairs of entities based on the sentences and the sentence bag representations. In addition, the entity relationship determination model can also identify the importance degree of different sentences to the determination of the relationship of one entity pair, and based on the determination, noise data in the sentence bags are weakened to a certain extent, so that the accuracy of the finally determined relationship type is improved. After the entity relationship determination model is trained, training parameters such as a pre-training weight matrix and a pre-training parameter vector included in the entity relationship determination model can be obtained and used in a subsequent process of actually determining the relationship type.

The embodiment of the application relates to artificial intelligence and machine learning technology, and is mainly designed based on machine learning in artificial intelligence.

Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The entity relationship type determination or entity relationship extraction belongs to a basic technology in NLP, and the NLP is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like. An Artificial Neural Network (ANN) abstracts a human brain neuron Network from an information processing perspective, establishes a certain simple model, and forms different networks according to different connection modes. The neural network is an operational model, which is formed by connecting a large number of nodes (or neurons) with each other, each node represents a specific output function called excitation function (activation function), the connection between every two nodes represents a weighted value for passing through the connection signal called weight, which is equivalent to the memory of the artificial neural network, the output of the network is different according to the connection mode of the network, the weighted value and the excitation function are different, and the network itself is usually an approximation to a certain algorithm or function in nature, and may also be an expression of a logic strategy.

The embodiment of the application relates to training an entity relationship determination model by using a machine learning method, and further applying the trained entity relationship determination model to actual entity relationship prediction. Specifically, the entity relationship type determination in the embodiments of the present application may be divided into two parts, including a training part and an application part. In the training part, an artificial neural network model (namely a subsequently mentioned entity relation determination model) is trained by the machine learning technology, so that the artificial neural network model is trained on the basis of a sentence bag-based training sample given in the embodiment of the application, and model parameters are continuously adjusted by an optimization algorithm until the model converges; the application part is used for expressing the sentence bag corresponding to the actual entity by using the artificial neural network model obtained by training in the training part, and predicting the relation type of the entity pair based on the obtained sentence bag expression vector. In addition, it should be further noted that, in the embodiment of the present application, the artificial neural network model may be trained online or offline, and is not limited herein. This is exemplified herein by off-line training.

The following briefly introduces the design concept of the embodiments of the present application.

At present, the knowledge graph has wide application in many fields, so how to construct a high-quality knowledge graph which meets the requirements of users is a great challenge. The traditional knowledge acquisition mode is to acquire various encyclopedia information on various internets, and the webpage data mainly comprise structured data, so that the structured knowledge can be obtained only by performing structured analysis on the data and then performing processing such as denoising and the like. However, the knowledge amount of the web page data is limited, compared with the vast corpus data on the internet, the structured data only accounts for a small part, and a large amount of knowledge is contained in the common text, so that the work of mining the related knowledge from the text is very necessary. The entity relationship extraction is the work of mining the relationship between entities from the common text and constructing the triple data to enrich the knowledge map.

A common method for extracting relationships includes collecting training data, generally generating the training data in a human review manner, and then training a classification model using the training data to classify pairs of entities in each sentence, i.e., identifying entities from the sentences, and classifying any two entities to determine relationships between the entities. However, the artificially constructed training data contains relatively large noise, and how to alleviate the influence of such noise data on the relationship type determination is a problem to be considered in the entity relationship extraction.

In view of this, an embodiment of the present application provides an entity relationship type determining method, in which, when determining a relationship type of a target entity pair, a plurality of sentences corresponding to the target entity pair are used as inputs of an entity relationship determining model, and when the entity relationship determining model determines an entity relationship based on a target sentence bag, a sentence representing vector is obtained based on characters included in each sentence, and a sentence weight value of each sentence is determined accordingly, and a sentence bag representing vector is obtained based on the sentence representing vector and the sentence weight value, wherein the sentence weight value represents an importance degree of a sentence for relationship determination of the target entity pair, so that a weight of a noise sentence is smaller, and thus noise sentences in the sentence bag can be filtered to a certain extent, thereby reducing an influence degree of the noise sentences on relationship type determination and further improving accuracy of entity relationship type determination, accordingly, the accuracy of the constructed knowledge graph is improved.

In addition, in the embodiment of the application, the training sample is constructed by combining a remote supervision method based on the existing knowledge graph, so that the problem that training data is few due to the fact that manual labeling is difficult is solved, and a certain filtering mode based on mutual information filtering is adopted for filtering the noise sentence sample aiming at the noise problem introduced by the remote supervision, so that the problem of low accuracy of the trained model caused by the noise sentence is reduced, and the accuracy of determining the entity relationship type is improved.

After introducing the design concept of the embodiment of the present application, some simple descriptions are provided below for application scenarios to which the technical solution of the embodiment of the present application can be applied, and it should be noted that the application scenarios described below are only used for describing the embodiment of the present application and are not limited. In a specific implementation process, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.

The scheme provided by the embodiment of the present application may be applied to a knowledge graph related application scenario, as shown in fig. 1, the application scenario diagram provided by the embodiment of the present application may include a terminal device 101 and a server 102 in the scenario.

The terminal device 101 may be, for example, a mobile phone, a tablet computer (PAD), a Personal Computer (PC), a smart television, a smart car device, a wearable device, and the like. The terminal device 101 may install an application, which may be, for example, a chat robot related application, a content recommendation related application, or a search application, etc.

The server 102 may be a background server corresponding to an application installed on the terminal device 101, for example, an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform, but is not limited thereto.

The server 102 may include one or more processors 1021, memory 1022, and an I/O interface 1023 to interact with the terminal, among other things. In addition, the server 102 may further configure a database 1024, and the database 1024 may be used to store model data, relevant data (e.g., sentences, texts, and prediction results) of target entity pairs to be predicted, knowledge graph relevant data, and the like, related to the embodiments of the present application. The memory 1022 of the server 102 may further store program instructions of the entity relationship type determination method provided in the embodiment of the present application, and when executed by the processor 1021, the program instructions can be used to implement the steps of the entity relationship type determination method provided in the embodiment of the present application, to determine the relationship type between the target entity pair, further add the determined entity relationship type to the knowledge graph, and extend the knowledge graph, so that the knowledge graph can be used in downstream services to implement corresponding service applications.

Taking an example that the application installed on the terminal device 101 may be a chat robot related application, the user may initiate a conversation content in the chat robot application, and correspondingly, the server 102 performs semantic recognition on the conversation content in combination with a semantic recognition model stored in itself, and recognizes a semantic expressed by the conversation content of the user, for example, if the conversation content of the user is "who is an author of article a? "then a query such as" (article a, author,.

Of course, the knowledge graph may also be applied to other scenarios, for example, a recommendation model is constructed in combination with the knowledge context of the knowledge graph to obtain a better recommendation result for the user, which is not limited in the embodiment of the present application.

Terminal device 101 and server 102 may be communicatively coupled directly or indirectly through one or more networks 103. The network 103 may be a wired network or a Wireless network, for example, the Wireless network may be a mobile cellular network, or may be a Wireless-Fidelity (WIFI) network, and of course, may also be other possible networks, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the servers 102 are divided according to different functions, and may include different sub-servers, for example, the sub-servers may include a sub-server 1 that provides a background service for an application, and a sub-server 2 that is used to construct a knowledge graph, where the sub-server 1 and the sub-server 2 may be different functional modules of the same physical server, and may also be different physical servers.

In a possible application scenario, in the embodiment of the present application, model data, relevant data of a target entity pair to be predicted, and relevant data of a knowledge graph may be stored by using a cloud storage technology. The distributed cloud storage system refers to a storage system which integrates a large number of storage devices (or called storage nodes) of different types in a network through application software or application interfaces to cooperatively work together through functions of cluster application, grid technology, distributed storage file system and the like, and provides data storage and service access functions to the outside.

In a possible application scenario, the servers 102 may be deployed in different regions for reducing communication delay, or different servers 102 may serve the regions corresponding to the terminal devices 101 respectively for load balancing. The plurality of servers 102 realize data sharing through the blockchain, that is, the plurality of servers 102 located in various regions constitute a data sharing system based on the blockchain technology. For example, the terminal device 101 is located at a site a and is in communication connection with the server 102, and the terminal device 101 is located at a site b and is in communication connection with other servers 102.

Each server 102 in the data sharing system has a node identifier corresponding to the server 102, and each server 102 in the data sharing system may store node identifiers of other servers 102 in the data sharing system, so that the generated block is broadcast to other servers 102 in the data sharing system according to the node identifiers of other servers 102. Each server 102 may maintain a node identifier list as shown in the following table, and store the server 102 name and the node identifier in the node identifier list. The node identifier may be an Internet Protocol (IP) address and any other information that can be used to identify the node, and table 1 only illustrates the IP address as an example.

Server name Node identification
Node 1 119.115.151.174
Node 2 118.116.189.145
Node N 119.124.789.258

TABLE 1

Of course, the method provided in the embodiment of the present application is not limited to be used in the application scenario shown in fig. 1, and may also be used in other possible application scenarios, and the embodiment of the present application is not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 will be described in the following method embodiments, and will not be described in detail herein. The subsequent method flow may be executed by the server 102 or the terminal device 101 in fig. 1, or executed by both the server 102 and the terminal device 101, and the following description mainly refers to the server 102 being executed as an example.

In this embodiment of the present application, the determination process of the entity relationship type may be performed by a trained entity relationship determination model, and therefore, before describing the flow of the entity relationship type determination method, a description is given here to the training process of the entity relationship determination model.

Please refer to fig. 2, which is a schematic diagram of a training process of an entity relationship determination model according to an embodiment of the present application.

Step 201: a plurality of training samples are obtained.

The traditional entity relation classification model is used for classifying a single sentence independently and judging the relation of an entity pair in the sentence, but the method may introduce a noise problem, namely the relation type reflected by a certain sentence is not consistent with the labeled relation type, and the noise sentences have larger influence on the model during training so that the accuracy of the model is not high. Therefore, in the embodiment of the application, a sentence bag-based training mode is adopted, each training sample is a sentence bag, each sentence bag comprises a plurality of sentences containing the same entity pair, and the entity pair corresponding to each sentence bag is labeled with a relationship type label.

For example, one possible training sample is the following example:

sentence 1: xiaozhuangzhang and wife Xiao Wu's eye with tear

Sentence 2: apartment with small apartment living under the name of wu xianzi

Sentence 3: small Zhang and Wu to hold a great wedding ceremony in a certain hotel

Sentence 4: registration of marriage with xiao wu in 2017 on 8.1.month.

Each sentence includes two entities, namely the xiao zha and the xiao wu, and based on the meaning of the representation of the sentences, it can be seen that the relationship attribute between the xiao zha and the xiao wu is the attribute of the couple, so that the sentence bag formed by the sentences can form a training sample corresponding to the entity pair of the xiao zha and the xiao wu, and the entity pair of the xiao zha and the xiao wu corresponding to the training sample is labeled with the relationship type of the couple.

It should be noted that the number of sentences included in the above example is only one possible number, and in practical applications, the number of sentences included in each training sample may be set according to practical situations, which is not limited in the embodiment of the present application.

In the embodiment of the present application, after the training samples are obtained, iterative training may be performed on the initial entity relationship determination model based on the obtained multiple training samples until the entity relationship determination model satisfies the convergence condition. Since the training process is similar for each iteration, the description is specifically given here by taking an iteration training process as an example.

Step 202: and for one sentence in each training sample, obtaining a sentence representation vector of the sentence based on the character representation vector corresponding to each character in the sentence.

In the embodiment of the present application, the entity relationship determination model needs to represent each training sample, and then predicts the corresponding relationship type based on the representation of each training sample, because the sentence expression vector of each sentence in each sentence pocket and the encoding process of the sentence pocket expression vector of each sentence pocket are similar, in the following description, one sentence pocket and one sentence are mainly used as examples for description, and the following descriptions can be referred to for other sentence pockets and other sentences.

Referring to fig. 3, a schematic structural diagram of an entity relationship determination model provided in the embodiment of the present application is shown, where the entity relationship determination model includes an input layer, a character feature encoding layer, a character feature fusion layer, a sentence feature fusion layer, and an output layer, and applications of each layer will be introduced one by one in the following description.

Specifically, after each training sample is input to the input layer, for each sentence in the training sample, a certain identifier is used to identify an entity pair in the sentence.

In one possible implementation, different entities in the entity pair may be identified by different identification symbols, and these identifiers will be inserted into the sentence to identify the location where the entity is located. Referring to fig. 3, for the two entities in the entity pair, it can be identified by "< e1 >" < e2> "respectively, such as the pair of entities" sheetlet "and" young wu ", the entity" sheetlet "can be identified by" < e1> "respectively, such as inserting the identifiers" < e1> "and" </e1> "before and after the sheetlet, and" </e1> "respectively, such as the identifiers" < e2> "before and after the sheetlet, respectively, and the entity" sheetlet "can be identified by" < e2> "and" </e2> "before and after the sheetlet, such as" </e2> "respectively, such as the ending position of the entity character" sheetlet ".

In another possible embodiment, an entity pair of a sentence may be identified by adding an additional label to the sentence, such as adding a label at the beginning or end of the sentence, indicating two entities included in the entity pair in the sentence, and a position in the sentence.

In the embodiment of the application, since the sentence is composed of individual characters, in order to obtain the sentence expression vector, the sentence may be firstly character-split, the sentence is split into individual characters (token), and the character expression vector of each character is obtained, and then the sentence expression vector is obtained based on the character expression vectors of the individual characters.

Specifically, the acquisition of the character expression vector can be realized through a character feature coding layer of the entity relationship determination model, the character feature coding layer comprises a character feature coding module, the character feature coding modules of each sentence can share network parameters, and the network parameters can be assigned randomly at the beginning, or an initialization assignment mode can be adopted to assign the network parameters, or parameters of other pre-training models can be transplanted.

In this embodiment of the present application, the character feature encoding module may adopt any possible encoding model, for example, a Bidirectional encoding representation from transforms (BERT) model based on a converter, a Gated Recursive Unit (GRU) model, or a Long Short-Term Memory network (LSTM) model, and of course, other possible models may also be adopted, which is not limited in this embodiment of the present application.

In the embodiment of the present application, referring to fig. 3, for each sentence, the character encoding module may obtain the character representation vector corresponding to each character in the sentence, and further, the sentence representation vector of the sentence may be obtained based on the character representation vector corresponding to each character in the sentence.

The process of obtaining sentence expression vectors based on each character expression vector can be realized by a character feature fusion layer included in the entity relationship determination model, the character feature fusion layer includes a character feature fusion module, the character feature fusion module of each sentence can also share network parameters, and the network parameters can be assigned randomly at the beginning, or can be assigned in an initialization assignment mode, or can be transplanted with parameters of other pre-training models.

Specifically, the character feature fusion module may obtain the sentence expression vector in a variety of ways, including but not limited to the following ways:

(1) mean-pooling (mean-pooling) mode

Specifically, the mean value pooling is performed on each character representation vector in each sentence, and a sentence representation vector of the sentence is obtained.

In the mean value pooling, a mean value is obtained for the value at each position of each character, the mean value is the feature value at the corresponding position in the sentence expression vector, as shown in fig. 4, for the sentence containing m characters, the values at the 1 st position in the m character expression vectors are respectively added and the mean value is obtained to obtain the feature value corresponding to the 1 st position in the sentence expression vector, and the feature values at the 2 nd position, the 3 rd position and the like are obtained, so that the sentence expression vector is obtained.

In practical applications, the sentence expression vector may also be obtained by a pooling process such as max pooling.

(2) Method of weighted summation

Considering that when the semantics of a sentence is understood, the importance degrees of different characters for the semantic understanding of the sentence are different, when the sentence is expressed, the sentence can be treated differently according to the importance degrees of different characters. Therefore, after the character representation vector of each character is obtained, the character weight value of each character can be determined, and further, a sentence representation vector is obtained based on the character representation vector and the character weight value corresponding to each character, wherein each character weight value represents the importance degree of one character to one sentence.

In one possible approach, a self-attention mechanism may be employed to determine the character weight values for individual characters.

Specifically, the character fusion module may include a weight vector matrix and a score conversion vector, for the character a, according to the character representation vector and the weight vector matrix of the character a, the weight representation vector corresponding to the character a may be obtained, and then a vector dot product operation is performed on the transpose of the score conversion vector and the weight representation vector, so as to obtain the weight score of the character a, and similarly, the weight scores of the characters other than the character a in the sentence may be obtained, and the weight scores of the characters are normalized to obtain the character weight values corresponding to the characters. The weight vector matrix and the fraction conversion vector are trainable parameters and are trained in the model training process, the weight vector matrix is used for modeling the character representation vector of the character again, and the fraction conversion vector is used for converting the weight representation vector into a floating point number.

(3) The manner of convolution operation.

When a convolution operation mode is adopted, the character representation vectors of all characters in a sentence can be spliced to form a sentence representation matrix, and the character fusion module is used for carrying out feature extraction on the sentence representation matrix, so that the sentence representation vectors are obtained.

The character fusion module can comprise convolution kernels with various sizes, so that the characteristics of perception fields with different sizes in the sentence expression matrix can be extracted.

Step 203: and respectively determining sentence weight values of corresponding sentences according to the obtained sentence expression vectors aiming at each training sample, wherein each sentence weight value represents the importance degree of a sentence for determining the relation of the entity pair.

In the embodiment of the application, in consideration of the existence of the noise sentence, in order to improve the accuracy of the finally obtained sentence bag expression vector of the whole sentence bag, a smaller weight may be given to the noise sentence, so that a small number of features of the noise sentence enter the sentence bag expression vector, and then, a sentence weight value corresponding to each sentence in the sentence bag needs to be determined first.

In the embodiment of the present application, a sentence weight value of each sentence may be obtained based on a self attention (self attention) mechanism.

In one possible implementation, the following self-attention mechanism may be employed to calculate the sentence weight value for each sentence.

Specifically, the sentence fusion module may include a pre-training weight matrix and a pre-training parameter vector, and an attention weight parameter that is expected to be finally learned in the training purpose, that is, the pre-training weight matrix and the pre-training parameter vector may distinguish between a noise sentence and a normal sentence, so that the weight difference between the processed noise sentence and the normal sentence is larger.

Here, a sentence bag is taken as an example, and a sentence weight value obtaining process is described. And aiming at each sentence expression vector in one sentence bag, obtaining an intermediate expression vector corresponding to the sentence expression vector based on each sentence expression vector and the pre-training weight matrix, and obtaining a vector dot product between the intermediate expression vector and a pre-training parameter vector included by the entity relationship determination model. The pre-training weight matrix and the pre-training parameter vector are trainable parameters and are trained in the model training process, the pre-training weight matrix is used for re-modeling the character representation vector of the character, and the pre-training parameter vector is used for converting the weight representation vector into a floating point number.

Furthermore, based on the above manner, a vector dot product of each sentence is obtained, and the vector dot product can represent the weight fraction of each sentence, and further, each obtained vector dot product is normalized to obtain a sentence weight value corresponding to each sentence expression vector.

The sentence weight value calculation for each sentence in the sentence bag may be expressed as follows:

si=vTtanh(Wxi)

A=softmax(S)

wherein W represents the pre-training weight matrix, v represents the pre-training parameter vector, siAnd xiThe vector dot product and sentence representing vector of the ith sentence are respectively represented, S represents the set of vector dot products of all sentences in the sentence bag, and A represents the set of sentence weight values of all sentences in the sentence bag.

Step 204: and determining a sentence bag representation vector of the target sentence bag based on the sentence representation vector and the sentence weight value respectively corresponding to each sentence.

In the embodiment of the present application, the sentence bag representation vector can be expressed as follows:

Y=AX

the method comprises the steps that X represents a set of sentence representation vectors of all sentences in a sentence bag, Y represents the sentence bag representation vectors, and during calculation, weighted summation is carried out on the sentence representation vectors and the sentence weight values of all the sentences respectively, so that the sentence bag representation vectors of the sentence bag can be obtained.

Step 205: and determining the prediction relation type of the entity pair corresponding to each training sample based on the sentence bag expression vector corresponding to each training sample.

Specifically, the relationship type may be predicted by a multi-classifier, for example, the sentence bag expression vector may be mapped to each relationship type by using a full connection layer, so as to obtain a probability value of one sentence bag expression vector corresponding to each predicted relationship type, and then determine the predicted relationship type according to the probability value.

Of course, other similar multi-classifiers may also be used, such as a Support Vector Machine (SVM) or Softmax, which is not limited in this embodiment of the present application.

Step 206: and determining a model loss value of the entity relationship determination model according to the prediction relationship type and the labeling relationship type of each training sample.

In the embodiment of the application, each training sample is labeled with a relationship type in advance, so that a model loss (loss) value of the entity relationship determination model can be determined according to the predicted relationship type and the labeled relationship type of each training sample. The model Loss value may be obtained by using a Cross Entropy Loss (Cross Entropy Loss) function, a Mean Squared Loss (Mean Squared Loss) function, or a Mean Absolute Error Loss (Mean Absolute Error Loss) function, or may be obtained by using other possible Loss functions, which is not limited in this embodiment.

Step 207: determining whether the entity relationship determination model satisfies a convergence condition.

The convergence condition may include any one of the following conditions:

(1) the model loss value is less than the set loss threshold. The model loss value can represent the difference degree between the prediction relation type and the annotation relation type of each training sample, and therefore when the model loss value is smaller than a set loss threshold value, that is to say, the prediction relation type and the annotation relation type are small enough or the same, it is indicated that the accuracy of the entity relation determination model is high, and therefore at this time, the entity relation determination model can be considered to meet the convergence condition.

(2) The iterative training times of the model reach a set upper limit value.

Step 208: and when the entity relationship determination model is determined not to meet the convergence condition, adjusting model parameters of the entity relationship determination model according to the model loss value, and skipping to the step 202 to continue the next training process.

Step 209: and when the entity relation determination model is determined to meet the convergence condition, finishing the training.

When any one of the convergence conditions is satisfied, the training process of the entity relationship determination model is ended, that is, the trained entity relationship determination model is obtained, and the trained entity relationship determination model can be used in the subsequent actual relationship type determination process.

In the embodiment of the application, a sample acquisition mode based on remote supervision is adopted in consideration of huge workload of manually labeling the relationship types of the entity pairs in each sentence one by one. Referring to fig. 5, a schematic diagram of a generation process of the training samples is shown.

Step 2011: and determining multiple relation types output by the preset model based on the entity relation, and acquiring multiple triples.

Each triple comprises an entity pair, and one relationship type in a plurality of relationship types is labeled in association with the corresponding entity pair.

Specifically, before training, various relationship types output by the model are preset for the entity relationship determination model to be trained, that is, the model can predict which relationship types, and based on the preset relationship types, corresponding triples can be obtained from existing data of the knowledge base.

For example, if the relationship type preset by the entity relationship determination model to be trained includes "couple", then the triples that have been associated with the relationship type labeled "couple" may be obtained from the knowledge base, for example (xiao, couple, wu), and (xiaoqi, couple, xiao eight), etc.

Step 2012: and for each triple, carrying out sentence matching on the entity pairs contained in the triple, and obtaining a sentence sample containing the entity pairs in the triple.

In the embodiment of the application, the acquired unsupervised texts are subjected to label returning by utilizing the acquired triplets, and because a large amount of unsupervised texts exist in the network, the large amount of unsupervised texts can become supervised data by the label returning mode.

Specifically, for the attribute of "couple" mentioned above, we will use the entity pair (xiao ) to match the sentence, and if there are two entities in the sentence, it is considered that the semantic of the sentence is likely to express the information of the triplet (xiao, couple, xiao). For example, the sentence after the logout is as follows:

sentence 1: xiaozhuangzhang and wife Xiao Wu's eye with tear

Sentence 2: apartment with small pages living under name of wu

The two sentences mentioned above, from which we see that there is actually a relatively large noise, for example, sentence 1 above expresses the attribute of "couple", but sentence 2 does not express this at all, i.e. sentence 2 belongs to a noise sentence.

Step 2013: and constructing corresponding training samples respectively based on the obtained sentence samples, wherein each training sample comprises a plurality of sentence samples containing the same entity pair.

In the embodiment of the present application, a bag-of-sentences-based model training is adopted, so that each training sample contains a plurality of sentence samples, and the plurality of sentence samples have the same entity pair, such as the bag-of-sentences sample mentioned in the introduction of the embodiment section shown in fig. 2.

Specifically, the number of sentence samples included in a sentence pocket can be set for the sentence pocket, and further, a sentence pocket, i.e., a training sample, is formed by randomly selecting a specified number of sentence samples including the same entity pair. Of course, in practical applications, the number of sentence samples included in each training sample may also be different, and then each sentence pocket may also be formed in a randomly selected manner, which is not limited in the embodiment of the present application.

Step 2014: and respectively labeling the corresponding relation types of the entity pairs associated with the corresponding training samples aiming at the obtained training samples.

And the labeled relationship type of each training sample is the relationship type of the same entity pair in the training sample. In continuation of the above example of "couple", after a training sample is constructed based on sentences such as "xianzhang and wife xianzhang with tear" and "xianzhang living in wu famous apartments", since it is known from the triad (xianzhang and wife, xianzhang and wu) that xianzhang is a couple relation, the type of relation can be labeled as "couple".

And then, performing iterative training on the entity relationship determination model to be trained based on the marked multiple training samples.

In the embodiment of the application, referring to the above process, a large number of noise sentences still exist in each sentence of the callbacks, so before the training sample is constructed, a certain means can be adopted to filter the noise sentences, and the training sample is constructed based on the remaining sentence samples after filtering. Referring to fig. 6, a schematic flow chart of constructing a training sample is shown.

Step 20131: and performing word segmentation operation on the obtained multiple sentence samples to obtain multiple word segments.

The word segmentation operation can be performed by using word segmentation tools, such as a jieba (jieba) word segmentation tool, a StandardAnalyzer, and a chinese analyzer.

Aiming at the sentence "apartment house living under the name wu in a small size", the sentence can be divided into a plurality of word divisions "apartment house" living in a small size "," apartment living in a small size "," famous "," small wu "," apartment house ".

Step 20132: and determining mutual information coefficients of each participle in the participles and the relation type aiming at each relation type, wherein one mutual information coefficient is used for representing the importance degree of one participle to one relation type.

In the embodiment of the application, considering that the occurrence probability of some words is very high in each relationship type, the words can be considered to be closely related to the relationship type, for example, the attribute of 'couple' exists, and the probability of the words of 'wife', 'husband', and the like is very high, and then each sentence sample can be screened out through some important words, so that the noise problem caused by a remote supervision mode is relieved.

Specifically, taking the relationship type a as an example, when determining the mutual information scores of the participles related to the relationship type a, a first probability of occurrence of each participle, a second probability of occurrence of the corresponding relationship type a, and a third probability of occurrence of each participle when the relationship type a exists are determined, and further, based on the first probability, the second probability, and the third probability, the mutual information coefficient corresponding to each participle is determined.

The mutual information coefficient is calculated in the following mode:

where U denotes a participle, C denotes a relationship type, and P (U ═ e)t) Representing participles etP (C ═ e) is the first probability ofc) Represents a relationship type ecP (U ═ e) is the second probability oft,C=ec) Represents the current relationship type ecParticiple e when present etA third probability of occurrence.

Step 20133: and selecting at least one word segmentation corresponding to the mutual information coefficient larger than the set threshold value based on the obtained multiple mutual information coefficients.

Similarly, taking the relationship type a as an example, after the mutual information scores of the participles related to the relationship type a are obtained, at least one participle corresponding to the mutual information coefficient larger than the set threshold value can be selected from the obtained participles according to the sequence of the mutual information scores from large to small.

The set threshold may be a preset fixed threshold, or may be a preset number of at least one selected participle, and then the set threshold may be selected from the sorted sequence according to the set number.

Step 20134: and screening out sentence samples which do not contain any participle in at least one participle from a plurality of sentence samples corresponding to each relationship type.

Similarly, taking the relationship type a as an example, at least one participle selected for the relationship type a is used, and the participles are participles with higher importance degree to the relationship type a, and when the relationship type a exists, the words exist in high probability, so that a plurality of sentence samples corresponding to the relationship type a can be filtered by using the words, that is, the sentence samples containing any participle in the at least one participle are retained, and the sentence samples not containing any participle in the at least one participle are screened out.

Step 20135: and respectively constructing corresponding training samples based on the plurality of residual sentence samples.

And then, the training sample is constructed based on the sentences left after screening, so that partial noise sentences can be screened out to a certain extent, the accuracy of the training sample is improved, the accuracy of the entity relationship determination model obtained by training is correspondingly improved, and the accuracy of the finally predicted entity relationship type is further improved.

Next, description will be made on the acquisition process of the character expression vector.

In one possible embodiment, the process of obtaining the character representation vector may be performed as follows. Here, a sentence, such as sentence a, is taken as an example, and the process of obtaining the vector representing each character in the sentence is described. Referring to fig. 7, a flow chart of feature coding of each character of the sentence a is shown.

S2021 a: and performing character splitting on the sentence A to obtain a plurality of characters included in the sentence A.

After the identifier of the entity is inserted into the sentence a, the sentence a may be split into individual characters, which may also include the inserted identifier.

S2022 a: and respectively carrying out feature coding on each character to obtain a content representation vector, a position representation vector and a source representation vector of each character.

The content representation vector is used for representing content corresponding to one character, and can be obtained based on the meaning of each character representation, or can be obtained by querying an existing word stock, wherein the word stock is a mapping word stock between one character and one vector; the position expression vector represents the position of a character in a sentence, can represent the relative position relation between the character and other characters in the sentence A, and can represent the position by the serial number of the character in the sentence A or the word vectors existing before and after the character; the source representation vector characterizes the sentence from which a character is derived, i.e., sentence a.

S2023 a: based on the content representation vector, the position representation vector and the source representation vector, a character representation vector of a character is obtained.

Referring to fig. 8, a schematic diagram of character representation vector acquisition for each character is shown. After the content indicating vector, the position indicating vector, and the source indicating vector of each character are obtained, the character indicating vector of the corresponding character may be obtained based on the content indicating vector, the position indicating vector, and the source indicating vector, and the character indicating vector may be a vector capable of simultaneously representing information of the content indicating vector, the position indicating vector, and the source indicating vector.

Specifically, the content representation vector, the position representation vector, and the source representation vector may be superimposed to obtain a character representation vector of the corresponding character, and as shown in fig. 8, the content representation vector Ec1, the position representation vector Eb1, and the source representation vector Ea1 of the character 1 are superimposed to obtain a character representation vector E1 of the character 1, for example, the character 1.

Specifically, the content representation vector, the position representation vector, and the source representation vector may be further subjected to a splicing process to obtain a character representation vector of a corresponding character, for example, the position representation vector Eb1 of the character 1 may be spliced to the rear of the content representation vector Ec1, and the source representation vector Ea1 may be spliced to the rear of the position representation vector Eb1 to obtain the character representation vector E1 of the character 1.

Specifically, the content representation vector, the position representation vector, and the source representation vector of each character may be pooled to obtain the character representation vector of the corresponding character. Also taking character 1 as an example, when the maximum pooling is performed, the values of the content expression vector Ec1, the position expression vector Eb1, and the source expression vector Ea1 of character 1 at the same position are maximized, thereby obtaining a character expression vector E1 of character 1.

In the embodiment of the application, in order to obtain the sentence expression vector more accurately, information of other characters can be blended into each character expression vector. For example, after the character representation vector is obtained through the above-mentioned process, information of other characters can be merged by using an attention mechanism.

Specifically, the character encoding layer may further include at least one attention vector matrix, so that at least one attention vector corresponding to each character is obtained according to each character representation vector and the corresponding attention vector matrix in sentence a. For example, the at least one attention vector matrix may include a request (query) vector matrix, a key (key) vector matrix, and a value (value) vector matrix, and accordingly, the at least one attention vector includes a query vector, a key vector, and a value vector.

Furthermore, an attention weight vector of each character may be obtained based on at least one attention vector of each character, where each value in the attention weight vector corresponding to one character respectively represents the attention weight of each character for one character, for example, a sentence a contains 4 characters, and then for a character 1 therein, the attention weight vector of the character 1 contains 4 values, each value representing the attention weight of one character contained in the sentence a for the character 1.

Specifically, the attention weight of the character 2 to the character 1 can be obtained through the similarity between the key vector of the character 2 and the query vector of the character 1, and similarly, the attention weight of the character 1 to the character 1 can be obtained through the similarity between the key vector of the character 1 and the query vector of the character 1.

Finally, the final character representation vector of a character is obtained by performing weighted summation with the corresponding attention vector according to each attention weight in the attention weight vector of the character, for example, the character representation vector corresponding to the character 1 is obtained by performing weighted summation with each value in the attention weight vector of the character 1 and the corresponding value vector.

Further, the finally obtained character representation vector is used for participating in the synthesis of the sentence representation vector.

In another possible embodiment, the process of obtaining the character representation vector may also be performed as follows. Here, also taking sentence a as an example, see fig. 9, which is another flow diagram for feature coding of each character of sentence a.

S2021 b: and performing character splitting on the sentence A to obtain a plurality of characters included in the sentence A.

S2022 b: and performing feature coding on the first character in the sentence A to obtain a character representation vector corresponding to the first character.

S2023 b: and extracting the characteristics of the current character to be processed in the sentence A to obtain a basic expression vector of the character.

In the embodiment of the application, according to the sequence of the characters in a sentence, feature coding is sequentially performed on each character in the characters, and the character representation vectors corresponding to the characters are respectively obtained, so that if the previous character is the first character, the current character to be processed is the second character, and similarly, if the previous character is the second character, the current character to be processed is the third character, and so on.

The basic expression vector is used for representing the content corresponding to one character, and can be obtained based on the meaning of each character representation, or can be obtained by querying an existing word stock, wherein the word stock is a mapping word stock between one character and one vector.

S2024 b: based on the obtained base representation vector and the character representation vector of the previous character, a character representation vector of the character is obtained.

For example, if the current character is the second character, then the character representation vector for the second character may be obtained based on the base representation vector obtained for the current character and the character representation vector for the first character.

S2025 b: and judging whether the character representation vector acquisition of all the characters is finished.

If the judgment result in the step S2025b is yes, that is, the acquisition of the character representation vectors of all the characters is completed, the process is ended, otherwise, the process jumps to the step S2023b to continue the execution, that is, the character representation vector of the next character of the current character is acquired.

In the embodiment of the present application, after the training of the entity relationship determination model is finished, the trained entity relationship determination model may be used to participate in the prediction of the actual relationship type. Please refer to fig. 10, which is a flowchart illustrating a method for determining an entity relationship type according to an embodiment of the present application, where the method includes the following steps.

Step 1001: and acquiring a target sentence bag associated with the target entity pair, wherein the target sentence bag comprises a plurality of sentences, and each sentence comprises the target entity pair.

The target entity pair is an entity pair to be subjected to relation type determination, a plurality of sentences containing the target entity pair can be collected, and similarly, certain noise sentences can be screened in advance in a mutual information mode, and then a target sentence bag related to the target entity pair is constructed based on the rest sentences.

For example, for the entity pair "zhang san" and "4/2/1956", a plurality of sentences containing the two entities may be collected, as exemplified below:

sentence 1: zhang Sansheng in 1956, 4 months and 2 days

Sentence 2: zhang three birthday in 4 months and 2 days in 1956

Sentence 3: zhang san is born in hospital 4/2 in 1956

The above-described sentences are only a partial example, and further, a target sentence bag may be constructed based on a plurality of sentences collected for the relationship type determination between "zhang san" and "4/2/4/1956".

Step 1002: and inputting the target sentence bag into the trained entity relationship determination model, and aiming at each sentence in the target sentence bag, obtaining a sentence expression vector of one sentence based on the character expression vector corresponding to each character in the sentence.

Step 1003: and respectively determining sentence weight values of corresponding sentences based on the obtained sentence expression vectors by adopting a trained entity relationship determination model, wherein each sentence weight value represents the importance degree of a sentence for determining the relationship of the target entity pair.

Step 1004: and determining a sentence bag expression vector of the target sentence bag based on the sentence expression vector and the sentence weight value respectively corresponding to each sentence by adopting the trained entity relation determination model.

Step 1005: and determining a target relation type between the two entities included in the target entity pair based on the sentence bag representation vector by adopting the trained entity relation determination model.

The process of step 1002 to step 1005 is similar to the description of the corresponding part in the model training, so the process of step 1002 to step 1004 can refer to the corresponding description of the training part, and will not be described in detail herein.

Taking the above entity pair "zhang san" and "4/2/1956" as examples, when the trained entity relationship determination model is accurate enough, the final entity relationship determination model may output the predicted relationship type of "zhang san" and "4/2/1956" as "birth time", and further may obtain a new triple (zhang san, birth time, 4/2/1956), and may add it into the knowledge map for use in a downstream application scenario, for example, when a user subsequently inputs zhang san's birth time, the triple may be matched, and then an answer "4/2/1956" may be successfully output, thereby implementing an accurate question and answer for question search.

In summary, in the embodiment of the application, aiming at the problems of difficult manual labeling and less training data, a large amount of training data is constructed in a knowledge-graph-based remote supervision manner, and aiming at the noise problem of the training data constructed based on the remote supervision, on one hand, the noise problem is alleviated in a mutual information manner, and on the other hand, the noise of the training data is further effectively alleviated by an entity relation determination model based on attention and bag prediction.

Referring to fig. 11, based on the same inventive concept, an embodiment of the present application further provides an entity relationship type determining apparatus 110, including:

an obtaining unit 1101, configured to obtain a target sentence bag associated with the target entity pair, and input the target sentence bag to the trained entity relationship determination model; the target sentence bag comprises a plurality of sentences, and each sentence comprises a target entity pair;

a sentence encoding unit 1102, configured to use the trained entity relationship determination model to perform the following operations for each sentence in the target sentence bag: aiming at a sentence, obtaining a sentence expression vector of the sentence based on the character expression vector corresponding to each character in the sentence;

a sentence bag encoding unit 1103, configured to determine, using the trained entity relationship determination model, sentence weight values of corresponding sentences, respectively, based on the obtained respective sentence expression vectors, each sentence weight value representing an importance degree of a sentence determined for a relationship of the target entity pair, and determine, using the trained entity relationship determination model, a sentence bag expression vector of the target sentence bag, based on the sentence expression vector and the sentence weight value corresponding to each sentence, respectively;

and a prediction unit 1104, configured to determine a target relationship type between two entities included in the target entity pair based on the bag of sentences representation vector by using the trained entity relationship determination model.

Optionally, the apparatus further comprises a model training unit 1105 configured to:

determining multiple relation types output by the model presetting based on the entity relation, and acquiring multiple triples; each triple comprises an entity pair, and one relation type in a plurality of relation types is labeled in association corresponding to one entity pair;

for a plurality of triples, the following operations are respectively executed: aiming at a triple, carrying out sentence matching by adopting an entity pair contained in the triple to obtain a sentence sample containing the entity pair contained in the triple;

constructing corresponding training samples respectively based on the obtained sentence samples, wherein each training sample comprises a plurality of sentence samples containing the same entity pair;

respectively labeling the corresponding relation types of entity pairs associated with the corresponding training samples aiming at the obtained training samples;

and carrying out iterative training on the entity relationship determination model to be trained based on the marked training samples until a convergence condition is met, and obtaining the trained entity relationship determination model.

Optionally, the model training unit 1105 is further configured to:

performing word segmentation operation on the obtained multiple sentence samples to obtain multiple word segments;

for multiple relation types, the following operations are respectively executed:

determining mutual information coefficients of each participle in a plurality of participles and a relation type aiming at the relation type, wherein one mutual information coefficient is used for representing the importance degree of one participle to the relation type;

selecting at least one word segmentation corresponding to the mutual information coefficient larger than a set threshold value based on the obtained multiple mutual information coefficients;

screening out sentence samples which do not contain any participle in at least one participle from a plurality of sentence samples corresponding to a relation type;

and respectively constructing corresponding training samples based on the plurality of residual sentence samples.

Optionally, the model training unit 1105 is specifically configured to:

for a plurality of word segments, the following operations are respectively executed:

determining a first probability of occurrence of a word segmentation for a word segmentation;

determining a second probability of occurrence of a relationship type and determining a third probability of occurrence of a word-segmentation when a relationship type exists;

and determining a mutual information coefficient corresponding to a word segmentation based on the first probability, the second probability and the third probability.

Optionally, the sentence encoding unit 1102 is specifically configured to:

carrying out character splitting on a sentence to obtain a plurality of characters included in the sentence;

for a plurality of characters, the following operations are respectively executed:

performing feature coding on a character to obtain a content representation vector, a position representation vector and a source representation vector of the character; the content representation vector is used for representing content corresponding to one character, the position representation vector represents the position of one character in one sentence, and the source representation vector represents the sentence from which one character comes;

based on the content representation vector, the position representation vector and the source representation vector, a character representation vector of a character is obtained.

Optionally, the sentence encoding unit 1102 is specifically configured to:

carrying out character splitting on a sentence to obtain a plurality of characters included in the sentence;

according to the sequence of a plurality of characters in a sentence, sequentially carrying out feature coding on each character in the plurality of characters to respectively obtain character expression vectors corresponding to the plurality of characters; when one character is subjected to feature coding, feature extraction is carried out on the one character to obtain a basic expression vector of the one character, and a character expression vector of the one character is obtained based on the basic expression vector and a character expression vector of a character before the one character.

Optionally, the sentence encoding unit 1102 is specifically configured to:

performing mean pooling on each obtained character representation vector to obtain sentence representation vectors; alternatively, the first and second electrodes may be,

determining a character weight value of each character, and obtaining a sentence expression vector based on a character expression vector and a character weight value corresponding to each character; wherein each character weight value characterizes the importance of a character to a sentence.

Optionally, the sentence bag encoding unit 1103 is specifically configured to:

for each sentence representation vector, the following operations are respectively executed:

aiming at a sentence expression vector, obtaining an intermediate expression vector of the sentence expression vector based on a pre-training weight matrix included by a sentence expression vector and an entity relation determination model;

obtaining a vector dot product between the intermediate representation vector and a pre-training parameter vector included in the entity relationship determination model;

and normalizing the obtained vector dot products to obtain sentence weight values corresponding to the sentence expression vectors.

The apparatus may be configured to execute the methods shown in the embodiments shown in fig. 2 to 10, and therefore, for functions and the like that can be realized by each functional module of the apparatus, reference may be made to the description of the embodiments shown in fig. 2 to 10, which is not repeated here. Since the model training unit 1105 is not an essential functional block, the model training unit 1105 is shown by a dotted line in fig. 11.

Referring to fig. 12, based on the same technical concept, an embodiment of the present application further provides a computer device 120, which may include a memory 1201 and a processor 1202.

The memory 1201 is used for storing computer programs executed by the processor 1202. The memory 1201 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to use of the computer device, and the like. The processor 1202 may be a Central Processing Unit (CPU), a digital processing unit, or the like. The embodiment of the present application does not limit the specific connection medium between the memory 1201 and the processor 1202. In the embodiment of the present application, the memory 1201 and the processor 1202 are connected by the bus 1203 in fig. 12, the bus 1203 is represented by a thick line in fig. 12, and the connection manner between other components is only schematically illustrated and is not limited thereto. The bus 1203 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 12, but this is not intended to represent only one bus or type of bus.

Memory 1201 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1201 may also be a non-volatile memory (non-volatile memory) such as, but not limited to, a read-only memory (rom), a flash memory (flash memory), a hard disk (HDD) or a solid-state drive (SSD), or any other medium which can be used to carry or store desired program code in the form of instructions or data structures and which can be accessed by a computer. The memory 1201 may be a combination of the above memories.

A processor 1202, configured to execute the method performed by the apparatus in the embodiments shown in fig. 2 to fig. 10 when calling the computer program stored in the memory 1201.

In some possible embodiments, various aspects of the methods provided by the present application may also be implemented in the form of a program product including program code for causing a computer device to perform the steps of the methods according to various exemplary embodiments of the present application described above in this specification when the program product is run on the computer device, for example, the computer device may perform the methods performed by the devices in the embodiments shown in fig. 2-10.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

32页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种轮毂编码生成方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!