Label prediction method, apparatus, storage medium and program product

文档序号：1953956 发布日期：2021-12-10 浏览：25次中文

阅读说明：本技术 标签预测方法、设备、存储介质及程序产品 (Label prediction method, apparatus, storage medium and program product ) 是由詹忆冰万升宫辰于 2021-09-28 设计创作，主要内容包括：本申请实施例提供一种标签预测方法、设备、存储介质及程序产品,通过获取图结构的节点信息,图结构包括有标签节点和无标签节点,节点信息包括节点特征、邻接矩阵以及有标签节点的观测标签；根据节点信息获取预设的泊松图网络模型与预设的图神经网络模型的总体损失函数；根据总体损失函数对泊松图网络模型和图神经网络模型的模型参数进行优化；根据节点信息以及优化后的泊松图网络模型确定无标签节点的预测标签。通过泊松图网络有效解决标签节点数量极少的半监督学习问题,且基于泊松图网络模型和图神经网络模型通过变分推断来对原本以求解的后验概率分布进行估计,共同提高模型学习能力,提高鲁棒性和置信度,进而更准确的预测无标签节点的标签。(The embodiment of the application provides a label prediction method, a device, a storage medium and a program product, wherein node information of a graph structure is obtained, the graph structure comprises labeled nodes and non-labeled nodes, and the node information comprises node characteristics, an adjacent matrix and observation labels of the labeled nodes; acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function; and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model. The semi-supervised learning problem with few label nodes is effectively solved through the Poisson graph network, the posterior probability distribution which is originally solved is estimated through variational inference based on the Poisson graph network model and the graph neural network model, the model learning capacity is jointly improved, the robustness and the confidence coefficient are improved, and then labels of label-free nodes are predicted more accurately.)

1. A label prediction method, comprising:

acquiring node information of a graph structure, wherein the graph structure comprises labeled nodes and non-labeled nodes, and the node information comprises node characteristics, an adjacent matrix and observation labels of the labeled nodes;

acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; wherein the Poisson graph network model and the graph neural network model are used for estimating a posterior probability distribution of the label-free node labels according to variation deduction;

optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function;

and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model.

2. The method of claim 1, wherein obtaining the total loss function of the preset poisson graph network model and the preset graph neural network model according to the node information comprises:

inputting the node characteristics, the adjacent matrix and the observation label with the label node into a preset Poisson graph network model to obtain a first posterior probability distribution of the node label;

inputting the node characteristics and the adjacency matrix into a preset graph neural network model to obtain a second posterior probability distribution of the node labels;

and constructing a total loss function of a preset Poisson graph network model and a preset graph neural network model based on the first posterior probability distribution and the second posterior probability distribution.

3. The method of claim 2, wherein the inputting the node features, the adjacency matrix and the observation labels of the labeled nodes into a preset poisson graph network model to obtain a first posterior probability distribution of the node labels comprises:

acquiring attention coefficients among the nodes according to an attention mechanism based on the node characteristics, and performing replacement processing according to the edge weights in the adjacent matrix to obtain an attention diagram;

constructing a label matrix of all nodes according to observation labels of the labeled nodes, wherein the value corresponding to the unlabeled node is 0;

and inputting the node characteristics, the attention diagram and the label matrix into a Poisson diagram network model, and obtaining a first posterior probability distribution of the node labels through multilayer Poisson convolution of the Poisson diagram network model.

4. The method of claim 3, wherein the inputting the node features, the attention map, and the label matrix into a Poisson graph network model, and obtaining a first posterior probability distribution of node labels through a plurality of Poisson convolution layers of the Poisson graph network model comprises:

and for any one layer of the Poisson convolutional layer, acquiring an output result of the Poisson convolutional layer on the basis of an output result of the previous layer of the Poisson convolutional layer, the label matrix and the diagonal matrix and the Laplace matrix of the attention map.

5. The method of claim 4, wherein obtaining the output result of the poisson convolutional layer of the current layer based on the output result of the poisson convolutional layer of the previous layer, the tag matrix, and the diagonal matrix and the laplacian matrix of the attention map comprises:

obtaining the output result of the poisson convolutional layer of the layer through the following formula:

wherein the content of the first and second substances,the output result of the t-th layer of the Poisson convolution layer is shown;anda diagonal matrix and a laplacian matrix respectively representing the attention map; b is^TA label matrix representing all nodes, wherein unlabeled nodes correspond to a value of 0.

6. The method of claim 4 or 5, further comprising:

performing label prediction according to the node characteristics to obtain a label prediction result;

and inputting the label prediction result into at least one layer of the Poisson convolution layer, and superposing the label prediction result with the output result of the layer of the Poisson convolution layer to be used as the final output result of the layer of the Poisson convolution layer.

7. The method of claim 3, wherein the graph neural network model is a graph attention network, and the attention coefficients in the graph neural network model are attention coefficients between the nodes.

8. The method according to claim 2, wherein constructing the overall loss function of the pre-set poisson graph network model and the pre-set graph neural network model based on the first posterior probability distribution and the second posterior probability distribution comprises:

and acquiring one or more of an evidence lower bound loss function, a contrast loss function and a cross entropy loss function based on the first posterior probability distribution and the second posterior probability distribution so as to construct a total loss function of a preset Poisson graph network model and a preset graph neural network model according to one or more of the evidence lower bound loss function, the contrast loss function and the cross entropy loss function.

9. The method according to claim 8, wherein said obtaining one or more of a lower bound loss of evidence function, a contrast loss function, and a cross entropy loss function based on the first posterior probability distribution and the second posterior probability distribution comprises:

determining KL divergence between the first posterior probability distribution and the second posterior probability distribution according to the first posterior probability distribution and the second posterior probability distribution, and determining the evidence lower bound loss function according to the KL divergence; and/or

Determining a contrast loss function between the Poisson's chart network model and the chart neural network model based on contrast learning according to the first posterior probability distribution and the second posterior probability distribution; and/or

And determining a cross entropy loss function according to the first posterior probability distribution of the labeled nodes and the observation label data.

10. The method according to claim 8, wherein determining KL-divergence between the first posterior probability distribution and the second posterior probability distribution from the first posterior probability distribution and the second posterior probability distribution, and determining the lower bound of evidence loss function from KL-divergence comprises:

determining the evidence lower bound loss function by the following formula

Wherein the content of the first and second substances,first a posteriori probability, p, for a label of an unlabeled node_θ(Y_UI X, A) is the second posterior probability of the label of the unlabeled node, p_θ(Y_L| A, X) is the second posterior probability of labeled node labels, X is the node feature, Y is_LFor observation labels of labeled nodes, Y_UFor observation labels with labeled nodes, A is the adjacency matrix, θ,As a parameter of the model, D_KL(. |. cndot.) represents the KL divergence of the two distributions.

11. The method of claim 8, wherein determining a contrast loss function between the poisson graph network model and the graph neural network model based on contrast learning from the first posterior probability distribution and the second posterior probability distribution comprises:

obtaining a pair of contrast loss functions according to the following formula

According to pairs of contrast loss functionsObtaining the contrast loss function L_ContThe following were used:

wherein z is_iAndare respectively node x_iIn thatAnd p_θOutput in (Y | X, A), z_kAndare respectively node x_kIn thatAnd p_θOutput in (Y | X, A), node X_kIs not node x_iRepresents the inner product, tau is the temperature parameter and n is the number of nodes.

12. The method according to any one of claims 8 to 11, wherein constructing an overall loss function of the pre-set poisson graph network model and the pre-set graph neural network model from one or more of an evidentiary lower bound loss function, a contrast loss function, and a cross-entropy loss function comprises:

determining the total loss function according to the following formula

13. A label prediction apparatus, characterized by comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring node information of a graph structure, the graph structure comprises labeled nodes and unlabeled nodes, and the node information comprises node characteristics, an adjacent matrix and observation labels of the labeled nodes;

the model optimization module is used for acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; wherein the Poisson graph network model and the graph neural network model are used for estimating a posterior probability distribution of the label-free node labels according to variation deduction; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function;

and the prediction module is used for determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model.

14. An electronic device, comprising: at least one processor; and a memory;

the memory stores computer-executable instructions;

the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any one of claims 1-12.

15. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1-12.

16. A computer program product comprising computer instructions, characterized in that the computer instructions, when executed by a processor, implement the method according to any of claims 1-12.

Technical Field

The embodiment of the application relates to the technical field of computers and artificial intelligence, in particular to a label prediction method, label prediction equipment, a storage medium and a program product.

Background

The semi-supervised learning algorithm based on the graph is a semi-supervised learning algorithm of a data structure of a graph structure, great attention of researchers at home and abroad is brought by virtue of solid mathematical foundation and superior algorithm performance, and a task target of the semi-supervised learning algorithm based on the graph is consistent with that of the semi-supervised learning algorithm. However, the semi-supervised learning algorithm based on the graph has inherent defects, and enough labeled samples are needed to effectively train the network, so that a more robust result is obtained. In practical application, the cost of data marking is very high, so even in the case of semi-supervised learning, the scale of the labeled data is usually very limited, in this case, the learning capability of the graph neural network is greatly limited, and especially when the labeled data and the labeled data are rare, the accuracy of the graph neural network algorithm is rapidly reduced.

Disclosure of Invention

Embodiments of the present application provide a label prediction method, device, storage medium, and program product, which are used to predict labels of label-free nodes more accurately under the condition that label nodes are extremely rare in a graph structure.

In a first aspect, an embodiment of the present application provides a label prediction method, including:

optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function;

and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model.

In a second aspect, an embodiment of the present application provides a label prediction apparatus, including:

and the prediction module is used for determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model.

In a third aspect, an embodiment of the present application provides an electronic device, including: at least one processor; and a memory;

the memory stores computer-executable instructions;

the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the method according to the first aspect is implemented.

In a fifth aspect, the present application provides a computer program product comprising computer instructions which, when executed by a processor, implement the method according to the first aspect.

According to the label prediction method, the label prediction device, the storage medium and the program product, node information of a graph structure is obtained, wherein the graph structure comprises labeled nodes and unlabeled nodes, and the node information comprises node characteristics, an adjacent matrix and observation labels of the labeled nodes; acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; the Poisson graph network model and the graph neural network model are used for deducing and estimating the posterior probability distribution of the label-free node labels according to the variation; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function; and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model. The semi-supervised learning problem with few label nodes is effectively solved through the Poisson graph network, the posterior probability distribution which is originally solved is estimated through variational inference based on the Poisson graph network model and the graph neural network model, the learning capacity of the model is improved together, the model has higher robustness and confidence coefficient, and the label of the label-free node can be predicted more accurately.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic view of an application scenario of a tag prediction method according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating a tag prediction method according to an embodiment of the present application;

FIG. 3 is a diagram illustrating a model architecture of a tag prediction method according to an embodiment of the present application;

FIG. 4 is a flow chart of a tag prediction method according to another embodiment of the present application;

FIG. 5 is a flow chart of a tag prediction method according to another embodiment of the present application;

FIG. 6 is a block diagram of a tag prediction device according to an embodiment of the present application;

fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.

With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Most algorithms of artificial intelligence can be trained without marking data, however, high-precision artificial marking is difficult to obtain, and therefore semi-supervised learning only using a small part of marking is more and more widely applied. For example, for a commodity classification model, although the types of commodities are numerous, only a small number of commodities need to be labeled, and a relatively reliable commodity classification model can be obtained by training the model by using the labeled commodities and more unlabeled commodities. The graph structure is a more advanced data structure description model at present, all data are represented as graph nodes, the relationship between the data and the graph nodes is described as graph edges, and the semi-supervised learning algorithm based on the graph draws great attention of researchers at home and abroad by virtue of solid mathematical foundation and superior algorithm performance.

The early graph-based semi-supervised learning algorithm mainly relies on the assumption that the large probability of adjacent nodes belongs to the same category, and utilizes algorithms such as a low-dimensional embedding algorithm, a spectrum kernel algorithm, a Markov random walk and the like of a Laplace feature graph; to further improve the performance of semi-supervised learning, many approaches consider joint modeling of data features and graph structures, such as deep semi-supervised embedding and Planetoid algorithms; in recent years, inspired by convolutional neural networks, researchers have proposed various graph neural network algorithms to deal with the graph-based semi-supervised learning problem with great success, and such methods can be roughly divided into two categories: a spatial domain-based graph neural network algorithm and a spectral domain-based graph neural network algorithm.

1) And (4) a graph neural network algorithm based on a spatial domain. In the spatial domain based approach, the graph convolution operation is defined as a weighted mean equation with the inputs being the neighbors of each node and the output being the impact of each neighbor on the center on the target node.

2) And (4) a spectral domain based graph neural network algorithm. Unlike spatial domain methods, spectral domain based methods are typically based on feature decomposition, taking into account the local nature of graph convolution from a spectral analysis perspective.

Although the graph neural network has achieved great success in graph-based semi-supervised learning, current graph neural network algorithms (such as the graph convolution neural network and the graph attention network) have inherent defects, and enough labeled samples are required to train the network effectively, so as to obtain a relatively robust result. In practical applications, the cost of data tagging is very high, and therefore, even in the case of semi-supervised learning, the scale of tagged data is usually very limited. In this case, the learning ability of the neural network is greatly limited, and especially when there is marked data and its rarity, the accuracy of the neural network algorithm is rapidly reduced.

In order to solve the above technical problem, the embodiment of the present application provides a graph neural network framework, that is, a semi-supervised learning problem when the number of labels is extremely small is specifically solved in comparison with a poisson graph network, and based on a variational inference theory, a posterior probability distribution of label-free node labels which are difficult to process is approximately estimated by means of two graph neural networks, namely a poisson graph network model and a graph neural network model.

Specifically, in order to enable limited tag information to be spread to the whole graph, a poisson graph network model is designed. By means of Poisson learning, a Poisson graph network with an attention mechanism can be adopted to flexibly model a label propagation process, graph structure information is used for guiding the label propagation process, another graph neural network is used for carrying out instantiation operation on a variation inference framework together, and then based on the framework, a label prediction result can be obtained from two perspectives, so that a comparison target can be used for optimizing the Poisson graph network and the graph neural network model at the same time naturally, supervision signal auxiliary model training can be mined from a large number of label-free node samples to make up the defect of rare label information, and a more robust result with higher confidence coefficient can be obtained.

A specific application scenario of the embodiment of the application is shown in fig. 1, and includes a database 101 and a server 102, where the database 101 may provide graph structure data including node information of a graph structure, the node information includes node features, an adjacency matrix and observation labels of labeled nodes, and the server 102 may obtain, after obtaining the node information of the graph structure, a total loss function of a preset poisson graph network model and a preset graph neural network model according to the node information; the Poisson graph network model and the graph neural network model are used for deducing and estimating the posterior probability distribution of the label-free node labels according to the variation; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function; and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model, and outputting the prediction label of the label-free node.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 2 is a flowchart of a tag prediction method according to an embodiment of the present application. The embodiment provides a label prediction method, an execution subject of which is an electronic device, and the label prediction method specifically comprises the following steps:

s201, obtaining node information of a graph structure, wherein the graph structure comprises labeled nodes and non-labeled nodes, and the node information comprises node characteristics, an adjacent matrix and observation labels of the labeled nodes.

In this embodiment, a graph structure is a more advanced data structure description model at present, all data are represented as nodes of a graph, relationships between the nodes are described as edges of the graph, where the edges include labeled nodes and unlabeled nodes, the labeled nodes have observation labels, the observation labels may be manually labeled or determined through a special way, and the unlabeled nodes do not have observation labels, and prediction needs to be performed according to the method of this embodiment; in addition, the adjacency matrix represents the matrix of the node adjacency relation, and node characteristics including the node characteristics of the labeled nodes and the unlabeled nodes can be obtained.

S202, acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; wherein the Poisson graph network model and the graph neural network model are used for estimating a posterior probability distribution of the label-free node labels according to variation inference.

In this embodiment, since the number of labeled nodes in the graph structure is extremely small, and the posterior probability distribution of the label without the labeled node is difficult to be determined directly and accurately by semi-supervised learning, the posterior probability distribution which is difficult to process is estimated approximately by means of two neural networks, namely a poisson graph network model and a graph neural network model, based on the variational judgment theory in this embodiment, wherein the variational inference is to find an easily expressed or solved posterior probability distribution for the target posterior probability distribution which is difficult to express or solve, so that the distance between the easily expressed or solved posterior probability distribution and the target posterior probability distribution is small enough to serve as the approximate posterior probability distribution of the target posterior probability distribution.

In the embodiment, a Poisson graph network model and a graph neural network model are constructed, and the posterior probability distribution of the label-free node label is estimated according to variation inference.

Among them, the Poisson Graph network model is a Poisson Learning algorithm model proposed by recent researchers (refer to the documents: Jeff Calder, Bredan Cook, Matthew Thorpe, Dejan slide. Poisson Learning: Graph Based semi-supervised Learning at very low probability rates. proceedings of the37th International Conference on Machine Learning, PMLR 119:1306 + 1316,2020.) used for dealing with semi-supervised Learning scenarios with extremely rare number of labels and proving its superiority compared with the traditional Laplace method. The graph neural network model can use most graph neural network models, such as a graph convolution neural network model, a graph attention network model and the like, and only the input of the graph neural network model is ensured to be the characteristics of the adjacent matrixes and the nodes, and the output of the graph neural network model is the prediction label of the nodes.

In this embodiment, the prediction labels of the nodes are predicted from two views to perform variation inference, and a comparison target, that is, an overall loss function, can be used to optimize the poisson graph network model and the graph neural network model at the same time, where the overall loss function can be created in combination with the output results of the poisson graph network model and the graph neural network model, optionally, the overall loss function may include one or more component departments, for example, may include one or more of an evidence lower bound loss function, a comparison loss function, and a cross entropy loss function, and the poisson graph network model and the graph neural network model may be constrained from different angles by the one or more loss functions, so as to make up for the defect of the sparse number of label nodes, and make the poisson graph network model and the graph neural network model have higher robustness and higher confidence.

S203, optimizing model parameters of the Poisson' S diagram network model and the diagram neural network model according to the overall loss function.

In this embodiment, model parameters of the poisson graph network model and the graph neural network model are optimized simultaneously based on the overall loss function, and the model parameters of the poisson graph network model and the graph neural network model are optimized continuously through continuous iteration, so that the poisson graph network model and the graph neural network model are converged, for example, the model reaches the target accuracy, or the iteration number reaches the target number, and the like.

S204, determining a prediction label of the label-free node according to the node information and the optimized Poisson graph network model.

In this embodiment, after the model parameters of the poisson graph network model and the graph neural network model are optimized, considering that the poisson graph network model can better perform label propagation, and the confidence of the prediction result is higher, in this embodiment, the prediction label of the label-free node is finally determined according to the node information and the optimized poisson graph network model.

In the label prediction method provided by this embodiment, node information of a graph structure is obtained, where the graph structure includes labeled nodes and unlabeled nodes, and the node information includes node characteristics, an adjacency matrix, and observation labels of the labeled nodes; acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; the Poisson graph network model and the graph neural network model are used for deducing and estimating the posterior probability distribution of the label-free node labels according to the variation; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function; and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model. The semi-supervised learning problem with few label nodes is effectively solved through the Poisson graph network, the posterior probability distribution which is originally solved is estimated through variational inference based on the Poisson graph network model and the graph neural network model, the learning capacity of the model is improved together, the model has higher robustness and confidence coefficient, and the label of the label-free node can be predicted more accurately.

On the basis of the foregoing embodiment, a network model architecture is as shown in fig. 3, and the obtaining, according to the node information, a total loss function of a preset poisson graph network model and a preset graph neural network model in S202 may specifically include, as shown in fig. 4:

s301, inputting the node characteristics, the adjacent matrix and the observation labels with the labeled nodes into a preset Poisson graph network model, and obtaining first posterior probability distribution of the node labels;

s302, inputting the node characteristics and the adjacency matrix into a preset graph neural network model, and obtaining a second posterior probability distribution of the node labels;

s303, constructing a total loss function of the preset Poisson graph network model and the preset graph neural network model based on the first posterior probability distribution and the second posterior probability distribution.

In this embodiment, the inputs of the poisson graph network model are node feature X, adjacency matrix a, and observation label Y with labeled nodes_LThe output is the first posterior probability distribution of the node label, namely the node feature X, the adjacency matrix A and the observation label Y with the labeled node_LA conditional probability distribution of the predictive tag on the basis of (a); the input of the graph neural network model is node characteristic X and an adjacent matrix A, and the output is a second posterior probability of a label-free node label, namely the conditional probability distribution of a prediction label on the basis of the node characteristic X and the adjacent matrix A; and further constructing a total loss function of the preset Poisson figure network model and the preset figure neural network model based on the first posterior probability distribution and the second posterior probability distribution so as to simultaneously optimize model parameters of the preset Poisson figure network model and the preset figure neural network model.

As shown in fig. 5, the inputting the node features, the adjacency matrix, and the observation labels with labeled nodes into a preset poisson graph network model to obtain a first posterior probability distribution of the node labels may specifically include:

s401, acquiring attention coefficients among the nodes according to an attention mechanism based on the node characteristics, and performing replacement processing according to edge weights in the adjacent matrix to obtain an attention diagram;

s402, constructing a label matrix of all nodes according to observation labels of labeled nodes, wherein the value corresponding to the label-free node is 0;

s403, inputting the node characteristics, the attention diagram and the label matrix into a Poisson diagram network model, and obtaining a first posterior probability distribution of the node labels through multilayer Poisson convolution of the Poisson diagram network model.

In the embodiment, a poisson learning algorithm is applied to cope with a semi-supervised learning scene with an extremely small number of labeled nodes, but poisson learning still has defects, and a graph structure cannot be effectively utilized to guide a label propagation process. Specifically, first, it relies on a fixed input map, and in practical applications, such a map may contain noise and may not depict the true relationship between samples; secondly, poisson learning does not consider structural information constructed from neighborhood features because it focuses mainly on propagation of label information, and based on these drawbacks, the prediction result of poisson learning may not be accurate. Therefore, a flexible neural network model, i.e., poisson graph network, is proposed in this embodiment.

In order to avoid the influence of interference information in the graph, in this embodiment, first, the attention mechanism is utilized to adaptively capture the association information between neighboring nodes, that is, different weights are assigned according to different importance degrees between neighboring nodes, specifically, the node x_iAnd x_jAttention coefficient e_ijCan be calculated as follows:

in the formula (1), the first and second groups,is a trainable weight vector, W is a trainable weight matrix, which is a parameterized linear transformation matrix that can map input features into a higher dimensional output feature space, | | represents a stitching operation. Attention coefficient e_ijUsually, a normalization operation is performed for the following calculation:

in the formula (2), N_iRepresenting a node x_iIndex set of neighbors, e_ikRepresents node x_iAnd x_iAttention coefficients between neighbors k in the index set of neighbors. The attention coefficient α obtained was used in the present embodiment_ijTo replace the essentially fixed edge weight A in the adjacency matrix_ijTherefore, as the network trains, the edge weight can be improved continuously, thereby more truly describing the associated information between the nodes.

In addition, in this embodiment, a label matrix of all nodes may also be constructed according to observation labels of labeled nodes, where a value corresponding to a label-free node is 0; specifically, B^TAnd the label matrix representing all the nodes comprises N rows of data, N is the total number of the nodes, wherein the L rows of data are observation labels of L labeled nodes, the N-L rows of data are labels of N-L unlabeled nodes, and the N-L rows of data are set to be 0. Wherein the sequence of S401 and S402 is not limited.

Further, the node features, the attention diagram and the label matrix are input into a poisson diagram network model, the poisson diagram network model can comprise a plurality of poisson convolutional layers, and a first posterior probability distribution of the node labels can be obtained through iteration of the plurality of poisson convolutional layers of the poisson diagram network model.

And for any one layer of the Poisson convolutional layer, acquiring an output result of the Poisson convolutional layer on the basis of an output result of the previous layer of the Poisson convolutional layer, the label matrix and the diagonal matrix and the Laplace matrix of the attention map.

Specifically, the poisson convolution of the poisson convolutional layer of the layer can be performed through the following formula, so that the output result of the poisson convolutional layer of the layer is obtained:

On the basis of the above embodiment, in order to search the structural information formed by node features, the embodiment may further introduce a feature transformation module f_FT。f_FTPerforming label prediction based on the node characteristics to obtain the corresponding label prediction result of the node, wherein f_FTCan be regarded as a single layer perceptron. Further, f is_FTAnd inputting the obtained label prediction result into at least one layer of Poisson convolution layer, and superposing the label prediction result with the output result of the layer of Poisson convolution layer to be used as the final output result of the layer of Poisson convolution layer, so that the node characteristic information in the neighborhood is effectively utilized in the iterative process of label propagation to further improve the label prediction result.

Preferably, the output results of a certain layer of the Poisson convolution layers in the middle of the multiple layers of the Poisson convolution layers can be superposed f_FTThe obtained label prediction result is likely to result in an excessively smooth prediction result if the poisson convolutional layer of the layer is a poisson convolutional layer closer to the front, and is likely to result in failure to capture more node feature information if the poisson convolutional layer of the layer is a poisson convolutional layer closer to the front, so that the last third or fourth poisson convolutional layer may be preferably selected, and for example, the 7th or 8 th poisson convolutional layer may be selected for the 10-layer poisson convolutional layer. Taking the last but one poisson convolutional layer as an example, the final output result of the poisson convolutional layer can be expressed as follows:

wherein t represents the total number of layers of the multi-layer Poisson convolution layer, namely the iteration number of the Poisson convolution. With the help of equation (4), the node feature information can continue to propagate along the neighborhood, and together with the label information, a more meaningful label prediction is generated. In addition, feature transformationIntroduction of modules, stacking f on the basis of Poisson convolution_FT(X) may speed up convergence of the iterative process.

On the basis of any of the above embodiments, the graph neural network model of this embodiment may adopt any graph neural network, and optionally, may adopt a graph convolution neural network model or a graph attention network model, where if the graph neural network model selects a graph attention network, the attention coefficients between the nodes of the poisson graph network may be shared with the graph attention network model, that is, the attention coefficients in the graph attention network model and the attention coefficients between the nodes described in the above embodiments may be adopted, it is not necessary to repeatedly obtain the attention coefficients between the nodes, reduce the number of parameters, and accelerate network training.

On the basis of any of the above embodiments, the constructing a total loss function of the preset poisson graph network model and the preset graph neural network model based on the first posterior probability distribution and the second posterior probability distribution includes:

In this embodiment, model parameters of the poisson graph network model and the graph neural network model can be optimized simultaneously based on the overall loss function, and the overall loss function includes one or more of a lower bound loss function, a contrast loss function and a cross entropy loss function, so that the poisson graph network model and the graph neural network model can be better constrained, and the robustness and the accuracy of the models can be improved.

In the above embodiment, the specific determination process of the evidence lower bound loss function is as follows: determining a KL divergence between the first posterior probability distribution and the second posterior probability distribution according to the first posterior probability distribution and the second posterior probability distribution, and determining the evidence lower bound loss function according to the KL divergence, which may specifically include:

determining the evidence lower bound loss function by the following formula

Wherein the content of the first and second substances,first a posteriori probability, p, for a label of an unlabeled node_θ(Y_UI X, A) is the second posterior probability of the label of the unlabeled node, p_θ(Y_L| A, X) is the second posterior probability of labeled node labels, X is the node feature, Y is_LFor observation labels of labeled nodes, Y_UFor observation labels with labeled nodes, A is the adjacency matrix, θ,As a parameter of the model, D_KL(. | | -) represents the KL divergence of the two probability distributions.

In this embodiment, KL divergence (Kullback-Leibler divergence) is used as an asymmetry measure of two probability distribution differences in the variation inference, the variation inference is equivalent to minimizing KL divergence, and an Evidence Lower Bound loss function (ELBO) shown in formula (5) is further constructed based on the KL divergence, where minimizing KL divergence is equivalent to maximizing the Evidence Lower Bound loss function.

In the above embodiment, the specific determination process of the contrast loss function is as follows: and determining a contrast loss function between the Poisson's chart network model and the chart neural network model based on contrast learning according to the first posterior probability distribution and the second posterior probability distribution.

Specifically, in addition to the above embodiment that the network training is guided by using limited tag information, in this embodiment, it is also desirable to mine a supervision signal embedded in the non-tag data, specificallyIn the present embodiment, comparative learning is used to explore supervisory signals in a large amount of unlabeled data to further optimize the prediction result. The embodiment can maximize the same nodeAnd p_θSimilarity of outputs in (Y | X, A) and minimizing similarity of outputs of different nodes, thereby, a pair of contrast loss functionsCan be expressed as follows:

according to pairs of contrast loss functionsObtaining an overall contrast loss function L_ContThe following were used:

wherein z is_iAndare respectively node x_iIn thatAnd p_θOutput in (Y | X, A), z_kAndare respectively node x_kIn thatAnd p_θOutput in (Y | X, A), node X_kIs not node x_iArbitrary node of (1, table)And (4) indicating an inner product, wherein tau is a temperature parameter and n is the number of nodes.

In the above embodiment, the cross entropy loss function can be expressed as: l is_CE(Z_L，Y_L) Wherein Z is_LIs composed ofResulting labeled sample X_LThe predictive tag of (1).

Based on the above embodiments, in an alternative embodiment, the overall loss function includes the evidence lower bound loss function, the contrast loss function, and the cross-entropy loss function, and thus the overall loss function can be determined according to the following formula

Wherein the content of the first and second substances,as a function of lower bound loss of evidence, L_CE(Z_L，Y_L) As a function of cross-entropy loss, Z_LIs composed ofResulting labeled sample X_LPredictive label of L_ContFor contrast loss function, λ₁、λ₂Are coefficients. Model parameters of the Poisson graph network model and the graph neural network model can be optimized based on the overall loss function, and by optimizing a comparison target between the Poisson graph network and the graph neural network, utilization of mutual information can be promoted, and the capability of label prediction can be improved.

Fig. 6 is a block diagram of a tag prediction apparatus according to an embodiment of the present application. The tag prediction apparatus provided in this embodiment may execute the processing flow provided in the method embodiment, as shown in fig. 6, where the tag prediction apparatus 600 includes: an obtaining module 601, a model optimizing module 602, and a predicting module 603.

An obtaining module 601, configured to obtain node information of a graph structure, where the graph structure includes labeled nodes and unlabeled nodes, and the node information includes node features, an adjacency matrix, and observation labels of the labeled nodes;

a model optimization module 602, configured to obtain a total loss function of a preset poisson graph network model and a preset graph neural network model according to the node information; wherein the Poisson graph network model and the graph neural network model are used for estimating a posterior probability distribution of the label-free node labels according to variation deduction; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function;

a predicting module 603, configured to determine a prediction label of the label-free node according to the node information and the optimized poisson graph network model.

On the basis of any of the above embodiments, when obtaining the total loss function of the preset poisson graph network model and the preset graph neural network model according to the node information, the model optimization module 602 is configured to:

inputting the node characteristics and the adjacency matrix into a preset graph neural network model to obtain a second posterior probability distribution of the node labels;

On the basis of any of the above embodiments, the model optimization module 602, when inputting the node features, the adjacency matrix, and the observation labels of the labeled nodes into a preset poisson graph network model and obtaining a first posterior probability distribution of the node labels, is configured to:

constructing a label matrix of all nodes according to observation labels of the labeled nodes, wherein the value corresponding to the unlabeled node is 0;

On the basis of any of the above embodiments, when the model optimization module 602 inputs the node features, the attention map, and the label matrix into a poisson graph network model, and obtains a first posterior probability distribution of a node label through multiple poisson convolutional layers of the poisson graph network model, the model optimization module is configured to:

On the basis of any of the above embodiments, when obtaining, for any one of the poisson convolutional layers, an output result of the poisson convolutional layer on the basis of the output result of the poisson convolutional layer on the previous layer, the tag matrix, and the diagonal matrix and laplacian matrix of the attention map, the model optimization module 602 is configured to:

obtaining the output result of the poisson convolutional layer of the layer through the following formula:

On the basis of any of the above embodiments, the model optimization module 602 is further configured to:

performing label prediction according to the node characteristics to obtain a label prediction result;

On the basis of any one of the above embodiments, the graph neural network model is a graph attention network, and the attention coefficients in the graph neural network model adopt the attention coefficients among the nodes.

On the basis of any of the above embodiments, the model optimization module 602, when constructing the total loss function of the preset poisson graph network model and the preset graph neural network model based on the first posterior probability distribution and the second posterior probability distribution, is configured to:

On the basis of any of the above embodiments, the model optimization module 602, when obtaining one or more of the lower bound evidence loss function, the contrast loss function, and the cross entropy loss function based on the first posterior probability distribution and the second posterior probability distribution, is configured to:

And determining a cross entropy loss function according to the first posterior probability distribution of the labeled nodes and the observation label data.

On the basis of any of the above embodiments, the model optimization module 602, when determining the KL divergence between the first posterior probability distribution and the second posterior probability distribution according to the first posterior probability distribution and the second posterior probability distribution, and determining the lower bound loss of evidence function according to the KL divergence, is configured to:

determining the evidence lower bound loss function by the following formula

On the basis of any of the above embodiments, the model optimization module 602, when determining the contrast loss function between the poisson graph network model and the graph neural network model based on the contrast learning according to the first posterior probability distribution and the second posterior probability distribution, is configured to:

obtaining a pair of contrast loss functions according to the following formula

According to pairs of contrast loss functionsObtaining the contrast loss function L_ContThe following were used:

wherein z is_iAndare respectively node x_iIn thatAnd p_θOutput in (Y | X, A), z_kAndare respectively node x_kIn thatAnd p_θOutput in (Y | X, A), node X_kIs not node x_iAny of the nodes of (a) or (b),<·，·>denotes the inner product, τ is the temperature parameter, and n is the number of nodes.

On the basis of any of the foregoing embodiments, the model optimization module 602, when constructing the overall loss function of the preset poisson graph network model and the preset graph neural network model according to one or more of the evidence lower bound loss function, the contrast loss function, and the cross entropy loss function, is configured to:

determining the total loss function according to the following formula

The label prediction device provided in the embodiment of the present application may be specifically configured to execute the method embodiments provided in fig. 2 to 5, and specific functions are not described herein again.

According to the label prediction device provided by the embodiment of the application, node information of a graph structure is obtained, wherein the graph structure comprises labeled nodes and non-labeled nodes, and the node information comprises node characteristics, an adjacent matrix and observation labels of the labeled nodes; acquiring a total loss function of a preset Poisson graph network model and a preset graph neural network model according to the node information; the Poisson graph network model and the graph neural network model are used for deducing and estimating the posterior probability distribution of the label-free node labels according to the variation; optimizing model parameters of the Poisson's diagram network model and the diagram neural network model according to the overall loss function; and determining the prediction label of the label-free node according to the node information and the optimized Poisson graph network model. The semi-supervised learning problem with few label nodes is effectively solved through the Poisson graph network, the posterior probability distribution which is originally solved is estimated through variational inference based on the Poisson graph network model and the graph neural network model, the learning capacity of the model is improved together, the model has higher robustness and confidence coefficient, and the label of the label-free node can be predicted more accurately.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device provided in the embodiment of the present application may execute the processing flow provided in the embodiment of the tag prediction method, as shown in fig. 7, the electronic device 70 includes a memory 71, a processor 72, and a computer program; wherein a computer program is stored in the memory 71 and configured to execute the label prediction method described in the above embodiments by the processor 72. The electronic device 70 may also have a communication interface 73 for transmitting control commands and/or data.

The electronic device in the embodiment shown in fig. 7 may be configured to execute the technical solution of the tag prediction method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.

In addition, the present embodiment also provides a computer-readable storage medium on which a computer program is stored, the computer program being executed by a processor to implement the method of the above embodiment.

In addition, the present embodiment also provides a computer program product, which includes a computer program, and the computer program is executed by a processor to implement the method of the above embodiment.

In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.

The above embodiments are only used for illustrating the technical solutions of the embodiments of the present application, and are not limited thereto; although the embodiments of the present application have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims. It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

20页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种基于卫星云图的对流云机器学习识别方法

Label prediction method, apparatus, storage medium and program product

相关技术

网友询问留言