Network representation learning algorithm across medical data sources

文档序号：1942818 发布日期：2021-12-07 浏览：4次中文

阅读说明：本技术 跨医疗数据源的网络表示学习算法 (Network representation learning algorithm across medical data sources ) 是由王朝坤严本成楼昀恺石耕源陈俊黄海峰陆超于 2020-04-03 设计创作，主要内容包括：一种跨医疗数据源的网络表示学习算法,包括：S1,生成包括源网络和目标网络的医疗网络数据；S2,从源网络和目标网络随机采样设定数量的节点；S3,得到一个L层的神经网络,并对每一层分别计算源网络和目标网络的结构特征和表达特征,计算源网络和目标网络的网络特征之间的距离损失；S4,得到源网络在L层神经网络的输出,并根据分类损失和距离损失计算损失值,根据反向传播算法更新算法的参数；S5,重复步骤S2-S4,直至整个算法收敛,使得算法对于疾病分类的准确率在多个迭代内不再上升。有益效果：考虑了不同医院数据源之间数据分布不一致的问题,通过提取网络的结构信息及节点属性信息、最小化特征距离弥补信息损失,有着广阔的应用空间。(A network representation learning algorithm across medical data sources, comprising: s1, generating medical network data comprising a source network and a target network; s2, randomly sampling a set number of nodes from the source network and the target network; s3, obtaining a neural network of L layers, respectively calculating the structural characteristics and expression characteristics of the source network and the target network for each layer, and calculating the distance loss between the network characteristics of the source network and the target network; s4, obtaining the output of the source network in the L-layer neural network, calculating a loss value according to the classification loss and the distance loss, and updating the parameters of the algorithm according to the back propagation algorithm; s5, repeating the steps S2-S4 until the whole algorithm converges, so that the accuracy of the algorithm for disease classification does not rise any more in a plurality of iterations. Has the advantages that: the problem of inconsistent data distribution among different hospital data sources is considered, information loss is made up by extracting the structure information and node attribute information of the network and minimizing the characteristic distance, and the method has a wide application space.)

A network representation learning algorithm across medical data sources, comprising:

s1, generating medical network data including a source network and a target network, wherein the source network is generated by a treatment record of a certain hospital, the target network is generated by a treatment record of another hospital different from the hospital, the medical network data includes treatment record information of patients, and network relations among symptoms, diseases, medicines and diagnosis methods are constructed;

s2, randomly sampling a set number of nodes from a source network and a target network respectively, wherein the number of the collected nodes is related to the degree of the medical network;

s3, obtaining a neural network of L layer from step S2, calculating the structure characteristic and expression characteristic of the source network and the target network for each layer, and calculating the distance loss between the network characteristics of the source network and the target network;

s4, obtaining the output of the source network in the L-layer neural network from S3, calculating a loss value according to the classification loss and the distance loss, and updating the parameters of the algorithm according to a back propagation algorithm;

s5, repeating the steps S2-S4 until the whole algorithm converges, so that the accuracy of the algorithm for disease classification does not rise any more in a plurality of iterations.

The algorithm for learning network representation across medical data sources as claimed in claim 1, wherein the step S3 is to obtain a neural network of L layers from the step S2 and calculate the structural features and expression features of the source network and the target network for each layer, respectively, and the calculating the distance loss between the network features of the source network and the target network comprises:

s30, inputting the node characteristics of the source network and the target network into the neural network of the L layer;

s31, in each layer of the L-layer neural network, the node feature expression vector of each network obtains the structural feature through a message routing module, and the structural feature obtains the new expression feature vector of the current node through a message aggregation module;

s32, calculating a distance loss value between node characteristics of a source network and a target network at a current layer through a network alignment module crossing medical data sources;

and S33, repeating the steps S31 to S32 for L times to obtain the node feature vectors of the final source network and the target network and the L-layer accumulated structural feature distance loss and expression feature distance loss.

The algorithm for network representation learning across medical data sources as claimed in claim 2, wherein the step S31, in each layer of the L-layer neural network, the node feature expression vector of each network gets the structural feature through a message routing module, and the structural feature gets the new expression feature vector of the current node through a message aggregation module includes:

the message routing module of each layer is represented as:

in the formula (I), the compound is shown in the specification,the structural feature vector of the L-th layer in the L-layer neural network is taken as the node i,the expression feature vectors of the source network and the target network of the L-1 layer in the L-layer neural network and the expression feature vector of the 0 layer are composed of original feature vectors x of nodes_iIt is shown that,parameter matrices, a, involved for message routing modules of layer l^(l)TIs a parameter matrix related to a message routing module of the l-th layer, sigma is an activation function, | | is a direct connection operation of two vectors, N (v) is a neighbor set directly connected with a node v,a message weight passed to node v for node u;

the message aggregation module for each layer is represented as:

in the formula (I), the compound is shown in the specification,andis a parameter matrix involved by the message aggregation module,a vector showing the node aggregation level.

The network representation learning algorithm of claim 3, wherein the step S32 of calculating, by the network alignment module across the medical data sources, the distance loss value between the node features of the current layer from the source network and the target network comprises:

the structural feature distance of each layer is:

in the formula, P_r,Q _rStructural feature vectors for source and target networksAndthe distribution of (a) to (b) is,is a distance function for calculating the feature vector of the structureAndthe desired distance of (d);

the expression signature distance loss for each layer is:

in the formula, P_a,Q _aExpressing feature vectors for nodes of a source network and a target networkAndthe distribution of (a) to (b) is,is a distance function for calculating the node expression feature vectorAndthe desired distance of (a).

The network representation learning algorithm of claim 4, wherein the step S33 of repeating the steps S31 to S32L times to obtain the node feature vectors of the final source network and the target network and the L-level accumulated structural feature distance loss and expression feature distance loss comprises:

the cumulative structural feature distance loss for the L layers is:

the cumulative distance loss of expression features for the L layers is:

21页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：模拟神经元的计算高效实施方式

Network representation learning algorithm across medical data sources

相关技术

网友询问留言