Community division method of social network based on node cluster

文档序号:1922979 发布日期:2021-12-03 浏览:31次 中文

阅读说明:本技术 一种基于节点簇的社交网络的社区划分方法 (Community division method of social network based on node cluster ) 是由 王爱莲 王星魁 崔璐 张林娟 刘春莲 于 2021-09-10 设计创作,主要内容包括:本发明公开了一种基于节点簇的社交网络的社区划分方法,本发明的方法先计算社交网络中各节点簇之间的相似度,得到各节点簇的最相似节点簇;然后根据各节点簇与各节点簇的最相似节点簇之间的相似关系、各节点簇的状态及节点簇封闭条件更新社交网络中的节点簇的属性,即更新节点簇的状态和节点簇的节点集合,当社交网络中的所有节点簇均为封闭状态时,将此时社交网络中各节点簇分别设置为社交网络中的社区。相较于现有的社区发现算法,更适用于大型复杂网络、时间复杂度低,经过实践对比,本发明的社区发现方法的准确性更高。(The invention discloses a community division method of a social network based on node clusters, which comprises the steps of firstly calculating the similarity among the node clusters in the social network to obtain the most similar node cluster of each node cluster; and then updating the attributes of the node clusters in the social network according to the similarity relation between each node cluster and the most similar node cluster of each node cluster, the state of each node cluster and the closed condition of the node clusters, namely updating the state of the node clusters and the node set of the node clusters, and when all the node clusters in the social network are in a closed state, respectively setting each node cluster in the social network as a community in the social network. Compared with the existing community discovery algorithm, the method is more suitable for large-scale complex networks, time complexity is low, and the community discovery method is higher in accuracy through practice comparison.)

1. A community division method of a social network based on a node cluster is characterized by comprising the following steps:

Proc1:

step1: initializing each node as an independent node cluster, and setting the state of each node as a default state;

step2: calculating the similarity among the node clusters, and recording the most similar node cluster of each node cluster according to the similarity; the node cluster comprises at least one node; the most similar node cluster of the node clusters is the node cluster with the highest similarity to the node cluster in the node clusters with the similarity to the node cluster larger than a preset value;

step3: performing Proc2 for each cluster of nodes;

step4: if the state of each node cluster is a closed state, ending and setting each node cluster in the social network as a community in the social network respectively, otherwise executing step 2;

Proc2:

step5: if the state of the node cluster is a closed state, ending;

step6: if the state of the node cluster is the following state, ending;

step7: if the most similar node cluster of the node cluster is C and the most similar node cluster of C is the node cluster, the node cluster and the node cluster C are mutually the most similar node cluster, the node cluster is combined with C, the following linked lists of the node cluster and the node cluster C are sequentially searched, after all the node clusters on the following linked lists are combined together, the closed condition is judged, if the condition is satisfied, the closed state is updated, otherwise, the default state is updated; the closed condition is that the intra-cluster average connection density of the node clusters is greater than the inter-cluster average connection density;

step8: if the most similar node cluster of the node cluster is C, but the most similar node cluster of C is not the node cluster, namely the node cluster and the node cluster are not mutually most similar, the node cluster becomes a following state and is added into a following linked list of C.

2. The community division method according to claim 1, wherein the complex network includes three basic entities of a node, a node cluster and a community, which are in a progressive relationship; the node is the most original node in the complex network; the node cluster is an entity formed by combining similar node clusters with smaller scale; the nodes are combined to form a node cluster, and then the node cluster forms a community, wherein the community is a final node cluster after the node clusters cannot be combined with each other.

3. The community partitioning method according to claim 1, wherein the state of the node cluster is a description of how the node cluster interacts with other node clusters, and includes four states: a. a default state, a b. closed state, a c. following state, and a d. expanded state; when a single node cluster is just initialized, the single node cluster is in a default state, and after the expansion of the single node cluster is finished, the state of the single node cluster is restored to the default state; the closed state indicates that the internal connection of the node cluster is relatively tight and cannot follow other node clusters or other node clusters are combined, and the node cluster at the moment is the final community.

4. The community division method according to claim 1, wherein the similarity is used to describe the degree of similarity between node clusters, and the measurement manner is to consider the proportion of the neighbor nodes that are the same in the two node clusters in all the neighbor nodes in the two node clusters; the formula is shown in formula 1-1:

S=(Ni∩Nj)/(Ni∪Ni) (1-1)

wherein N isiIs the neighbor node set of the node cluster i.

5. The community division method according to claim 1, wherein the average connection density within a cluster is substantially an average of a ratio of the number of connections made by each node within the cluster to all the connections; the formula is shown in formulas 1-2:

wherein, Ci1For the number of connections of node i within a node cluster, Ci2Is the total number of all the connections of the node i, and N is the total number of the nodes contained in the node set of the node cluster.

6. The community division method according to claim 1, wherein the inter-cluster average connection density is an average of the intra-cluster average connection densities of the respective node clusters; the formula is shown in formulas 1-3:

where ρ isiThe average connection density in the cluster of the ith node cluster, and M is the total number of the current node cluster.

7. The community partitioning method according to claim 1, wherein modularity is adopted as a community partitioning quality index, and the modularity is defined as shown in the following formulas 1 to 4:

Aijindicating whether the i node and the j node are adjacent, if so, Aji1, otherwise Aij0; in the corresponding network, the probability that an edge (i, j) exists ism is the total number of connections, kiDegree, δ (C) representing node ii,Cj) The method is a binary function, when the node i and the node j belong to a community, the value is 1, otherwise, the value is 0.

Technical Field

The invention relates to the technical field of computer online social networks, in particular to a community division method of a social network based on a node cluster.

Background

In recent years, with the explosive development of social networking services represented by Facebook, Twitter and the Twitter, the research of online social networking has become a very challenging and promising research field, and the community discovery of online social networking is an important content of social networking analysis. The research on the community structure of the social network is helpful for understanding the network topology structure characteristics, discovering user aggregation modes and influencing factors, revealing the inherent functional characteristics of the complex network, understanding the relationship, behavior and variation trend among individuals in the community, and promoting the development of various applications such as information propagation, information recommendation, public security event management and control and the like on the social network.

The phenomenon of uneven relationship exists in the social network, some individuals are densely connected with one another and are sparsely connected with one another, so that community mechanisms in the social network are formed, network nodes are the mapping of people in the actual society on the virtual network, and the connection edges of the network represent communication among network users. Since a social network can be regarded as a plurality of node sets with high cohesion characteristics in a network topology structure, a community structure can be defined as a plurality of subsets of a complex network node set, the nodes in each subset are connected with each other very densely, and the connecting edges between the nodes in different subsets are relatively sparse. As shown in fig. 1.

In order to measure the quality of a community structure, Mark Newman et al provides a definition of modularity through a simulation variance definition according to the difference between a network actual topological structure and a random network topological structure. The modularity index is used for measuring the division quality of the community mechanism by means of the global information of the network topology structure. In addition, communities can be measured based on the similarity of the nodes, the nodes in the communities are assumed to be similar, the similarity of the nodes among the communities is low, and community mechanisms are divided according to the similarity of the nodes.

From the essence connotation, the community discovery in the online social network is the process of dividing network nodes into sub-graphs according to the connection tightness degree of an internal topological structure, so that in the field of computer science, by means of mathematical tools such as graph theory and the like, the discovery problem of a tight collection of node connection in the graphs is described as a graph segmentation problem, and a representative greedy optimization strategy-based algorithm and a spectrum bisection method appear. Since the 21 st century, with the development of complex networks, network community discovery gains the attention of experts in various fields, a new split algorithm GN algorithm pulls open the sequence of community discovery development, and in the algorithm, the proposal of modularity concept is used as an important index for measuring network community discovery to provide a basis for better community discovery. Since then, many scholars propose a community discovery algorithm based on optimization theory with modularity as an objective function. The scholars such as Shichuan and the like propose a community discovery algorithm based on multi-objective optimization, and the division of communities is realized by using a genetic algorithm and combining a multi-objective function to depict community mechanism characteristics. Gergely Palla et al discusses the community discovery problem using the notion of a party and a clique, emphasizing nodes that overlap communities and community boundaries in the network. The Informap algorithm converts a network topology into a coded structure by means of data coding of an information theory so as to discover communities. And a community mechanism discovery algorithm based on a probability model is also used for realizing the division of community mechanisms by analyzing the maximum likelihood probability. The algorithm is relatively high in time complexity, focuses on the accuracy of community mechanisms more, and is suitable for community discovery in a social network with a small scale. In a social network with a larger scale, due to the wide variety of node information, many scholars wish to develop a community discovery algorithm with higher accuracy and lower time complexity, and therefore, new community discovery algorithms are proposed from different perspectives.

The community discovery algorithm can be divided into two algorithms of dynamic calculation and static calculation according to different algorithm mechanisms.

The algorithm design idea of the static calculation is that in each step of the calculation, all nodes in the grid need to be analyzed, and the criterion for judging whether the final division result is the result is to calculate whether the specific division combination of all the nodes can meet the global optimization target. The algorithm design idea of dynamic computation is that starting from a local node, the state of the node is updated according to a specific rule, and then the final global partition result of other nodes is deduced step by step.

Disclosure of Invention

The invention aims to solve the technical problem of providing a community division method of a social network based on a node cluster aiming at the defects of the prior art.

The technical scheme of the invention is as follows:

a community division method of a social network based on a node cluster comprises the following steps:

Proc1:

step1, initializing each node as an independent node cluster, and setting the state of each node as a default state;

step2, calculating the similarity among the node clusters, and recording the most similar node cluster of each node cluster according to the similarity; the node cluster comprises at least one node; the most similar node cluster of the node clusters is the node cluster with the highest similarity to the node cluster in the node clusters with the similarity to the node cluster larger than a preset value;

step3, executing Proc2 on each node cluster;

step4, if the state of each node cluster is a closed state, ending and respectively setting each node cluster in the social network as the community in the social network at the moment, otherwise executing Step 2;

Proc2:

step5, if the state of the node cluster is a closed state, ending;

step6, if the state of the node cluster is the following state, ending;

step7, if the most similar node cluster of the node cluster is C and the most similar node cluster of C is the node cluster, the node cluster and the node cluster C are the most similar node clusters, the node cluster is combined with C, the following linked lists of the node cluster and the node cluster C are sequentially searched, after all the node clusters on the following linked lists are combined together, the closed condition is judged, if the condition is satisfied, the closed state is updated, otherwise the default state is updated; the closed condition is that the intra-cluster average connection density of the node clusters is greater than the inter-cluster average connection density;

and Step8, if the most similar node cluster of the node cluster is C, but the most similar node cluster of C is not the node cluster, namely the node cluster and the node cluster are not mutually most similar, the node cluster becomes a following state and is added into a following linked list of C.

In the community division method, the complex network comprises three basic entities, namely nodes, node clusters and communities, which are in progressive relation; the node is the most original node in the complex network; the node cluster is an entity formed by combining similar node clusters with smaller scale; the nodes are combined to form a node cluster, and then the node cluster forms a community, wherein the community is a final node cluster after the node clusters cannot be combined with each other.

In the community division method, the state of the node cluster refers to how the node cluster interacts with other node clusters, and includes four states: a. a default state, a b. closed state, a c. following state, and a d. expanded state; when a single node cluster is just initialized, the single node cluster is in a default state, and after the expansion of the single node cluster is finished, the state of the single node cluster is restored to the default state; the closed state indicates that the internal connection of the node cluster is relatively tight and cannot follow other node clusters or other node clusters are combined, and the node cluster at the moment is the final community.

In the community division method, the similarity is used for describing the similarity between the node clusters, and the measurement mode is to examine the proportion of the neighbor nodes which are the same in the two node clusters in all the neighbor nodes of the two node clusters; the formula is shown in formula 1-1:

S=(Ni∩Nj)/(Ni∪Nj) (1-1)

wherein N isiIs the neighbor node set of the node cluster i.

According to the community division method, the average connection density in the cluster is actually the average value of the proportion of the connection number of each node in the cluster to all the connection numbers; the formula is shown in formulas 1-2:

wherein, CiiFor the number of connections of node i within a node cluster, Ci2Is the total number of all the connections of the node i, and N is the total number of the nodes contained in the node set of the node cluster.

In the community division method, the inter-cluster average connection density is the average value of the intra-cluster average connection density of each node cluster; the formula is shown in formulas 1-3:

where ρ isiThe average connection density in the cluster of the ith node cluster, and M is the total number of the current node cluster.

The community division method adopts modularity as a community division quality index, and the definition of the modularity is shown as the following formula 1-4:

Aijindicating whether the i node and the j node are adjacent, if so, Aji1, otherwise Aij0; in the corresponding network, the probability that an edge (i, j) exists ism is the total number of connections, kiDegree, δ (C) representing node ii,Cj) The method is a binary function, when the node i and the node j belong to a community, the value is 1, otherwise, the value is 0.

The invention provides a recursive merging community discovery algorithm based on node clusters, which comprises the steps of firstly calculating the similarity among the node clusters in a social network to obtain the most similar node cluster of each node cluster; and then updating the attributes of the node clusters in the social network according to the similarity relation between each node cluster and the most similar node cluster of each node cluster, the state of each node cluster and the closed condition of the node clusters, namely updating the state of the node clusters and the node set of the node clusters, and when all the node clusters in the social network are in a closed state, respectively setting each node cluster in the social network as a community in the social network. Because the node cluster is formed by at least one node, and the community is formed by the node cluster, the complex network is abstracted into the node, the node cluster and the community, so that the community discovery process is simplified.

The method of the invention is compared with the current typical label propagation algorithm and the cell machine learning algorithm. The label propagation algorithm predicts label information of unmarked nodes through label information of marked nodes, the cell machine-based algorithm describes the structural characteristics of communities from multiple sides, and the two are used for obtaining community division results through optimizing module coefficients. And based on the recursive merging community discovery algorithm of the node clusters, the existing node clusters are gradually merged by judging whether the node clusters meet the preset merging conditions, so that the purpose of rapidly dividing communities is achieved. Compared with the experimental results on the common Zachary network, the dolphin network and the American Football club network, the community division obtained by the node cluster aggregation algorithm is more stable and faster than the label propagation algorithm and the cell machine learning algorithm, and the highest modularity value is obtained on three data sets, so that the application of the node cluster aggregation algorithm to the community division is more effective than other two algorithms.

Drawings

FIG. 1 is a simple social network;

FIG. 2 is a community organization of a node cluster;

FIG. 3 is a community organization of a Football node cluster;

FIG. 4 is a community organization of a Club network;

FIG. 5 is a community organization of a Dolphins network;

FIG. 6 is a comparison of different algorithms for Football networks;

FIG. 7 is a comparison of different algorithms for a karate club network;

FIG. 8 is a comparison of different algorithms for dolphin networks;

Detailed Description

The present invention will be described in detail with reference to specific examples.

Example 1

Recursive merging community discovery algorithm based on node cluster

For example, a social network formed by employees of the same company of the human network can form a virtual community network according to different departments, and the virtual community of each department can be divided into a plurality of sub-community mechanisms according to different project groups. In physical analysis, the community organizations represent a complex system or a complex network with a collection of elements having the same or similar functions, and the elements interact or cooperate with each other to jointly form a relatively independent organization structure in the whole system or jointly complete the relatively independent functions of the system. The method has important significance for understanding the topological structure characteristics of the whole network and mining the functions of each organization module of the network. Aiming at the problem, many scholars propose a virtual community discovery algorithm based on the similarity of nodes, such as a similarity-based aggregation algorithm, wherein an EAGLE algorithm adopts a clustering frame, discovers all maximum derivatives through a maximum derivative search technology, and merges the maximum sub-communities; and by utilizing a vector similarity formula based on a cosine formula, each network node is regarded as a community mechanism with one node, the nodes which are tightly connected in a local range are searched for and combined through the concept of a local strongest edge, and the combined community is regarded as a virtual node to participate in the subsequent combining process. The algorithms described above are relatively high in complexity and not very stable.

As shown in fig. 2, the present invention focuses on abstract modeling of node, node cluster, and connection relationship in a complex network, and proposes three basic entities: the first is a node, namely the most original node in a complex network; secondly, a node cluster is an entity formed by combining similar node clusters with smaller scale; and the third is community. The three are in progressive relation, the nodes are combined to form a node cluster, and then the node cluster forms a community, wherein the community is the final node cluster after the node clusters cannot be combined with each other.

The smallest-scale node cluster is a single node, and since a node can be regarded as a node cluster formed by a single node, the relationship between nodes or the relationship between a node and a node cluster is not considered independently, but only the relationship between node clusters is considered.

Properties and parameters of a node cluster

The main attributes of the node cluster include: the state of the node cluster, the neighbor node set of the node cluster and the node set of the node cluster. The state of a node cluster mainly refers to the description of how the node cluster interacts with other node clusters, and comprises four states: a. default state, b. closed state, c. following state, d. expanded state. When a single node cluster is just initialized, the state is the default state, and after the expansion of the node cluster is finished, the state of the node cluster is restored to the default state. The closed state indicates that the internal connection of the node cluster is relatively tight and cannot follow other node clusters or other node clusters are combined, and the node cluster at the moment is the final community. When the current node is in the default state, if the most similar other node cluster C is found and the most similar node cluster of the node cluster C is not the current node cluster, the current node cluster is in the following state, the other node clusters are not combined any more, and only when the C node clusters are combined, the C node clusters are combined into the following linked list together, or after the C node clusters are combined with the other node clusters, when the C node clusters are combined, the C node clusters are combined together. When the node cluster is in the default state and the node cluster C are the most similar node clusters, the node cluster is converted into the expansion state, and the node clusters C are merged.

The index parameters of the node cluster comprise: similarity, intra-cluster average link density, inter-cluster average link density.

The similarity is used for describing the similarity between the node clusters, and the measurement mode is to consider the proportion of the neighbor nodes which are the same in the two node clusters in all the neighbor nodes of the two node clusters. The formula is shown in formula 1-1:

S=(Ni∩Nj)/(Ni∪Nj) (1-1)

wherein N isiIs the neighbor node set of the node cluster i.

The average connection density within a cluster is actually the average of the number of connections made by each node within the cluster in proportion to the total number of connections made. The formula is shown in formulas 1-2:

wherein, CiiFor the number of connections of node i within a node cluster, Ci2Is the total number of all the connections of the node i, and N is the total number of the nodes contained in the node set of the node cluster.

The inter-cluster average connection density refers to an average of the intra-cluster average connection densities of the respective node clusters. The formula is shown in formulas 1-3:

where ρ isiThe average connection density in the cluster of the ith node cluster, and M is the total number of the current node cluster.

Community division quality index

The modularity is originally proposed by Mark NewMan (Mark NewMan), and is a currently common method for measuring the strength of network community mechanisms, and the quality of network community division is evaluated by comparing the connection density difference calculated by the existing network and a reference network under the condition of the same community division. The magnitude of the modularity value mainly depends on the community division of the nodes in the network, and the closer the value is to 1, the higher the community division quality is.

The modularity is defined as shown in formulas 1-4 below:

Aijindicating whether the i node and the j node are adjacent, if so, Aji1, otherwise Aij0. In the corresponding network, the probability that an edge (i, j) exists ism is the total number of connections, kiIndicating the degree of node i, as previously stated。δ(Ci,Cj) The method is a binary function, when the node i and the node j belong to a community, the value is 1, otherwise, the value is 0.

The meaning of this formula is the difference between the proportion of an edge within the same community in the network and the expected value of the proportion of an edge within a reference network under the same community organization. If the modularity value is higher, the community division effect in the complex network is better.

Description of the procedures

Proc1:

Step1, initialize each node as an independent node cluster and set its state as the default state.

Step2, calculating the similarity among the node clusters, and recording the most similar node cluster of each node cluster according to the similarity; the node cluster comprises at least one node; the most similar node cluster of the node clusters is the node cluster with the highest similarity to the node cluster in the node clusters with the similarity to the node cluster larger than a preset value; .

Step3 Proc2 is performed for each cluster of nodes.

And Step4, if the state of each node cluster is a closed state, ending and respectively setting the node clusters in the social network at the moment as communities in the social network, otherwise, executing Step 2.

Proc2:

Step5, if the state of the node cluster is closed, the process ends.

Step6, if the state of the node cluster is the following state, the process ends.

Step7, if the most similar node cluster of the node cluster is C and the most similar node cluster of C is the node cluster, the node cluster and the node cluster C are the most similar node clusters, the node cluster is merged with C, the following linked lists of the node cluster and the node cluster C are sequentially searched, after all the node clusters on the following linked lists are merged together, the closed condition is judged, if the condition is satisfied, the closed state is updated, otherwise the default state is updated; the closed condition is that the intra-cluster average connection density of the node clusters is greater than the inter-cluster average connection density; .

And Step8, if the most similar node cluster of the node cluster is C, but the most similar node cluster of C is not the node cluster, namely the node cluster and the node cluster are not mutually most similar, the node cluster becomes a following state and is added into a following linked list of C.

The algorithm is shown in Table 1-1 below.

Table 1-1 recursive merging algorithm for node clusters

Results and analysis of the experiments

Experimental data

In order to test the performance of the community discovery algorithm, a plurality of scholars in the social field abstract and extract a plurality of network topological graphs with typical community mechanisms, and the network is an analysis of an actual social network, so that the community mechanisms of the network often have clear practical significance. The development of the social network provides large-scale network data for the research of a community discovery algorithm, and students collect and sort data information of a plurality of social networks as test data of community division. This chapter selects three datasets commonly used in the field of complex network analysis: U.S. Football net, Karate Club net, dolphin's dolphin net are used as experimental data, and FIG. 3, FIG. 4 and FIG. 5 are structural diagrams of three social networks, in which members in the network are mapped as network nodes and relationships between the members are mapped as edges of the nodes, and a network graph is constructed on the basis of the three structural diagrams. Respectively applying cellular automata, a label propagation algorithm and a node cluster recursive merging algorithm for comparison, wherein the closed conditions are as follows: the average intra-cluster link density is greater than the average inter-cluster link density.

The American College Football data set is the competition network data among I-A subarea schools of the American College of NCAA Football 2000 autumn convention game. Mark Newman et al, in order to find a large-scale network structure diagram, studied and analyzed the course arrangement of each team of the American football league, abstractly extracted the NCAA football network, which is a wireless topology structure network with 115 nodes and 616 edges. In the Football network profile, each node represents a Football team, and if there is a connecting edge between two nodes, it represents that a match occurs between the two teams.

The Zachary's Karate Club network, a classical data set, is often used in the field of social network analysis. Socialists Zachary studied and analyzed the social situation of members of the karate club at a university in the united states early in the 70's 20 th century. Over two years, by analyzing the social relationship among 34 members, including their interpersonal relationship inside and outside the club, he established a social relationship network among the members, and there were 34 nodes in this network, representing 34 club members, and 78 edges, and if there is an edge connecting the two nodes, it indicates that the corresponding two members have close contact, and they are frequent friends.

The dolphin network is also a network that is often applied in social network analysis as a network for verifying the validity of a community discovery algorithm. In 2003, Lusseau et al studied and analyzed the interaction relationship of dolphins in New Zealand, and records of many years analyzed the life habits of 62 dolphins widoschus, and finally explored and found the specific way of the interaction of the dolphins, so that a social network comprising 159 edges and 62 nodes was constructed. If an edge is connected between two nodes in the network, two dolphins representing the corresponding points often appear together frequently.

The node attributes of the data set are shown in tables 1-2 below.

TABLE 1-2 node attributes in a network

Results of the experiment

A label propagation algorithm and a cell machine learning algorithm are selected to be compared with a node cluster algorithm. The three algorithms are shown in fig. 6, 7 and 8 for the community discovery process of the American football club, the karate fun network and the dolphin network. The horizontal axis is the iteration number, the vertical axis is the node number, and different colors represent different algorithms. And finding an evaluation standard modularity index adopted by the community result, wherein the higher the modularity is, the better the community division effect is.

As can be seen from fig. 6, 7 and 8, the iteration speed of the node cluster aggregation algorithm is slightly better than that of the label propagation algorithm, and the convergence speed is also faster than that of the label algorithm. The cellular automata-based algorithm is essentially a multi-objective optimization algorithm, is relatively complex, and has low convergence rate and iteration rate. The results of the three algorithm experiments are shown in tables 1-3.

TABLE 1-3 comparison of the results

As can be seen from tables 1-3, the modularity of the node cluster aggregation algorithm is higher than that of the label propagation algorithm and the cell machine learning algorithm.

Analysis of experiments

The method takes a node cluster set formed by a single node as an initial set, calculates the similarity between two nodes, constructs a similarity following linked list, gradually merges the existing node clusters through recursion, takes a preset closed condition as a condition for whether the node clusters are continuously merged or not, and finally quickly divides communities. In the initial experiment process, due to randomness in the iterative process of the label algorithm, when the label is asynchronously updated, multiple community mechanisms can meet the stop condition aiming at multiple operations of the same data set, but the community mechanisms are relatively similar. Various community mechanisms meeting end conditions do not appear in the experimental process of the node cluster aggregation algorithm, and the defect that the label algorithm is unstable does not exist. Comparing the experimental processes of fig. 6, fig. 7 and fig. 8, the node cluster algorithm is stable and fast, and the evaluation index of modularity is adopted to analyze the experimental result, and the modularity is much better than the label propagation algorithm. Overall, the node cluster aggregation algorithm gives the best results on all three datasets.

Example 2

The processing apparatus for community division in a social network provided by this embodiment includes:

and the initialization module is used for respectively initializing each node in the social network into a node cluster and setting each node cluster into a default state.

The computing module is used for computing the similarity among the node clusters in the social network to obtain the most similar node cluster of the node clusters; the node cluster comprises at least one node; the most similar node cluster of the node clusters is the node cluster with the highest similarity to the node cluster in the node clusters with the similarity to the node cluster larger than a preset value;

the updating module is used for updating the attributes of the node clusters in the social network according to the similarity relation between each node cluster and the most similar node cluster of each node cluster, the state of each node cluster and the node cluster sealing conditions; the attribute of the node cluster comprises the state of the node cluster and the node set of the node cluster;

when the similarity relations are not mutually most similar, setting the state of the node cluster as a following state, and adding the node cluster into a following linked list of the most similar node cluster of the node cluster;

when the similarity relations are the most similar to each other, converting the state of the node cluster into an expansion state; the node cluster combines the most similar node clusters of the node clusters, and combines the node clusters in the following linked list of the most similar node clusters of the node clusters; when the merged node cluster meets the closed condition, converting the state of the merged node cluster into a closed state; when the merged node does not meet the closed condition, converting the state of the merged node cluster into a default state;

and the setting module is used for setting all the node clusters in the social network to be communities in the social network when all the node clusters in the social network are in a closed state.

The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:确定农业作物的减灾保产措施的方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!