Group detection method, group detection device, electronic equipment and computer storage medium

文档序号:1939239 发布日期:2021-12-07 浏览:16次 中文

阅读说明:本技术 群体检测方法、装置、电子设备和计算机存储介质 (Group detection method, group detection device, electronic equipment and computer storage medium ) 是由 饶玮 于 2021-01-25 设计创作,主要内容包括:本发明实施例提出了一种群体检测方法、装置、电子设备和计算机存储介质,该方法包括:获取社交关系网络中社群大小的先验分布;根据所述先验分布确定所述社交关系网络的模块度的计算方式;以最大化所述社交关系网络的模块度为目标,对所述社交关系网络进行社群划分,得到社群划分结果。(The embodiment of the invention provides a group detection method, a group detection device, electronic equipment and a computer storage medium, wherein the method comprises the following steps: obtaining prior distribution of community sizes in a social relationship network; determining a calculation mode of the modularity of the social relationship network according to the prior distribution; and carrying out community division on the social relationship network by taking the maximization of the modularity of the social relationship network as a target to obtain a community division result.)

1. A population detection method, comprising:

obtaining prior distribution of community sizes in a social relationship network;

determining a calculation mode of the modularity of the social relationship network according to the prior distribution;

and carrying out community division on the social relationship network by taking the maximization of the modularity of the social relationship network as a target to obtain a community division result.

2. The method of claim 1, wherein the determining the modularity of the social relationship network based on the prior distribution when the social relationship network is a non-weighted network comprises:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the total number of edges of the social relationship network, the number of edges of each community in the social relationship network and the sum of the node degrees of each community in the social relationship network.

3. The method of claim 1, wherein the calculating of the modularity of the social relationship network based on the prior distribution when the social relationship network is a weighted network comprises:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the social relationship network and the sum of the weights of the edges in each community in the social relationship network.

4. The method as claimed in any one of claims 1 to 3, wherein the dividing the social relationship network into communities with the goal of maximizing the modularity of the social relationship network to obtain the community division result comprises:

determining a modularity change function according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are combined; wherein the weight coefficient of a community in the social relationship network is related to the number of nodes of the community;

and according to the calculation mode of the modularity and the modularity change function, carrying out community division on the social relationship network by adopting a Louvain method to obtain a community division result.

5. The method of claim 4, wherein determining a modularity variation function according to the weighting coefficients of any two communities in the social relationship network and the weighting coefficient of a new community after the two communities are merged comprises:

when the social relationship network is a non-weighted network, determining the modularity degree change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained by combining any two communities, the total number of edges of the social relationship network, the sum of the node degrees of each community in any two communities, the number of edges of each community in any two communities and the number of connecting edges between any two communities.

6. The method of claim 4, wherein determining a modularity variation function according to the weighting coefficients of any two communities in the social relationship network and the weighting coefficient of a new community after the two communities are merged comprises:

when the social relationship network is a weighted network, determining the modularity change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained after the two communities are combined, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the two communities, the sum of the weights of the edges of each community in the two communities and the sum of the weights of the connecting edges between the two communities.

7. The method of any of claims 1 to 3, wherein obtaining the prior distribution of community sizes in the social relationship network comprises:

and counting distribution information of the community size in sample data, and constructing prior distribution of the community size in the social relationship network according to the distribution information of the community size in the sample data.

8. A population detection device, the device comprising:

the acquisition module is used for acquiring prior distribution of community sizes in the social relationship network;

the first processing module is used for determining a calculation mode of the modularity of the social relationship network according to the prior distribution;

and the second processing module is used for carrying out community division on the social relationship network by taking the modularity of the social relationship network as a target to be maximized to obtain a community division result.

9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the population detection method of any one of claims 1 to 7 when executing the program.

10. A computer storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing the population detection method of any one of claims 1 to 7.

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a group detection method and apparatus, an electronic device, and a computer storage medium.

Background

In the related art, group detection is a technique of dividing nodes of a social relationship network into a plurality of communities, and after the communities are divided by using the group detection technique, connections between the nodes within each community are strong connections, and connections between the communities are weak connections. In the related technology, group detection of a social relationship network can be realized by adopting a method based on modularity, the modularity is not only provided as an optimized objective function, but also is one of the most popular standards for measuring the quality of a community result at present, and the provision of the modularity index is also considered as a milestone in group detection research history; however, the inventors found that: the community division result obtained by the method based on the modularity degree does not completely conform to the real community structure of the social relationship network, that is, the community division result obtained by the method based on the modularity degree is not accurate enough.

Disclosure of Invention

The embodiment of the invention is expected to provide a technical scheme for group detection, and can solve the problem that the community division result obtained in the related technology is not accurate enough.

The embodiment of the invention provides a group detection method, which comprises the following steps:

obtaining prior distribution of community sizes in a social relationship network;

determining a calculation mode of the modularity of the social relationship network according to the prior distribution;

and carrying out community division on the social relationship network by taking the maximization of the modularity of the social relationship network as a target to obtain a community division result.

In some embodiments, when the social relationship network is a non-weighted network, the calculating method for determining the modularity of the social relationship network according to the prior distribution includes:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the total number of edges of the social relationship network, the number of edges of each community in the social relationship network and the sum of the node degrees of each community in the social relationship network.

In some embodiments, when the social relationship network is a weighted network, the calculating method for determining the modularity of the social relationship network according to the prior distribution includes:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the social relationship network and the sum of the weights of the edges in each community in the social relationship network.

In some embodiments, the performing community division on the social relationship network with the goal of maximizing the modularity of the social relationship network to obtain a community division result includes:

determining a modularity change function according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are combined; wherein the weight coefficient of a community in the social relationship network is related to the number of nodes of the community;

and according to the calculation mode of the modularity and the modularity change function, carrying out community division on the social relationship network by adopting a Louvain method to obtain a community division result.

In some embodiments, the determining a modularity variation function according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are merged includes:

when the social relationship network is a non-weighted network, determining the modularity degree change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained by combining any two communities, the total number of edges of the social relationship network, the sum of the node degrees of each community in any two communities, the number of edges of each community in any two communities and the number of connecting edges between any two communities.

In some embodiments, the determining a modularity variation function according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are merged includes:

when the social relationship network is a weighted network, determining the modularity change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained after the two communities are combined, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the two communities, the sum of the weights of the edges of each community in the two communities and the sum of the weights of the connecting edges between the two communities.

In some embodiments, the obtaining an a priori distribution of community sizes in a social relationship network includes:

and counting distribution information of the community size in sample data, and constructing prior distribution of the community size in the social relationship network according to the distribution information of the community size in the sample data.

The embodiment of the invention also provides a group detection device, which comprises:

the acquisition module is used for acquiring prior distribution of community sizes in the social relationship network;

the first processing module is used for determining a calculation mode of the modularity of the social relationship network according to the prior distribution;

and the second processing module is used for carrying out community division on the social relationship network by taking the modularity of the social relationship network as a target to be maximized to obtain a community division result.

In some embodiments, when the social relationship network is a non-weighted network, the first processing module is configured to determine a calculation manner of the modularity of the social relationship network according to the prior distribution, and includes:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the total number of edges of the social relationship network, the number of edges of each community in the social relationship network and the sum of the node degrees of each community in the social relationship network.

In some embodiments, when the social relationship network is a weighted network, the first processing module is configured to determine a calculation manner of the modularity of the social relationship network according to the prior distribution, and includes:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the social relationship network and the sum of the weights of the edges in each community in the social relationship network.

In some embodiments, the second processing module is configured to perform community division on the social relationship network with a goal of maximizing modularity of the social relationship network, and obtain a community division result, and includes:

determining a modularity change function according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are combined; wherein the weight coefficient of a community in the social relationship network is related to the number of nodes of the community;

and according to the calculation mode of the modularity and the modularity change function, carrying out community division on the social relationship network by adopting a Louvain method to obtain a community division result.

In some embodiments, the second processing module is configured to determine a modularity varying function according to a weighting factor of any two communities in the social relationship network and a weighting factor of a new community after the any two communities are merged, and includes:

when the social relationship network is a non-weighted network, determining the modularity degree change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained by combining any two communities, the total number of edges of the social relationship network, the sum of the node degrees of each community in any two communities, the number of edges of each community in any two communities and the number of connecting edges between any two communities.

In some embodiments, the second processing module is configured to determine a modularity varying function according to a weighting factor of any two communities in the social relationship network and a weighting factor of a new community after the any two communities are merged, and includes:

when the social relationship network is a weighted network, determining the modularity change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained after the two communities are combined, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the two communities, the sum of the weights of the edges of each community in the two communities and the sum of the weights of the connecting edges between the two communities.

In some embodiments, the obtaining module, configured to obtain the prior distribution of community sizes in the social relationship network, includes:

and counting distribution information of the community size in sample data, and constructing prior distribution of the community size in the social relationship network according to the distribution information of the community size in the sample data.

The embodiment of the invention also provides electronic equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, any one of the group detection methods is realized.

An embodiment of the present invention further provides a computer storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements any one of the above-mentioned population detection methods.

In the group detection method, the group detection device, the electronic equipment and the computer storage medium provided by the embodiment of the invention, firstly, prior distribution of the sizes of communities in a social relationship network is obtained; then, determining a calculation mode of the modularity of the social relationship network according to the prior distribution; and finally, carrying out community division on the social relationship network by taking the maximization of the modularity of the social relationship network as a target to obtain a community division result.

Therefore, in the embodiment of the invention, the calculation mode of the modularity of the social relationship network can be determined on the basis of considering the prior distribution of the sizes of the communities in the social relationship network, and then the community division is performed, that is, the finally obtained community division result can reflect the sizes of the communities in the social relationship network, and the accuracy of the community division result is improved to a certain extent.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention.

FIG. 1 is a schematic diagram of population detection using a modularity-based approach in the related art;

FIG. 2 is a flow chart of a population detection method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a process of performing community division on a social relationship network by using a Louvain method according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a population detection device according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In the related art, a social relationship network represents a network formed by different users and reflecting relationships among the users, and a community may reflect structural characteristics of the social relationship network, embody relationships among the users, display user interest characteristics, and the like. By group detection of the social relationship network, communities (user groups) with common interests can be identified in the social relationship network, correct information can be spread to correct audiences, and the method plays an important role in realizing accurate information push.

In the related art, the group detection method for the social relationship network includes graph segmentation, clustering, a spectrum method, label propagation, a modularity-based method and the like; the graph partitioning divides nodes in the graph into groups of a predetermined size, the number of edges between the groups is the smallest, and the problem belongs to the NP-hard problem. Clustering considers a community as a set of objects with similar contents, and focuses on the definition of node similarity. The spectrum method needs to calculate the characteristic value of the matrix, so that the cost is high, and the spectrum method is difficult to apply to a large-scale social relationship network. The label propagation method has the advantage of small calculation amount, can be used for solving the problem of rapid community detection, and is suitable for community detection in a large-scale network. The tag propagation method usually has a phenomenon that the community size is greatly different. The modularity is not only proposed as an objective function for optimization, but is also one of the most popular criteria for measuring the quality of community results.

The method based on the modularity generally has the problem of resolution limitation, namely, the community division result obtained by adopting the method based on the modularity is not accurate enough; as shown in fig. 1, the ring network has 24 clusters, nodes in each cluster are connected with each other two by two (the nodes in the clusters are not shown in the figure), and each cluster has 5 nodes; here, a clique is a subset of nodes in the graph, where any two different nodes in the clique are adjacent to each other, also referred to as a full graph. The clusters are connected by a single link, eventually forming a ring in the figure. Intuitively, each group should be divided into separate communities with a modularity of Q1-0.8674. However, when the method based on the modularity is adopted, the two groups are combined by the optimal community division according to the modularity index, and the corresponding modularity is Q2-0.8712. This means that the modularity is optimal and does not correspond to the best community detection result. It can be seen that some community structures smaller than a certain scale cannot be found by the modularity-based method. The resolution limitation is mainly caused by the fact that no information about the number of nodes in the community is included in the modularity definition, and the community division result is highly sensitive to the connection weight in the network.

In the related art, the social relationship between users, such as family relationship, co-person relationship, etc., may be predicted using inter-user call information, location information, device login information. Ideally, the predicted data set is completely cohesive in the communities and completely zero-connection between the communities, so that the user group detection can directly obtain a result only through one round of connected graph analysis. However, in a practical scenario, such ideal situation is not possible, and the quality of data source, the small available data types, the performance limitation of the algorithm model, and the like may cause the connection relationship between different communities. Therefore, group division by a community detection algorithm is further required. In some scenes, according to experience, the number of members in the same family is often 2-6, and if no limit is imposed on the size of the community, the size of each community obtained by a modularity-based method is often far larger than the number (2-6); the fundamental reason is that the prior information about the communities in the related art is less, so that the community division result may not satisfy the knowledge about the communities, and the result is not accurate enough.

In the related art, the modularity index is proposed by comparing with a random intra-network link relation structure, and the modularity is calculated using the following formula (1):

wherein m represents the total number of edges of the social relationship network, each node represents a user in the social relationship network, and the connection between the nodes is the edge in the social relationship network; k is a radical ofiAnd kjRepresenting degrees for node i and node j, respectively, the degrees for a node representing the number of edges associated with the node; c. CiAnd cjRespectively represent communities to which the nodes i and j belong when ci=cjWhen is, delta (c)i,cj) 1, otherwise, δ (c)i,cj)=0;eiRepresents a community ciNumber of inner edges, diRepresents a community ciThe sum of the degrees of all nodes in the node; a. theijThe value of (c) can be illustrated by equation (2).

When the modularity-based method is adopted to realize group detection of the social relationship network, the greater the modularity corresponding to the community division result, the more reasonable the community division result is.

However, based on the above description, it can be seen that, in the related art, the definition of the modularity only considers the structural information of the social relationship network, and does not consider the prior distribution of the community sizes, so how to accurately obtain the community division result on the basis of considering the prior distribution of the community sizes is an urgent technical problem to be solved.

The technical scheme of the embodiment of the invention is provided for solving the problem that the community division result obtained by adopting a modularity-based method in the related art is not accurate enough.

The present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the examples provided herein are merely illustrative of the present invention and are not intended to limit the present invention. In addition, the following embodiments are provided as partial embodiments for implementing the present invention, not all embodiments for implementing the present invention, and the technical solutions described in the embodiments of the present invention may be implemented in any combination without conflict.

It should be noted that, in the embodiments of the present invention, the terms "comprises", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, so that a method or apparatus including a series of elements includes not only the explicitly recited elements but also other elements not explicitly listed or inherent to the method or apparatus. Without further limitation, the use of the phrase "including a. -. said." does not exclude the presence of other elements (e.g., steps in a method or elements in a device, such as portions of circuitry, processors, programs, software, etc.) in the method or device in which the element is included.

The term "and/or" herein is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or C, may mean: a exists alone, A and C exist simultaneously, and C exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality.

For example, the group detection method provided by the embodiment of the present invention includes a series of steps, but the group detection method provided by the embodiment of the present invention is not limited to the described steps, and similarly, the group detection device provided by the embodiment of the present invention includes a series of modules, but the group detection device provided by the embodiment of the present invention is not limited to include the explicitly described modules, and may include modules that are required to acquire relevant information or perform processing based on the information.

Embodiments of the invention may be implemented on a terminal and/or a server, where the terminal may be a thin client, a thick client, a hand-held or laptop device, a microprocessor-based system, a set-top box, a programmable consumer electronics, a network personal computer, a small computer system, and so forth. The server may be a small computer system, a mainframe computer system, a distributed cloud computing environment including any of the systems described above, and so forth.

The electronic devices, such as servers, may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

An embodiment of the present invention provides a group detection method, and fig. 2 is a flowchart of the group detection method according to the embodiment of the present invention, and as shown in fig. 2, the flowchart may include:

step 201: a priori distribution of community sizes in a social relationship network is obtained.

In the embodiment of the application, the prior distribution of the community sizes in the social relationship network can be obtained based on sample data, and the sample data represents known social relationship network data. In some embodiments, the distribution information of the community sizes in the sample data may be counted, and the prior distribution of the community sizes in the social relationship network may be constructed according to the distribution information of the community sizes in the sample data.

It can be seen that the prior distribution of the community sizes in the social relationship network can be easily obtained by counting the distribution information of the community sizes in the sample data.

In some embodiments, a histogram distribution of community sizes may be statistically calculated from the sample data, and then a prior distribution of community sizes in the social relationship network may be determined based on the histogram distribution.

Illustratively, a probability density function p (| c |) of the community size, which represents the number of nodes in the community c, may be estimated from the sample data as a prior distribution.

Illustratively, the prior distribution of community sizes in the social relationship network may be constructed using a Parzen window method, also known as Kernel Density Estimation (Kernel Density Estimation), which is a method used in probability theory to estimate an unknown Density function, belonging to one of the non-parametric inspection methods, and the Parzen window function may be a gaussian function or other functions. The Parzen window function can be expressed by equation (3).

Where n denotes the number of sample data, σ denotes the standard deviation, and xiRepresents the value of the ith sample in the sample data, x is an argument, and p (x) represents the prior probability that the value of the sample is x.

Referring to the above formula (3), it can be seen that the probability density function of the community size in the social relationship network approaches the average value of the gaussian function, and as σ is smaller (approaches 0), p (x) approaches the histogram distribution, and σ is larger, p (x) is smoother; illustratively, σ may be empirically set to 0.6.

Step 202: and determining a calculation mode of the modularity of the social relationship network according to the prior distribution.

In the related art, the modularity of the social relationship network may be calculated according to the above formula (1), and the prior distribution of the community sizes in the social relationship network is not considered in the formula (1).

In the embodiment of the present application, the calculation manner of the modularity of the social relationship network may be re-determined according to the prior distribution, that is, the embodiment of the present application corrects the definition of the modularity of the social relationship network in the related art, and the prior distribution of the community size in the social relationship network is reflected in the definition of the corrected modularity.

Step 203: and carrying out community division on the social relationship network by taking the modularity of the maximized social relationship network as a target to obtain a community division result.

In some embodiments, the modularity of the social relationship network may be used as an objective function of the optimization, function values of the modularity corresponding to various community division results may be determined, and a community division result corresponding to a maximum function value may be used as a finally obtained community division result.

It is understood that if the modularity of the social relationship network is directly used as an objective function of the optimization to obtain the community division result, the computation complexity may be too high, and therefore, in other embodiments, the greedy optimization strategy of the Louvain method may be borrowed, the modularity is locally optimized by finding a small community, and the community division result is finally obtained through aggregation and iteration,

illustratively, the modularity change function can be determined according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are merged; wherein the weight coefficient of the community in the social relationship network is related to the node number of the community;

and then, according to the calculation mode of the modularity and the modularity change function, carrying out community division on the social relationship network by adopting a Louvain method to obtain a community division result.

In one implementation, after the number of nodes of the community is obtained, the weight coefficient of the community in the social relationship network can be obtained according to a preset weight coefficient calculation formula of the community; the value of the community weight coefficient may range from 0 to 1.

Therefore, the embodiment of the application uses the idea of the Louvain algorithm for reference, and performs local optimization through an improved modularity change function, so that the generation of the community division result is ensured, and meanwhile, the rapid calculation is facilitated.

In practical applications, the steps 201 to 203 may be implemented based on a Processor of an electronic Device, where the Processor may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor.

It can be understood that, in the embodiment of the present invention, a calculation manner of modularity of the social relationship network may be determined based on consideration of prior distribution of sizes of communities in the social relationship network, so as to perform community division, that is, a finally obtained community division result may reflect the size of the community in the social relationship network, so that accuracy of the community division result is improved to a certain extent.

In some embodiments, the social relationship network may be a weighted network or a non-weighted network; each edge in the weighting network has a corresponding weight, and the weights corresponding to different edges in the weighting network can be different; the weights of all edges in the unweighted network are the same.

In some embodiments, when the social relationship network is a non-weighted network, the calculation manner of the modularity of the social relationship network may be determined according to the prior distribution, the total number of edges of the social relationship network, the number of edges of each community in the social relationship network, and the sum of the node degrees of each community in the social relationship network.

For example, in order to determine the calculation manner of the modularity of the social relationship network, the number of nodes of each community in the social relationship network may be constrained by using the prior distribution; when the social relationship network is an unweighted network, the calculation formula of the modularity of the social relationship network may be formula (4).

Wherein Q isvRepresenting the modularity of the social relationship network when the social relationship network is an unweighted network, m representing the total number of edges of the social relationship network, W (| c)i|) represents a community c in a social relationship networkiWeight coefficient of (1), W (| c)i|)=p(|ci|), it can be seen that community c in the social relationship networkiThe weight coefficient of (a) may reflect the community ciProbability of number of inner nodes; e.g. of the typeiRepresents a community ciNumber of edges of diRepresents a community ciThe sum of the degrees of all nodes within.

For an unweighted network, the number of communities in the network can be considered to be uniformly distributed, that is, the sizes of the communities are subject to uniform distribution, and the probability density functions of the sizes of the communities corresponding to different communities are the same.

It can be seen that, for the unweighted network, the calculation method of the modularity of the social relationship network can be easily modified in combination with the prior distribution of the community sizes and the graph structure of the network.

In some embodiments, when the social relationship network is a non-weighted network, the modularity degree variation function may be determined according to a weight coefficient of any two communities in the social relationship network, a weight coefficient of a new community obtained by merging the any two communities, a total number of edges of the social relationship network, a sum of degrees of nodes of each of the any two communities, a number of edges of each of the any two communities, and a number of connecting edges between the any two communities.

When the Louvain method is adopted to divide the social relationship network into communities, two communities in the social relationship network need to be merged to obtain a new merged community; the connecting edge represents an edge for realizing direct connection of any two communities, if any two communities of the social relationship network are directly connected, the connecting edge is considered to exist between any two communities, and the number of the connecting edge between any two communities can be determined at the moment; if any two communities of the social relationship network are not directly connected, the number of connecting edges between any two communities can be considered as 0.

In some embodiments, when the Louvain method is used to divide communities in the unweighted network, if community c is to be dividediAnd community cjMerging them together, the community ciAnd community cjThe modularity variation function resulting from the merging can be represented by equation (5).

Wherein Δ Q represents a modularity variation function, ejRepresents a community cjNumber of edges of eijRepresents a community ciAnd community cjThe number of connecting edges between the two; wiRepresents a community ciWeight coefficient of (1), WjRepresents a community cjWeight coefficient of (1), WijRepresents a community ciAnd community cjThe weight coefficient of the new community after merging; djRepresents a community cjThe sum of the degrees of all nodes within.

It can be seen that, for the unweighted network, the social relationship network can be subjected to community division by using a Louvain method based on the constructed modularity change function, which is beneficial to quickly obtaining the community division result of the social relationship network.

In some embodiments, when the social relationship network is a weighted network, the calculation manner of the modularity of the social relationship network is determined according to the prior distribution, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the social relationship network, and the weighted sum of the edges in each community in the social relationship network.

For example, when the social relationship network is a weighted network, the adjacency matrix a, the adjacency moment of the weighted network may be first calculatedThe matrix is a matrix representing the adjacent relation between nodes in the network, and when i and j are both integers greater than 0, the element of the ith row and the jth column of the matrix A can be marked as aij,aijRepresenting the adjacent relation between the node i and the node j in the social relation network, the strength of the node i can be defined as si=∑jaij(ii) a Illustratively, the calculation formula of the modularity of the social relationship network may be formula (6).

Wherein Q iswRepresenting the modularity of the social relationship network when the social relationship network is a weighted network, 2W representing the sum of the strengths of all nodes of the social relationship network, W (| c)i|) represents a community c in a social relationship networkiWeight coefficient of (d), wiRepresenting a community c in a social relationship networkiSum of intensities of all nodes of liRepresenting a community c in a social relationship networkiThe sum of the weights of the edges within.

It can be seen that, for the weighting network, the calculation method of the modularity of the social relationship network can be easily modified in combination with the prior distribution of the community sizes and the graph structure of the network.

In some embodiments, when the social relationship network is a weighted network, the modularity change function may be determined according to a weight coefficient of any two communities in the social relationship network, a weight coefficient of a new community obtained by merging the any two communities, a sum of node strengths of the social relationship network, a sum of node strengths of each of the any two communities, a sum of weights of edges of each of the any two communities, and a sum of weights of connecting edges between the any two communities.

In some embodiments, when the weighting network is community-divided by the Louvain method, if community c is to be dividediAnd community cjMerging them together, the community ciAnd community cjThe modularity variation function resulting from the merging can be represented by equation (7).

Wherein, is Δ QwRepresenting a modularity variation function, WiRepresents a community ciWeight coefficient of (1), WjRepresents a community cjWeight coefficient of (1), WijRepresents a community ciAnd community cjThe weight coefficient of the new community after merging; liRepresenting a community c in a social relationship networkiThe sum of the weights of the inner edges, ljRepresenting a community c in a social relationship networkjThe sum of the weights of the inner edges, lijRepresenting the sum of the weights of the connecting edges between any two communities in the social relationship network; w is aiRepresenting a community c in a social relationship networkiSum of intensities of all nodes of, wjRepresenting a community c in a social relationship networkjThe sum of the strengths of all nodes.

It can be seen that, for the weighting network, the community division can be performed on the social relationship network by using the Louvain method based on the constructed modularity variation function, which is beneficial to rapidly obtaining the community division result of the social relationship network.

In some embodiments, for a non-weighted network or a weighted network, the modularity variation function is a key point for community division of the network by using the Louvain method.

Referring to fig. 3, the process of performing community division on the social relationship network by using the Louvain method may include:

step 301: each node in the social relationship network is considered an independent community.

It can be seen that at the initial time, the number of nodes in the social relationship network is equal to the number of communities divided at the initial time.

Step 302: and determining a community merged with each community, and performing community merging.

In some embodiments, for each community, merging with each adjacent community (the adjacent community represents the community where the adjacent node of the corresponding community is located) is sequentially tried, and after each attempt to merge the communities, the value of the corresponding modularity varying function is determined according to the modularity varying function; for each community, a neighboring community that maximizes the value of the modularity variation function is recorded.

In the embodiment of the invention, the method is used for the community ciThe value of the maximum modularity degree variation function may be denoted as max Δ Q, and when max Δ Q is greater than 0, the community c may be designatediMerging with an adjacent community which maximizes the value of the modularity varying function; when max Δ Q is less than or equal to 0, for community ciNo community merger is performed.

As can be seen from the above description, in order to calculate the value of the modularity function, it is necessary to obtain a priori distribution of the community sizes in the social network.

Step 303: and judging whether the community to which each node belongs is changed, if so, returning to the step 302, and if not, executing the step 304.

Step 304: the nodes of the social relationship network are compressed.

In some embodiments, nodes in the same community may be compressed into a new node, and at this time, the sum of the weights of the edges between the nodes in the same community may be converted into the weight coefficient of the new node, and the weight of the edge between adjacent communities may be converted into the weight of the corresponding edge between the new nodes.

Step 305: and judging whether the modularity of the social relationship network changes or not, if so, returning to the step 301, and if not, executing the step 306.

In some embodiments, after step 304 is executed each time, the modularity of the social relationship network may be calculated once, and when the modularity of the social relationship network calculated this time is different from the modularity of the social relationship network calculated last time, it is considered that the modularity of the social relationship network changes; and when the modularity of the social relationship network obtained by the current calculation is the same as that of the social relationship network obtained by the last calculation, determining that the modularity of the social relationship network is unchanged.

Step 306: and carrying out community division according to the current community merging result.

On the basis of the group detection method provided by the foregoing embodiment, the embodiment of the present invention also provides a group detection apparatus.

Fig. 4 is a schematic structural diagram of a group detection device according to an embodiment of the present invention, and as shown in fig. 4, the device may include:

an obtaining module 401, configured to obtain prior distribution of community sizes in a social relationship network;

a first processing module 402, configured to determine a calculation manner of modularity of the social relationship network according to the prior distribution;

the second processing module 403 is configured to perform community division on the social relationship network with the goal of maximizing the modularity of the social relationship network, so as to obtain a community division result.

In some embodiments, when the social relationship network is a non-weighted network, the first processing module 402 is configured to determine a calculation manner of the modularity of the social relationship network according to the prior distribution, and includes:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the total number of edges of the social relationship network, the number of edges of each community in the social relationship network and the sum of the node degrees of each community in the social relationship network.

In some embodiments, when the social relationship network is a weighted network, the first processing module 402 is configured to determine a calculation manner of the modularity of the social relationship network according to the prior distribution, and includes:

and determining a calculation mode of the modularity of the social relationship network according to the prior distribution, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the social relationship network and the sum of the weights of the edges in each community in the social relationship network.

In some embodiments, the second processing module 403 is configured to perform community division on the social relationship network with the goal of maximizing the modularity of the social relationship network, and obtain a community division result, including:

determining a modularity change function according to the weight coefficients of any two communities in the social relationship network and the weight coefficient of a new community after the two communities are combined; wherein the weight coefficient of a community in the social relationship network is related to the number of nodes of the community;

and according to the calculation mode of the modularity and the modularity change function, carrying out community division on the social relationship network by adopting a Louvain method to obtain a community division result.

In some embodiments, the second processing module 403 is configured to determine a modularity degree variation function according to the weighting coefficients of any two communities in the social relationship network and the weighting coefficient of a new community after the any two communities are merged, and includes:

when the social relationship network is a non-weighted network, determining the modularity degree change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained by combining any two communities, the total number of edges of the social relationship network, the sum of the node degrees of each community in any two communities, the number of edges of each community in any two communities and the number of connecting edges between any two communities.

In some embodiments, the second processing module 403 is configured to determine a modularity degree variation function according to the weighting coefficients of any two communities in the social relationship network and the weighting coefficient of a new community after the any two communities are merged, and includes:

when the social relationship network is a weighted network, determining the modularity change function according to the weight coefficients of any two communities in the social relationship network, the weight coefficient of a new community obtained after the two communities are combined, the sum of the node strengths of the social relationship network, the sum of the node strengths of each community in the two communities, the sum of the weights of the edges of each community in the two communities and the sum of the weights of the connecting edges between the two communities.

In some embodiments, the obtaining module 401 is configured to obtain the prior distribution of community sizes in the social relationship network, and includes:

and counting distribution information of the community size in sample data, and constructing prior distribution of the community size in the social relationship network according to the distribution information of the community size in the sample data.

The obtaining module 401, the first processing module 402, and the second processing module 403 may be implemented by a processor located in an electronic device, where the processor is at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor.

In addition, each functional module in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware or a form of a software functional module.

Based on the understanding that the technical solution of the present embodiment essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method of the present embodiment. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Specifically, the computer program instructions corresponding to a group detection method in the present embodiment may be stored on a storage medium such as an optical disc, a hard disc, a usb disk, or the like, and when the computer program instructions corresponding to a group detection method in the storage medium are read or executed by an electronic device, any one of the group detection methods of the foregoing embodiments is implemented.

Based on the same technical concept of the foregoing embodiment, referring to fig. 5, it shows an electronic device 50 provided by an embodiment of the present invention, which may include: a memory 501, a processor 502, and a computer program stored on the memory 501 and executable on the processor 502; wherein the content of the first and second substances,

a memory 501 for storing computer programs and data;

a processor 502 for executing the computer program stored in the memory to implement any one of the population detection methods of the previous embodiments.

In practical applications, the memory 501 may be a volatile memory (volatile memory), such as a RAM; or a non-volatile memory (non-volatile memory) such as a ROM, a flash memory (flash memory), a Hard Disk (Hard Disk Drive, HDD) or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to the processor 502.

The processor 502 may be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor.

In some embodiments, the functions of the apparatus provided in the embodiments of the present invention or the modules included in the apparatus may be used to execute the method described in the above method embodiments, and for specific implementation, reference may be made to the description of the above method embodiments, and for brevity, details are not described here again

The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, which are not repeated herein for brevity

The methods disclosed in the method embodiments provided by the present application can be combined arbitrarily without conflict to obtain new method embodiments.

Features disclosed in various product embodiments provided by the application can be combined arbitrarily to obtain new product embodiments without conflict.

The features disclosed in the various method or apparatus embodiments provided herein may be combined in any combination to arrive at new method or apparatus embodiments without conflict.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:页岩单井含气丰度预测方法及系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!