cdlib.algorithms.surprise_communities¶
-
surprise_communities
(g_original: object, initial_membership: list = None, weights: list = None, node_sizes: list = None) → cdlib.classes.node_clustering.NodeClustering¶ Surprise_communities is a model where the quality function to optimize is:
\[Q = m D(q \parallel \langle q \rangle)\]where \(m\) is the number of edges, \(q = \frac{\sum_c m_c}{m}\), is the fraction of internal edges, \(\langle q \rangle = \frac{\sum_c \binom{n_c}{2}}{\binom{n}{2}}\) is the expected fraction of internal edges, and finally
\(D(x \parallel y) = x \ln \frac{x}{y} + (1 - x) \ln \frac{1 - x}{1 - y}\) is the binary Kullback-Leibler divergence.
For directed graphs we can multiplying the binomials by 2, and this leaves \(\langle q \rangle\) unchanged, so that we can simply use the same formulation. For weighted graphs we can simply count the total internal weight instead of the total number of edges for \(q\) , while \(\langle q \rangle\) remains unchanged.
Parameters: - g_original – a networkx/igraph object
- initial_membership – list of int Initial membership for the partition. If
None
then defaults to a singleton partition. Deafault None - weights – list of double, or edge attribute Weights of edges. Can be either an iterable or an edge attribute. Deafault None
- node_sizes – list of int, or vertex attribute Sizes of nodes are necessary to know the size of communities in aggregate graphs. Usually this is set to 1 for all nodes, but in specific cases this could be changed. Deafault None
Returns: NodeClustering object
Example: >>> from cdlib import algorithms >>> import networkx as nx >>> G = nx.karate_club_graph() >>> coms = algorithms.surprise_communities(G)
References: Traag, V. A., Aldecoa, R., & Delvenne, J.-C. (2015). Detecting communities using asymptotical surprise. Physical Review E, 92(2), 022816. 10.1103/PhysRevE.92.022816
Note
Reference implementation: https://github.com/vtraag/leidenalg