# cdlib.algorithms.surprise_communities¶

surprise_communities(g_original: object, initial_membership: list = None, weights: list = None, node_sizes: list = None) → cdlib.classes.node_clustering.NodeClustering

Surprise_communities is a model where the quality function to optimize is:

$Q = m D(q \parallel \langle q \rangle)$

where $$m$$ is the number of edges, $$q = \frac{\sum_c m_c}{m}$$, is the fraction of internal edges, $$\langle q \rangle = \frac{\sum_c \binom{n_c}{2}}{\binom{n}{2}}$$ is the expected fraction of internal edges, and finally

$$D(x \parallel y) = x \ln \frac{x}{y} + (1 - x) \ln \frac{1 - x}{1 - y}$$ is the binary Kullback-Leibler divergence.

For directed graphs we can multiplying the binomials by 2, and this leaves $$\langle q \rangle$$ unchanged, so that we can simply use the same formulation. For weighted graphs we can simply count the total internal weight instead of the total number of edges for $$q$$ , while $$\langle q \rangle$$ remains unchanged.

Supported Graph Types

Undirected Directed Weighted
Yes Yes Yes
Parameters: g_original – a networkx/igraph object initial_membership – list of int Initial membership for the partition. If None then defaults to a singleton partition. Deafault None weights – list of double, or edge attribute Weights of edges. Can be either an iterable or an edge attribute. Deafault None node_sizes – list of int, or vertex attribute Sizes of nodes are necessary to know the size of communities in aggregate graphs. Usually this is set to 1 for all nodes, but in specific cases this could be changed. Deafault None NodeClustering object
>>> from cdlib import algorithms
>>> import networkx as nx
>>> G = nx.karate_club_graph()
>>> coms = algorithms.surprise_communities(G)


Traag, V. A., Aldecoa, R., & Delvenne, J.-C. (2015). Detecting communities using asymptotical surprise. Physical Review E, 92(2), 022816. 10.1103/PhysRevE.92.022816

Note

Reference implementation: https://github.com/vtraag/leidenalg