cdlib.algorithms.graph_entropy

graph_entropy(g_original: object, weights: str = None) → cdlib.classes.node_clustering.NodeClustering

This method takes advantage of the use of entropy with regard to information theory. Entropy is a measure of uncertainty involved in a random variable.

This approach uses a new definition, Graph Entropy, as a measure of structural complexity in a graph. This algorithm incorporates a seed-growth technique. Unlike the other seed-growth style methods, however, the graph entropy approach does not require any predetermined threshold because it searches for an optimal solution by minimizing graph entropy.

This method finds locally optimal clusters with minimal graph entropy. A seed vertex is selected at random from a candidate set of seed vertices. Then, an initial cluster which is composed of the seed vertex and its immediate neighbors is created. Next, the neighbors are iteratively evaluated for removal to minimize the initial entropy of the graph. Finally, outer boundary vertices are added recursively if their addition causes the entropy of the graph to decrease.

Supported Graph Types

Undirected Directed Weighted
Yes No No
Parameters:
  • g_original – a networkx/igraph object
  • weights – label used for the edge weights.. Default, None
Returns:

NodeClustering object

Example:
>>> from cdlib import algorithms
>>> import networkx as nx
>>> G = nx.karate_club_graph()
>>> coms = algorithms.graph_entropy(G)
References:

Kenley, E.C., Cho, Y.-R. 2011. Detecting protein complexes and functional modules from protein interaction networks: A graph entropy approach. Proteomics 11, 3835-3844.