cdlib.algorithms.markov_clustering¶

markov_clustering(g_original: object, expansion: int = 2, inflation: int = 2, loop_value: int = 1, iterations: int = 100, pruning_threshold: float = 0.001, pruning_frequency: int = 1, convergence_check_frequency: int = 1) → cdlib.classes.node_clustering.NodeClustering¶

The Markov clustering algorithm (MCL) is based on simulation of (stochastic) flow in graphs. The MCL algorithm finds cluster structure in graphs by a mathematical bootstrapping procedure. The process deterministically computes (the probabilities of) random walks through the graph, and uses two operators transforming one set of probabilities into another. It does so using the language of stochastic matrices (also called Markov matrices) which capture the mathematical concept of random walks on a graph. The MCL algorithm simulates random walks within a graph by alternation of two operators called expansion and inflation.

Supported Graph Types

Undirected	Directed	Weighted
Yes	No	No

Parameters:	g_original – a networkx/igraph object expansion – The cluster expansion factor inflation – The cluster inflation factor loop_value – Initialization value for self-loops iterations – Maximum number of iterations (actual number of iterations will be less if convergence is reached) pruning_threshold – Threshold below which matrix elements will be set set to 0 pruning_frequency – Perform pruning every ‘pruning_frequency’ iterations. convergence_check_frequency – Perform the check for convergence every convergence_check_frequency iterations
Returns:	NodeClustering object
Example:

>>> from cdlib import algorithms
>>> import networkx as nx
>>> G = nx.karate_club_graph()
>>> coms = algorithms.markov_clustering(G)

References:

Enright, Anton J., Stijn Van Dongen, and Christos A. Ouzounis. An efficient algorithm for large-scale detection of protein families. Nucleic acids research 30.7 (2002): 1575-1584.

Note

Reference implementation: https://github.com/GuyAllard/markov_clustering