Quick Start¶
CDlib
is a Python library that allows network partition extraction, comparison, and evaluation.
We designed it to be agnostic w.r.t. the data structure used to represent the network to be clustered: all the algorithms it implements accept interchangeably igraph/networkx objects.
Of course, such a choice comes with advantages as well as drawbacks. Here are the main ones you have to be aware of:
Advantages - Easy integration of existing/novel (python implementation of) CD algorithms; - Standardization of input and output; - Zero-configuration user interface (e.g., you do not have to reshape your data!)
Drawbacks - Algorithm performances are not comparable (execution time, scalability… they all depend on how each algorithm was originally implemented); - Memory (in)efficiency: Depending by the type of structure each algorithm requires, memory consumption could be high; - Hidden transformation times: usually not a bottleneck, moving from a graph representation to another can take “some” time (usually linear in the graph size)
Most importantly, remember that i) each algorithm will be able to handle graphs up to a given size, and ii) that maximum size may vary greatly across the exposed algorithms.
Tutorial¶
Extracting communities using CDlib
is easy as:
from cdlib import algorithms
import networkx as nx
G = nx.karate_club_graph()
coms = algorithms.louvain(G, weight='weight', resolution=1., randomize=False)
Of course, you can choose among all the algorithms available (taking care of specifying the correct parameters). As a result, you will get a Clustering object (or a more specific subclass).
Clustering objects exposes a set of methods to perform evaluation and comparisons. For instance, to get the partition modularity, write:
mod = coms.newman_girvan_modularity(g)
or, equivalently
from cdlib import evaluation
mod = evaluation.newman_girvan_modularity(g,communities)
Moreover, you can also visualize networks and communities, plot indicators, and similarity matrices… take a look at the module reference to get a few examples.
I know plain tutorials are overrated: if you want to explore CDlib
functionalities, please start playing around with our interactive Google Colab Notebook !
FAQ¶
Q1. I developed a novel Community Discovery algorithm/evaluation/visual analytics method and would like to see it integrated into CDlib
. What should I do?
A1. That is great! Just open an issue on the project GitHub briefly describing the method (provide a link to the paper where it was first introduced) and links to a Python implementation (if available). We will return to you soon to discuss the next steps.
Q2. Can you add method XXX to your library?
A2. It depends. Do you have a link to a Python implementation, or are you willing to help us implement it? If so, that is perfect. If not, everything is possible, but it will likely require some time.