# Node Clustering¶

Algorithms falling in this category generate communities composed by nodes. The communities can represent neat, crisp, partition as well as overlapping or even fuzzy ones.

Note

The following lists are aligned to CD methods available in the GitHub main branch of CDlib.

In particular, the current version of the library on pypl (that can be installed through pip) does not include the following algorithms: belief, ga.

## Crisp Communities¶

A clustering is said to be a partition if each node belongs to one and only one community. Methods in this subclass return as result a NodeClustering object instance.

 agdl(g_original, number_communities, kc) AGDL is a graph-based agglomerative algorithm, for clustering high-dimensional data. aslpaw(g_original) ASLPAw can be used for disjoint and overlapping community detection and works on weighted/unweighted and directed/undirected networks. async_fluid(g_original, k) Fluid Communities (FluidC) is based on the simple idea of fluids (i.e., communities) interacting in an environment (i.e., a non-complete graph), expanding and contracting. belief(g_original[, max_it, eps, …]) Belief community seeks the consensus of many high-modularity partitions. cpm(g_original[, initial_membership, …]) CPM is a model where the quality function to optimize is: chinesewhispers(g_original[, weighting, …]) Fuzzy graph clustering that (i) creates an intermediate representation of the input graph, which reflects the “ambiguity” of its nodes, and (ii) uses hard clustering to discover crisp clusters in such “disambiguated” intermediate graph. der(g_original[, walk_len, threshold, …]) DER is a Diffusion Entropy Reducer graph clustering algorithm. edmot(g_original[, component_count, cutoff]) The algorithm first creates the graph of higher order motifs. eigenvector(g_original) Newman’s leading eigenvector method for detecting community structure based on modularity. em(g_original, k) EM is based on based on a mixture model. ga(g_original[, population, generation, r]) Genetic based approach to discover communities in social networks. gdmp2(g_original[, min_threshold]) Gdmp2 is a method for identifying a set of dense subgraphs of a given sparse graph. girvan_newman(g_original, level) The Girvan–Newman algorithm detects communities by progressively removing edges from the original graph. greedy_modularity(g_original[, weight]) The CNM algorithm uses the modularity to find the communities strcutures. infomap(g_original) Infomap is based on ideas of information theory. label_propagation(g_original) The Label Propagation algorithm (LPA) detects communities using network structure alone. leiden(g_original[, initial_membership, weights]) The Leiden algorithm is an improvement of the Louvain algorithm. louvain(g_original[, weight, resolution, …]) Louvain maximizes a modularity score for each community. markov_clustering(g_original[, expansion, …]) The Markov clustering algorithm (MCL) is based on simulation of (stochastic) flow in graphs. rber_pots(g_original[, initial_membership, …]) rber_pots is a model where the quality function to optimize is: rb_pots(g_original[, initial_membership, …]) Rb_pots is a model where the quality function to optimize is: scan(g_original, epsilon, mu) SCAN (Structural Clustering Algorithm for Networks) is an algorithm which detects clusters, hubs and outliers in networks. significance_communities(g_original[, …]) Significance_communities is a model where the quality function to optimize is: spinglass(g_original) Spinglass relies on an analogy between a very popular statistical mechanic model called Potts spin glass, and the community structure. surprise_communities(g_original[, …]) Surprise_communities is a model where the quality function to optimize is: walktrap(g_original) walktrap is an approach based on random walks. sbm_dl(g_original[, B_min, B_max, deg_corr]) Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. sbm_dl_nested(g_original[, B_min, B_max, …]) Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models.

## Overlapping Communities¶

A clustering is said to be overlapping if any generic node can be assigned to more than one community. Methods in this subclass return as result a NodeClustering object instance.

 angel(g_original, threshold[, …]) Angel is a node-centric bottom-up community discovery algorithm. big_clam(g_original[, dimensions, …]) BigClam is an overlapping community detection method that scales to large networks. conga(g_original, number_communities) CONGA (Cluster-Overlap Newman Girvan Algorithm) is an algorithm for discovering overlapping communities. congo(g_original, number_communities[, height]) CONGO (CONGA Optimized) is an optimization of the CONGA algortithm. danmf(g_original[, layers, pre_iterations, …]) The procedure uses telescopic non-negative matrix factorization in order to learn a cluster memmbership distribution over nodes. demon(g_original, epsilon[, min_com_size]) Demon is a node-centric bottom-up overlapping community discovery algorithm. ego_networks(g_original[, level]) Ego-networks returns overlapping communities centered at each nodes within a given radius. egonet_splitter(g_original[, resolution]) The method first creates the egonets of nodes. kclique(g_original, k) Find k-clique communities in graph using the percolation method. lais2(g_original) LAIS2 is an overlapping community discovery algorithm based on the density function. lemon(g_original, seeds[, min_com_size, …]) Lemon is a large scale overlapping community detection method based on local expansion via minimum one norm. lfm(g_original, alpha) LFM is based on the local optimization of a fitness function. multicom(g_original, seed_node) MULTICOM is an algorithm for detecting multiple local communities, possibly overlapping, by expanding the initial seed set. nmnf(g_original[, dimensions, clusters, …]) The procedure uses joint non-negative matrix factorization with modularity based regul;arization in order to learn a cluster memmbership distribution over nodes. nnsed(g_original[, dimensions, iterations, seed]) The procedure uses non-negative matrix factorization in order to learn an unnormalized cluster membership distribution over nodes. node_perception(g_original, threshold, …) Node perception is based on the idea of joining together small sets of nodes. overlapping_seed_set_expansion(g_original, seeds) OSSE is an overlapping community detection algorithm optimizing the conductance community score The algorithm uses a seed set expansion approach; the key idea is to find good seeds, and then expand these seed sets using the personalized PageRank clustering procedure. percomvc(g_original) The PercoMVC approach composes of two steps. slpa(g_original[, t, r]) SLPA is an overlapping community discovery that extends tha LPA. wCommunity(g_original[, min_bel_degree, …]) Algorithm to identify overlapping communities in weighted graphs

## Fuzzy Communities¶

A clustering is said to be a fuzzy if each node can belongs (with a different degree of likelihood) to more than one community. Methods in this subclass return as result a FuzzyNodeClustering object instance.

 frc_fgsn(g_original, theta, eps, r) Fuzzy-Rough Community Detection on Fuzzy Granular model of Social Network.

## Node Attribute¶

Methods in this subclass return as result a AttrNodeClustering object instance.

 eva(g_original, labels[, weight, …]) The Eva algorithm extends the Louvain approach in order to deal with the attributes of the nodes (aka Louvain Extended to Vertex Attributes). ilouvain(g_original, labels, id) The I-Louvain algorithm extends the Louvain approach in order to deal only with the scalar attributes of the nodes.

## Bipartite Graph Communities¶

Methods in this subclass return as result a BiNodeClustering object instance.

 bimlpa(g_original[, theta, lambd]) BiMLPA is designed to detect the many-to-many correspondence community in bipartite networks using multi-label propagation algorithm. CPM_Bipartite(g_original, …[, …]) CPM_Bipartite is the extension of CPM to bipartite graphs infomap_bipartite(g_original) Infomap is based on ideas of information theory.

## Antichain Communities¶

Methods in this subclass are designed to extract communities from Directed Acyclic Graphs (DAG) and return as result a NodeClustering object instance.

 siblinarity_antichain(g_original[, …]) The algorithm extract communities from a DAG that (i) respects its intrinsic order and (ii) are composed of similar nodes.