cdlib.algorithms.gemsec

gemsec(g_original: object, walk_number: int = 5, walk_length: int = 80, dimensions: int = 32, negative_samples: int = 5, window_size: int = 5, learning_rate: float = 0.1, clusters: int = 10, gamma: float = 0.1, seed: int = 42) → cdlib.classes.node_clustering.NodeClustering

The procedure uses random walks to approximate the pointwise mutual information matrix obtained by pooling normalized adjacency matrix powers. This matrix is decomposed by an approximate factorization technique which is combined with a k-means like clustering cost.

Supported Graph Types

Undirected Directed Weighted
Yes Yes No
Parameters:
  • g_original – a networkx/igraph object
  • walk_number – Number of random walks. Default is 5.
  • walk_length – Length of random walks. Default is 80.
  • dimensions – Dimensionality of embedding. Default is 32.
  • negative_samples – Number of negative samples. Default is 5.
  • window_size – Matrix power order. Default is 5.
  • learning_rate – Gradient descent learning rate. Default is 0.1.
  • clusters – Number of cluster centers. Default is 10.
  • gamma – Clustering cost weight coefficient. Default is 0.1.
  • seed – Random seed value. Default is 42.
Returns:

NodeClustering object

Example:
>>> from cdlib import algorithms
>>> import networkx as nx
>>> G = nx.karate_club_graph()
>>> coms = algorithms.gemsec(G)
References:

Rozemberczki, B., Davies, R., Sarkar, R., & Sutton, C. (2019, August). Gemsec: Graph embedding with self clustering. In Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 65-72).

Note

Reference implementation: https://karateclub.readthedocs.io/