cdlib.benchmark.XMark¶
-
XMark
(n: int = 2000, gamma: float = 3, beta: float = 2, m_cat: tuple = ('auto', 'auto'), theta: float = 0.3, mu: float = 0.5, avg_k: int = 10, min_com: int = 20, type_attr: str = 'categorical') → [<class 'object'>, <class 'object'>]¶ Returns the XMark benchmark annotated graph and planted communities.
Parameters: - n – Number of nodes in the created graph.
- gamma – Power law exponent for the degree distribution of the created graph. This value must be strictly greater than one.
- beta – Power law exponent for the community size distribution in the created graph. This value must be strictly greater than one.
- m_cat – If the attribute type is categorical, it is the number of values in the domain of the attribute.
- m_cont – If the attribute type is continuous, it is the number of peaks in the distribution (at least a bimodal distirbution, i.e., m_cont=2).
- theta – If the attribute type is categorical, it specifies the percentage of noise within a cluster.
- sigma – If the attribute type is continuous, it is the standard deviation.
- mu – Fraction of intra-community edges incident to each node. This value must be in the interval [0, 1].
- avg_k – esired average degree of nodes in the created graph. This value must be in the interval [0, n]. Exactly one of this and min_degree must be specified, otherwise a NetworkXError is raised.
- min_com – Minimum size of communities in the graph. If not specified, this is set to min_degree.
- type_attr – The attribute type. It can be “categorical” or “continuous”.
Returns: A networkx synthetic graph, the set of communities (NodeClustering object)
Example: >>> from cdlib.benchmark import XMark >>> N = 2000 >>> gamma = 3 >>> beta = 2 >>> m_cat = ["auto", "auto"] >>> theta = 0.3 >>> mu = 0.5 >>> avg_k = 10 >>> min_com = 20 >>> g, coms = XMark(n=N, gamma=gamma, beta=beta, mu=mu, >>> m_cat=m_cat, >>> theta=theta, >>> avg_k=avg_k, min_com=min_com, >>> type_attr="categorical")
References: Salvatore Citraro, and Giulio Rossetti. “XMark: A Benchmark For Node-Attributed Community Discovery Algorithms”, 2021 (to appear)
Note
Reference implementation: https://github.com/dsalvaz/XMark