-
Notifications
You must be signed in to change notification settings - Fork 229
Open
Description
Hi Graphies,
Current state
If I understand correctly there is currently no method for community detection in BGL (but see clustering methods). Community detection algorithms like Louvain and Leiden automatically infer hidden groupings in very large networks, identifying gene modules, customer segments, fraud rings, or system components without requiring labeled data.
Common methods include:
- K-core Decomposition
- Label Propagation
- the Louvain algorithm (popular, ~30000 citations, some statistical issues)
- the Leiden algorithm (extends Louvain, solves some issues, more complex, less standard ~ 6000 citations)
Scope
I plan to submit a PR to implement the simplest case of Louvain modularity optimization for undirected graphs only:
- minimal BGL-style API
- regression + correctness tests to reproduce validation methods/results in Blondel et al 2008
- no metric generalization yet
The benefit of starting with Louvain is that it builds requried components for Leiden, that is also more complex.
Algorithm figure from Blondel at al. 2008
Litterature and context
Related papers and ressource:
- Newman and Girvan, 2004: paper for modularity metrics, with per-community version too
- Blondel et al, 2008: Louvain foundational paper using modularity metrics
- Campigotto et al, 2014: Generalizing Louvain to multiple quality metrics + C++ implementation using Strategy pattern (see reference 4)
- Traag et al, 2019 : Leiden paper + Louvain background
- Louvain Method CS class video
- Comparative evaluation of community detection methods : methods, performances, generated community sizes distributions
Possible implementation
I will just throw a quick idea of the interface just to be sure it fits well wit BGL spirit
template <typename Graph, typename CommunityMap, typename WeightMap>
double compute_modularity(const Graph& g, const CommunityMap& communities, const WeightMap& weights)
{
// probably needs vertices(), out_edges(), target(), source()
BOOST_CONCEPT_ASSERT((boost::IncidenceGraphConcept<Graph>));
BOOST_CONCEPT_ASSERT((boost::VertexListGraphConcept<Graph>));
// directed louvain seems to require modifsm, see https://github.com/nicolasdugue/DirectedLouvain
typedef typename boost::graph_traits<Graph>::directed_category directed_category;
BOOST_STATIC_ASSERT_MSG(
(boost::is_same<directed_category, boost::undirected_tag>::value),
"compute_modularity requires an undirected graph"
);
// impl from Blondel 2008 works with weighted graphs
}
template <typename Graph, typename CommunityMap, typename WeightMap>
void louvain_clustering(const Graph& g, CommunityMap& communities, const WeightMap&) // note the write param
{
BOOST_CONCEPT_ASSERT((boost::IncidenceGraphConcept<Graph>));
BOOST_CONCEPT_ASSERT((boost::VertexListGraphConcept<Graph>));
typedef typename boost::graph_traits<Graph>::directed_category directed_category;
BOOST_STATIC_ASSERT_MSG(
(boost::is_same<directed_category, boost::undirected_tag>::value),
"louvain_clustering requires an undirected graph."
);
// hide state from user ?
std::vector<double> internal_edges_per_community(num_vertices(g));
std::vector<double> total_degree_per_community(num_vertices(g));
std::vector<double> neigh_community_weight(num_vertices(g));
// impl
// return void like djikstra ? or an enriched structured with internal results from the algo for debug ?
}
// unweighted overload
template<typename Graph, typename CommunityMap>
void louvain_clustering(const Graph& g, CommunityMap& communities)
{
louvain_clustering(g, communities, boost::make_constant_property_map(1.0));
}
// usage:
louvain_clustering(g, comm_map);
double quality = compute_modularity(g, comm_map);Metadata
Metadata
Assignees
Labels
No labels