Skip to content

Proposal: Implement Distributed Balanced Partitioning via Linear Embedding (DBP-LE) #4875

@c0dew0rm

Description

@c0dew0rm

Describe the feature:
This proposal suggests implementing the Distributed Balanced Partitioning via Linear Embedding (DBP-LE) algorithm — originally developed by Aydin, Bateni & Mirrokni (Google Research, WSDM 2016) — as an optional graph partitioning strategy in JanusGraph.

DBP-LE addresses the challenge of balanced partitioning in large distributed graphs by:

  1. Embedding vertices into a one-dimensional linear order using affinity (common-neighbor similarity).
  2. Performing local refinements via Minimum Linear Arrangement (MinLA) and RankSwap.
  3. Applying imbalance-aware postprocessing (sliding-window min-cut / DP optimization).

The output is a vertex ordering and balanced partition boundaries that minimize edge cuts between partitions.

This algorithm is highly scalable (MapReduce-friendly), easy to integrate with JanusGraph’s backend partitioner, and has been validated on real-world production graphs (Google Maps, Twitter, Friendster).


Describe a specific use case for the feature:
Distributed graph deployments in JanusGraph currently rely on simple hash or range partitioning, or external partitioners such as METIS.
These methods either:

  • don’t scale to billions of edges, or
  • produce suboptimal cut quality (leading to excessive cross-partition traversals).

Integrating DBP-LE as a native partitioning strategy would:

  • Improve query performance and reduce network I/O by minimizing cross-machine edges.
  • Provide balanced data distribution across storage backends.
  • Enable better performance for iterative analytics (e.g., PageRank, Connected Components, community detection).
  • Support machine-learning workloads (e.g., GNN training) that require balanced graph mini-batches with minimal boundary communication.

Example configuration:

storage.partition.strategy: dbp_linear_embedding
storage.partition.alpha: 0.03
storage.partition.parts: 16

Additional context / references:

Paper: Distributed Balanced Partitioning via Linear Embedding — Aydin, Bateni, Mirrokni, WSDM 2016. DOI: [10.1145/2835776.2835829](https://doi.org/10.1145/2835776.2835829)

Expected benefits:

15–25 % reduction in edge cut size vs. METIS/FENNEL.

40 % fewer cross-shard queries in Google Maps case study.

Linear scalability to hundreds of millions of vertices.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions