Partitioning algorithms and the multilevel framework

ParMETIS provides several partitioning algorithms for different scenarios. All of them build on the same multilevel framework — a three-phase approach that produces high-quality partitions efficiently — but they differ in how they initialize the partition and how much graph structure or coordinate data they use.

The multilevel framework

Every graph partitioning and ordering algorithm in ParMETIS follows the same three-phase multilevel structure. This approach scales to large distributed graphs because the expensive initial partitioning happens only on a much smaller coarsened graph.

Coarsening

ParMETIS repeatedly collapses the graph by finding a matching — a set of edges or vertex pairs — and merging each matched pair into a single coarser vertex. The weight of the new vertex is the sum of its constituents’ weights, and the edges of the new vertex are the union of their neighbors (excluding internal edges).This process repeats until the graph has fewer than CoarsenTo vertices (computed internally as a function of nparts and ncon). The sequence of graphs from finest to coarsest forms a multilevel hierarchy.Two matching strategies control the coarsening quality:

Constant	Value	Behavior
`PARMETIS_MTYPE_LOCAL`	1	Each vertex can only match with vertices on the same process. Faster but may produce lower-quality coarsenings at process boundaries.
`PARMETIS_MTYPE_GLOBAL`	2	Vertices can match with vertices on neighboring processes. Produces better coarsening quality, especially for highly connected distributed graphs.

The coarsening stops when the graph size reduction between successive levels falls below COARSEN_FRACTION (0.75), indicating that further coarsening is no longer effective.

Initial partitioning

Once the coarse graph is small enough, ParMETIS computes an initial partition. For k-way partitioning, it runs NIPARTS (8) random initial partition trials using a greedy algorithm and retains the best result.If the coarse graph fits on a single process, ParMETIS calls the serial METIS library (METIS_PartGraphKway) directly. This produces a high-quality initial partition at low cost because the graph is small.

Uncoarsening and refinement

ParMETIS projects the partition back up through the coarsening hierarchy. At each level, it runs k-way FM (Fiduccia-Mattheyses) refinement (KWayFM) to improve the partition. The refinement moves boundary vertices between partitions to reduce the edge cut while keeping partition weights within the balance tolerance.For multi-constraint problems (ncon > 1), KWayBalance runs a balancing pass before the FM pass whenever the imbalance exceeds ubavg + 0.035.

The coarsening target is computed as min(gnvtxs + 1, 25 * ncon * max(npes, nparts)). For typical problems this means the coarse graph has on the order of tens of thousands of vertices regardless of the original size.

k-way partitioning: `ParMETIS_V3_PartKway`

ParMETIS_V3_PartKway is the primary partitioning function. It computes a k-way partition directly — without recursively bisecting — using the full multilevel framework described above.

int ParMETIS_V3_PartKway(
    idx_t *vtxdist, idx_t *xadj, idx_t *adjncy, idx_t *vwgt,
    idx_t *adjwgt, idx_t *wgtflag, idx_t *numflag, idx_t *ncon, idx_t *nparts,
    real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut, idx_t *part,
    MPI_Comm *comm);

Use ParMETIS_V3_PartKway when:

You do not have vertex coordinate data.
You need a fresh partition from scratch (not a refinement of an existing one).
You have nparts that may differ from npes.

If the total graph has fewer than SMALLGRAPH (10,000) vertices or fewer than 20 vertices per process, ParMETIS partitions it serially using METIS instead of running the full parallel algorithm.

Geometry-assisted partitioning

When vertex coordinates are available, ParMETIS can use spatial information to seed a better initial partition or to compute a fast geometry-only partition.

ParMETIS_V3_PartGeom
ParMETIS_V3_PartGeomKway

ParMETIS_V3_PartGeom partitions the graph using space-filling curves (SFC) on coordinate data alone, without looking at the graph edges at all.

int ParMETIS_V3_PartGeom(
    idx_t *vtxdist, idx_t *ndims, real_t *xyz,
    idx_t *part, MPI_Comm *comm);

The algorithm:

Maps each vertex’s coordinates into a Z-order (Morton) curve value using ndims-dimensional bit interleaving with 2^9 = 512 bins per dimension.
Sorts vertices globally along the SFC using parallel sample sort (PseudoSampleSort).
Assigns consecutive ranges of the sorted order to each of the npes partitions.

This produces a partition in O(n log n) time with minimal communication. The partition quality is lower than graph-based methods because edge connectivity is ignored, but the speed makes it useful for quick load balancing or as a building block.

ParMETIS_V3_PartGeom always produces exactly npes partitions (one per process). Use ParMETIS_V3_PartGeomKway if you need nparts != npes.

ParMETIS_V3_PartGeomKway combines geometry-based initial partitioning with multilevel k-way refinement. It runs ParMETIS_V3_PartGeom to quickly redistribute the graph, then applies the full multilevel refinement loop.

int ParMETIS_V3_PartGeomKway(
    idx_t *vtxdist, idx_t *xadj, idx_t *adjncy, idx_t *vwgt,
    idx_t *adjwgt, idx_t *wgtflag, idx_t *numflag, idx_t *ndims, real_t *xyz,
    idx_t *ncon, idx_t *nparts, real_t *tpwgts, real_t *ubvec, idx_t *options,
    idx_t *edgecut, idx_t *part, MPI_Comm *comm);

The xyz array stores coordinates in interleaved order: vertex i has coordinates xyz[i * ndims] through xyz[i * ndims + ndims - 1].

idx_t  ndims  = 3;        /* 3D mesh coordinates */
real_t xyz[6] = {
    1.0, 2.0, 0.0,        /* vertex 0: (x=1, y=2, z=0) */
    3.0, 1.5, 1.0,        /* vertex 1: (x=3, y=1.5, z=1) */
};

Use ParMETIS_V3_PartGeomKway when you have 2D or 3D mesh coordinates and want the best partition quality. The geometry guides the initial partition into a configuration close to optimal, and the subsequent k-way refinement polishes it.

ParMETIS_V3_RefineKway refines an existing partition without repartitioning from scratch. You provide a part array that already assigns each local vertex to a partition, and ParMETIS improves it using the multilevel k-way refinement loop.

int ParMETIS_V3_RefineKway(
    idx_t *vtxdist, idx_t *xadj, idx_t *adjncy, idx_t *vwgt,
    idx_t *adjwgt, idx_t *wgtflag, idx_t *numflag, idx_t *ncon, idx_t *nparts,
    real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut,
    idx_t *part, MPI_Comm *comm);

Use ParMETIS_V3_RefineKway when:

You already have a reasonable partition and want to improve it without the overhead of full repartitioning.
You are iterating: you computed a partition, modified some vertex weights slightly, and want to recover balance with minimal movement.

The options array element options[PMV3_OPTION_PSR] controls the processor-subdomain relationship:

Constant	Value	Meaning
`PARMETIS_PSR_COUPLED`	1	The number of partitions equals the number of processes (`nparts == npes`). Each process manages exactly one partition during refinement.
`PARMETIS_PSR_UNCOUPLED`	2	The number of partitions differs from the number of processes. ParMETIS maps partitions to processes as needed.

Use PARMETIS_PSR_COUPLED when running a simulation where each MPI rank corresponds to one partition. Use PARMETIS_PSR_UNCOUPLED when you compute more partitions than processes, such as when preparing data for a run with more ranks.

Adaptive repartitioning: `ParMETIS_V3_AdaptiveRepart`

ParMETIS_V3_AdaptiveRepart is designed for evolving meshes — simulations where vertex weights change over time as load shifts (for example, due to adaptive mesh refinement). It balances the quality of the new partition against the cost of redistributing data.

int ParMETIS_V3_AdaptiveRepart(
    idx_t *vtxdist, idx_t *xadj, idx_t *adjncy, idx_t *vwgt,
    idx_t *vsize, idx_t *adjwgt, idx_t *wgtflag, idx_t *numflag, idx_t *ncon,
    idx_t *nparts, real_t *tpwgts, real_t *ubvec, real_t *ipc2redist,
    idx_t *options, idx_t *edgecut, idx_t *part, MPI_Comm *comm);

The key parameter is ipc2redist (inter-partition communication to redistribution cost ratio). It controls the trade-off:

Low ipc2redist (e.g., 0.001) — redistribution is cheap; aggressively minimize edge cut regardless of how many vertices move.
High ipc2redist (e.g., 1000.0) — redistribution is expensive; prefer keeping vertices in place even if it results in a higher edge cut.

/* Treat redistribution as moderately expensive */
real_t ipc2redist = 100.0;

/* vsize describes the data volume to move per vertex */
idx_t vsize[nvtxs]; /* populate with per-vertex data sizes */

ParMETIS_V3_AdaptiveRepart(
    vtxdist, xadj, adjncy,
    vwgt, vsize, adjwgt,
    &wgtflag, &numflag, &ncon, &nparts,
    tpwgts, ubvec, &ipc2redist,
    options, &edgecut, part, &comm
);

Separator refinement types

The ordering algorithms (ParMETIS_V3_NodeND, ParMETIS_V32_NodeND) use separator-based refinement instead of FM refinement. Two strategies are available via the rtype parameter:

Constant	Value	Behavior
`PARMETIS_SRTYPE_GREEDY`	1	Visits boundary vertices from highest to lowest gain and moves each one that improves the separator. Fast but may miss improvements that require simultaneous moves.
`PARMETIS_SRTYPE_2PHASE`	2	First builds the separator, then refines it using `PARMETIS_SRTYPE_GREEDY` in a second phase. Generally produces smaller, better-balanced separators at the cost of additional computation.

PARMETIS_SRTYPE_2PHASE is the default when using ParMETIS_V3_NodeND (which calls ParMETIS_V32_NodeND with rtype = NULL).

Processor-subdomain coupling (PSR)

The PARMETIS_PSR_COUPLED and PARMETIS_PSR_UNCOUPLED constants are relevant for ParMETIS_V3_RefineKway and ParMETIS_V3_AdaptiveRepart. They control how ParMETIS internally maps the nparts subdomains onto the npes processes during refinement.In the coupled case, each process is responsible for exactly one partition, which simplifies communication patterns and can improve performance. In the uncoupled case, the mapping is computed dynamically, allowing nparts to be any value.Set options[PMV3_OPTION_PSR] to one of these constants. PMV3_OPTION_PSR equals 3, the same index as PMV3_OPTION_IPART.

Debug levels

Set options[PMV3_OPTION_DBGLVL] to a bitwise OR of these flags to enable diagnostic output:

Flag	Value	Output
`PARMETIS_DBGLVL_TIME`	1	Per-phase timing
`PARMETIS_DBGLVL_INFO`	2	General information
`PARMETIS_DBGLVL_PROGRESS`	4	Coarsening progress per level
`PARMETIS_DBGLVL_REFINEINFO`	8	Refinement statistics
`PARMETIS_DBGLVL_MATCHINFO`	16	Matching information
`PARMETIS_DBGLVL_RMOVEINFO`	32	Vertex movement information
`PARMETIS_DBGLVL_REMAP`	64	Remapping decisions

options[0] = 1;  /* enable options array */
options[PMV3_OPTION_DBGLVL] = PARMETIS_DBGLVL_TIME | PARMETIS_DBGLVL_PROGRESS;

Get Started

Concepts

Guides

Configuration

Partitioning algorithms and the multilevel framework

The multilevel framework

k-way partitioning: `ParMETIS_V3_PartKway`

Geometry-assisted partitioning

k-way refinement: `ParMETIS_V3_RefineKway`

Adaptive repartitioning: `ParMETIS_V3_AdaptiveRepart`

Build docs developers (and LLMs) love

Get Started

Concepts

Guides

Configuration

Documentation Index

​The multilevel framework

​k-way partitioning: ParMETIS_V3_PartKway

​Geometry-assisted partitioning

​k-way refinement: ParMETIS_V3_RefineKway

​Adaptive repartitioning: ParMETIS_V3_AdaptiveRepart

Build docs developers (and LLMs) love

The multilevel framework

k-way partitioning: `ParMETIS_V3_PartKway`

Geometry-assisted partitioning

k-way refinement: `ParMETIS_V3_RefineKway`

Adaptive repartitioning: `ParMETIS_V3_AdaptiveRepart`