Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KarypisLab/ParMETIS/llms.txt

Use this file to discover all available pages before exploring further.

ParMETIS distributes a graph across MPI processes so that each process owns a contiguous range of vertices and only the adjacency data for those vertices. Before you call any ParMETIS function, you must build four core arrays — vtxdist, xadj, adjncy, and optionally vwgt and adjwgt — and populate them on every process.

CSR format

ParMETIS stores the local adjacency structure in Compressed Sparse Row (CSR) format, the same format used throughout the sparse linear algebra community. Two arrays together describe the edges:
  • xadj — row pointers of length (nvtxs + 1). xadj[i] is the index into adjncy where the neighbor list of local vertex i begins, and xadj[i+1] - xadj[i] is the degree of vertex i. xadj[0] is always 0 and xadj[nvtxs] equals the total number of local edges.
  • adjncy — flat adjacency list of length equal to the number of local edges. Each entry is the global index of a neighboring vertex.

Example: 4-vertex undirected graph

Consider a 4-vertex graph with these edges: 0–1, 0–3, 1–2, 1–3, 2–3.
Vertex 0: neighbors 1, 3
Vertex 1: neighbors 0, 2, 3
Vertex 2: neighbors 1, 3
Vertex 3: neighbors 0, 1, 2
The single-process CSR layout is:
//        v0  v1  v2  v3  end
idx_t xadj[5]   = { 0,  2,  5,  7,  10 };

//           v0's nbrs   v1's nbrs      v2's nbrs   v3's nbrs
idx_t adjncy[10] = { 1, 3,   0, 2, 3,   1, 3,   0, 1, 2 };
Reading the layout:
  • Vertex 0 uses adjncy[xadj[0]..xadj[1]-1] = adjncy[0..1] = {1, 3}.
  • Vertex 1 uses adjncy[xadj[1]..xadj[2]-1] = adjncy[2..4] = {0, 2, 3}.
  • Vertex 2 uses adjncy[xadj[2]..xadj[3]-1] = adjncy[5..6] = {1, 3}.
  • Vertex 3 uses adjncy[xadj[3]..xadj[4]-1] = adjncy[7..9] = {0, 1, 2}.
adjncy entries are global vertex indices, even for vertices owned by the same process. When numflag = 0 (the default), indices are 0-based.

Distributed representation with vtxdist

In a parallel run, the full graph is split across nprocs processes. The vtxdist array tells every process the global vertex range owned by each process:
  • vtxdist has length nprocs + 1.
  • Process p owns global vertices vtxdist[p] through vtxdist[p+1] - 1.
  • Every process holds an identical copy of vtxdist.
Each process independently builds its own xadj and adjncy covering only its local vertices, but using global indices in adjncy.

Example: 4-vertex graph on 2 processes

/* Both processes hold this identical array */
idx_t vtxdist[3] = { 0, 2, 4 };
/* Process 0 owns vertices 0..1, process 1 owns vertices 2..3 */

/* Process 0: local vertices 0 and 1 */
idx_t xadj_p0[3]   = { 0, 2, 5 };
idx_t adjncy_p0[5] = { 1, 3,   0, 2, 3 };
/*                      ^v0^    ^^^v1^^^ */

/* Process 1: local vertices 2 and 3 */
idx_t xadj_p1[3]   = { 0, 2, 5 };
idx_t adjncy_p1[5] = { 1, 3,   0, 1, 2 };
/*                      ^v2^    ^^^v3^^^ */
The local xadj on each process always starts at 0 and counts edges relative to the local vertex list. The adjncy entries remain global vertex IDs.
You can derive the local vertex count for process p as vtxdist[p+1] - vtxdist[p]. The local edge count is xadj[nvtxs].

Vertex and edge weights

ParMETIS supports optional integer weights on vertices and edges, plus a communication-size hint for vertices.
The vwgt array stores integer vertex weights. It has length nvtxs * ncon, where ncon is the number of balance constraints. For the common case of a single constraint (ncon = 1), vwgt[i] is the weight of local vertex i.
idx_t ncon = 1;
/* Local vertices 0 and 1 have weights 3 and 5 */
idx_t vwgt[2] = { 3, 5 };
ParMETIS balances partitions so that the total vertex weight in each partition is within ubvec of the target fraction specified by tpwgts.

The wgtflag parameter

wgtflag is a bitmask that tells ParMETIS which weight arrays you are providing:
wgtflagVertex weights (vwgt)Edge weights (adjwgt)
0Not usedNot used
1Not usedUsed
2UsedNot used
3UsedUsed
The bit encoding is: bit 0 (value 1) enables edge weights, bit 1 (value 2) enables vertex weights.
/* Provide both vertex and edge weights */
idx_t wgtflag = 3;

ParMETIS_V3_PartKway(
    vtxdist, xadj, adjncy,
    vwgt,    /* vertex weights — used because bit 1 is set */
    adjwgt,  /* edge weights   — used because bit 0 is set */
    &wgtflag, ...
);
When wgtflag includes vertex weights (wgtflag = 2 or 3), you must also pass the correct ncon value. The vwgt array must have exactly nvtxs * ncon entries.

The numflag parameter

numflag controls the indexing convention used throughout vtxdist, xadj, and adjncy:
  • numflag = 0C-style 0-based indexing (default). Vertex indices run from 0 to gnvtxs - 1.
  • numflag = 1Fortran-style 1-based indexing. Vertex indices run from 1 to gnvtxs. vtxdist, xadj, and all entries in adjncy must be incremented by 1.
ParMETIS converts 1-based input to 0-based internally at the start of each call, and converts back on return.
/* Fortran-style layout for the 4-vertex example, single process */
idx_t vtxdist[2]  = { 1, 5 };           /* vertices 1..4 */
idx_t xadj[5]     = { 1, 3, 6, 8, 11 }; /* 1-based offsets */
idx_t adjncy[10]  = { 2, 4, 1, 3, 4, 2, 4, 1, 2, 3 }; /* global 1-based IDs */
idx_t numflag     = 1;

Multi-constraint vertex weights

When you set ncon > 1, each vertex carries ncon weight values — one per balance constraint. The values are stored interleaved in vwgt: vertex i has weights at vwgt[i * ncon] through vwgt[i * ncon + ncon - 1].
idx_t ncon = 2;  /* Two balance constraints */

/*
 * Local vertices 0 and 1, each with 2 weights:
 *   vertex 0: constraint-0 weight = 3, constraint-1 weight = 7
 *   vertex 1: constraint-0 weight = 5, constraint-1 weight = 2
 */
idx_t vwgt[4] = { 3, 7,   5, 2 };
/*               ^--- v0 ---^  ^--- v1 ---^ */
ParMETIS balances each constraint independently: partition loads for constraint c must satisfy the ubvec[c] tolerance relative to the target fractions in tpwgts[p * ncon + c].
The graph_t structure in libparmetis/struct.h stores the distributed graph internally. The key fields mirror the external API arrays:
typedef struct graph_t {
    idx_t gnvtxs;   /* total global vertex count */
    idx_t nvtxs;    /* local vertex count on this process */
    idx_t nedges;   /* local edge count */
    idx_t ncon;     /* number of balance constraints */
    idx_t *xadj;    /* local CSR row pointers */
    idx_t *adjncy;  /* local adjacency list (global IDs) */
    idx_t *vwgt;    /* vertex weights, length nvtxs * ncon */
    idx_t *adjwgt;  /* edge weights, length nedges */
    idx_t *vsize;   /* vertex communication sizes */
    idx_t *vtxdist; /* global vertex distribution, length npes+1 */
    /* ... communication, coarsening, and partition fields ... */
} graph_t;
ParMETIS copies your input arrays into this structure at the start of each API call via SetupGraph(). The free_xadj, free_adjncy, free_vwgt, and free_adjwgt flags track whether ParMETIS owns the allocations and should free them.
  • Allocate vtxdist with nprocs + 1 elements and broadcast it so every process has the same values.
  • Allocate xadj with nvtxs + 1 elements on each process. xadj[0] must be 0.
  • Allocate adjncy with xadj[nvtxs] elements (the total local edge count).
  • If you provide vwgt, allocate nvtxs * ncon elements. If you provide adjwgt, allocate xadj[nvtxs] elements.
  • ParMETIS does not modify your input arrays except when numflag = 1, in which case it converts to 0-based numbering on entry and back to 1-based on return.

Build docs developers (and LLMs) love