Quickstart: partition a graph in parallel with ParMETIS

This guide walks you through writing a minimal C program that distributes a graph across MPI processes and calls ParMETIS_V3_PartKway to partition it. By the end you will have a working program you can compile and run, and an understanding of the key data structures ParMETIS expects.

Complete the installation guide before starting. You need mpicc, libparmetis, and libmetis installed and on your compiler’s search path.

The example graph

The example uses a simple 8-vertex ring graph (each vertex connected to its two neighbors), distributed evenly across 4 MPI processes—2 vertices per process. ParMETIS will assign each vertex to one of 4 partitions.

Vertices:  0 - 1 - 2 - 3 - 4 - 5 - 6 - 7 - (back to 0)
Processes: [0: 0,1] [1: 2,3] [2: 4,5] [3: 6,7]

Steps

Understand the key data structures

ParMETIS represents the distributed graph using three arrays that every process must populate before calling any partitioning function.vtxdist — a global array of length nprocs + 1 that describes how vertices are distributed. Process p owns vertices vtxdist[p] through vtxdist[p+1] - 1. Every process holds an identical copy.xadj — the local CSR row-pointer array. For a process owning n vertices, xadj has length n + 1. xadj[i] is the index into adjncy where the neighbors of local vertex i begin.adjncy — the local adjacency list. Each entry is the global index of a neighboring vertex.

All vertex indices in vtxdist and adjncy use 0-based (C) numbering by default. Set numflag = 1 to use 1-based (Fortran) numbering.

Write the example program

Create a file named example.c with the following content:

example.c

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <parmetis.h>

int main(int argc, char *argv[]) {
    MPI_Comm comm;
    int nprocs, rank;

    /* 1. Initialize MPI */
    MPI_Init(&argc, &argv);
    MPI_Comm_dup(MPI_COMM_WORLD, &comm);
    MPI_Comm_size(comm, &nprocs);
    MPI_Comm_rank(comm, &rank);

    if (nprocs != 4) {
        if (rank == 0)
            fprintf(stderr, "This example requires exactly 4 processes.\n");
        MPI_Finalize();
        return 1;
    }

    /*
     * 2. Set up vtxdist.
     *    8 vertices distributed 2 per process: [0,2), [2,4), [4,6), [6,8)
     */
    idx_t vtxdist[5] = {0, 2, 4, 6, 8};

    /*
     * 3. Set up local adjacency (xadj + adjncy) for a ring graph.
     *    Each process owns 2 vertices. Each vertex has 2 neighbors.
     *
     *    Process 0: vertex 0 -> {7, 1},  vertex 1 -> {0, 2}
     *    Process 1: vertex 2 -> {1, 3},  vertex 3 -> {2, 4}
     *    Process 2: vertex 4 -> {3, 5},  vertex 5 -> {4, 6}
     *    Process 3: vertex 6 -> {5, 7},  vertex 7 -> {6, 0}
     */
    idx_t xadj[3]    = {0, 2, 4};   /* 2 local vertices, each with 2 edges */
    idx_t adjncy_p0[4] = {7, 1, 0, 2};
    idx_t adjncy_p1[4] = {1, 3, 2, 4};
    idx_t adjncy_p2[4] = {3, 5, 4, 6};
    idx_t adjncy_p3[4] = {5, 7, 6, 0};

    idx_t *adjncy;
    switch (rank) {
        case 0: adjncy = adjncy_p0; break;
        case 1: adjncy = adjncy_p1; break;
        case 2: adjncy = adjncy_p2; break;
        default: adjncy = adjncy_p3; break;
    }

    /*
     * 4. Call ParMETIS_V3_PartKway.
     *
     *    vwgt    = NULL  (no vertex weights)
     *    adjwgt  = NULL  (no edge weights)
     *    wgtflag = 0     (no weights)
     *    numflag = 0     (0-based indexing)
     *    ncon    = 1     (one balance constraint)
     *    nparts  = 4     (four partitions)
     *    tpwgts  = equal fractions (1/nparts each)
     *    ubvec   = 1.05  (5% imbalance tolerance)
     *    options = {0}   (default options)
     */
    idx_t  wgtflag = 0;
    idx_t  numflag = 0;
    idx_t  ncon    = 1;
    idx_t  nparts  = 4;
    idx_t  edgecut;
    idx_t  part[2];           /* one partition label per local vertex */

    real_t tpwgts[4];         /* target partition weights */
    real_t ubvec = 1.05f;
    idx_t  options[3] = {0, 0, 0};

    for (int i = 0; i < nparts; i++)
        tpwgts[i] = 1.0f / (real_t)nparts;

    ParMETIS_V3_PartKway(
        vtxdist, xadj, adjncy,
        NULL,       /* vwgt   */
        NULL,       /* adjwgt */
        &wgtflag, &numflag, &ncon, &nparts,
        tpwgts, &ubvec, options,
        &edgecut, part, &comm
    );

    /* 5. Print the resulting partition assignment */
    for (int p = 0; p < nprocs; p++) {
        if (rank == p) {
            idx_t base = vtxdist[rank];
            for (int i = 0; i < 2; i++)
                printf("vertex %lld -> partition %lld\n",
                       (long long)(base + i), (long long)part[i]);
            fflush(stdout);
        }
        MPI_Barrier(comm);
    }

    if (rank == 0)
        printf("Edge cut: %lld\n", (long long)edgecut);

    /* 6. Finalize MPI */
    MPI_Comm_free(&comm);
    MPI_Finalize();
    return 0;
}

Compile the program

Use mpicc to compile and link against ParMETIS and METIS. If you installed the libraries to ~/local, pass the header and library paths explicitly.

mpicc -o example example.c \
    -I~/local/include \
    -L~/local/lib \
    -lparmetis -lmetis

If you built ParMETIS as a shared library, add ~/local/lib to LD_LIBRARY_PATH before running the program:

export LD_LIBRARY_PATH=~/local/lib:$LD_LIBRARY_PATH

Run the program

Launch the program with 4 MPI processes:

mpirun -np 4 ./example

You should see output similar to the following (process ordering may vary):

vertex 0 -> partition 3
vertex 1 -> partition 0
vertex 2 -> partition 0
vertex 3 -> partition 1
vertex 4 -> partition 1
vertex 5 -> partition 2
vertex 6 -> partition 2
vertex 7 -> partition 3
Edge cut: 4

For an 8-vertex ring with 4 partitions, the minimum edge cut is 4 (two edges per partition boundary). ParMETIS may produce different but equally valid partition assignments across runs.

Parameter reference

The full signature of ParMETIS_V3_PartKway is:

int ParMETIS_V3_PartKway(
    idx_t *vtxdist, idx_t *xadj, idx_t *adjncy, idx_t *vwgt,
    idx_t *adjwgt, idx_t *wgtflag, idx_t *numflag, idx_t *ncon, idx_t *nparts,
    real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut, idx_t *part,
    MPI_Comm *comm);

vtxdist

idx_t *

required

Global vertex distribution array of length nprocs + 1. Every process must hold the same copy. Process p owns vertices vtxdist[p] to vtxdist[p+1] - 1.

xadj

idx_t *

required

Local CSR row-pointer array of length (local vertex count) + 1. xadj[i] is the start index in adjncy for local vertex i.

adjncy

idx_t *

required

Local adjacency list. Each entry is the global index of a neighboring vertex.

vwgt

idx_t *

Vertex weight array of length (local vertex count) * ncon. Pass NULL to assign uniform weight 1 to all vertices.

adjwgt

idx_t *

Edge weight array of length equal to the local number of edges. Pass NULL to assign uniform weight 1 to all edges.

wgtflag

idx_t *

required

Controls which weights are used: 0 = no weights, 1 = edge weights only, 2 = vertex weights only, 3 = both vertex and edge weights.

numflag

idx_t *

required

Indexing convention: 0 for C-style 0-based, 1 for Fortran-style 1-based.

ncon

idx_t *

required

Number of balance constraints. Use 1 for a standard single-constraint partitioning.

nparts

idx_t *

required

Number of partitions to create.

tpwgts

real_t *

required

Array of length nparts * ncon specifying the desired fractional weight for each partition and constraint. Values must sum to 1.0 per constraint. Pass equal fractions (1.0 / nparts) for a balanced partition.

ubvec

real_t *

required

Array of length ncon specifying the allowed load imbalance per constraint. A value of 1.05 allows up to 5% imbalance.

options

idx_t *

required

Options array of length 3. Set options[0] = 0 to use all defaults, or options[0] = 1 to enable options[1] (debug level) and options[2] (random seed).

edgecut

idx_t *

required

Output. Set by ParMETIS to the total number of edges that cross partition boundaries.

part

idx_t *

required

Output array of length equal to the local vertex count. After the call, part[i] is the partition assigned to local vertex i.

comm

MPI_Comm *

required

Pointer to the MPI communicator that includes all processes participating in the partitioning. All processes in the communicator must call ParMETIS_V3_PartKway collectively.

Next steps

API reference

Full reference documentation for ParMETIS_V3_PartKway and all other API functions.

Graph partitioning guide

Learn about geometry-assisted partitioning and multi-constraint partitioning for more complex scenarios.

Get Started

Concepts

Guides

Configuration

Quickstart: partition a graph in parallel with ParMETIS

The example graph

Steps

Parameter reference

Next steps

API reference

Graph partitioning guide

Build docs developers (and LLMs) love

Get Started

Concepts

Guides

Configuration

Documentation Index

​The example graph

​Steps

​Parameter reference

​Next steps

API reference

Graph partitioning guide

Build docs developers (and LLMs) love

The example graph

Steps

Parameter reference

Next steps