Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KarypisLab/ParMETIS/llms.txt

Use this file to discover all available pages before exploring further.

This guide walks you through writing a minimal C program that distributes a graph across MPI processes and calls ParMETIS_V3_PartKway to partition it. By the end you will have a working program you can compile and run, and an understanding of the key data structures ParMETIS expects.
Complete the installation guide before starting. You need mpicc, libparmetis, and libmetis installed and on your compiler’s search path.

The example graph

The example uses a simple 8-vertex ring graph (each vertex connected to its two neighbors), distributed evenly across 4 MPI processes—2 vertices per process. ParMETIS will assign each vertex to one of 4 partitions.
Vertices:  0 - 1 - 2 - 3 - 4 - 5 - 6 - 7 - (back to 0)
Processes: [0: 0,1] [1: 2,3] [2: 4,5] [3: 6,7]

Steps

1

Understand the key data structures

ParMETIS represents the distributed graph using three arrays that every process must populate before calling any partitioning function.vtxdist — a global array of length nprocs + 1 that describes how vertices are distributed. Process p owns vertices vtxdist[p] through vtxdist[p+1] - 1. Every process holds an identical copy.xadj — the local CSR row-pointer array. For a process owning n vertices, xadj has length n + 1. xadj[i] is the index into adjncy where the neighbors of local vertex i begin.adjncy — the local adjacency list. Each entry is the global index of a neighboring vertex.
All vertex indices in vtxdist and adjncy use 0-based (C) numbering by default. Set numflag = 1 to use 1-based (Fortran) numbering.
2

Write the example program

Create a file named example.c with the following content:
example.c
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <parmetis.h>

int main(int argc, char *argv[]) {
    MPI_Comm comm;
    int nprocs, rank;

    /* 1. Initialize MPI */
    MPI_Init(&argc, &argv);
    MPI_Comm_dup(MPI_COMM_WORLD, &comm);
    MPI_Comm_size(comm, &nprocs);
    MPI_Comm_rank(comm, &rank);

    if (nprocs != 4) {
        if (rank == 0)
            fprintf(stderr, "This example requires exactly 4 processes.\n");
        MPI_Finalize();
        return 1;
    }

    /*
     * 2. Set up vtxdist.
     *    8 vertices distributed 2 per process: [0,2), [2,4), [4,6), [6,8)
     */
    idx_t vtxdist[5] = {0, 2, 4, 6, 8};

    /*
     * 3. Set up local adjacency (xadj + adjncy) for a ring graph.
     *    Each process owns 2 vertices. Each vertex has 2 neighbors.
     *
     *    Process 0: vertex 0 -> {7, 1},  vertex 1 -> {0, 2}
     *    Process 1: vertex 2 -> {1, 3},  vertex 3 -> {2, 4}
     *    Process 2: vertex 4 -> {3, 5},  vertex 5 -> {4, 6}
     *    Process 3: vertex 6 -> {5, 7},  vertex 7 -> {6, 0}
     */
    idx_t xadj[3]    = {0, 2, 4};   /* 2 local vertices, each with 2 edges */
    idx_t adjncy_p0[4] = {7, 1, 0, 2};
    idx_t adjncy_p1[4] = {1, 3, 2, 4};
    idx_t adjncy_p2[4] = {3, 5, 4, 6};
    idx_t adjncy_p3[4] = {5, 7, 6, 0};

    idx_t *adjncy;
    switch (rank) {
        case 0: adjncy = adjncy_p0; break;
        case 1: adjncy = adjncy_p1; break;
        case 2: adjncy = adjncy_p2; break;
        default: adjncy = adjncy_p3; break;
    }

    /*
     * 4. Call ParMETIS_V3_PartKway.
     *
     *    vwgt    = NULL  (no vertex weights)
     *    adjwgt  = NULL  (no edge weights)
     *    wgtflag = 0     (no weights)
     *    numflag = 0     (0-based indexing)
     *    ncon    = 1     (one balance constraint)
     *    nparts  = 4     (four partitions)
     *    tpwgts  = equal fractions (1/nparts each)
     *    ubvec   = 1.05  (5% imbalance tolerance)
     *    options = {0}   (default options)
     */
    idx_t  wgtflag = 0;
    idx_t  numflag = 0;
    idx_t  ncon    = 1;
    idx_t  nparts  = 4;
    idx_t  edgecut;
    idx_t  part[2];           /* one partition label per local vertex */

    real_t tpwgts[4];         /* target partition weights */
    real_t ubvec = 1.05f;
    idx_t  options[3] = {0, 0, 0};

    for (int i = 0; i < nparts; i++)
        tpwgts[i] = 1.0f / (real_t)nparts;

    ParMETIS_V3_PartKway(
        vtxdist, xadj, adjncy,
        NULL,       /* vwgt   */
        NULL,       /* adjwgt */
        &wgtflag, &numflag, &ncon, &nparts,
        tpwgts, &ubvec, options,
        &edgecut, part, &comm
    );

    /* 5. Print the resulting partition assignment */
    for (int p = 0; p < nprocs; p++) {
        if (rank == p) {
            idx_t base = vtxdist[rank];
            for (int i = 0; i < 2; i++)
                printf("vertex %lld -> partition %lld\n",
                       (long long)(base + i), (long long)part[i]);
            fflush(stdout);
        }
        MPI_Barrier(comm);
    }

    if (rank == 0)
        printf("Edge cut: %lld\n", (long long)edgecut);

    /* 6. Finalize MPI */
    MPI_Comm_free(&comm);
    MPI_Finalize();
    return 0;
}
3

Compile the program

Use mpicc to compile and link against ParMETIS and METIS. If you installed the libraries to ~/local, pass the header and library paths explicitly.
mpicc -o example example.c \
    -I~/local/include \
    -L~/local/lib \
    -lparmetis -lmetis
If you built ParMETIS as a shared library, add ~/local/lib to LD_LIBRARY_PATH before running the program:
export LD_LIBRARY_PATH=~/local/lib:$LD_LIBRARY_PATH
4

Run the program

Launch the program with 4 MPI processes:
mpirun -np 4 ./example
You should see output similar to the following (process ordering may vary):
vertex 0 -> partition 3
vertex 1 -> partition 0
vertex 2 -> partition 0
vertex 3 -> partition 1
vertex 4 -> partition 1
vertex 5 -> partition 2
vertex 6 -> partition 2
vertex 7 -> partition 3
Edge cut: 4
For an 8-vertex ring with 4 partitions, the minimum edge cut is 4 (two edges per partition boundary). ParMETIS may produce different but equally valid partition assignments across runs.

Parameter reference

The full signature of ParMETIS_V3_PartKway is:
int ParMETIS_V3_PartKway(
    idx_t *vtxdist, idx_t *xadj, idx_t *adjncy, idx_t *vwgt,
    idx_t *adjwgt, idx_t *wgtflag, idx_t *numflag, idx_t *ncon, idx_t *nparts,
    real_t *tpwgts, real_t *ubvec, idx_t *options, idx_t *edgecut, idx_t *part,
    MPI_Comm *comm);
vtxdist
idx_t *
required
Global vertex distribution array of length nprocs + 1. Every process must hold the same copy. Process p owns vertices vtxdist[p] to vtxdist[p+1] - 1.
xadj
idx_t *
required
Local CSR row-pointer array of length (local vertex count) + 1. xadj[i] is the start index in adjncy for local vertex i.
adjncy
idx_t *
required
Local adjacency list. Each entry is the global index of a neighboring vertex.
vwgt
idx_t *
Vertex weight array of length (local vertex count) * ncon. Pass NULL to assign uniform weight 1 to all vertices.
adjwgt
idx_t *
Edge weight array of length equal to the local number of edges. Pass NULL to assign uniform weight 1 to all edges.
wgtflag
idx_t *
required
Controls which weights are used: 0 = no weights, 1 = edge weights only, 2 = vertex weights only, 3 = both vertex and edge weights.
numflag
idx_t *
required
Indexing convention: 0 for C-style 0-based, 1 for Fortran-style 1-based.
ncon
idx_t *
required
Number of balance constraints. Use 1 for a standard single-constraint partitioning.
nparts
idx_t *
required
Number of partitions to create.
tpwgts
real_t *
required
Array of length nparts * ncon specifying the desired fractional weight for each partition and constraint. Values must sum to 1.0 per constraint. Pass equal fractions (1.0 / nparts) for a balanced partition.
ubvec
real_t *
required
Array of length ncon specifying the allowed load imbalance per constraint. A value of 1.05 allows up to 5% imbalance.
options
idx_t *
required
Options array of length 3. Set options[0] = 0 to use all defaults, or options[0] = 1 to enable options[1] (debug level) and options[2] (random seed).
edgecut
idx_t *
required
Output. Set by ParMETIS to the total number of edges that cross partition boundaries.
part
idx_t *
required
Output array of length equal to the local vertex count. After the call, part[i] is the partition assigned to local vertex i.
comm
MPI_Comm *
required
Pointer to the MPI communicator that includes all processes participating in the partitioning. All processes in the communicator must call ParMETIS_V3_PartKway collectively.

Next steps

API reference

Full reference documentation for ParMETIS_V3_PartKway and all other API functions.

Graph partitioning guide

Learn about geometry-assisted partitioning and multi-constraint partitioning for more complex scenarios.

Build docs developers (and LLMs) love