Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KarypisLab/ParMETIS/llms.txt

Use this file to discover all available pages before exploring further.

Multi-physics simulations assign more than one type of work to each mesh vertex. For example, a vertex might carry both a memory footprint (from field storage) and a compute cost (from the numerical scheme). Balancing only one of these while ignoring the other can overload processes in unexpected ways. ParMETIS supports multi-constraint partitioning — setting ncon > 1 — so that each constraint is balanced independently across all partitions. The same PartKway, PartGeomKway, and AdaptiveRepart functions accept multi-constraint input; only the layout of the weight arrays changes.

What multi-constraint means

With ncon constraints, each vertex has ncon weights rather than one. The partitioner finds a k-way partition such that, for every constraint j, the total weight of constraint j assigned to each partition is close to the target fraction tpwgts[i*ncon+j].

Vertex weight layout

vwgt is laid out in row-major (interleaved) order: all constraints for vertex 0 first, then all constraints for vertex 1, and so on.
vwgt[i * ncon + j]  =  weight of constraint j for local vertex i
For ncon = 2 and 3 local vertices:
/* vertex 0: memory=4, compute=2
   vertex 1: memory=1, compute=5
   vertex 2: memory=3, compute=3 */
idx_t vwgt[6] = {4, 2,
                 1, 5,
                 3, 3};

Target weight layout (tpwgts)

tpwgts has nparts * ncon entries. tpwgts[i*ncon+j] is the target fraction of total constraint-j weight that should end up in partition i. For each constraint, the values across all partitions must sum to 1.0.
/* 3 partitions, 2 constraints, uniform targets */
real_t tpwgts[6];
for (int i = 0; i < 3; i++) {
    tpwgts[i*2 + 0] = 1.0f / 3.0f;   /* constraint 0 */
    tpwgts[i*2 + 1] = 1.0f / 3.0f;   /* constraint 1 */
}

Imbalance tolerance per constraint (ubvec)

ubvec has ncon entries. Each entry is the maximum allowed imbalance ratio for the corresponding constraint.
real_t ubvec[2] = {1.05f, 1.05f};  /* 5% tolerance per constraint */
You can use different tolerances per constraint if one resource is more critical than another.

Complete example with ncon=2

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <parmetis.h>

int main(int argc, char *argv[])
{
    MPI_Comm comm;
    int npes, mype;

    MPI_Init(&argc, &argv);
    MPI_Comm_dup(MPI_COMM_WORLD, &comm);
    MPI_Comm_size(comm, &npes);
    MPI_Comm_rank(comm, &mype);

    /* 8-vertex ring, 4 vertices per process, 2 constraints */
    idx_t vtxdist[3]   = {0, 4, 8};
    idx_t nvtxs_local  = vtxdist[mype+1] - vtxdist[mype];

    idx_t xadj_p0[]   = {0, 2, 4, 6, 8};
    idx_t adjncy_p0[] = {7, 1,  0, 2,  1, 3,  2, 4};
    idx_t xadj_p1[]   = {0, 2, 4, 6, 8};
    idx_t adjncy_p1[] = {3, 5,  4, 6,  5, 7,  6, 0};

    idx_t *xadj   = (mype == 0) ? xadj_p0   : xadj_p1;
    idx_t *adjncy = (mype == 0) ? adjncy_p0 : adjncy_p1;

    /*
     * Two constraints per vertex:
     *   constraint 0 = memory footprint
     *   constraint 1 = compute cost
     *
     * vwgt[i*2+0] = memory weight for vertex i
     * vwgt[i*2+1] = compute weight for vertex i
     */
    idx_t ncon = 2;
    idx_t *vwgt = (idx_t *)malloc(nvtxs_local * ncon * sizeof(idx_t));

    for (idx_t i = 0; i < nvtxs_local; i++) {
        vwgt[i * ncon + 0] = 1 + i;             /* memory: varies per vertex */
        vwgt[i * ncon + 1] = nvtxs_local - i;   /* compute: inverse pattern  */
    }

    /* Uniform edge weights */
    idx_t *adjwgt = (idx_t *)malloc(xadj[nvtxs_local] * sizeof(idx_t));
    for (idx_t i = 0; i < xadj[nvtxs_local]; i++)
        adjwgt[i] = 1;

    idx_t  wgtflag = 3, numflag = 0, nparts = 2;

    /* tpwgts: nparts * ncon = 4 entries */
    real_t tpwgts[4];
    for (int i = 0; i < nparts; i++) {
        tpwgts[i * ncon + 0] = 1.0f / nparts;
        tpwgts[i * ncon + 1] = 1.0f / nparts;
    }

    /* ubvec: one tolerance per constraint */
    real_t ubvec[2] = {1.05f, 1.05f};

    idx_t options[3] = {0, 0, 0};
    idx_t edgecut;
    idx_t *part = (idx_t *)malloc(nvtxs_local * sizeof(idx_t));

    ParMETIS_V3_PartKway(vtxdist, xadj, adjncy, vwgt, adjwgt,
                         &wgtflag, &numflag, &ncon, &nparts,
                         tpwgts, ubvec, options, &edgecut, part, &comm);

    if (mype == 0)
        printf("Edge cut: %d\n", (int)edgecut);

    for (int p = 0; p < npes; p++) {
        if (mype == p) {
            for (idx_t i = 0; i < nvtxs_local; i++)
                printf("process %d: vertex %d (mem=%d, cmp=%d) -> partition %d\n",
                       mype,
                       (int)(vtxdist[mype] + i),
                       (int)vwgt[i*ncon+0],
                       (int)vwgt[i*ncon+1],
                       (int)part[i]);
        }
        MPI_Barrier(comm);
    }

    free(vwgt); free(adjwgt); free(part);
    MPI_Comm_free(&comm);
    MPI_Finalize();
    return 0;
}

Compilation

mpicc -o multiconstraint multiconstraint.c -lparmetis -lmetis
Diffusion-based refinement is only available when ncon <= 2. This is defined by MAX_NCON_FOR_DIFFUSION = 2 in the ParMETIS source. For problems with 3 or more constraints, the partitioner falls back to a different refinement strategy.
Ensure that for each constraint j, the sum of tpwgts[i*ncon+j] over all partitions i equals exactly 1.0. Floating-point rounding can cause issues — if needed, set the last entry explicitly to 1.0 - sum_of_previous rather than computing it independently.

Build docs developers (and LLMs) love