GridPACK targets high-performance computing platforms and is built on two parallel runtimes: MPI for inter-process messaging and the Global Arrays (GA) toolkit for one-sided remote memory access. This combination lets the framework balance ease of use — familiar MPI communicator semantics — with high-throughput data operations for tasks like ghost cell synchronization and distributed key-value stores. Application developers rarely interact with GA directly; instead they work through framework abstractions such asDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/GridOPTICS/GridPACK/llms.txt
Use this file to discover all available pages before exploring further.
Communicator, TaskManager, GlobalStore, and GlobalVector.
MPI + Global Arrays
MPI provides point-to-point and collective communication used by the solver libraries (PETSc), the partitioner, and explicit ghost exchange operations. Global Arrays extends MPI with shared-memory-style access to distributed arrays, enabling efficient one-sided reads and atomic operations without explicit message passing. Every GridPACK component that performs parallel operations takes aCommunicator as input. The Communicator maintains both an MPI communicator handle (via boost::mpi) and an internal GA process group handle, so the same object drives both runtimes.
The Communicator class
Communicator wraps a boost::mpi::communicator and a GA process group. It lives in the gridpack::parallel namespace.
Communicator objects are passed by value throughout GridPACK; copies share the same underlying communicator via reference counting.
Every gridpack object that uses Global Arrays — including networks, task managers,
GlobalStore, and GlobalVector — must be constructed with a Communicator. The communicator ties the GA process group to the object’s lifetime.Ghost cell exchange pattern
The most frequent parallel operation in a GridPACK solve loop is updating ghost buses and branches so that locally-owned components can read up-to-date neighbor data. This is a push-then-read pattern:Allocate exchange buffers
After
network->partition(), call factory.setExchange(). This queries getXCBufSize() on each component and allocates a buffer of that size. The buffer address is then pushed into the component via setXCBuf(void *buf).Initialize update data structures
Call
network->initBusUpdate() and/or network->initBranchUpdate() once. This builds the internal communication maps used for subsequent updates. Omit branch updates if no branch data needs synchronization.Write state into exchange buffer
At the start of each iteration, each active component writes its current state values into the exchange buffer using the internal pointers set during
setXCBuf.The Shuffler for data redistribution
When GridPACK needs to move an arbitrary set of objects across processors — not just ghost updates, but wholesale redistribution — it uses theShuffler utility (in gridpack/parallel/shuffler.hpp). The shuffler accepts a list of (object, destination-rank) pairs and performs an all-to-all redistribution in a single pass. This is used internally by the partitioner to move BusData and BranchData structs after the graph is partitioned.
The TaskManager pattern
Many power grid analyses involve processing a large collection of independent cases — contingency analysis is the canonical example, where each contingency is an independent network solve. The TaskManager class distributes these tasks across processors using a Global Arrays atomic counter, ensuring that each task is processed by exactly one processor with minimal synchronization overhead.
nextTask(Communicator &comm, int *next) ensures all processors in the sub-communicator receive the same task index atomically:
TaskManager uses NGA_Read_inc — an atomic fetch-and-increment on a one-element GA array — so task distribution scales to large processor counts without a central scheduler bottleneck.
GlobalStore — distributed indexed vectors
GlobalStore<T> lets each processor contribute variable-length vectors indexed by an integer key. After all contributions are added, a single upload() call makes every vector accessible to any processor via one-sided GA reads.
GlobalVector — distributed linear arrays
GlobalVector<T> targets the case where each processor contributes a contiguous block to a global linear array, and all processors need to read back arbitrary ranges after upload:
GlobalVector is used, for example, to collect per-bus results from all processors into a single vector that can be exported to a file or fed into a subsequent analysis stage.
PROGRESS_RANKS: asynchronous communication
Global Arrays supports multiple communication runtimes. The default uses MPI two-sided messaging, which is straightforward but does not scale well beyond approximately a dozen processors. For large core counts, the progress ranks runtime delivers significantly higher throughput by dedicating one MPI process per SMP node to managing communication asynchronously. When building GridPACK with the progress ranks runtime, set the CMake flag:--with-mpi-pr flag instead of --with-mpi-ts.
When USE_PROGRESS_RANKS=TRUE, GridPACK adjusts internal communicator construction so that the GA process group excludes the reserved communication ranks, keeping application logic transparent to the difference.
Framework overview
The four-layer architecture and Communicator usage patterns
Network model
How ghost buses and branches are created during partitioning
Bus and branch components
Exchange buffer implementation in component classes