ExecutionPlan to disk and reloading it when the graph is unchanged.
The cache stores the compiled
ExecutionPlan — the buffer layout and GPU dispatch sequence — not the model weights. You still need to call session.set_parameter(...) to load weights after building from cache.Using build_session_cached
Replace build_session with build_session_cached and pass a path for the cache file:
build_session_cached runs the full pipeline (autodiff → optimize → compile) and saves the resulting ExecutionPlan to model.plan.ron. On subsequent runs, it reads the file, checks the graph hash, and — if the graph is unchanged — skips straight to Session::new. The compile time reported in the benchmarks table is 0 s because Meganeura always caches after the first build.
How cache invalidation works
The cache stores a fingerprint of your forward graph alongside the plan. The fingerprint hashes:- The number of nodes in the graph
- The operation type at each node (
matmul,relu,parameter, etc.) - The input edges of each node
- The tensor shape at each node
- The name of every
ParameterandInputnode - The graph output list
build_session_cached will recompile and overwrite the cache. A log message is emitted when this happens:
Cache file format
The cache is serialized as a RON (Rusty Object Notation) text file. RON is a human-readable format structurally similar to Rust expressions. The file contains aCachedPlan struct with two fields:
build_session_cached will simply recompile.
Lower-level cache functions
If you need finer control, you can call the cache functions directly from themeganeura::cache module:
save_plan computes the graph hash and writes it alongside the ExecutionPlan in RON format. load_plan returns Ok(None) for a missing file (not an error), and Err(...) only for a corrupt or unreadable file.
What is and is not cached
| Cached | Not cached |
|---|---|
| Buffer layout (sizes, dtypes) | Model weights (set_parameter data) |
| GPU dispatch sequence | Input data |
| Kernel indices and workgroup sizes | Optimizer state (Adam moments) |
| Fused op selections | GPU device objects (recreated in Session::new) |