Pareto frontier analysis helps you identify the set of non-dominated solutions - model variants where you cannot improve one metric (like accuracy) without degrading another (like latency or energy). This is essential for selecting the best deployment configuration for your edge device.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/RaviTejaMedarametla/edge-ai-hardware-optimization/llms.txt
Use this file to discover all available pages before exploring further.
What is a Pareto Frontier?
A Pareto frontier is the set of points where no single point is strictly better than another across all dimensions. In edge AI optimization:A model is Pareto optimal if there is no other model that:
- Has lower latency (or energy) AND
- Has equal or better accuracy
- Model A: 10ms latency, 85% accuracy
- Model B: 15ms latency, 90% accuracy
- Model C: 12ms latency, 87% accuracy
The pareto_frontier Function
Thepareto_frontier function is defined in src/edge_opt/experiments.py:91-99:
Function Signature
df: pd.DataFrame
The results DataFrame from
run_sweep(), containing all model variants with their metricsx_col: str
The metric to minimize (x-axis). Common choices:
"latency_ms"for latency-accuracy frontiers"energy_proxy_j"for energy-accuracy frontiers
Algorithm Walkthrough
The function uses a greedy sweep algorithm to identify Pareto optimal points:Step 1: Filter to Accepted Variants (Line 92)
Why filter to accepted variants?
Why filter to accepted variants?
The
accepted column indicates whether a model variant satisfies the memory budget constraint (see memory-budgets.mdx). Only variants that fit on the target device are considered.Step 2: Sort by x_col then Accuracy
- Primary sort: ascending by
x_col(lower latency/energy first) - Secondary sort: descending by accuracy (higher accuracy first)
| Index | latency_ms | accuracy | Reasoning |
|---|---|---|---|
| 0 | 8.2 | 0.89 | Lowest latency, high accuracy |
| 1 | 8.2 | 0.85 | Same latency, lower accuracy |
| 2 | 10.5 | 0.91 | Higher latency, highest accuracy |
| 3 | 12.1 | 0.88 | Even higher latency, lower accuracy |
Step 3: Greedy Frontier Selection (Lines 93-98)
Initialize best_accuracy to -1.0
Start with an impossibly low accuracy so the first row is always selected
Check for improvement
If
row["accuracy"] > best_accuracy, this row offers better accuracy than all previous rows at this x_col valueVisual Example
Using the table from Step 2:| Index | latency_ms | accuracy | Selected? | Reason |
|---|---|---|---|---|
| 0 | 8.2 | 0.89 | ✅ Yes | First point, accuracy = 0.89 > -1.0 |
| 1 | 8.2 | 0.85 | ❌ No | Accuracy 0.85 < 0.89 (dominated by index 0) |
| 2 | 10.5 | 0.91 | ✅ Yes | Accuracy 0.91 > 0.89 (new best) |
| 3 | 12.1 | 0.88 | ❌ No | Accuracy 0.88 < 0.91 (dominated by index 2) |
Computing Multiple Frontiers
In practice, you typically compute two frontiers from the same sweep results:The latency and energy frontiers may contain different models. A model that is Pareto optimal for latency may not be optimal for energy, and vice versa.
Interpreting the Results
Each row in the returned frontier DataFrame represents a deployment option:Example Output
| pruning_level | precision | latency_ms | accuracy | memory_mb |
|---|---|---|---|---|
| 0.8 | int8 | 5.2 | 0.8245 | 4.1 |
| 0.6 | int8 | 8.7 | 0.8678 | 7.8 |
| 0.4 | fp16 | 12.3 | 0.8921 | 11.2 |
| 0.2 | fp32 | 18.9 | 0.9103 | 14.7 |
Fastest Option (Row 1)
Fastest Option (Row 1)
Configuration: 80% pruning + int8 quantizationTradeoffs:
- Latency: 5.2ms (fastest)
- Accuracy: 82.45% (lowest)
- Memory: 4.1MB (smallest)
Balanced Option (Row 2-3)
Balanced Option (Row 2-3)
Configuration: 40-60% pruning + mixed precisionTradeoffs:
- Latency: 8-12ms (moderate)
- Accuracy: 86-89% (good)
- Memory: 7-11MB (moderate)
Most Accurate Option (Row 4)
Most Accurate Option (Row 4)
Configuration: 20% pruning + fp32Tradeoffs:
- Latency: 18.9ms (slowest)
- Accuracy: 91.03% (highest)
- Memory: 14.7MB (largest)
Visualization with save_plots
Thesave_plots function (defined in src/edge_opt/experiments.py:102-143) visualizes the Pareto frontiers:
1. Accuracy vs Latency (Lines 107-118)
- Blue dots: Accepted variants (within memory budget)
- Gray X’s: Rejected variants (exceed memory budget)
- Red line: Pareto frontier connecting optimal points
2. Accuracy vs Energy (Lines 120-131)
3. Accuracy vs Memory (Lines 133-143)
Plots are saved as PNG files with 180 DPI resolution in the specified output directory.
Practical Selection Strategy
Here’s how to choose a model from the Pareto frontier:Define your primary constraint
Determine your hard requirement:
- Latency budget: “Must be < 20ms”
- Energy budget: “Must be < 0.5J per inference”
- Accuracy requirement: “Must be > 85%”
Advanced Use Cases
Multi-Objective Optimization
Multi-Objective Optimization
Compute frontiers for multiple metrics and find the intersection:
Per-Device Frontiers
Per-Device Frontiers
Compute separate frontiers for different deployment scenarios:
Custom Frontier Metrics
Custom Frontier Metrics
You can use
pareto_frontier with any metric in your DataFrame:Common Issues and Solutions
Related Functions
run_sweep()- Generates the input DataFrame (src/edge_opt/experiments.py:47)save_plots()- Visualizes Pareto frontiers (src/edge_opt/experiments.py:102)memory_violations()- Determines accepted/rejected status (src/edge_opt/metrics.py:66)