Skip to main content

Interpreting Metrics

This guide helps you understand what simulation metrics tell you about system performance and how to use them to identify issues and optimize configurations.

Key Performance Metrics

SimulationBank tracks four primary metric categories:

Wait Time

How long customers wait before service begins

Queue Length

Number of customers waiting at any time

Throughput

Rate at which customers are served

Saturation

Percentage of time tellers are busy

Wait Time Analysis

What is Wait Time?

Definition:
wait_time = service_start_time - arrival_time
The duration a customer spends in the waiting_queue before being assigned to a teller.

Interpreting Average Wait Time

Average WaitInterpretationSystem State
< 10 secondsExcellentLow load, excess capacity
10-30 secondsGoodBalanced system
30-60 secondsAcceptableModerate load
60-120 secondsPoorHigh load, consider adding tellers
> 120 secondsUnacceptableSystem struggling, immediate action needed

Wait Time Distribution

Why it matters: Average alone doesn’t tell the full story. Consider:
# Scenario A
avg_wait = 30 seconds
min_wait = 25 seconds
max_wait = 35 seconds
# Interpretation: Consistent, predictable

# Scenario B  
avg_wait = 30 seconds
min_wait = 0 seconds
max_wait = 180 seconds
# Interpretation: Highly variable, some customers wait 3 minutes!
Look at both average and maximum wait times. High maximum indicates bursty arrivals or service variability.

Wait Time by Priority

Expected pattern:
High Priority:    0.5-2 seconds   (served almost immediately)
Medium Priority:  2-10 seconds    (short wait)
Low Priority:     10-30 seconds   (longer wait)
Red flags:
  • High priority > 10s: System overloaded, even priority customers waiting
  • Low priority > 5 minutes: Potential starvation, too many high-priority arrivals
  • All priorities similar: Priority system not working (check sorting logic)

Example Interpretations

{
  "wait_time": {
    "average": 8.3,
    "min": 0.0,
    "max": 42.1,
    "by_priority": {
      "high": 1.2,
      "medium": 5.8,
      "low": 15.7
    }
  }
}
Analysis:
  • ✅ Average wait < 10s (excellent)
  • ✅ Clear priority differentiation (1.2s vs 15.7s)
  • ✅ Max wait < 1 minute (acceptable peak)
  • Verdict: Well-configured system

Queue Length Analysis

What is Queue Length?

Definition:
queue_length = len(waiting_queue)
The number of customers waiting for service at any instant.

Interpreting Queue Length

Queue LengthInterpretationRecommended Action
0-2UnderutilizedConsider reducing tellers
3-8OptimalGood balance
9-15HighMonitor closely, may need more tellers
16-30Very highAdd tellers soon
> 30CriticalImmediate intervention needed
Queue Length Over Time:

15 |
10 |    *  *
 5 | * *  *  * *
 0 |*           *
   +---------------
   0  10  20  30 min
Pattern: Oscillates around constant valueInterpretation:
  • System is stable (ρ < 1)
  • Arrival and service rates balanced
  • Normal random variation
Action: None needed

Average vs. Maximum Queue Length

# Example metrics
queue_metrics = {
    "current": 7,
    "average": 5.3,
    "max": 18
}
Interpretation:
  • Average (5.3): Typical queue size - acceptable
  • Max (18): Worst-case observed - indicates occasional congestion
  • Current (7): Snapshot at simulation end
A high max relative to average indicates high variability. This is normal with exponential arrivals and service times.

Throughput Analysis

What is Throughput?

Definition:
throughput = total_customers_served / simulation_time
The rate at which customers complete service (customers per second).

Theoretical Maximum Throughput

max_throughput = num_tellers * service_rate
             = num_tellers / service_mean

# Example: 5 tellers, 6-second service
max_throughput = 5 / 6.0 = 0.833 customers/second

Interpreting Throughput

Throughput / MaxInterpretationSystem State
< 50%Very underutilizedReduce tellers
50-70%Moderate utilizationGood balance
70-85%High utilizationEfficient operation
85-95%Very high utilizationNear capacity
> 95%SaturatedAt theoretical limit

Throughput vs. Arrival Rate

arrival_rate = 1.0 customers/second
throughput = 0.998 customers/second

# throughput ≈ arrival_rate
Interpretation:
  • System keeping up with arrivals
  • Minimal queue accumulation
  • Stable operation

Cumulative Customers Served

# After 1-hour simulation
total_served = 3,587 customers
simulation_time = 3,600 seconds

throughput = 3587 / 3600 = 0.996 customers/second
           = 59.8 customers/minute
           = 3,587 customers/hour
Use this to estimate daily/monthly volumes:
# If this throughput sustained for 8-hour workday
daily_capacity = 0.996 * 3600 * 8 = 28,697 customers/day

Saturation (Utilization) Analysis

What is Saturation?

Definition:
saturation = total_busy_time / total_available_time
         = ρ (traffic intensity)
Percentage of time tellers are busy serving customers.

Interpreting System Saturation

SaturationInterpretationTeller State
< 30%Very lowMostly idle, overstaffed
30-50%LowFrequent idle periods
50-70%OptimalBalanced busy/idle
70-85%HighMostly busy, short idle periods
85-95%Very highConstantly busy, minimal breaks
95-100%SaturatedAlways busy, queue growing

Per-Teller Utilization

{
  "saturation": {
    "system": 0.78,
    "per_teller": [
      {"id": "T-1", "utilization": 0.82, "sessions_served": 295},
      {"id": "T-2", "utilization": 0.79, "sessions_served": 284},
      {"id": "T-3", "utilization": 0.75, "sessions_served": 270},
      {"id": "T-4", "utilization": 0.76, "sessions_served": 273}
    ]
  }
}
Analysis:
  • System: 78% utilization (good)
  • Variation: 75-82% (normal random variation)
  • Sessions: 270-295 (fairly balanced)
If one teller has significantly different utilization (e.g., 95% while others are at 70%), there may be a bug in the assignment logic.

Idle Time

idle_percentage = (1 - saturation) * 100

# Example: saturation = 0.78
idle_percentage = (1 - 0.78) * 100 = 22%
Interpretation:
  • Tellers idle 22% of the time
  • This is healthy - allows for breaks, unexpected delays
  • Too low idle time (< 5%) means no buffer capacity

Combining Metrics for Insights

Pattern 1: High Utilization, Low Wait Times

saturation = 0.85      # 85% busy
avg_wait_time = 5.2    # 5 seconds
queue_length_avg = 2.3 # ~2 customers
Interpretation:
  • Efficient system: High throughput without excessive waits
  • Tellers kept busy but customers served quickly
  • Verdict: Well-optimized configuration

Pattern 2: Low Utilization, High Wait Times

saturation = 0.35      # Only 35% busy
avg_wait_time = 45.3   # 45 seconds!
queue_length_avg = 8.5 # ~9 customers
Interpretation:
  • Paradox! Tellers idle yet customers waiting
  • Likely cause: Bug in _assign_free_teller() logic
  • Tellers not being assigned to waiting customers
  • Action: Debug simulation code

Pattern 3: High Saturation, Growing Queue

saturation = 0.98      # 98% busy
avg_wait_time = 180    # 3 minutes
queue_length = growing # Increasing over time
throughput < arrival_rate
Interpretation:
  • Unstable system: ρ ≥ 1
  • Tellers maxed out but can’t keep up
  • Queue will grow indefinitely
  • Action: URGENT - Add tellers

Pattern 4: Balanced System

saturation = 0.72
avg_wait_time = 8.5
queue_length_avg = 4.2
queue_length: stable oscillation
throughput ≈ arrival_rate
Interpretation:
  • Ideal configuration
  • All metrics in healthy ranges
  • System stable and efficient
  • Action: None needed, monitor over time

Using Metrics to Optimize

Scenario: Reduce Costs (Minimize Tellers)

Current state:
num_tellers = 8
saturation = 0.45      # Only 45% busy
avg_wait_time = 3.2    # Very low
Analysis:
  • Overstaffed - wasting resources
  • Can reduce tellers without hurting service
Action:
# Try with 6 tellers
expected_saturation = 0.45 * (8/6) = 0.60  # Still acceptable

# Run simulation to verify
# If avg_wait_time stays < 15s, accept change

Scenario: Improve Service (Reduce Wait Times)

Current state:
num_tellers = 5
saturation = 0.88
avg_wait_time = 52.3   # Unacceptable
Analysis:
  • High utilization causing long waits
  • Need more capacity
Action:
# Add 2 tellers (5 → 7)
expected_saturation = 0.88 * (5/7) = 0.63  # Better

# Expected wait time reduction:
# Rule of thumb: wait_time inversely proportional to (1 - ρ)
# Old: 1 - 0.88 = 0.12
# New: 1 - 0.63 = 0.37
# Improvement factor: 0.37 / 0.12 ≈ 3x
# Expected new wait: 52.3 / 3 ≈ 17 seconds

Scenario: Handle Peak Hours

Off-peak state:
arrival_rate = 0.8
num_tellers = 4
saturation = 0.67
Peak projection:
peak_arrival_rate = 2.5  # 3x increase

# If we keep 4 tellers:
expected_saturation = 0.67 * (2.5 / 0.8) = 2.09  # UNSTABLE!

# Required tellers:
required = 4 * 2.09 / 0.75 = 11.112 tellers

Metric Benchmarks

Excellent Performance

{
  "wait_time": {"average": 5.0, "max": 25.0},
  "queue_length": {"average": 2.5, "max": 8},
  "throughput": 0.85,  // 85% of max
  "saturation": 0.68   // 68% utilization
}

Acceptable Performance

{
  "wait_time": {"average": 15.0, "max": 60.0},
  "queue_length": {"average": 6.0, "max": 18},
  "throughput": 0.75,
  "saturation": 0.80
}

Poor Performance

{
  "wait_time": {"average": 90.0, "max": 300.0},
  "queue_length": {"average": 20.0, "max": 65},
  "throughput": 0.50,
  "saturation": 0.95
}

Further Reading

Metrics Dashboard

Technical details of metric calculations

Configuring Parameters

How to adjust config based on metrics

Advanced Scenarios

Complex optimization examples

Running Simulations

Collecting and exporting metrics

Build docs developers (and LLMs) love