performance-efficiency: AWS Performance Efficiency Review

The performance-efficiency skill performs a focused assessment of your workload’s performance posture. It analyzes resource selection, scaling configurations, caching patterns, and data access patterns in your codebase to identify bottlenecks and improvement opportunities — every finding includes a specific file path, evidence, and an estimated performance impact.

Use performance-efficiency when you need a dedicated performance review, latency analysis, scaling assessment, or caching evaluation. For a multi-pillar review that includes performance alongside security, cost, and other pillars, use wa-review instead.

What the Agent Analyzes

The skill runs a structured discovery across six performance domains before evaluating against WA Framework questions.

Compute Selection Discovery

The agent examines compute resource selection and configuration:

EC2 instance types and families (general purpose vs compute/memory/storage-optimized)
Lambda memory and timeout configurations
ECS/Fargate task CPU and memory allocations
Container base images (Alpine, distroless, full OS)
Graviton (ARM) vs x86 architecture selection
Provisioned concurrency configurations (Lambda)
Batch vs real-time processing selection

Automatically flagged:

Lambda with default 128MB memory (likely under-provisioned, slower execution)
Lambda timeout ≥ caller’s timeout (will always appear as timeout to caller)
General-purpose instances for compute-heavy or memory-heavy workloads
x86 where Graviton provides better price-performance
Over-provisioned Fargate tasks (large CPU/memory for simple workloads)

Storage and Database Performance Discovery

The agent analyzes storage and database configurations for performance:

Database engine selection vs access patterns
Read replica configurations
Connection pooling (RDS Proxy, application-level pooling)
DynamoDB table design (partition key distribution, GSI configurations)
Caching layers (ElastiCache, DAX, CloudFront)
EBS volume types and IOPS provisioning
Database query patterns in application code (N+1 queries, missing indexes)

Automatically flagged:

Relational database for key-value access patterns (DynamoDB would be better)
No connection pooling in serverless → RDS architectures
DynamoDB with hot partition patterns (poor key design)
No caching layer for read-heavy, low-change data
gp2 EBS volumes (gp3 offers better baseline performance at lower cost)
Missing DAX for DynamoDB read-heavy patterns
N+1 query patterns or unbounded queries in application code

Networking and Content Delivery Discovery

The agent analyzes networking configurations for performance:

CloudFront distributions (or absence for static/API content)
API Gateway caching configurations
VPC endpoint usage (reduced latency vs internet path)
Placement groups for latency-sensitive workloads
Global Accelerator configurations
Connection settings (keep-alive, HTTP/2, compression)
DNS resolution (Route 53 latency-based routing)

Automatically flagged:

No CDN for static content delivery
API responses without compression (gzip/brotli)
No API Gateway caching for cacheable GET endpoints
Cross-region calls that could use local endpoints
Missing VPC endpoints (adds NAT Gateway latency)

Scaling and Elasticity Discovery

The agent examines scaling configurations and responsiveness:

Auto Scaling policies (target tracking vs step scaling)
Scaling metrics used (CPU only vs custom metrics like queue depth, latency)
Cooldown periods and scaling speed
Scheduled scaling for predictable patterns
Provisioned concurrency for latency-sensitive Lambda functions
Database auto-scaling (Aurora, DynamoDB)
Scaling bottlenecks (connection limits, DNS propagation, cold starts)

Automatically flagged:

CPU-only scaling metrics (ignores memory, queue depth, latency)
Long cooldown periods that prevent rapid response
No provisioned concurrency on latency-sensitive Lambda functions
Fixed-capacity databases with variable load patterns
Scaling max too low for expected peak

Application Performance Pattern Discovery

The agent analyzes application code for performance patterns:

Synchronous vs asynchronous processing patterns
Batch processing opportunities (batch writes, bulk APIs)
Pagination implementations (cursor vs offset)
Data serialization formats (JSON vs binary protocols)
Caching usage in application code (local caches, distributed caches)
Warm-up and connection reuse patterns (Lambda handler initialization)
Parallel vs sequential processing of independent operations

Automatically flagged:

Synchronous processing that could be async (SQS, EventBridge)
Sequential API calls that could be parallelized
Offset-based pagination on large datasets (cursor-based is O(1))
Lambda cold start patterns (initializing in handler instead of module scope)
Missing batch operations (individual PutItem instead of BatchWriteItem)
Unbounded data retrieval without pagination

Performance Monitoring Discovery

The agent analyzes performance monitoring configurations:

CloudWatch metrics and alarms (latency percentiles, errors)
X-Ray / OpenTelemetry tracing configuration
Custom metrics for performance-sensitive operations
Performance budgets and alerts
Dashboard definitions tracking latency/throughput

Automatically flagged:

No latency monitoring at p95/p99 level (average metrics hide tail latency)
No distributed tracing across service boundaries
No performance alarms (only error alarms)
Average-only metrics (hides tail latency issues)

WA Framework Coverage: PERF 1–5

After discovery, the agent evaluates your workload against all 5 Performance Efficiency pillar questions.

Question	Focus Area
PERF 1	Appropriate resource selection — instance types, compute families, storage tiers, selection justification
PERF 2	Compute solution — Lambda vs ECS vs EC2 selection, memory/CPU configs, cold start handling
PERF 3	Storage solution — storage types matching access patterns, IOPS configs, tiering strategies
PERF 4	Database solution — engine selection, read replicas, connection pooling, caching layers
PERF 5	Networking solution — CloudFront, VPC endpoints, placement groups, compression

Output Format

The skill produces a structured performance assessment including:

Performance Scorecard — 1–5 score across six domains (Compute Selection, Storage & Database, Networking & CDN, Scaling & Elasticity, Application Patterns, Monitoring)
Critical and High Risk Findings — with domain, title, description, evidence (file:line), performance impact, recommendation, expected improvement, effort, and relevant AWS services
Medium and Low Risk Findings — in condensed format
Optimization Opportunities table — current config, recommended config, expected improvement, and evidence for each resource
Prioritized Remediation Plan — Quick Wins (< 1 week), Foundation (1–4 weeks), Strategic (1–3 months)

How to Invoke

performance review
performance efficiency assessment
evaluate our performance posture

Example Optimization Opportunities

## Optimization Opportunities

| Resource | Current Config | Recommended | Expected Improvement | Evidence |
|----------|---------------|-------------|---------------------|----------|
| Lambda (api-handler) | 128 MB memory | 512 MB memory | ~60% reduction in duration | functions.ts:23 |
| RDS PostgreSQL | No connection pooling | RDS Proxy | ~40% reduction in connection overhead | database.tf:45 |
| API responses | No compression | gzip enabled on CloudFront | ~70% reduction in transfer size | cloudfront.tf:12 |
| DynamoDB reads | No caching | DAX cluster | Sub-millisecond read latency | dynamo.tf:67 |
| Lambda init | SDK client in handler | SDK client at module scope | ~200ms cold start reduction | handler.ts:5 |

Performance Trade-off Awareness

Performance optimizations often involve trade-offs with cost, complexity, or consistency. The agent explicitly notes these trade-offs for every recommendation. For example, enabling DAX for DynamoDB adds cost and operational overhead — the agent will estimate the latency benefit alongside the cost increase so you can make an informed decision.

Common trade-offs the agent surfaces:

Optimization	Performance Gain	Trade-off
Caching (ElastiCache/DAX)	Lower read latency	Added cost, eventual consistency, cache invalidation complexity
Async processing (SQS)	Higher write throughput	Increased latency for consumers, eventual consistency
Provisioned concurrency	Eliminated Lambda cold starts	Higher Lambda cost (charged even when idle)
Read replicas	Distributed read load	Replication lag, additional cost
gp3 over gp2 EBS	Same or better IOPS baseline	Minimal — gp3 is also cheaper

Benchmark Results

Evaluated with Claude Opus 4.8, 16K output tokens, paired comparison (same prompt with and without skill):

Baseline	With Skill	Delta
90%	100%	+10%

The skill brings a strong baseline model to 100% on behavioral assertions covering resource selection analysis, caching coverage, scaling configuration depth, and estimated performance improvements.

Get Started

Installation

Skills

Reference Data & Lenses

Evaluation & Benchmarks

performance-efficiency: AWS Performance Efficiency Review

What the Agent Analyzes

WA Framework Coverage: PERF 1–5

Output Format

How to Invoke

Example Optimization Opportunities

Performance Trade-off Awareness

Benchmark Results

Build docs developers (and LLMs) love

Get Started

Installation

Skills

Reference Data & Lenses

Evaluation & Benchmarks

Documentation Index

​What the Agent Analyzes

​WA Framework Coverage: PERF 1–5

​Output Format

​How to Invoke

​Example Optimization Opportunities

​Performance Trade-off Awareness

​Benchmark Results

Build docs developers (and LLMs) love

What the Agent Analyzes

WA Framework Coverage: PERF 1–5

Output Format

How to Invoke

Example Optimization Opportunities

Performance Trade-off Awareness

Benchmark Results