Query optimization is crucial for maintaining good performance in Snuba. This guide covers techniques, best practices, and common pitfalls to help you write efficient queries.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/getsentry/snuba/llms.txt
Use this file to discover all available pages before exploring further.
Query Processing Pipeline
Understanding how Snuba processes queries helps you write better ones:- Parsing: Query string → AST
- Validation: Check entity requirements
- Logical Processing: Apply entity-specific transformations
- Storage Selection: Choose optimal table/view
- Translation: Logical → Physical query
- Physical Processing: Apply optimizations
- Execution: Run on ClickHouse
Key Optimization Principles
1. Always Include Required Conditions
Most entities require specific conditions to prevent expensive queries.2. Use Appropriate Time Ranges
Smaller time ranges are more efficient:- ✅ Query hours or days for real-time dashboards
- ✅ Query weeks for trend analysis
- ⚠️ Query months sparingly
- ❌ Avoid querying years of data
3. Leverage Indexes
ClickHouse indexes certain columns for faster queries: Indexed Columns (typically):project_idtimestamp- Tags (when promoted)
event_id,group_id,transaction_id
4. Use Appropriate Limits
Always specify reasonable limits:Default limit is 1000 rows. For pagination, use
LIMIT with OFFSET:SnQL Optimization Techniques
Tag Queries
Snuba optimizes tag queries in different ways:- Use equality (
=) when possible - Use
INfor multiple values:tags[environment] IN array('prod', 'staging') - Avoid
LIKEon tags unless necessary
Aggregation Optimization
Use Appropriate Aggregation Functions
- Fast Aggregations
- Slower Aggregations
count()sum()min(),max()uniq()- Uses HyperLogLog (approximate)
Limit Grouping Cardinality
High-cardinality grouping is expensive:- Group by categorical data (environment, status codes)
- Avoid grouping by UUIDs, timestamps, or high-cardinality strings
- Use
LIMIT BYto limit results per group
Sampling for Exploration
Use sampling for exploratory queries:turbo flag:
Subquery Optimization
Subqueries add overhead. Use them strategically:Join Optimization
Joins are expensive. Optimize by:- Push filters into subqueries: Filter before joining
- Use appropriate join types: Defined by entity relationships
- Join on indexed columns: Join keys are typically indexed
MQL Optimization Techniques
Choose Appropriate Granularity
Granularity affects query performance:- Match
granularityto storage configuration - Use larger intervals for longer time ranges
- 1 minute (60s) for real-time dashboards
- 1 hour (3600s) for daily views
- 1 day (86400s) for weekly/monthly views
Filter Early
Filters in the MQL string are more efficient than post-processing:Use Appropriate Aggregations
Match aggregation to metric type:- Counters:
sum(),count() - Distributions:
quantiles(),avg(),sum() - Sets:
uniq() - Gauges:
avg(),last(),min(),max()
Limit Grouping Dimensions
Limit the number of group-by tags:Storage-Level Optimizations
Final Queries
Snuba may applyFINAL to ClickHouse queries to handle deduplication:
- Forces merge of duplicate rows
- Increases query time
- Applied automatically when necessary
- Queries on ReplacingMergeTree tables
- When consistent results are required
- Using
turbo: true(disables FINAL, may return duplicates) - Querying read-replica storages (already deduplicated)
Storage Selection
Snuba may have multiple storages for an entity:- Primary: For consistent queries
- Replica: For high-throughput queries
- Materialized Views: Pre-aggregated for specific queries
- Query pattern
consistentflag- Available storages
Monitoring Query Performance
Use debug mode to understand query performance:Interpret Timing Breakdown
execute: Time spent in ClickHouse - optimize query if highprepare_query: Query processing time - usually lowvalidate_schema: Validation time - should be minimal
Interpret Stats
final: true- Consider using turbo mode if duplicates are acceptableresult_rows: 100000- Reduce limit or use pagination- Many results with no grouping - Add filtering or grouping
Review Generated SQL
Debug mode includes the generated SQL:PREWHEREvsWHERE- PREWHERE is faster for indexed columnsFINALkeyword - Indicates deduplication overhead- Index usage - Conditions on indexed columns
Common Performance Issues
Issue: Slow Tag Queries
- Problem
- Solution
Issue: High Cardinality Grouping
- Problem
- Solution
Issue: Missing Required Conditions
- Problem
- Solution
Issue: Large Time Range
- Problem
- Solution
Query Optimization Checklist
✅ Required Conditions
- Include
project_idcondition - Include timestamp range
- Use
IN array()for multiple projects
✅ Time Ranges
- Use smallest reasonable time range
- Prefer hours/days over weeks/months
- Use appropriate datetime format
✅ Filtering
- Filter on indexed columns when possible
- Use equality over LIKE for tags
- Push filters before joins/aggregations
✅ Aggregations
- Use appropriate functions for metric types
- Prefer approximate functions (uniq vs uniqExact)
- Group by low-cardinality columns
✅ Limits
- Always use reasonable limits (≤1000)
- Use LIMIT BY for per-group limits
- Paginate large result sets
✅ Sampling
- Use SAMPLE for exploratory queries
- Enable turbo mode during development
- Scale results appropriately
Advanced Techniques
Concurrent Query Limits
Snuba tracks concurrent queries per project and globally:- Caching results
- Batching queries
- Reducing query frequency
Rate Limiting
Snuba has rate limits per referrer and organization:- Using appropriate polling intervals
- Caching query results
- Batching similar queries
Allocation Policies
Snuba uses allocation policies to manage resources:- Bytes Scanned: Rejects queries scanning too much data
- Concurrent Queries: Limits parallel queries
- Query Duration: May throttle long-running queries
Testing Query Performance
When optimizing queries:- Enable debug mode: See timing and generated SQL
- Compare variations: Test different approaches
- Check stats: Look for red flags (FINAL, high row counts)
- Use sampling first: Test on small data samples
- Monitor production: Track query performance over time
Query performance can vary based on:
- Time of day (concurrent load)
- Data volume
- ClickHouse cluster health
- Recent cache state
Next Steps
SnQL Syntax
Learn complete SnQL syntax
MQL Syntax
Learn complete MQL syntax
Query Overview
Understand query fundamentals