Optimizing Throughput

Auto-generate your docs

Overview
Write Optimization Strategies
1. Use Bounds to Reduce Read Dependencies
Example: Range-Based Bounds
Example: Prefix-Based Bounds
2. Choose Namespacing Over Bounds for High Throughput
3. Avoid Sequential Keys
Solutions
4. Tune Lazy Aggregation Settings
Read Optimization Strategies
1. Use Batch Operations
2. Minimize Aggregate Read Scope
3. Cache Expensive Aggregations
Combined Strategy Example
Performance Checklist
Monitoring Performance
See Also

Overview

The Aggregate component provides several strategies for optimizing throughput. This guide covers techniques for both read and write optimization.

Write Optimization Strategies

1. Use Bounds to Reduce Read Dependencies

To reduce the read dependency footprint of your queries, partition your aggregate space and use bounds whenever possible.

Bounds limit the portion of the aggregate tree that queries depend on, reducing spurious reruns and OCC conflicts.

Example: Range-Based Bounds

// Only reads scores between 95 and 100
// In a query: only reruns when a score in that range changes
// In a mutation: only conflicts with mutations modifying scores in that range
await aggregateByScore.count(ctx, {
  bounds: {
    lower: { key: 95, inclusive: false },
    upper: { key: 100, inclusive: true },
  },
});

Example: Prefix-Based Bounds

// Only reads data from a specific user
// Will only rerun or conflict when a mutation modifies that user's data
await aggregateScoreByUser.count(ctx, { 
  bounds: { prefix: [username] } 
});

Benefits:

Queries depend on a smaller subset of the tree
Reduced spurious query reruns
Lower OCC conflict rates in mutations
Multiple mutations can run concurrently if they operate on different ranges

2. Choose Namespacing Over Bounds for High Throughput

When you have natural data partitions and don’t need cross-partition aggregation, use namespaces instead of bounds.

const scoresByGame = new TableAggregate<{
  Namespace: Id<"games">;
  Key: number;
  DataModel: DataModel;
  TableName: "scores";
}>(components.scoresByGame, {
  namespace: (doc) => doc.gameId,
  sortKey: (doc) => doc.score,
});

// Each game has its own isolated tree
const footballCount = await scoresByGame.count(ctx, { 
  namespace: footballGameId 
});

Each namespace gets its own internal data structure with zero overlap between namespaces, providing complete isolation.

When to use namespaces:

Data is naturally partitioned (e.g., per user, per tenant, per game)
You don’t need to aggregate across partitions
You want maximum write throughput
You want zero contention between partitions

When to use bounds instead:

You need to aggregate globally
Your partitions are temporary or dynamic
You want flexible querying across different ranges

3. Avoid Sequential Keys

Don’t use strictly increasing values like _creationTime as your only sort key:

// ❌ Bad: All writes hit the same part of the tree
const aggregate = new TableAggregate<{
  Key: number;
  DataModel: DataModel;
  TableName: "events";
}>(components.aggregate, {
  sortKey: (doc) => doc._creationTime,
});

Why this is problematic:

All new inserts target the end of the tree
Every write contends with every other write
No parallelism possible

Using _creationTime as the sole sort key serializes all write operations, severely limiting throughput.

Solutions

Option 1: Add a randomizing component

// ✅ Better: Add randomness to distribute writes
const aggregate = new TableAggregate<{
  Key: [number, string]; // [rounded timestamp, random ID]
  DataModel: DataModel;
  TableName: "events";
}>(components.aggregate, {
  sortKey: (doc) => [
    Math.floor(doc._creationTime / 3600000), // Round to hour
    doc._id,
  ],
});

Option 2: Use namespacing with time ranges

// ✅ Better: Namespace by time period
const aggregate = new TableAggregate<{
  Namespace: string; // "2024-01-15"
  Key: number;
  DataModel: DataModel;
  TableName: "events";
}>(components.aggregate, {
  namespace: (doc) => {
    const date = new Date(doc._creationTime);
    return date.toISOString().split('T')[0];
  },
  sortKey: (doc) => doc._creationTime,
});

4. Tune Lazy Aggregation Settings

Adjust maxNodeSize and rootLazy for your workload:

// For write-heavy workloads
await aggregate.clear(ctx, {
  maxNodeSize: 64,  // Larger = less contention
  rootLazy: true,   // Default: avoid root node contention
});

// For read-heavy workloads
await aggregate.clear(ctx, {
  maxNodeSize: 16,  // Default
  rootLazy: false,  // Faster queries, more write contention
});

See Lazy Aggregation for detailed guidance.

Read Optimization Strategies

1. Use Batch Operations

For multiple similar queries, use batch operations to reduce overhead:

// ❌ Inefficient: Multiple separate calls
const counts = await Promise.all([
  aggregate.count(ctx, { bounds: bounds1 }),
  aggregate.count(ctx, { bounds: bounds2 }),
  aggregate.count(ctx, { bounds: bounds3 }),
]);

// ✅ Efficient: Single batch call
const counts = await aggregate.countBatch(ctx, [
  { bounds: bounds1 },
  { bounds: bounds2 },
  { bounds: bounds3 },
]);

Available batch operations:

countBatch() - Count items for multiple bounds
sumBatch() - Sum items for multiple bounds
atBatch() - Get items at multiple offsets

Benefits:

Reduced function call overhead
Optimized database access
Improved transaction efficiency
Lower latency for multiple queries

Batch operations can be significantly faster than individual calls, especially when querying multiple ranges or offsets.

2. Minimize Aggregate Read Scope

Only read what you need:

// ❌ Bad: Depends on entire aggregate
const totalCount = await aggregate.count(ctx);

// ✅ Better: Only depends on relevant subset
const recentCount = await aggregate.count(ctx, {
  bounds: {
    lower: { key: Date.now() - 86400000 } // Last 24 hours
  }
});

3. Cache Expensive Aggregations

For aggregations that don’t need to be real-time, consider caching:

import { internalMutation, query } from "./_generated/server";

// Cache the count in a document
export const updateCachedCount = internalMutation({
  handler: async (ctx) => {
    const count = await aggregate.count(ctx);
    await ctx.db.patch(cachedStatsId, { totalCount: count });
  },
});

// Read from cache instead of aggregating
export const getTotalCount = query({
  handler: async (ctx) => {
    const stats = await ctx.db.get(cachedStatsId);
    return stats?.totalCount ?? 0;
  },
});

Schedule periodic updates:

import { cronJobs } from "convex/server";

const crons = cronJobs();

crons.interval(
  "update cached counts",
  { minutes: 5 },
  internal.stats.updateCachedCount
);

export default crons;

Caching is most useful for global aggregations that would otherwise depend on the entire tree.

Combined Strategy Example

Here’s a comprehensive example combining multiple optimization strategies:

import { components } from "./_generated/api";
import { TableAggregate } from "@convex-dev/aggregate";

// Optimized leaderboard design
const leaderboard = new TableAggregate<{
  Namespace: Id<"games">; // Isolate by game
  Key: [number, string]; // [score, userId] for stable sorting
  DataModel: DataModel;
  TableName: "scores";
}>(components.leaderboard, {
  namespace: (doc) => doc.gameId,
  sortKey: (doc) => [doc.score, doc.userId],
  sumValue: (doc) => doc.score,
});

// Initialize with optimized settings
export const initLeaderboard = internalMutation({
  handler: async (ctx) => {
    await leaderboard.clear(ctx, {
      maxNodeSize: 32, // Reduce contention
      rootLazy: true,  // Enable lazy root
    });
  },
});

// Efficient batch query for multiple users
export const getUserRankings = query({
  args: { gameId: v.id("games"), userIds: v.array(v.string()) },
  handler: async (ctx, { gameId, userIds }) => {
    // Use batch operation for multiple users
    const rankings = await leaderboard.countBatch(
      ctx,
      userIds.map(userId => ({
        namespace: gameId,
        bounds: {
          upper: { key: [Infinity, userId] },
        },
      }))
    );
    
    return userIds.map((userId, i) => ({
      userId,
      rank: rankings[i],
    }));
  },
});

// User-specific query with minimal scope
export const getUserStats = query({
  args: { gameId: v.id("games"), userId: v.string() },
  handler: async (ctx, { gameId, userId }) => {
    // Only depends on this user's data in this game
    const count = await leaderboard.count(ctx, {
      namespace: gameId,
      bounds: { prefix: [userId] },
    });
    
    const sum = await leaderboard.sum(ctx, {
      namespace: gameId,
      bounds: { prefix: [userId] },
    });
    
    return {
      gamesPlayed: count,
      totalScore: sum,
      averageScore: count > 0 ? sum / count : 0,
    };
  },
});

Performance Checklist

Use this checklist to optimize your aggregate performance:

Monitoring Performance

Watch these metrics in the Convex dashboard:

OCC Error Rate: High rate indicates write contention
Function Duration: Slow aggregation queries may need optimization
Function Call Volume: High query rerun rate suggests scope issues
Mutation Throughput: Low throughput suggests contention problems

If you see frequent OCC conflicts or slow mutation throughput, revisit your key design and aggregation settings.

Build docs developers (and LLMs) love

Get started for free Talk to us

Get Started

Core Concepts

Guides

Use Cases

Performance

Operations

Overview

Write Optimization Strategies

1. Use Bounds to Reduce Read Dependencies

Example: Range-Based Bounds

Example: Prefix-Based Bounds

2. Choose Namespacing Over Bounds for High Throughput

3. Avoid Sequential Keys

Solutions

4. Tune Lazy Aggregation Settings

Read Optimization Strategies

1. Use Batch Operations

2. Minimize Aggregate Read Scope

3. Cache Expensive Aggregations

Combined Strategy Example

Performance Checklist

Monitoring Performance

See Also

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Use Cases

Performance

Operations

​Overview

​Write Optimization Strategies

​1. Use Bounds to Reduce Read Dependencies

​Example: Range-Based Bounds

​Example: Prefix-Based Bounds

​2. Choose Namespacing Over Bounds for High Throughput

​3. Avoid Sequential Keys

​Solutions

​4. Tune Lazy Aggregation Settings

​Read Optimization Strategies

​1. Use Batch Operations

​2. Minimize Aggregate Read Scope

​3. Cache Expensive Aggregations

​Combined Strategy Example

​Performance Checklist

​Monitoring Performance

​See Also

Build docs developers (and LLMs) love

Overview

Write Optimization Strategies

1. Use Bounds to Reduce Read Dependencies

Example: Range-Based Bounds

Example: Prefix-Based Bounds

2. Choose Namespacing Over Bounds for High Throughput

3. Avoid Sequential Keys

Solutions

4. Tune Lazy Aggregation Settings

Read Optimization Strategies

1. Use Batch Operations

2. Minimize Aggregate Read Scope

3. Cache Expensive Aggregations

Combined Strategy Example

Performance Checklist

Monitoring Performance

See Also