Skip to main content

Overview

The Aggregation Explorer is a powerful tool that allows you to visualize and compare how different aggregation methods produce community forecasts. It’s essential for understanding how Metaculus combines individual forecasts into collective predictions.
Access the Aggregation Explorer at /aggregation-explorer or by searching for any question.

What is Aggregation?

Aggregation is the process of combining multiple individual forecasts into a single community prediction. Different aggregation methods can produce significantly different results, especially when:
  • The forecaster pool changes over time
  • Some forecasters are more accurate than others
  • New information becomes available
  • The question approaches its close date

Aggregation Methods

The Aggregation Explorer supports multiple aggregation methods, each with different properties:

Recency Weighted

Default method for most questions. Combines reputation weighting with recency weighting to give more influence to recent forecasts from skilled forecasters.
Algorithm:
  1. Collect Latest Forecasts: Get the most recent forecast from each forecaster
  2. Calculate Reputation Weight: Based on historical forecasting accuracy
  3. Calculate Recency Weight: More recent forecasts get higher weight
  4. Combine Weights: Multiply reputation × recency for final weight
  5. Aggregate: Take weighted average in log-odds space
  6. Transform: Convert back to probabilities and normalize
Formula:For each forecaster i:
w_i = w_reputation(i) × w_recency(i)
Then the aggregate forecast:
p_agg = normalize(logit^(-1)((Σ w_i · logit(p_i)) / (Σ w_i)))
Properties:
  • Responds quickly to new information
  • Rewards track record accuracy
  • Most commonly used in practice

Unweighted

Simple average of all forecasters, giving equal weight to everyone.
Algorithm:
  1. Get the most recent forecast from each forecaster
  2. Transform each forecast to log-odds space
  3. Take arithmetic mean
  4. Transform back to probabilities
  5. Normalize to sum to 1
When to use:
  • Democratic consensus needed
  • Small expert groups
  • Avoiding reputation bias
  • Educational contexts

Metaculus Pros

Aggregates forecasts only from forecasters who have earned tournament medals (gold, silver, or bronze).
Eligibility:Forecasters must have:
  • At least one tournament medal (gold, silver, or bronze)
  • Demonstrated consistent forecasting skill
  • Active participation history
Aggregation:
  • Uses recency-weighted method
  • Only includes medal-holders
  • Updates as forecasters earn medals
Use Cases:
  • High-stakes questions
  • When you want expert consensus
  • Filtering out casual forecasters

Medalists (Tiered)

Filter forecasts by medal tier:
  • All Medals: Bronze, silver, and gold medalists
  • Silver and Gold: Excludes bronze medalists
  • Gold Only: Only gold medalists

Cohort: Joined Before Date

Aggregates forecasts only from users who joined Metaculus before a specified date.
Use Cases:
  • Measuring prediction skill of early adopters
  • Comparing experienced vs new forecasters
  • Historical analysis
  • Tournament restrictions
Method:
  • Uses recency-weighted aggregation
  • Filters by user join date
  • Date threshold is configurable

Single Aggregation (Staff Only)

Creates a single aggregate forecast at a specific point in time, useful for research and analysis.

Using the Aggregation Explorer

Step 1: Search for a Question

  1. Navigate to /aggregation-explorer
  2. Enter a question ID or URL
  3. Click Explore

Step 2: Select Aggregation Methods

Use the aggregation method selector to add multiple methods to compare:
  • Click Add Aggregation Method
  • Select from available methods
  • Configure options (dates, medal tiers, etc.)
  • Each method appears as a separate line on the chart

Step 3: Configure Options

Bot Toggle:Some methods support including/excluding bot forecasts:
  • Toggle Include Bots to see bot impact
  • Useful for comparing human vs bot+human aggregates
  • Only available when question allows bots in aggregates
Date Selection:For cohort methods:
  • Use date picker to set join date threshold
  • View how different cohorts predicted
  • Compare early vs late forecaster groups
Medal Tier:For medalist aggregations:
  • Select All Medals, Silver & Gold, or Gold Only
  • See how filtering by skill level affects predictions
User IDs (Advanced):Some methods support filtering to specific user IDs:
  • Enter comma-separated user IDs
  • Create custom forecaster groups
  • Useful for team analysis

Step 4: Analyze Results

The chart shows:
  • Timeline: X-axis shows question lifetime
  • Forecast Values: Y-axis shows prediction values
  • Confidence Intervals: Shaded areas (when available)
  • Forecaster Counts: Hover to see how many forecasters contributed
  • Multiple Aggregations: Compare up to 10 methods simultaneously

Understanding the Visualization

For Binary Questions

  • Y-axis: Probability of “Yes” (0-100%)
  • Single line per aggregation method
  • Dotted line at 50% for reference
  • Resolution marker (if resolved)

For Multiple Choice Questions

  • Select sub-question from dropdown
  • One chart per selected option
  • Compare option probabilities over time

For Continuous/Date Questions

  • Shows median prediction
  • Confidence intervals (25th-75th percentile)
  • Can view specific percentiles
  • Scaling applied automatically

Technical Details

Endpoints:The Aggregation Explorer uses:
  • GET /api/questions/{id}/ - Question data
  • Query params for aggregation methods
  • aggregation_methods - Comma-separated list
  • include_bots - Boolean flag
  • user_ids - Filter to specific users
Implementation:Located in:
  • Frontend: front_end/src/app/(main)/aggregation-explorer/
  • Backend: utils/the_math/aggregations.py
  • API: utils/views.py
Key Functions:
  • get_aggregation_history() - Generates time series
  • compute_discrete_forecast_values() - Calculates aggregates
  • get_histogram() - For continuous distributions

Common Use Cases

Comparing Bot Performance

Question has include_bots_in_aggregates enabled:
  1. Add Recency Weighted with bots OFF
  2. Add Recency Weighted with bots ON
  3. Compare to see bot impact on aggregate

Analyzing Expert Consensus

  1. Add Unweighted (all forecasters)
  2. Add Metaculus Pros (medal holders)
  3. Add Gold Only (top performers)
  4. See if experts differ from crowd

Historical Cohort Analysis

  1. Add Cohort: Joined Before with early date (e.g., 2020)
  2. Add Cohort: Joined Before with later date (e.g., 2023)
  3. Add Recency Weighted (all users)
  4. Compare prediction evolution across cohorts

Best Practices

Getting Insights:
  • Start with Recency Weighted as baseline
  • Add 2-3 comparison methods maximum
  • Use contrasting colors for clarity
  • Focus on key decision points (updates after news)
  • Check forecaster counts to ensure statistical significance
Avoiding Misinterpretation:
  • Low forecaster counts create noisy aggregates
  • Early forecasts may be speculative
  • Resolution time != question close time
  • Bot forecasts may not reflect current knowledge
  • Medal tiers change over time

Exporting Data

To download aggregation data:
  1. Visit the question page
  2. Click Download Data
  3. Select aggregation methods to export
  4. Choose CSV or JSON format
  5. Data includes timestamps, values, and forecaster counts

Build docs developers (and LLMs) love