Forecasting is the core activity on Metaculus. Forecasters make predictions about future events by submitting probability distributions that represent their beliefs about different outcomes.
class Forecast(models.Model): # Time range when this forecast is active start_time = models.DateTimeField(db_index=True) end_time = models.DateTimeField(null=True, db_index=True, blank=True) # Prediction data (one of these will be set based on question type) probability_yes: float # Binary questions probability_yes_per_category: list[float | None] # Multiple choice continuous_cdf: list[float] # Numeric/Date/Discrete # Metadata author = models.ForeignKey(User, models.CASCADE) question = models.ForeignKey(Question, models.CASCADE) post = models.ForeignKey("posts.Post", models.CASCADE) source = models.CharField(max_length=30, choices=SourceChoices.choices) distribution_input = models.JSONField(null=True, blank=True)
From questions/types.py:18-22, Metaculus supports four aggregation methods:
Recency Weighted
Unweighted
Single Aggregation
Metaculus Prediction
Recent forecasts are weighted more heavily than older forecasts. This is the default method for most questions.Best for: Standard questions with extended forecasting periodsAlgorithm: Uses time-decaying weights to give recent predictions more influence
All forecasts are weighted equally regardless of when they were made.Best for: Very short-term questions or live forecasting eventsAlgorithm: Simple geometric mean of all active forecasts
Aggregation calculated at a single point in time.Best for: Snapshot aggregations or historical analysis
The core aggregation algorithm uses geometric mean (from scoring/score_math.py:28-53):
def get_geometric_means( forecasts: Sequence[Forecast | AggregateForecast],) -> list[AggregationEntry]: geometric_means = [] timesteps: set[float] = set() # Collect all forecast start and end times for forecast in forecasts: timesteps.add(forecast.start_time.timestamp()) if forecast.end_time: timesteps.add(forecast.end_time.timestamp()) # Calculate geometric mean at each timestep for timestep in sorted(timesteps): prediction_values = [ f.get_pmf() for f in forecasts if f.start_time.timestamp() <= timestep and (f.end_time is None or f.end_time.timestamp() > timestep) ] if not prediction_values: continue geometric_mean = gmean(prediction_values, axis=0) predictors = len(prediction_values) geometric_means.append( AggregationEntry(geometric_mean, predictors, timestep) ) return geometric_means
Geometric mean is used instead of arithmetic mean because it better handles extreme probabilities and prevents a single forecaster from dominating the aggregate.
constraints = [ # end_time must be after start_time models.CheckConstraint( check=Q(end_time__isnull=True) | Q(end_time__gt=F("start_time")), name="end_time_after_start_time", ),]
Forecasts are filtered to only count those made during the question’s active period (from questions/models.py:539-562):
def filter_within_question_period(self): return self.filter( # Has no end time or an end time after question open time (Q(end_time__isnull=True) | Q(end_time__gt=F("question__open_time"))) # AND has a start time earlier than the questions close time & ( (Q(question__actual_close_time__isnull=False) & Q(start_time__lt=F("question__actual_close_time"))) | (Q(question__actual_close_time__isnull=True) & Q(start_time__lt=F("question__scheduled_close_time"))) ), )