nightlyIrt: 2PL IRT Item Parameter Calibration Worker

nightlyIrt is the nightly Item Response Theory calibration job. It runs 15 minutes after nightlyBkt, once per day, and fits a 2-Parameter Logistic (2PL) IRT model for every exercise that has accumulated at least 50 attempts. For each qualifying item it fetches the attempt outcomes together with the student’s current BKT p_known estimate (used as a proxy for latent ability θ), runs maximum-likelihood estimation via L-BFGS-B, and writes back the discrimination a and difficulty b parameters directly to the items table. The backend then uses these parameters to compute Fisher information I(θ) = a²·P(θ)·(1−P(θ)) for adaptive item selection. Like nightlyBkt, this function is purely deterministic — no LLM, no S3, no SQS.

Trigger and configuration

Property	Value
Trigger	EventBridge scheduled rule
Schedule	`cron(15 7 * * ? )` — every day at 07:15 UTC*
Timeout	900 s
Memory	1024 MB
Handler	`src.pipeline.nightly_irt.handler`
Event payload	(none read — scheduled invocation)

The 15-minute offset after nightlyBkt (cron(0 7 * * ? *)) is intentional. The IRT fit uses student_skill_mastery.p_known as the θ estimate; freshly recalibrated BKT parameters make those estimates more accurate, improving the quality of the 2PL fit.

Execution flow

Fetch items with sufficient attempt count

The handler queries all items that have accumulated at least 50 attempts, tracked by the attempt_count denormalised column:

SELECT id FROM items WHERE attempt_count >= 50

Items below this threshold are never fetched, avoiding unnecessary per-item queries.

Fetch attempts with student theta estimates

For each qualifying item, the worker fetches attempts from the last 90 days, joined to student_skill_mastery to retrieve the student’s current p_known value (the BKT posterior, used as θ):

SELECT m.p_known AS theta, a.is_correct
FROM attempts a
JOIN student_skill_mastery m
  ON m.student_id = a.student_id
 AND m.skill_id = (SELECT skill_id FROM items WHERE id = a.item_id)
WHERE a.item_id = $1
  AND a.created_at > NOW() - INTERVAL '90 days'

The p_known value is on [0, 1], so it is re-scaled to the IRT θ space [-1, 1] via theta = 2 * p_known - 1 before being passed to the fitting routine.

Fit 2PL model via L-BFGS-B MLE

fit_2pl(item_id, attempts) minimises the negative log-likelihood of the logistic response function:

P(correct | θ) = 1 / (1 + exp(−a · (θ − b)))

using scipy.optimize.minimize with the L-BFGS-B method and bounds:

a (discrimination): [0.5, 3.0]
b (difficulty): [-3.0, 3.0]

The optimiser starts from the neutral point x0 = [1.0, 0.0].

params = fit_2pl(item_id=str(item_id), attempts=observations)

If the item has fewer than MIN_ATTEMPTS (50) observations at fit time, fit_2pl returns calibrated=False with safe defaults instead of raising.

Write IRT parameters back to items table

The fitted parameters are written directly to the items table:

UPDATE items
SET irt_a=$1, irt_b=$2, irt_calibrated_at=NOW()
WHERE id=$3

A structured log event irt_calibrated is emitted per item with item_id, a, and b.

Return payload

{
  "calibrated_items": 312
}

The single integer calibrated_items is the count of items for which the UPDATE was executed (i.e., items that had attempt_count >= 50 at query time).

IrtItemParams schema

item_id

str

required

UUID of the exercise (items.id). Used only internally; not written to the database column.

float

required

Discrimination parameter. Higher values mean the item is better at distinguishing students near the difficulty level. Bounded [0.1, 3.0]; defaults to 1.0 for uncalibrated items.

float

required

Difficulty parameter. Represents the θ value at which a student has a 50 % chance of answering correctly. Bounded [-3.0, 3.0]; defaults to 0.0 for uncalibrated items.

calibrated

bool

true when the parameters were fit from real data via MLE; false when the item had fewer than 50 attempts and the safe defaults a=1.0, b=0.0 were used instead. Defaults to false.

Items with calibrated=False still participate in the adaptive selection engine. The defaults a=1.0, b=0.0 represent a neutral, moderately discriminating item at average difficulty, which is a reasonable prior until more data is collected. The backend reads irt_a, irt_b, and irt_calibrated_at directly from the items table.

Local invocation

uv run python -c "from src.pipeline.nightly_irt import handler; handler({}, None)"

Ensure DATABASE_URL is set in your .env file. The function requires items, attempts, and student_skill_mastery tables to be populated.

Get Started

Core Concepts

Workers

Configuration & Operations

Deployment

nightlyIrt: 2PL IRT Item Parameter Calibration Worker

Trigger and configuration

Execution flow

Return payload

IrtItemParams schema

Local invocation

Build docs developers (and LLMs) love

Get Started

Core Concepts

Workers

Configuration & Operations

Deployment

Documentation Index

​Trigger and configuration

​Execution flow

​Return payload

​IrtItemParams schema

​Local invocation

Build docs developers (and LLMs) love

Trigger and configuration

Execution flow

Return payload

IrtItemParams schema

Local invocation