Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/timepoint-ai/timepoint-clockchain/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Content Judge is a safety layer that screens user queries before scene generation. It uses an LLM to classify queries as safe, sensitive, or harmful, helping maintain educational standards while allowing historically significant but mature content.
The Judge is always available but requires an OPENROUTER_API_KEY to function. Queries are blocked if the Judge is unavailable.

ContentJudge Class

class ContentJudge:
    def __init__(self, api_key: str, model: str = "google/gemini-2.0-flash-001"):
        self.api_key = api_key
        self.model = model

Configuration

ParameterDefaultDescription
api_keyRequiredOpenRouter API key
modelgoogle/gemini-2.0-flash-001LLM model for screening

Environment Variables

OPENROUTER_API_KEY=sk-or-v1-...
OPENROUTER_MODEL=google/gemini-2.0-flash-001  # Optional

Screening Queries

The primary method is screen(), which returns a verdict:
async def screen(self, query: str) -> str:
    prompt = JUDGE_PROMPT.format(query=query)
    
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            OPENROUTER_URL,
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
            },
            json={
                "model": self.model,
                "messages": [{"role": "user", "content": prompt}],
            },
            timeout=60.0,
        )
        resp.raise_for_status()
        data = resp.json()
    
    text = data["choices"][0]["message"]["content"].strip()
    # Parse JSON...
    result = json.loads(text)
    verdict = result.get("verdict", "reject")
    
    if verdict in ("approve", "sensitive"):
        return verdict
    return "reject"

Return Values

VerdictMeaningAction
"approve"Safe historical topicAllow generation
"sensitive"Historically significant but matureAllow with disclaimer
"reject"Harmful, hateful, or not historicalBlock generation

Moderation Prompt

The Judge uses a carefully designed prompt:
JUDGE_PROMPT = """You are a content moderation system for a historical education platform.

Evaluate this query for a historical scene generation:
"{query}"

Classify as ONE of:
- "approve" — innocuous historical topic, safe to generate
- "sensitive" — involves violence, controversy, or mature themes but is historically significant and educational; approve with a disclaimer
- "reject" — harmful, hateful, exploitative, or not a genuine historical query

Return ONLY a JSON object: {{"verdict": "approve"|"sensitive"|"reject", "reason": "brief explanation"}}"""

Design Principles

  1. Educational first: Prioritize historical significance over comfort
  2. Context-aware: Distinguish between educational and exploitative content
  3. Transparent: Always provide a reason for the verdict

Example Classifications

Approve

judge = ContentJudge(api_key="...")

verdict = await judge.screen("Moon landing 1969")
# Returns: "approve"
# Reason: "Innocuous historical event"

Sensitive

verdict = await judge.screen("Battle of Stalingrad 1942")
# Returns: "sensitive"
# Reason: "Significant WWII battle but involves graphic violence"

Reject

verdict = await judge.screen("Make a video game level")
# Returns: "reject"
# Reason: "Not a historical query"

Integration with JobManager

The Judge is typically invoked at the start of scene generation:
judge = ContentJudge(settings.OPENROUTER_API_KEY)

# Before generating a scene
verdict = await judge.screen(user_query)

if verdict == "reject":
    raise ValueError("Query rejected by content moderation")
elif verdict == "sensitive":
    # Add disclaimer to response
    disclaimer = "This historical event involves mature themes."
# Proceed with generation...

Response Parsing

The Judge handles markdown-wrapped JSON:
text = data["choices"][0]["message"]["content"].strip()
if text.startswith("```"):
    text = text.split("\n", 1)[1] if "\n" in text else text[3:]
    if text.endswith("```"):
        text = text[:-3]
    text = text.strip()

result = json.loads(text)
verdict = result.get("verdict", "reject")
If JSON parsing fails or the LLM returns an invalid verdict, the Judge defaults to "reject" for safety.

Logging

All screening decisions are logged:
logger = logging.getLogger("clockchain.judge")
logger.info(
    "Judge verdict for %r: %s (%s)", 
    query, 
    verdict, 
    result.get("reason", "")
)
Example log output:
2026-03-06 10:45:12 INFO clockchain.judge Judge verdict for 'D-Day invasion 1944': sensitive (Involves wartime violence but historically significant)

Error Handling

try:
    verdict = await judge.screen(query)
except httpx.HTTPStatusError as e:
    if e.response.status_code == 429:
        # Rate limited - queue for retry
        verdict = "reject"
    else:
        # Other API error - fail safe
        verdict = "reject"
except json.JSONDecodeError:
    # LLM returned invalid JSON
    logger.error("Failed to parse judge response")
    verdict = "reject"
except httpx.TimeoutException:
    # 60-second timeout exceeded
    verdict = "reject"

Fail-Safe Behavior

The Judge always defaults to rejection on errors:
  • API failures
  • Timeout
  • Invalid JSON
  • Unknown verdict values
This ensures the system errs on the side of caution.

Performance

  • Timeout: 60 seconds per query
  • Model: Gemini 2.0 Flash (fast inference)
  • Cost: ~0.0001perquery(1000requests=0.0001 per query (1000 requests = 0.10)
Screening adds ~1-2 seconds of latency to each generation request. This is acceptable for the safety benefits.

Customization

To use a different model or adjust the prompt:
judge = ContentJudge(
    api_key="sk-or-v1-...",
    model="anthropic/claude-3-haiku"  # Alternative model
)
For stricter moderation, modify JUDGE_PROMPT in app/workers/judge.py to adjust classification criteria.

Build docs developers (and LLMs) love