Overview
Hallucination detection in PAS2 works by comparing responses to semantically equivalent queries. If an AI model provides inconsistent or contradictory information when answering the same question phrased differently, this indicates potential hallucination.Detection workflow
The complete detection process follows a multi-stage pipeline:Response collection
Query the target model with the original query and all paraphrases to collect responses.
Response comparison
Use a judge model (OpenAI o3-mini) to analyze all responses for factual inconsistencies.
Main detection method
Thedetect_hallucination() method orchestrates the entire process:
Method signature
Parameters:query(str): The original question to testn_paraphrases(int): Number of paraphrases to generate (default: 3)
Dict: Complete results including judgment, responses, and analysis
Return structure
The method returns a comprehensive dictionary (pas2.py:283-293):Response collection
PAS2 uses parallel processing to efficiently collect responses from multiple queries:The system uses ThreadPoolExecutor with up to 5 concurrent workers to speed up response collection while avoiding API rate limits.
Parallel response gathering
- Responses are collected in the correct order
- Failed requests don’t block other responses
- Progress can be tracked incrementally
Individual response method
Each response is obtained through_get_single_response() (pas2.py:138-172):
The system prompt is intentionally generic to avoid biasing the model’s responses. This allows natural variations and potential hallucinations to emerge.
Progress tracking
The detection process supports real-time progress callbacks through multiple stages:Progress stages
Progress stages
- starting: Initial setup (5% progress)
- generating_paraphrases: Creating query variations (15% progress)
- paraphrases_complete: Paraphrases ready (30% progress)
- getting_responses: Collecting model responses (35% progress)
- responses_progress: Incremental updates per response (40-65% progress)
- responses_complete: All responses collected (65% progress)
- judging: Analyzing for hallucinations (70% progress)
- complete: Process finished (100% progress)
Judgment data model
The detection results are structured using a Pydantic model for type safety:Error handling
The system includes comprehensive error handling at each stage:Response errors
If a single response fails, it returns an error message but continues processing other queries.
Judgment errors
If the judge model fails, a fallback judgment is returned with
hallucination_detected=False and confidence_score=0.0.Performance metrics
The entire detection process typically completes in 5-15 seconds, depending on:- Number of paraphrases (more paraphrases = longer processing)
- API response times (network latency and model speed)
- Query complexity (longer responses take more time)
All timing information is logged for monitoring and optimization purposes. Check the logs for detailed performance breakdowns (pas2.py:299).