AgenticRetriever

Overview

The AgenticRetriever class performs strategic document section selection using query decomposition results. It combines intent-based targeting, entity matching, and term frequency scoring to retrieve the 3-5 most relevant sections.

Class Definition

class AgenticRetriever:
    INTENT_SECTION_MAP = {
        "penalty": ["Late Payment Penalties", "Payment Terms"],
        "payment_terms": ["Payment Terms", "Late Payment Penalties"],
        "intellectual_property": ["Intellectual Property Rights"],
        "indemnification": ["Indemnification"],
        "termination": ["Termination for Convenience"],
        "confidentiality": ["Confidentiality"],
        "scope_of_services": ["Scope of Services"],
    }
    
    def __init__(self, doc_store: MockDocumentStore)

Parameters

doc_store

MockDocumentStore

required

Document store instance for accessing document sections

Methods

retrieve

Retrieves the most relevant document sections based on query decomposition.

async def retrieve(self, query: str, decomposition: Dict) -> List[Dict]

query

str

required

The original user query

decomposition

Dict

required

Query decomposition from QueryDecomposer.decompose()

intent

str

Detected intent category

entities

List[str]

Extracted entities

constraints

Dict

Extracted constraints

sections

List[Dict]

List of 3-5 most relevant sections, each containing:

title

str

Section title from the document

content

str

Section content text

page_num

int

Page number where the section appears

_relevance_score

float

Computed relevance score (higher is better)

Example

retriever = AgenticRetriever(doc_store)

query = "What is the late payment penalty?"
decomposition = {
    "intent": "penalty",
    "entities": ["late", "payment", "penalty"],
    "constraints": {},
    "temporals": []
}

sections = await retriever.retrieve(query, decomposition)

for section in sections:
    print(f"{section['title']} (page {section['page_num']})")
    print(f"Relevance: {section['_relevance_score']}")
    print(section['content'][:200])
    print()

Intent-Section Mapping

The retriever uses a predefined mapping to target specific sections based on detected intent. The first section in each list is the primary target (receives +7.0 score boost), while secondary sections receive +5.0.

INTENT_SECTION_MAP = {
    "penalty": ["Late Payment Penalties", "Payment Terms"],
    "payment_terms": ["Payment Terms", "Late Payment Penalties"],
    "intellectual_property": ["Intellectual Property Rights"],
    "indemnification": ["Indemnification"],
    "termination": ["Termination for Convenience"],
    "confidentiality": ["Confidentiality"],
    "scope_of_services": ["Scope of Services"],
}

Relevance Scoring Algorithm

_score_section

Computes a multi-factor relevance score for each section.

def _score_section(self, section: Dict, query: str, decomposition: Dict) -> float

Scoring Weights:

Intent-Based Scoring (Highest Priority)
- Primary intent match: +7.0 (5.0 + 2.0 bonus)
- Secondary intent match: +5.0
Entity Matching
- Entity in content: +1.0 per entity
- Entity in title: +1.5 per entity
Query Term Matching
- Term in content (>3 chars): +0.5 per term
Text Search Boost
- Applied from section["_text_boost"] if present

Example Scoring:

# Query: "What is the late payment penalty?"
# Intent: "penalty"
# Entities: ["late", "payment", "penalty"]

# Section: "Late Payment Penalties"
score = 7.0  # Primary intent match
score += 1.5 * 3  # 3 entities in title
score += 0.5 * 4  # 4 query terms in content
# Total: ~13.5

Private Helper Methods

_filter_irrelevant

Removes sections below the relevance threshold.

def _filter_irrelevant(self, sections: List[Dict], threshold: float = 1.0) -> List[Dict]

sections

List[Dict]

required

Sections with _relevance_score field

threshold

float

default:"1.0"

Minimum relevance score to include (default: 2.0 in retrieve method)

Threshold in retrieve(): 2.0 Sections must score at least 2.0 to be considered relevant. If no sections meet this threshold, the top-scoring section is returned as a fallback.

Retrieval Strategy

Full-text search: Query the document store
Fallback: If no results, retrieve all sections from the sample contract
Scoring: Apply multi-factor scoring to all sections
Filtering: Keep only sections with score ≥ 2.0
Selection: Return top 5 relevant sections (or top 1 if none meet threshold)
Logging: Log retrieval results with intent and section count

Usage Example

from components import QueryDecomposer, AgenticRetriever
from mock_data import MockDocumentStore

# Initialize components
decomposer = QueryDecomposer()
doc_store = MockDocumentStore()
retriever = AgenticRetriever(doc_store)

# Process query
query = "What intellectual property rights does the client retain?"
decomposition = await decomposer.decompose(query)

# Retrieve sections
sections = await retriever.retrieve(query, decomposition)

print(f"Found {len(sections)} relevant sections")
for section in sections:
    print(f"\n{section['title']} (Score: {section['_relevance_score']:.1f})")
    print(f"Page {section['page_num']}")

Performance Characteristics

Target sections: 3-5 per query
Minimum threshold: 2.0 relevance score
Fallback behavior: Returns 1 section if none meet threshold
Primary intent boost: 7.0 points
Entity title match: 1.5 points each

Integration

The retriever sits between the QueryDecomposer and the response generation pipeline:

# Full pipeline
decomposition = await decomposer.decompose(query)
sections = await retriever.retrieve(query, decomposition)  # <- AgenticRetriever
response = generator.generate(sections)
verdict = await judge.evaluate(response, sections)

Components

Workflow

Utilities

AgenticRetriever

Overview

Class Definition

Parameters

Methods

retrieve

Example

Intent-Section Mapping

Relevance Scoring Algorithm

_score_section

Private Helper Methods

_filter_irrelevant

Retrieval Strategy

Usage Example

Performance Characteristics

Integration

Build docs developers (and LLMs) love

Components

Workflow

Utilities

​Overview

​Class Definition

​Parameters

​Methods

​retrieve

​Example

​Intent-Section Mapping

​Relevance Scoring Algorithm

​_score_section

​Private Helper Methods

​_filter_irrelevant

​Retrieval Strategy

​Usage Example

​Performance Characteristics

​Integration

Build docs developers (and LLMs) love

Overview

Class Definition

Parameters

Methods

retrieve

Example

Intent-Section Mapping

Relevance Scoring Algorithm

_score_section

Private Helper Methods

_filter_irrelevant

Retrieval Strategy

Usage Example

Performance Characteristics

Integration