Every URL that surfaces in a search result is scored before the agent decides whether to fetch it. TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/IconDean/research-agent/llms.txt
Use this file to discover all available pages before exploring further.
score_source function produces a single float between 0.0 and 1.0 by combining three independent signals: how trustworthy the domain is as a category, how relevant the result snippet is to the research question, and whether the snippet contains recency indicators. Sources that score at or below the minimum threshold of 0.5 are blocked — the agent never downloads their content, and the block is recorded in the final report’s Methodology section.
The Scoring Formula
| Component | Weight | What it measures |
|---|---|---|
| Domain Authority | 40% | Trustworthiness of the source’s domain or TLD |
| Snippet Relevance | 50% | Word overlap between the snippet and the research question |
| Recency Signals | 10% | Presence of a publication year or relative time expression |
Component 1: Domain Authority (weight 0.4)
Domain authority is determined by matching the URL against a tiered list of known domains and TLDs. The matching is applied in order: TLD rules are checked first, then exact domain membership in the major-domains list, then social-media domains, and finally everything else falls into the base tier.Domain Tiers
| Tier | Score | Domains / TLDs |
|---|---|---|
| Academic / Government | 0.9 | *.edu, *.gov |
| Major Recognised Sources | 0.8 | nature.com, science.org, wikipedia.org, arxiv.org, reuters.com, apnews.com, bloomberg.com, nytimes.com, wsj.com, bbc.com, techcrunch.com, wired.com, github.com, medium.com, scholar.google.com |
| Non-profit / Organisation | 0.7 | *.org (not already matched above) |
| General Web | 0.4 | All other domains not matched by a higher tier |
| Social Media | 0.3 | twitter.com, x.com, facebook.com, instagram.com |
The
.org tier (0.7) only applies to domains that did not already match the Major Recognised Sources list. For example, wikipedia.org scores 0.8 (major domain), not 0.7 (.org TLD).Component 2: Snippet Relevance (weight 0.5)
Snippet relevance measures how much vocabulary the search result snippet shares with the research question, after removing common stopwords from both strings. Stopwords excluded from matching:Component 3: Recency Signals (weight 0.1)
The recency component adds a flat bonus of 0.1 to the score if the snippet contains any of these patterns:| Pattern type | Examples |
|---|---|
Four-digit year starting with 202 | 2024, 2023, 2025 |
| Relative time expressions | "hours ago", "days ago", "weeks ago", "minutes ago" |
+0.1 is added. If neither matches, the recency contribution is 0.0.
Complete Score Examples
The formula is:(domain_score * 0.4) + (relevance_score * 0.5) + recency_score
Threshold Enforcement
The minimum credibility score isMIN_CREDIBILITY_SCORE = 0.5. Enforcement happens inside fetch_page, not at search time — the agent always scores and stores metadata for every search result, but only fetches pages that clear the threshold.
- A
"block"progress event fires (displayed as🚫 Blockedin the CLI). - The error string is returned to the Gemini tool-use loop instead of page content.
- The URL remains in
context.source_metadatawith its score, so it appears in the final report’s Blocked Sources count. - The blocked fetch does not consume a page from
MAX_SOURCES_PER_QUERY.
Score Persistence in Reports
Every score computed byscore_source is stored in ResearchContext.source_metadata and is included verbatim in the final report’s Sources section: