Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/JasonHonKL/spy-search/llms.txt

Use this file to discover all available pages before exploring further.

Spy Search is built on a two-tier architecture: a React frontend that provides the user interface, and a FastAPI backend that houses the entire agentic search pipeline. When a query arrives at the backend, it moves through a chain of coordinated agents — Planner, Searcher, and Reporter — each fulfilling a discrete role before the final synthesized report is returned to the client.

Component Breakdown

FastAPI Backend (main.py)

The entry point of the application is main.py, which creates the FastAPI app instance, mounts the aggregated API router from src/api/app.py, and applies CORS middleware scoped to http://localhost:8080 — the port served by the React dev server and production build.
app = FastAPI(title="Your API", description="API Documentation", version="1.0.0")
app.include_router(router)

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:8080"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
src/api/app.py aggregates five route modules — files, messages, agents, streaming, and misc — into a single APIRouter that is mounted on the app.

React Frontend (frontend/)

The frontend is a TypeScript + Vite application built with the shadcn/ui component library. It communicates with the FastAPI backend over HTTP and Server-Sent Events (SSE) for streaming responses.

Agent Pipeline

The search pipeline is composed of four layers working in sequence:
  • Router — wraps each agent and handles message passing to and from the Server
  • Server — holds the registry of all Routers, dispatches messages, and drives the pipeline loop
  • Planner — breaks the user query into an ordered task queue and assigns each task to the correct agent
  • Agents — execute domain-specific workflows (web search, RAG retrieval, report writing)
  • Reporter — collects all gathered data and synthesizes the final structured report

Browser Engine

The web retrieval layer offers two modes:
  • DuckDuckGo (src/browser/duckduckgo.py) — fast keyword search via langchain_community, with async content extraction capped at a 1.5-second total budget
  • crawl4ai / Playwright (src/browser/crawl_ai.py) — optional deep-crawl mode using a headless Chromium browser for JavaScript-rendered pages and LLM-guided content extraction

Vector Store

src/RAG/chrome.py wraps ChromaDB for local retrieval-augmented generation (RAG). The RAG_agent walks a configured file directory, converts each document to Markdown via markitdown, chunks text into 1 500-character segments, and indexes them in ChromaDB. At query time it retrieves the top-k most relevant chunks.

Model Layer

src/model/model.py defines the abstract Model base class with a unified interface (completion, completion_stream, get_llm_config, etc.). Concrete implementations for Gemini, Ollama, Deepseek, Grok, and OpenAI all satisfy this contract, so any agent can switch providers without code changes. The Factory class instantiates the correct implementation at runtime based on configuration.

Request Flow

  1. User sends a query to a FastAPI endpoint (e.g., /report/{query} or a streaming endpoint).
  2. Factory.get_model() instantiates the configured Model; Factory.get_agent() constructs the required Agent objects (Planner, one or more searchers, Reporter).
  3. generate_report(query, planner, agents) wires each agent into a Router, registers all Routers with a Server, and calls server.start(query).
  4. Planner receives the query, calls the LLM with a planning prompt that lists available agents and their descriptions, and returns an ordered task queue.
  5. Server dispatches each task by following the "agent" field in every response — routing to quick-searcher, searcher, or local-retrieval as instructed.
  6. Each agent executes its workflow (DuckDuckGo search, Playwright crawl, or ChromaDB retrieval) and returns {"agent": "planner", "task": "", "data": [...]} so the Server routes back to the Planner.
  7. Planner pops the next task from its queue; once the queue is empty it routes to Reporter.
  8. Reporter synthesizes a section-by-section report using targeted LLM calls, concatenates the sections, and returns {"agent": "TERMINATE", "data": <report>, "task": ""}.
  9. Server detects TERMINATE and returns the final response dict to generate_report(), which extracts the "data" field (the finished report string) and returns it to the calling API handler.

Key Source Modules

ModulePathRole
API Appsrc/api/app.pyFastAPI router aggregation across five route modules
Factorysrc/factory/factory.pyAgent and Model instantiation by name/provider
Serversrc/router/server.pyAgent pipeline orchestration and termination detection
Routersrc/router/router.pyPer-agent message routing between Server and Agent
DuckSearchsrc/browser/duckduckgo.pyDuckDuckGo-backed web search engine
VectorSearchsrc/RAG/chrome.pyChromaDB wrapper for local RAG
Server and Router are internal coordination primitives, not HTTP routers. They have no relationship to FastAPI’s APIRouter — they exist solely to pass messages between agents inside the pipeline.

Build docs developers (and LLMs) love