Overview
When you send a message in the UI, the app callsPOST /api/chat. At a high level, three things happen:
Let’s walk through each step in detail.
Step 1: Classification
Before searching or answering, Perplexica runs a classification step to understand the question and plan the response.What the classifier decides
The classifier (src/lib/agents/search/classifier.ts) analyzes the query and determines:
- Should we do research? Some questions don’t need web search (e.g., “What did we discuss earlier?”)
- Which widgets are relevant? Weather, stocks, or calculations
- What sources to use? Web, academic papers, or discussions
- How to rewrite the query into a clearer, standalone form that works without conversation context
The classifier uses a structured schema with boolean flags for each decision. This ensures consistent, predictable behavior.
Classification output example
Step 2: Parallel execution
After classification, Perplexica runs two processes in parallel for optimal performance:Widgets
Widgets are small, structured helpers that provide real-time data:Weather
Current conditions and forecasts based on location
Stocks
Real-time market data and stock prices
Calculations
Evaluate mathematical expressions
- Run independently of research
- Show structured UI cards while the answer is being generated
- Provide helpful context but are not cited as sources
- Executed by
src/lib/agents/search/widgets/executor.ts
Widgets complete quickly and appear in the UI before the final answer, giving users immediate value.
Research
If research is needed (based on classification), the researcher gathers information in the background: Research capabilities (src/lib/agents/search/researcher/actions/):
- Web search: General information via SearXNG meta-search
- Academic search: Scholarly papers and research articles
- Social search: Discussion forums and community insights
- Upload search: Semantic search over user-uploaded files (PDFs, documents)
- URL scraping: Direct content extraction from specific URLs
- The researcher receives the standalone query and enabled sources
- It selects appropriate tools based on the classification
- Tools run and gather relevant content
- Results are deduplicated and ranked
- A set of “search findings” is returned with metadata (title, URL, content)
src/lib/agents/search/researcher/index.ts
Step 3: Answer generation
Once Perplexica has enough context from research and widgets, it generates the final response.Context assembly
The system combines two types of context:Search results (citable)
Search results (citable)
Widget results (non-citable)
Widget results (non-citable)
Optimization modes
You can control the tradeoff between speed and quality usingoptimizationMode:
- Speed: Fast responses with lighter processing
- Balanced: Default mode balancing speed and thoroughness
- Quality: Deep analysis with more comprehensive answers
- How much context is gathered
- The complexity of the writer prompt
- Model parameters (temperature, top-p, etc.)
Streaming response
The answer is streamed to the user in real-time:- Writer prompt is constructed with search context and system instructions
- LLM begins generating the response
- Each chunk is emitted to the user via Server-Sent Events (SSE)
- The UI updates progressively as text arrives
- When complete, the full response is saved to the database
src/lib/agents/search/index.ts:122-166
How citations work
Perplexica prompts the model to cite the references it uses. The citation system works as follows:- Source numbering: Each search result has an index
- Inline citations: The model references sources by index (e.g., [1], [2])
- UI rendering: Citations are rendered as clickable links alongside the answer
- Supporting links: Each citation links to the original source
The writer prompt explicitly instructs the model to cite sources. This is handled by the prompt template in
src/lib/prompts/search/writer.ts.Media search (images and videos)
Image and video search use separate, specialized endpoints:How it differs
- Endpoint:
POST /api/imagesandPOST /api/videos - Process:
- Generate a focused query using the chat model
- Fetch matching results from the search backend
- Return structured media results
- No research phase: These are pure search operations
- No citations: Results are displayed as a gallery
src/lib/agents/media/image.ts and src/lib/agents/media/video.ts
Search API for integrations
If you’re integrating Perplexica into another product, usePOST /api/search.
Response format
Streaming mode
Enable streaming by settingstream: true in your request:
Complete flow diagram
Here’s the complete flow from question to answer:Performance optimizations
Perplexica is designed for speed:- Parallel execution: Widgets and research run simultaneously
- Streaming: Users see responses as they’re generated
- Efficient prompts: Prompts are optimized per mode
- Caching: Provider connections are reused
- Database indexes: Fast chat and message lookups
Next steps
Architecture
Deep dive into components and code structure
API Reference
Integrate Perplexica into your applications