Skip to main content
Perplexica provides a RESTful API that enables you to integrate AI-powered search capabilities into your applications. The API supports multiple search modes, various AI model providers, and both standard and streaming responses.

Base URL

When running Perplexica locally, the default base URL is:
http://localhost:3000
Replace localhost:3000 with your Perplexica instance URL if running on a different host or port.

API architecture

Perplexica’s API is built on Next.js and provides several key endpoints:
  • Search API (/api/search) - Execute AI-powered searches with customizable sources and models
  • Chat API (/api/chat) - Interactive chat interface with conversation history
  • Providers API (/api/providers) - Retrieve available AI model providers and their models
  • Additional endpoints - Images, videos, suggestions, and more

Key concepts

Providers and models

Before making search requests, you need to understand Perplexica’s provider system:
  • Providers are AI service platforms (OpenAI, Anthropic, Ollama, etc.)
  • Each provider offers chat models (for generating responses) and embedding models (for semantic search)
  • You must specify both a chat model and embedding model for each request
  • Provider IDs are UUIDs that you obtain from the /api/providers endpoint
Use the /api/providers endpoint to get a list of all configured providers and their available models before making search requests.

Search sources

Perplexica can search different types of content:
  • web - General web search results
  • academic - Academic papers and scholarly content
  • discussions - Forum discussions and community content
You can enable multiple sources in a single request.

Optimization modes

Control the balance between speed and quality:
  • speed - Fastest responses, ideal for quick queries
  • balanced - Good balance between speed and quality (default)
  • quality - Highest quality responses, may take longer

Response formats

Standard responses

By default, the API returns complete JSON responses with:
  • message - The generated answer
  • sources - Array of sources used, including content snippets and metadata

Streaming responses

Set stream: true to receive real-time updates via Server-Sent Events (SSE):
  • Progressive text generation
  • Lower perceived latency
  • Better user experience for long responses
Streaming is recommended for interactive applications where you want to display responses as they’re generated.

Example workflow

  1. Get available providers and models
    curl http://localhost:3000/api/providers
    
  2. Extract provider IDs and model keys from the response
  3. Make a search request using the obtained IDs:
    curl -X POST http://localhost:3000/api/search \
      -H "Content-Type: application/json" \
      -d '{
        "chatModel": {
          "providerId": "550e8400-e29b-41d4-a716-446655440000",
          "key": "gpt-4o-mini"
        },
        "embeddingModel": {
          "providerId": "550e8400-e29b-41d4-a716-446655440000",
          "key": "text-embedding-3-large"
        },
        "optimizationMode": "balanced",
        "sources": ["web"],
        "query": "What is Perplexica?"
      }'
    
  4. Process the response containing the answer and sources

Error handling

The API uses standard HTTP status codes:
  • 200 - Successful request
  • 400 - Invalid request (missing required fields, malformed JSON)
  • 500 - Internal server error
Error responses include a message field describing the issue:
{
  "message": "Missing sources or query"
}

Rate limits

Perplexica does not enforce built-in rate limits, but your responses may be limited by:
  • The AI provider’s rate limits (OpenAI, Anthropic, etc.)
  • Your server’s resource capacity
  • Network bandwidth for streaming responses

Next steps

Authentication

Learn about API authentication requirements

Search endpoint

Detailed documentation for the search API

Build docs developers (and LLMs) love