Streaming responses

Perplexica’s Search API supports streaming responses, allowing you to receive results incrementally as they’re generated. This provides a better user experience by displaying partial results immediately rather than waiting for the complete response.

Enable streaming

To enable streaming, set the stream parameter to true in your search request:

{
  "chatModel": {
    "providerId": "550e8400-e29b-41d4-a716-446655440000",
    "key": "gpt-4o-mini"
  },
  "embeddingModel": {
    "providerId": "550e8400-e29b-41d4-a716-446655440000",
    "key": "text-embedding-3-large"
  },
  "sources": ["web"],
  "query": "What is Perplexica",
  "stream": true
}

Response format

When streaming is enabled, the API returns a stream using Server-Sent Events (SSE) with Content-Type: text/event-stream. Each line in the stream contains a complete, valid JSON object.

Stream headers

The streaming response includes these headers:

Content-Type: text/event-stream
Cache-Control: no-cache, no-transform
Connection: keep-alive

Message types

The stream sends different message types during the search process:

init

Sent when the stream connection is established:

{"type":"init","data":"Stream connected"}

sources

Sent once with all sources used to generate the response:

{
  "type": "sources",
  "data": [
    {
      "content": "Perplexica is an innovative, open-source AI-powered search engine...",
      "metadata": {
        "title": "What is Perplexica",
        "url": "https://example.com/perplexica"
      }
    }
  ]
}

response

Sent multiple times with chunks of the generated answer:

{"type":"response","data":"Perplexica is an "}
{"type":"response","data":"innovative, open-source "}
{"type":"response","data":"AI-powered search engine..."}

done

Sent when the stream is complete:

{"type":"done"}

Complete example

Here’s what a complete streaming response looks like:

{"type":"init","data":"Stream connected"}
{"type":"sources","data":[{"content":"...","metadata":{"title":"...","url":"..."}}]}
{"type":"response","data":"Perplexica is an "}
{"type":"response","data":"innovative, open-source "}
{"type":"response","data":"AI-powered search engine..."}
{"type":"done"}

Consuming the stream

Here are examples of how to consume the streaming API in different languages:

const response = await fetch('http://localhost:3000/api/search', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    chatModel: {
      providerId: '550e8400-e29b-41d4-a716-446655440000',
      key: 'gpt-4o-mini'
    },
    embeddingModel: {
      providerId: '550e8400-e29b-41d4-a716-446655440000',
      key: 'text-embedding-3-large'
    },
    sources: ['web'],
    query: 'What is Perplexica',
    stream: true
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const lines = chunk.split('\n').filter(line => line.trim());
  
  for (const line of lines) {
    const message = JSON.parse(line);
    
    switch (message.type) {
      case 'init':
        console.log('Stream connected');
        break;
      case 'sources':
        console.log('Sources:', message.data);
        break;
      case 'response':
        process.stdout.write(message.data);
        break;
      case 'done':
        console.log('\nStream complete');
        break;
    }
  }
}

Each line in the stream is a complete JSON object. Make sure to parse each line separately rather than treating the entire stream as a single JSON document.

Stream lifecycle

Connection - The stream begins with an init message
Sources - A sources message contains all references used
Content - Multiple response messages deliver the answer incrementally
Completion - A done message signals the end of the stream

If the client disconnects before the stream completes, the server will automatically clean up resources and stop processing the request.

Non-streaming mode

If you prefer to receive the complete response at once, set stream to false or omit the parameter (defaults to false):

{
  "chatModel": { ... },
  "embeddingModel": { ... },
  "sources": ["web"],
  "query": "What is Perplexica",
  "stream": false
}

The response will be a standard JSON object:

{
  "message": "Perplexica is an innovative, open-source AI-powered search engine...",
  "sources": [
    {
      "content": "...",
      "metadata": {
        "title": "...",
        "url": "..."
      }
    }
  ]
}

Getting Started

Endpoints

Advanced

Enable streaming

Response format

Stream headers

Message types

init

sources

response

done

Complete example

Consuming the stream

Stream lifecycle

Non-streaming mode

Build docs developers (and LLMs) love

Getting Started

Endpoints

Advanced

​Enable streaming

​Response format

​Stream headers

​Message types

​init

​sources

​response

​done

​Complete example

​Consuming the stream

​Stream lifecycle

​Non-streaming mode

Build docs developers (and LLMs) love

Enable streaming

Response format

Stream headers

Message types

init

sources

response

done

Complete example

Consuming the stream

Stream lifecycle

Non-streaming mode