/api/generate endpoint uses streaming responses via Server-Sent Events (SSE) to provide real-time feedback as the code is generated.
When Streaming is Used
Streaming is only used for initial generation:- When
isFollowUpisfalseor not provided - When creating a new animation from scratch
Response Format
The endpoint uses the Vercel AI SDK’stoUIMessageStreamResponse() method, which returns a stream of Server-Sent Events.
Stream Structure
Metadata Event
Before the AI-generated code stream begins, the endpoint prepends a metadata event containing detected skills:Metadata Event Structure
AI SDK Stream Events
After the metadata event, the stream contains events from the Vercel AI SDK:Text Delta Events
As the code is generated token-by-token:Reasoning Events (Optional)
If the model supports reasoning (like o1-mini), reasoning tokens are included whensendReasoning: true:
Finish Event
When generation completes:Client-Side Consumption
Here’s how to consume the streaming response in the browser:Using Fetch API
Using Vercel AI SDK (React)
The Vercel AI SDK provides React hooks that handle streaming automatically:Event Sequence Example
Here’s a complete example of the event sequence for a simple animation:Benefits of Streaming
- Real-time Feedback: Users see code appearing as it’s generated
- Better UX: No waiting for complete response before showing anything
- Early Errors: Syntax errors can be detected before generation completes
- Progress Indication: Natural loading state without custom spinners
Why Follow-ups Don’t Stream
Follow-up edits use non-streaming responses because:- Edit operations (find/replace) must be atomic
- Structured output is required (edits array or full code)
- Faster response time (no token-by-token overhead)
- Deterministic result needed before applying changes