Video upload
Splyce uploads the video file directly to the Gemini Files API using thegoogle-genai SDK. Gemini processes the video natively — no frame extraction or separate transcription is needed at this stage.
Supported formats: .mp4, .mpeg, .mov, .webm, .mkv, .avi
Two-pass analysis
analyze_video_for_ad_placement() runs two sequential prompts against the uploaded video.
Pass 1: Scene breakdown
The first prompt instructs Gemini to produce a timestamped breakdown of the entire video: what is happening in each scene, who is on screen, what dialogue or audio is present, and what body parts are visible. This breakdown is stored in thescene_breakdown field of the response.
Pass 2: Ad placement selection
The second prompt uses the scene breakdown as context and asks Gemini to select the single best moment to insert a 3-second ad for the given product. Gemini evaluates each scene against three criteria:- Character body position — a wrist, hand, or other body part is clearly visible and unobstructed
- Natural audio pause — dialogue or ambient sound has a gap that can accommodate a voiceover
- Scene plausibility — the product’s presence would feel contextually natural, not forced
ad_placement JSON object.
Ad placement JSON structure
Thead_placement object contains all information needed by the video generation step:
Field reference
| Field | Type | Description |
|---|---|---|
ad_timestamp | string | Human-readable timestamp in MM:SS format |
ad_timestamp_seconds | number | Machine-readable timestamp in seconds; used by the generation step for frame extraction and splicing |
placement_rationale | string | Gemini’s explanation of why this moment was selected |
scene_context | string | Description of what is happening on screen at the chosen moment |
ad_description.visual | string | Specifies where on the character’s body the product should appear |
ad_description.text_or_voiceover | string | The voiceover line to be generated by ElevenLabs |
ad_description.transition_in | string | How the ad segment enters; typically "seamless" |
ad_description.transition_out | string | How the ad segment exits; typically "seamless" |
ad_description.style_notes | string | Visual style guidance for frame editing (color grade, grain, lighting) |
edit_instruction.timestamp | string | Redundant MM:SS timestamp for human reference |
edit_instruction.whats_happening | string | What is visible at this frame |
edit_instruction.adjustment | string | Exact edit instruction passed to the Gemini image model |
Video caching and video_id
After the video is uploaded and analyzed, Splyce stores the local file path and analysis result on the server, indexed by a generatedvideo_id. This ID is returned alongside the analysis in the /api/analyze-video response.
You must pass this video_id to /api/generate-ad-video. Splyce uses it to locate the cached video for frame extraction and final splicing — the video file is not re-uploaded in step 3.
The server cache expires after 30 minutes. If you call
/api/generate-ad-video after the cache has expired, Splyce will return a “video not found” error. Re-run /api/analyze-video to obtain a new video_id.Full response structure
The/api/analyze-video endpoint returns:
analysis field in the /api/generate-ad-video request body.