Skip to main content
Video ad placement is the second step in the Splyce pipeline. It takes a video file and a product name, runs it through Gemini, and returns both a full scene breakdown and a precise ad placement decision — including the exact timestamp, the visual edit description, and the body location where the product should appear.

Video upload

Splyce uploads the video file directly to the Gemini Files API using the google-genai SDK. Gemini processes the video natively — no frame extraction or separate transcription is needed at this stage.
The maximum supported upload size is 200 MB by default (MAX_VIDEO_UPLOAD_MB). Files larger than this limit will be rejected before upload. If your clips exceed this size, trim or transcode them before sending.
Supported formats: .mp4, .mpeg, .mov, .webm, .mkv, .avi

Two-pass analysis

analyze_video_for_ad_placement() runs two sequential prompts against the uploaded video.

Pass 1: Scene breakdown

The first prompt instructs Gemini to produce a timestamped breakdown of the entire video: what is happening in each scene, who is on screen, what dialogue or audio is present, and what body parts are visible. This breakdown is stored in the scene_breakdown field of the response.

Pass 2: Ad placement selection

The second prompt uses the scene breakdown as context and asks Gemini to select the single best moment to insert a 3-second ad for the given product. Gemini evaluates each scene against three criteria:
  1. Character body position — a wrist, hand, or other body part is clearly visible and unobstructed
  2. Natural audio pause — dialogue or ambient sound has a gap that can accommodate a voiceover
  3. Scene plausibility — the product’s presence would feel contextually natural, not forced
The result is the ad_placement JSON object.

Ad placement JSON structure

The ad_placement object contains all information needed by the video generation step:
{
  "ad_timestamp": "MM:SS",
  "ad_timestamp_seconds": 12.5,
  "placement_rationale": "why this moment was chosen",
  "scene_context": "what is happening on screen at this moment",
  "ad_description": {
    "visual": "product ON CHARACTER — which wrist, hand, or body part",
    "text_or_voiceover": "Oh wow, a Patek",
    "transition_in": "seamless",
    "transition_out": "seamless",
    "style_notes": "color grade, grain, and lighting to match the scene"
  },
  "edit_instruction": {
    "timestamp": "MM:SS",
    "whats_happening": "description of what is on screen",
    "adjustment": "add product_name ON the character at [exact body location]"
  }
}

Field reference

FieldTypeDescription
ad_timestampstringHuman-readable timestamp in MM:SS format
ad_timestamp_secondsnumberMachine-readable timestamp in seconds; used by the generation step for frame extraction and splicing
placement_rationalestringGemini’s explanation of why this moment was selected
scene_contextstringDescription of what is happening on screen at the chosen moment
ad_description.visualstringSpecifies where on the character’s body the product should appear
ad_description.text_or_voiceoverstringThe voiceover line to be generated by ElevenLabs
ad_description.transition_instringHow the ad segment enters; typically "seamless"
ad_description.transition_outstringHow the ad segment exits; typically "seamless"
ad_description.style_notesstringVisual style guidance for frame editing (color grade, grain, lighting)
edit_instruction.timestampstringRedundant MM:SS timestamp for human reference
edit_instruction.whats_happeningstringWhat is visible at this frame
edit_instruction.adjustmentstringExact edit instruction passed to the Gemini image model

Video caching and video_id

After the video is uploaded and analyzed, Splyce stores the local file path and analysis result on the server, indexed by a generated video_id. This ID is returned alongside the analysis in the /api/analyze-video response. You must pass this video_id to /api/generate-ad-video. Splyce uses it to locate the cached video for frame extraction and final splicing — the video file is not re-uploaded in step 3.
The server cache expires after 30 minutes. If you call /api/generate-ad-video after the cache has expired, Splyce will return a “video not found” error. Re-run /api/analyze-video to obtain a new video_id.

Full response structure

The /api/analyze-video endpoint returns:
{
  "video_id": "uuid-string",
  "product": "Patek Philippe Aquanaut 5167A-001",
  "scene_breakdown": "...",
  "ad_placement": { ... }
}
Pass the entire response object as the analysis field in the /api/generate-ad-video request body.

Build docs developers (and LLMs) love