Video ad placement

Video ad placement is the second step in the Splyce pipeline. It takes a video file and a product name, runs it through Gemini, and returns both a full scene breakdown and a precise ad placement decision — including the exact timestamp, the visual edit description, and the body location where the product should appear.

Video upload

Splyce uploads the video file directly to the Gemini Files API using the google-genai SDK. Gemini processes the video natively — no frame extraction or separate transcription is needed at this stage.

The maximum supported upload size is 200 MB by default (MAX_VIDEO_UPLOAD_MB). Files larger than this limit will be rejected before upload. If your clips exceed this size, trim or transcode them before sending.

Supported formats: .mp4, .mpeg, .mov, .webm, .mkv, .avi

Two-pass analysis

analyze_video_for_ad_placement() runs two sequential prompts against the uploaded video.

Pass 1: Scene breakdown

The first prompt instructs Gemini to produce a timestamped breakdown of the entire video: what is happening in each scene, who is on screen, what dialogue or audio is present, and what body parts are visible. This breakdown is stored in the scene_breakdown field of the response.

Pass 2: Ad placement selection

The second prompt uses the scene breakdown as context and asks Gemini to select the single best moment to insert a 3-second ad for the given product. Gemini evaluates each scene against three criteria:

Character body position — a wrist, hand, or other body part is clearly visible and unobstructed
Natural audio pause — dialogue or ambient sound has a gap that can accommodate a voiceover
Scene plausibility — the product’s presence would feel contextually natural, not forced

The result is the ad_placement JSON object.

Ad placement JSON structure

The ad_placement object contains all information needed by the video generation step:

{
  "ad_timestamp": "MM:SS",
  "ad_timestamp_seconds": 12.5,
  "placement_rationale": "why this moment was chosen",
  "scene_context": "what is happening on screen at this moment",
  "ad_description": {
    "visual": "product ON CHARACTER — which wrist, hand, or body part",
    "text_or_voiceover": "Oh wow, a Patek",
    "transition_in": "seamless",
    "transition_out": "seamless",
    "style_notes": "color grade, grain, and lighting to match the scene"
  },
  "edit_instruction": {
    "timestamp": "MM:SS",
    "whats_happening": "description of what is on screen",
    "adjustment": "add product_name ON the character at [exact body location]"
  }
}

Field reference

Field	Type	Description
`ad_timestamp`	string	Human-readable timestamp in `MM:SS` format
`ad_timestamp_seconds`	number	Machine-readable timestamp in seconds; used by the generation step for frame extraction and splicing
`placement_rationale`	string	Gemini’s explanation of why this moment was selected
`scene_context`	string	Description of what is happening on screen at the chosen moment
`ad_description.visual`	string	Specifies where on the character’s body the product should appear
`ad_description.text_or_voiceover`	string	The voiceover line to be generated by ElevenLabs
`ad_description.transition_in`	string	How the ad segment enters; typically `"seamless"`
`ad_description.transition_out`	string	How the ad segment exits; typically `"seamless"`
`ad_description.style_notes`	string	Visual style guidance for frame editing (color grade, grain, lighting)
`edit_instruction.timestamp`	string	Redundant `MM:SS` timestamp for human reference
`edit_instruction.whats_happening`	string	What is visible at this frame
`edit_instruction.adjustment`	string	Exact edit instruction passed to the Gemini image model

Video caching and video_id

After the video is uploaded and analyzed, Splyce stores the local file path and analysis result on the server, indexed by a generated video_id. This ID is returned alongside the analysis in the /api/analyze-video response. You must pass this video_id to /api/generate-ad-video. Splyce uses it to locate the cached video for frame extraction and final splicing — the video file is not re-uploaded in step 3.

The server cache expires after 30 minutes. If you call /api/generate-ad-video after the cache has expired, Splyce will return a “video not found” error. Re-run /api/analyze-video to obtain a new video_id.

Full response structure

The /api/analyze-video endpoint returns:

{
  "video_id": "uuid-string",
  "product": "Patek Philippe Aquanaut 5167A-001",
  "scene_breakdown": "...",
  "ad_placement": { ... }
}

Pass the entire response object as the analysis field in the /api/generate-ad-video request body.

Get Started

Core Concepts

Guides

Video ad placement

Video upload

Two-pass analysis

Pass 1: Scene breakdown

Pass 2: Ad placement selection

Ad placement JSON structure

Field reference

Video caching and video_id

Full response structure

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Video upload

​Two-pass analysis

​Pass 1: Scene breakdown

​Pass 2: Ad placement selection

​Ad placement JSON structure

​Field reference

​Video caching and video_id

​Full response structure

Build docs developers (and LLMs) love

Video upload

Two-pass analysis

Pass 1: Scene breakdown

Pass 2: Ad placement selection

Ad placement JSON structure

Field reference

Video caching and video_id

Full response structure