Video Generation

POST /api/v1/videos

Generate a complete short video. The request is processed asynchronously — the response returns a task_id to poll for status.

curl -X POST http://localhost:8080/api/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "video_subject": "benefits of morning exercise",
    "video_aspect": "9:16",
    "video_count": 1
  }'

{
  "status": 200,
  "message": "success",
  "data": {
    "task_id": "6c85c8cc-a77a-42b9-bc30-947815aa0558"
  }
}

Request parameters

video_subject

string

required

The topic or keyword for the video. The LLM uses this to write the script and find footage search terms.

video_script

string

A custom script to use instead of AI generation. When provided, the LLM script generation step is skipped.

video_terms

string or array

Custom search keywords for finding footage. If omitted, the LLM generates them from the subject and script.

video_aspect

string

default:"9:16"

Output video aspect ratio. Options: "9:16" (portrait, 1080×1920), "16:9" (landscape, 1920×1080), "1:1" (square, 1080×1080).

video_concat_mode

string

default:"random"

How footage clips are ordered. "random" shuffles clips; "sequential" uses them in retrieval order.

video_transition_mode

string

default:"null"

Transition effect between clips. Options: null (none), "Shuffle", "FadeIn", "FadeOut", "SlideIn", "SlideOut".

video_clip_duration

integer

default:"5"

Duration in seconds of each footage clip before cutting to the next.

video_count

integer

default:"1"

Number of video variants to generate. Each variant uses a different random selection of footage clips.

video_source

string

default:"pexels"

Where to source footage. Options: "pexels", "pixabay", "local".

video_materials

array

List of specific local material objects. Each object has provider (string), url (string), and duration (integer) fields. Used when video_source is "local".

custom_audio_file

string

Path to a custom audio file on the server. When set, TTS is skipped and subtitles are disabled.

video_language

string

default:""

Language code (e.g. "en", "zh", "fr"). Leave empty for auto-detection from the script.

voice_name

string

Name of the TTS voice to use. See Voice Synthesis for available voices.

voice_volume

float

default:"1.0"

Voice narration volume. Range: 0.0–1.0.

voice_rate

float

default:"1.0"

Speech speed multiplier. 1.2 is 20% faster than normal.

bgm_type

string

default:"random"

Background music selection mode. "random" picks a random track; "" disables music.

bgm_file

string

Specific MP3 filename from the resource/songs/ directory to use as background music.

bgm_volume

float

default:"0.2"

Background music volume. Range: 0.0–1.0.

subtitle_enabled

boolean

default:"true"

Whether to generate and burn subtitles into the video.

subtitle_position

string

default:"bottom"

Subtitle vertical position: "bottom", "top", or "center".

font_name

string

default:"STHeitiMedium.ttc"

Subtitle font filename. The file must exist in resource/fonts/.

text_fore_color

string

default:"#FFFFFF"

Subtitle text color as a hex string.

text_background_color

boolean or string

default:"true"

Subtitle background fill. true for default, false to disable, or a hex color string.

font_size

integer

default:"60"

Subtitle font size in pixels.

stroke_color

string

default:"#000000"

Subtitle text outline color.

stroke_width

float

default:"1.5"

Subtitle text outline width in pixels.

paragraph_number

integer

default:"1"

Number of script paragraphs the LLM should generate when writing the script.

POST /api/v1/subtitle

Generate subtitle and audio files only, without compositing a full video. Useful for previewing or post-processing subtitles separately.

curl -X POST http://localhost:8080/api/v1/subtitle \
  -H "Content-Type: application/json" \
  -d '{
    "video_script": "Morning exercise improves energy levels and mental clarity.",
    "voice_name": "en-US-JennyNeural"
  }'

Accepts the same subtitle and voice parameters as /api/v1/videos. Returns a task_id.

POST /api/v1/audio

Generate a voice narration audio file only, without video or subtitles.

curl -X POST http://localhost:8080/api/v1/audio \
  -H "Content-Type: application/json" \
  -d '{
    "video_script": "Morning exercise improves energy levels and mental clarity.",
    "voice_name": "en-US-JennyNeural",
    "voice_rate": 1.2
  }'

Returns a task_id. Poll /api/v1/tasks/{task_id} for the audio file URL.

Overview

Endpoints

POST /api/v1/videos

Request parameters

POST /api/v1/subtitle

POST /api/v1/audio

Build docs developers (and LLMs) love

Overview

Endpoints

Documentation Index

​POST /api/v1/videos

​Request parameters

​POST /api/v1/subtitle

​POST /api/v1/audio

Build docs developers (and LLMs) love

POST /api/v1/videos

Request parameters

POST /api/v1/subtitle

POST /api/v1/audio