Skip to main content

POST /api/v1/videos

Generate a complete short video. The request is processed asynchronously — the response returns a task_id to poll for status.
curl -X POST http://localhost:8080/api/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "video_subject": "benefits of morning exercise",
    "video_aspect": "9:16",
    "video_count": 1
  }'
{
  "status": 200,
  "message": "success",
  "data": {
    "task_id": "6c85c8cc-a77a-42b9-bc30-947815aa0558"
  }
}

Request parameters

video_subject
string
required
The topic or keyword for the video. The LLM uses this to write the script and find footage search terms.
video_script
string
A custom script to use instead of AI generation. When provided, the LLM script generation step is skipped.
video_terms
string or array
Custom search keywords for finding footage. If omitted, the LLM generates them from the subject and script.
video_aspect
string
default:"9:16"
Output video aspect ratio. Options: "9:16" (portrait, 1080×1920), "16:9" (landscape, 1920×1080), "1:1" (square, 1080×1080).
video_concat_mode
string
default:"random"
How footage clips are ordered. "random" shuffles clips; "sequential" uses them in retrieval order.
video_transition_mode
string
default:"null"
Transition effect between clips. Options: null (none), "Shuffle", "FadeIn", "FadeOut", "SlideIn", "SlideOut".
video_clip_duration
integer
default:"5"
Duration in seconds of each footage clip before cutting to the next.
video_count
integer
default:"1"
Number of video variants to generate. Each variant uses a different random selection of footage clips.
video_source
string
default:"pexels"
Where to source footage. Options: "pexels", "pixabay", "local".
video_materials
array
List of specific local material objects. Each object has provider (string), url (string), and duration (integer) fields. Used when video_source is "local".
custom_audio_file
string
Path to a custom audio file on the server. When set, TTS is skipped and subtitles are disabled.
video_language
string
default:""
Language code (e.g. "en", "zh", "fr"). Leave empty for auto-detection from the script.
voice_name
string
Name of the TTS voice to use. See Voice Synthesis for available voices.
voice_volume
float
default:"1.0"
Voice narration volume. Range: 0.01.0.
voice_rate
float
default:"1.0"
Speech speed multiplier. 1.2 is 20% faster than normal.
bgm_type
string
default:"random"
Background music selection mode. "random" picks a random track; "" disables music.
bgm_file
string
Specific MP3 filename from the resource/songs/ directory to use as background music.
bgm_volume
float
default:"0.2"
Background music volume. Range: 0.01.0.
subtitle_enabled
boolean
default:"true"
Whether to generate and burn subtitles into the video.
subtitle_position
string
default:"bottom"
Subtitle vertical position: "bottom", "top", or "center".
font_name
string
default:"STHeitiMedium.ttc"
Subtitle font filename. The file must exist in resource/fonts/.
text_fore_color
string
default:"#FFFFFF"
Subtitle text color as a hex string.
text_background_color
boolean or string
default:"true"
Subtitle background fill. true for default, false to disable, or a hex color string.
font_size
integer
default:"60"
Subtitle font size in pixels.
stroke_color
string
default:"#000000"
Subtitle text outline color.
stroke_width
float
default:"1.5"
Subtitle text outline width in pixels.
paragraph_number
integer
default:"1"
Number of script paragraphs the LLM should generate when writing the script.

POST /api/v1/subtitle

Generate subtitle and audio files only, without compositing a full video. Useful for previewing or post-processing subtitles separately.
curl -X POST http://localhost:8080/api/v1/subtitle \
  -H "Content-Type: application/json" \
  -d '{
    "video_script": "Morning exercise improves energy levels and mental clarity.",
    "voice_name": "en-US-JennyNeural"
  }'
Accepts the same subtitle and voice parameters as /api/v1/videos. Returns a task_id.

POST /api/v1/audio

Generate a voice narration audio file only, without video or subtitles.
curl -X POST http://localhost:8080/api/v1/audio \
  -H "Content-Type: application/json" \
  -d '{
    "video_script": "Morning exercise improves energy levels and mental clarity.",
    "voice_name": "en-US-JennyNeural",
    "voice_rate": 1.2
  }'
Returns a task_id. Poll /api/v1/tasks/{task_id} for the audio file URL.

Build docs developers (and LLMs) love