Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/gratitude5dee/wzrd-studio-desktopfinal/llms.txt

Use this file to discover all available pages before exploring further.

Kanvas is WZRD Studio’s generative visual playground. It combines AI image and video generation across multiple creative studios — cinematic stills, talking-head lip-sync, image-to-image transformations, character-consistent generation — with two distinct composition workflows: Remix, for applying visual style transformations to existing footage, and Lyrics, for building audio-reactive animated lyric videos driven by your own audio and a Remotion-powered preview engine.

Routes

PathDescription
/kanvasMain generative studio hub
/kanvas?studio=imageImage Studio (text-to-image / image-to-image)
/kanvas?studio=videoVideo Studio (text-to-video / image-to-video)
/kanvas?studio=cinemaCinema Studio (cinematic stills with camera controls)
/kanvas?studio=editEdit Studio (image editing with fal.ai models)
/kanvas?studio=lipsyncLipSync Studio (talking-head and lip-sync)
/kanvas?studio=worldviewWorldview
/kanvas?studio=character-creationCharacter Creation
/kanvas/remixRemix — apply AI styles to footage
/kanvas/remix/:templateIdRemix with a pre-loaded template
/kanvas/remix/jobs/:jobIdTrack a specific remix job
/kanvas/lyricsLyrics landing — browse templates
/kanvas/lyrics/newLyrics wizard — upload audio and build a new video
/kanvas/lyrics/templates/:templateIdOpen a saved lyrics template

The Kanvas Studio Hub

The main Kanvas page (/kanvas) is a tabbed generative studio. The active studio is controlled by the ?studio= search param, so deep-linking always lands on the right studio.

Studios at a Glance

Image

Generate images from text prompts (text-to-image) or transform reference images (image-to-image). Supports multi-image references and @mention character injection for consistent subjects.

Video

Generate short videos from text (text-to-video), animate a reference image (image-to-video), or apply motion to an existing video (reference-to-video).

Cinema

Cinematic still generation with a full director’s camera kit — choose camera body, lens, focal length, and aperture to influence the visual language of each output.

Edit

Powered by fal.ai models. Perform structural image edits: color grade, color key, blur, rotate, flip, aspect ratio change, sketch, and more.

LipSync

Talking-head mode: portrait image + audio → animated face. Lip-sync mode: source video + audio → mouth-synced replacement.

Character Creation

Build reusable character blueprints that can be @mentioned in any prompt to inject consistent subject references across generations.

Model Selection

Models are fetched per-studio at page load via fetchKanvasModels(studio) from the kanvas-generate edge function. The UI prefers fal.ai models first (sortKanvasModelsFalFirst), then the catalog default, then GMI Cloud models.
// src/features/kanvas/types.ts
interface KanvasModel {
  id: string;
  name: string;
  studio: KanvasStudio;
  mode: KanvasMode;      // 'text-to-image' | 'image-to-video' | 'talking-head' | ...
  credits: number;       // credit cost per generation
  requiresAssets: KanvasAssetType[];
  controls: KanvasControlDefinition[];
  defaults: Record<string, unknown>;
}

Credit System

Each generation deducts credits based on the selected model. If your balance is insufficient, an InsufficientCreditsError is thrown by submitKanvasJob and a dialog appears showing required vs available credits with a link to the billing page.
// src/features/kanvas/service.ts
export class InsufficientCreditsError extends Error {
  constructor(public payload: { required: number; available: number }) { ... }
}
Credits are consumed at generation time. Queued or failed jobs do not return credits automatically. Check the Recent Jobs panel for job status before re-submitting.

Asset Management

Upload images, videos, and audio through the in-page asset selector. Assets are stored in WZRD’s asset library via uploadKanvasAsset and persist across sessions. The last 6 assets of each type are shown as clickable thumbnails for quick selection.
// Upload an asset
const asset = await uploadKanvasAsset(file, { assetType: 'image' | 'video' | 'audio' });

@Mention Character References

Prompts support @character-name mentions that expand to character blueprint descriptions and reference image URLs before the request is sent to the generation model. Pin a character to inject it automatically into every prompt in the session.

Job History and Polling

All submitted jobs are tracked in the Recent Jobs rail (per-studio). Active jobs (queued or processing status) are polled every 4 seconds via refreshKanvasJobStatus. After 3 consecutive poll failures the job is marked locally failed to clear the loading spinner.
// Job status lifecycle
type KanvasJobStatus = 'queued' | 'processing' | 'completed' | 'failed' | 'cancelled';

Voice Actions

The Kanvas page registers two voice actions with useRegisterVoiceActions:
  • kanvas_set_studio — Switch to a studio and optionally pre-fill the prompt.
  • kanvas_generate — Trigger a generation (requires confirmation, as it spends credits).

Remix Mode

Remix (/kanvas/remix) applies AI visual style transformations to existing video or image sources using the footage asset library and lyric template system.
The left panel lists available footage assets from listFootageAssets(), filterable by:
  • Aspect ratioall, 9:16, 16:9
  • Category — e.g. bay-area-8mm (from listFootageCategories())
  • Tag — free-text tag filter
  • Sort — newest, oldest, shortest, longest

Remix Job Tracking

/kanvas/remix/jobs/:jobId shows the status of a single in-progress or completed remix job, including progress percentage and the final download URL when complete.

Lyrics Mode

Lyrics mode (/kanvas/lyrics) is a step-by-step wizard for creating audio-reactive animated lyric videos. Upload your own audio, receive a word-level transcription, set cut markers, pick a visual template, and preview the result in a live Remotion player.

Wizard Steps

1

Upload Audio

Drop an audio file (MP3, WAV, AAC, etc.) into the Audio Panel or select from previously uploaded assets. The audio is uploaded to WZRD storage via uploadTemplateAudio and registered via the kanvas-lyrics-audio-register edge function.Waveform peaks are decoded client-side by decodeWaveform (using the Web Audio API) and displayed as a scrollable waveform for scrubbing.
2

Transcribe & Edit Lyrics

Click Transcribe to send the audio to the kanvas-lyrics-transcribe edge function. The result is a word-level LyricBlock[] with start/end timestamps and confidence scores.Edit the transcript in the Lyrics Panel — adjust word timing, merge or split lyric blocks, and correct any transcription errors before proceeding.
// Word-level lyric data
interface LyricBlock {
  id: string;
  startTime: number;  // seconds
  endTime: number;
  words: Array<{
    id: string;
    text: string;
    startTime: number;
    endTime: number;
    confidence?: number;
  }>;
}
3

Set Cut Markers

The Markers Panel lets you place cut markers on the waveform timeline. Markers snap to 50 ms increments and are deduplicated within a 250 ms window. These markers define where the visual footage cuts in the final composition.
4

Choose a Template

Browse visual templates from kanvas-lyrics-template. Each template defines a Remotion composition, a set of footage slots, and default styling. Selecting a template loads it from getTemplate(templateId) and updates the live preview.

Lyrics Template Landing

/kanvas/lyrics (without a wizard route) shows the TemplatesLanding component — a gallery of pre-built visual templates. Clicking a template navigates to /kanvas/lyrics/templates/:templateId to preview it, or to /kanvas/lyrics/new to start a fresh wizard with that template as the base style.

Remotion Preview and Export

The Lyrics Remix composer uses @remotion/player (pinned at 4.0.424) for real-time preview:
import { Player, type PlayerRef } from '@remotion/player';

<Player
  component={LyricRemixComposition}
  inputProps={{ template, audioUrl, timelineSlots, selectedStyleId, aspectRatio }}
  durationInFrames={totalFrames}
  fps={30}
  compositionWidth={width}
  compositionHeight={height}
/>
Final renders are triggered by the ExportModal, which calls the kanvas-generate edge function with the Remotion render preset. The render job is tracked in the standard Kanvas job polling loop.
The Remotion player preview updates live as you adjust styles, swap footage clips, or toggle the aspect ratio — no re-submission needed. Only the final export triggers credit consumption.

Lyrics Template Service

// src/features/kanvas-lyrics/service.ts (key operations)
createTemplate(data)         // Create a new template with audio asset
uploadTemplateAudio(file)    // Upload audio and get a storage URL
transcribeTemplate(id)       // Trigger transcription → word-level LyricBlocks
updateTemplate(id, data)     // Persist lyric edits and cut markers
finalizeTemplate(id)         // Mark as ready for rendering
getTemplate(id)              // Fetch a saved template by ID
Template status lifecycle:
new → audio_ready → lyrics_processing → lyrics_ready → markers_ready → saved

Cinema Studio: Camera Controls

The Cinema studio exposes a director’s camera kit beyond just a text prompt:

Camera

Select from KANVAS_CAMERAS — film and digital camera bodies that influence grain, dynamic range, and colour science.

Lens

Choose from KANVAS_LENSES — prime and zoom lens characters.

Focal Length

Pick from KANVAS_FOCAL_LENGTHS to control perspective compression.

Aperture

Select from KANVAS_APERTURES to dial in depth of field.
These settings are assembled by buildCinemaRequest({ prompt, cinema, modelId, settings }) before the job is submitted.

Generation Request Types

All studio submissions go through submitKanvasJob(request). The request shape varies by studio:
// Text-to-image
buildImageRequest({ modelId, prompt, settings, imageIds?, referenceAssets? })

// Image-to-image (requires at least one imageId)
buildImageRequest({ modelId, prompt, settings, imageIds: ['<assetId>'] })
// Text-to-video
buildVideoRequest({ modelId, prompt, settings, mode: 'text-to-video' })

// Image-to-video (animate a still frame)
buildVideoRequest({ modelId, prompt, settings, mode: 'image-to-video', imageId: '<assetId>' })

// Reference-to-video (motion from any media)
buildVideoRequest({ modelId, prompt, settings, mode: 'reference-to-video', referenceAssetId: '<assetId>' })
// Talking-head (portrait + audio → animated face)
buildLipSyncRequest({ mode: 'talking-head', modelId, prompt, imageId, audioId, settings })

// Lip-sync (replace mouth movement in existing video)
buildLipSyncRequest({ mode: 'lip-sync', modelId, videoId, audioId, settings })
buildCinemaRequest({
  modelId, prompt, settings,
  cinema: { camera, lens, focalLength, aperture },
  imageIds?,           // optional reference images
  elementIds?,         // @mention character element IDs
  referenceBlueprintIds?,
})

Kanvas Job Lifecycle

submitKanvasJob(request)
  → POST /functions/v1/kanvas-generate
  → returns KanvasJob { id, status: 'queued' }

  [polling every 4s]
refreshKanvasJobStatus(jobId)
  → GET /functions/v1/kanvas-job-status?id=...
  → returns updated KanvasJob

  [on status === 'completed']
getJobPrimaryUrl(job)
  → job.resultPayload.primaryUrl  (video or image CDN URL)
Jobs survive page navigation. The Recent Jobs rail re-hydrates from listKanvasJobs() on page load, so in-progress generations continue updating even after a browser refresh.

Build docs developers (and LLMs) love