Flock can transcribe audio files directly inside a SQL query and return the resulting text as a column. You can join transcripts with structured data, filter rows based on what was said, summarize calls, or generate embeddings for similarity search — all in standard DuckDB SQL.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/dais-polymtl/flock/llms.txt
Use this file to discover all available pages before exploring further.
Supported providers
Supported providers and their transcription endpoints:| Provider | Transcription endpoint | Example model |
|---|---|---|
| OpenAI | audio/transcriptions | whisper-1 |
| Azure OpenAI | Azure audio transcription endpoint | whisper-1 |
Using audio in context_columns
To transcribe audio, add an entry withtype: 'audio' and a transcription_model to the context_columns array:
Audio context column properties
SQL column containing the audio source — a local file path or URL, depending on the provider.
Must be
'audio' to identify this column as an audio input.Provider-specific transcription model to use. Required whenever
type is 'audio'. For OpenAI and Azure, use 'whisper-1'.Optional alias for referencing the transcribed text in your prompt template, e.g.,
{call}.Validation rules
Flock enforces these rules at bind time:- If
typeis'audio',transcription_modelmust be provided — omitting it raises an error. - If
transcription_modelis set buttypeis not'audio', Flock raises an error.
Examples
Basic transcription
Transcribe a list of audio files and return the raw transcripts:Chaining transcription and summarization
Use a CTE to transcribe first, then pass the text to a secondllm_complete call:
Filtering based on audio content
Usellm_filter to keep only rows whose audio meets a semantic criterion:
Generating embeddings from audio
There is no direct audio-to-embedding API in Flock. Use a two-step approach: transcribe the audio, then embed the resulting text withllm_embedding:
Function support matrix
| Function | Audio support | Notes |
|---|---|---|
llm_complete | Full | Transcribe and optionally transform content |
llm_filter | Full | Filter rows based on audio-derived semantics |
llm_reduce | Full | Summarize or aggregate transcripts |
llm_rerank | Via text | Rerank based on transcript features |
llm_first | Via text | Pick top row based on transcript criteria |
llm_last | Via text | Pick bottom row based on transcript criteria |
llm_embedding | Via text | Embed transcripts, not raw audio |