Flock is a DuckDB community extension that adds LLM-powered SQL functions directly into your database. Once loaded, you can callDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dais-polymtl/flock/llms.txt
Use this file to discover all available pages before exploring further.
llm_complete, llm_filter, llm_embedding, and more from any SQL query — no extra services or infrastructure required. This guide takes you from a fresh DuckDB install to a working LLM query.
Prerequisites
Before you begin, make sure you have:- DuckDB v1.5.0 or later — download from duckdb.org/docs/installation
- An API key for at least one supported provider: OpenAI, Azure OpenAI, or Anthropic — or Ollama running locally if you prefer local models
Install Flock
Install and load the extension
The recommended way to install Flock is from the DuckDB community extension catalog. Run the following two commands in your DuckDB session:
Building from source requires CMake 3.5+, a C++ compiler (GCC, Clang, or MSVC), and either Ninja or Make. The
build_and_run.sh script handles vcpkg setup and dependency management automatically.Configure a provider
Flock uses DuckDB’s nativeCREATE SECRET mechanism to store your API credentials. Each provider has its own secret type. The example below configures OpenAI — see the provider guides for Azure, Ollama, and Anthropic.
Create a model
After configuring your secret, register a named model in Flock’s model manager. The model name is what you reference in every query.Run your first query
With your model registered, callllm_complete from a SELECT statement:
context_columns:
What’s next
OpenAI setup
Full guide for configuring the OpenAI provider and available models
Azure setup
Configure Azure OpenAI with resource name, deployment, and API version
Ollama setup
Run models locally with Ollama — no API key required
Anthropic setup
Use Claude models with the Anthropic provider
SQL functions reference
Explore all scalar functions: llm_complete, llm_filter, llm_embedding, and more
Structured output
Enforce JSON schemas on LLM responses for reliable downstream processing