Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/dais-polymtl/flock/llms.txt

Use this file to discover all available pages before exploring further.

Flock is a DuckDB community extension that adds LLM-powered SQL functions directly into your database. Once loaded, you can call llm_complete, llm_filter, llm_embedding, and more from any SQL query — no extra services or infrastructure required. This guide takes you from a fresh DuckDB install to a working LLM query.

Prerequisites

Before you begin, make sure you have:
  • DuckDB v1.5.0 or later — download from duckdb.org/docs/installation
  • An API key for at least one supported provider: OpenAI, Azure OpenAI, or Anthropic — or Ollama running locally if you prefer local models

Install Flock

1

Install and load the extension

The recommended way to install Flock is from the DuckDB community extension catalog. Run the following two commands in your DuckDB session:
INSTALL flock FROM community;
LOAD flock;
Building from source requires CMake 3.5+, a C++ compiler (GCC, Clang, or MSVC), and either Ninja or Make. The build_and_run.sh script handles vcpkg setup and dependency management automatically.
2

Verify the extension loaded

Confirm Flock is active by checking the list of loaded extensions:
SELECT extension_name, loaded, installed
FROM duckdb_extensions()
WHERE extension_name = 'flock';
You should see loaded = true and installed = true.

Configure a provider

Flock uses DuckDB’s native CREATE SECRET mechanism to store your API credentials. Each provider has its own secret type. The example below configures OpenAI — see the provider guides for Azure, Ollama, and Anthropic.
CREATE SECRET (
    TYPE openai,
    API_KEY 'your-api-key'
);
CREATE SECRET creates a session-scoped secret (lost when DuckDB closes). Use CREATE PERSISTENT SECRET to write it to disk so it reloads automatically in future sessions.

Create a model

After configuring your secret, register a named model in Flock’s model manager. The model name is what you reference in every query.
CREATE MODEL(
    'MyModel',
    'gpt-4o',
    'openai',
    {"tuple_format": "json", "batch_size": 32, "model_parameters": {"temperature": 0.7}}
);
The four arguments are: model name, provider model ID, provider, and a JSON config object.

Run your first query

With your model registered, call llm_complete from a SELECT statement:
SELECT llm_complete(
    {'model_name': 'MyModel'},
    {'prompt': 'Write a one-sentence description of what a database is.'}
);
To use column data as context, pass context_columns:
SELECT llm_complete(
    {'model_name': 'MyModel'},
    {
        'prompt': 'Summarize this review in one sentence: {review}',
        'context_columns': [{'data': review_text, 'name': 'review'}]
    }
) AS summary
FROM customer_reviews;

What’s next

OpenAI setup

Full guide for configuring the OpenAI provider and available models

Azure setup

Configure Azure OpenAI with resource name, deployment, and API version

Ollama setup

Run models locally with Ollama — no API key required

Anthropic setup

Use Claude models with the Anthropic provider

SQL functions reference

Explore all scalar functions: llm_complete, llm_filter, llm_embedding, and more

Structured output

Enforce JSON schemas on LLM responses for reliable downstream processing

Build docs developers (and LLMs) love