Flock’s aggregate functions work like any SQL aggregate — they process a group of rows defined byDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dais-polymtl/flock/llms.txt
Use this file to discover all available pages before exploring further.
GROUP BY and return one result per group. The difference is that the “aggregation logic” is delegated to a language model: you supply a prompt describing what you want, and Flock batches the rows in the group and sends them to the model.
All aggregate functions share the same two-argument signature as the scalar functions:
- A model configuration struct with
model_nameand an optionalsecret_name - A prompt configuration struct with a
promptorprompt_name, an optionalversion, and acontext_columnsarray
context_columns API is identical to the scalar functions:
llm_reduce
llm_reduce collapses all rows in a group into a single text output. The model receives every row’s column values and the prompt describing the aggregation — summarization, consolidation, opinion synthesis, and similar tasks.
Return type: JSON
Parameters
Model configuration (first argument)The registered model name to use for aggregation.
The DuckDB secret holding the API key for this model.
An inline prompt instructing the model how to aggregate the rows. Mutually exclusive with
prompt_name.The name of a pre-configured prompt in Flock’s prompt registry. Mutually exclusive with
prompt.The version of the named prompt to use. Only valid with
prompt_name.Columns whose values are passed to the model for each row in the group.
Examples
llm_rerank
llm_rerank reorders the rows in a group by relevance to a query prompt and returns the full set of rows as a JSON array sorted from most to least relevant. It is built on the sliding-window listwise reranking method described by Ma et al. (2023), which handles groups larger than a model’s context window by progressively ranking overlapping subsets.
Return type: JSON (array of row objects, ordered by relevance)
llm_rerank returns all rows in the group as a JSON array. If you only need the single best or worst match, use llm_first or llm_last instead — they are more efficient for that case.Sliding window mechanism
When a group contains more rows than a model can rank in one call,llm_rerank uses a sliding window strategy:
- Rank the last
mdocuments in a window. - Shift the window toward the beginning of the list by
m/2. - Repeat until the window covers the start of the list.
Parameters
Model configuration (first argument)The registered model name to use for reranking.
The DuckDB secret holding the API key for this model.
The query or relevance criterion to rank against. Mutually exclusive with
prompt_name.The name of a pre-configured ranking prompt. Mutually exclusive with
prompt.The version of the named prompt to use. Only valid with
prompt_name.Columns whose values the model uses to assess relevance for each row.
Examples
Output format
llm_rerank returns a JSON array of objects. Each object mirrors the columns provided in context_columns, ordered from most to least relevant:
llm_first
llm_first reranks the rows in a group by relevance to a prompt and returns only the most relevant row as a JSON object. It is equivalent to running llm_rerank and taking the first element, but avoids materializing the full ranked list.
Return type: JSON (single row object)
Parameters
Model configuration (first argument)The registered model name to use for selection.
The DuckDB secret holding the API key for this model.
The relevance criterion. The model selects the row that best matches this prompt. Mutually exclusive with
prompt_name.The name of a pre-configured selection prompt. Mutually exclusive with
prompt.The version of the named prompt to use. Only valid with
prompt_name.Columns the model uses to assess relevance.
Examples
Output format
llm_first returns a single JSON object containing the column values of the most relevant row:
llm_last
llm_last is the complement of llm_first — it reranks the rows in a group by relevance to a prompt and returns the least relevant row. Use it to identify outliers, flag low-quality entries, or find the weakest match in a group.
Return type: JSON (single row object)
Parameters
Model configuration (first argument)The registered model name to use for selection.
The DuckDB secret holding the API key for this model.
The relevance criterion. The model selects the row that least matches this prompt. Mutually exclusive with
prompt_name.The name of a pre-configured selection prompt. Mutually exclusive with
prompt.The version of the named prompt to use. Only valid with
prompt_name.Columns the model uses to assess relevance.
Examples
Output format
llm_last returns a single JSON object containing the column values of the least relevant row: