Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/castorini/quackir/llms.txt

Use this file to discover all available pages before exploring further.

SearchType is an enum exported from the top-level quackir package. It is passed as the method argument to Searcher.search() to select which retrieval strategy to execute.
from quackir import SearchType

Members

MemberValueDescription
SearchType.SPARSE'sparse'BM25 full-text search against a contents column.
SearchType.DENSE'dense'Cosine similarity search against an embedding column.
SearchType.HYBRID'hybrid'Reciprocal Rank Fusion (RRF) combining sparse and dense results.

Usage

Pass a SearchType member as the method argument to searcher.search():
from quackir import SearchType
from quackir.search import DuckDBSearcher

searcher = DuckDBSearcher("my_index.db")

# Sparse BM25 search
results = searcher.search(
    method=SearchType.SPARSE,
    query_string="neural information retrieval",
    top_n=10,
)

# Dense vector search
results = searcher.search(
    method=SearchType.DENSE,
    query_embedding=[0.12, -0.34, ...],  # list of floats
    top_n=10,
)

# Hybrid RRF search (requires both query_string and query_embedding)
results = searcher.search(
    method=SearchType.HYBRID,
    query_string="neural information retrieval",
    query_embedding=[0.12, -0.34, ...],
    top_n=10,
    table_names=["sparse_corpus", "dense_corpus"],
    rrf_k=60,
)

searcher.close()
When method is SearchType.SPARSE or SearchType.HYBRID, the query_string is automatically tokenized with Pyserini’s Lucene Analyzer unless tokenize_query=False is passed.
SQLiteSearcher only supports SearchType.SPARSE. Calling search() with SearchType.DENSE or SearchType.HYBRID on a SQLite searcher returns no results because embedding_search and rrf_search are not implemented.

Query tokenization behavior

methodtokenize_query=True (default)tokenize_query=False
SPARSEquery_string is tokenizedquery_string used as-is
DENSENo tokenization appliedNo tokenization applied
HYBRIDquery_string is tokenizedquery_string used as-is

Build docs developers (and LLMs) love