Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/castorini/quackir/llms.txt

Use this file to discover all available pages before exploring further.

SQLiteSearcher queries BM25 full-text indexes stored in SQLite FTS5 virtual tables created by SQLiteIndexer. It implements the abstract Searcher base class. Dense and hybrid retrieval are not supported.
from quackir.search import SQLiteSearcher
SQLiteSearcher only supports SearchType.SPARSE. embedding_search and rrf_search are not implemented and return None. Calling search() with SearchType.DENSE or SearchType.HYBRID will not return results.

Constructor

SQLiteSearcher(db_path="sqlite.db")
Opens a SQLite connection to the specified file.
db_path
string
default:"sqlite.db"
Path to the SQLite database file produced by SQLiteIndexer.

Methods

searcher.search(
    method,
    query_id=None,
    query_string=None,
    query_embedding=None,
    top_n=5,
    tokenize_query=True,
    table_names=["corpus"],
    rrf_k=60,
)
Main entry point for retrieval. For SQLite, only SearchType.SPARSE produces results. Filters out query_id from the returned list when provided.
method
SearchType
required
Should be SearchType.SPARSE for SQLiteSearcher.
query_id
string
default:"None"
Document ID to exclude from results.
query_string
string
default:"None"
Text query for sparse search. Required when method is SPARSE.
query_embedding
number[]
default:"None"
Unused by SQLiteSearcher.
top_n
number
default:"5"
Maximum number of results to return.
tokenize_query
boolean
default:"true"
When True, the query_string is tokenized with Pyserini’s Lucene Analyzer before querying.
table_names
string[]
default:"[\"corpus\"]"
List containing the name of the sparse table to search.
rrf_k
number
default:"60"
Unused by SQLiteSearcher.
return
list
List of (doc_id, score) tuples ordered by descending BM25 score.

searcher.fts_search(query_string, top_n=5, table_name="corpus")
Executes a BM25 search against the fts_{table_name} FTS5 virtual table. Query terms are individually quoted and joined with OR to support any-term matching.
SQLite FTS5’s bm25() function returns negative values by convention (lower rank = better). SQLiteSearcher negates the score so that higher values indicate better matches, consistent with the rest of the QuackIR API.
query_string
string
required
Query string. Individual terms are quoted and OR-joined.
top_n
number
default:"5"
Maximum number of results to return.
table_name
string
default:"corpus"
Name of the base table. The FTS5 virtual table fts_{table_name} is queried.
return
list
List of (id, score) tuples.

get_search_type

searcher.get_search_type(table_name)
Inspects column names. Returns SearchType.SPARSE if a contents column is present, otherwise raises ValueError.
table_name
string
required
Table to inspect.
return
SearchType
Always SearchType.SPARSE for valid SQLite tables.

filter_id

SQLiteSearcher.filter_id(results, query_id)
Static method. Removes the entry whose id matches query_id. Called automatically by search().
results
list
required
List of (id, score) tuples.
query_id
string
default:"None"
Document ID to remove.
return
list
Filtered results list.

close

searcher.close()
Closes the underlying SQLite connection.

Example

from quackir import SearchType
from quackir.search import SQLiteSearcher

searcher = SQLiteSearcher("sparse.db")

results = searcher.search(
    method=SearchType.SPARSE,
    query_string="information retrieval benchmarks",
    top_n=10,
)

for doc_id, score in results:
    print(doc_id, score)

searcher.close()

Build docs developers (and LLMs) love