Documentation Index
Fetch the complete documentation index at: https://mintlify.com/castorini/quackir/llms.txt
Use this file to discover all available pages before exploring further.
PostgresSearcher queries full-text (sparse), pgvector (dense), or Reciprocal Rank Fusion (hybrid) indexes stored in a PostgreSQL database. It implements the abstract Searcher base class.
Dense and hybrid retrieval require the pgvector extension to be installed in your PostgreSQL instance.
Constructor
psycopg2 connection to the specified PostgreSQL database.
Name of the PostgreSQL database to connect to.
PostgreSQL username.
Methods
search
fts_search, embedding_search, or rrf_search based on method, then filters the query_id document from results.
SearchType.SPARSE, SearchType.DENSE, or SearchType.HYBRID.Document ID to exclude from results.
Text query for sparse or hybrid search.
Query vector for dense or hybrid search.
Maximum number of results to return.
When
True and method is SPARSE or HYBRID, the query_string is tokenized with Pyserini’s Lucene Analyzer before querying.Table(s) to search. For
HYBRID, provide [sparse_table, dense_table].RRF rank smoothing constant. Only used when
method is SearchType.HYBRID.List of
(doc_id, score) tuples ordered by descending score.fts_search
to_tsquery('simple', …) and ranks results with ts_rank. The query string is sanitized before use: non-word characters are stripped and remaining terms are joined with | (OR).
Query string. Cleaned and converted to a
tsquery expression internally.Maximum number of results to return.
Name of the sparse table with a GIN index on
contents.List of
(id, score) tuples.embedding_search
<=> distance operator. The score returned is 1 - cosine_distance, so higher values indicate greater similarity.
Query vector. Passed directly as the
::vector cast argument.Maximum number of results to return.
Name of the dense table with a
vector column.List of
(id, score) tuples ordered by descending similarity.rrf_search
table_names using get_search_type.
Query string for keyword retrieval. Cleaned to a
tsquery expression internally.Query vector for semantic retrieval.
Number of candidates fetched from each sub-ranker before fusion.
RRF rank smoothing constant.
Two table names. The method auto-detects which is sparse and which is dense.
List of
(id, rrf_score) tuples ordered by descending RRF score.get_search_type
information_schema.columns to detect the table type.
Table to inspect.
SearchType.SPARSE if a contents column exists; SearchType.DENSE if an embedding column exists.ValueError if neither column is found.
filter_id
id matches query_id. Called automatically by search().
List of
(id, score) tuples.Document ID to remove.
Filtered results list.
close
psycopg2 connection.