PostgresSearcher API reference

PostgresSearcher queries full-text (sparse), pgvector (dense), or Reciprocal Rank Fusion (hybrid) indexes stored in a PostgreSQL database. It implements the abstract Searcher base class.

from quackir.search import PostgresSearcher

Dense and hybrid retrieval require the pgvector extension to be installed in your PostgreSQL instance.

Constructor

PostgresSearcher(db_name="quackir", user="postgres")

Opens a psycopg2 connection to the specified PostgreSQL database.

db_name

string

default:"quackir"

Name of the PostgreSQL database to connect to.

user

string

default:"postgres"

PostgreSQL username.

Methods

search

searcher.search(
    method,
    query_id=None,
    query_string=None,
    query_embedding=None,
    top_n=5,
    tokenize_query=True,
    table_names=["corpus"],
    rrf_k=60,
)

Main entry point for retrieval. Dispatches to fts_search, embedding_search, or rrf_search based on method, then filters the query_id document from results.

method

SearchType

required

SearchType.SPARSE, SearchType.DENSE, or SearchType.HYBRID.

query_id

string

default:"None"

Document ID to exclude from results.

query_string

string

default:"None"

Text query for sparse or hybrid search.

query_embedding

number[]

default:"None"

Query vector for dense or hybrid search.

top_n

number

default:"5"

Maximum number of results to return.

tokenize_query

boolean

default:"true"

When True and method is SPARSE or HYBRID, the query_string is tokenized with Pyserini’s Lucene Analyzer before querying.

table_names

string[]

default:"[\"corpus\"]"

Table(s) to search. For HYBRID, provide [sparse_table, dense_table].

rrf_k

number

default:"60"

RRF rank smoothing constant. Only used when method is SearchType.HYBRID.

return

list

List of (doc_id, score) tuples ordered by descending score.

fts_search

searcher.fts_search(query_string, top_n=5, table_name="corpus")

Executes a PostgreSQL full-text search using to_tsquery('simple', …) and ranks results with ts_rank. The query string is sanitized before use: non-word characters are stripped and remaining terms are joined with | (OR).

query_string

string

required

Query string. Cleaned and converted to a tsquery expression internally.

top_n

number

default:"5"

Maximum number of results to return.

table_name

string

default:"corpus"

Name of the sparse table with a GIN index on contents.

return

list

List of (id, score) tuples.

embedding_search

searcher.embedding_search(query_embedding, top_n=5, table_name="corpus")

Computes cosine similarity using pgvector’s <=> distance operator. The score returned is 1 - cosine_distance, so higher values indicate greater similarity.

query_embedding

number[]

required

Query vector. Passed directly as the ::vector cast argument.

top_n

number

default:"5"

Maximum number of results to return.

table_name

string

default:"corpus"

Name of the dense table with a vector column.

return

list

List of (id, score) tuples ordered by descending similarity.

rrf_search

searcher.rrf_search(query_string, query_embedding, top_n=5, k=60, table_names=["sparse", "dense"])

Combines full-text and semantic ranking using Reciprocal Rank Fusion. Each result’s RRF score is:

rrf_score = 1 / (k + keyword_rank) + 1 / (k + semantic_rank)

Sparse and dense tables are auto-detected from table_names using get_search_type.

query_string

string

required

Query string for keyword retrieval. Cleaned to a tsquery expression internally.

query_embedding

number[]

required

Query vector for semantic retrieval.

top_n

number

default:"5"

Number of candidates fetched from each sub-ranker before fusion.

number

default:"60"

RRF rank smoothing constant.

table_names

string[]

default:"[\"sparse\", \"dense\"]"

Two table names. The method auto-detects which is sparse and which is dense.

return

list

List of (id, rrf_score) tuples ordered by descending RRF score.

get_search_type

searcher.get_search_type(table_name)

Queries information_schema.columns to detect the table type.

table_name

string

required

Table to inspect.

return

SearchType

SearchType.SPARSE if a contents column exists; SearchType.DENSE if an embedding column exists.

Raises ValueError if neither column is found.

filter_id

PostgresSearcher.filter_id(results, query_id)

Static method. Removes the entry whose id matches query_id. Called automatically by search().

results

list

required

List of (id, score) tuples.

query_id

string

default:"None"

Document ID to remove.

return

list

Filtered results list.

close

searcher.close()

Closes the underlying psycopg2 connection.

Examples

Sparse full-text search

from quackir import SearchType
from quackir.search import PostgresSearcher

searcher = PostgresSearcher(db_name="mydb", user="myuser")

results = searcher.search(
    method=SearchType.SPARSE,
    query_string="information retrieval benchmarks",
    top_n=10,
)

for doc_id, score in results:
    print(doc_id, score)

searcher.close()

Dense cosine similarity search

from quackir import SearchType
from quackir.search import PostgresSearcher

query_vector = [0.12, -0.34, 0.56]  # replace with a real embedding

searcher = PostgresSearcher(db_name="mydb", user="myuser")

results = searcher.search(
    method=SearchType.DENSE,
    query_embedding=query_vector,
    top_n=10,
    table_names=["dense_corpus"],
)

for doc_id, score in results:
    print(doc_id, score)

searcher.close()

Hybrid RRF search

from quackir import SearchType
from quackir.search import PostgresSearcher

searcher = PostgresSearcher(db_name="mydb", user="myuser")

results = searcher.search(
    method=SearchType.HYBRID,
    query_string="neural retrieval",
    query_embedding=[0.12, -0.34, 0.56],
    top_n=10,
    table_names=["sparse_corpus", "dense_corpus"],
    rrf_k=60,
)

for doc_id, score in results:
    print(doc_id, score)

searcher.close()

Core

Indexers

Searchers

Analysis

PostgresSearcher API reference

Constructor

Methods

search

fts_search

embedding_search

rrf_search

get_search_type

filter_id

close

Examples

Sparse full-text search

Dense cosine similarity search

Hybrid RRF search

Build docs developers (and LLMs) love

Core

Indexers

Searchers

Analysis

Documentation Index

​Constructor

​Methods

​search

​fts_search

​embedding_search

​rrf_search

​get_search_type

​filter_id

​close

​Examples

​Sparse full-text search

​Dense cosine similarity search

​Hybrid RRF search

Build docs developers (and LLMs) love

Constructor

Methods

search

fts_search

embedding_search

rrf_search

get_search_type

filter_id

close

Examples

Sparse full-text search

Dense cosine similarity search

Hybrid RRF search