AI Pipeline: Gemini 2.5 Flash, Genkit, and Vector Search

TenderCheck AI uses a two-stage AI pipeline built on top of Google Genkit. Stage 1 extracts mandatory requirements from a tender document using a strict Legal Auditor persona. Stage 2 validates a vendor proposal against those requirements using a more nuanced Senior Evaluator persona. Both stages call the same model — Gemini 2.5 Flash via the @genkit-ai/google-genai Genkit plugin — but with very different system prompts and output schemas. LangSmith provides end-to-end observability across all AI calls via the traceable wrapper.

Stage 1 — Tender Analysis (Legal Auditor)

The first stage treats the tender document as a legal instrument and extracts only the clauses that carry real compliance weight. AI role: A strict legal and technical auditor whose only concern is rules that cause disqualification or affect scoring. Model: googleai/gemini-2.5-flash called via ai.generate() from the Genkit SDK configured in genkit.config.ts. Extraction focus: The prompt instructs the model to search for imperative phrases — "deberá", "será obligatorio", "se requiere", "es indispensable", "must", "shall" — and to ignore introductory text, filler, or general descriptions that are not rules. Output schema (Zod-validated):

const AnalysisSchema = z.object({
  summary: z.string(),
  requirements: z.array(
    z.object({
      id: z.string(),
      text: z.string(),
      type: z.enum(["TECHNICAL", "ADMINISTRATIVE", "LEGAL", "FINANCIAL"]),
      confidence: z.number(),
      keywords: z.array(z.string()),
      pageNumber: z.number(),
      sourceText: z.string(),
    }),
  ),
});

Each extracted requirement maps to a Requirement entity with:

Field	Description
`id`	UUID generated at runtime
`text`	Complete, exact technical demand
`type`	`TECHNICAL`, `ADMINISTRATIVE`, `LEGAL`, or `FINANCIAL`
`confidence`	`1.0` for clear mandates (`"deberá"`), `0.5` for desirable clauses
`keywords`	3–4 keywords used for vector search
`pageNumber`	Absolute page from `--- PAGE X ---` markers embedded by `PdfParserAdapter`
`sourceText`	Literal 1–2 sentence fragment from the document

The model output is returned as a typed object via Genkit’s structured output support — no manual JSON parsing is required.

Stage 2 — Proposal Validation (Senior Evaluator)

The second stage compares a vendor’s proposal against the requirements extracted in Stage 1. AI role: A senior tender auditor who understands technical synonyms, partial compliance, and implicit evidence. The system prompt explicitly forbids answering “not specified” when numerical requirements are present in the context. Model: googleai/gemini-2.5-flash — same model, different system prompt. Input: The requirement text and the full proposal text, truncated to PROPOSAL_TRUNCATE_SINGLE = 500000 characters to fit Gemini 2.5 Flash’s 1 million token context window. Output per requirement (ComparisonResult):

const ComparisonSchema = z.object({
  status: z.enum(["COMPLIANT", "NON_COMPLIANT", "PARTIAL"]),
  reasoning: z.string(),
  score: z.number(),
  sourceQuote: z.string(),
});

Batch processing: To reduce latency and API call overhead, requirements are validated in batches:

// constants.ts
export const BATCH_CHUNK_SIZE = 3;     // requirements per AI call
export const MAX_AI_CONCURRENCY = 3;   // parallel batch calls

The compareBatch() method sends BATCH_CHUNK_SIZE requirements in a single ai.generate() call. Up to MAX_AI_CONCURRENCY batches are processed concurrently using Promise.all. All reasoning output is returned strictly in Spanish.

Vector Embeddings

Before sending requirements to the LLM for validation, VectorSearchService uses semantic search to pre-filter only the requirements relevant to the proposal chunk being evaluated. This reduces LLM calls by 60–80%. Model: googleai/gemini-embedding-001 called via ai.embed(). Dimensions: gemini-embedding-001 produces 3072-dimensional vectors. This is the active embedding model used by VectorSearchService (set in the constructor: this.dimensions = 3072). Note that constants.ts also defines a legacy constant VECTOR_DIMENSIONS = 768 left over from an earlier nomic-embed-text model — that constant is not used by the active embedding service. Vectors are stored as BLOBs in the requirements table of Turso (SQLite). Because SQLite has no native cosine similarity function, all similarity computation happens in JavaScript. Serialize / deserialize:

// Serialize Float32Array → Buffer for SQLite BLOB storage
serializeEmbedding(embedding: Float32Array): Buffer {
  return Buffer.from(embedding.buffer);
}

// Deserialize Buffer from SQLite → Float32Array
deserializeEmbedding(buffer: Buffer): Float32Array {
  if (!buffer || !buffer.buffer || buffer.length === 0) {
    return new Float32Array(this.dimensions);
  }
  return new Float32Array(
    buffer.buffer,
    buffer.byteOffset || 0,
    buffer.length / Float32Array.BYTES_PER_ELEMENT,
  );
}

Similarity search thresholds (from constants.ts):

export const SIMILARITY_THRESHOLD = 0.3;  // cosine similarity floor
export const TOP_K_SIMILAR = 5;           // max results returned

The findSimilar() method on VectorSearchService computes pairwise cosine similarity between the proposal embedding and every stored requirement embedding, filters by SIMILARITY_THRESHOLD, and returns up to TOP_K_SIMILAR results sorted by descending similarity score.

LangSmith Observability

All three core GeminiGenkitService methods — _analyze, _compareProposal, and _compareBatch — are wrapped with traceable from the LangSmith SDK:

import { traceable } from "langsmith/traceable";

private _analyze = traceable(
  async (text: string): Promise<TenderAnalysis> => { /* ... */ },
  { name: "analyze_tender" },
);

private _compareProposal = traceable(
  async (requirementText: string, proposalText: string) => { /* ... */ },
  { name: "compare_proposal" },
);

private _compareBatch = traceable(
  async (requirements: { id: string; text: string }[], proposalText: string) => { /* ... */ },
  { name: "compare_batch" },
);

This sends a trace for every AI invocation to your LangSmith project. Enable tracing by setting these environment variables on the backend:

LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_API_KEY=<your-langsmith-api-key>
LANGCHAIN_PROJECT=tendercheck-ai

With tracing active you can inspect prompt versions, per-call latency, token usage, and failure rates from the LangSmith dashboard. The server logs whether tracing is enabled at startup:

🔍 [LangSmith] Tracing Enabled: true

Chunked Processing for Large PDFs

PDFs exceeding LARGE_PDF_THRESHOLD = 15 pages are automatically split into overlapping page chunks before being sent to the LLM.

// constants.ts
export const PAGES_PER_CHUNK = 10;          // pages per chunk
export const CHUNK_MAX_CHARS = 500000;       // character ceiling per chunk
export const CHUNK_PARALLEL_PROCESSING = 3; // max concurrent chunk calls
export const LARGE_PDF_THRESHOLD = 15;      // page count trigger

The chunkPages() utility in infrastructure/utils/chunking.ts inserts --- PAGE X --- markers at each page boundary so the model can accurately report pageNumber for every extracted requirement. Up to CHUNK_PARALLEL_PROCESSING chunks are analysed concurrently via Promise.all inside GeminiGenkitService.analyzeChunks(). Results from all chunks are then aggregated into a single TenderAnalysis before being persisted.

// GeminiGenkitService.ts
async analyzeChunks(chunks: PageChunk[]): Promise<ChunkAnalysisResult[]> {
  const results = await Promise.all(
    chunks.map((chunk) => this.analyzeChunk(chunk)),
  );
  // ...aggregate and return
  return results;
}

Genkit’s ai.generate() function accepts a Zod schema via the output: { schema } option. When a schema is provided, Genkit automatically instructs the model to return structured JSON and deserialises the response into a fully-typed object — eliminating the need for any manual JSON parsing or try-catch around JSON.parse().

Get Started

Core Features

Architecture

Configuration & Deployment

AI Pipeline: Gemini 2.5 Flash, Genkit, and Vector Search

Stage 1 — Tender Analysis (Legal Auditor)

Stage 2 — Proposal Validation (Senior Evaluator)

Vector Embeddings

LangSmith Observability

Chunked Processing for Large PDFs

Build docs developers (and LLMs) love

Get Started

Core Features

Architecture

Configuration & Deployment

Documentation Index

​Stage 1 — Tender Analysis (Legal Auditor)

​Stage 2 — Proposal Validation (Senior Evaluator)

​Vector Embeddings

​LangSmith Observability

​Chunked Processing for Large PDFs

Build docs developers (and LLMs) love

Stage 1 — Tender Analysis (Legal Auditor)

Stage 2 — Proposal Validation (Senior Evaluator)

Vector Embeddings

LangSmith Observability

Chunked Processing for Large PDFs