Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/RubenDarioGuerreroNeira/Ecosistema-IA-Colombia/llms.txt

Use this file to discover all available pages before exploring further.

GenkitService is the sole LLM gateway in Salud IA Bot. It is a NestJS @Injectable() service that wraps the official openai Node.js SDK, pointed at https://openrouter.ai/api/v1, and invokes Meta LLaMA 3.1 70B Instruct by default. Every prompt sent through the bot that cannot be answered by structured data alone is routed here. The service is kept deliberately minimal: no injected dependencies, just an env-driven OpenAI client instance and a retry loop.

Configuration

GenkitService reads two environment variables at construction time. Neither has a hard runtime requirement—the service degrades gracefully if keys are absent (useful for tests).
OPENROUTER_API_KEY
string
required
Your OpenRouter API key. Falls back to 'test' if unset (requests will fail authentication but the service will not crash on startup).
OPENROUTER_MODEL
string
The OpenRouter model identifier to use for completions. Defaults to 'meta-llama/Meta-Llama-3.1-70B-Instruct' if not set.
The underlying OpenAI client is configured with:
private readonly openai = new OpenAI({
  apiKey: process.env.OPENROUTER_API_KEY ?? 'test',
  baseURL: 'https://openrouter.ai/api/v1',
});

Class: GenkitService

import { Injectable, Logger } from '@nestjs/common';
import OpenAI from 'openai';

@Injectable()
export class GenkitService {
  private readonly logger = new Logger(GenkitService.name);
  private readonly openai = new OpenAI({
    apiKey: process.env.OPENROUTER_API_KEY ?? 'test',
    baseURL: 'https://openrouter.ai/api/v1',
  });

  private async sleep(ms: number) {
    return new Promise((resolve) => setTimeout(resolve, ms));
  }

  async generateResponse(prompt: string): Promise<string> {
    const MAX_RETRIES = 3;
    let lastError: any;
    const model =
      process.env.OPENROUTER_MODEL ||
      'meta-llama/Meta-Llama-3.1-70B-Instruct';

    for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
      try {
        const response = await this.openai.chat.completions.create({
          model,
          messages: [{ role: 'user', content: prompt }],
        });
        return response.choices[0].message.content ?? '';
      } catch (error: any) {
        lastError = error;
        const isTransient = error?.status === 429 || error?.status === 503;
        if (isTransient && attempt < MAX_RETRIES) {
          const delay = Math.pow(2, attempt) * 1000;
          this.logger.warn(
            `OpenRouter transient error (${error.status || error.code}). ` +
            `Retrying in ${delay}ms... (Attempt ${attempt + 1}/${MAX_RETRIES})`,
          );
          await this.sleep(delay);
          continue;
        }
        this.logger.error(
          `OpenRouter API failed after ${attempt} retries: ${error.message}`,
        );
        throw error;
      }
    }
    throw lastError;
  }
}

Method: generateResponse

generateResponse is the only public method. It builds a single-turn user message and calls openai.chat.completions.create().
prompt
string
required
The full user-facing prompt string. Salud IA Bot assembles context (retrieved SIVIGILA facts, conversation history, system instruction) before passing it here.
Promise<string>
string
The LLM’s text reply (response.choices[0].message.content). Returns an empty string if the model returns a null content field.

Retry Logic

The method uses an attempt counter (attempt = 0 … MAX_RETRIES) and only retries on transient HTTP errors.
AttemptDelay before retryRetried?
0 -> 11 000 ms (2^0 x 1 000)HTTP 429 or 503 only
1 -> 22 000 ms (2^1 x 1 000)HTTP 429 or 503 only
2 -> 34 000 ms (2^2 x 1 000)HTTP 429 or 503 only
3 (final)-Error is re-thrown
Non-transient errors (e.g. HTTP 401 Unauthorized, 400 Bad Request) are not retried. They are logged at error level and immediately re-thrown to the caller.
If OPENROUTER_API_KEY is not set, every call will fail with a 401. Set the variable before starting the NestJS app in production.

Error handling

const isTransient = error?.status === 429 || error?.status === 503;
if (isTransient && attempt < MAX_RETRIES) {
  const delay = Math.pow(2, attempt) * 1000;
  this.logger.warn(`OpenRouter transient error ...`);
  await this.sleep(delay);
  continue;
}
this.logger.error(`OpenRouter API failed after ${attempt} retries: ${error.message}`);
throw error;
After MAX_RETRIES exhausted attempts the loop exits and re-throws lastError to propagate the failure to the bot handler.

Integration with BotUpdate

BotUpdate calls GenkitService.generateResponse() only after the structured-data RAG layer fails to produce a direct answer. This keeps LLM usage minimal and guarantees that factual SIVIGILA statistics are returned without model hallucination.
// Simplified excerpt from bot.update.ts
const structuredResult = await saludPublicaService.procesarPregunta(text);
if (structuredResult.encontrado) {
  return ctx.reply(buildMessage(structuredResult.evento));
}
// Fallback to LLM
const llmResponse = await genkitService.generateResponse(buildPrompt(text, context));
return ctx.reply(llmResponse);
To swap the model (e.g. to meta-llama/Meta-Llama-3.1-8B-Instruct for lower latency), set OPENROUTER_MODEL in your .env file without touching the service code.

Build docs developers (and LLMs) love