Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/JoAmps/rgt-assignment/llms.txt

Use this file to discover all available pages before exploring further.

Welcome to the RAG Support System documentation. This system combines semantic retrieval over an indexed knowledge base with LLM-based generation to produce grounded, cited answers while mitigating hallucinations and adversarial inputs.

What is the RAG Support System?

The RAG Support System is an AI-powered customer support platform that:
  • Retrieves relevant documentation from a vector store (Chroma) using semantic search
  • Triages incoming tickets with ML models to predict category and priority
  • Generates grounded, cited answers using large language models
  • Evaluates answer quality with offline faithfulness and relevance metrics
  • Flags low-confidence cases for human review
Built with Python, FastAPI, LangChain, Chroma, and OpenAI, the system prioritizes correctness, modularity, and production readiness.

Core capabilities

Semantic retrieval

Vector-based search over your knowledge base using OpenAI embeddings and Chroma for fast, relevant results

ML-powered triage

Automatic classification of support tickets by category and priority with confidence scoring

Grounded answers

LLM-generated responses with citations and internal next steps, backed by retrieved context

Production safeguards

Prompt injection protection, adversarial testing, and human-in-the-loop workflows for uncertain cases

Key features

  • Document ingestion — Chunk and embed markdown files into Chroma with configurable chunking strategies
  • RAG agent — Retrieval-augmented generation pipeline with category-aware filtering and low-latency responses
  • Triage models — TF-IDF + Logistic Regression models for category and priority prediction
  • Structured outputs — JSON-formatted citations, internal next steps, and review flags
  • Offline evaluation — Relevance, faithfulness, and adversarial robustness testing with audit-ready reports
  • FastAPI endpoints — Production-ready HTTP API for ingestion, question answering, and triage

Get started

Quickstart

Go from zero to your first RAG query in under 5 minutes

Installation

Set up Python, dependencies, and environment variables

Architecture

Understand system components and request flow

API Reference

Explore endpoints, request models, and examples

Architecture overview

The system follows a modular architecture with clear separation of concerns:
Client Request

  FastAPI Layer (validation, routing)

  Triage Service (category + priority prediction)

  RAG Service (embed query → retrieve chunks → generate answer)

  Response (draft_reply, citations, internal_next_steps, needs_human_review)
See the Architecture page for detailed component descriptions and request flow diagrams.

Design principles

The RAG Support System is built on these core principles:
  1. Correctness first — Answers must be supported by retrieved knowledge; hallucinations are unacceptable
  2. Modularity — Retrieval, generation, and evaluation are independently testable
  3. Cost awareness — Predictable and controllable LLM usage with bounded retrieval and caching
  4. Security — Resilience against prompt injection and misuse with explicit refusal behavior
  5. Production readiness — Observable, scalable, and maintainable with structured logging and metrics
This system prioritizes faithfulness over creativity. Lower temperature and constrained prompts reduce expressive freedom but eliminate hallucinations in support contexts.

Technology stack

  • Python 3.12+ — Core language with type hints and async support
  • FastAPI — High-performance API framework with automatic OpenAPI docs
  • LangChain — LLM orchestration and document processing
  • Chroma — Vector database for semantic search
  • OpenAI — Embeddings (text-embedding-3-small) and LLM (GPT-4.1)
  • scikit-learn — ML models for triage classification
  • uv — Fast Python package installer and dependency manager

Next steps

1

Follow the quickstart

Install dependencies, ingest documents, and make your first RAG query in minutes
2

Read the architecture guide

Understand how the system components work together
3

Explore the API reference

Learn about available endpoints and request/response models
4

Run evaluations

Test answer quality with offline metrics and adversarial robustness checks

Build docs developers (and LLMs) love