Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/HugoX2003/nisira-assistant/llms.txt

Use this file to discover all available pages before exploring further.

NISIRA’s knowledge base is built from documents stored in Google Drive and indexed as vector embeddings in a PostgreSQL pgvector table. This page walks through every step of the document lifecycle — from uploading a file to verifying that its embeddings are ready for retrieval — and covers the CLI tools available for programmatic corpus management.

Uploading documents

You can add documents to the corpus in two ways: Option A — Upload through the Admin Panel Navigate to Admin Panel → Google Drive tab and use the upload card on the right side of the screen. The component accepts .pdf, .txt, .md, .doc, and .docx files up to 50 MB via drag-and-drop or a file-picker dialog. When you click Subir documento, the panel posts the file to the backend:
POST /api/admin/drive/upload/
Content-Type: multipart/form-data
Authorization: Bearer <admin_token>

file=<binary>
The backend saves the file to Google Drive (if authenticated) and also writes it to local storage at data/documents/<file_name>. Embedding generation for that single file starts immediately in a background daemon thread, so the upload response returns at once with "processing": "background". Option B — Google Drive folder sync Files placed directly in the configured Drive folder are pulled in automatically when you trigger a sync. Navigate to the Google Drive tab and click Sincronizar (or call the sync endpoint directly). The system detects only new files — duplicates are identified by MD5 hash and skipped.
POST /api/admin/drive/sync/
Authorization: Bearer <admin_token>
When a sync downloads one or more new files it automatically launches background embedding generation for those files. You do not need to trigger generation manually after a sync completes.

End-to-end upload and embed workflow

1

Sync or upload your documents

Either trigger a Drive sync from the Google Drive tab or upload files directly. Wait until the progress bar shows completed and the log stream shows [OK] Sincronización completa.
2

Open the Embeddings tab

Switch to Admin Panel → Embeddings tab. Review the stat cards to confirm that the total chunk count and table size reflect the documents you just added.
3

Generate embeddings (if needed)

If your upload went through the sync path and new files were detected, generation has already started. If you uploaded files manually and the background thread did not cover all files, click Generar to process all remaining unprocessed documents:
POST /api/admin/embeddings/generate/
Authorization: Bearer <admin_token>
The endpoint returns immediately. Poll for progress:
GET /api/admin/embeddings/progress/
Authorization: Bearer <admin_token>
The response includes status (startingrunningcompleted or error), current, total, current_file, processed, errors, and a rolling logs array. The panel polls this endpoint every 1.5 seconds and renders the results in a live progress bar and log stream.
4

Verify embeddings

Once generation finishes, run a verification pass to confirm index integrity:
POST /api/admin/embeddings/verify/
Authorization: Bearer <admin_token>
A successful response includes collections_verified and per-collection status. The panel displays a success notification with the count of verified collections.
5

Confirm indexed documents

Click Ver documentos indexados in the Embeddings tab to open the processed-files list. Each row shows the filename, file type icon, and chunk count. Files marked Sin archivo en Drive exist in the vector index but have been removed from Drive; you can delete their embeddings individually (see below).

Checking embedding status

Overall status Retrieve aggregate statistics for the entire vector store:
GET /api/admin/embeddings/status/
Authorization: Bearer <admin_token>
Response fields: success, backend (postgres or chroma), total_collections, total_documents, collections (list with name and document_count), and for PostgreSQL backends a storage_info object with detailed table stats including table_size. Per-file status List every file that has embeddings along with its chunk count:
GET /api/admin/embeddings/processed/
Authorization: Bearer <admin_token>
Each entry in the files array contains file_name, file_type, and chunks_count.

Verifying embeddings

The verify endpoint checks the integrity of every collection in the vector store and returns a list of results:
POST /api/admin/embeddings/verify/
Authorization: Bearer <admin_token>
{
  "success": true,
  "backend": "postgres",
  "collections_verified": 1,
  "results": [
    {
      "collection": "rag_embeddings",
      "document_count": 15799,
      "status": "OK",
      "backend": "postgres"
    }
  ]
}

Clearing all embeddings

Clearing embeddings deletes all vector data from the store. The assistant will be unable to retrieve any documents until embeddings are fully regenerated. Only use this if you need to rebuild the index from scratch.
POST /api/admin/embeddings/clear/
Authorization: Bearer <admin_token>
For a PostgreSQL backend the response includes embeddings_deleted (the count of rows removed). For ChromaDB backends the response lists deleted collection names.

Deleting a single document’s embeddings

To remove only the vectors for one file — for example, to force a re-embed after the source document changes — use the per-document delete endpoint:
DELETE /api/admin/embeddings/delete/<file_name>/
Authorization: Bearer <admin_token>
URL-encode the file_name if it contains spaces or special characters. The response includes deleted_embeddings (chunk count removed) and confirms the operation. In the Admin Panel you can trigger this from the trash icon next to any row in the indexed-documents list.

UploadedDocument tracking

Every file uploaded through the Admin Panel API is tracked in the UploadedDocument Django model with the following fields:
FieldTypeDescription
file_nameCharFieldOriginal filename
file_pathCharFieldLocal path at data/documents/<name>
file_sizeBigIntegerFieldSize in bytes
file_typeCharFieldFile extension (e.g. .pdf)
drive_file_idCharFieldGoogle Drive file ID (null if not uploaded to Drive)
drive_uploadedBooleanFieldWhether the file was successfully uploaded to Drive
processedBooleanFieldWhether embedding generation has run
chunks_createdIntegerFieldNumber of text chunks extracted
embeddings_generatedIntegerFieldNumber of embedding vectors stored
uploaded_atDateTimeFieldUpload timestamp
processed_atDateTimeFieldEmbedding completion timestamp (null if not yet processed)
You can inspect these records directly at /admin/api/uploadeddocument/.

CLI alternatives

For scripted workflows or initial corpus loading, two management commands are available: Programmatic RAG control
python manage.py rag_manage
Provides sub-commands for corpus inspection and embedding operations without going through the HTTP layer. Full one-shot Drive sync
python manage.py sync_drive_full
Pulls all files from the configured Google Drive folder and triggers embedding generation in a single blocking operation. Useful for initial setup or scheduled nightly jobs.
The pipeline uses chunk size 500 tokens with 50-token overlap, and retrieval blends 60% semantic (vector cosine similarity, minimum threshold 0.65) with 40% lexical (BM25) search. These parameters are visible in the Embeddings tab → Pipeline parameters info card.

Build docs developers (and LLMs) love