NISIRA’s knowledge base is built from documents stored in Google Drive and indexed as vector embeddings in a PostgreSQL pgvector table. This page walks through every step of the document lifecycle — from uploading a file to verifying that its embeddings are ready for retrieval — and covers the CLI tools available for programmatic corpus management.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/HugoX2003/nisira-assistant/llms.txt
Use this file to discover all available pages before exploring further.
Uploading documents
You can add documents to the corpus in two ways: Option A — Upload through the Admin Panel Navigate to Admin Panel → Google Drive tab and use the upload card on the right side of the screen. The component accepts.pdf, .txt, .md, .doc, and .docx files up to 50 MB via drag-and-drop or a file-picker dialog. When you click Subir documento, the panel posts the file to the backend:
data/documents/<file_name>. Embedding generation for that single file starts immediately in a background daemon thread, so the upload response returns at once with "processing": "background".
Option B — Google Drive folder sync
Files placed directly in the configured Drive folder are pulled in automatically when you trigger a sync. Navigate to the Google Drive tab and click Sincronizar (or call the sync endpoint directly). The system detects only new files — duplicates are identified by MD5 hash and skipped.
When a sync downloads one or more new files it automatically launches background embedding generation for those files. You do not need to trigger generation manually after a sync completes.
End-to-end upload and embed workflow
Sync or upload your documents
Either trigger a Drive sync from the Google Drive tab or upload files directly. Wait until the progress bar shows
completed and the log stream shows [OK] Sincronización completa.Open the Embeddings tab
Switch to Admin Panel → Embeddings tab. Review the stat cards to confirm that the total chunk count and table size reflect the documents you just added.
Generate embeddings (if needed)
If your upload went through the sync path and new files were detected, generation has already started. If you uploaded files manually and the background thread did not cover all files, click Generar to process all remaining unprocessed documents:The endpoint returns immediately. Poll for progress:The response includes
status (starting → running → completed or error), current, total, current_file, processed, errors, and a rolling logs array. The panel polls this endpoint every 1.5 seconds and renders the results in a live progress bar and log stream.Verify embeddings
Once generation finishes, run a verification pass to confirm index integrity:A successful response includes
collections_verified and per-collection status. The panel displays a success notification with the count of verified collections.Confirm indexed documents
Click Ver documentos indexados in the Embeddings tab to open the processed-files list. Each row shows the filename, file type icon, and chunk count. Files marked Sin archivo en Drive exist in the vector index but have been removed from Drive; you can delete their embeddings individually (see below).
Checking embedding status
Overall status Retrieve aggregate statistics for the entire vector store:success, backend (postgres or chroma), total_collections, total_documents, collections (list with name and document_count), and for PostgreSQL backends a storage_info object with detailed table stats including table_size.
Per-file status
List every file that has embeddings along with its chunk count:
files array contains file_name, file_type, and chunks_count.
Verifying embeddings
The verify endpoint checks the integrity of every collection in the vector store and returns a list of results:Clearing all embeddings
embeddings_deleted (the count of rows removed). For ChromaDB backends the response lists deleted collection names.
Deleting a single document’s embeddings
To remove only the vectors for one file — for example, to force a re-embed after the source document changes — use the per-document delete endpoint:file_name if it contains spaces or special characters. The response includes deleted_embeddings (chunk count removed) and confirms the operation. In the Admin Panel you can trigger this from the trash icon next to any row in the indexed-documents list.
UploadedDocument tracking
Every file uploaded through the Admin Panel API is tracked in theUploadedDocument Django model with the following fields:
| Field | Type | Description |
|---|---|---|
file_name | CharField | Original filename |
file_path | CharField | Local path at data/documents/<name> |
file_size | BigIntegerField | Size in bytes |
file_type | CharField | File extension (e.g. .pdf) |
drive_file_id | CharField | Google Drive file ID (null if not uploaded to Drive) |
drive_uploaded | BooleanField | Whether the file was successfully uploaded to Drive |
processed | BooleanField | Whether embedding generation has run |
chunks_created | IntegerField | Number of text chunks extracted |
embeddings_generated | IntegerField | Number of embedding vectors stored |
uploaded_at | DateTimeField | Upload timestamp |
processed_at | DateTimeField | Embedding completion timestamp (null if not yet processed) |
/admin/api/uploadeddocument/.