Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jtapieromalambo-ctrl/Signia/llms.txt

Use this file to discover all available pages before exploring further.

Signia is a Colombian Sign Language (LSC) translation web application built on Django 5.2 and deployed on Railway via Gunicorn. The project is split into focused Django apps that each own a clear domain — authentication, text-to-sign translation, webcam sign recognition, and history — plus a standalone grammar module that bridges Spanish text to the LSC gloss order through the Groq LLM API. Understanding how these pieces fit together is essential before modifying any part of the stack.

Project Structure

The repository root doubles as the Django project root. The directory layout below matches the structure documented in AGENTS.md:
Signia/          # Django project package (settings, urls, wsgi, asgi)
usuarios/        # Auth, profiles, OTP email verification, OAuth (Google/Facebook), contact form
traduccion/      # Text/audio → LSC sign videos (Whisper STT + Groq LSC grammar + BD video lookup)
reconocimientos/ # Camera hand-sign → text (MediaPipe landmarks + sklearn RandomForest)
historial/       # Per-user translation/recognition history
lsc_grammar.py   # Standalone LSC grammar layer (Groq API, fallback chain of 4 Llama models)
ffmpeg/          # Bundled ffmpeg binaries (required by faster-whisper)
media/           # User-uploaded sign videos + reference videos
static/          # Dev static files (templates reference this)
staticfiles/     # collectstatic output (served by WhiteNoise in prod)

App Responsibilities

App / ModuleResponsibility
Signia/Django project config — settings.py, root urls.py, wsgi.py, asgi.py
usuarios/Custom user model (usuarios.Usuario), email-based auth, OTP verification, Google/Facebook OAuth via django-allauth, contact form
traduccion/Text and audio input → LSC sign video sequence; calls Whisper STT for audio, then passes text to lsc_grammar.convertir_a_lsc(), then looks up matching video files in the DB
reconocimientos/Webcam hand-sign recognition; MediaPipe HandLandmarker extracts landmarks per frame, a trained RandomForest classifies the sequence into a sign label
historial/Stores per-user translation and recognition events in EntradaHistorial records
lsc_grammar.pyStandalone module at the project root; converts Spanish text to LSC gloss order via the Groq API with a four-model fallback chain and a rule-based safety net

Data Flows

Text or Audio → LSC Sign Videos

This path is handled by traduccion/views.py:buscar_video.
1

Input arrives

The user submits either typed text (POST field palabra) or a microphone recording (POST file audio). Audio is transcribed to Spanish text by faster-Whisper (base model, CPU, int8).
2

Vocabulary cache lookup

_obtener_vocabulario_bd() fetches all sign names from the video table. Results are cached in Django’s cache backend under the key vocabulario_lsc for 10 minutes to avoid repeated DB queries.
3

LSC grammar conversion

lsc_grammar.convertir_a_lsc(text, vocabulario_bd) sends the Spanish text to the Groq API, which returns a structured JSON payload containing an ordered list of LSC gloss tokens, sentence type, facial expression markers, and strategies for handling missing signs.
4

Token extraction

lsc_grammar.tokens_para_busqueda(resultado_lsc) strips non-searchable tokens (facial expressions, aspect markers, topic markers) and returns an ordered list of plain gloss strings.
5

Video DB lookup

For each token (attempting multi-token compound matches up to 3 tokens first), _buscar_token_con_fallbacks() queries the video table by nombre__iexact. If a synonym strategy was returned by the AI, the synonym is tried next. Tokens with no match are collected in faltantes.
6

Response rendered

The matched video objects, LSC token metadata, missing tokens, and AI model used are passed to the traduccion/traductor.html template for display.

Webcam Signs → Text

This path is handled by reconocimientos/views.py:predecir_landmarks (or the fallback predecir endpoint for raw frame data).
1

Client captures landmarks

The browser runs MediaPipe HandLandmarker in JavaScript. For each webcam frame it extracts 21 hand landmarks × 3 coordinates × up to 2 hands = 126 floats. The landmark sequence is accumulated client-side.
2

Landmarks POST to server

The client POSTs { "secuencia": [[126 floats], ...] } to /reconocimientos/predecir_landmarks/.
3

Centroid normalization

Each frame’s landmark array is passed to _normalizar_landmarks_centroide(), which subtracts the mean position of each hand’s 21 landmarks. This makes predictions position-invariant — the model does not care where on screen the signer’s hands appear.
4

Sequence normalization

normalizar_secuencia() linearly interpolates the variable-length frame sequence to exactly 30 frames using numpy.interp.
5

Feature construction

construir_features() concatenates the flattened normalized positions, per-frame deltas, and delta magnitudes into a single feature vector.
6

RandomForest prediction

The feature vector is classified by the loaded RandomForestClassifier. The top prediction and its probability are decoded by the LabelEncoder and returned as { "seña": "...", "confianza": 95.3 }.

Key Technical Constraints

MediaPipe HandLandmarker is not thread-safe. reconocimientos/views.py uses threading.local() to maintain one HandLandmarker instance per Django worker thread. Never replace this with a single shared instance — doing so causes deadlocks and dropped frames under concurrent requests.
Training deletes all VideoSeña records. After a training run completes, every VideoSeña DB record and its corresponding file on disk is permanently deleted. Export any training videos you want to keep before triggering a retrain.
CompressedStaticFilesStorage — not ManifestStaticFilesStorage. The STATICFILES_STORAGE backend is set to WhiteNoise’s CompressedStaticFilesStorage. Manifest mode appends content hashes to filenames, which breaks MediaPipe’s WASM loader because it resolves worker and model files by their exact, unhashed names.
ML model loads at import time. reconocimientos/views.py calls _cargar_modelo() at module load. If reconocimientos/modelo/model_seq.pkl is absent, modelo and encoder are set to None and every prediction endpoint returns HTTP 503 until a model is trained.
Training runs in a daemon thread inside the Django process. The entrenar_modelo view spawns a threading.Thread(daemon=True). The _entrenando flag prevents concurrent training runs. Because it is a daemon thread, it is killed if the Gunicorn worker exits — avoid restarting the server while training.
ffmpeg/ is prepended to PATH. traduccion/views.py prepends the local ffmpeg/ directory to os.environ["PATH"] at module load so that faster-Whisper can locate the bundled ffmpeg binary. Do not move or rename this directory.
Sessions expire after 20 minutes. SESSION_COOKIE_AGE = 1200 and SESSION_EXPIRE_AT_BROWSER_CLOSE = True are both active. SESSION_SAVE_EVERY_REQUEST = True resets the 20-minute clock on every request.

Database

EnvironmentBackendNotes
Local developmentSQLite (db.sqlite3)Default when DATABASE_URL is not set
Production (Railway)PostgreSQL via NeonDATABASE_URL env var; sslmode=require; DISABLE_SERVER_SIDE_CURSORS=True required for Neon’s connection pooler
DISABLE_SERVER_SIDE_CURSORS = True is set unconditionally in settings.py because Neon uses PgBouncer connection pooling, which does not support named server-side cursors.

Static Files

In production, WhiteNoise serves static files directly from Gunicorn workers via whitenoise.middleware.WhiteNoiseMiddleware — no separate static-file server or CDN is required. Files are compressed at collectstatic time. .wasm and .task files are excluded from compression (configured in WHITENOISE_SKIP_COMPRESS_EXTENSIONS) to avoid Range Not Satisfiable errors when Chromium makes byte-range requests to MediaPipe’s WASM binary and model file.

Build docs developers (and LLMs) love