Signia’s text-to-signs pipeline bridges spoken Spanish and Colombian Sign Language by combining speech transcription, AI-powered grammar reordering, and a database of recorded sign videos. When a user submits a sentence — whether typed or spoken — the system restructures it according to LSC’s Subject–Object–Verb grammar and plays back the matching sign videos in sequence, enabling fluid, linguistically accurate communication.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jtapieromalambo-ctrl/Signia/llms.txt
Use this file to discover all available pages before exploring further.
How It Works
Submit text or audio at /traductor/
Users can either type a Spanish sentence into the text field or record audio directly in the browser. Both inputs converge into the same processing pipeline.
Audio transcription with faster-whisper
If the user submits audio, the uploaded
.webm file is passed to a lazily-loaded faster-whisper model (base, device="cpu", compute_type="int8"). Whisper transcribes the audio with language='es' and beam_size=5, producing a plain-text Spanish sentence. The temporary file is deleted from disk after transcription.LSC grammar conversion via Groq
The transcribed or typed text is sent to
lsc_grammar.convertir_a_lsc(), which calls the Groq API with a detailed LSC linguistic system prompt. The model reorders the input into LSC’s SOV structure and returns a structured JSON list of gloss tokens.Video lookup for each LSC token
tokens_para_busqueda() extracts the word tokens from the Groq response, filtering out non-lexical markers like facial expressions [EF:...] and aspect markers [ASP:...]. Each token is looked up in the video model using a case-insensitive exact match on the nombre field. Multi-word expressions (e.g. CON_GUSTO) are tried first as compound phrases before falling back to individual tokens.LSC Grammar Conversion
LSC uses SOV (Subject–Object–Verb) word order, which differs fundamentally from Spanish’s SVO structure. Thelsc_grammar.py module handles this reordering through a 15-module linguistic system prompt grounded in Alejandro Oviedo’s (2001) LSC research and the INSOR/Caro y Cuervo dictionary.
Key grammatical transformations applied:
| Rule | Description | Example |
|---|---|---|
| SOV order | Subject first, then object, then verb | Yo como arroz → YO ARROZ COMER |
| Temporal markers first | Time expressions precede the subject | Mañana voy al médico → MAÑANA YO MEDICO IR |
| Tópico-comentario | Topic/theme leads; marked with [TOPIC] | El carro rojo, yo lo compré → CARRO ROJO [TOPIC] YO COMPRAR |
| Negation at end | NO always follows the verb (and modal) | No puedo ir → YO IR PODER NO |
| WH-questions at end | Interrogative pronoun moves to final position | ¿Cómo te llamas? → TU LLAMAR COMO [EF:CEJAS_FRUNCIDAS] |
| Copula deletion | Empty ser/estar is dropped | Él es médico → EL MEDICO |
| Modals after verb | Modal verbs follow the main verb | Quiero salir → YO SALIR QUERER |
sentence_type (e.g. declarative, question_wh), facial_expression metadata, and notes with linguistic observations. Non-lexical tokens — [EF:CEJAS_FRUNCIDAS], [ASP:COMPLETADO], [TOPIC] — are stripped before the database lookup.
Fallback Chain
If the primary Groq model is unavailable or rate-limited,lsc_grammar.py automatically tries four models in order:
_fallback_sin_ia) applies basic LSC ordering (time → subject → rest → negation → question). The view’s modelo_usado context variable is set to 'fallback' so the template can detect and surface this condition to the user.
Audio Input
Audio is captured in the browser as.webm and uploaded via multipart/form-data. The file is saved to a temporary directory (temp/) with a UUID filename, then processed:
faster-whisper model is loaded lazily on first use (guarded by threading.Lock()), so it does not block Gunicorn startup. Files smaller than 1,000 bytes are silently skipped to avoid processing empty recordings.
Vocabulary System
Sign videos are stored using thevideo model in the traduccion app:
nombre field is the lookup key. Tokens from the LSC grammar layer are matched against it with nombre__iexact, so MEDICO, medico, and Médico all resolve to the same record. Multi-word expressions use spaces: the token CON_GUSTO is looked up as "con gusto" after replacing underscores.
Admins upload, rename, and delete sign videos through the /admin-videos/ panel, which provides a full CRUD interface (/reconocimientos/traductor/crear/, .../editar/<id>/, .../eliminar/<id>/).
The full vocabulary list is cached in Django’s cache backend under the key
vocabulario_lsc for 10 minutes (600 seconds). This avoids a database round-trip on every translation request. When the vocabulary changes (after adding or removing a video), the cache expires naturally or you can clear it manually with cache.delete('vocabulario_lsc').Fallback for Missing Tokens
When the Groq model identifies a token that has no matching video in the database, it sets a per-token strategy in theestrategia_faltantes dict of the response. The view applies this strategy via _buscar_token_con_fallbacks():
| Strategy | Behaviour |
|---|---|
synonym:ALTERNATIVA | Looks up the alternative token in the database instead |
spell | Marks the token as missing; frontend can render it as fingerspelling |
record | Marks the token as a candidate for a new sign recording |
fingerspell | Short acronym to be fingerspelled character by character |
faltantes list, which is passed to the template for display.
Translation History
Every successful translation (at least one video found) is recorded for authenticated users:/historial/ and are paginated at 15 items per page.