The TTS sidecar is a standalone FastAPI 2.0 application (Documentation Index
Fetch the complete documentation index at: https://mintlify.com/GustavoNightmare/InformacionMuseo/llms.txt
Use this file to discover all available pages before exploring further.
Servertts/app/main.py) that runs on port 8010 alongside the main Flask app. Its job is to pre-generate three MP3 narration files per species using Microsoft Edge TTS, index them by QR code, and serve them on demand — so that audio playback on an ESP32 or browser never waits for on-the-fly synthesis.
Authentication Schemes
The service has two separate authentication paths depending on who is calling.Public Endpoints — ?key=<TTS_API_KEY>
Every endpoint under /tts/*, /qr/*, and /ws/* is authenticated via a query parameter:
TTS_API_KEY environment variable configured in Servertts/.env. This key is intended for external consumers such as ESP32 devices, mobile apps, and browser clients.
Internal Endpoints — X-API-Key Header
Endpoints under /internal/* are reserved for the Flask app and require the shared secret transmitted as a request header:
MUSEO_API_KEY environment variable (set equal to MUSEO_TTS_SHARED_KEY in the Docker Compose stack). Requests with a missing or mismatched header receive 401 unauthorized.
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
TTS_API_KEY | (required) | Authenticates public /tts/*, /qr/*, /ws/* callers |
MUSEO_API_KEY | (required) | Authenticates internal /internal/* calls from Flask |
MUSEO_TTS_PUBLIC_BASE_URL | "" | Public base URL used to build audio_url in responses |
EDGE_TTS_VOICE | es-CO-GonzaloNeural | Microsoft Edge TTS voice |
EDGE_TTS_RATE | +0% | Speech rate adjustment |
EDGE_TTS_VOLUME | +0% | Volume adjustment |
AUDIO_CACHE_DIR | ./cache_audio | Root directory for pre-generated MP3 files |
DEBUG_FRAMES_DIR | ./debug_frames | Directory where received JPEG frames are saved |
Audio Cache Structure
Pre-generated audio files are stored underAUDIO_CACHE_DIR using the following layout:
_qr_index.json file maps every registered qr_id to its canonical species_id. When a QR code is scanned, the service looks up this index to locate the correct audio directory.
Narration Styles
Three styles are built bybuild_text_from_species() and pre-generated at sync time:
| Style | Format | Description |
|---|---|---|
ficha | Factual | Opens with name and scientific name, then description, habitat, diet, and up to 3 curiosities |
narrativo | Story | Opens with “Te cuento sobre…”, then scientific name, description, habitat, diet, and curiosities |
corto | Short summary | Name, scientific name, brief description, habitat, diet, and only the first curiosity |
Flask Integration Flow
POST /internal/species/delete which removes the audio directory and purges all QR index entries for that species.
ID Validation
Allspecies_id and qr_id values must match the regular expression ^[A-Za-z0-9_-]+$. Requests containing IDs with spaces, slashes, or special characters will receive a 400 invalid_species_id or 400 invalid_qr_id error.
Health Check
A lightweight health endpoint is available without authentication:Endpoint Groups
Internal Sync
Flask-to-TTS calls that pre-generate and delete species audio. Protected by
X-API-Key header.TTS by QR
Serve pre-generated MP3 by QR ID or synthesize ad-hoc speech from arbitrary text.
Frame / QR Resolution
POST raw JPEG bytes to detect a QR code and retrieve species text or audio in one step.
Debug
Browser-viewable debug page plus JSON status endpoint — no auth required.