Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/AlonsoSam/vozi-android/llms.txt

Use this file to discover all available pages before exploring further.

When a child taps an available or completed phoneme station on their learning path, the app pushes ExerciseScreen with the selected Phoneme as a parameter. The screen runs a session of 10 words, one at a time. For each word the child can listen to a native pronunciation, then attempt to say it into the microphone. The app evaluates the spoken word using the same rules as VOZI iOS and shows friendly feedback — no scores, no letter grades, just “¡Muy bien!” or “Casi, intenta otra vez”. After 10 words, a completion dialog shows how many the child got right and whether they earned the reward for that sound.

The Exercise Screen

The exercise screen is laid out as a single vertical column with five distinct zones:

Progress Bar

A colored LinearProgressIndicator at the top showing (index + 1) / 10. Below it, a small label reads “Palabra N de 10”. The bar color is the phoneme’s identity color from VoziTheme.phonemeColor().

Word Card

A large rounded white card (_WordCard) showing the word’s image from assets/words/word_<normalized>.png (resolved via PracticeWord.imageKey) and the word in 42sp bold text below it. If the image asset doesn’t exist, a placeholder icon is shown without crashing.

Feedback Banner

A _FeedbackBanner widget that displays contextual messages (see Speak Mode section). Its height is reserved at 54dp so the layout doesn’t jump between states.

Listen / Speak Buttons

Two side-by-side _BigActionButton widgets: Escuchar (peach tint) and Hablar/Detener (phoneme color, pulsing while active). Both show an icon above a label for children who can’t read.
A full-width Siguiente / Terminar button at the bottom is disabled until the child has attempted the current word at least once (_answered = true).

Listen Mode

Tapping Escuchar calls _listen(), which:
1

Cancel any active mic session

Calls _stt.cancel() and _tts.stop() to free the microphone before playing audio.
2

Play the MP3 asset

Calls _audio.playWord(word.audioKey) which attempts to play the file at assets/audio/words/<normalized>.mp3 (e.g., ratónraton.mp3) — the same audio files used by VOZI iOS.
3

TTS fallback

If the MP3 asset does not exist (returns false), _tts.speak(word.text) is called as a fallback using the device’s system text-to-speech engine in Spanish.
4

Restore UI state

The _listening flag is set to false and the button label returns to “Escuchar”.
The Hablar button is disabled while audio is playing so the microphone is not activated simultaneously.

Speak Mode (Voice Recognition)

VOZI uses a two-click manual flow designed to prevent false triggers and give the child clear control. The _attempt state machine drives the feedback banner and button label.

Attempt State Machine

StateBanner MessageDescription
none(empty, reserved height)Initial state; no attempt started
preparing”Preparando micrófono…”First tap: STT engine initializing
ready”Habla ahora”Microphone is actively listening
tooFast”Espera un momentito, todavía estoy preparando el micrófono.”Second tap arrived before mic was ready
heard”Te escuché”Transcription received; brief transition state (~700 ms)
passed”¡Muy bien!”Word evaluated as correct
almost”Casi, intenta otra vez”Word evaluated as incorrect
empty”No detecté voz, intentemos otra vez”STT returned empty transcription
unavailable”Micrófono no disponible aquí”STT engine unavailable (no permission / emulator)

Two-Click Flow

// First tap — start
setState(() => _attempt = _Attempt.preparing);
await _stt.listen(
  onReady: () => setState(() => _attempt = _Attempt.ready),
  onResult: _evaluate,
  targetWord: _current.text,
);

// Second tap — stop and evaluate
await _stt.stopAndReport(); // triggers onResult callback
STT runs entirely on-device using the Android speech recognition engine. The recognized text is never uploaded anywhere. Only the text string and a pass/fail boolean are stored locally as part of the attempt history, which parents can review in the dashboard.
Every attempt — including empty ones and unavailable states — is saved to the local attempt history via ProfileScope.of(context).recordAttempt(...).

Word Evaluation Algorithm

WordEvaluator.evaluate() applies three rules in sequence, mirroring the logic in VOZI iOS’s PhonemeWordEvaluator. All three must pass for a word to be marked correct.
WordResult result = WordEvaluator.evaluate(
  phoneme: Phoneme.r,
  target: 'rana',
  transcription: 'la rana salta',
);
// result.passed → true
// result.score  → 1.0
Signature:
static WordResult evaluate({
  required Phoneme phoneme,
  required String target,
  required String transcription,
}) → WordResult
WordResult fields:
FieldTypeDescription
passedbooltrue if all three rules are satisfied
scoredoubleLevenshtein similarity 0.0–1.0 (used as support rule)

The Three Rules

1

Rule 1 — Exact token match

The normalized target word must appear as a complete token in the normalized transcription. Partial substring matches don’t count. For example, if the target is "rana" and the transcription is "la rana", the token "rana" is present. If the transcription is "arana" (as one token), it fails.
2

Rule 2 — Phoneme sound preserved

The target word must contain the phoneme’s characteristic sound pattern. This is checked against the normalized target word, not the transcription:
PhonemeCondition
Rnormalized target starts with 'r'
RRnormalized target contains 'rr'
Snormalized target starts with 's'
Lnormalized target starts with 'l'
TRnormalized target contains 'tr'
PRnormalized target contains 'pr'
PLnormalized target contains 'pl'
BRnormalized target contains 'br'
BLnormalized target contains 'bl'
3

Rule 3 — Levenshtein similarity ≥ 0.8

_similarity(target, transcription) computes a 0.0–1.0 score. It checks per-word similarity against each token in the transcription and takes the maximum. If any token exactly matches the target, the score is 1.0. A minimum of 0.8 is required as support.
Normalization (_normalize) lowercases the input, strips diacritics (e.g. á → a, ñ → n), removes non-alphanumeric characters, and collapses whitespace — ensuring accented transcriptions from STT don’t fail unnecessarily.

Session Completion

After all 10 words are attempted, _finish() is called automatically by the Terminar button.
final correct  = _passed.where((p) => p).length;
final rewarded = correct >= _requiredCorrect; // (10 * 0.9).ceil() = 9
A session is rewarded when the child gets 9 or more of 10 words correct. The completion dialog (_CompletionDialog) shows:
  • 🎉 or 💪 emoji depending on outcome
  • Correct count: “Acertaste N de 10 palabras.”
  • Points earned: “Ganaste 10 puntos. ⭐” (rewarded) or how many more are needed (not rewarded)
  • A single Volver al camino button
ProfileStore.finishPhoneme() is always called with the rewarded flag:
  • Always: phoneme added to practicedPhonemes (unlocks the next station regardless of score)
  • Only if rewarded: phoneme added to completedPhonemes, 10 points added to profile
If the session was rewarded, showVoziConfetti(context) fires a Flutter overlay of 34 animated confetti particles that runs once (~1.6 s) and auto-destroys.

Word Bank

The word bank is a static Map<String, List<String>> in WordBank._words. Each phoneme has exactly 10 words, taken directly from VOZI iOS’s ContentBank.swift.
PhonemeWords
Rrana, rosa, ratón, ratita, rueda, rama, remo, río, ropa, radio
RRperro, carro, torre, burro, gorra, jarra, tierra, barro, zorro, cerro
Ssapo, sol, silla, sopa, sal, saco, sueño, salsa, sala, sirena
Lluna, lechuga, loro, leche, lámpara, libro, limón, llave, lobo, lata
TRtren, trapo, trono, trigo, trompo, tres, trozo, trucha, trenza, trofeo
PRproa, presa, prisa, prado, prosa, presto, prendedor, prensa, pronto, praga
PLplato, pluma, playa, plaza, pleno, plata, plancha, plano, plaga, plomo
BRbrazo, brisa, brocha, brasa, bravo, brillo, broma, cebra, libro, cabra
BLblanco, blíster, bloque, blando, cable, tabla, pueblo, mueble, habla, establo
Each word is wrapped in a PracticeWord object when loaded. The image asset path is assets/words/word_<normalized>.png (the imageKey property strips diacritics, lowercases, and prepends word_ — e.g., ratónword_raton.png). The audio asset path is assets/audio/words/<normalized>.mp3 with no prefix (e.g., ratónraton.mp3).

Build docs developers (and LLMs) love