VOZI: Interactive Spanish Phoneme Learning for Kids

VOZI is an educational Flutter app built for Android that helps children between the ages of 4 and 7 learn to pronounce Spanish phonemes through play. Each practice session pairs a picture card and audio playback with on-device speech recognition, so children hear a word, say it aloud, and receive instant feedback — all without an internet connection. VOZI is a learning tool, not a diagnostic instrument; it is designed to build phoneme awareness through repetition and positive reinforcement.

What VOZI Teaches

VOZI covers nine Spanish phonemes and consonant clusters arranged as a sequential learning path. Children progress through each phoneme in order, practicing 10 real Spanish words per phoneme before unlocking the next stage.

Order	Phoneme	Sample Words
1	R	rana, rosa, ratón, rueda, río…
2	RR	perro, carro, torre, burro, zorro…
3	S	sapo, sol, silla, sopa, sirena…
4	L	luna, lechuga, loro, leche, lobo…
5	TR	tren, trapo, trono, trigo, trofeo…
6	PR	proa, presa, prisa, prado, pronto…
7	PL	plato, pluma, playa, plaza, plancha…
8	BR	brazo, brisa, brocha, cebra, cabra…
9	BL	blanco, bloque, blando, cable, tabla…

The phoneme path is linear: a child must complete all 10 words for R before RR becomes available, RR before S, and so on. Each phoneme’s word list is the same static set used in VOZI iOS (ContentBank.swift), ensuring content parity across platforms. The word bank lives in lib/data/word_bank.dart and can be migrated to Supabase in a future phase without affecting local progress data.

How a Practice Session Works

When a child selects a phoneme, each of its 10 words is presented as an image card. Two interaction modes drive the session: Listen (Escuchar): The app plays the correct pronunciation of the word using the device’s Text-to-Speech engine (flutter_tts) or a pre-recorded MP3 from assets/audio/words/. The child can tap Listen as many times as needed before attempting to speak. Speak (Hablar): The app activates the on-device speech recognizer (speech_to_text). The microphone listens, transcribes the child’s utterance locally, and evaluates it against three criteria: the target word must appear as a complete token, the phoneme sound must be preserved, and the edit-distance similarity must be 0.8 (80%) or above. A word that passes all three checks counts as correct. No audio recording is retained after evaluation — only the resulting pass/fail metric. When a child completes all 10 words in a phoneme, the next phoneme on the path is unlocked. If the child answered at least 9 out of 10 words correctly (≥ 90% session accuracy), VOZI also plays a session-complete audio cue from assets/audio/feedback/session/ and displays a confetti celebration driven by an animated Rive character from assets/rive/. The child receives a reward that appears in their personal rewards gallery.

VOZI is an educational app, not a medical or clinical tool. It is designed to build phoneme awareness through positive reinforcement and is not a substitute for evaluation or therapy by a licensed speech-language pathologist.

Privacy by Design

VOZI enforces a strict local-processing rule for all voice data. The speech recognizer runs entirely on-device: raw audio is never recorded to disk, never transmitted over the network, and never sent to Supabase. The only data that may be synced to the backend are structured metrics:

Phoneme practiced (e.g., "RR")
Word practiced (e.g., "perro")
Similarity score (numeric, 0.0–1.0)
Pass / fail result
Timestamp and child profile ID

This means a parent or guardian can review progress data without any audio trail existing. The supabase_config.dart source explicitly notes: “VOZI no sube audio a Supabase. VOZI no sube transcripciones.”

Who Manages the App

VOZI is designed so that a child can use it independently after a guardian completes the initial setup. Children do not log in — they simply select their profile from the profiles screen. Adults access a separate dashboard by entering a 4-digit PIN (1234 in the academic demo). The adult panel provides:

An overview of each child’s phoneme progress and attempt history
A gateway to the Premium features demo
Access to the adult Supabase account screen for optional backend sync

The PIN is a lightweight barrier intended for academic demonstration. It is defined as a compile-time constant in AppConfig.parentPin and is not a substitute for full authentication.

Key Features at a Glance

On-Device Speech Recognition

Pronunciation is evaluated locally using the speech_to_text package. No audio ever leaves the device.

TTS + MP3 Audio Playback

Words are voiced via flutter_tts and pre-recorded MP3s so children hear accurate pronunciation before speaking.

9-Phoneme Learning Path

Nine phonemes and clusters (R through BL), each with 10 curated practice words, unlock sequentially as children progress.

Rive Animated Rewards

Completing a phoneme triggers a Rive character animation and a session-complete audio cue to celebrate the child’s achievement.

Child Profiles, No Login

Multiple child profiles are stored locally. Children tap their avatar to start — no passwords or accounts required.

Optional Supabase Sync

Progress metrics can sync to a Supabase backend for adult review. The app runs fully offline when no .env credentials are present.

Adult PIN Dashboard

A 4-digit PIN gates the adult panel, keeping settings and progress data separate from the child-facing interface.

Image Card UI

Each practice word is paired with a PNG illustration from assets/words/ to reinforce meaning alongside sound.

Get Started

Core Features

Backend & Sync

Architecture

VOZI: Interactive Spanish Phoneme Learning for Kids

What VOZI Teaches

How a Practice Session Works

Privacy by Design

Who Manages the App

Key Features at a Glance

On-Device Speech Recognition

TTS + MP3 Audio Playback

9-Phoneme Learning Path

Rive Animated Rewards

Child Profiles, No Login

Optional Supabase Sync

Adult PIN Dashboard

Image Card UI

Build docs developers (and LLMs) love

Get Started

Core Features

Backend & Sync

Architecture

Documentation Index

​What VOZI Teaches

​How a Practice Session Works

​Privacy by Design

​Who Manages the App

​Key Features at a Glance

On-Device Speech Recognition

TTS + MP3 Audio Playback

9-Phoneme Learning Path

Rive Animated Rewards

Child Profiles, No Login

Optional Supabase Sync

Adult PIN Dashboard

Image Card UI

Build docs developers (and LLMs) love

What VOZI Teaches

How a Practice Session Works

Privacy by Design

Who Manages the App

Key Features at a Glance