Skip to main content
EV Sum 2 uses Android’s TextToSpeech API to convert text into spoken audio with Spanish (Chile) locale, enabling users to hear their saved phrases aloud.

Overview

The text-to-speech feature provides natural voice synthesis for Spanish text, with automatic initialization and lifecycle management.
The TTS engine is configured for Chilean Spanish (es-CL) to provide natural pronunciation for local users.

Architecture

The feature consists of a single controller class:
  • TextToSpeechController (services/TextToSpeechController.kt:7) - Manages TTS lifecycle and playback

Implementation

The TextToSpeechController class wraps Android’s TextToSpeech API:
class TextToSpeechController(context: Context) {
    private var tts: TextToSpeech? = null
    private var ready = false

    init {
        tts = TextToSpeech(context) { status ->
            ready = status == TextToSpeech.SUCCESS
            if (ready) {
                tts?.language = Locale.forLanguageTag("es-CL")
            }
        }
    }

    fun speak(text: String) {
        if (!ready) return
        val clean = text.trim()
        if (clean.isBlank()) return
        tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, "phrase_tts")
    }

    fun stop() {
        tts?.stop()
    }

    fun destroy() {
        tts?.stop()
        tts?.shutdown()
        tts = null
    }
}

Key features

Chilean Spanish

Optimized for Spanish (Chile) locale

Auto-initialization

Automatic TTS engine setup on creation

Queue management

QUEUE_FLUSH mode for immediate playback

Lifecycle aware

Proper cleanup to prevent memory leaks

Usage

1

Create controller

Initialize the controller with context:
val context = LocalContext.current
val tts = remember { TextToSpeechController(context) }
2

Speak text

Convert text to speech:
Button(onClick = { tts.speak("Hola, mundo!") }) {
    Text("Speak")
}
3

Clean up

Destroy the controller when done:
DisposableEffect(Unit) {
    onDispose { tts.destroy() }
}

Initialization process

The TTS engine initializes asynchronously:
init {
    tts = TextToSpeech(context) { status ->
        ready = status == TextToSpeech.SUCCESS
        if (ready) {
            tts?.language = Locale.forLanguageTag("es-CL")
        }
    }
}
The controller tracks initialization state with the ready flag. Speech requests before initialization complete are silently ignored.

Initialization states

StateDescriptionBehavior
InitializingTTS engine loadingspeak() calls ignored
ReadyLocale set to es-CLspeak() calls processed
FailedInitialization errorspeak() calls ignored

Speaking text

The speak() method handles text-to-speech conversion:
fun speak(text: String) {
    if (!ready) return
    val clean = text.trim()
    if (clean.isBlank()) return
    tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, "phrase_tts")
}

Parameters explained

The text to be spoken. Automatically trimmed and validated.
Playback mode that clears any pending speech and starts immediately.Alternative: QUEUE_ADD appends to queue instead of replacing.
Optional parameters bundle (not used in this implementation).
Unique identifier for tracking this speech request.

Queue modes

Interrupts any ongoing speech and starts immediately:
tts.speak("New text", TextToSpeech.QUEUE_FLUSH, null, "id")
Use case: User clicks a new phrase while another is playing.
EV Sum 2 uses QUEUE_FLUSH to ensure immediate feedback when users tap a phrase, preventing confusion from queued speech.

Stopping speech

Stop playback immediately:
fun stop() {
    tts?.stop()
}
Example usage:
var isSpeaking by remember { mutableStateOf(false) }

Button(
    onClick = {
        if (isSpeaking) {
            tts.stop()
            isSpeaking = false
        } else {
            tts.speak(text)
            isSpeaking = true
        }
    }
) {
    Text(if (isSpeaking) "Stop" else "Speak")
}

Lifecycle management

Properly clean up resources when the controller is no longer needed:
fun destroy() {
    tts?.stop()
    tts?.shutdown()
    tts = null
}

In Compose

val tts = remember { TextToSpeechController(context) }

DisposableEffect(Unit) {
    onDispose {
        tts.destroy()
    }
}
Always call destroy() to release TTS resources. Failure to do so may cause memory leaks and prevent other apps from using TTS.

Integration with phrase management

The home screen demonstrates TTS integration with saved phrases (ui/home/HomeScreen.kt:88):
val tts = remember { TextToSpeechController(context) }

DisposableEffect(Unit) {
    onDispose { tts.destroy() }
}

LazyColumn {
    items(phrases) { phrase ->
        Card {
            Text(phrase.text)
            
            Button(
                onClick = { tts.speak(phrase.text) },
                modifier = Modifier.fillMaxWidth().height(56.dp)
            ) {
                Icon(Icons.AutoMirrored.Filled.VolumeUp, null)
                Spacer(Modifier.width(12.dp))
                Text("PLAY VOICE")
            }
        }
    }
}

Validation

The controller includes automatic validation:
fun speak(text: String) {
    if (!ready) return           // Skip if not initialized
    val clean = text.trim()
    if (clean.isBlank()) return  // Skip if empty/whitespace
    tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, "phrase_tts")
}

Validation checks

  1. Initialization check: if (!ready) return
  2. Text cleaning: text.trim()
  3. Empty check: if (clean.isBlank()) return
Validation is automatic - you don’t need to check these conditions before calling speak().

Error handling

The TTS engine may fail to initialize on some devices:
tts = TextToSpeech(context) { status ->
    ready = status == TextToSpeech.SUCCESS
    if (ready) {
        tts?.language = Locale.forLanguageTag("es-CL")
    } else {
        // Initialization failed
        Log.e("TTS", "TextToSpeech initialization failed")
    }
}

Common initialization failures

Cause: Device doesn’t have a TTS engine installedSolution: Prompt user to install Google Text-to-Speech from Play Store
Cause: TTS engine doesn’t support Spanish (Chile)Solution: Fall back to generic Spanish or prompt language download
Cause: TTS engine is busy or being used by another appSolution: Retry initialization or show error message

Advanced usage

Speech rate and pitch

Customize voice characteristics:
init {
    tts = TextToSpeech(context) { status ->
        if (status == TextToSpeech.SUCCESS) {
            tts?.language = Locale.forLanguageTag("es-CL")
            tts?.setSpeechRate(1.0f)  // 0.5 = slower, 2.0 = faster
            tts?.setPitch(1.0f)       // 0.5 = lower, 2.0 = higher
        }
    }
}

Utterance callbacks

Track speech progress:
fun speak(text: String, onComplete: () -> Unit) {
    if (!ready) return
    val clean = text.trim()
    if (clean.isBlank()) return
    
    val utteranceId = "phrase_${System.currentTimeMillis()}"
    
    tts?.setOnUtteranceProgressListener(object : UtteranceProgressListener() {
        override fun onStart(utteranceId: String?) {
            // Speech started
        }
        
        override fun onDone(utteranceId: String?) {
            onComplete()
        }
        
        override fun onError(utteranceId: String?) {
            // Speech error
        }
    })
    
    tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, utteranceId)
}

Testing

Test TTS availability

val intent = Intent()
intent.action = TextToSpeech.Engine.ACTION_CHECK_TTS_DATA
val result = context.startActivityForResult(intent, CHECK_TTS_DATA)

if (result == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS) {
    // TTS is available
} else {
    // Prompt to install TTS data
    val installIntent = Intent()
    installIntent.action = TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA
    context.startActivity(installIntent)
}

Emulator testing

  • Most emulators include Google TTS engine
  • Spanish voice may need to be downloaded
  • Test with various text lengths and special characters

Best practices

Always clean up

Call destroy() in onDispose or component cleanup to release resources.

Handle initialization

TTS initializes asynchronously - calls before ready are ignored.

Validate input

The controller validates text automatically, but check for meaningful content.

Consider queue mode

Use QUEUE_FLUSH for immediate feedback, QUEUE_ADD for sequences.
The TTS engine runs in the background. If your app is backgrounded during speech, playback continues until completion.

Dependencies

No additional dependencies required - TextToSpeech is part of Android framework.

Locale support

The controller is configured for Chilean Spanish:
tts?.language = Locale.forLanguageTag("es-CL")
To support different locales:
class TextToSpeechController(
    context: Context,
    private val locale: Locale = Locale.forLanguageTag("es-CL")
) {
    init {
        tts = TextToSpeech(context) { status ->
            if (status == TextToSpeech.SUCCESS) {
                val result = tts?.setLanguage(locale)
                ready = result != TextToSpeech.LANG_MISSING_DATA &&
                        result != TextToSpeech.LANG_NOT_SUPPORTED
            }
        }
    }
}

Phrase Management

Store and manage phrases to speak

Speech-to-Text

Convert speech to text input

Build docs developers (and LLMs) love