Text-to-speech - EV Sum 2 Android

EV Sum 2 uses Android’s TextToSpeech API to convert text into spoken audio with Spanish (Chile) locale, enabling users to hear their saved phrases aloud.

Overview

The text-to-speech feature provides natural voice synthesis for Spanish text, with automatic initialization and lifecycle management.

The TTS engine is configured for Chilean Spanish (es-CL) to provide natural pronunciation for local users.

Architecture

The feature consists of a single controller class:

TextToSpeechController (services/TextToSpeechController.kt:7) - Manages TTS lifecycle and playback

Implementation

The TextToSpeechController class wraps Android’s TextToSpeech API:

class TextToSpeechController(context: Context) {
    private var tts: TextToSpeech? = null
    private var ready = false

    init {
        tts = TextToSpeech(context) { status ->
            ready = status == TextToSpeech.SUCCESS
            if (ready) {
                tts?.language = Locale.forLanguageTag("es-CL")
            }
        }
    }

    fun speak(text: String) {
        if (!ready) return
        val clean = text.trim()
        if (clean.isBlank()) return
        tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, "phrase_tts")
    }

    fun stop() {
        tts?.stop()
    }

    fun destroy() {
        tts?.stop()
        tts?.shutdown()
        tts = null
    }
}

Key features

Chilean Spanish

Optimized for Spanish (Chile) locale

Auto-initialization

Automatic TTS engine setup on creation

Queue management

QUEUE_FLUSH mode for immediate playback

Lifecycle aware

Proper cleanup to prevent memory leaks

Usage

Create controller

Initialize the controller with context:

val context = LocalContext.current
val tts = remember { TextToSpeechController(context) }

Speak text

Convert text to speech:

Button(onClick = { tts.speak("Hola, mundo!") }) {
    Text("Speak")
}

Clean up

Destroy the controller when done:

DisposableEffect(Unit) {
    onDispose { tts.destroy() }
}

Initialization process

The TTS engine initializes asynchronously:

init {
    tts = TextToSpeech(context) { status ->
        ready = status == TextToSpeech.SUCCESS
        if (ready) {
            tts?.language = Locale.forLanguageTag("es-CL")
        }
    }
}

The controller tracks initialization state with the ready flag. Speech requests before initialization complete are silently ignored.

Initialization states

State	Description	Behavior
Initializing	TTS engine loading	`speak()` calls ignored
Ready	Locale set to `es-CL`	`speak()` calls processed
Failed	Initialization error	`speak()` calls ignored

Speaking text

The speak() method handles text-to-speech conversion:

fun speak(text: String) {
    if (!ready) return
    val clean = text.trim()
    if (clean.isBlank()) return
    tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, "phrase_tts")
}

Parameters explained

text (String)

The text to be spoken. Automatically trimmed and validated.

QUEUE_FLUSH

Playback mode that clears any pending speech and starts immediately.Alternative: QUEUE_ADD appends to queue instead of replacing.

params (null)

Optional parameters bundle (not used in this implementation).

utteranceId ("phrase_tts")

Unique identifier for tracking this speech request.

Queue modes

QUEUE_FLUSH
QUEUE_ADD

Interrupts any ongoing speech and starts immediately:

tts.speak("New text", TextToSpeech.QUEUE_FLUSH, null, "id")

Use case: User clicks a new phrase while another is playing.

Adds to the end of the speech queue:

tts.speak("New text", TextToSpeech.QUEUE_ADD, null, "id")

Use case: Reading multiple phrases in sequence.

EV Sum 2 uses QUEUE_FLUSH to ensure immediate feedback when users tap a phrase, preventing confusion from queued speech.

Stopping speech

Stop playback immediately:

fun stop() {
    tts?.stop()
}

Example usage:

var isSpeaking by remember { mutableStateOf(false) }

Button(
    onClick = {
        if (isSpeaking) {
            tts.stop()
            isSpeaking = false
        } else {
            tts.speak(text)
            isSpeaking = true
        }
    }
) {
    Text(if (isSpeaking) "Stop" else "Speak")
}

Lifecycle management

Properly clean up resources when the controller is no longer needed:

fun destroy() {
    tts?.stop()
    tts?.shutdown()
    tts = null
}

In Compose

val tts = remember { TextToSpeechController(context) }

DisposableEffect(Unit) {
    onDispose {
        tts.destroy()
    }
}

Always call destroy() to release TTS resources. Failure to do so may cause memory leaks and prevent other apps from using TTS.

Integration with phrase management

The home screen demonstrates TTS integration with saved phrases (ui/home/HomeScreen.kt:88):

val tts = remember { TextToSpeechController(context) }

DisposableEffect(Unit) {
    onDispose { tts.destroy() }
}

LazyColumn {
    items(phrases) { phrase ->
        Card {
            Text(phrase.text)
            
            Button(
                onClick = { tts.speak(phrase.text) },
                modifier = Modifier.fillMaxWidth().height(56.dp)
            ) {
                Icon(Icons.AutoMirrored.Filled.VolumeUp, null)
                Spacer(Modifier.width(12.dp))
                Text("PLAY VOICE")
            }
        }
    }
}

Validation

The controller includes automatic validation:

fun speak(text: String) {
    if (!ready) return           // Skip if not initialized
    val clean = text.trim()
    if (clean.isBlank()) return  // Skip if empty/whitespace
    tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, "phrase_tts")
}

Validation checks

Initialization check: if (!ready) return
Text cleaning: text.trim()
Empty check: if (clean.isBlank()) return

Validation is automatic - you don’t need to check these conditions before calling speak().

Error handling

The TTS engine may fail to initialize on some devices:

tts = TextToSpeech(context) { status ->
    ready = status == TextToSpeech.SUCCESS
    if (ready) {
        tts?.language = Locale.forLanguageTag("es-CL")
    } else {
        // Initialization failed
        Log.e("TTS", "TextToSpeech initialization failed")
    }
}

Common initialization failures

Missing TTS engine

Cause: Device doesn’t have a TTS engine installedSolution: Prompt user to install Google Text-to-Speech from Play Store

Language not supported

Cause: TTS engine doesn’t support Spanish (Chile)Solution: Fall back to generic Spanish or prompt language download

Resource unavailable

Cause: TTS engine is busy or being used by another appSolution: Retry initialization or show error message

Advanced usage

Speech rate and pitch

Customize voice characteristics:

init {
    tts = TextToSpeech(context) { status ->
        if (status == TextToSpeech.SUCCESS) {
            tts?.language = Locale.forLanguageTag("es-CL")
            tts?.setSpeechRate(1.0f)  // 0.5 = slower, 2.0 = faster
            tts?.setPitch(1.0f)       // 0.5 = lower, 2.0 = higher
        }
    }
}

Utterance callbacks

Track speech progress:

fun speak(text: String, onComplete: () -> Unit) {
    if (!ready) return
    val clean = text.trim()
    if (clean.isBlank()) return
    
    val utteranceId = "phrase_${System.currentTimeMillis()}"
    
    tts?.setOnUtteranceProgressListener(object : UtteranceProgressListener() {
        override fun onStart(utteranceId: String?) {
            // Speech started
        }
        
        override fun onDone(utteranceId: String?) {
            onComplete()
        }
        
        override fun onError(utteranceId: String?) {
            // Speech error
        }
    })
    
    tts?.speak(clean, TextToSpeech.QUEUE_FLUSH, null, utteranceId)
}

Testing

Test TTS availability

val intent = Intent()
intent.action = TextToSpeech.Engine.ACTION_CHECK_TTS_DATA
val result = context.startActivityForResult(intent, CHECK_TTS_DATA)

if (result == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS) {
    // TTS is available
} else {
    // Prompt to install TTS data
    val installIntent = Intent()
    installIntent.action = TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA
    context.startActivity(installIntent)
}

Emulator testing

Most emulators include Google TTS engine
Spanish voice may need to be downloaded
Test with various text lengths and special characters

Best practices

Always clean up

Call destroy() in onDispose or component cleanup to release resources.

Handle initialization

TTS initializes asynchronously - calls before ready are ignored.

Validate input

The controller validates text automatically, but check for meaningful content.

Consider queue mode

Use QUEUE_FLUSH for immediate feedback, QUEUE_ADD for sequences.

The TTS engine runs in the background. If your app is backgrounded during speech, playback continues until completion.

Dependencies

No additional dependencies required - TextToSpeech is part of Android framework.

Locale support

The controller is configured for Chilean Spanish:

tts?.language = Locale.forLanguageTag("es-CL")

To support different locales:

class TextToSpeechController(
    context: Context,
    private val locale: Locale = Locale.forLanguageTag("es-CL")
) {
    init {
        tts = TextToSpeech(context) { status ->
            if (status == TextToSpeech.SUCCESS) {
                val result = tts?.setLanguage(locale)
                ready = result != TextToSpeech.LANG_MISSING_DATA &&
                        result != TextToSpeech.LANG_NOT_SUPPORTED
            }
        }
    }
}

Phrase Management

Store and manage phrases to speak

Speech-to-Text

Convert speech to text input

Get Started

Features

Architecture

​Overview

​Architecture

​Implementation

​Key features

Chilean Spanish

Auto-initialization

Queue management

Lifecycle aware

​Usage

​Initialization process

​Initialization states

​Speaking text

​Parameters explained

​Queue modes

​Stopping speech

​Lifecycle management

​In Compose

​Integration with phrase management

​Validation

​Validation checks

​Error handling

​Common initialization failures

​Advanced usage

​Speech rate and pitch

​Utterance callbacks

​Testing

​Test TTS availability

​Emulator testing

​Best practices

Always clean up

Handle initialization

Validate input

Consider queue mode

​Dependencies

​Locale support

​Related features

Phrase Management

Speech-to-Text

Build docs developers (and LLMs) love

Overview

Architecture

Implementation

Key features

Usage

Initialization process

Initialization states

Speaking text

Parameters explained

Queue modes

Stopping speech

Lifecycle management

In Compose

Integration with phrase management

Validation

Validation checks

Error handling

Common initialization failures

Advanced usage

Speech rate and pitch

Utterance callbacks

Testing

Test TTS availability

Emulator testing

Best practices

Dependencies

Locale support

Related features