Voice Assistant Component for ESP32 ESPHome Devices

ESPHome’s voice_assistant component turns any ESP32 device with a microphone into a local voice assistant. It streams microphone audio to Home Assistant’s Assist pipeline, which handles wake-word detection, speech-to-text, intent processing, and text-to-speech — all locally without cloud services. Responses can be played back through a connected speaker or media player.

Voice Assistant requires Home Assistant 2023.5 or later.

Audio and voice components consume significant RAM and CPU. Crashes may occur if you include too many additional components, especially Bluetooth/BLE. If you experience crashes, consult the Troubleshooting guide for backtrace instructions.

Minimal Example

i2s_audio:
  i2s_lrclk_pin: GPIO25
  i2s_bclk_pin: GPIO26

microphone:
  - platform: i2s_audio
    id: board_mic
    i2s_din_pin: GPIO23
    adc_type: external

speaker:
  - platform: i2s_audio
    id: board_speaker
    i2s_dout_pin: GPIO22

voice_assistant:
  microphone: board_mic
  speaker: board_speaker

Configuration Variables

microphone

Microphone ID or list

The microphone source(s) for audio input. A single source can be provided directly, or a list of up to two sources for dual-channel streaming (see Dual Microphone below).

speaker

The speaker component to use for TTS response playback. Cannot be used together with media_player.

media_player

The media player component to use for TTS response playback. Cannot be used together with speaker.

micro_wake_word

The micro_wake_word component for on-device wake-word detection. When configured, Home Assistant can remotely change which wake-word model is active.

use_wake_word

boolean

Enable wake-word detection in the Home Assistant Assist pipeline. Defaults to false.

conversation_timeout

Time

How long to preserve conversation context before resetting the conversation_id. Defaults to 300s.

noise_suppression_level

int 0–4

Noise suppression level applied in the Assist pipeline. 0 = disabled (default).

auto_gain

dBFS 0–31

Automatic gain control level. 0dBFS = disabled (default).

volume_multiplier

float

Volume scaling multiplier applied to the microphone. Must be > 0. Defaults to 1 (disabled).

Automation Triggers

Pipeline State Triggers

voice_assistant:
  microphone: board_mic
  speaker: board_speaker
  on_listening:
    - light.turn_on:
        id: status_led
        effect: "pulse"
  on_stt_vad_start:
    - logger.log: "Voice activity detected"
  on_stt_end:
    - logger.log:
        format: "You said: %s"
        args: [x.c_str()]
  on_tts_start:
    - logger.log:
        format: "Response: %s"
        args: [x.c_str()]
  on_tts_end:
    - logger.log:
        format: "Audio URL: %s"
        args: [x.c_str()]
  on_end:
    - light.turn_off: status_led
  on_error:
    - logger.log:
        format: "Error: %s - %s"
        args: [code.c_str(), message.c_str()]

Wake Word Triggers

voice_assistant:
  on_wake_word_detected:
    - logger.log: "Wake word heard!"
    - light.turn_on: status_led

Client Connection Triggers

voice_assistant:
  on_client_connected:
    - logger.log: "Home Assistant connected"
  on_client_disconnected:
    - logger.log: "Home Assistant disconnected"

Intent Triggers

voice_assistant:
  on_intent_start:
    - logger.log: "Processing intent..."
  on_intent_progress:
    - lambda: |-
        if (!x.empty()) {
          ESP_LOGI("va", "Streaming TTS URL: %s", x.c_str());
        }
  on_intent_end:
    - logger.log: "Intent complete"

TTS Stream Triggers (requires `speaker`)

voice_assistant:
  speaker: board_speaker
  on_tts_stream_start:
    - logger.log: "TTS audio starting"
  on_tts_stream_end:
    - light.turn_off: status_led

Timer Triggers

voice_assistant:
  on_timer_started:
    - logger.log:
        format: "Timer started: %d seconds"
        args: [timer.total_seconds]
  on_timer_finished:
    - rtttl.play: "alert:d=4,o=5,b=100:e,e,e"
  on_timer_cancelled:
    - logger.log: "Timer cancelled"
  on_timer_tick:
    - lambda: |-
        for (auto &t : timers) {
          ESP_LOGI("timer", "Remaining: %d s", t.seconds_left);
        }

Voice Assistant Actions

`voice_assistant.start`

Listen for a single voice command, then stop. Silence detection automatically determines when the user has finished speaking.

on_...:
  - voice_assistant.start:
      silence_detection: true    # default: true
      wake_word: "hey jarvis"    # optional: wake word that triggered start

`voice_assistant.start_continuous`

Continuously listen for commands. Automatically restarts after each response. Call voice_assistant.stop to end the cycle.

on_...:
  - voice_assistant.start_continuous:

`voice_assistant.stop`

Stop the current listening session or continuous cycle.

on_...:
  - voice_assistant.stop:

Conditions

`voice_assistant.is_running`

Returns true if the voice assistant is currently active.

`voice_assistant.connected`

Returns true if Home Assistant is connected and ready.

Usage Patterns

Push to Talk

Hold a button to listen, release to send.

voice_assistant:
  microphone:
    microphone: board_mic
    channels: 0
    gain_factor: 4
  speaker: board_speaker

binary_sensor:
  - platform: gpio
    pin: GPIO0
    name: "PTT Button"
    on_press:
      - voice_assistant.start:
          silence_detection: false   # manual release controls end
    on_release:
      - voice_assistant.stop:

Click to Toggle

Click once to start listening; click again to stop.

binary_sensor:
  - platform: gpio
    pin: GPIO0
    name: "VA Button"
    on_click:
      - if:
          condition: voice_assistant.is_running
          then:
            - voice_assistant.stop:
          else:
            - voice_assistant.start_continuous:

Always-On with Wake Word (Micro Wake Word)

micro_wake_word:
  models:
    - model: hey_jarvis
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;

voice_assistant:
  microphone: board_mic
  speaker: board_speaker
  micro_wake_word: mww_component
  on_end:
    - micro_wake_word.start:    # re-arm wake word after each interaction

esphome:
  on_boot:
    - micro_wake_word.start:    # start listening immediately on boot

Dual Microphone Channel

Provide two microphone sources for improved voice pipeline performance. Home Assistant 2026.6+ is required to use the second channel.

voice_assistant:
  id: va
  microphone:
    - microphone: i2s_mics
      channels: 0    # processed audio (with gain, noise suppression)
    - microphone: i2s_mics
      channels: 1    # raw audio
  speaker: board_speaker

Both microphone sources must provide 16 kHz, 16-bit, mono audio as required by the Assist pipeline.

LED Feedback Example

Provide clear visual feedback throughout the voice pipeline.

light:
  - platform: neopixelbus
    id: ring_light
    type: GRB
    pin: GPIO4
    num_leds: 12
    effects:
      - pulse:
          name: "listening_pulse"
          transition_length: 500ms
          update_interval: 500ms

voice_assistant:
  microphone: board_mic
  speaker: board_speaker
  on_listening:
    - light.turn_on:
        id: ring_light
        blue: 100%
        brightness: 80%
        effect: "listening_pulse"
  on_stt_vad_end:
    - light.turn_on:
        id: ring_light
        red: 100%
        green: 100%
        blue: 0%
        brightness: 60%
        effect: none
  on_tts_stream_start:
    - light.turn_on:
        id: ring_light
        green: 100%
        brightness: 60%
  on_end:
    - light.turn_off: ring_light
  on_error:
    - light.turn_on:
        id: ring_light
        red: 100%
        brightness: 80%
    - delay: 2s
    - light.turn_off: ring_light

Core

Platforms

Inputs & Sensors

Outputs & Actuators

Display & Media

Connectivity

Voice Assistant Component for ESP32 ESPHome Devices

Minimal Example

Configuration Variables

Automation Triggers

Pipeline State Triggers

Wake Word Triggers

Client Connection Triggers

Intent Triggers

TTS Stream Triggers (requires `speaker`)

Timer Triggers

Voice Assistant Actions

`voice_assistant.start`

`voice_assistant.start_continuous`

`voice_assistant.stop`

Conditions

`voice_assistant.is_running`

`voice_assistant.connected`

Usage Patterns

Push to Talk

Click to Toggle

Always-On with Wake Word (Micro Wake Word)

Dual Microphone Channel

LED Feedback Example

Build docs developers (and LLMs) love

Core

Platforms

Inputs & Sensors

Outputs & Actuators

Display & Media

Connectivity

Documentation Index

​Minimal Example

​Configuration Variables

​Automation Triggers

​Pipeline State Triggers

​Wake Word Triggers

​Client Connection Triggers

​Intent Triggers

​TTS Stream Triggers (requires speaker)

​Timer Triggers

​Voice Assistant Actions

​voice_assistant.start

​voice_assistant.start_continuous

​voice_assistant.stop

​Conditions

​voice_assistant.is_running

​voice_assistant.connected

​Usage Patterns

​Push to Talk

​Click to Toggle

​Always-On with Wake Word (Micro Wake Word)

​Dual Microphone Channel

​LED Feedback Example

Build docs developers (and LLMs) love

Minimal Example

Configuration Variables

Automation Triggers

Pipeline State Triggers

Wake Word Triggers

Client Connection Triggers

Intent Triggers

TTS Stream Triggers (requires `speaker`)

Timer Triggers

Voice Assistant Actions

`voice_assistant.start`

`voice_assistant.start_continuous`

`voice_assistant.stop`

Conditions

`voice_assistant.is_running`

`voice_assistant.connected`

Usage Patterns

Push to Talk

Click to Toggle

Always-On with Wake Word (Micro Wake Word)

Dual Microphone Channel

LED Feedback Example