What’s in this example?
This example demonstrates how to build an interactive voice chat application using:- Audio Recording: Records audio from your microphone with automatic silence detection
- Cloud Processing: Uses Modal to run the LFM2-Audio-1.5B model on GPU in the cloud
- Audio Playback: Automatically plays the generated audio response
- Recording your voice question from the microphone (with auto-stop on silence)
- Uploading the audio to a Modal volume
- Processing the audio with LFM2-Audio-1.5B on a GPU instance to generate an interleaved text and audio response
- Downloading the generated audio response
- Playing the response through your speakers
Tools
This example uses the following tools and libraries:- liquid-audio: Python library for working with LFM2-Audio models
- Modal: Serverless cloud platform for running GPU workloads
- PyAudio: Cross-platform audio I/O library for recording from microphone
- pygame: Audio playback library
- torchaudio: Audio processing utilities for PyTorch
- rich: Terminal formatting library for audio visualization
Prerequisites
Set up Modal
- Create a Modal account at modal.com
- Install the Modal CLI:
uv add modal - Authenticate:
uv run modal token new
How to run it
The generated audio file will be saved locally as
answer_YYYYMMDD_HHMMSS.wav for each session.