Voice input
Instead of typing a prompt, you can speak your description. Vibra Code records your voice, sends the audio to AssemblyAI for transcription, and automatically inserts the transcribed text into the message input. Voice input is handled by two native iOS services:EXAudioRecorderService— captures the microphone audio on-deviceEXAssemblyAIService— streams the audio to the AssemblyAI API and receives the transcription
Open a session
Start a new session or open an existing one. The bottom bar must be visible — swipe up or tap the chevron to open chat mode if needed.
Tap the mic button
Tap the microphone icon in the bottom bar. Vibra Code will request microphone permission the first time. Once granted, recording begins immediately.
Speak your description
Describe what you want to build or change. Speak naturally — you do not need to use any special commands or keywords. The longer and more detailed your description, the better the result.
Tap the mic button again to stop
Tap the mic button a second time to stop recording. The audio is sent to AssemblyAI for transcription.
Voice transcription requires an internet connection and depends on the AssemblyAI API. The native
EXAssemblyAIService handles transcription on-device; no additional backend environment variable is required.Image attachments
You can attach one or more images to any message. This is useful for:- Mockup screenshots — show the AI exactly what the UI should look like
- Reference designs — attach a screenshot from another app you want to replicate
- Error screenshots — show a visual bug that is hard to describe in words
images field on the messages table. The AI agent receives the image paths and uses them as visual context when generating code.
How to attach an image
- Tap the image icon in the bottom bar
- Choose an image from your photo library or camera
- The image thumbnail appears above the text input
- Type any additional text instructions (optional) and tap Send
Combining voice and images
Voice input and image attachments work together. A common pattern is:- Attach a mockup image of the screen you want to build
- Tap the mic button and describe it verbally — “Build this screen. The top section has a profile photo with the user’s name and bio below it. The button at the bottom should be the brand’s primary color.”
- The transcription fills in the text field alongside the attached image
- Send the combined message
Tips for effective descriptions
Whether you use voice, text, or images, a few practices consistently produce better app output:- Name the screen — “Build a settings screen with…” gives the agent a frame of reference
- Describe layout explicitly — “The header is at the top with a back button on the left. Below it is a scrollable list of items.”
- Specify colors and fonts — “Use a dark navy background with white text and an orange accent”
- Mention interactions — “Tapping a list item navigates to a detail screen”
- Iterate in small steps — Make one focused change per message rather than requesting many changes at once
- Reference existing elements — “Keep the existing navigation bar and just change the tab icons”