Voice input and image attachments

Vibra Code supports two richer input methods beyond typing: voice recording and image attachments. Both are accessible from the bottom bar in any open session, and they can be combined in a single message for maximum clarity.

Voice input

Instead of typing a prompt, you can speak your description. Vibra Code records your voice, sends the audio to AssemblyAI for transcription, and automatically inserts the transcribed text into the message input. Voice input is handled by two native iOS services:

EXAudioRecorderService — captures the microphone audio on-device
EXAssemblyAIService — streams the audio to the AssemblyAI API and receives the transcription

Open a session

Start a new session or open an existing one. The bottom bar must be visible — swipe up or tap the chevron to open chat mode if needed.

Tap the mic button

Tap the microphone icon in the bottom bar. Vibra Code will request microphone permission the first time. Once granted, recording begins immediately.

Speak your description

Describe what you want to build or change. Speak naturally — you do not need to use any special commands or keywords. The longer and more detailed your description, the better the result.

Tap the mic button again to stop

Tap the mic button a second time to stop recording. The audio is sent to AssemblyAI for transcription.

Review and send

The transcribed text appears in the message input field. You can edit it before sending, or tap Send immediately if it looks correct. The message is sent automatically once transcription completes.

Voice transcription requires an internet connection and depends on the AssemblyAI API. The native EXAssemblyAIService handles transcription on-device; no additional backend environment variable is required.

Image attachments

You can attach one or more images to any message. This is useful for:

Mockup screenshots — show the AI exactly what the UI should look like
Reference designs — attach a screenshot from another app you want to replicate
Error screenshots — show a visual bug that is hard to describe in words

Images are stored in Convex storage via the images field on the messages table. The AI agent receives the image paths and uses them as visual context when generating code.

How to attach an image

Tap the image icon in the bottom bar
Choose an image from your photo library or camera
The image thumbnail appears above the text input
Type any additional text instructions (optional) and tap Send

You can attach multiple images to a single message by tapping the image button again after the first attachment.

For best results, attach a clean, cropped screenshot that focuses on the specific UI element or screen you want the AI to reference. Avoid attaching large photos with lots of irrelevant content.

Combining voice and images

Voice input and image attachments work together. A common pattern is:

Attach a mockup image of the screen you want to build
Tap the mic button and describe it verbally — “Build this screen. The top section has a profile photo with the user’s name and bio below it. The button at the bottom should be the brand’s primary color.”
The transcription fills in the text field alongside the attached image
Send the combined message

This gives the AI both visual and textual context, which often produces more accurate results than either method alone.

Tips for effective descriptions

Whether you use voice, text, or images, a few practices consistently produce better app output:

Name the screen — “Build a settings screen with…” gives the agent a frame of reference
Describe layout explicitly — “The header is at the top with a back button on the left. Below it is a scrollable list of items.”
Specify colors and fonts — “Use a dark navy background with white text and an orange accent”
Mention interactions — “Tapping a list item navigates to a detail screen”
Iterate in small steps — Make one focused change per message rather than requesting many changes at once
Reference existing elements — “Keep the existing navigation bar and just change the tab icons”

Get Started

Using Vibra Code

Plans & Billing

Voice input and image attachments

Voice input

Image attachments

How to attach an image

Combining voice and images

Tips for effective descriptions

Build docs developers (and LLMs) love

Get Started

Using Vibra Code

Plans & Billing

Documentation Index

​Voice input

​Image attachments

​How to attach an image

​Combining voice and images

​Tips for effective descriptions

Build docs developers (and LLMs) love

Voice input

Image attachments

How to attach an image

Combining voice and images

Tips for effective descriptions