Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jAtInn71/chatwoot-costom/llms.txt

Use this file to discover all available pages before exploring further.

Voice call transcripts automatically turn spoken exchanges into Chatwoot messages. Every completed turn — what the visitor said and what the AI agent replied — is appended to the visitor’s conversation in real time, giving your support team a full record of what happened during the call without any manual note-taking.

How transcripts flow from call to conversation

1. The embed fires an event per completed turn

The <elevenlabs-convai> web component dispatches a custom event on its host element after each turn completes. ElevenLabsVoiceButton.vue registers listeners for three event names to cover different SDK versions:
  • convai-message
  • message
  • transcript
Each event’s detail object contains the speaker (source / role: user or ai) and the spoken text (message, text, or content).

2. The button POSTs to the transcript endpoint

For each event, the component calls:
POST /api/v1/widget/conversations/voice_transcript?website_token=<token>

{
  "source": "user",   // or "ai"
  "content": "Hello, I need help with my order."
}
The widget appends ?website_token=<token> to the URL automatically so the Rails controller can identify the inbox.

3. The Rails controller writes the message

Api::V1::Widget::ConversationsController#voice_transcript does the following on each POST:
  1. Validates that source is either user or ai and that content is not blank.
  2. Looks up the visitor’s existing open conversation, or calls build_conversation_for_voice to create one.
  3. Creates a message with:
    • message_type: :incoming for source == "user", or :outgoing for the AI turn
    • sender: @contact for user turns; nil for AI turns
    • content_attributes: { voice_transcript: true, role: "<source>" }
  4. Returns { id: <message_id>, conversation_id: <conversation_id> } with HTTP 200.

4. The widget syncs messages immediately

After each successful POST, the button component dispatches conversation/syncLatestMessages. This triggers a fetch of the latest messages in the widget’s conversation view, so the visitor sees their voice turns appear in the chat window in real time — alongside any text messages already in the thread.

Voice-first flow

If a visitor starts a voice call before typing any text message, there is no existing conversation to attach transcripts to. In that case build_conversation_for_voice creates a new Chatwoot conversation tagged with:
additional_attributes: { initiated_from: 'voice_agent' }
This ensures transcripts always have a conversation to land in, and the initiated_from attribute lets you identify voice-originated conversations in reports.

What agents see

Voice transcript messages appear as ordinary message bubbles in the Chatwoot dashboard conversation view. They are interleaved with any text messages the visitor typed. Each bubble carries content_attributes.voice_transcript: true and content_attributes.role (user or ai), which can be used for custom styling or filtering in reports.
User turns (what the visitor said) are stored as incoming messages. AI turns (what the ElevenLabs agent said) are stored as outgoing messages. This matches the standard Chatwoot convention where incoming means from the visitor and outgoing means from the agent/bot side.

Checking transcript delivery

If transcripts are not appearing in the conversation, check the Rails logs:
docker compose logs rails | grep voice_transcript
A successful POST logs nothing at the info level and returns HTTP 200 with { id, conversation_id }. If you see a 500 error, the full exception message is logged at the error level with the prefix [VOICE-AGENT] voice_transcript failed:. See Troubleshooting for a list of common failure modes and how to resolve them.

Build docs developers (and LLMs) love