Introduction

What is speak-mintlify?

speak-mintlify is a CLI tool that automatically creates voice narration for your documentation pages. It converts your MDX content into natural-sounding audio files, uploads them to S3-compatible storage, and injects audio player components directly into your documentation.

The tool is built to be intelligent and efficient. It only regenerates audio when your content changes, tracks file hashes to avoid unnecessary API calls, and integrates seamlessly with your existing CI/CD pipelines.

Why use voice narration?

Voice narration transforms how developers interact with your documentation:

Accessibility

Make your documentation accessible to developers with visual impairments or reading disabilities

Multitasking

Let developers listen while coding, debugging, or working on other tasks

Learning styles

Support auditory learners who retain information better through listening

Developer engagement

Increase time-on-page and improve the overall documentation experience

Key features

Smart regeneration - Only generates audio when content changes, using hash-based tracking

Multiple voices - Configure different voices for variety or different content types

S3-compatible storage - Works with AWS S3, Cloudflare R2, MinIO, and other S3-compatible services

Automatic injection - Injects audio player components into your MDX files automatically

CI/CD ready - Designed to integrate seamlessly with GitHub Actions and other automation tools

Flexible configuration - Use YAML config files, environment variables, or CLI flags

Dry run mode - Preview changes before making them with the --dry-run flag

How it works

Extract content

speak-mintlify parses your MDX files and extracts readable text content, removing code blocks and other non-narrative elements

Generate audio

Content is sent to Fish Audio TTS API to generate natural-sounding voice narration

Upload to S3

Audio files are uploaded to your S3-compatible storage with intelligent caching

Inject component

Audio player components are automatically added to your MDX files with the correct voice URLs

speak-mintlify uses Fish Audio for its affordable, high-quality, natural-sounding voices. Unlike other providers that can be expensive and difficult to use at scale, Fish Audio makes it easy to add voice narration to your documentation without breaking the bank.

Get started in minutes

Follow our quickstart guide to add voice narration to your documentation

Get Started

Configuration

Commands

Guides

Reference

What is speak-mintlify?

Why use voice narration?

Accessibility

Multitasking

Learning styles

Developer engagement

Key features

How it works

Powered by Fish Audio

Get started in minutes

Build docs developers (and LLMs) love

Get Started

Configuration

Commands

Guides

Reference

​What is speak-mintlify?

​Why use voice narration?

Accessibility

Multitasking

Learning styles

Developer engagement

​Key features

​How it works

​Powered by Fish Audio

Get started in minutes

Build docs developers (and LLMs) love

What is speak-mintlify?

Why use voice narration?

Key features

How it works

Powered by Fish Audio