Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/IAHispano/Applio/llms.txt

Use this file to discover all available pages before exploring further.

Applio exposes its entire voice-conversion pipeline through a single entry-point script, core.py, which uses Python’s built-in argparse library. Every feature available in the Gradio web UI — inference, batch processing, training, text-to-speech, model blending, and more — can be driven from the command line without launching a browser. This makes core.py ideal for automation scripts, CI/CD pipelines, cloud batch jobs, and any headless environment where a graphical interface is not practical.

Basic usage

The CLI follows the standard subcommand pattern. Run the script with a subcommand name, followed by the flags specific to that subcommand:
python core.py <subcommand> [flags]
To display the full help text for any subcommand, append --help:
python core.py infer --help

Environment setup

All commands must be run from the Applio root directory (the folder that contains core.py) with the project’s virtual environment activated. Running from a different working directory or without the virtual environment will cause import errors.
# Linux / macOS
source .venv/bin/activate
python core.py infer --help

# Windows
.venv\Scripts\activate
python core.py infer --help
If you have not yet installed Applio’s dependencies, run python core.py prerequisites first. This downloads the required pretrained models and executables into the correct locations.

Available subcommands

infer

Convert a single audio file to a target voice using a trained RVC model.

batch_infer

Convert an entire folder of audio files in one pass using the same model and settings.

tts

Synthesize text with Edge TTS, then immediately apply RVC voice conversion — all in one command.

preprocess

Slice and clean a raw audio dataset to prepare it for feature extraction and training.

extract

Extract pitch (F0) and speaker embeddings from a preprocessed dataset.

train

Train an RVC model from extracted features. Automatically runs index generation on completion.

index

Manually (re-)generate the FAISS index file for a trained model.

model_information

Print metadata stored inside a .pth model file — architecture, epoch, sample rate, and more.

model_blender

Fuse two RVC models together at a configurable blend ratio to create a hybrid voice.

tensorboard

Launch a TensorBoard server pointed at the logs/ directory to monitor training progress.

download

Download and extract a model archive from a direct URL or Hugging Face link into logs/.

prerequisites

Download pretrained models, additional support models (RMVPE, etc.), and required executables.

audio_analyzer

Analyze an audio file and print sample rate, duration, bit depth, channel count, and RMS level.

Quick reference table

SubcommandPurposeRequired flags
inferSingle-file voice conversion--input_path, --output_path, --pth_path, --index_path
batch_inferFolder voice conversion--input_folder, --output_folder, --pth_path, --index_path
ttsTTS + voice conversion--tts_text/--tts_file, --tts_voice, --output_tts_path, --output_rvc_path, --pth_path, --index_path
preprocessDataset preprocessing--model_name, --dataset_path, --sample_rate, --cut_preprocess
extractFeature extraction--model_name, --sample_rate, --include_mutes
trainModel training--model_name, --save_every_epoch, --sample_rate
indexIndex generation--model_name
model_informationModel metadata--pth_path
model_blenderModel fusion--model_name, --pth_path_1, --pth_path_2
tensorboardTensorBoard server(none)
downloadModel download--model_link
prerequisitesDependency setup(none required)
audio_analyzerAudio analysis--input_path
Pass --help to any subcommand to see the exact accepted values and defaults printed directly in your terminal, e.g. python core.py train --help.

Build docs developers (and LLMs) love