Enable GPU acceleration for face swap

By default, Deep-Live-Cam runs inference on the CPU, which is reliable but slow for real-time use. Enabling a GPU execution provider offloads the ONNX model inference to your hardware accelerator, significantly improving frame rates. The provider you use depends on your GPU: NVIDIA cards use CUDA, Apple Silicon uses CoreML, and AMD/Intel GPUs on Windows can use DirectML or OpenVINO.

Deep-Live-Cam auto-detects the best available provider at startup in this order: cuda → rocm → coreml → dml → cpu. You can override this with the --execution-provider flag.

Choosing a provider

CUDA is the recommended provider for NVIDIA GPUs (Turing architecture or newer for best performance). It requires installing the CUDA Toolkit and cuDNN before updating the Python packages.1. Install system dependencies

Download and install CUDA Toolkit 12.8.0
Download cuDNN v8.9.7 for CUDA 12.x and ensure the cuDNN bin directory is added to your system PATH

2. Install Python packagesWith your virtual environment active, install PyTorch with CUDA 12.8 support and the GPU-enabled ONNX Runtime:

pip install -U torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip uninstall onnxruntime onnxruntime-gpu
pip install onnxruntime-gpu==1.21.0

3. Run with CUDA

python run.py --execution-provider cuda

When CUDA is active, Deep-Live-Cam automatically selects the FP16 inswapper model (inswapper_128_fp16.onnx) when it is present, falling back to the FP32 variant (inswapper_128.onnx) on older GPUs. It also promotes libx264 to h264_nvenc for hardware-accelerated video encoding.

CoreML gives the best performance on Apple Silicon (M1/M2/M3/M4) by routing inference through the Neural Engine and GPU shader cores.Prerequisite: Complete the macOS-specific setup first — install Python 3.11 via Homebrew and create your virtual environment with python3.11 -m venv venv.1. Install the Silicon-specific ONNX Runtime

pip uninstall onnxruntime onnxruntime-silicon
pip install onnxruntime-silicon==1.13.1

2. Run with CoreML

python3.11 run.py --execution-provider coreml

You must use python3.11 to run the app on macOS, not python or python3, if multiple Python versions are installed. Using the wrong interpreter can cause import errors or load the wrong onnxruntime package.

For older Intel-based Macs or when using the legacy CoreML provider, use the standard onnxruntime-coreml package instead of the Silicon-specific build.1. Install dependencies

pip uninstall onnxruntime onnxruntime-coreml
pip install onnxruntime-coreml==1.21.0

2. Run with CoreML

python run.py --execution-provider coreml

DirectML works on any DirectX 12-capable GPU on Windows, including AMD and Intel integrated and discrete GPUs. It does not require vendor-specific drivers beyond what Windows already provides.1. Install dependencies

pip uninstall onnxruntime onnxruntime-directml
pip install onnxruntime-directml==1.21.0

2. Run with DirectML

python run.py --execution-provider directml

DirectML sets the execution thread count to 1 automatically to avoid serialization issues with the DirectML API.

OpenVINO is Intel’s inference optimization toolkit. It works on Intel CPUs, integrated graphics, and discrete Arc GPUs.1. Install dependencies

pip uninstall onnxruntime onnxruntime-openvino
pip install onnxruntime-openvino==1.21.0

2. Run with OpenVINO

python run.py --execution-provider openvino

If you do not have a supported GPU, Deep-Live-Cam will run on the CPU using the default onnxruntime package installed with requirements.txt. No additional installation is needed.

python run.py

CPU mode is significantly slower than GPU-accelerated providers and is not suitable for real-time webcam use on most hardware.

Performance notes

Provider	Hardware	Real-time capable
CUDA	NVIDIA GTX 10xx+	Yes
CoreML (Silicon)	Apple M1–M4	Yes
CoreML (Legacy)	Intel Mac	Limited
DirectML	AMD/Intel (Windows)	Varies
OpenVINO	Intel	Varies
CPU	Any	No

For the fastest inference on NVIDIA hardware, use CUDA with onnxruntime-gpu==1.21.0. When CUDA is active, Deep-Live-Cam automatically selects the FP16 inswapper model when available (reducing memory bandwidth) and promotes video encoding to h264_nvenc or hevc_nvenc, keeping encoding off the CPU.

Get Started

Installation

Using Deep-Live-Cam

Configuration

Troubleshooting & Contributing

Enable GPU acceleration for face swap

Choosing a provider

Performance notes

Build docs developers (and LLMs) love

Get Started

Installation

Using Deep-Live-Cam

Configuration

Troubleshooting & Contributing

Documentation Index

​Choosing a provider

​Performance notes

Build docs developers (and LLMs) love

Choosing a provider

Performance notes