Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/facebookresearch/audioseal/llms.txt

Use this file to discover all available pages before exploring further.

This page covers common issues you might encounter when using AudioSeal and how to resolve them.

Common Errors

Problem: This error occurs when passing audio tensors without the required batch dimension.Solution: AudioSeal expects a batch of audio tensors as input. Add a batch dimension to your input tensor:
# If your audio tensor has shape (channels, samples)
wav = wav.unsqueeze(0)  # Now it has shape (batch, channels, samples)

# Example usage
watermark = model.get_watermark(wav.unsqueeze(0))
See the Getting Started notebook for complete examples.
Problem: On Windows machines, you may encounter this error due to an old checkpoint that’s not compatible with Windows.Solution: Invalidate the cache by removing the cached files and re-running:
  1. Navigate to C:\Users\<USER>\.cache\audioseal
  2. Delete all files in this directory
  3. Run your code again to re-download the checkpoint
This issue occurs with older checkpoint versions uploaded to the model hub. Clearing the cache will download the updated, Windows-compatible version.
Problem: Newer versions of torchaudio don’t handle the default audio backend properly.Solution: You have two options:Option 1: Downgrade torchaudio
pip install torchaudio==2.1.0
Option 2: Install soundfile as your audio backend
pip install soundfile
If you’re using torchaudio version 2.2.0 or later, we recommend installing soundfile as the audio backend for better compatibility.

Installation Issues

Problem: Pip reports conflicts between package versions during installation.Solution: Ensure you have the minimum required versions:
  • Python >= 3.8 (>= 3.10 for streaming support)
  • PyTorch >= 1.13.0
Try installing in a fresh virtual environment:
python -m venv audioseal_env
source audioseal_env/bin/activate  # On Windows: audioseal_env\Scripts\activate
pip install audioseal
Problem: AudioSeal doesn’t utilize your GPU for inference.Solution: Ensure you have the correct PyTorch version with CUDA support:
# Check PyTorch CUDA availability
python -c "import torch; print(torch.cuda.is_available())"

# If False, install PyTorch with CUDA support
# Visit https://pytorch.org/get-started/locally/ for your specific CUDA version
Move your model and audio tensors to GPU:
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AudioSeal.load_generator("audioseal_wm_16bits").to(device)
wav = wav.to(device)

Model Loading Issues

Problem: Connection errors or timeouts when downloading model checkpoints.Solution: Try the following approaches:
  1. Check your internet connection
  2. Set a longer timeout:
    import os
    os.environ['HF_HUB_DOWNLOAD_TIMEOUT'] = '300'
    
  3. Manually download from Hugging Face Hub and load locally:
    model = Watermarker.from_pretrained("/path/to/checkpoint", device=device)
    
Problem: CUDA out of memory errors when processing large audio files.Solution: Process audio in smaller chunks:
# Split long audio into smaller chunks
chunk_size = 16000 * 30  # 30 seconds at 16kHz
chunks = torch.split(wav, chunk_size, dim=-1)

watermarked_chunks = []
for chunk in chunks:
    watermark = model.get_watermark(chunk)
    watermarked_chunks.append(chunk + watermark)

watermarked_audio = torch.cat(watermarked_chunks, dim=-1)
For streaming applications, consider using the streaming API available in AudioSeal 0.2+.

Detection Issues

Problem: The detector outputs low probabilities even for known watermarked audio.Possible causes and solutions:
  1. Sample rate mismatch: Ensure the audio sample rate matches the model’s expected rate (16kHz by default)
    import julius
    if sample_rate != 16000:
        wav = julius.resample_frac(wav, sample_rate, 16000)
    
  2. Audio has been heavily modified: The watermark may have been degraded by compression, noise, or other attacks
  3. Using wrong detector: Ensure you’re using the matching detector for your generator model
    # Use matching pair
    generator = AudioSeal.load_generator("audioseal_wm_16bits")
    detector = AudioSeal.load_detector("audioseal_detector_16bits")
    

Performance Issues

Problem: Watermarking or detection is slower than expected.Solutions:
  1. Use GPU: Move models to GPU for faster inference
  2. Disable gradient computation:
    model.eval()
    with torch.no_grad():
        watermark = model.get_watermark(wav)
    
  3. Batch processing: Process multiple files in batches when possible
  4. Use streaming mode: For real-time applications, use the streaming API (AudioSeal 0.2+)

Streaming Support Issues

Problem: Running the streaming API in certain PyTorch versions causes hanging in Jupyter notebooks.Solution: Disable torch dynamo before starting Jupyter:
export NO_TORCH_COMPILE=1
jupyter notebook
Or set it in Python:
import os
os.environ['NO_TORCH_COMPILE'] = '1'
This issue affects specific PyTorch versions. Ensure you’re using a compatible version or disable torch compile as shown above.

Getting Help

If you encounter issues not covered here:
  1. Check the GitHub Issues page for similar problems
  2. Review the example notebooks for working code
  3. Open a new issue with:
    • Your AudioSeal version
    • Python and PyTorch versions
    • Complete error message and stack trace
    • Minimal code to reproduce the issue
When reporting issues, always include your environment details and a reproducible example to help maintainers diagnose the problem quickly.

Build docs developers (and LLMs) love