Denoising Diffusion Implicit Models for faster, deterministic sampling from diffusion models
DDIM (Denoising Diffusion Implicit Models), introduced by Song et al. in 2021, is a faster sampling method for diffusion models. Unlike DDPM which requires 1000 steps to generate a sample, DDIM can produce high-quality samples in as few as 10-50 steps.
DDPM’s reverse process is stochastic—it adds noise at each denoising step. DDIM makes a key observation: we can define a deterministic reverse process that produces the same marginal distributions but allows skipping timesteps.
DDIM doesn’t require retraining the model. You can use a DDPM-trained model and sample with DDIM immediately.
DDIM introduces a parameter η ∈ [0, 1] that controls stochasticity:
η = 0: Fully deterministic (standard DDIM)
η = 1: Recovers stochastic DDPM
0 < η < 1: Interpolates between deterministic and stochastic
Deterministic sampling (η=0) is preferred for most applications because it’s faster and reproducible—the same noise seed always produces the same image.
The optimal number of steps depends on your dataset and model. MNIST can achieve good results with 50 steps, while CIFAR-10 may need 100-250 steps for comparable quality to DDPM.
Because DDIM is deterministic, it enables smooth interpolations in latent space:
# Generate two random starting noisesz1 = torch.randn(1, channels, size, size)z2 = torch.randn(1, channels, size, size)# Interpolate in noise spacealphas = torch.linspace(0, 1, steps=10)interpolated = [(1-a)*z1 + a*z2 for a in alphas]# Each interpolated noise produces a deterministic imageimages = [sample_ddim(z, ddim_steps=50) for z in interpolated]
With stochastic DDPM, the same noise seed produces different images each time due to sampling randomness.
When comparing DDPM vs DDIM, ensure you’re using the same random seed for the initial noise. Otherwise, differences in sample quality may be due to lucky/unlucky noise samples rather than the algorithm itself.
Where σ_t can be chosen arbitrarily. When σ_t = 0, this becomes deterministic. When σ_t matches the DDPM posterior variance, it recovers DDPM exactly.The key insight is that the marginal distributions q(x_t | x_0) remain the same, so a model trained with DDPM’s objective can be used with DDIM sampling.
Start with 50 steps and adjust based on your needs:
# Fast previewimages = diffusion.sample_ddim(num_samples=16, ddim_steps=20)# Production qualityimages = diffusion.sample_ddim(num_samples=16, ddim_steps=100)# Maximum quality (close to DDPM)images = diffusion.sample_ddim(num_samples=16, ddim_steps=250)
If you need diverse samples, generate multiple images with different noise seeds rather than increasing η. This gives you explicit control over diversity.