🪢Diffusion models
Diffusion models are a class of generative models that convert Gaussian noise into samples from a learned data distribution via an iterative denoising process. These models can be conditional on class labels, text, or low-resolution images. A diffusion model is trained on a denoising objective of the form:

Where are data-conditioning pairs,
, and
are functions of
that influence sample quality. Intuitively,
is trained to denoise
into
using a squared error loss, weighted to emphasize certain values of
. Samplings such as the ancestral sampler and DDIM start from pure noise
and iteratively generate points
, where
, that gradually decrease in noise content. These points are functions of the x-predictions
.
Classifier guidance is a technique to improve sample quality while reducing diversity in conditional diffusion models using gradients from a pre-trained model during sampling. Classifier-free guidance is an alternative technique that avoids this pre-trained model by instead jointly training a single diffusion model on conditional and unconditional objectives via randomly dropping
during training (e.g. with 10% probability).
Sampling is performed using the adjustedprediction
, where

Here, and
are conditional and unconditional
predictions, given by
, and
is the guidance weight. Setting
disables classifier-free guidance, while increasing
strengthens the effect of guidance. Imagen depends critically on classifier-free guidance for effective text conditioning.
Last updated