It's Never Too Late:
Noise Optimization for Collapse Recovery in Trained Diffusion Models

1UC Berkeley, 2University of Tübingen, Tübingen AI Center, 3Technical University of Munich, MCML
*Equal contribution
Teaser figure showing noise optimization process

TL;DR: We optimize the initial random noise to recover diversity from collapsed diffusion models. Pink noise instead of white noise gives more diversity for free.

Abstract

Contemporary text-to-image models exhibit a surprising degree of mode collapse, as can be seen when sampling several images given the same text prompt. While previous work has attempted to address this issue by steering the model using guidance mechanisms, or by generating a large pool of candidates and refining them, in this work we take a different direction and aim for diversity in generations via noise optimization. Specifically, we show that a simple noise optimization objective can mitigate mode collapse while preserving the fidelity of the base model. We also analyze the frequency characteristics of the noise and show that initializing with pink noise can improve both optimization and search. Our experiments demonstrate that noise optimization yields superior results in terms of generation quality and variety.

Approach: From Collapsed to Diverse


Given a fixed text prompt and diffusion model, we optimize the noise initialization to increase visual diversity. Starting from i.i.d. noise samples, we generate a set of images. Using diversity and quality objectives (e.g. DINO dissimilarity, HPSv2), we update the noise to produce output images that capture more diversity per text prompt.


Pipeline overview showing the noise optimization approach

Pink noise gives image diversity for free

Results generated with SDXL-Turbo

The choice of noise distribution itself affects the diversity of generated images. Pretrained diffusion models are typically sampled from i.i.d. Gaussian (white) noise. Initializing instead with pink noise (which has enhanced low-frequency content) gives more diverse samples before any optimization, and makes the subsequent noise optimization easier.

"A photo of a car"

White noise

i.i.d. Ours
White noise: i.i.d. (top) and Ours (bottom)

Pink noise

i.i.d. Ours
Pink noise: i.i.d. (top) and Ours (bottom)

Sequential Diverse Generation

Results generated with Flux.1 [schnell]

i.i.d. samples
Our generations
Mixed samples

BibTeX

@inproceedings{harrington2026noisediv,
  author    = {Harrington, Anne and Koepke, A. Sophia and Karthik, Shyamgopal and Darrell, Trevor and Efros, Alexei A.},
  title     = {It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2026}
}