Member-only story

Introduction to Stable Diffusion XL 0.9

6 min readJul 12, 2023

Improving Latent Diffusion Models for High-Resolution Image Synthesis

By reading this article, you will learn to generate high-resolution images using the new Stable Diffusion XL 0.9 architecture.

Note that this tutorial will be based on the diffusers package instead of the original implementation.

For your information, SDXL is a new pre-released latent diffusion model created by StabilityAI. Compared to the previous models (SD1.5, SD2.1, etc.), SDXL 0.9 has the following characteristics:

leverages a three times larger UNet backbone (more attention blocks)
has a second text encoder and tokenizer
trained on multiple aspect ratios
has a refinement model to improve the visual fidelity (post-hoc image to image)
latent image is 128 x 128 and final image resolution is 1024 x 1024

As illustrated in the image above, SDXL 0.9 comes with the following checkpoints:

Text-to-Image (1024x1024 resolution): stabilityai/stable-diffusion-xl-base-0.9
Image-to-Image / Refiner…

Introduction to Stable Diffusion XL 0.9

Written by Ng Wai Foong

Responses (1)