How to Fine-tune Stable Diffusion using LoRA

Ng Wai Foong
8 min readFeb 21, 2023

Personalized generated images with custom datasets

Image by the author

Previously, I have covered the following articles on fine-tuning the Stable Diffusion model to generate personalized images:

By default, doing a full fledged fine-tuning requires about 24 to 30GB VRAM. However, with the introduction of Low-Rank Adaption of Large Language Models (LoRA), it is now possible to do fine-tuning with consumer GPUs.

Based on a local experiment, a single process training with batch size of 2 can be done on a single 12GB GPU (10GB without xformers, 6GB with xformers).

LoRA offers the following benefits:

  • less likely to have catastrophic forgetting as the previous pre-trained weights are kept frozen
  • LoRA weights have fewer parameters than the original model and can be easily portable
  • allow control to which extent the model is adapted toward new training images (supports interpolation)

This tutorial is strictly based on the diffusers package. Training and inference…

--

--

Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/