TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

How to Fine-tune Stable Diffusion using Dreambooth

Personalized generated images with custom styles or objects

Ng Wai Foong
TDS Archive
Published in
10 min readNov 15, 2022
Image by the author

Previously, I have covered an article on fine-tuning Stable Diffusion using textual inversion. This tutorial focuses on how to fine-tune Stable Diffusion using another method called Dreambooth. Unlike textual inversion method which train just the embedding without modification to the base model, Dreambooth fine-tune the whole text-to-image model such that it learns to bind a unique identifier with a specific concept (object or style). As a result, the generated images is more personalized to the object or style compared to textual inversion.

This tutorial is based on a forked version of Dreambooth implementation by HuggingFace. The original implementation requires about 16GB to 24GB in order to fine-tune the model. The maintainer ShivamShrirao optimized the code to reduce VRAM usage to under 16GB. Depending on your needs and settings, you can fine-tune the model with 10GB to 16GB GPU. I have personally tested the training to be feasible on Tesla T4 GPU.

Please note that all the existing implementation is not by the original author of Dreambooth. As a result, there might be slight difference in terms of reproducibility.

Let’s proceed to the next section to setup all the necessary modules.

Setup

It is recommended to create a new virtual environment before you continue with the installation.

Python packages

In your working directory, create a new file called requirements.txt with the following code:

accelerate==0.12.0
torchvision
transformers>=4.21.0
ftfy
tensorboard
modelcards

Activate your virtual environment and run the following command one by one to install all the necessary modules:

pip install git+https://github.com/ShivamShrirao/diffusers.git
pip install -r requirements.txt

NOTE: You need to install diffusers using the url above instead of installing it directly from pypi.

bitsandbytes package

There is an optional package called bitsandbytes, which can reduce the VRAM usage further…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Ng Wai Foong
Ng Wai Foong

Written by Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/

Responses (7)

Write a response