Member-only story
How to Fine-tune Stable Diffusion using Textual Inversion
Personalized generated images with custom styles or objects

On 22 Aug 2022, Stability.AI announced the public release of Stable Diffusion, a powerful latent text-to-image diffusion model. The model is capable of generating different variants of images given any text or image as input.
Please note that the
… model is being released under a Creative ML OpenRAIL-M license. The license allows for commercial and non-commercial usage. It is the responsibility of developers to use the model ethically. This includes the derivatives of the model.
This tutorial focuses on how to fine-tune the embedding to create personalized images based on custom styles or objects. Instead of re-training the model, we can represent the custom style or object as new words in the embedding space of the model. As a result, the new word will guide the creation of new images in an intuitive way.
Have a look at the following comparison between a real life object and a generated image using stable diffusion model and fine-tuned embedding:

Let’s proceed to the next section for setup and installation.
Installation
Please note that this tutorial is based on a forked version of Stable Diffusion which integrated textual inversion to personalize image generation. The official repository recommends using Anaconda for the installation.
Refer to the following guide to install all the required dependencies:
Use the following requirement file if you intend to install the dependencies via pip install
:
pip install -r requirements.txt
If you have trouble with the installation, refer to the following troubleshooting guide.
Make sure that you have all the required models and checkpoints in the working directory or .cache folder.