Member-only story
Introduction to UniDiffuser
A unified diffusion framework
By reading this piece, you will learn about UniDiffuser,
… a unified diffusion framework to fit all distributions relevant to a set of multi-modal data in one model.
From version 0.17.0 onward, the diffusers
package officially supports the UniDiffuserPipeline
, which allows users to perform the following tasks:
- Joint Image-Text Gen
- Text-to-Image
- Image-to-Text
- Image Variation
- Text Variation
At the time of this writing, UniDiffuser comes with the following pre-trained models:
- UniDiffuser-v0: This version is trained on LAION-5B at 512x512 resolution, which contains noisy webdata of text-image pairs.
- UniDiffuser-v1: This version is resumed from UniDiffuser-v0, and is further trained with a set of less noisy internal text-image pairs. It uses a flag as its input to distinguish webdata and internal data during training.
Let’s proceed to the next section to install all the necessary modules.
Setup
Before that, it is highly recommended to create a new virtual environment.