Member-only story

Introduction to UniDiffuser

Ng Wai Foong
4 min readJun 16, 2023

A unified diffusion framework

Image by the author

By reading this piece, you will learn about UniDiffuser,

… a unified diffusion framework to fit all distributions relevant to a set of multi-modal data in one model.

From version 0.17.0 onward, the diffusers package officially supports the UniDiffuserPipeline, which allows users to perform the following tasks:

  • Joint Image-Text Gen
  • Text-to-Image
  • Image-to-Text
  • Image Variation
  • Text Variation

At the time of this writing, UniDiffuser comes with the following pre-trained models:

  • UniDiffuser-v0: This version is trained on LAION-5B at 512x512 resolution, which contains noisy webdata of text-image pairs.
  • UniDiffuser-v1: This version is resumed from UniDiffuser-v0, and is further trained with a set of less noisy internal text-image pairs. It uses a flag as its input to distinguish webdata and internal data during training.

Let’s proceed to the next section to install all the necessary modules.

Setup

Before that, it is highly recommended to create a new virtual environment.

--

--

Ng Wai Foong
Ng Wai Foong

Written by Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/

No responses yet