Member-only story

Introduction to MultiDiffusion: Text to Panorama Image Generation

Ng Wai Foong
5 min readFeb 22, 2023

Fusing diffusion path for panorama image generation

Image by the author

By reading this piece, you will learn to use MutliDiffusion for panorama image generation. Based on the official documentation, MultiDiffusion is

… a unified framework that enables versatile and controllable image generation. MultiDiffusion can be readily applied to any a pre-trained text-to-image diffusion model, without any further training or finetuning. It is capable of generating high quality and diverse images that adhere to user-provided controls.

From version 0.13.0 onward, the diffusers package officially supports MultiDiffusion via the StableDiffusionPanoramaPipeline pipeline. The pipeline can be used with any Stable Diffusion models to generate panorama images (2048 x 512 resolution by default).

One of the major problem with the existing Stable Diffusion pipeline is that generating images with resolution higher than the training images will incur unwanted artifacts. For example:

  • weird aspect ratio
  • duplication of subjects

There has been various attempts (high res fix, text2img2img, etc.) for fixing this issues. MultiDiffusion is one of the approach to generate wide view images (panorama) that look seamlessly.

--

--

Ng Wai Foong
Ng Wai Foong

Written by Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/

No responses yet