Skip to yearly menu bar Skip to main content


Poster

A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting

Wouter Van Gansbeke · Bert De Brabandere

[ ]
Wed 2 Oct 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Panoptic and instance segmentation networks are often trained with specialized object detection modules, complex loss functions, and ad-hoc post-processing steps to handle the permutation-invariance of the instance masks. This work builds upon Stable Diffusion and proposes a latent diffusion approach for panoptic segmentation, resulting in a simple architecture which omits these complexities. Our training process consists of two steps: (1) training a shallow autoencoder to project the segmentation masks to latent space; (2) training a diffusion model to allow image-conditioned sampling in latent space. The use of a generative model unlocks the exploration of mask completion or inpainting, which has applications in interactive segmentation. The experimental validation on COCO and ADE20k yields strong results for segmentation tasks. Finally, we demonstrate the approach's adaptability to a multi-task setting by introducing learnable task embeddings.

Live content is unavailable. Log in and register to view live content