Skip to yearly menu bar Skip to main content


Poster

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Junhao Zhuang · Yanhong Zeng · WENRAN LIU · Chun Yuan · Kai Chen

[ ] [ Project Page ]
Tue 1 Oct 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Achieving high-quality versatile image inpainting, wherein user-specified regions are seamlessly filled with plausible content based on user intent, presents a significant challenge. Existing methodologies typically tackle this challenge by training separate models for distinct repair tasks, such as context-aware image inpainting and text-guided object inpainting, due to the need for different optimal training strategies. To overcome this challenge, we introduce PowerPaint, the first high-quality and versatile inpainting model that excels in both tasks. First, we introduce learnable task prompts along with tailored fine-tuning strategies to guide the model's focus on different inpainting targets explicitly. This enables PowerPaint to accomplish various inpainting tasks by utilizing different task prompts, resulting in state-of-the-art performance. Second, we demonstrate the versatility of the task prompt in PowerPaint by showcasing its effectiveness as a negative prompt for object removal. Moreover, we leverage prompt interpolation techniques to enable controllable shape-guided object inpainting, enhancing the model's applicability in shape-guided applications. Finally, we extensively evaluate PowerPaint on various inpainting benchmarks to demonstrate its superior performance for versatile image inpainting. We will release the codes and models publicly, facilitating further research in the field.

Live content is unavailable. Log in and register to view live content