Poster
Responsible Visual Editing
Minheng Ni · Yeli Shen · Yabin Zhang · Wangmeng Zuo
# 203
With the recent advancements in visual synthesis, there is a growing risk of encountering synthesized images with detrimental effects, such as hate, discrimination, and privacy violations. Unfortunately, it remains unexplored on how to avoid synthesizing harmful images and convert them into responsible ones. In this paper, we present responsible visual editing, which modifies risky concepts within an image to more responsible ones with minimal content changes. However, the concepts that need to be edited are often abstract, making them hard to located and modified. To tackle these challenges, we propose a Cognitive Editor (CoEditor) by harnessing the large multimodal models through a two-stage cognitive process: (1) a perceptual cognitive process to locate what to be modified and (2) a behavioral cognitive process to strategize how to modify. To mitigate the negative implications of harmful images on research, we build a transparent and public dataset, namely AltBear, which expresses harmful information using teddy bears instead of humans. Experiments demonstrate that CoEditor can effectively comprehend abstract concepts in complex scenes, significantly surpassing the baseline models for responsible visual editing. Moreover, we find that the AltBear dataset corresponds well to the harmful content found in real images, providing a safe and effective benchmark for future research. Our source code and dataset can be found at https://github.com/kodenii/Responsible-Visual-Editing.
Live content is unavailable. Log in and register to view live content