Skip to yearly menu bar Skip to main content


Poster

Data Augmentation via Latent Diffusion for Saliency Prediction

Bahar Aydemir · Deblina Bhattacharjee · Tong Zhang · Mathieu Salzmann · Sabine Süsstrunk

# 255
Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ] [ Project Page ] [ Paper PDF ]
Fri 4 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Saliency prediction models are constrained by the limited diversity and quantity of labeled data. Standard data augmentation techniques such as rotating, and cropping change the scene composition hence affecting saliency. In this work, we propose a novel data augmentation method for deep saliency prediction that involves editing natural images while retaining the complexity and variability of real-world visual scenes. Since saliency depends on both high-level and low-level features such as semantics and photometric properties, our approach involves learning both by incorporating photometric and semantic attributes such as color, contrast, brightness, and class. To that end, we introduce a saliency-guided cross-attention mechanism that enables targeted edits on the photometric properties, thereby enhancing saliency within specific image regions and providing controllability to our model in the context of saliency prediction. Experimental results demonstrate that our data augmentation method consistently improves the performance of various saliency models. Moreover, leveraging these features that generate augmentation for saliency prediction yields superior performance on publicly available saliency benchmarks. Our saliency predictions are highly aligned with human visual attention patterns in the edited images, as validated by a user study. We will make our code publicly available upon acceptance.

Live content is unavailable. Log in and register to view live content