Poster

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection

Yanguang Sun ⋅ Chunyan Xu ⋅ Jian Yang ⋅ Hanyu Xuan ⋅ Lei Luo

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

2024 Poster

Paper PDF [ Poster] [ Supplemental]

Abstract

Camouflaged object detection (COD) has attracted a lot of attention in computer vision. The main challenge lies in the high degree of similarity between camouflaged objects and their surroundings in the spatial domain, making identification difficult. Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design, but often ignore the sensitivity and locality of features in the spatial domain, leading to sub-optimal results. In this paper, we propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method. This method consists of a series of well-designed Entanglement Transformer Blocks (ETB) for representation learning, a Joint Domain Perception Module (JDPM) for semantic enhancement, and a Dual-domain Reverse Parser (DRF) for feature integration in the frequency and spatial domains. Specifically, the ETB utilizes frequency self-attention (FSA) to effectively characterize the relationship between different frequency bands, while the entanglement feed-forward network (EFFN) facilitates information interaction between features of different domains through entanglement learning. Our extensive experiments demonstrate the superiority of FSEL over 21 state-of-the-art (SOTA) methods, through comprehensive quantitative and qualitative comparisons in three widely-used COD datasets.

Chat is not available.