Skip to yearly menu bar Skip to main content


Poster

A Rotation-invariant Texture ViT for Fine-Grained Recognition of Esophageal Cancer Endoscopic Ultrasound Images

Tianyi Liu · Shuaishuai S Zhuang · Jiacheng Nie · Geng Chen · Yusheng Guo · Guangquan Zhou · Jean-Louis Coatrieux · Yang Chen

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Tue 1 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Endoscopic Ultrasound (EUS) is advantageous in perceiving hierarchical changes in the esophageal tract wall for diagnosing submucosal tumors. However, the lesions often disrupt the structural integrity and fine-grained texture information of the esophageal layer, impeding the accurate diagnosis. Moreover, the lesions can appear in any radial position due to the characteristics of EUS imaging, further increasing the difficulty of diagnosis. In this study, we advance an automatic classification model by equipping the Vision Transformer (ViT), a recent state-of-the-art model, with a novel statistical rotation-invariant reinforcement mechanism dubbed SRRM-ViT. Mainly, we adaptively select crucial regions to avoid interference from irrelevant information in the image. Also, this model integrates histogram statistical features with rotation invariance into the self-attention mechanism, achieving bias-free capture of fine-grained information of lesions at arbitrary radial positions. Validated by in-house clinical data and public data, SRRM-ViT has demonstrated remarkable performance improvements, which demonstrates the efficacy and potential of our approach in EUS image classification. Keywords: Fine-Grained Visual Classification (FGVC), Endoscopic Ultrasound (EUS), Rotation Invariant, Token Selection.

Live content is unavailable. Log in and register to view live content