Poster

TAG: Text Prompt Augmentation for Zero-Shot Out-of-Distribution Detection

Xixi Liu · Christopher Zach

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

2024 Poster

Paper PDF [ Poster] [ Supplemental]

Abstract

Out-of-distribution (OOD) detection has been extensively studied for the reliable deployment of deep-learning models. Despite great progress in this research direction, most works focus on discriminative classifiers and perform OOD detection based on single-modal representations that consist of either visual or textual features. Moreover, they rely on training with in-distribution (ID) data. The emergence of vision-language models (e.g. \CLIPc) allows to perform zero-shot OOD detection by leveraging multi-modal feature embeddings and therefore only rely on labels defining ID data. Several approaches have been devised but these either need a given OOD label set, which might deviate from real OOD data, or fine-tune CLIP, which potentially has to be done for different ID datasets. In this paper, we first adapt various OOD scores developed for discriminative classifiers to \CLIP. Further, we propose an enhanced method named \emph{TAG} based on Text prompt AuGmentation to amplify the separation between ID and OOD data, which is simple but effective, and can be applied on various score functions. Its performance is demonstrated on CIFAR-100 and large-scale ImageNet-1k OOD detection benchmarks. It consistently improves AUROC and FPR95 on CIFAR-100 across five commonly used architectures over four baseline OOD scores. The average AUROC and FPR95 improvements are 6.35 % and 10.67 %, respectively. The results for ImageNet-1k follow a similar, but less pronounced pattern.

Chat is not available.