Skip to yearly menu bar Skip to main content


Poster

UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening

Siyuan Cheng · Guangyu Shen · Kaiyuan Zhang · Guanhong Tao · Shengwei An · Hanxi Guo · Shiqing Ma · Xiangyu Zhang

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ] [ Project Page ]
Thu 3 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Deep neural networks (DNNs) have demonstrated effectiveness in various fields. However, DNNs are vulnerable to backdoor attacks, which inject a unique pattern, called trigger, in the input to cause misclassification to an attack-chosen target label. While existing works have proposed various methods to mitigate backdoor effects in poisoned models, they tend to be less effective against recent advanced attacks. In this paper, we introduce a novel post-training defense technique UNIT that can effectively remove backdoors for a variety of attacks. In specific, UNIT approximates a unique and tight activation distribution for each neuron in the model. It then proactively dispels substantially large activation values that exceed the approximated boundaries. Our experimental results demonstrate that UNIT outperforms 9 popular defense methods against 14 existing backdoor attacks, including 2 advanced attacks, using only 5\% of clean training data.

Live content is unavailable. Log in and register to view live content