Skip to yearly menu bar Skip to main content


Poster

WBP: Training-time Backdoor Attacks through Hardware-based Weight Bit Poisoning

Kunbei Cai · Zhenkai Zhang · Qian Lou · Fan Yao

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Thu 3 Oct 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Pre-trained models are widely used in machine learning (ML) due to the minimal demand for computational resources and training data. Recent studies show that the pre-trained model are vulnerable to backdoor attacks. Additionally, prior studies on hardware security have indicated that ML systems could potentially be compromised through bit flip attacks using Rowhammer. In this paper, we introduce \textbf{WBP} (i.e., weight bit poisoning), a novel attack framework that allows an attacker to implant a task-agnostic backdoor into the victim model \emph{during the fine-tuning process} through limited \emph{weight bit flips}. Notably, WBP aims to directly maximize the distance of output representations for normal and triggered inputs. We evaluate WBP on state-of-the-art CNNs and Vision Transformer models with a variety of downstream tasks. Our experimental results demonstrate that, without any prior knowledge of fine-tuning datasets, WBP can compromise a wide range of downstream tasks with a 99.3% attack success rate on average by flipping as few as 11 bits among millions of parameters.

Live content is unavailable. Log in and register to view live content