Diffusion models have shown superior performance on unsupervised anomaly detection tasks. Since trained with normal data only, diffusion models tend to reconstruct normal counterparts of test images with certain noises added. However, these methods treat all potential anomalies equally, which may cause two main problems. On the one hand, the difficulty of reconstructing images with different anomalies is uneven. For example, adding back a missing element is harder than dealing with a scratch, thus requiring larger denoising steps in diffusion models. Therefore, instead of utilizing the same setting for all samples, we propose to predict a particular denoising step for each sample by evaluating the difference between image contents and the priors extracted from diffusion models. On the other hand, even in the same image, reconstructing abnormal regions differs from normal areas. Theoretically, the diffusion model predicts a noise for each step, typically following a standard Gaussian distribution. However, due to the difference between the anomaly and the potential normal sample, the predicted noise in abnormal regions will inevitably deviate from the standard Gaussian distribution. To this end, we propose introducing synthetic abnormal samples in training to encourage the diffusion models to break through the limitation of standard Gaussian distribution, and an adaptive feature fusion scheme is utilized during inference. With the above modifications, we propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection, which introduces appealing flexibility and achieves anomaly-free reconstruction while retaining as much normal information as possible. Extensive experiments are conducted on two commonly used anomaly detection datasets (MVTec-AD and MPDD), showing the effectiveness of the proposed method. The source code and pre-trained models will be publicly available.
Live content is unavailable. Log in and register to view live content