Robustness is the most important property of watermarking schemes. In practice, the watermarking mechanism shall be robust to both geometric and non-geometric distortions. In deep learning-based watermarking frameworks, robustness can be ensured by end-to-end training with different noise layers. However, most of the current CNN-based watermarking frameworks, even trained with targeted distortions, cannot well adapt to geometric distortions due to the architectural design. Since the traditional convolutional layer's position structure is relatively fixed, it lacks the flexibility to capture the influence of geometric distortion, making it difficult to train for corresponding robustness. To address such limitations, we propose a Swin Transformer and Deformable Convolutional Network (DCN)-based watermark model backbone. The attention mechanism and the deformable convolutional window effectively improve the feature processing flexibility, greatly enhancing the robustness, especially for geometric distortions. Besides, for non-geometric distortions, aiming at improving the generalizability for more distortions, we also provide a distortion-style-ensembled noise layer, including an image encoder, an image decoder, and distortion-style layers that can effectively simulate styles of different kinds of distortions. In the final watermark model training stage, we can simply train with our proposed noise layer for overall robustness. Extensive experiments illustrate that compared to existing state-of-the-art (SOTA) works, with better visual quality, our method achieves explicitly superior performance under tested geometric distortions, with 100.00% watermark extraction accuracy in almost all cases. Relevant codes will be released after review.
Live content is unavailable. Log in and register to view live content