In this paper, we introduce D4-VTON, a novel solution for image-based virtual try-on that seamlessly replaces a person's original garments with target garments while preserving pose and identity. We address challenges encountered in prior works, such as inaccurate clothing parsers causing artifacts and failing to ensure faithful semantic alignment. Additionally, we tackle the difficulties faced by diffusion models in solving this specific task, which involves the composite tasks of inpainting and denoising. To achieve these goals, we employ two self-contained technologies: Firstly, we propose a Dynamic Group Warping Module (DGWM) to disentangle semantic information and guide warping flows for authentic warped garments. Secondly, we deploy a Differential Noise Restoration Process (DNRP) to capture differential noise between incomplete try-on input and its complete counterpart, facilitating lifelike final results with negligible overhead. Extensive experiments demonstrate that D4-VTON surpasses state-of-the-art methods both quantitatively and qualitatively by a significant margin, showcasing its superiority in generating realistic images and precise semantic alignment.
Live content is unavailable. Log in and register to view live content