Neural Radiance Fields (NeRF) have achieved remarkable progress on dynamic scenes with deformable objects. Nonetheless, most of the previous works required multi-view inputs or long training time (several hours), making it hard to apply them for real-world scenarios. Recently, a series of works have been dedicated to addressing the blurry artifact present in synthesized novel views given a monocular input of dynamic scenes. However, they may fail to predict stable and accurate deformation while keeping high-frequency details when rendering at various resolutions. To this end, we introduce a novel framework DMiT (Deformable Mipmapped Tri-Plane) that adopts the mipmaps to render dynamic scenes with various resolutions from novel views. With the help of hierarchical mipmapped triplanes, we incorporate an MLP to effectively predict a mapping between the observation space and the canonical space, enabling not only high-fidelity dynamic scene rendering but also high-performance training and inference. Moreover, a training scheme for joint geometry and deformation refinement is designed for canonical regularization to reconstruct high-quality geometries. Extensive experiments on both synthetic and real dynamic scenes demonstrate the efficacy and efficiency of our method.
Live content is unavailable. Log in and register to view live content