Acquiring 3D data for fully-supervised monocular 3D hand reconstruction is often difficult, requiring specialized equipment for use in a controlled environment. Yet, fundamental principles governing hand movements are well-established in the understanding of the human hand's unique structure and functionality. In this paper, we leverage these valuable foundational insights for the training of 3D hand reconstruction models. Specifically, we systematically study hand knowledge from different sources, including hand biomechanics, functional anatomy, and physics. We effectively incorporate them into the reconstruction models by introducing a set of differentiable training losses accordingly. Instead of relying on 3D supervision, we consider a challenging weakly-supervised setting, where models are trained solely using 2D hand landmark annotation, which is considerably more accessible in practice. Moreover, different from existing methods that neglect the inherent uncertainty in image observations, we explicit model the uncertainty. We enhance the training process by exploiting a simple yet effective Negative Log-Likelihood (NLL) loss, which incorporates the well-captured uncertainty into the loss function. Through extensive experiments, we demonstrate that our method significantly outperforms state-of-the-art weakly-supervised methods. For example, our method achieves nearly a 21% performance improvement on the widely adopted FreiHAND dataset.
Live content is unavailable. Log in and register to view live content