This work introduces MotionLCM, extending controllable motion generation to a real-time level. Existing methods for spatial-temporal control in text-conditioned motion generation suffer from significant runtime inefficiencies. To address this issue, we first propose the motion latent consistency model (MotionLCM) for motion generation, building upon the latent diffusion model (MLD). By employing one-step (or few-step) inference, we further improve the runtime efficiency of the motion latent diffusion model for motion generation. To ensure effective controllability, we incorporate a motion ControlNet within the latent space of MotionLCM. This design enables explicit control signals to directly influence the generation process, similar to controlling other latent-free diffusion models for motion generation. By employing these techniques, our approach achieves real-time generation of human motion with text conditions and control signals. Experimental results demonstrate the remarkable generation and control capabilities of MotionLCM while maintaining real-time runtime efficiency.
Live content is unavailable. Log in and register to view live content