Poster

Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders

Lucas Stoffl ⋅ Andy Bonnetto ⋅ Stéphane D'Ascoli ⋅ Alexander Mathis

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

2024 Poster

Paper PDF [ Poster] [ Supplemental]

Abstract

Natural behavior is hierarchical. Yet, there is a paucity of benchmarks addressing this aspect. Recognizing the scarcity of large-scale hierarchical behavioral benchmarks, we create a novel synthetic basketball playing benchmark (Shot7M2). Beyond synthetic data, we extend BABEL into a hierarchical action segmentation benchmark (hBABEL). Then, we develop a masked autoencoder framework (hBehaveMAE) to elucidate the hierarchical nature of motion capture data in an unsupervised fashion. We find that hBehaveMAE learns interpretable latents on Shot7M2 and hBABEL, where lower encoder levels show a superior ability to represent fine-grained movements, while higher encoder levels capture complex actions and activities. Additionally, we evaluate hBehaveMAE on MABe22, a representation learning benchmark with short and long-term behavioral states. hBehaveMAE achieves state-of-the-art performance without domain-specific feature extraction. Together, these components synergistically contribute towards unveiling the hierarchical organization of natural behavior. Models and benchmarks are available at https://github.com/amathislab/BehaveMAE

Chat is not available.