Skip to yearly menu bar Skip to main content


E3V-K5: An Authentic Benchmark for Redefining Video-Based Energy Expenditure Estimation

Shengxuming Zhang · Lei Jin · Yifan Wang · Xinyu Wang · Xu Wen · Zunlei Feng · Mingli Song

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ] [ Project Page ]
Tue 1 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract: Accurately estimating energy expenditure (EE) is crucial for optimizing athletic training, monitoring daily activity levels, and preventing sports-related injuries. Estimating energy expenditure based on video (E$^\mathit{3}$V) is an appealing research direction. This paper introduces E3V-K5, an authentic dataset of sports videos that significantly enhances the accuracy of EE estimation. The dataset comprises 16,526 video clips from various categories and intensity of sports with continuous calorie readings obtained from the COSMED K5 indirect calorimeter, recognized as the most reliable standard in sports research. Augmented with the heart rate and physical attributes of each subject, the volume, diversity, and authenticity of E3V-K5 surpass all previous video datasets in E$^\mathit{3}$V, making E3V-K5 a cornerstone in this field and facilitating future research. Furthermore, we propose E3SFormer, a novel approach specifically designed for the E3V-K5 dataset, focusing on EE estimation using human skeleton data. E3SFormer consists of two Transformer branches for simultaneous action recognition and EE regression. The attention of joints from the action recognition branch is utilized in assisting the EE regression branch. Extensive experimentation validates E3SFormer's effectiveness, demonstrating its superior performance to existing skeleton-based action recognition models. Our dataset and code will be publicly accessible.

Live content is unavailable. Log in and register to view live content