ECCV Poster Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Poster

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Jiyao Zhang · Weiyao Huang · Bo Peng · Mingdong Wu · Fei Hu · Zijian Chen · Bo Zhao · Hao Dong

[ Abstract ] [ Project Page ] [ Paper PDF ]

[ Poster] [ Supplemental]

2024 Poster

Abstract:

6D Object Pose Estimation is a critical yet challenging task in the field of computer vision, distinguished from more traditional 2D tasks by its lack of large-scale datasets. This scarcity hampers comprehensive evaluation of model performance and consequently, limits research development while also restricting the applicability of research across diverse domains due to the limited number of instances or categories available. To address these issues and facilitate progress in this field, this paper introduces Omni6DPose, a substantial dataset characterized by its diversity in object categories, large scale, and variety in object materials. Omni6DPose is divided into three main components: ROPE (Real 6D Object Pose Estimation Dataset), which includes 270,000 images annotated with over one million annotations across 600 instances in 140 categories; SOPE(Simulated 6D Object Pose Estimation Dataset), consisting of 350,000 images created in a mixed reality setting with depth simulation, annotated with over one million annotations across 4,000 instances in the same 140 categories; and the manually aligned real scanned objects used in both ROPE and SOPE. Omni6DPose is inherently challenging due to the substantial variations and ambiguities. To address this challenge, we propose GenPose++, a novel framework incorporating a pre-trained 2D foundational model to enhance generalization capabilities and employing a diffusion-based generative approach to adeptly manage ambiguity issues. And, this paper provides a comprehensive benchmarking analysis to evaluate the performance of previous methods on this large-scale dataset in the realms of 6D object pose estimation and pose tracking.

Chat is not available.