Skip to yearly menu bar Skip to main content


Poster

MegaScenes: Scene-Level View Synthesis at Scale

Joseph Tung · Gene Chou · Ruojin Cai · Guandao Yang · Kai Zhang · Gordon Wetzstein · Bharath Hariharan · Noah Snavely

[ ]
Tue 1 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications. Recently, pose-conditioned diffusion models have led to significant progress by extracting 3D information from 2D foundation models, but these methods are limited by the lack of scene-level training data. Common dataset choices either consist of isolated objects (Objaverse), or of object-centric scenes with limited pose distributions (DTU, CO3D). In this paper, we create a large-scale scene-level dataset from Internet photo collections, called MegaScenes, which contains over 100K SfM reconstructions from around the world. Internet photos represent a scalable data source but come with challenges such as lighting and transient objects. We address these issues to further create a subset suitable for the task of NVS. Additionally, we analyze failure cases of state-of-the-art NVS methods and significantly improve generation consistency. Through extensive experiments we validate the effectiveness of both our dataset and method on generating in-the-wild scenes.

Live content is unavailable. Log in and register to view live content