Poster

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

Sehwan Choi ⋅ Jun Won Choi ⋅ JUNGHO KIM ⋅ Hongjae Shin

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

2024 Poster

Paper PDF [ Slides] [ Poster] [ Supplemental]

Abstract

Predicting vectorized high-definition (HD) map online is useful for autonomous driving, providing detailed geometric and semantic information on the surrounding road environment. In this paper, we introduce Mask2Map, a novel end-to-end online HD map construction method. Our approach identifies semantic components within a scene represented in the bird's eye view (BEV) domain and then generates a precise vectorized map topology based on this information. Mask2Map comprises two main components: an Instance-level Mask Prediction Network (IMPNet) and a Mask-Driven Map Prediction Network (MMPNet). IMPNet generates a mask-aware query capable of producing BEV segmentation masks, while MMPNet accurately constructs vectorized map components, leveraging the semantic geometric information provided by the mask-aware query. For enhancing HD map predictions, we design innovative modules for MMPNet based on outputs from IMPNet. We present a Positional Feature Generator that generates instance-level positional features by utilizing the comprehensive spatial context from semantic components of instance. We also propose a Geometric Feature Extractor which extracts point-level geometric features using sparse key points pooled from the segmentation masks. Furthermore, we present the denoising training strategy for inter-network consistency to boost the performance of map construction. Our evaluation conducted on nuScenes and Argoverse2 benchmarks demonstrates that our Mask2Map achieves a remarkable performance improvement over previous state-of-the-art methods by 10.1 mAP and 4.1 mAP. The code will be available soon.

Chat is not available.