The informative power of traffic analysis can be enhanced by considering changes in both time and space. Vehicle tracking algorithms applied to drone videos provide a better overview than street-level surveillance cameras. However, existing aerial MOT datasets only cover stationary settings, leaving the performance in moving-camera scenarios covering a considerably larger area unknown. To fill this gap, we present VETRA, a dataset for vehicle tracking in aerial imagery introducing heterogeneity in terms of camera movement, frame rate, as well as type, size and number of objects. When dealing with these challenges, state-of-the-art online MOT algorithms exhibit a significant decrease in performance compared to other benchmark datasets. Despite the performance gains achieved by our baseline method through the integration of camera motion compensation, there remains potential for improvement, particularly in situations where vehicles have similar visual appearance, prolonged occlusions, and complex urban driving patterns. Making VETRA available to the community adds a missing building block for both testing and developing vehicle tracking algorithms for versatile real-world applications.
Live content is unavailable. Log in and register to view live content