"**One Thousand and One Hours: Self-driving Motion Prediction Dataset**
John Houston (Lyft); Guido Zuidhof (Lyft); Luca Bergamini (Lyft); Yawei Ye (Lyft); Long Chen (Lyft); Ashesh Jain (Lyft); Sammy Omari (Lyft); Vladimir Iglovikov (Lyft); Peter Ondruska (Lyft)*
Motivated by the impact of large-scale datasets on ML systems we present the largest self-driving dataset for motion prediction to date, containing over 1,000 hours of data. This was collected by a fleet of 20 autonomous vehicles along a fixed route in Palo Alto, California, over a four-month period. It consists of 170,000 scenes, where each scene is 25 seconds long and captures the perception output of the self-driving system, which encodes the precise positions and motions of nearby vehicles, cyclists, and pedestrians over time. On top of this, the dataset contains a high-definition semantic map with 15,242 labelled elements and a high-definition aerial view over the area. We show that using a dataset of this size dramatically improves performance for key self-driving problems. Combined with the provided software kit, this collection forms the largest and most detailed dataset to date for the development of self-driving machine learning tasks, such as motion forecasting, motion planning and simulation.