Lyft Makes Self-Driving Research Public

Lyft has released data on self-driving technology in a bid to accelerate the development of autonomus cars, which is seen as critical to the ride-hailing giant’s future viability.

The company is releasing a subset of its autonomous driving data, the Level 5 Dataset, and will be sponsoring a research competition.

The Level 5 Dataset includes over 55,000 human-labeled 3D annotated frames, a drivable surface map, and an underlying HD spatial semantic map to contextualize the data.

Academic research requires costly data that is out of reach for most academic teams. Sensor hardware must be built and properly calibrated, a localization stack is needed, and an HD semantic map must be created. Only then can you unlock higher-level functionality like 3D perception, prediction, and planning.

"At Lyft Level 5, we’ve been perfecting our hardware and autonomy stack for the last two years. We want to share some of the data we have collected or derived along the way, to help level the playing field for all researchers interested in autonomous technology," said Lyft's Luc Vincent, EVP Autonomous Technology.

Lyft is currently operating an ongoing self-driving employee shuttle, and the company's fleet is accruing tens of thousands of autonomous miles to train its system.

Ther company is already iterating on the third generation of Lyft’s self-driving car and has built a perception suite, patenting a new sensor array and a proprietary ultra-high dynamic range (100+DB) camera.

Because HD mapping is crucial to autonomous vehicles, Lyft's teams in Munich and Palo Alto have been building high-quality lidar-based geometric maps as well as high-definition semantic maps that are used by the autonomy stack.

Meanwhile, Lyft's team in London (formerly Blue Vision Labs) has been hard at work unlocking the scale of the Lyft fleet to build high quality, cost-effective geometric maps, using only a camera phone to capture the source data. This effort is essential for Lyft to build large-scale mapping and data collection infrastructure.

Finally, Lyft’s autonomous platform team has been deploying partner vehicles on the Lyft network. With its partner Aptiv, Lyft has successfully provided over 50,000 self-driving rides to Lyft passengers in Las Vegas, which is the largest paid commercial self-driving service in operation. Waymo vehicles are also now available on the Lyft network in Arizona, expanding the opportunity for Lyft's passengers to experience self-driving rides.

Lyft is also launching a competition for individuals to train algorithms on the dataset.

For reference, the Lyft Level 5 Dataset includes:

Over 55,000 human-labeled 3D annotated frames;
Data from 7 cameras and up to 3 lidars;
A drivable surface map; and,
An underlying HD spatial semantic map (including lanes, crosswalks, etc.)

Lyft's dataset makes it possible for researchers to work on a variety of problems, including prediction of agents over time; scene depth estimation from cameras with lidar as ground truth; object detection in 3D over the semantic map; scene segmentation using lidar and semantic maps; agent behavior classification; and many more.

Lyft has segmented this dataset into training, validation, and testing sets — the company will release the validation and testing sets once the competition opens.

There will be $25,000 in prizes, and Lyft will be flying the top researchers to the NeurIPS Conference in December, as well as allowing the winners to interview with Lyft's team.