«  View All Posts

Online to offline: Turning real-world experience into virtual tests

May 12, 2020 | 4 min. read

By the Aurora Team


On-road events become more impactful when we add them to our Virtual Testing Suite and use them to continually improve the Aurora Driver.

At Aurora, we’ve adopted a “smarter, not farther” approach to on-road testing. That is, instead of blindly pushing to drive more and more miles, we’ve continued to focus on collecting quality real-world data and on getting the most value out of every data point. For example, we amplify the impact of real-world experience by flagging interesting or novel events and incorporating them into our Virtual Testing Suite.

While they aren’t valuable as a measure of progress, on-road events can be incredibly valuable as learning opportunities. Our triage team reviews flagged events and then works with our engineering teams to identify which ones offer opportunities to improve the Aurora Driver.

One real-life situation can inspire tens or even hundreds of permutations in our Virtual Testing Suite, all of which can be continually used to fine-tune existing capabilities. In this way, one on-road experience becomes a multifaceted feedback loop for future versions of the Aurora Driver.

Read on to learn about Aurora’s "online-to-offline pipeline," our process for rapidly converting on-road events into virtual tests.

Why we need on-road testing

We drive the vast majority of our miles in our Virtual Testing Suite, enabling us to make rapid progress at scale. But while we drive far more miles virtually, high-quality real-world experience is still important for developing the Aurora Driver.

First, real-world tests allow us to assess whether successes in virtual testing translate to the road. For example, will other drivers understand the Aurora Driver’s intentions when merging? We can simulate what other actors might do in various situations, but it’s important to observe how they actually interact with our vehicles.

Second, real-world data is useful for thoroughly training and testing our perception system. While we’re working at the cutting-edge of sensor simulation, it’s difficult to accurately simulate all artifacts of a particular sensor and the nuances of environmental conditions like dust, smog, and pollen. For now, real-world data remains important.

For example, the video below shows vehicle exhaust on a cold, rainy day in Pittsburgh. Early on, our perception system might have perceived the exhaust as an obstacle, causing the Aurora Driver to brake or nudge (adjust its trajectory) around it. We used real-world data from scenes like this to teach our perception system to identify and ignore exhaust, resulting in a better driving experience.

The Aurora Driver’s perception system requires real-world examples of environmental elements like exhaust to learn to recognize and respond to them appropriately.

Third, real-world testing helps us continually create new and more lifelike virtual tests. Taking inspiration from our experiences on the road means we don’t have to rely solely on engineers’ imagination to capture important edge cases in our Virtual Testing Suite.

On-road sources for virtual tests

The on-road events that we turn into virtual tests come from two sources:

  • Copilot annotations: Vehicle operator copilots, who provide support from the passenger’s seat, routinely flag experiences that are interesting, uncommon, or novel. We often recreate these in our Virtual Testing Suite to prepare the Aurora Driver for a diverse set of situations on the road.

  • Disengagements: We use disengagements throughout this post to describe instances when our vehicle operators proactively retook control when they believed there was a chance that an unsafe situation might occur or they didn’t like how the vehicle was driving.

Types of virtual tests

As we’ve said before, our Virtual Testing Suite contains a complementary suite of tests that assess how software works at every level. Thus, we convert on-road events into one or more of the following types of virtual tests:

  • Perception tests: For example, say a bicyclist passes one of our vehicles. Specialists review log footage from the event and then label things like object category (bicyclist), velocity (3 mph), etc. We can then use that “absolute truth” to test how well new versions of perception software can determine what really happened on the road.

  • Manual driving evaluations: We assess how the Aurora Driver’s planned trajectory compares to the vehicle operator’s actual trajectory.

  • Simulations: Simulations are virtual models of the world where we can test how the Aurora Driver reacts in many permutations of the same situation. Simulations also allow us to simulate a wide variety of interactions between the Aurora Driver and other actors in the virtual world. For example, when the Aurora Driver stops for a jaywalker, how will that jaywalker respond? And then how do other actors in the simulation respond when the jaywalker crosses the street?

Aurora’s online-to-offline (O2O) pipeline

So how do we convert on-road events into virtual tests? By leveraging our "online-to-offline (O2O) pipeline."

To illustrate this process, let’s follow a disengagement that helped the Aurora Driver learn how to nudge. Again, nudging refers to times when the Aurora Driver adjusts its trajectory to move around obstacles, like large delivery trucks partially blocking the road.

1. On-road event

Vehicle operators annotate disengagements and flag scenes that are interesting, uncommon, or novel. In this situation, the Aurora Driver hesitates to nudge around a vehicle that suddenly veers out of the roadway and into an on-street parking space. To avoid disrupting traffic, our trained vehicle operators quickly take over and drive around the parking vehicle. 

2. Triage

Our triage team reviews on-road events and provides an initial diagnosis.

Triage watches logged footage from this disengagement and confirms that the Aurora Driver didn’t smoothly nudge.

3. Create virtual tests

We create one or more virtual tests, including perception tests, manual driving evaluations, and/or simulations.

This nudging situation was caused by an uncommon interaction–another vehicle suddenly pulling over after a turn–so it’s the kind of event we will use to augment our Virtual Testing Suite. We used it as inspiration for 50 new nudging simulations, including:

  • A reproduction of the exact scene from the disengagement log footage.

  • Variations created by changing variables like how fast the vehicle in front of us is moving.


A composite of the simulations where we varied the speed of the parking vehicle.

4. Iterate

Engineers use our diverse Virtual Testing Suite to fine-tune new and existing capabilities. The engineering team used the nudging simulations inspired by this disengagement along with many other codebase tests (unit and regression), perception tests, and manual driving evaluations to fine-tune the Aurora Driver’s ability to nudge.

To date, we've run tens of millions of virtual tests on this capability. For example, the Aurora Driver has already practiced nudging more than 20 million times in simulation alone.

5. Test on the road

We test improvements in the real world and continue collecting useful data. Here’s the Aurora Driver nudging smoothly in complex situations.


Our O2O pipeline allows us to squeeze the maximum value out of every real-world mile and continually expand our Virtual Testing Suite. These types of investments have allowed us to make significant progress and will ultimately allow us to deliver the Aurora Driver safely, quickly, and broadly.

We’re hiring vehicle operators, triage specialists, and engineers to help us advance the Aurora Driver. Visit our Careers page to learn more about open positions and what it’s like to work at Aurora.

the Aurora Team

Delivering the benefits of self-driving technology safely, quickly, and broadly.