Capability spotlight: Fault management
Aurora Driver | July 29, 2022 | 4 min. read
By Tony Stentz
Starting with a thought experiment
You are driving along on the highway when your vehicle’s check engine light comes on. Like most drivers, you figure it can wait and continue on your way. Then the light starts blinking, signaling a critical problem that requires immediate attention. If you continue to ignore the light, it’s possible that you could be putting yourself or other road users in danger. As an alert and responsible driver, you quickly pull over to a safe spot on the side of the road.
Modern vehicles are equipped with electronic control systems and onboard diagnostics to signal to the human driving the vehicle that something has gone awry. So when a problem does happen, human drivers can make smart, informed decisions about whether or not they feel it’s safe to continue driving. But what happens when the engine of an autonomous vehicle experiences a problem?
Unlike traditional vehicles, Aurora Driver-powered vehicles will not be able to rely on human intervention. Instead, the Aurora Driver must be able to detect, diagnose, and respond to anomalous conditions that interfere with operations and pose a safety risk. To that end, we recently announced the early release of the Aurora Driver’s Fault Management System, or FMS.
What is a fault?
A fault occurs when some part of the Aurora Driver’s system or the vehicle it’s controlling becomes impaired. A fault could include anything from an obstructed camera to a blown tire or an anomaly in the code of an autonomy sub-system.
A system is fault-tolerant if, in the event of a fault, it remains capable of safe operation—though perhaps with reduced capabilities. For the Aurora Driver, this means that the vehicle is still able to safely stop or pull over when something goes wrong.
Fault tolerance is built into the software, hardware, and embedded systems that make up the Aurora Driver. All of the safety-critical functions (including those embedded in perception, forecasting, motion planning, localization, control, steering, braking, communication, and power supply and distribution) continue to work even when something breaks.
What is fault management?
To operate safely in autonomy, the Aurora Driver will be equipped with a Fault Management System that actively detects and mitigates faults. The Aurora Driver’s FMS and fault tolerance support the second principle of our Safety Case Framework: Fail-Safe. Under this principle, we consider the Aurora Driver to be acceptably safe even when parts of the autonomy system fail so that it continues to behave in a way that does not endanger passengers or other road users.
Detection: Aurora’s FMS is designed to actively monitor the health of the vehicle, including the self-driving software, sensors, and onboard computer. Each component of the Aurora Driver is constantly reporting diagnostic health checks to the other components, ensuring that all systems are meeting the right conditions for safe autonomous operation.
Diagnosis: When a fault is detected, the FMS will evaluate its severity and determine the impact it will have on the Aurora Driver’s ability to drive safely. If it is not safe to continue normal operations, the FMS will produce a mitigation strategy.
Response: The result of the diagnosis will trigger one of a number of mitigation strategies, such as continuing to drive but at a reduced speed or pulling over to the shoulder. The FMS will consider the state of the entire system to decide on the safest fault response. The Aurora Driver’s motion planner will then execute that strategy according to the vehicle’s environment.
Safely testing fault management
Like all of the Aurora Driver’s software, the FMS was developed and extensively tested in our Virtual Testing Suite before being validated on closed tracks and then on public roads. This process allows us to quickly and responsibly implement new capabilities.
In June, we began testing the FMS on public highways in Texas. To do so safely, Aurora engineers created artificial faults of varying levels of severity and significance. When these faults are injected into the system, the Aurora Driver responds to them as though they are real.
1. To initiate a test, one of the two vehicle operators in the truck injects a fault, using the autonomy interface to cut off data from one of the Aurora Driver's sensors.
2. The FMS sub-systems monitoring the vehicle's sensors immediately detect that there is an issue with perception and locate the cause—missing data from the front center lidar sensor.
3. The FMS evaluates the functional integrity of the Aurora Driver and determines that loss of the front center lidar sensor could pose a critical safety risk, and triggers a fault mitigation strategy to minimize that risk.
4. The Aurora Driver executes the fault mitigation strategy by first activating the truck’s hazard lights, reducing its speed, and beginning to look for a safe place on the side of the road to pull over.
5. When it finds a safe place with a wide-enough shoulder, the Aurora Driver maneuvers the vehicle onto the side of the road and brings it to a stop.
6. After confirming that the Aurora Driver has successfully completed the test by executing the fault mitigation strategy, the on-board vehicle operator takes control of the truck and merges back onto the highway. With additional training, the Aurora Driver will learn how to re-enter the road after pulling over, allowing it to autonomously resume operating after recovering from faults.
Aurora Driver-powered trucks complete dozens of these fault injection tests every day, allowing us to measure and verify the Aurora Driver’s ability to complete safety maneuvers in response to faults it could encounter while hauling goods for our customers. Testing this fault management capability provides critical evidence to satisfy our Safety Case for autonomous operations on public roads.
Stay tuned as we continue to release increasingly capable and mature autonomous vehicles in preparation for the launch of our Aurora Horizon and Aurora Connect products.
For more on our safety strategy, check out this introduction to the principles that make up our Safety Case Framework.
Vice President, Systems, Safety Engineering, and Validation