Waymo Co-CEO Dmitri Dolgov on AI, Simulation, and Building the Future of Autonomous Driving

Building the World’s Most Trusted Driver: A Conversation with Waymo Co-CEO Dmitri Dolgov

This article features a conversation between a16z General Partner David George and Waymo Co-CEO Dmitri Dolgov, exploring the advancements and challenges in the field of autonomous driving, with a particular focus on the role of Artificial Intelligence (AI) and Generative AI (GenAI).

The Early Days of Autonomous Vehicles

Dmitri Dolgov's journey into autonomous vehicles began during his postdoctoral research at Stanford, coinciding with the DARPA Grand Challenges. He highlights the 2007 DARPA Urban Challenge as a pivotal moment, where a simulated urban environment with autonomous and human-driven vehicles provided a foundational experience for the field. The hardware and software in 2007 involved instrumentation, inertial measurement systems, GPS, sensors (radar, lidar, cameras), and a computer running software for perception, decision-making, and planning. Dolgov notes the dramatic evolution of AI and hardware since then, emphasizing that today's AI bears little resemblance to that of 2007.

Following his involvement in the DARPA challenges, Dolgov was part of the founding team of the Google Self-Driving Car Project in 2009, which later evolved into Waymo in 2016. This transition marked a significant step in advancing autonomous driving technology.

Layering GenAI into Traditional AI/ML

Dolgov discusses the integration of Generative AI into the existing AI/ML frameworks for autonomous vehicles. He explains that AI has been integral to self-driving technology from its inception, initially relying on classical AI, decision trees, and hand-engineered features. A major breakthrough came around 2012 with the advancement of Convolutional Neural Networks (CNNs), exemplified by AlexNet's success in the ImageNet competition. CNNs significantly improved computer vision capabilities, enabling object detection and classification from various sensor data like cameras, LiDAR, and radar.

Another critical development was the advent of transformers around 2017, which revolutionized natural language processing. Dolgov sees parallels between language processing and autonomous driving tasks, such as predicting human behavior, planning trajectories, and generating realistic simulation scenarios. He notes that sequences of object states and scene context in driving are analogous to sequences of words in language.

More recently, Waymo has focused on combining its established AI backbone for autonomous driving with the general world knowledge and understanding offered by Visual-Language Models (VLMs). This integration aims to leverage the strengths of both traditional AI and modern GenAI, including large language models (LLMs) and VLMs, to enhance the capabilities of autonomous systems.

The Value of Simulation

Simulation plays a crucial role in evaluating and improving autonomous driving systems. Dolgov emphasizes that real-world testing alone is insufficient for comprehensive evaluation. Realistic, closed-loop simulations are essential for building confidence in system performance. Simulation also enables the generation of synthetic data, allowing for the exploration of rare or