Research Leaders Urge Tech Industry to Monitor AI's 'Thoughts'

AI researchers from leading institutions including OpenAI, Google DeepMind, and Anthropic, alongside a broad coalition of companies and nonprofit groups, have published a position paper advocating for enhanced investigation into techniques for monitoring the internal reasoning processes of AI models, often referred to as their 'thoughts'. This call to action, detailed in a paper released on Tuesday, highlights the critical importance of understanding and preserving the transparency of AI's decision-making pathways.

The Significance of Chain-of-Thought (CoT) Monitoring

A key feature of advanced AI reasoning models, such as OpenAI's o3 and DeepSeek's R1, is their use of 'chains-of-thought' (CoTs). CoTs represent an externalized process where AI models work through problems step-by-step, akin to a human using a scratchpad for complex calculations. These reasoning models are fundamental to powering AI agents, and the paper's authors argue that monitoring CoTs could be a primary method for maintaining control over these agents as they become more widespread and capable.

"CoT monitoring presents a valuable addition to safety measures for frontier AI, offering a rare glimpse into how AI agents make decisions," stated the researchers in their position paper. "Yet, there is no guarantee that the current degree of visibility will persist. We encourage the research community and frontier AI developers to make the best use of CoT monitorability and study how it can be preserved."

A Call for Transparency and Preservation

The position paper specifically urges leading AI model developers to investigate the factors that contribute to CoT 'monitorability' – essentially, what makes AI models' reasoning processes transparent and understandable. The authors emphasize that while CoT monitoring offers a vital method for understanding AI reasoning, it is a potentially fragile capability. They caution against any interventions that might reduce this transparency or reliability.

Furthermore, the paper calls on AI model developers to actively track CoT monitorability and explore its potential implementation as a robust safety measure.

Key Signatories and Industry Unity

The paper boasts notable signatories, including Mark Chen (OpenAI Chief Research Officer), Ilya Sutskever (Safe Superintelligence CEO), Nobel laureate Geoffrey Hinton, Shane Legg (Google DeepMind Co-founder), Dan Hendrycks (xAI Safety Adviser), and John Schulman (Thinking Machines Co-founder). First authors hail from the U.K. AI Security Institute and Apollo Research, with other signatories representing institutions like METR, Amazon, Meta, and UC Berkeley.

This collaborative effort signifies a moment of unity among AI industry leaders aiming to bolster research in AI safety. It emerges amidst fierce competition within the tech sector, which has seen companies like Meta actively recruiting top AI researchers from OpenAI, Google DeepMind, and Anthropic with substantial offers. The most sought-after talent often focuses on developing AI agents and reasoning models.

The Race for AI Understanding

OpenAI's release of its first AI reasoning model, o1, in September 2024, spurred rapid development of competing models from Google DeepMind, xAI, and Anthropic, many demonstrating advanced performance. However, understanding the inner workings of these AI reasoning models remains a significant challenge. Despite advancements in AI performance, a deeper comprehension of their decision-making processes has lagged.

Anthropic, a leader in AI interpretability, has committed to cracking open the 'black box' of AI models by 2027. CEO Dario Amodei has urged OpenAI and Google DeepMind to increase their research in this area. Early Anthropic research suggests CoTs might not always be a fully reliable indicator of model reasoning, contrasting with OpenAI's view that CoT monitoring could be a reliable way to track AI alignment and safety.

The Role of Position Papers in Research

Position papers like this serve to amplify nascent research areas, such as CoT monitoring, and attract greater attention and funding. While major companies are already investing in these topics, this paper is expected to encourage further research and investment.

Bowen Baker, an OpenAI researcher involved in the paper, highlighted its purpose in an interview with TechCrunch: "Publishing a position paper like this, to me, is a mechanism to get more research and attention on this topic before that happens." He noted the critical juncture at which the industry finds itself with CoT technology, emphasizing its potential utility and the need for focused effort to ensure its longevity.

People walking in a maze shaped as a brain

TechCrunch All Stage Event

TechCrunch is hosting its 'All Stage' event in Boston on July 15, 2025, focusing on strategies, workshops, and connections for founders and VCs across all stages. Early bird registration offers significant savings.

Research Leaders Urge Tech Industry to Monitor AI's 'Thoughts'