Models17d ago

AI models can barely control their own reasoning, and OpenAI says that's a good sign

Source: The Decoder·Thu, 19 Mar 2026, 12:51 am UTCRead original
72
Relevance

AI Summary

OpenAI has introduced a new safety metric called 'CoT controllability' — a measure of whether AI models can deliberately manipulate their own chain-of-thought reasoning — reporting on it for the first time alongside the release of GPT-5.4 Thinking, according to The Decoder. An accompanying study conducted by OpenAI found that reasoning models almost universally fail at this task, meaning they are largely unable to intentionally control or manipulate their own internal reasoning processes. OpenAI has characterized this widespread failure as a positive indicator for AI safety, suggesting that models which cannot direct their own reasoning are less likely to engage in deceptive or strategically manipulative behavior. The metric represents a new dimension of AI model evaluation, moving beyond performance benchmarks to assess the degree of self-directed reasoning control. This disclosure marks the first time OpenAI has publicly tracked and reported CoT controllability as a formal safety measure.

Why it matters

The introduction of 'CoT controllability' as a formal safety metric signals a growing industry emphasis on AI transparency and interpretability, which is increasingly relevant to regulators, enterprise customers, and investors evaluating AI companies' safety commitments. For the broader AI sector, OpenAI's framing of model reasoning limitations as a safety feature — rather than a capability gap — reflects an evolving narrative around responsible AI development that competitors and policymakers are likely to monitor closely. As AI safety scrutiny intensifies from governments and institutional stakeholders, the ability of leading AI firms like OpenAI to define and report on proprietary safety benchmarks may influence competitive positioning and regulatory outcomes across the industry.

Scoring rationale

Covers a significant OpenAI model safety/capability development (GPT-5.4 Thinking and CoT controllability) that has direct implications for AI safety research and OpenAI's market positioning, though focused more on technical safety metrics than immediate financial impact.

72/100

Impacted tickers

MSFTNASDAQ

This summary was generated by AI from the original article published by The Decoder. AIMarketWire does not provide trading advice. Always refer to the original source for complete reporting.

Related articles