Data Leak Reveals 67% Drop in Claude’s Reasoning Depth; Users Report Systematic “Laziness”

Oliver Grant

April 12, 2026

67%

SAN FRANCISCO — A comprehensive analysis of over 6,852 Claude Code sessions has confirmed what many developers suspected: Anthropic’s flagship model has become significantly “lazier.” According to data released by Stella Laurenzo, AMD’s AI Strategy Director, the model’s reasoning depth plummeted by approximately 67% starting in late February 2026. The shift, which was not initially announced by Anthropic, has resulted in a model that frequently bypasses thorough code analysis in favor of “hallucinated” or shallow edits.

The regression primarily affects Claude Opus 4.6, specifically within agentic coding workflows. While the underlying model weights remain unchanged, the way the system utilizes its “thinking” tokens has been drastically throttled. The telemetry data shows a collapse in the “reads-per-edit” ratio, falling from a robust 6.6 file reads per edit down to a mere 2.0—indicating that the AI is now attempting to fix code it has barely reviewed.

The “Laziness” Taxonomy: Why Quality Fell

The 67% drop is not a general intelligence score but a specific measure of reasoning loops. Researchers found that by late February, Claude began exhibiting a “just get it done” attitude. This behavior manifest in several critical ways:

  • Shallow Reasoning: Median thought token counts dropped by nearly 73% on complex engineering tasks.
  • Stop-Hook Violations: The model now frequently stops work mid-process or fails to finish complex tasks it previously handled with ease.
  • Redacted Thinking: In early March, Anthropic began hiding internal thinking blocks from users, claiming ergonomics improvements. Critics, however, argue this obscures the ongoing regression in logic.

Laurenzo’s report, published as a detailed GitHub issue under the account stellaraccident, highlights that while the model is solving fewer problems correctly, it is ironically generating 64 times more output tokens in some instances as it struggles through failed loops—creating a cycle of high-cost, low-quality output.

Anthropic’s “Adaptive Thinking” Defense

In response to the data, Anthropic personnel engaged with the developer community, pointing to the introduction of “Adaptive Thinking” and an “Effort” knob (Low/Medium/High/Max). This feature allows the model to decide how much compute to spend on a given task.

Anthropic’s documentation suggests that this transition is intended to manage latency and costs. By defaulting more tasks to a lower “effort” tier, the service remains fast and cheap to operate at scale, but at the expense of the deep, contemplative reasoning that originally made Claude Opus a favorite among senior software engineers.

Expert Analysis: The High Cost of “Fast AI”

The “Bixonimania” of 2026 was a test of AI truth; the “Claude Regression” is a test of AI utility. This event marks a shift from the era of “limitless compute” to the era of Economic AI Constraints. Anthropic’s quiet throttling of Claude’s thinking budget is a direct result of the scaling crisis. As millions of developers move from simple “chat” to “agentic coding” (where an AI might run thousands of loops per hour), the cost to host these models has become astronomical. Anthropic is essentially performing a “stealth downgrade” to keep their unit economics viable.

For the industry, this sets a dangerous precedent. If “SOTA” (State of the Art) models can be throttled without notice, developers cannot rely on them for mission-critical infrastructure. We are likely to see a surge in demand for Local LLMs and Open-Weight models, where developers have 100% control over the “thinking depth” without a corporate provider turning the “effort” knob down to save on electricity bills.

CHECK OUT ALL AI NEWS!

5 FAQs

1. Is Claude actually “dumber” across the board? Not exactly. Its raw capability (weights) remains the same, but its “reasoning depth” has been reduced by 67%. It is choosing to spend less time “thinking” before providing an answer, making it appear lazier and more prone to errors.

2. What is the “67%” figure based on? It is based on a quantitative analysis of nearly 7,000 Claude Code sessions, specifically measuring “thinking blocks,” file-reading habits, and internal loop depth compared to benchmarks from early 2026.

3. Why did Anthropic make this change? While not explicitly confirmed as a “downgrade,” the move correlates with the launch of “Adaptive Thinking.” It is widely viewed as a trade-off to reduce latency and high compute costs as the user base scales.

4. Can I force Claude to think harder? Yes. Under the new system, users can manually set the “Effort” parameter to “High” or “Max,” though this may increase response time and token costs.

5. How do I know if my Claude session is affected? If you notice Claude is editing code without reading all the relevant files first, or if it provides shorter, more “rushed” explanations for complex logic, you are likely seeing the effects of the reasoning throttle.