Anthropic Says Its AI Tools Are Speeding Up Development of Even More Advanced Models

Anthropic reported internal and public data showing artificial intelligence systems are accelerating the creation of more capable AI. Company metrics show engineers are shipping substantially more code since 2021, with a surge in code authored by Anthropic's Claude models. Public benchmarks and internal demonstrations point to faster gains in the length and complexity of tasks AI can handle autonomously. Anthropic said it would support and participate in a coordinated, verifiable slowdown at the frontier of AI development and will research verification systems to enable such a pause.

Key Points

Anthropic’s internal metrics show engineers shipped eight times as much code per quarter from 2021 to 2025.
As of May 2026, more than 80% of code merged into Anthropic’s codebase was authored by Claude; prior to Claude Code’s research preview in February 2025 the share was in the low single digits.
Public benchmarks indicate the length of tasks AI can reliably complete autonomously has been doubling roughly every four months, up from an earlier rate of doubling every seven months.

Anthropic said on Thursday that its internal analytics indicate a clear acceleration in the pace at which AI systems are producing more advanced AI - a pattern that raises the possibility of recursive self-improvement, where AI systems increasingly contribute to building their own successors.

According to the company, engineering teams at Anthropic are shipping code at a markedly higher rate than earlier in the decade. Between 2021 and 2025, the amount of code shipped per quarter rose eightfold. As of May 2026, Anthropic reports that over 80% of code merged into its codebase was authored by Claude, the company’s family of models. By contrast, prior to the introduction of Claude Code in a research preview in February 2025, the share of code coming from Claude was in the low single digits.

Anthropic also cited a March 2026 internal poll of 130 employees across research teams. The median respondent in that survey estimated they produced about four times the output when using Mythos Preview compared with not having access to any AI models.

Publicly observable benchmarks appear to reflect the same acceleration. Anthropic said the effective length of autonomous tasks that AI systems can reliably handle has been increasing at an accelerating clip - roughly doubling every four months recently, compared with an earlier trend of doubling approximately every seven months.

The company provided concrete comparisons across model releases to illustrate this trend. In March 2024, Claude Opus 3 could complete software tasks that typically take humans about four minutes. A year later, Claude Sonnet 3.7 handled tasks that took about an hour and a half. By March 2026, Claude Opus 4.6 was managing tasks that span roughly 12 hours.

Anthropic also reported improvements in open-ended task performance. The company said Claude’s success rate on open-ended tasks reached 76% in May 2026, representing a 50 percentage-point gain over the prior six months.

In April 2026, Anthropic published a demonstration involving Claude-powered agents assigned an open research problem in AI safety and left to pursue solutions end to end. In that exercise, two human researchers closed about 23% of the performance gap in roughly a week. The agents, working cumulatively for about 800 hours and consuming about $18,000 in compute, recovered approximately 97% of the gap.

Given these dynamics, Anthropic said a slowdown in frontier AI development would likely be beneficial, allowing societal institutions and alignment research more time to adapt as capabilities advance. The company stated it would be willing to slow or temporarily pause development if other frontier developers did the same in a verifiable fashion.

To support such coordination, Anthropic announced that the Anthropic Institute will pursue research aimed at building systems that enable credible verification of a slowdown or pause - mechanisms that would allow frontier AI developers to confirm whether others around the world have actually reduced or halted development activity.

Anthropic’s disclosures combine internal productivity measures, employee-reported output estimates, and public benchmark trends to argue that AI-driven acceleration in development is underway. The company positions research into verifiable slowdown mechanisms as a practical response to the risks it sees in continued rapid advancement.

Risks

Recursive self-improvement risk - accelerated AI contributions to code and capability could enable models to support development of ever-more-capable systems, creating governance and safety challenges for the technology and research sectors.
Coordination and verification uncertainty - Anthropic said it would slow or pause development only if other frontier developers do so in a verifiable manner, highlighting practical difficulties for regulators, industry groups, and research institutions in confirming compliance.
Operational concentration risk - a large share of code being authored by a model (over 80% reported for Claude) raises questions about single-vendor or single-technology concentration that could impact engineering workflows and downstream supply chains in software and tech industries.

Menu

Anthropic Says Its AI Tools Are Speeding Up Development of Even More Advanced Models

Key Points

Risks

More from Stock Markets