Anthropic urges ‘temporary pause’ on AI development to discuss risks

Anthropic has floated the idea of a worldwide “temporary pause” on AI development – and said it was going to convene “policymakers” to discuss the dangers of advanced AI – in its latest release touting the capabilities of its products.

In a long post on Thursday, Anthropic detailed the progress of its AI model, Claude, towards “recursive self improvement” – that is, being able to make better and more powerful versions of itself. Recursive self-improvement is a bugbear of AI safety researchers, viewed as the key step for AI to become superintelligent and therefore unleash widespread consequences on humanity.

The idea features heavily in the widely read AI 2027 doomsday scenario of last year, which imagines AI agents designing more and more intelligent versions of themselves, one of which eventually kills all of humanity with a bioweapon in order to make room for more datacentres and solar panels.

Anthropic’s post notes a “trend” of increasing capability in Claude which, “taken far enough and given enough compute … points to an AI system capable of fully autonomously designing and developing its own successor”. This, Anthropic said, may increase the risk of “humans losing control over AI systems”.

To deal with this, Anthropic proposed to organise conversations where “policymakers, researchers, civil society and other AI companies can help answer some of the questions this piece raises”.

The news comes alongside a separate report, from the Financial Times, that the US AI company has embedded engineers inside the National Security Agency despite a legal battle with the Pentagon over the use of its tools. The engineers are reportedly helping the NSA use Anthropic’s model, Mythos, for offensive cybersecurity operations.

If calling for a worldwide conversation on AI risk is in contradiction with supporting a US spy agency to – potentially – attack Iran and China with cyberweapons, neither development is “surprising” given the AI company’s past actions, said Steven Murdoch, a professor at University College London.

“Anthropic might give the impression of being warm and fuzzy, but their definition of AI safety is narrow. Supporting US authorities in the development of offensive capabilities has never been something they have spoken against,” he said.

Murdoch said that Anthropic’s post did not offer evidence of any step changes in the progress of AI capabilities.

“It is true that there’s some evidence that AI capabilities have increased and continue to increase with no limits becoming immediately clear,” he said, but he added: “I don’t think anything has fundamentally changed today that has caused Anthropic to publish this article.”

The advances that Anthropic appears to detail in its post do not amount to AI recursively improving itself – at least, not yet. Instead, the company reports that a substantial amount of the work done on improving its AI systems is now done with AI. Claude is good at “running experiments”, it says, or at least speeding certain sections of code.

Like other AI systems, Claude appears to be improving at tackling more challenging tasks. Anthropic described it “steering research” and “proposing its own experiments”, although these accomplishments appear to have taken place within strict confines, and to be confined to coding-related tasks.

The quality of code written by AI is also improving, said Anthropic: “As of May 2026, more than 80% of the code we merge into Anthropic’s codebase was authored by Claude.”

Murdoch said that Anthropic’s call for a “temporary pause” on AI echoed other proposals on AI safety the company had made throughout the years – as did its plan to engage policymakers. “It’s a reminder of what they are concerned about, and have been concerned about for many years.”

“I’m sure the attention is welcome, but again this isn’t a new thing. Anthropic have been trying to get the attention of policymakers since they were founded.”

Two months ago, Anthropic announced – but declined to release – Mythos, an AI model that it implied was too powerful to unleash upon the public, owing to cybersecurity concerns. The announcement led to widespread hubbub and earned the attention of US treasury secretary, Scott Bessent, and MI5.

Some experts, however, suggested that there could be more hype behind Anthropic’s Mythos announcement than substance, especially given the vagueness with which the company described some of Mythos’s capabilities. Heidy Khlaaf, chief AI scientist at the AI Now Institute, called the announcement of Mythos “a marketing post.”

Anthropic this week filed for an IPO that could value the company at $1tn. The company was approached for comment.

View original source — The Guardian ↗

ShareShare on X Share on Facebook