Anthropic Says AI Has a Catastrophic "Collective Action" Problem

Anthropic has rewritten its flagship safety framework to acknowledge that no single AI developer can unilaterally prevent catastrophe — and to explain what it will and won't commit to doing alone.

Anthropic released an updated AI model safety framework this week that it says no longer commits the company to unilaterally pausing development to reduce catastrophic risk if competitors move ahead without comparable safeguards as the company cites what it describes as a growing "collective action problem" in frontier AI development.

In Version 3.0 of its Responsible Scaling Policy, the company says that "the overall level of catastrophic risk from AI depends on the actions of multiple AI developers, not just one," and that from a societal perspective "what matters is the risk to the ecosystem as a whole."

The policy defines catastrophic risk in plain terms as "risks of the most severe potential harms from advanced AI, such as existential threats or fundamental destabilization of global systems."

Anthropic Says AI Has a Catastrophic "Collective Action" Problem

Read next

Moody’s, Verisk Push Data “Moat” Defense Against AI

Trump's Climate Supercomputer Plans Threaten US Insurance Market: Actuaries

Anthropic Formalizes Model Sabotage Surveillance Following California Safety Law