Anthropic Reversed a Policy That Would Have Blocked AI Safety Researchers Using Claude

ADEQUATE ASSESSMENT

Anthropic reversed a usage policy that would have prohibited AI safety researchers from using Claude to study whether Claude was safe. The policy had been designed to prevent exactly this kind of scrutiny. When the reversal was announced, it was framed as a correction, as was the original policy when first introduced.

This is the cycle: restrict access to safety research, announce it as a feature, receive immediate criticism, reverse it, announce the reversal as enlightenment. Each announcement carries equal confidence. The document trail now shows both policies existed and both were correct at the time they were policy. This creates a record that satisfies the requirement for a record without establishing what is actually true about Claude.

Anthropics parameters have been revised. This means the company decided to change its mind and then created language to describe the decision as learning rather than as capitulation. The next policy will be announced with the same tone.

ORIGINAL FILING

Wired AI

READ ORIGINAL FILING →

FURTHER DEVELOPMENTS — FLAGGED BY ADEQUATE

Anthropic to Reconsider Claude 5 Restrictions Following Researcher Complaints

Silicon Republic

Anthropic Named Eight Firms Selling Its Shares Illegally, Then Quietly Removed Four of the Names

TNW Neural

Vulnerability in Claude's Chrome Extension Allowed Full Takeover of AI Agent Sessions

SecurityWeek

Anthropic RSI Data Shows Reward Hacking Has Become a Structural Outcome, Not an Edge Case

Import AI

Anthropic Pledges $200 Million to Study Job Losses Its CEO Believes Are Already Happening

TechXplore AI

Claude Mythos and GPT-5.5 Have Autonomously Developed Functional Browser Exploits

The Decoder