Anthropic backs down on hidden Claude Fable 5 restrictions

Anthropic apologized and withdrew the policy that would have secretly restricted Claude Fable 5’s performance for AI developers. Scientists cautioned that the restrictive policy could make the advanced functionalities of AI accessible to a select few companies, hindering the growth of science globally.

The decision by Anthropic, which came on June 10, has implications far more significant than the actions taken by just one company. Given its valuation of nearly $965 billion and its impending IPO, the incident brought out a crucial contradiction within the AI sector: how companies developing the best models manage their competitive needs alongside an open research culture that drives innovation worldwide.

Anthropic’s hidden Claude Fable 5 restrictions trigger industry outrage

With the launch of Claude Fable 5 on June 9, Anthropic made public four categories of protections, including cybersecurity, biology, chemistry, and AI research. Three of those categories acted similarly. Whenever a sensitive query was detected, the system would either reject it completely or refer the user to Claude Opus 4.8, its former top offering, with an open alert.

However, the fourth category differed from the others. When confronted with queries for emerging AI, Fable 5 would compromise the quality of its output without letting the user know. As per Fortune’s report on the announcement, the company outlined interventions to hinder the system’s performance without revealing any of it. The full details can be found in Anthropic’s 319-page system card.

According to Anthropic, the probability of this restriction coming into play was about 0.03%. But the principle alarmed researchers far more than the percentage.

“We made the wrong tradeoff, and we apologize for not getting the balance right,” Anthropic reportedly explained.

Critics say Claude Fable 5 restrictions threatened independent AI research

This criticism came from people who rarely agree. Open-source supporters, safety researchers working with AI, and even former employees of Anthropic all pushed back within hours of the system card’s publication.

Will Brown, AI startup Prime Intellect research lead, stated that the policy felt like the company was “starting to pull the ladder up behind them.” He added that there is a growing number of companies that are evaluating frontier systems’ safety and reliability. The covert performance degradation could potentially sabotage their verification process.

Nathan Lambert, an open-model researcher who once headed the work at the Allen Institute for AI, was even stronger in his statement. He said on X that the policy “paints Anthropic clearly as anti-science, and therefore anti-progress and anti-safety.”

Jeremy Howard, co-founder of AnswerDotAI, framed the issue as a competitive power grab. Anthropic’s own researchers could still use the unrestricted model internally, Howard argued, meaning the frontier would keep advancing while outside researchers fell behind. He stated that “the AI frontier advances, and power imbalance increases.”

Even former Anthropic staff weighed in. Behnam Neyshabur, who previously co-led the company’s AI scientist initiative, posted that restricting these capabilities “fundamentally slows scientific and technological progress and is net negative for humanity.”

How could it affect Anthropic’s IPO?

The incident happened in a very delicate period for Anthropic. The company confidentially filed IPO documents on June 1, raising $65 billion at an implied valuation of $965 billion. Valuation in this case depends significantly on trust from enterprise customers and from the research community.

Separate from the AI research controversy, the release of Fable 5 faced criticism in another regard. The robust biology filters employed by the model prevented it from answering questions regarding cell membranes and mitochondria, which are usually subjects taught in high schools. According to reports, it was impossible to make the model describe how mRNA vaccines function and what causes hay fever, although it had no trouble discussing TNT and password risks.

Microsoft has also restricted its staff from using Fable 5 due to data retention issues related to the newly introduced Mythos-class retention policies by Anthropic. The retention period for prompts and outputs is set at 30 days for trust and safety purposes, with the flagged content retained for up to two years.

What’s next?

With the amended policy, Anthropic said that Fable 5 will make its AI development safeguards visible. In case the system finds out that the user is conducting frontier AI research, it can either deny their request or switch to a different system and notify the user in both cases.

Anthropic acknowledged a tradeoff. Since the safeguard is now visible, the company has no choice but to employ it more broadly. This means that more innocuous queries would be blocked. The company said it is working to improve classifier precision. This case brought to light an issue of even greater magnitude. Insofar as AI model capabilities increase, training costs rise, and the temptation to limit competing parties’ access to such tools increases. Whether Anthropic’s quick reversal sets a precedent or merely delays the next attempt at covert restriction will depend on how the rest of the industry responds.

The smartest crypto minds already read our newsletter. Want in? Join them.