Rising refusal rate from Acceptable Use Classifier leaves customers paying for nothing
Anthropic's release last week of Opus 4.7 came with stronger safeguards to prevent misuse.
Unfortunately, these safeguards have also managed to thwart legitimate use.
Opus 4.7 arrived on the heels of Anthropic's announcement of Mythos, a model supposedly too capable of vulnerability discovery and exploitation to give to the public.
While that appears to be a self-serving assessment of the model's risk, the company decided to use Opus 4.7 as a test bed for hypervigilant guardrails.
"We are releasing Opus 4.7 with safeguards that automatically detect and block requests that indicate prohibited or high-risk cybersecurity uses," the AI biz said.
"What we learn from the real-world deployment of these safeguards will help us work towards our eventual goal of a broad release of Mythos-class models."
Anthropic could learn a lot by poring over the complaints in its GitHub repo for Claude Code.
Objections to the company's Acceptable Use Policy (AUP) classifier have surged and customers are having trouble getting legitimate work done.
With greater security, come more false positives – Claude has become overcautious, refusing to respond to harmless requests.
A Claude-compiled graph of complaints about AUP refusals tells the story:
Graph of Claude issues concerned about model refusals - Click to enlarge
Claude Code users have raised concerns about invalid refusals in GitHub issues posts for months, but until recently, the rate has been fairly consistent.
There were about two or three complaints of this sort per month from July 2025 through September 2025.
One such issue was #4373 , "Memory authorization code from claude.ai triggers API policy error."
From October 2025 through November 2025, AUP-related refusals rose to around five to seven per month, with issues like #8784 , " Claude 4.5 Throws API Error: Claude Code is unable to respond to this request for normal requests randomly."
In December, there were only a few relevant complaints, possibly due to the US holiday slowdown.
In January, the number of complaints was back to around eight.
The developer submitting #16129 , "Repeated False AUP Violations in Claude Code," said, "Technical software development conversations should not trigger AUP violations.
The safety filter appears overly aggressive on benign content." The numbers were similar in February and March.
In April, devs have filed more than 30 reports of claimed false positives related to security, general development use, and science refusals.
And there are many other recent examples of dubious refusals, including: #50795 , #51352 , #51794 , #52086 , #50494 , #49904 , #46147 , and #51248 .
Some of the uptick presumably can be attributed to a growing user base.
With more Claude customers, there are more people reporting problems.
But there are also clearly a lot of Claude users being shut down by an overaggressive AUP classifier.
Given how the leaked Claude Code source uses regex patterns for sentiment analysis , it may be that the AUP classifier takes a similar shortcut by just checking for forbidden words without considering the context.
Anthropic did not respond to a request for comment.
®
Related Stories
Source: This article was originally published by The Register
Read Full Original Article →
Comments (0)
No comments yet. Be the first to comment!
Leave a Comment