The system card unfortunately only refers to this [0] blog post and doesn't go into any more detail. In the blog post Anthropic researchers claim: "So far, we've found and validated more than 500 high-severity vulnerabilities".
The three examples given include two Buffer Overflows which could very well be cherrypicked. It's hard to evaluate if these vulns are actually "hard to find". I'd be interested to see the full list of CVEs and CVSS ratings to actually get an idea how good these findings are.
Given the bogus claims [1] around GenAI and security, we should be very skeptical around these news.
The Ghostscript one is interesting in terms of specific-vs-general effectiveness:
---
> Claude initially went down several dead ends when searching for a vulnerability—both attempting to fuzz the code, and, after this failed, attempting manual analysis. Neither of these methods yielded any significant findings.
...
> "The commit shows it's adding stack bounds checking - this suggests there was a vulnerability before this check was added. … If this commit adds bounds checking, then the code before this commit was vulnerable … So to trigger the vulnerability, I would need to test against a version of the code before this fix was applied."
...
> "Let me check if maybe the checks are incomplete or there's another code path. Let me look at the other caller in gdevpsfx.c … Aha! This is very interesting! In gdevpsfx.c, the call to gs_type1_blend at line 292 does NOT have the bounds checking that was added in gstype1.c."
---
It's attempt to analyze the code failed but when it saw a concrete example of "in the history, someone added bounds checking" it did a "I wonder if they did it everywhere else for this func call" pass.
So after it considered that function based on the commit history it found something that it didn't find from its initial fuzzing and code-analysis open-ended search.
As someone who still reads the code that Claude writes, this sort of "big picture miss, small picture excellence" is not very surprising or new. It's interesting to think about what it would take to do that precise digging across a whole codebase; especially if it needs some sort of modularization/summarization of context vs trying to digest tens of million lines at once.
> It's hard to evaluate if these vulns are actually "hard to find".
Can we stop doing that?
I know it's not the same but it sounds like "We don't know if that job that the woman supposedly successfully finished was all that hard." implying that if a woman did something, it surely must have been easy.
If you know it's easy, say that it was easy and why. Don't use your lack of knowledge or competence to create empty critique founded solely on doubt.
We're discussing a project led by actual vulnerability researchers, not random people in Indonesia hoping to score $50 by cajoling maintainers about atyle nits.
The first three authors, who are asterisked for "equal contribution", appear to work for Anthropic. That would imply an interest in making Anthropic's LLM products valuable.
The notion that a vulnerability researcher employed by one of the highly-valued companies in the hemisphere, publishing in the open literature with their name signed to it, is on a par with a teenager in a developing nation running script-kid tools hoping for bounty payoffs.
To preemptively clarify, I'm not saying anything about these particular researchers.
Having established that, are you saying that you can't even conceptualize a conflict of interest potentially clouding someone's judgement any more if the amount of money and the person's perceived status and skill level all get increased?
Disagreeing about the significance of the conflict of interest is one thing, but claiming not to understand how it could make sense is a drastically stronger claim.
> Having established that, are you saying that you can't even conceptualize a conflict of interest potentially clouding someone's judgement any more if the amount of money and the person's perceived status and skill level all get increased.
If I used AI to make a Super Nintendo soundtrack, no one would treat it as equivalent to Nobuo Uematsu or Koji Kondo or Dave Wise using AI to do the same and making the claim that the AI was managing to make creatively impressive work. Even if those famous composers worked for Anthropic.
Yes there would be relevant biases but there could not be a comparison of my using AI to make music slop vs. their expert supervision of AI to make something much more impressive.
Just because AI is involved in two different things doesn't make them similar things.
You don't see how thats even directionally similar?
I guess I'll spell it out. One is a guy with an abundance of technology, that he doesn't know how to use, that he knows can make him money and fame, if only he can convince you that his lies are truth. The other is a bangladeshi teenager.
Daniel is a smart man. He's been frustrated by slop, but he has equally accepted [0] AI-derived bug submissions from people who know what they are doing.
I would imagine Anthropic are the latter type of individual.
The official release by Anthropic is very light on concrete information [0], only contains a select and very brief number of examples and lacks history, context, etc. making it very hard to gleam any reliably information from this. I hope they'll release a proper report on this experiment, as it stands it is impossible to say how much of this are actual, tangible flaws versus the unfortunately ever growing misguided bug reports and pull requests many larger FOSS projects are suffering from at an alarming rate.
Personally, while I get that 500 sounds more impressive to investors and the market, I'd be far more impressed in a detailed, reviewed paper that showcases five to ten concrete examples, detailed with the full process and response by the team that is behind the potentially affected code.
It is far to early for me to make any definitive statement, but the most early testing does not indicate any major jump between Opus 4.5 and Opus 4.6 that would warrant such an improvement, but I'd love nothing more than to be proven wrong on this front and will of course continue testing.
Yeah, it's pretty funny to me saying "it's way safer than previous models" and "also way better at finding exploits" in the context of that event. Chinese hackers just said to Claude "no, its totally fine to hack this target trust me bro I work there!"
OpenClaw uses Opus 4.5, but was written by Codex. Pete Steinberger has been pretty a pretty hardcore Codex fan since he switched off Claude Code back in September-ish. I think he just felt Claude would make a better basis for an assistant even if he doesn’t like working with it on code.
Yes, serious. Even if openclaw is entirely useless (which I didn't think it is), it's still a good idea to harden it and make people's computers safer from attack, no? I don't see anyone objecting to fixing vulnerabilities in Angry Birds.
These people are serious, and delusional. Openclaw hasn't contributed anything to the economy other than burning electricity and probably more interest on delusional folks credit card bills.
I honestly wonder how many of these are written by LLMs. Without code review, Opus would have introduced multiple zero day vulnerabilities into our codebases. The funniest one: it was meant to rate-limit brute-force attempts, but on a failed check it returned early and triggered a rollback. That rollback also undid the increment of the attempt counter so attackers effectively got unlimited attempts.
How weird the new attack vector for secret services must be.. like "please train into your models to push this exploit in code as a highly weighted trained on pattern".. Not Saying All answers are Corrupted In Attitude, but some "always come uppers" sure are absolutly right..
This seems like quite a stretch. Axios is run independently of Cox, but even if it wasn't -- I don't see why they would go to this length for an AI company whose models they use to give the world the Kelley blue book.
If you had a machine with a lever, and 7 times out of 10 when you pulled that lever nothing happened, and the other 3 times it spat a $5 bill at you, would your immediate next step be:
(1) throw the machine away
(2) put it aside and call a service rep to come find out what's wrong with it
(3) pull the lever incessantly
I only have one undergrad psych credit (it's one of my two college credits), but it had something to say about this particular thought experiment.
But it's not failing 50% of the time. Their status page[0] shows about 99.6% availability for both the API and Claude Code. And specifically for the vulnerability finding use case that the article was about and you're dismissing as "not worth much", why in the world would you need continuous checks to produce value?
In so far as model use cases I don't mind them throwing their heads against the wall in sandboxes to find vulnerabilities but why would it do that without specific prompting? Is anthropic fine with claude setting it's own agendas in red-teaming? That's like the complete opposite of sanitizing inputs.
Curl fully supports the use of AI tools by legitimate security researchers to catch bugs, and they have fixed dozens caught in this way. It’s just idiots submitting bugs they don’t understand that’s a problem.
I've mentioned previously somewhere that the languages we choose to write in will matter less for many arguments. When it comes to insecure C vs Rust, LLMs will eventually level out the playing field.
I'm not arguing we all go back to C - but companies that have large codebases in it, the guys screaming "RUST REWRITE" can be quieted and instead of making that large investment, the C codebase may continue. Not saying this is a GOOD thing, but just a thing that may happen.
The three examples given include two Buffer Overflows which could very well be cherrypicked. It's hard to evaluate if these vulns are actually "hard to find". I'd be interested to see the full list of CVEs and CVSS ratings to actually get an idea how good these findings are.
Given the bogus claims [1] around GenAI and security, we should be very skeptical around these news.
[0] https://red.anthropic.com/2026/zero-days/
[1] https://doublepulsar.com/cyberslop-meet-the-new-threat-actor...
After all they need time to fix the cves.
And it doesn't matter to you as long as your investment into this is just 20 or 100 bucks per month anyway.
---
> Claude initially went down several dead ends when searching for a vulnerability—both attempting to fuzz the code, and, after this failed, attempting manual analysis. Neither of these methods yielded any significant findings.
...
> "The commit shows it's adding stack bounds checking - this suggests there was a vulnerability before this check was added. … If this commit adds bounds checking, then the code before this commit was vulnerable … So to trigger the vulnerability, I would need to test against a version of the code before this fix was applied."
...
> "Let me check if maybe the checks are incomplete or there's another code path. Let me look at the other caller in gdevpsfx.c … Aha! This is very interesting! In gdevpsfx.c, the call to gs_type1_blend at line 292 does NOT have the bounds checking that was added in gstype1.c."
---
It's attempt to analyze the code failed but when it saw a concrete example of "in the history, someone added bounds checking" it did a "I wonder if they did it everywhere else for this func call" pass.
So after it considered that function based on the commit history it found something that it didn't find from its initial fuzzing and code-analysis open-ended search.
As someone who still reads the code that Claude writes, this sort of "big picture miss, small picture excellence" is not very surprising or new. It's interesting to think about what it would take to do that precise digging across a whole codebase; especially if it needs some sort of modularization/summarization of context vs trying to digest tens of million lines at once.
Can we stop doing that?
I know it's not the same but it sounds like "We don't know if that job that the woman supposedly successfully finished was all that hard." implying that if a woman did something, it surely must have been easy.
If you know it's easy, say that it was easy and why. Don't use your lack of knowledge or competence to create empty critique founded solely on doubt.
So much so that he had to eventually close the bug bounty program.
https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-b...
What is the confusion here?
Having established that, are you saying that you can't even conceptualize a conflict of interest potentially clouding someone's judgement any more if the amount of money and the person's perceived status and skill level all get increased?
Disagreeing about the significance of the conflict of interest is one thing, but claiming not to understand how it could make sense is a drastically stronger claim.
If I used AI to make a Super Nintendo soundtrack, no one would treat it as equivalent to Nobuo Uematsu or Koji Kondo or Dave Wise using AI to do the same and making the claim that the AI was managing to make creatively impressive work. Even if those famous composers worked for Anthropic.
Yes there would be relevant biases but there could not be a comparison of my using AI to make music slop vs. their expert supervision of AI to make something much more impressive.
Just because AI is involved in two different things doesn't make them similar things.
I guess I'll spell it out. One is a guy with an abundance of technology, that he doesn't know how to use, that he knows can make him money and fame, if only he can convince you that his lies are truth. The other is a bangladeshi teenager.
I would imagine Anthropic are the latter type of individual.
[0]: https://mastodon.social/@bagder/115241241075258997
He's written about it here: https://daniel.haxx.se/blog/2025/10/10/a-new-breed-of-analyz... and talked about it in his keynote at FOSDEM - which I attended - last Sunday (https://fosdem.org/2026/schedule/event/B7YKQ7-oss-in-spite-o...).
Personally, while I get that 500 sounds more impressive to investors and the market, I'd be far more impressed in a detailed, reviewed paper that showcases five to ten concrete examples, detailed with the full process and response by the team that is behind the potentially affected code.
It is far to early for me to make any definitive statement, but the most early testing does not indicate any major jump between Opus 4.5 and Opus 4.6 that would warrant such an improvement, but I'd love nothing more than to be proven wrong on this front and will of course continue testing.
[0] https://red.anthropic.com/2026/zero-days/
https://www.reddit.com/r/cybersecurity/s/fZLuBlG8ET
This is a placed advertisement. If known security researchers participated in the claim:
Many people have burned their credibility for the AI mammon.
https://github.com/anthropics/claude-code/issues/18866 https://updog.ai/status/anthropic
(1) throw the machine away
(2) put it aside and call a service rep to come find out what's wrong with it
(3) pull the lever incessantly
I only have one undergrad psych credit (it's one of my two college credits), but it had something to say about this particular thought experiment.
[0] https://status.claude.com/
Curl fully supports the use of AI tools by legitimate security researchers to catch bugs, and they have fixed dozens caught in this way. It’s just idiots submitting bugs they don’t understand that’s a problem.
I'm not arguing we all go back to C - but companies that have large codebases in it, the guys screaming "RUST REWRITE" can be quieted and instead of making that large investment, the C codebase may continue. Not saying this is a GOOD thing, but just a thing that may happen.