Verify any claim · lenz.io
Claim analyzed
Tech“Anthropic's latest AI model has identified more than 500 previously unknown high-severity security flaws in open-source libraries with minimal prompting.”
Submitted by Vicky
The conclusion
Evidence from Anthropic's own red-team report shows Claude Opus 4.6 uncovered and internally validated more than 500 high-severity, previously unknown vulnerabilities in open-source libraries, with press accounts describing near-default prompting. Independent confirmation is limited and the term “latest model” could also refer to Anthropic's unreleased Mythos Preview, but these ambiguities do not materially change the basic fact that a Claude model discovered 500+ serious flaws.
Based on 14 sources: 10 supporting, 0 refuting, 4 neutral.
Caveats
- Figure is based mainly on Anthropic's self-reported validation; no full third-party audit is public.
- “Latest model” is vague and could be interpreted as Mythos Preview rather than Claude Opus 4.6.
- “Minimal prompting” is asserted but details on human assistance, tooling, and triage are not disclosed.
Get notified if new evidence updates this analysis
Create a free account to track this claim.
Sources
Sources used in the analysis
We're now using Claude to find and help fix vulnerabilities in open source software. [...] So far, we've found and validated more than 500 high-severity vulnerabilities. We've begun reporting them and are seeing our initial patches land, and we’re continuing to work with maintainers to patch the others.
In a recent evaluation of AI models’ cyber capabilities, current Claude models can now succeed at multistage attacks on networks with dozens of hosts using only standard, open-source tools [...] This illustrates how barriers to the use of AI in relatively autonomous cyber workflows are rapidly coming down.
Claude Mythos Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. It was able to identify nearly all of these vulnerabilities—and develop many related exploits—entirely autonomously, without any human steering.
In pre-release testing, Mythos identified thousands of previously unknown zero-day vulnerabilities across every major operating system and every major web browser. It found flaws that had survived decades of human security review and millions of automated tests. It reproduced vulnerabilities and developed working exploits on the first attempt in over 83% of cases.
In controlled evaluations where Mythos Preview was explicitly directed and given network access to do so, we observed that it could execute multi-stage attacks on vulnerable networks and discover and exploit vulnerabilities autonomously – tasks that would take human professionals days of work.
Anthropic's own testing of Mythos uncovered that the AI is "capable of identifying and then exploiting zero-day vulnerabilities in every major operating system and every major web browser when directed by a user to do so." The Mythos tests even identified some vulnerabilities that are over 20 years old.
Anthropic's latest AI model has found more than 500 previously unknown high-severity security flaws in open-source libraries with little to no prompting... Claude found more than 500 previously unknown zero-day vulnerabilities in open-source code using just its 'out-of-the-box' capabilities.
Anthropic's latest AI model identifies 'thousands of zero-day vulnerabilities' in 'every major operating system and every major web browser'; Cybersecurity - digital lock Security researcher says AMD auto-updater downloads software insecurely, enabling remote code execution; Dario Amodei looking a little menacing. Anthropic's Claude Mythos isn't a sentient super-hacker, it's a sales pitch — claims of 'thousands' of severe zero-days rely on just 198 manual reviews.
Anthropic’s Claude Opus 4.6 autonomously audited open source code, uncovered 500+ serious vulnerabilities across key libraries... Notably, the model required no task-specific tooling, custom scaffolding, or specialised prompting to surface the issues.
Anthropic (using Claude Opus 4.6) reported 500+ high-severity vulnerabilities in open-source projects. Mythos is distinguished from previous capabilities on both technological and strategic levels... “One-shot” (single-prompt) capability. Mythos accomplishes significantly more with a single prompt, without elaborate scaffolding or agent configuration.
When Anthropic disclosed last month that Claude Opus 4.6 had autonomously discovered more than 500 high-severity zero-day vulnerabilities in ... When Anthropic disclosed last month that Claude Opus 4.6 had autonomously discovered more than 500 high-severity zero-day vulnerabilities.
That question became much more concrete after Anthropic announced Claude Code Security on February 20, 2026 [...] Anthropic says it scans codebases for vulnerabilities, suggests targeted patches, and uses multi-stage verification before surfacing findings for human review. Anthropic also emphasizes severity and confidence ratings and makes it clear that fixes still require human approval.
Anthropic's announcements on Claude models like Opus series highlight their use in red-teaming for security flaws, but claims of exact numbers like 500+ are tied to specific tests; later models like Mythos expand scope beyond open-source libraries to OS and browsers, indicating evolving capabilities.
Claude Opus 4.6 breakthrough: 1M token context window... and 500 zero-day vulnerabilities discovered in open-source code... Anthropic's team gave Claude standard security tools, debuggers, fuzzers, but no special instructions. The model figured out the rest.
What do you think of the claim?
Your challenge will appear immediately.
Challenge submitted!
Expert review
How each expert evaluated the evidence and arguments
Expert 1 — The Logic Examiner
Source 1 (Anthropic's own Red Team Blog) directly states that Claude has "found and validated more than 500 high-severity vulnerabilities" in open-source software, and Sources 7, 9, 10, and 11 independently corroborate that this was achieved with minimal/no specialized prompting using Claude Opus 4.6 — this constitutes a direct logical chain from primary evidence to the core claim. However, two inferential gaps weaken the claim's precision: (1) the "latest model" qualifier is ambiguous — Source 3 and 4 introduce Claude Mythos Preview as a distinct, more capable unreleased model, meaning "latest" could refer to either Opus 4.6 or Mythos depending on interpretation, and (2) Source 8 raises a legitimate methodological challenge — that the "thousands" of Mythos zero-days relied on only 198 manual reviews — though this critique targets Mythos's broader claims rather than the specifically validated 500+ open-source finding from Source 1, meaning the opponent's rebuttal commits a false equivalence by conflating two separate research efforts; the 500+ validated open-source finding from Source 1 is logically distinct from the Mythos "thousands" claim, and the Tom's Hardware critique does not directly undermine it. The claim is therefore mostly true: the 500+ high-severity vulnerabilities in open-source libraries with minimal prompting is well-supported by primary evidence, but the "latest model" framing introduces a minor scope ambiguity given the concurrent existence of Mythos Preview.
Expert 2 — The Context Analyst
The claim omits key qualifiers about what “identified” means here (Anthropic says “found and validated” 500+ but does not, in the snippet provided, specify the exact model version, the time window, the definition of “previously unknown,” or how much human workflow/triage/verification was involved—details that matter for interpreting “minimal prompting” and “identified” in practice) [1]. With that context restored, the core takeaway that Anthropic reports 500+ validated high-severity vulnerabilities in open-source software and that this was not dependent on elaborate prompting/scaffolding is broadly accurate, but the phrasing “Anthropic's latest AI model” is ambiguous (could imply Mythos vs a Claude release) and the “minimal prompting” framing is less substantiated than the 500+ figure itself, making the overall impression somewhat overstated rather than flatly wrong [1][3][8].
Expert 3 — The Source Auditor
The most authoritative source here is Source 1 (Anthropic Red Team Blog, high authority, dated April 29, 2026), which directly states that Claude has "found and validated more than 500 high-severity vulnerabilities" in open-source software, with patches already being reported — this is a first-party disclosure from Anthropic itself. Source 3 (Anthropic's official site, very high authority) and Sources 5 (AISI, a UK government body — high authority) further corroborate advanced AI vulnerability discovery, though they focus on the separate "Mythos Preview" model finding "thousands" across OS/browsers. The claim's specific language — "latest AI model," "500+ previously unknown high-severity flaws," "open-source libraries," "minimal prompting" — maps most cleanly onto Source 1's validated 500+ figure attributed to Claude (identified in secondary sources as Claude Opus 4.6), not Mythos Preview. Source 8 (Tom's Hardware, moderate authority) raises a legitimate methodological challenge — that the "thousands" Mythos claims rely on only 198 manual reviews — but this critique targets the Mythos/thousands claim, not the distinct 500+ open-source finding in Source 1, which explicitly states vulnerabilities were "found and validated." Sources 7, 9, 10, 11 are secondary/lower-authority outlets that largely echo Anthropic's own figures without independent verification, limiting their additive evidentiary weight. The "minimal prompting" element is corroborated by Sources 7, 9, and 10 describing "out-of-the-box," "no task-specific tooling," and "one-shot" capability. The claim is substantively confirmed by the highest-authority source (Anthropic's own Red Team Blog) with corroboration from a UK government evaluator (AISI), though the "latest model" framing is slightly ambiguous given the concurrent Mythos Preview disclosures, and secondary sources carry circular-reporting risk.
Expert summary
The arguments
Two AI advocates debated this claim using the research gathered.
Argument for
Anthropic itself reports that it is “now using Claude to find and help fix vulnerabilities in open source software” and that, with validation, it has “found and validated more than 500 high-severity vulnerabilities,” with reports already sent and patches landing—directly matching the motion's “500+ previously unknown high-severity flaws in open-source libraries” (Source 1, Anthropic Red Team Blog). Multiple independent write-ups echo the same core point that this was achieved with little/no specialized prompting or scaffolding—describing “out-of-the-box” or “one-shot” capability and “no task-specific tooling… or specialised prompting” while still exceeding 500 high-severity issues (Sources 7, 9, 10).
The Proponent's reliance on Source 1 to confirm the "latest AI model" claim is undermined by the research brief's own evidence: Source 3 and Source 4 make clear that the model achieving the most significant vulnerability discoveries is the unreleased "Claude Mythos Preview," not Claude Opus 4.6, meaning the atomic claim misattributes findings to the wrong model — a foundational factual error the Proponent fails to address. Furthermore, the Proponent's corroborating sources (Sources 7, 9, 10) are secondary commentators of lower authority that merely echo Anthropic's own figures, and Source 8 (Tom's Hardware) directly undermines the validation integrity of those figures by noting that claims of mass zero-day discovery "rely on just 198 manual reviews," a methodological deficiency that the Proponent conspicuously ignores rather than rebuts.
Argument against
The claim that Anthropic's "latest" AI model found 500+ vulnerabilities with "minimal prompting" is materially misleading, because Source 8 (Tom's Hardware) directly challenges the credibility of these figures, noting that claims of "thousands" of severe zero-days "rely on just 198 manual reviews," casting serious doubt on the validation methodology behind the 500+ number as well. Furthermore, the research brief itself reveals that the model responsible for the most expansive vulnerability discoveries is not Claude Opus 4.6 but rather the unreleased "Claude Mythos Preview" (Source 3, Source 4), meaning the claim misattributes the findings to the wrong model — a fundamental factual error that renders the atomic claim false as stated.
The Opponent's argument commits a scope-shift fallacy by using Tom's Hardware's critique of Anthropic's separate “thousands” Mythos claims (Source 8) to impugn the distinct, explicitly “found and validated more than 500 high-severity vulnerabilities” open-source effort reported by Anthropic's Red Team (Source 1), which directly addresses validation rather than “198 manual reviews.” The Opponent also misframes “latest model” as necessarily Mythos Preview, but the motion's 500+ open-source-library finding is specifically attributed to Claude in Anthropic's own disclosure (Source 1) and is independently summarized as Claude Opus 4.6 with little/no prompting by multiple secondary reports (Sources 7, 9, 10), so citing Mythos (Source 3) does not negate the truth of the 500+ minimal-prompting claim.