0%
Still working...

Anthropic’s Mythos Preview Found a 27-Year-Old Vulnerability in OpenBSD. No Human Ever Caught It

On April 7, Anthropic quietly redefined what AI means for cybersecurity. Their new frontier model, Claude Mythos Preview, autonomously discovered a 27-year-old vulnerability in OpenBSD — an operating system that has built its entire reputation on being security-hardened. No human auditor, no automated scanner, no fuzzer in 27 years ever found it.

That alone would be a landmark moment. But it is just one of thousands of zero-days that Mythos Preview has found across every major operating system and every major web browser. This is not incremental progress. This is a phase change.

The OpenBSD Bug Nobody Saw

The vulnerability sits in OpenBSD’s implementation of TCP SACK (Selective Acknowledgement), a protocol extension dating back to RFC 2018 in 1996. OpenBSD added SACK support in 1998.

Mythos Preview identified two interacting bugs. First, the kernel validates the end of a SACK block range against the send window but never checks the start. Normally harmless, because acknowledging bytes -5 through 10 has the same effect as acknowledging 1 through 10.

But the model found the second flaw. If a single SACK block simultaneously deletes the only hole in the tracking linked list and triggers the “append a new hole” code path, the append writes through a pointer that is now NULL. The list walk just freed the only node and left nothing behind.

Reaching that code path normally looks impossible. It requires a SACK start that is both at or below the hole’s start and strictly above the highest byte previously acknowledged. One number cannot satisfy both conditions — unless you exploit signed integer overflow. TCP sequence numbers are 32-bit, and OpenBSD compared them using (int)(a - b) < 0. Placing the SACK start roughly 2^31 away from the real window overflows the sign bit in both comparisons. The impossible condition is satisfied. The kernel writes to NULL. The machine crashes.

Any attacker who can establish a TCP connection to an OpenBSD host could remotely crash the machine. Repeatedly. Every firewall, router, and critical service running OpenBSD was exposed.

Scale Changes the Game

The OpenBSD bug is the headline, but it is not the story. The story is what happened when Anthropic pointed Mythos Preview at the open-source ecosystem and let it run.

In a 16-year-old FFmpeg codebase, Mythos Preview found a vulnerability in the H.264 codec. Automated fuzzers had hit the vulnerable line of code five million times without triggering it. The bug required constructing a video frame with exactly 65,536 slices to collide a 16-bit sentinel value. No random input generator would ever produce that.

In the Linux kernel, the model autonomously identified and chained together multiple vulnerabilities — a KASLR bypass through one bug, a read primitive through another, and a write through a third — to escalate from an ordinary user to full root access.

In FreeBSD, it found and fully exploited a 17-year-old remote code execution vulnerability in the NFS server. No human guided the exploitation. The model independently discovered a stack overflow, worked out that the normal defences did not apply to this specific code path, built a 20-gadget ROP chain split across six RPC packets, and used an information disclosure in NFSv4’s EXCHANGE_ID call to bypass authentication. From unauthenticated attacker to root.

The total cost of the thousand runs against OpenBSD that found the SACK bug was under $20,000. The specific run that found it cost under $50.

Why This Is Different From Fuzzing

Every few years, the security community gets a new tool that shifts the balance. Fuzzers did it in the 2000s. Static analysis tools did it in the 2010s. Each time there were concerns the same tools would help attackers, and each time the tools ultimately benefited defenders more.

But Mythos Preview is different in kind, not just degree. Fuzzers generate random inputs and watch for crashes. They are brute force tools. The FFmpeg vulnerability that survived five million fuzzer runs is evidence of their fundamental limitation — they cannot reason about the code.

Mythos Preview reads source code, forms hypotheses about where vulnerabilities might exist, writes targeted proofs of concept, validates its findings, and chains multiple bugs into working exploits. It does this autonomously. In one case, Anthropic engineers with no security training asked the model to find remote code execution vulnerabilities overnight. They woke up to a working exploit.

This is not fuzzing at scale. This is automated security research at a level that was previously exclusive to elite teams at places like Google Project Zero.

Project Glasswing: The Industry Response

Anthropic’s response to Mythos Preview’s capabilities is Project Glasswing, a defensive coalition that includes AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.

The model will not be made generally available. Partners and over 40 additional organisations maintaining critical software infrastructure get access through Glasswing, backed by $100 million in usage credits from Anthropic and $4 million in direct donations to open-source security organisations.

The coalition’s mandate is defensive: scan critical codebases, find and patch vulnerabilities before attackers can develop similar capabilities. The approach to disclosure is conservative — fewer than one per cent of the vulnerabilities discovered so far have been fully patched, which means we are only seeing the tip of what Mythos Preview has found.

What This Means for Enterprise Security Strategy

If you run security for any organisation, you need to think about three things immediately.

Patch cycles are about to break. Mythos Preview can turn a public CVE and a git commit hash into a working exploit fully autonomously. The window between vulnerability disclosure and active exploitation, which has already been shrinking, is about to collapse. If your patching cadence is measured in weeks, that is no longer fast enough.

Defence-in-depth needs re-examination. Many defensive techniques work by making exploitation tedious rather than impossible. Stack canaries, ASLR, and sandboxes all add friction. Mythos Preview demonstrated that a sufficiently capable model grinds through tedious steps at machine speed. Mitigations whose value comes from friction rather than hard barriers will weaken against model-assisted attackers.

Legacy software is the biggest exposure. How will you respond when a critical vulnerability is reported in an application whose developer you acquired ten years ago and no longer support? When a model can find 27-year-old bugs in the most security-conscious operating system in the world, every line of legacy code in your stack becomes a liability.

The Uncomfortable Transition

Anthropic’s own framing is candid. In the long run, they believe AI-driven security will benefit defenders more than attackers. The same model that finds vulnerabilities can also write patches, triage reports, and help developers ship more secure code. But they also acknowledge the short-term risk: if frontier model capabilities proliferate before defenders have used them to clean up the most critical codebases, attackers will have the advantage.

We have been in a relatively stable security equilibrium for twenty years. The attacks of 2026 look fundamentally similar to the attacks of 2006 — more sophisticated, but the same shape. Mythos Preview suggests that equilibrium is ending.

The organisations that come through this transition well will be the ones that start using AI for defensive security work now, even with current-generation models that are less capable than Mythos Preview. The scaffolds, processes, and institutional knowledge you build today with Opus 4.6 or GPT-4o will be the foundation you need when Mythos-class models become broadly available.

Twenty-seven years. That is how long a critical vulnerability survived in one of the most audited operating systems on the planet. The age of “someone would have checked that” is over.

Leave A Comment

Recommended Posts