Claude Fable 5 and Mythos 5: What You Need to Know

Stanley Ulili

Updated on June 15, 2026

The mythos-fable distinction
Benchmark performance
The cost tradeoff
Long-context and agentic capabilities
Vision capabilities
Generating full-stack applications
The government shutdown
The jailbreak at the center of the dispute
The implications for the industry

Anthropic's release of Claude Fable 5 and Mythos 5 was supposed to mark a new high point for commercial AI, with benchmark results that left competitors well behind and real-world demos that compressed months of engineering work into hours.

What nobody anticipated was that within weeks of launch, the US government would step in and force Anthropic to take both models offline entirely, citing national security concerns.

This article covers what Fable 5 actually does, why it drew so much attention, and what the subsequent shutdown reveals about the emerging politics of frontier AI.

The mythos-fable distinction

Anthropic released two distinct tiers of the same underlying architecture. Mythos 5 is the raw frontier model, extremely capable in areas like offensive cybersecurity but not available to the general public. Fable 5 is the publicly accessible version, built on the same architecture but trained with extensive safety layers that constrain what it will do.

The practical effect is that a chart Anthropic released in the announcement makes the split look almost comical. On offensive cyber benchmarks, Mythos 5 scores up to 88.4%. Fable 5 scores a flat 0.0% across the board. The safeguards don't just reduce capability in those areas; they eliminate it entirely for public users.

A bar chart titled "Offensive cyber evaluations" showing that Claude Fable 5 has a 0% success rate on cyber-attack tasks due to its safeguards, while the unsafeguarded Claude Mythos 5 is highly proficient.

For a select group of cyberdefenders and infrastructure providers, the less-restricted Mythos 5 is available through a program called Project Glasswing, run in collaboration with the US government to help strengthen digital defenses. Everyone else interacts with Fable 5.

Benchmark performance

The headline numbers for Fable 5 are significant enough to stand on their own. On SWE-Bench Pro, a benchmark measuring the ability to resolve real GitHub issues in large, complex codebases, Fable 5 scores 80.3%. Claude Opus 4.8, its predecessor, scored 69.2%. GPT-5.5 scored 58.6%. That's not an incremental improvement; it's a different tier of capability.

FrontierCode, a benchmark from Cognition (the team behind the AI software engineer Devin), measures something more meaningful than raw task completion. It evaluates whether code produced by the model is good enough that a human project maintainer would actually accept and merge it into production. On that benchmark, Fable 5 scores 29.3%, compared to 13.4% for Opus 4.8 and just 5.7% for GPT-5.5. The gap is large enough to suggest these models are operating in fundamentally different ranges of code quality.

The performance advantage extends beyond coding. Fable 5 shows significant improvements in spatial reasoning, document understanding (including native PDF comprehension without external tools), and biology benchmarks where it approaches or exceeds human expert performance. On OSWorld-Verified, which tests a model's ability to perform tasks on an actual operating system, Fable 5 scores 85.0%.

The cost tradeoff

Higher performance comes with higher cost. Fable 5 is priced at $10 per million input tokens and $50 per million output tokens at the API level, which positions it firmly as a premium model. A chart Anthropic released shows that even a medium-effort Fable 5 run outperforms maximum-effort runs from Opus 4.8 and GPT-5.5, but the cost-per-task for high-accuracy work on the FrontierCode benchmark can run upward of $10-$20 per task.

A graph titled "FrontierCode Accuracy vs Cost," showing Claude Fable 5 (red line) achieving significantly higher accuracy scores than Claude Opus 4.8 (green line) and GPT-5.5 (black dots) at various cost points.

For subscribers on claude.ai, Fable 5 was included at no extra cost through June 22 for Pro, Max, Team, and Enterprise plans. After that date, access requires purchasing usage credits separately, with Anthropic stating it aims to restore Fable 5 as a standard part of subscription plans as capacity allows. Using Fable 5 also consumes message limits twice as fast as Opus, a mechanism that reflects the higher computational cost of running the model.

Long-context and agentic capabilities

One of the areas Anthropic emphasized most in the Fable 5 announcement is its ability to sustain focus across extremely long tasks. The model can work across millions of tokens, effectively creating and referencing internal notes to maintain context throughout complex, extended jobs.

The clearest real-world illustration of this came from Stripe, which tested Fable 5 on a codebase-wide migration across a 50-million-line Ruby codebase. Fable 5 completed the task in a single day. Stripe estimated the same work would have taken a full team of human engineers over two months.

Vision capabilities

Fable 5's multimodal vision handling is a notable improvement over previous generations. The model can perform complex visual tasks that previously required a combination of tools, relying instead on vision alone.

One demo involved playing and completing Pokémon FireRed using only a minimal vision-only harness. The model interprets pixels, makes strategic decisions, navigates the game world, and handles battles without any structured data or additional tools. It's not a trick; it reflects a genuine shift in how much visual reasoning the model can carry out natively.

A more practically useful demo involved taking a single full-page screenshot of the Linear website and generating a complete replica in HTML, CSS, and JavaScript from that image alone.

A side-by-side comparison showing the original Linear app website on the left and a nearly identical, fully functional replica on the right, generated by Claude Fable 5 from just a screenshot.

The result is close enough to the original that the differences are hard to spot. The model captures layout, typography, color, and structure without any access to the source code or the web.

Generating full-stack applications

One way to understand the gap between Fable 5 and its predecessors is to compare what each model produces when given the same single-prompt task: build a complete personal finance dashboard with a React frontend, an Express backend, multiple navigable pages, charts, and a polished visual design.

A prompt for this kind of task might look like:

Copied!

You are an expert full-stack developer. Build a complete personal finance dashboard application.

Technology stack:
- Frontend: React with Vite, Tailwind CSS, and Recharts
- Backend: Node.js with Express

The application should include:
- An Express server with mock data and API endpoints for /api/overview, /api/accounts, and /api/transactions
- A main dashboard with a net worth line chart, a cash flow bar chart, account summaries, and recent transactions
- Separate pages for Accounts, Transactions, and Investments
- A clean, modern light theme using green and orange as accent colors

Provide all necessary code, a package.json for both frontend and backend, and instructions to run the application.

The outputs from each model made the quality difference concrete.

The high-quality, fully-featured finance dashboard application generated by Claude Fable 5, showcasing

Fable 5 produced a fully functional, polished application with a coherent visual design, interactive charts, and a layout that reflected genuine UI/UX sensibility. Opus 4.8 also produced a professional result, opting for a dark theme, which many users might actually prefer. GPT-5.5 produced something functional but visually basic, closer to a rough wireframe than a finished product. Fable 5 also completed the task in roughly eight minutes, compared to twelve for Opus 4.8 and fifteen for GPT-5.5.

The government shutdown

Before the performance gains had time to settle in, an announcement from Anthropic's official X account changed the story entirely.

A screenshot of the initial tweet from the official AnthropicAI account, detailing the US government's export control directive.

The statement was direct: the US government, citing national security authorities, had issued an export control directive requiring Anthropic to suspend all access to Fable 5 and Mythos 5 for any foreign national, whether inside or outside the United States, including Anthropic's own foreign national employees. To comply, Anthropic disabled both models for all customers. Other Claude models were not affected.

The header of Anthropic's official blog post, clearly stating the subject of the announcement.

The mechanism is export controls, a regulatory tool historically applied to hardware and cryptography. Applying it to a commercial AI model is a significant and arguably novel use of that authority.

The jailbreak at the center of the dispute

Anthropic published a detailed response explaining what the government believed it had found. According to the statement, the government claimed awareness of a method for bypassing Fable 5's safety features, a technique commonly called a jailbreak.

An AI jailbreak is a prompt or sequence of prompts designed to get a model to ignore its own safety constraints. Model developers invest heavily in training models to refuse dangerous or harmful requests, and jailbreaks are attempts to circumvent that training. In Fable 5's case, the government believed the discovered technique could be used to get the model to identify software vulnerabilities, which has obvious potential for misuse in offensive cybersecurity operations.

Text from the statement is highlighted, focusing on the government's concern over a method of "bypassing, or 'jailbreaking' Fable 5."

Anthropic pushed back on the severity of the finding across several lines of argument.

First, the company argued the jailbreak revealed only "a small number of previously known, minor vulnerabilities" that "all appear relatively simple." In other words, the security gaps the technique exposed were already known and weren't particularly serious.

Second, Anthropic argued the relevant capability isn't unique to Fable 5. Similar capability, the company stated, is available from other publicly accessible models, including GPT-5.5, and is used routinely by security professionals doing legitimate defensive work.

The statement's claim that similar capabilities exist in other models, with a specific mention of "OpenAI's GPT-5.5."

Naming a competitor's model in a government dispute carries obvious risks. It drew immediate attention for being an unusually blunt move in a space where major AI labs have generally maintained a cooperative public posture.

Third, Anthropic pointed to the extensive pre-launch safety testing it conducted with the US government, the UK AISI, and multiple third-party organizations, collectively thousands of hours of red-teaming. Red-teaming in this context means deploying dedicated teams to try every possible approach to breaking the model's safety features, then using those findings to strengthen the model before release. Anthropic's position is that this process showed Fable 5's safeguards were more effective than any previously deployed model, and that no tester had found a universal jailbreak, meaning a method that reliably bypasses all safety features across all use cases. The government's finding, by Anthropic's framing, is a narrower, non-universal technique that works in limited circumstances.

The implications for the industry

Anthropic's official statement warned that if the government's standard were applied consistently across the industry, "it would essentially halt all new model deployments for all frontier model providers." The argument is that no current frontier model is perfectly resistant to narrow jailbreaks, and if the existence of any such vulnerability is grounds for shutting down global access, no company can safely release a new, capable model.

The access restrictions also raise questions that go beyond Anthropic specifically. The directive targets any foreign national, regardless of where they live or work. That scope, enforced consistently, would mean AI models become controlled technologies subject to nationality-based access restrictions, similar to how export controls have historically worked for satellite technology or advanced encryption.

Enforcing nationality restrictions on AI access would require identity verification at a level the industry currently doesn't operate at, which would mark a significant shift away from the pseudonymous or even anonymous access most AI tools currently allow. It would also likely accelerate the development of separate AI ecosystems across different geopolitical blocs, each with its own models and its own access rules.

Whether the directive gets reversed, broadened to other models, or becomes a template for ongoing government oversight of frontier AI is not yet clear. What is clear is that Fable 5's launch exposed how quickly the gap between technical capability and regulatory readiness can become a genuine crisis, and that the conversation about who gets access to the most powerful AI models has moved well beyond the industry's control.

Got an article suggestion? Let us know

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.