What Shipped Yesterday
Anthropic released Claude Opus 4.8 on May 28, 2026, alongside two features that change what agentic coding work looks like in practice: dynamic workflows in Claude Code and effort controls in the chat interface. The model itself is a “modest but tangible” step up from Opus 4.7 in Anthropic’s own words — pricing is unchanged at $5 per million input tokens and $25 per million output — but it’s the surrounding infrastructure that warrants attention.
Dynamic workflows, in research preview as of today, let Claude Code spin up and coordinate up to 1,000 subagents in a single session. Effort controls let users dial how hard Claude thinks before responding. And a quiet but useful API change lets developers update system instructions mid-task without blowing up the prompt cache.
What’s New in Opus 4.8
The headline benchmark numbers hold up well. Opus 4.8 scores 84% on Online-Mind2Web (browser agent tasks), which Anthropic reports as a meaningful jump over both Opus 4.7 and GPT-5.5. On CursorBench — Cursor’s internal coding eval — it exceeds prior Opus models at every effort level. It’s also the first model to break 10% overall on the Legal Agent Benchmark’s all-pass standard, which matters because that benchmark requires end-to-end task completion, not just individual step accuracy.
The most important improvement is less headline-grabbing: honesty about its own work. Anthropic’s evaluations show Opus 4.8 is roughly four times less likely than Opus 4.7 to let code flaws pass unremarked. In practice, early testers note it’s more willing to flag uncertainty, push back on unsound plans, and refuse to paper over ambiguity with false confidence. That matters more when agents run for hours without human check-ins.
Fast mode — where the model operates at roughly 2.5x speed — is now three times cheaper than it was for Opus 4.7, at $10/$50 per million input/output tokens. If you’ve been avoiding fast mode on cost grounds, that’s worth revisiting.
Dynamic Workflows: How It Actually Works
The architecture is straightforward: when you trigger a workflow in Claude Code, Claude writes an orchestration script, breaks the task into subtasks, and fans them out across subagents running in parallel. Each subagent independently works its slice, outputs are verified before being folded in, and Claude reports back only when it has a coherent result. Up to 16 agents run concurrently (fewer on lower-core machines); the total per-session cap is 1,000 agents.
Progress is checkpointed throughout, so an interrupted workflow picks up where it left off instead of restarting. This is not a trivial property — codebase migrations that take hours would otherwise be impractical to run.
Anthropic describes the adversarial verification step as key: agents tackle a problem from independent angles, other agents try to refute their findings, and the loop iterates until answers converge. This is how a workflow can reach results that a single long-context pass cannot, not through brute parallel speed alone, but through independent verification.
Dynamic workflows are available in research preview in Claude Code CLI, Desktop, and the VS Code extension for Max, Team, and Enterprise plans (if admin-enabled). They’re also accessible via the Claude API.
The Bun Port: 750,000 Lines in 11 Days
The most credible stress test of dynamic workflows so far is Jarred Sumner’s Bun port. Sumner — the creator of Bun, the fast JavaScript runtime — used dynamic workflows to port Bun’s codebase from Zig to Rust. The result: roughly 750,000 lines of Rust, 99.8% of the existing test suite passing, with 11 days from first commit to merge.
The workflow decomposed into three phases. First, a workflow that mapped the correct Rust lifetime for every struct field in the Zig codebase — pure analysis work that benefited from parallel agents cross-checking each other. Second, a generation phase where hundreds of agents wrote .rs files as behavior-identical ports of their .zig counterparts, with two reviewer agents per file. Third, a fix loop that ran the build and test suite iteratively until both were clean.
Before citing this as proof that agentic AI can handle production migrations, two caveats are worth noting. Sumner has described this as an experiment, not a production deployment. The Rust codebase exists to compare against the Zig version on performance, memory, and maintainability — there is, he said, a high chance the Rust code gets discarded. What the port demonstrates is not that dynamic workflows ship production code automatically, but that they can tackle a scale of work — 750K lines, 11 days — that would have been effectively out of reach for a single-context-window agent loop.
For context on how this compares to other agentic coding tools, the vortx.ch comparison of Claude Code, Cursor, and Copilot from earlier this month covered the gap in parallel agent capability — dynamic workflows make that gap wider.
Effort Controls and a Quiet API Change
Effort controls are now surfaced in claude.ai and Cowork as a slider alongside the model selector. The four levels are low, default (high), extra, and max. In Claude Code, the extra level maps to xhigh. Anthropic recommends extra or max for difficult tasks and long-running async workflows; default is the right choice for most interactive coding sessions.
The rate limit increase that ships alongside this matters: higher effort levels consume more tokens per task, and Anthropic has adjusted Claude Code rate limits upward to accommodate that. The effective throughput on Max and Team plans is higher than before even at the same nominal rate.
The API change is subtle but saves real headaches for agent developers: the Messages API now accepts system entries inside the messages array. Previously, updating Claude’s instructions mid-task — to change permissions, token budgets, or environment context — required routing the update through a user turn, which disrupted the conversation structure. It now works cleanly without touching the prompt cache. This is the kind of thing that doesn’t make headlines but eliminates a class of workaround that every serious agent harness was doing differently.
The broader trajectory here is the same one noted in earlier analysis of agentic engineering adoption: tooling is maturing faster than team processes. Dynamic workflows can handle the migration; knowing when to use them, what to verify, and how to structure the handback to human engineers is still largely uncharted.
What’s Coming Next: Project Glasswing
Anthropic’s announcement included a preview of what follows Opus: a “new class of model with even higher intelligence than Opus.” Under Project Glasswing, a small number of organizations are already using Claude Mythos Preview for cybersecurity work. Anthropic says that Mythos-class models require stronger cyber safeguards before general availability, and that those safeguards are close — they expect Mythos to reach all customers “in the coming weeks.”
The alignment assessment for Opus 4.8 notes its misaligned behavior rates are similar to Mythos Preview, which suggests the gap between the two on safety properties is narrower than on raw capability. The practical implication: the caution around releasing Mythos is specifically about offensive cyber capability, not about general alignment.
Further Reading
- Introducing Claude Opus 4.8 — Anthropic’s official release post with benchmark tables and full feature details.
- Introducing Dynamic Workflows in Claude Code — the deeper technical walkthrough of how orchestration and verification work.
- The New Stack: Claude Opus 4.8 Analysis — solid independent breakdown of the effort controls and what the honesty improvements mean in practice.
- Simon Willison’s Opus 4.8 Notes — concise developer-focused take on what actually changed and what remains to be seen.

