Shelly Palmer - Claude is now a day worker

Think about this: Evolving from a five-minute attention span to what feels like a full work shift at silicon speed.

Shelly Palmer May 23, 2025 5:00 PM

code-unsplash — Opus 4 can pursue a complex coding task for about seven consecutive hours without losing context. Photo by Chris Ried on Unsplash

Yesterday at Anthropic’s first “Code with Claude” conference in San Francisco, the company introduced Claude Opus 4 and its companion, Claude Sonnet 4. The headline is clear: Opus 4 can pursue a complex coding task for about seven consecutive hours without losing context. That leap takes us from last year’s five-minute attention span to what feels like a full work shift at silicon speed.

Why This Matters

In the GPT-3 era, a five-second task could be completed in roughly five seconds. I projected that by early 2026, a five-month task might be completed in five seconds. Anthropic’s announcement forces me to redraw that curve. Sustained multi-hour autonomy is exactly what a practical agentic system requires. Continuous context, tool use and memory enable an AI worker to accept a ticket, call the right APIs, refactor dozens of files, run tests, and open a pull request while you sleep.

What Anthropic Shipped

Extended thinking and parallel tool use. Opus 4 interleaves reasoning with simultaneous tool calls and stores each 64,000 token thinking segment for later steps.
200,000 token context window. The model retains about 500 pages of text in active memory, which means that long documents, entire codebases and detailed test suites no longer need aggressive pruning.
Pricing parity with Claude 3. Opus 4 remains at $15 per million input tokens and $75 per million output tokens. Sonnet 4 stays at $3 and $15, respectively. The premium over GPT-4o persists, but there is no launch surcharge.
AI Safety Level 3. Anthropic added stronger jailbreak detection, enhanced tool sandboxes and a public bug-bounty program. Whether regulators find that sufficient remains to be seen, yet the posture exceeds the industry baseline.
Early customer evidence. Rakuten let Opus 4 refactor an open-source project for seven hours without human intervention, a task earlier Claude versions abandoned after forty-five minutes. Engineers at Sourcegraph report that the model stays on track longer and produces more elegant patches. GitHub plans to integrate Sonnet 4 into Copilot’s forthcoming coding agent.

Strategic Lens for the C-Suite

Agent pipelines become real. If a large language model can own a ticket for most of a business day, leaders can design workflows in which human review brackets the work instead of supervising each step. Productivity gains will appear first in code maintenance, data cleanup, and document generation.
Cost calculus shifts to output quality. Opus 4 costs roughly 3-4x more per token than GPT-4o. If it closes tickets with fewer iterations, the blended cost per deliverable can still fall. Run side-by-side pilots that measure human review minutes rather than only model latency.
Risk and compliance require new guardrails. Seven-hour autonomy expands the blast radius of an error. Instrument agent chains with hard policy stops, such as repository write limits, approval gates and indemnity clauses, before allowing the model to touch production assets.
Procurement gets easier. Immediate availability on AWS Bedrock and Google Vertex AI removes a major deployment hurdle. If your cloud teams have already connected security controls to Bedrock, adopting Opus 4 is essentially an API call.

The Bigger Picture

Autonomy is compounding faster than most enterprise road maps anticipate. With Opus 4, the gap between call-and-response chatbots and a junior colleague that finishes a ticket has narrowed sharply. Incremental upgrades (such as parallel tool calls, longer context, and local memory files) add up to a qualitative leap.

Today, the ceiling is seven hours; tomorrow, it could span several days or even months. Leaders should plan for a near-term future in which AI agents persist across multiple business cycles, from campaign planning to quarterly closes and research sprints. Redraw your graphs, allocate budget for rapid model churn, and begin experimenting with agentic workflows before competitors do.

As always your thoughts and comments are both welcome and encouraged.

About Shelly Palmer

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named he covers tech and business for , is a regular commentator on CNN and writes a popular . He's a , and the creator of the popular, free online course, . Follow or visit .

Comments