The Digital Data Design Institute at Harvard is now the Harvard Business School AI Institute.

Codex for (Almost) Everything, And the End of the AI Subsidy Era

Image of a smart phone with Codex app open.

On April 16, 2026, OpenAI shipped a major update titled Codex for (Almost) Everything, transforming what was once a developer-only coding tool into a general-purpose agentic super-app that runs alongside your system. With computer use, it can control your operating system; it browses the web inside its own window; it generates images, remembers your preferences, and orchestrates work across more than 90 plugins.[1] Four weeks later, on May 14, OpenAI extended to mobile: a QR-code pairing flow inside the ChatGPT app on iPhone, iPad, and Android lets you supervise, approve, and redirect Codex from anywhere.[2] If that pattern sounds familiar, it should. It is the same QR-pairing architecture Anthropic introduced with Claude Cowork Dispatch in March (highlighted in last month’s newsletter). That’s why Codex deserves a deeper look this month.

From Coding Tool to Digital Employee

Codex features full local file access, persistent memory of your preferences and recurring workflows, slash-commands for reusable skills, @-mentions for plugins, scheduled automations, and direct computer control through a virtual cursor.[3] The mobile extension allows you to scan a QR code and turn your phone into a Chief-of-Staff dashboard that streams screenshots, terminal output, diffs, and approval prompts from Codex on your desktop.[2] OpenAI reports more than 3 million weekly active developers on Codex as of April. That is nearly double the 1.6 million in March.[3][4]

The Hidden Cost of Mainstream Agents

Stanford’s Digital Economy Lab, working from a Microsoft Research study of agentic coding tasks, found that even simple tool-calling agents burn 5,000 – 15,000 tokens per task, and complex multi-agent workflows routinely consume 200,000 to over 1 million tokens for a single goal. The same paper shows that running the same agent on the same task can vary in cost by up to 30x.[5] A cautionary tale: one developer’s LangChain agent entered a recursive loop overnight, made 14,000 redundant tool calls, and ran up a $437 bill before hitting a quota.[6] Stories like this show us that the opportunity is real, but so is the meter.

In “The Agentic AI Advantage,” a webinar in the HBS AI Institute series The Prompt, executive fellow Tim Sanders, argued that agentic AI will resolve the iron triangle: fast, better, and cheaper. However, one of his key takeaways for leaders was that AI tools, at their core, aren’t cost saving mechanisms, they are productivity and growth tools. The challenge for organizations will be to control the meter without shrinking the ambition.

Why the Subsidy Era Is Ending

The math of flat-rate subscriptions does not survive recursive token consumption, and the major vendors have all moved away from it. GitHub announced that every Copilot plan migrates to usage-based billing on June 1, 2026.[7] On April 8, 2026, Anthropic launched Managed Agents in public beta. The pricing structure is a hybrid: standard per-token costs for model inference, plus $0.08 per session-hour while an agent is actively running. OpenAI introduced Workspace Agents this spring, moving to credit-based pricing in May 2026 after a free introductory period. Frontier model pricing itself, which is currently $5 per million input and $25 per million output tokens on Claude Opus 4.7, looks modest until you multiply it by the recursive workloads above.[8] OpenAI now caps ChatGPT Plus at 3,000 GPT-5 Thinking messages per week, after which traffic downgrades to a mini model.[9] This is the AI industry catching up to the reality that compute is a scarce, billable resource.

This Month’s Action Item for Leaders

Treat token economics as an architectural problem. I suggest three moves to make before your next budget cycle. First, run a cost audit on every agentic workflow in production, measuring tokens-per-task and dollars-per-outcome, not just tasks-completed. Second, adopt the KPMG Build / Buy / Borrow framework: custom builds where IP control matters, off-the-shelf tools where time-to-value dominates, and open-source where you need a hedge.[10] Third, route through an LLM gateway so your applications never hard-code a single vendor’s API; if a vendor spikes prices or rate-limits agents, an “escape hatch” reroutes the workload to a cheaper model automatically. Companies that do this in 2026 will not be at the mercy of any single lab’s pricing decisions in 2027.

One Other Thing

Watch the AI Cost Scoreboard pattern emerging in enterprise dashboards. The elements include burn-rate per agent workflow, token spend by vendor, and automatic loop-pausing alerts when a recursive agent exceeds threshold.[11] The CFO conversation about AI is rapidly converging with the conversation about cloud FinOps. The leaders who will capture the majority of the $3T agentic dividend (discussed in last month’s newsletter) are are those who know, to the dollar, what each agent they run is worth.

Next Month at a Glance

The most significant announcement for business leaders at the recent annual Google I/O was Gemini Spark, unveiled on May 19.  Spark is Google’s entry into what it is calling the “always-on agent” category: a personal AI that runs 24/7 on Google Cloud virtual machines, taking action on your behalf even while your phone and laptop are off. Next month, we will look at what Spark means for the business leader who lives inside Google Workspace.


Mike Grandinetti is an Executive Fellow at Harvard Business School. He’s a serial tech entrepreneur, board member, AI & innovation consultant, VC EIR, and award-winning professor in the practice. A former Silicon Valley engineer and McKinsey consultant, Mike has been C-Suite leader roles across 8 tech startups, resulting in 2 NASDAQ IPOs and 7 strategic exits.

He’s led senior executive workshops for Berkeley, Brown, Carnegie Mellon, Columbia, Cornell, Harvard P&ED, NYU & Oxford. He’s been a senior advisor and organizing team member for the MIT CIO Symposium for a decade.

https://www.linkedin.com/in/mikegrandinetti
www.mikegrandinetti.com


Notes

[1] https://www.macrumors.com/2026/04/16/openai-codex-mac-update/

[2] https://9to5mac.com/2026/05/14/openai-brings-codex-control-to-chatgpt-for-iphone-and-android/

[3] https://openai.com/index/codex-for-almost-everything/

[4] https://fortune.com/2026/03/04/openai-codex-growth-enterprise-ai-agents/

[5] https://digitaleconomy.stanford.edu/news/how-are-ai-agents-spending-your-tokens/

[6] https://earezki.com/ai-news/2026-04-29-i-let-my-ai-agent-run-overnight-it-cost-437/

[7] https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/

[8] https://platform.claude.com/docs/en/about-claude/pricing

[9] https://help.openai.com/en/articles/6825453-chatgpt-release-notes

[10] https://kpmg.com/us/en/articles/2026/agentic-ai-untangled.html

[11] https://www.razor.co.uk/insights/the-era-of-agentic-token-economics