The Meter is Running

For the past two years, companies were told to use AI everywhere: put it in coding, customer support, legal work, research, operations, sales, and internal productivity. The pitch was simple: AI would reduce labor costs, accelerate output, and turn every employee into a power user.

Now the meter is running.

Corporate AI usage is exploding, especially around coding agents and autonomous workflows. But the bills are rising faster than the measured value. Axios reported this week that companies are now “bargain hunting” for cheaper AI models as usage blows out IT budgets and ROI remains uncertain. The same Axios reporting notes that enterprises are increasingly reluctant to standardize on one AI provider because they fear future price increases once they are locked into a platform. (axios.com)

The deeper problem is that today’s corporate AI bills may still be artificially low. OpenAI, Anthropic, Google, and others are fighting for adoption, enterprise share, and public-market credibility. The labs need massive usage numbers to justify near-trillion-dollar valuations and expected IPOs. That means enterprise customers are getting a subsidized version of the future. The scary part is not the bill companies are seeing now. It is the bill they may see when the subsidy fades.

What changed

Enterprise AI sticker shock is now mainstream. Axios reported on May 28 that corporate leaders are starting to question whether AI spending is delivering enough return, with Microsoft canceling most of its Claude Code licenses partly over cost and Uber leadership saying AI costs are getting harder to justify. Axios described the corporate pullback as a “healthy swing” away from AI overuse, or “tokenmaxxing,” where teams burn through tokens because usage itself became the internal success metric. (axios.com)

Uber became the warning case. Uber reportedly burned through its full 2026 AI budget in four months after Claude Code adoption surged across engineering. Tom’s Hardware reported that Uber leadership could not yet draw a clear connection between heavy Claude Code usage and successful product output, while Forbes separately framed the episode as a budgeting failure caused by token-based pricing moving faster than enterprise finance controls. (tomshardware.com)

The problem is not just price per token. It is tokens per task. Agentic AI can consume far more tokens than simple chat because the system plans, calls tools, reads files, rewrites code, checks errors, retries, and runs multi-step loops. Tom’s Hardware reported that agentic AI can consume up to 1,000 times more tokens than standard AI workflows, which is why corporate cost models built around seat licenses are breaking. (tomshardware.com)

The “all you can eat” model is breaking down. Axios reported earlier this month that Anthropic was tightening limits on paying Claude users and that OpenAI was using the moment to court power users toward Codex. The strategic point is clear: flat monthly subscriptions do not survive when agents can burn compute faster than a human user ever could. (axios.com)

Anthropic’s compute costs show the scale of the problem. TechCrunch reported that Anthropic will pay xAI roughly $1.25 billion per month through May 2029 for access to Colossus compute capacity. That is not an ordinary vendor bill. It is infrastructure at national-industrial scale. It also explains why AI providers cannot subsidize unlimited enterprise usage forever. (techcrunch.com)

The labs are still racing toward public-market validation. Axios reported that Anthropic raised $65 billion at a $965 billion post-money valuation, overtaking OpenAI’s most recent reported valuation. Axios also reported that the search for cheaper AI subscriptions could threaten near-trillion-dollar AI valuations just as the major labs near record IPOs. (axios.com)

The subsidy is becoming visible at the edges. Tom’s Hardware reported that an OpenAI employee working on OpenClaw generated more than $1.3 million in OpenAI API token usage in one month, covering 603 billion tokens across 7.6 million requests. The same report noted that Codex, Claude Code, and Cursor have been competing aggressively for developer adoption and subsidizing inference costs below API rates to attract and retain users. (tomshardware.com)

Orthogonal Take

The enterprise AI cost problem is not just “AI is expensive.”

The more precise problem is that companies adopted AI as if it were SaaS, but AI behaves more like electricity.

A SaaS seat is predictable. You buy the license, assign the user, and the cost is mostly fixed. Heavy usage is usually good because it means adoption is high and the marginal cost to the vendor is low.

AI is different. Every prompt, document, tool call, agent loop, code review, retry, and output consumes compute. The more successful adoption becomes, the more variable the cost becomes. In the old software model, usage often improved unit economics. In the AI model, usage can destroy them if it is not routed, capped, cached, and measured.

That is the structural shift.

Companies are now discovering that there are three different AI economies:

  1. The demo economy. AI looks cheap when one person uses a chatbot to draft a memo, summarize a PDF, or write a small script.
  2. The team economy. AI gets more expensive when hundreds or thousands of employees use premium models all day for coding, analysis, research, and workflow automation.
  3. The agent economy. AI gets materially harder to control when autonomous systems loop through tasks, call tools, generate intermediate work, and consume tokens without a human feeling each incremental cost.

That third economy is where the budget failures happen.

The trap is that corporate behavior was shaped by the demo economy. Leaders saw impressive outputs, assumed the unit economics would scale, and encouraged broad usage. But enterprise reality is moving into the agent economy, where compute spend can grow faster than headcount savings, especially when usage is not tied to measurable business outcomes.

There is a second trap underneath that one: the IPO subsidy.

The AI labs need adoption, usage, revenue growth, developer loyalty, and enterprise lock-in before they go public. That gives them every reason to make AI feel cheaper than its fully loaded economics. But public markets will eventually demand gross margin discipline. Once that happens, enterprise customers should expect more usage-based billing, more tiering, more rate limits, more model segmentation, and more pressure to pay real prices for premium intelligence.

The bill companies are getting now is not necessarily the final bill. It may be the subsidized acquisition price.

Why this matters for operators

This is where the AI strategy has to mature.

The right answer is not “use less AI.” That would be as crude as saying “use less electricity.” The right answer is to stop treating intelligence as free once it enters the workflow.

AI usage now needs the same discipline companies eventually developed around cloud spend:

  • Model routing by task value
  • Premium models only where premium judgment matters
  • Cheaper models for routine work
  • Usage caps by team and workflow
  • Token budgets tied to business outcomes
  • Prompt caching and reuse
  • Human approval before expensive agent loops
  • Cost attribution by project, customer, team, and deliverable
  • ROI review before expanding from pilot to production

The companies that win will not be the ones that use the most AI. They will be the ones that use the right AI at the right point in the workflow.

That is also why the “one model for everything” strategy is economically fragile. If every task goes to the most expensive frontier model, the budget will eventually break. If every task goes to the cheapest model, quality will eventually break. The advantage is in routing, judgment, and architecture.

Bottom Line

The free-sample era is ending.

Enterprises are realizing that token consumption is not an abstract technical metric. It is a budget line. Uber blew through its AI budget. Microsoft pulled back on Claude Code licenses. Companies are shopping for cheaper models. Anthropic is buying compute at a scale that makes permanent subsidies impossible. OpenAI, Anthropic, and the rest are racing toward IPOs that will force the economics into daylight.

The implication is direct: the winning product architecture is not maximal AI usage. It is disciplined intelligence deployment.

Use premium models where judgment, synthesis, ambiguity, and trust matter. Use cheaper models where the task is routine. Preserve human judgment at the center. Treat friction as a cost-control signal, not just a workflow problem. And never let token consumption become a vanity metric.

The next phase of AI will not be defined by who burns the most tokens.

It will be defined by who turns the fewest necessary tokens into the most valuable work.

Subscribe to Orthogonal

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe