The AI Pilot Succeeded. The Economics Did Not.

\ The demonstration was impressive. An AI agent completed in minutes what normally took an employee several hours. The model answered questions accurately, generated polished outputs and navigated a complex workflow with minimal intervention. Senior leaders watched the presentation and immediately asked the inevitable question: “How quickly can we deploy this across the company?” A few months later, the project was still technically alive. Employees had access to the tool. Usage was growing. The company could point to the number of licenses activated, prompts submitted and tokens consumed. But the business had not materially improved. The process still required the same number of people. Customers were not receiving meaningfully better service. Decisions were not becoming faster. Costs were increasing rather than falling. Nobody could clearly connect the initiative to revenue, margin or customer retention. The AI pilot had succeeded. The AI transformation had not. This gap is becoming one of the defining challenges of enterprise AI. KPMG’s 2026 AI Pulse research found that organizations expect their average AI spending to reach approximately $207 million over the next 12 months, nearly twice the level reported a year earlier. Yet the same research found that 65 percent of business leaders identified difficulty scaling use cases as a barrier to demonstrating returns, while 62 percent cited skills gaps. The problem is no longer whether companies can deploy AI. It is whether they can convert rising investment into sustained business performance. At the same time, another problem has surfaced: companies are discovering that AI consumption can grow much faster than the value it creates. Welcome to the age of tokenmaxxing. When AI Consumption Became the Goal For traditional enterprise software, adoption was relatively easy to measure. A company bought licenses. Employees logged into the application. Leaders tracked monthly active users. Because most software was sold through predictable per-seat subscriptions, greater utilization was generally viewed positively. Generative AI introduced a different economic model. Many AI systems charge according to the number of tokens they process. Every prompt, response, document retrieval, reasoning step and agent interaction consumes tokens. A simple employee query might consume a small amount. An agent independently working through a complicated task may make hundreds of model calls, repeatedly process large volumes of context and consume millions of tokens. This created a tempting new metric: token usage. Some organizations began treating AI consumption as evidence of adoption. Teams created usage dashboards. Managers compared employees based on how frequently they used AI. In some cases, companies encouraged employees to maximize token consumption on the assumption that greater AI usage must translate into greater productivity. Technology publication TechCrunch described this emerging behavior as “tokenmaxxing”: using the number of tokens consumed as a proxy for how productively an employee is using AI. The logic appears reasonable. If AI improves productivity, then employees using more AI must be more productive. But that conclusion confuses an input with an outcome. Using more electricity does not necessarily mean a factory is producing more valuable goods. Conducting more meetings does not mean an organization is making better decisions. Sending more emails does not mean a sales team is building stronger customer relationships. Likewise, consuming more AI tokens does not mean a company is creating more value. Once token usage becomes a target, employees have an incentive to increase consumption whether or not it improves performance. AI agents can be assigned unnecessary tasks. Large reasoning models can be used when smaller models would be sufficient. Employees can repeatedly regenerate acceptable outputs in pursuit of marginal improvements. What begins as an attempt to encourage experimentation can gradually become tokenmaxxing: maximizing AI usage without establishing whether the additional consumption produces corresponding business impact. The Token Bill Is Beginning to Arrive The financial consequences are now becoming visible. TechCrunch reported in June 2026 that Uber had exhausted its entire annual budget for AI coding tools by April, only four months into the year. The publication also reported that Microsoft withdrew access to Claude Code from some developers after previously enabling it, while a Priceline employee said that a routine renewal for the Cursor coding platform came back four to five times more expensive. These examples do not necessarily mean that the tools failed to create value. They reveal something more fundamental: even sophisticated companies are struggling to forecast AI consumption and control its economics. The paradox is striking. The cost of processing an individual token may decline, yet the total enterprise AI bill can continue to rise. Deloitte has warned that AI spending is becoming increasingly nonlinear, volatile and difficult to predict. Costs vary according to model choice, reasoning complexity, context length, workload design and usage patterns. More capable reasoning models may complete harder tasks, but they can also consume substantially more tokens than models handling simpler requests. As AI becomes more capable, employees use it more frequently. Applications absorb larger volumes of company data. Models receive longer context windows. Agents perform chains of actions rather than generating one response. The cost of each unit of intelligence may decline while the number of units consumed grows exponentially. This resembles the Jevons paradox: making a resource more efficient and affordable can increase its total consumption rather than reduce it. Companies are therefore facing two simultaneous failures: They cannot always demonstrate the value their AI investments are producing. And they cannot confidently predict what those investments will cost. Both failures stem from the same underlying problem. Organizations are managing AI as a technology rollout rather than as a business operating model. The Economics Gap The economics of AI are fundamentally different from those of conventional enterprise software. Traditional software spending is often based on a known number of users and a relatively predictable license fee. AI costs can depend on consumption, model choice, context length, task complexity and the number of reasoning or tool-use steps performed. Agentic systems add another layer of variability. A 2026 research study led by Longju Bai, with co-authors including Erik Brynjolfsson and Alex Pentland, examined token consumption across eight frontier models performing agentic coding tasks. The researchers found that agentic tasks consumed approximately 1,000 times more tokens than conventional code reasoning and chat tasks. More importantly, repeated runs of the same task could vary by as much as 30 times in total token consumption. Spending more did not guarantee better results. The study found that higher token consumption did not consistently produce greater accuracy. Performance often peaked at an intermediate level of expenditure and then plateaued, meaning additional token use could add cost without adding quality. The models were also unable to forecast their own consumption reliably and systematically underestimated the tokens they would ultimately use. This makes traditional budgeting especially difficult. A company may know what a million tokens costs, but not how many millions an agent will consume to complete a workflow, particularly when the agent independently decides how many steps, searches, revisions and tool calls to perform. Companies therefore cannot manage AI economics solely through an annual procurement process. They need an operational discipline similar to cloud FinOps: continuous visibility into consumption, cost and value. Deloitte describes AI as an economic system that can introduce volatility, margin pressure and capital risk when left unmanaged. Its guidance to finance and technology leaders is to connect token consumption directly to financial outcomes rather than treating AI expenditure as another static software line item. Every significant AI use case should therefore have a clear economic unit. For a customer-service agent, it might be cost per successfully resolved interaction. For a sales assistant, it might be cost per qualified opportunity created. For a software-development tool, it might be cost per accepted code contribution or production release. For a churn intervention system, it might be cost per customer retained. The denominator matters. Cost per token reveals what the technology consumed. Cost per business outcome reveals what the company received. A Better Model: From Token Consumption to Value Realization Companies need to replace activity-led AI management with a value-realization system. That system can be organized around five questions. 1. What Business Outcome Will Change? Start with a measurable outcome, not a model capability. Define the operational or financial baseline before launching the pilot. If the goal is to reduce customer-service costs, establish the current cost per successful resolution. If the goal is to improve retention, document current churn and intervention performance. If the goal is to accelerate software development, measure cycle time, accepted code and production quality. Without a baseline, almost any improvement can be claimed and none can be proven. 2. What Workflow Must Be Redesigned? Map the complete process surrounding the AI use case. Identify upstream inputs, downstream decisions, handoffs, controls and exceptions. Determine which steps should be automated, eliminated or retained. The objective should not be to insert AI somewhere into an existing process. It should be to design the best possible process now that AI exists. 3. Who Owns the Outcome? Assign one business leader responsibility for value realization. Technology may build the system. Finance may validate the economics. Risk may establish boundaries. Human resources may support role redesign. But one owner must be accountable for the final business result. A useful test is simple: If the AI initiative fails to produce its intended outcome, whose performance evaluation is affected? If the answer is “nobody’s,” the initiative does not truly have an owner. 4. How Will Adoption Change Behavior? Define adoption in behavioral terms. Do not merely ask whether employees used the tool. Determine whether they made different decisions, served customers differently or redirected their time toward more valuable work. Time saved is not automatically value captured. If an AI assistant saves a salesperson five hours per week, the business case depends on what happens to those hours. Do they conduct more customer meetings, pursue more leads and improve account planning? Or does the time simply disappear into a busier calendar? Leaders must design the value-capture mechanism rather than assume that it will emerge spontaneously. 5. What Is the Cost per Outcome? Track AI consumption at the use-case and workflow level. Use the smallest model capable of completing the task. Route straightforward work to less expensive models. Limit unnecessary context. Cache repeated information. Set budgets for agents and require escalation when consumption exceeds defined thresholds. But do not optimize tokens in isolation. An expensive AI task may be attractive if it creates a sufficiently valuable outcome. A cheap AI task may still be wasteful if it produces nothing useful. The goal is not token minimization. The goal is value maximization. The New AI Dashboard The typical enterprise AI dashboard is dominated by activity: Number of AI users Number of prompts Tokens consumed Agents deployed Hours allegedly saved A value-oriented dashboard would look different. It would include: Business outcomes improved Workflow cycle time Incremental revenue or cost reduction Customer or employee behavior changed AI cost per completed outcome Percentage of AI outputs accepted or acted upon Human intervention and exception rates Benefits realized against the original business case Usage would still matter, but it would be treated as a diagnostic metric rather than the definition of success. High usage with low impact would signal waste. Low usage with high impact might indicate a targeted and valuable application. High cost with high value could justify expansion. High cost with uncertain value would require intervention. This is the distinction many companies missed during the first wave of enterprise AI adoption. They assumed more usage meant more transformation. It did not. The Coming Shift From AI Adoption to AI Productivity The first phase of enterprise generative AI was about access. Could employees use the technology safely? Which platforms should companies approve? How quickly could organizations distribute licenses and experiment with use cases? The next phase will be about productivity. Not whether employees use AI, but whether the organization performs better because they use it. This transition will be uncomfortable. Some highly visible pilots will lose funding because their economics cannot be demonstrated. Some organizations will introduce token budgets, consumption limits and model-routing requirements. Employees who were encouraged to maximize AI usage may be told to use it more selectively. There is a danger that companies will overcorrect. Tokenmaxxing is not a reason to suppress experimentation or ration intelligence indiscriminately. Excessive cost controls could prevent employees from discovering valuable applications and push organizations back toward risk-averse technology governance. The answer is not simply to use less AI. It is to use AI with greater economic intent. Companies should experiment aggressively where the potential value is high, but expand selectively once the workflow, adoption model and economics are understood. They should encourage employees to solve meaningful problems, not compete to generate the largest token count. The most important AI metric is not how much intelligence the organization consumes. It is how intelligently the organization converts that consumption into better performance. The Pilot Was Never the Finish Line A successful AI pilot proves that a technology can work under controlled conditions. It does not prove that employees will adopt it. It does not prove that a workflow will improve. It does not prove that customers will notice. It does not prove that costs will remain manageable. And it certainly does not prove that the company will create value. The hardest part of enterprise AI begins after the demonstration. It requires leaders to choose economically meaningful problems, redesign workflows, establish ownership, change employee behavior and manage a new category of variable cost. These are not primarily technology challenges. They are strategy, operating-model and leadership challenges. The companies that succeed will not necessarily be those that deploy the most agents, purchase the most advanced models or consume the most tokens. They will be the companies that can draw the clearest line from every AI investment to a better decision, a stronger customer outcome or a measurable improvement in the P&L. Your AI pilot worked. Now comes the part that actually matters. \

View original source — Hacker Noon ↗

ShareShare on X Share on Facebook