Executives chase elusive AI gains, reality checks needed
Executives chase the AI productivity boom, yet the struggle feels expected. The real question: why would boards actually rewrite metrics and operating models after a single AI pilot?
The piece that ran with the headline “6,000 execs struggle to find the AI productivity boom” makes a blunt point and then expects you to be surprised. Don’t be. The real surprise would be if thousands of boards suddenly rewired their performance metrics and operating models the week after they greenlit an AI pilot. The interesting question isn’t why executives are “struggling,” but why that struggle is being framed as a failure instead of a predictable pause.
The article’s framing gives the game away: executives “can’t find” the productivity boom. That assumes productivity is the right thing to be looking for first. It’s a narrow question and, for many early deployments, almost a category error.
Executives mean something very specific when they say “productivity”: output per labor hour, cash flow per headcount, return on invested capital. AI often shows up somewhere else entirely: shorter decision cycles, cleaner shortlists in hiring, better customer triage, fewer ugly mistakes, slightly sharper first drafts. These show up as friction reductions, not as dramatic jumps in revenue per employee.
They also show up slowly.
Executives read vendor decks promising step-function gains and let themselves believe the fantasy. They were expecting big, clean, spreadsheet-ready uplifts. What they got instead was better drafts, smarter searches, and fewer customer escalations. Small wins. Real wins. But not the headline-grabbing productivity lift some of them had been implicitly promised. That mismatch between marketing and accounting is a governance failure, not a technology failure.
Let’s be real: if you measure the wrong outcome, you’ll reallocate the wrong resources. Treat AI as a plug-and-play efficiency tool and you get exactly what a lot of executives are reporting — cost, complexity, and not much else. Treat it as an operations problem and the picture changes.
Integration, process redesign, data plumbing, and incentives are the unglamorous work that actually drive returns. You can embed a model into chat, into a CRM, into whatever platform is fashionable, but if managers don’t change KPIs, if data remains balkanized, and if frontline workers aren’t trained, that model becomes expensive cruft in the tech stack.
From my Goldman days, there was a simple pattern: the firms that actually saw returns from automation never sold it as “transformation.” They treated it as continuous improvement: narrow pilots, tied to specific metrics, run repeatedly. The article talks about executives “struggling.” Right — because a lot of them hired teams to pull levers without first drawing a map of the factory floor.
There’s another blind spot: the viewpoint is almost entirely top-down. Executives are natural aggregators; they look at company-wide productivity figures and expect enterprise-scale lifts. But AI’s early wins are local. They’re happening on the edge of the org chart.
Sales teams shaving cycles off proposals. Support teams deflecting repetitive tickets. Analysts spending less time on mindless document prep and more on interpretation. These don’t always move the needle at the quarterly level, at least not yet, but they move the day-to-day reality of work. If all of that is invisible to the board because it doesn’t roll up neatly into one top-line metric, the diagnosis will be “no boom here.”
Distribution matters. When gains sit at the operational edge, companies need career paths, compensation structures, and budgeting that actually recognize and recycle those gains into broader change. Instead, what often happens is that marginal time savings silently fund headcount freezes or slightly nicer margin guides. The tools that created the savings stay trapped in pilot mode.
There is a tougher counter-argument the article gestures at, and it deserves more airtime: maybe there is no big AI productivity boom waiting around the corner. Maybe these systems are just clever autocomplete engines, and the gains vanish once the novelty wears off.
Some deployments really are marginal. Plenty of AI projects add cost, latency, and risk without reducing error rates. Some “AI assistants” simply repackage existing search and knowledge tools with extra steps and higher inference bills. That’s not a boom; that’s decor.
But treating all AI like that ignores the spread in outcomes. AI isn’t a monolith. Certain applications — code helpers for routine boilerplate, document summarization in legal or compliance workflows, triage in support settings — quietly take cycle time and rework out of the system. The math doesn’t lie here: the expected value of an AI project depends on task repeatability, data quality, and the marginal cost of an error in that workflow. Many firms essentially skipped that analysis and then blamed the technology when payback didn’t materialize.
There’s a historical rhyme here. When spreadsheets first hit finance, nobody saw an immediate “spreadsheet productivity boom” on macro stats. What they did see, slowly, was a shift in who could model, how fast decisions were made, and which firms learned to trust their own numbers. AI is landing the same way: first as micro-changes in who can do what, not as instant efficiency fireworks.
So what should boards and executives actually be doing instead of complaining about a missing boom? They should stop treating “AI” as a single line item and start asking a more boring, accurate question: where are marginal gains most defensible?
That means looking for teams with repeatable tasks, usable data, and a clear tolerance for model error. It means insisting on pilots that are instrumented around the metrics those teams care about, not around vanity enterprise-level output proxies. And it means demanding an explicit repurposing plan for any time or headcount savings — whether that’s reskilling, redeploying, or, yes, sometimes reducing roles, but with intent instead of drift.
Right now, the headline captures the symptom well enough. The underlying diagnosis is still off: the bottleneck isn’t AI’s ability to help, it’s organizations’ willingness to redesign how they measure and absorb that help. Give it a couple of years and the same executives will be talking less about “finding the boom” and more about why their competitors’ supposedly “incremental” deployments quietly ate their margins.