Why Only 5% of Companies Are Seeing Real AI ROI

Earlier this year, researchers at MIT published a statistic that hung in the air like smoke: despite spending close to $40 billion on generative AI over two years, only 5% of enterprises could point to a real business return.

The report introduced the idea of the GenAI Divide, a growing split between a narrow band of organizations that have turned pilots into infrastructure and the larger set stuck in permanent experimentation.

Some firms reported quieter but meaningful gains like better retention, fewer back-office errors, vendor friction smoothed away. Others never moved past the demo.

Researchers traced the difference to a single principle: Successful companies built systems that learned in the wild, improved with each cycle, and aligned with how people actually worked.

Why Enterprise AI Tools Miss the Mark
Turning AI Experiments Into Real ROI
Why Most AI Pilots Die in Production
Inside the 5%: How Top AI Leaders Drive Measurable ROI
AI Readiness Isn’t What You Think It Is
Closing the Gap Between AI Pilots and Enterprise Performance

Why Enterprise AI Tools Miss the Mark

The most widely used generative AI tools in the enterprise today are the same ones workers use on their phones during lunch. ChatGPT, Microsoft Copilot and Claude see daily use across teams, often without formal approval, training or integration (referred to as Shadow AI). These tools help employees summarize documents, clean up emails and brainstorm ideas. They are easy to adopt because they don’t require permission.

Meredith Broussard, author of "Artificial Unintelligence" and associate professor at the Arthur L. Carter Journalism Institute of NYU, told us, “When a company says to employees, ‘You must use AI,’ that’s a bad idea. Saying you have to use one of these new tools is like saying you have to use a yellow highlighter. You should instead think about using the right tool for the task.”

She warned against mistaking novelty for trust. “As a business, you have to think about, ‘Are you in the business of having your customers trust you?’ If you are, you don’t want GenAI to be customer facing. No one wants to talk to a chatbot.”

Turning AI Experiments Into Real ROI

But this ease of use has created a strange illusion. Organizations believe they are adopting generative AI because their employees are experimenting with it. What they are actually doing is outsourcing small fragments of knowledge work to systems that remain disconnected from their core operations.

The MIT study shows that meaningful AI ROI begins when AI systems take root inside workflows that already matter. The companies that progress start with narrow use cases and avoid broad strategies in favor of tools with a clear job. Like:

An invoice processor that shortens vendor disputes
A support agent that drafts responses and improves with use

These systems rarely impress in demos. Their value appears in metrics the following quarter.

In companies that see returns, generative systems evolve. They learn from usage, respond to edge cases and accumulate context. In companies that don’t, AI becomes a recurring topic in quarterly innovation updates. It stays trapped in slide decks and internal showcases, occasionally generating ideas but never outcomes.

Why Most AI Pilots Die in Production

The early stages of an AI pilot often move quickly. Demos generate enthusiasm. Teams imagine how the system might fit into daily work. The interface feels responsive. Early outputs look clean. But momentum fades when the tool enters production and begins to resist the environment around it.

As Omar Shanti, CTO of Hatchworks AI, explained, “Generative AI projects are easy to do but hard to do well. It’s easy to get to the pilot phase, but getting to production is an elusive goal for most enterprises.”

In real AI-augmented workflows, the tool behaves differently. It forgets recent instructions. It repeats avoidable mistakes. It needs context that people already provided. After a few weeks, teams begin to work around it. The project still exists, but progress stops.

The MIT study captured this pattern in detail. When systems fail to learn, they stop earning attention. Each session starts from scratch. Feedback evaporates. Teams lose confidence because the system doesn’t evolve with use. It behaves like a new product every time it loads.

The systems that succeed follow a different pattern. They:

Launch inside clear processes
Receive steady correction
Retain adjustments
Improve over time

What began as a small tool becomes a reliable part of the operation. It fits because it learns. Until that loop forms, most AI pilots remain technically active but functionally irrelevant.

Related Article: Do's, Don'ts and Must-Haves for Agentic AI

Inside the 5%: How Top AI Leaders Drive Measurable ROI

“Is email useful? Yes. Has it totally eliminated handwritten materials? No. Use GenAI the same way. Focus on the mundane, not the shiny, and then you’ll make better decisions.”

- Meredith Broussard

Associate Professor, NYU

Companies that escape pilot purgatory share a common trait: they begin with workflows that deliver real value from day one. Instead of chasing trends, they anchor their efforts in tasks that move the needle.

High-Impact Use Cases That Deliver ROI

Johnson & Johnson eliminated hundreds of scattered GenAI experiments once they discovered that roughly 10–15% of them produced about 80% of the value. They shifted toward giving individual departments like supply chain and research the freedom to own the AI process and tools that aligned with their work.

Deloitte’s analysis echoes this shift. Their case studies show that focusing on a limited set of high-impact use cases — especially those layered on existing workflows — accelerates ROI. Centralizing governance helps too, by ensuring integration and scalability without overextending the intervention.

MIT’s research paints the same portrait. The groups that produce real value tend to partner with external vendors skilled in context and domain fluency. These partnerships deliver value roughly twice as often as in-house builds. Their approach centers on adaptability, continuous improvement and deep integration into the way teams already work.

To Broussard, the mistake is aiming for disruption instead of utility. “Is email useful? Yes. Has it totally eliminated handwritten materials? No. Use GenAI the same way. Focus on the mundane, not the shiny, and then you’ll make better decisions.”

Beyond strategy and governance, smart AI deployments embrace learning loops.

Learning Opportunities

Webinar

Jan

Ditch the Desk Phones: How Modern Teams Drive AI-First Communications

Find out how one team finally pulled the plug on a legacy phone system. And built something smarter.

Webinar

On demand

Empowering Non-Profits: Smarter Crisis Communication and Community Engagement

The cost of miscommunication is measured in more than words. Learn to deliver outreach that's fast, clear and trusted.

Watch Now

Webinar

On demand

Roundtable: Turning Real-Time CX Signals into Business Results

Four big brands. One live, unscripted discussion on how modern CX teams move from dashboards to real impact.

Watch Now

Webinar

On demand

Unlock Connected Service: How to Forecast, Staff & Support Every Channel

Stop juggling tools. 73% of CX leaders say silos damage CX. Build a seamless service operation instead.

Watch Now

Webinar

On demand

Beyond Composability: How Modern Marketers Build Connected Experiences

Ready to launch campaigns faster, personalize smarter and prove your marketing ROI? Discover the power of a modern DXP.

Watch Now

Webinar

On demand

Rebrand. Migrate. Optimize. How to Do It All (Without Slowing Down)

Cresta leveled up site speed, design flexibility and marketer sanity (in record time). Find out how.

Watch Now

Webinar

Jan

Ditch the Desk Phones: How Modern Teams Drive AI-First Communications

Find out how one team finally pulled the plug on a legacy phone system. And built something smarter.

Webinar

On demand

Empowering Non-Profits: Smarter Crisis Communication and Community Engagement

The cost of miscommunication is measured in more than words. Learn to deliver outreach that's fast, clear and trusted.

Watch Now

Webinar

On demand

Roundtable: Turning Real-Time CX Signals into Business Results

Four big brands. One live, unscripted discussion on how modern CX teams move from dashboards to real impact.

Watch Now

One academic study on corporate expense processing in a large Korean company showed how combining generative AI with intelligent document processing cut processing time by more than 80%. The smart system handled exceptions, learned from human corrections and steadily improved with use — exactly the kind of compounding value that shifts pilots into production value.

In summary, here’s how the top performers operate:

They choose high-value workflows
They pair carefully chosen external tools with internal workflows, creating feedback loops for continuous improvement
Their systems accumulate context from usage instead of fading back into slide decks

AI Readiness Isn’t What You Think It Is

By the time most organizations approve a generative AI pilot, they assume they are ready. They have a budget. They have leadership support. They may even have vendor agreements and toolkits. But the readiness required to launch a pilot is different from the readiness required to generate value.

Vivar Aval, CFO and COO of Avidbots, said, “I’m not surprised so many companies aren’t measuring ROI. Early adoption is iterative. The pilot stage is where you learn your baseline, define the right metrics and only then can you prove the value.”

The companies that reach sustained returns tend to invest in different infrastructure. They treat feedback as a requirement. They design handoffs between people and systems that create shared responsibility. The strongest systems begin inside real processes and improve through repeated use. Each correction stays in the system. Each result becomes slightly more aligned with the task. Over time, the tool becomes easier to trust, because it adjusts in ways people can see and verify.

Readiness is visible in the details. Who owns the feedback? Where does improvement show up on the dashboard? Which teams gain leverage when the tool succeeds? These answers reveal whether a company is prepared to scale AI or just prepared to try it.

Closing the Gap Between AI Pilots and Enterprise Performance

The companies that see results from generative AI build systems that learn by doing. Their tools improve with each interaction. They stay inside the work, connected to the same AI performance metrics, teams and constraints that shape performance.

McKinsey pointed to this shift in its work on agentic AI. Systems that recall, adapt and act across workflows. These tools become part of the process. They gain relevance through use.

Architecture matters too. AI delivers value when workflows support correction and systems respond to it. Feedback becomes part of the loop. And governance reinforces the effect. Teams that track outcomes and assign ownership create the conditions for improvement. They measure change. They expect learning. AI earns a place by delivering on both.

The companies that move forward begin small. They place AI inside workflows that already carry weight. They choose roles where improvement can be seen. The return comes slowly, then steadily. First the tool fits. Then it learns. Then it stays.

Table of Contents

Why Enterprise AI Tools Miss the Mark

Turning AI Experiments Into Real ROI

Why Most AI Pilots Die in Production

Inside the 5%: How Top AI Leaders Drive Measurable ROI

High-Impact Use Cases That Deliver ROI

AI Readiness Isn’t What You Think It Is

Closing the Gap Between AI Pilots and Enterprise Performance