Easy Value, Moonshots, and Money Pits: The AI ROI Framework
Easy Value, Moonshots, and Money Pits: The AI ROI Framework
Most AI projects fail to be completed, fail to reach production, or fail to deliver expected ROI. Here is the 2×2 framework — value vs. ease of implementation — that Bain and Toptal use to prioritize AI initiatives.
Mayur Domadiya · June 11, 2026 · 8 min read
Most AI projects fail. Not eventually and not subtly — they fail to reach production, fail to be completed, or fail to deliver expected outcomes even when they ship. This is not a controversial observation. An AI and data analytics leader who spent years at Bain & Company before leading AI practice at Toptal watched a significant number of organizations launch AI projects that never crossed the finish line, or did cross it and left the business exactly where it started. The root cause is consistent: organizations evaluate AI projects on whether they seem technically impressive or strategically fashionable, not on whether they can deliver quantifiable value given the organization's actual data readiness, implementation constraints, and change management capacity. The framework that addresses this is a 2×2 that scores every proposed initiative on two dimensions before a single line of code is committed: the value it can deliver, and how difficult it will actually be to implement.
Why Most AI Projects Fail Before They Start
The failure pattern has a specific anatomy. A company identifies an area where AI seems applicable — call center operations, pricing optimization, customer churn prediction. The internal data science team builds a model. The model is accurate. It performs well in testing. And then it is impossible to act on.
An insurance company demonstrates this precisely: data scientists built an accurate model to forecast call volume by call type — exactly what the brief requested. Technically correct. But the forecast data was too granular, and the forecast window too short, for managers to make realistic staffing changes. By the time practical constraints were factored in — schedule release and review time, recruiting and training windows, minimum continuous staffing requirements — there was no remaining value to optimize. The project was complete. The ROI was zero.
The underlying problem is a structural disconnect between data science teams and business operations. Data scientists are motivated to solve technically interesting problems. Business leaders need to solve economically significant ones. Without a structured evaluation process that forces alignment before the project starts, the gap produces exactly the insurance outcome: technically correct work that cannot be translated into operational change. The engineering cost was real. The value was not.
Axis 1 — Evaluating Value: Three Questions Before You Start
Value assessment has three components, each of which needs an explicit answer before any engineering work begins.
Financial impact. What is the quantifiable upside? Cost-benefit analysis, ROI calculations, and scenario modeling are all valid tools. The discipline is to cover both sides of the equation: near-term efficiency gains (automation, cost reduction) and long-term upside (revenue growth, new product capability). Near-term automation ROI is relatively easy to calculate — if a model replaces a manual process, the math is straightforward. Long-term product ROI is harder to model but equally important to include. Organizations that only calculate automation ROI and miss the product-level opportunity will consistently under-invest in the AI initiatives with the highest long-term returns.
Strategic alignment. Does this project connect directly to what the CEO is being measured on? An AI initiative that a chief data officer finds technically compelling is not the same as one a CEO would fund if it were explained at a board meeting. The strongest examples of AI budget approval come from exactly this alignment: a chemical manufacturer whose CEO had mandated margin improvements approved multiple AI initiatives in pricing and supply chain — areas with direct impact on the metric already under scrutiny — while other business units faced cuts. Strategic alignment created both the budget and the organizational will to move. Without it, even a technically excellent project struggles to survive the next planning cycle.
Opportunity cost. What happens if a close competitor executes this before you do? Would they take market share? Would they be able to serve customers at a lower cost? Would they provide a more differentiated offering? These questions convert the evaluation from a passive "should we do this?" to an active "can we afford not to?" — which is the more useful framing for any market where switching costs are low and competitive differentiation is thin. McKinsey's 2023 research found that organizations report increased revenue and decreased costs in the business functions where they have implemented AI, and two-thirds of company representatives expect to increase AI integration in the next several years. The opportunity cost of inaction is compounding.
Axis 2 — Evaluating Difficulty: Four Factors That Determine Whether a Project Ships
The value axis establishes whether a project is worth doing. The difficulty axis establishes whether your organization can actually complete it.
Off-the-shelf versus custom build. The decision to use a foundation model like GPT-4 rather than train a proprietary LLM is straightforward for most organizations — the data requirements for training a production-quality LLM are prohibitive, and the existing models are capable. The buy-versus-build question becomes genuinely difficult for narrower use cases: should you buy an expense classification tool or build one? The right heuristic: if the capability you are building is core to your competitive advantage — the thing that differentiates your product in the market — build it and own the IP. If it is not a source of competitive differentiation and an affordable tool fits the need, buy. Applying this question consistently eliminates a significant fraction of unnecessary custom development work.
Data availability and quality. This is the most common bottleneck in AI project implementation, and the most consistently underestimated at the evaluation stage. An ML model that performs well in a testing environment frequently fails in production: data arriving at different intervals than the training set, missing fields, format inconsistencies between source systems, latency that exceeds what the use case requires. Data readiness should be evaluated on two dimensions: the ability to deliver an adequate signal for the problem at hand, and the ability to operate accurately in a live production environment. If the second condition cannot be met, no level of modeling sophistication compensates for it.
Technical feasibility and project complexity. More components — model types, infrastructure dependencies, data pipelines, integration layers — mean lower probability of completion. The general principle is to start with the simplest viable approach: if traditional ML methods like regression cannot find a significant signal from the available data, the probability that deep learning will find a meaningful signal is also lower. The exception is computer vision and NLP tasks, where deep learning is structurally required to capture the nuanced relationships involved. In those cases, the right starting point is a pre-trained off-the-shelf model — OpenCV for vision, BERT for NLP — rather than building from scratch.
Stakeholder alignment. Most of the value from an AI initiative comes not from the model itself but from the organizational and process changes built around it. A pricing model that accurately predicts optimal deal pricing delivers no value if sales managers do not trust it, do not know how to apply it in negotiation, and have no workflow that delivers the output at the moment they need it. Stakeholder alignment — identifying who needs to change their behavior for model output to translate into business value, and securing their commitment before the project starts — is a precondition of value capture, not a post-launch communication exercise.
The Four Quadrants: Where to Invest and Where to Stop
After scoring each initiative on both dimensions, every project maps to one of four quadrants.
Easy value projects are quick to implement and deliver immediate, quantifiable returns. These are the natural starting points — they build organizational confidence in AI capability, demonstrate ROI to executives, and develop the internal muscles for more ambitious work. The risk is over-indexing here: if all engineering budget goes to easy value projects, the long-term competitive opportunity in the harder quadrants goes unaddressed.
Moonshots are high-value but complex — the initiatives that could deliver substantial competitive advantage but require significant investment in data infrastructure, technical capability, and organizational change. The right response to a moonshot is not to defer it indefinitely because it appears difficult. It is to understand specifically what makes it difficult. In most cases, data is the bottleneck. If improving one or two data sources would move the initiative from blocked to buildable, that data investment becomes a priority in its own right — because it unlocks access to a disproportionately large opportunity, not just a single use case.
Data scientists will always be eager to explore and build with cutting-edge technologies, but they need coaching from business leaders on exactly which problems need to be solved.
Low value initiatives are technically easy but deliver minimal business impact. These are the projects that seem harmless to pursue but consume engineering time that could compound elsewhere. They are not priorities.
Money pits are the most dangerous quadrant: difficult to implement and low in return. The credibility-destroying AI projects — the ones that exhaust engineering resources and produce nothing the business can act on — live here. The insurance call center example lived here. Identifying money pits early, before resources are committed, is the primary reason to run this evaluation in the first place.
Why the Data Investment Unlocks the Quadrant That Matters Most
The consistent finding across AI project evaluations is that data quality is the most frequent bottleneck for high-value moonshot initiatives. The implication is that a data investment — which may not appear to be an AI project at all — is often the highest-ROI move available in the portfolio.
A consumer product brand demonstrates this directly: the team identified that building a customer data platform (CDP) would unlock five high-value AI opportunities simultaneously — personalized marketing, promotion optimization, cross-selling, churn prediction, and attribution modeling. As a standalone project, the CDP was struggling to get budget approval because no single use case justified the investment. When it was presented as the enabler of five distinct, financially quantifiable AI opportunities that were currently blocked, the budget was approved and fast-tracked. The data investment was the correct first move because it moved multiple moonshots from blocked to buildable in a single project.
The practical diagnostic for any organization reviewing its AI project list: map every moonshot against current data infrastructure. How many of them are blocked by the same one or two missing or low-quality data sources? If the answer is more than one, improving those sources is almost certainly more valuable — and often faster — than building any individual model on top of inadequate data.
What This Means
The high failure rate of AI projects is not a model quality problem. It is a project selection problem. Organizations evaluate AI initiatives on technical ambition or strategic fashionability and skip the prior question: does this initiative deliver quantifiable value that our organization can actually capture given our current data, technical, and organizational constraints?
For founders and CTOs allocating AI engineering budget, the immediate application is to run every proposed initiative through both axes before it enters the development queue. Quantify the financial impact. Confirm the strategic alignment. Assess data readiness honestly. Identify who needs to change their behavior for the project to deliver value, and confirm those stakeholders are committed before engineering begins. If a project cannot pass that assessment, it does not become a moonshot by receiving more budget — it becomes a money pit at a higher cost.
The organizations that build durable AI capability are the ones that treat initiative selection with the same rigor they apply to engineering execution. The model is the tractable part. Knowing which model solves a real business problem, with data that exists in usable form, in an organization ready to act on the output — that is the harder and more valuable problem. It is also the one most often skipped. That is why building AI features that actually ship and deliver ROI requires answering these questions before writing any code.
Want to know which AI features are worth building?
Book a free 20-minute AI Feature Scoping Call. We will map your highest-ROI AI feature, tell you the real cost, and whether Boundev is the right fit. No decks. No BS.
Book scoping call →Rather we just build it?
Book a free scoping call and we'll ship your production-safe AI feature this week.