5 Steps VCs Use to Evaluate AI Startups — and What to Fix
5 Steps VCs Use to Evaluate AI Startups — and What to Fix
Carolyn Deng, CFA — Wharton MBA who has executed 20+ VC/PE deals across a $700M portfolio — lays out exactly how investors diligence AI startups. This is the framework to internalize before you fundraise.
Mayur Domadiya · June 12, 2026 · 10 min read
Between 2016 and 2017, global AI startup investment jumped 300% — from $5 billion to $15.2 billion, according to CB Insights and Statista. Capital has continued pouring in since, making AI one of the most competitive fundraising environments in venture history. The founders who understand how sophisticated investors evaluate AI deals arrive at pitches better prepared: they know which diligence questions are coming, where their weakest answers are, and what to fix before the meeting. Carolyn Deng, CFA, a Wharton MBA who has executed more than 20 VC/PE deals while managing a $700M portfolio, outlines the five-step process investors use to evaluate AI dealflow. For founders, this is the checklist that matters.
Step 1: Customer Desirability — The Most Important Question
Before any financial or technical diligence begins, an experienced investor asks: what problem is this solving, and does it actually matter? Deng identifies this as the most important of the five steps — because a well-funded AI company solving a problem nobody cares about fails regardless of how sophisticated the model is.
The analysis has two dimensions. First, is the problem mass-market or niche? A startup building AI for a $1 million per year total addressable market is not a fundable VC proposition — the return ceiling makes the math impossible regardless of execution quality. Second, and trickier, is the problem mission-critical? Deng draws the line clearly. If an autonomous vehicle AI software has a 0.001% error rate, that still means one accident every 1,000 hours of driving. The tolerance for that is zero. Netflix or Amazon recommending the wrong item 1% of the time has no meaningful downside. Mission-critical applications — autonomous vehicles, medical diagnosis, surgical robotics — carry both bigger potential returns and bigger risks than non-mission-critical ones, which makes them a fundamentally different investment proposition.
For near-term fundability, non-mission-critical AI solving a well-defined problem at scale is the strongest category. This includes smart customer service (AI chatbots that go beyond rule-based logic), medical imaging diagnosis where error doesn't mean immediate patient harm, machine translation, AI financial advisors, and computer vision applications. Founders building in these spaces have more investor appetite. Those building mission-critical applications need to have thought through how they will reach the error tolerance that market demands.
Three red flags investors look for in customer desirability: the company is targeting a problem that very few people will pay to solve; the problem requires solving 10 other problems before the AI can address it; or the company is trying to solve too many problems simultaneously. Any of these is a diligence blocker.
Step 2: Commercial Viability — Market Size and Investment Horizon
Once customer desirability passes, investors examine whether the business can generate enough revenue to justify the AI investment. For mature enterprises deploying AI, the requirement is a robust business case with explicit ROI. IBM CEO Virginia Rometty set a concrete goal for Watson: $10 billion in annual revenue before 2024 — a publicly stated number that shaped how IBM justified every Watson investment decision internally.
For AI startups, especially pre-revenue ones, the primary commercial viability question is market size. If the maximum addressable market is $1 million per year for a geographically and vertically constrained application, the question is not whether the AI can be built — it's whether the business can ever return the capital invested in building it.
The investment horizon is the second factor, and it's underestimated. Deep AI technology takes longer to develop and longer to monetize than most founders account for. Waymo has been testing autonomous driving technology since 2009. Nvidia is the canonical AI investment story — but an investor who bought in at the 1999 IPO did not see significant returns until after 2016, when deep learning became commercially dominant. That's a 17-year S-curve. Investors in deep AI technology have to believe in the destination and have the patience to hold through a long non-linear development period.
Step 3: Technical Feasibility — Data, Algorithms, Computing Power
Technical diligence for AI breaks into three components: data, algorithms, and compute. Each has specific questions that experienced investors ask, and each is a different kind of risk.
Data: Machine learning models require access to clean, well-labeled data that resembles the real world. Investors want to know whether the company has access to usable training data, how they obtained it, and whether they can continue obtaining it at scale. The trend of open banking and democratized consumer data creates new opportunities here — but so does the regulatory risk around consumer data privacy.
Algorithms and talent: Robust, scalable algorithms require both the right data and the right people. Top AI talent — data scientists and engineers experienced with production AI — is heavily concentrated at Google, Facebook, Microsoft, and IBM. DeepMind employees earn approximately $345,000 per year on average. Competing for that talent requires both compensation and a compelling enough product vision to attract people away from teams with more resources. Investors ask whether deep learning is actually the right technology for the problem or whether a rule-based system would be cheaper and more appropriate. A robo-advisor for retail asset allocation may not need deep learning at all. A hedge fund algorithm that needs to improve from past mistakes does.
Computing power: AI workloads are expensive at scale. Kaifu Lee, one of the world's leading AI investors, describes a portfolio company that spent $1 million USD in its first three months exclusively on deep learning compute servers. A typical deep learning training task requires machines with four to eight high-capacity GPUs. Computer vision tasks at scale require hundreds or thousands of GPU clusters that emit ten times more heat than standard servers. Investors want to know whether the company can afford the compute its use case requires — and whether the necessary computing power is actually available at the scale the product will need to reach.
A typical deep learning model training task requires one or multiple computers that have four to eight high capacity GPUs. Many computer vision tasks require hundreds and thousands of GPU clusters — and emit 10x more heat than a normal server.
The areas that have achieved the most breakthroughs with deep learning and are most suitable for the technology are natural language processing, computer vision, and game-based or evolutionary systems. These are also the areas where the talent pool is deepest and the technical risk is most manageable.
Step 4: Spotting AI Hype — The Most Dangerous Diligence Gap
Because AI companies command higher valuations and more investor attention than equivalent non-AI companies, there is a structural incentive to mislabel. Deng identifies three patterns investors watch for:
The first is companies building process automation or rule-based systems and calling it AI. If a company claims to use AI but has not hired a team of data scientists or AI engineers — either in-house or through contracted partners — that is a red flag. Asking directly what underlying technologies they are using, and what their training pipeline looks like, separates real AI from labeled automation.
The second is overstating what AI alone can accomplish without human intervention. In 2018, Chinese translation company iFlytek was caught in a controversy: their supposedly real-time simultaneous machine translation device turned out to be listening to and copying a human translator's voice. iFlytek's explanation was that real-time translation at the required speed and accuracy is not currently possible without human assistance — a legitimate technical reality, but one that had been obscured in their marketing. In many applications, human-machine hybrid systems outperform pure AI, and this should be disclosed as a feature rather than hidden as a limitation.
The third is confusing a scientific prototype with a commercially scalable system. A prototype can be built by one data scientist using MatLab, running on 1,000 to 10,000 data points, in a few months. A commercially scalable product requires: funds to hire AI specialists (data scientists, architects, software engineers, product managers), training datasets with tens of millions of data points, production-grade code in Python or C++, and infrastructure on Amazon AWS, Google Cloud, or Microsoft Azure. These are not the same thing, and they do not cost the same to build.
Step 5: Financial and Business Metrics
AI companies are tech companies first, and are evaluated using both traditional financial metrics and technology-specific indicators. The traditional layer — revenue, net income, cash flow, growth rate, P/E and P/S ratios, competitive landscape, regulatory risk — applies to any investment. The technology-specific layer weights differently: growth rate often matters more than current profitability, and for early-stage AI startups, user statistics like monthly active users and bookings frequently matter more than revenue or cash flow. Valuations reflect this — Nvidia's P/E ratio trades around 30x versus McDonald's at approximately 20x, reflecting the market's belief in future earnings growth over current profitability.
For public companies, these details are available through financial filings and market data providers. For private AI startups, investors work directly with management to obtain the necessary numbers. The practical takeaway for founders: have these metrics prepared before the meeting, even if the numbers are early-stage, and be explicit about what stage of the S-curve the company occupies and what the next inflection point looks like.
What This Means for Founders
The investor's diligence checklist and the founder's product checklist are the same list. Customer desirability, commercial viability, technical feasibility, hype-resistance, and financial transparency — building a fundable AI startup means having strong answers to all five. The founders who do the diligence on themselves before investors do it to them are the ones who show up to fundraising conversations with credibility rather than gaps.
The single most important practical insight in Deng's framework is the distinction between a scientific AI prototype and a commercially scalable system. They are different projects, built by different teams, requiring different infrastructure, costing orders of magnitude differently. Understanding that gap — and having a realistic plan for crossing it — is what separates founders who raise from those who don't. That gap is also where the engineering and product work that actually moves a company from prototype to production lives.
Ready to close the prototype-to-production gap?
Book a free 20-minute AI Feature Scoping Call. We'll map your highest-ROI AI feature, the real cost to build it commercially, and whether Boundev is the right fit.
Book scoping call →Rather we just build it?
Book a free scoping call and we'll ship your production-safe AI feature this week.