Four AI features. Ninety days. Zero new hires.
That is the headline. But the real story starts six months earlier, when the CTO of a mid-market B2B SaaS company — let's call them Meridian — sat across from her board and promised AI features would ship "next quarter." Then the quarter after that. Then the one after that.
Meridian was not slow or incompetent. They had a strong engineering team of 11. The problem was structural: their four AI feature ideas required skills their team did not have, and every attempt to close that gap — posting jobs, scoping freelancers, evaluating agencies — added weeks of delay without adding a single line of production code. By the time they found Boundev, they had lost nine months and were about to lose a $400K enterprise deal to a competitor whose AI assistant feature had just launched.
This post walks through what Meridian's CTO knew, what they did not know, how we scoped and shipped all four features inside a single quarter, and what anyone running AI roadmap work in 2026 can take from this.
The Backlog That Would Not Move
Meridian's product was a B2B workflow platform serving mid-market operations teams. Their four AI features had been on the roadmap since mid-2025:
- A smart document summarization tool for auto-summarizing uploaded PDFs and contracts
- A semantic search layer across their internal knowledge base
- An AI-drafted reply assistant integrated into their in-app inbox
- An anomaly detection module that flagged unusual workflow patterns for ops managers
None of the four were exotic. All four had clear user demand backed by support ticket data. The engineering effort was well-defined on paper. The bottleneck was not vision. It was three overlapping problems that companies hit at exactly this stage.
Problem 1: In-House Engineers Were Already at Capacity
Meridian's team of 11 engineers was running a tight sprint cycle shipping core product. Every sprint had feature work, bug fixes, and technical debt. There was no "slack" to absorb an entirely new discipline — prompt engineering, RAG architecture, embedding pipelines, vector database management — without pulling someone off a live sprint.
Problem 2: AI Hiring Was Not Moving Fast Enough
Meridian had posted two AI engineer roles in January 2026. By March — when they came to us — they had interviewed 14 candidates, made 2 offers, and had 0 acceptances. The candidates who cleared the bar were choosing bigger offers from larger companies. The candidates who were available did not have the right depth in production AI systems. Median time-to-fill for a senior AI engineer role in the US is currently 4.5 months, assuming you find the right candidate at all.
Problem 3: The Freelance and Agency Options Had Too Much Friction
Two AI freelancers had been scoped via Upwork and Toptal. One delivered a working Jupyter notebook — not production code. The other disappeared three weeks into a six-week project. A boutique AI agency quoted $280,000 for a six-month engagement to ship two of the four features. Meridian could not afford the budget, the timeline, or the risk.
By the time their CTO booked a scoping call with Boundev, the board was watching. One board member had forwarded her a competitor press release: "[Competitor] launches AI assistant — available to all enterprise tiers now."
The Scoping Call: What We Found in 22 Minutes
Our scoping call with Meridian's CTO and lead engineer took exactly 22 minutes. We asked the four questions we always ask:
- What does each feature need to do, specifically? (Not the product vision — the actual input/output behavior)
- What data does it need to run on, and where does it live?
- What does "done" look like — what is the acceptance criterion before you would ship to production?
- What does your stack look like today?
The answers came back clean. Meridian had well-defined requirements, real training data for two of the features, and a Python/FastAPI backend we could integrate with directly. The semantic search feature needed a vector database they did not have yet — we recommended Pinecone based on their scale and budget — but that was not a blocker.
We said yes on the call. We mapped all four features to our Growth tier and estimated we could complete them across a single quarter with two AI engineers on rotating focus. Three days later, we started.
How We Built It: Feature by Feature
Feature 1: Smart Document Summarization (Weeks 1–3)
This was the clearest scope of the four. Meridian users uploaded PDFs — contracts, proposals, reports — and needed auto-generated summaries with section-level highlights. We used a chunked retrieval approach with GPT-4o, running the documents through a preprocessing pipeline that stripped formatting noise before passing structured chunks to the model.
The key engineering decision was summary fidelity over speed. Meridian's users are operations managers reviewing legal documents. A summary with a hallucinated clause is worse than no summary. We built in a confidence-scoring layer that flagged low-certainty extractions for human review rather than presenting them as fact.
The feature passed Meridian's QA in week 3. It shipped to their beta group — 47 users — in week 4.
Feature 2: Semantic Search Across the Knowledge Base (Weeks 2–5)
This was the most architecturally involved feature. Meridian had 14,000 internal knowledge base articles across two systems — their own CMS and a legacy Confluence instance — that needed to be embedded, indexed, and made searchable via natural language queries.
We stood up a Pinecone index, built an ingestion pipeline in Python that pulled from both sources, chunked articles at the paragraph level, and generated embeddings via OpenAI's text-embedding-3-large model. Query routing used a hybrid search approach: dense retrieval for semantic queries, BM25 for exact-match keyword queries, with a re-ranking step before results were returned.
from pinecone import Pinecone
from openai import OpenAI
pc = Pinecone(api_key=PINECONE_API_KEY)
index = pc.Index("meridian-kb")
client = OpenAI()
def hybrid_search(query: str, top_k: int = 5):
dense_vec = client.embeddings.create(
input=query,
model="text-embedding-3-large"
).data[0].embedding
results = index.query(
vector=dense_vec,
top_k=top_k,
include_metadata=True
)
return results.matches
Notice that the embedding call and the Pinecone query are kept separate. This matters because Meridian's ops team needed to swap embedding models later without rewriting the retrieval logic.
The search feature launched in week 6, with a measured 62% reduction in "I couldn't find it" support tickets in the first two weeks post-launch.
Feature 3: AI-Drafted Reply Assistant (Weeks 4–7)
Meridian's in-app inbox let operations managers respond to workflow requests from their internal teams. The AI-drafted reply assistant analyzed the incoming message, surfaced relevant context from the knowledge base, and generated a draft reply that the manager could edit and send.
This feature was the most sensitive one to get wrong. Nobody wants to send an AI-generated response that misses the tone, gets the facts wrong, or embarrasses them with a client. We used a retrieval-augmented generation (RAG) architecture that pulled context from the same Pinecone index built for Feature 2 — giving the draft assistant grounding in Meridian's actual internal documentation rather than relying on LLM training data alone.
We also built a lightweight eval suite: 200 real historical inbox messages and their human-written responses, used to score draft quality on coherence, factual accuracy (against the KB), and appropriate tone. Every model or prompt change had to pass the eval before deployment.
The eval suite saved us from shipping a version that was technically impressive but subtly off-tone for enterprise ops managers. That kind of near-miss is invisible without structured evals.
The reply assistant launched in week 7 with a "draft quality" user rating of 4.3/5 in its first 30 days.
Feature 4: Anomaly Detection for Workflow Patterns (Weeks 6–10)
The fourth feature was the most data-intensive. Meridian's platform tracked workflow completion rates, step durations, and reassignment events for thousands of active workflows. Their ops managers needed to be alerted when a workflow was behaving outside its normal pattern — stalling at an unusual step, showing an unexpected spike in reassignments — before it became a problem.
We built a time-series anomaly detection layer using isolation forest for unsupervised detection on historical workflow data, with alert thresholds calibrated by workflow type. The model ran on a nightly batch job and surfaced ranked alerts into the Meridian dashboard each morning.
The tricky part was false positives. Anomaly detection models that cry wolf destroy user trust fast. We spent two full weeks with Meridian's ops team tuning the sensitivity settings by workflow category until the alert-to-real-issue ratio was below 10%. When it launched in week 10, ops managers reported catching three workflow failures in the first two weeks that would have gone unnoticed until client escalation.
Working on something AI-shaped? We'll scope it in 20 minutes — no pitch, no pressure.
Book scoping call →What the Quarter Looked Like in Numbers
The differences between what Meridian expected when they started and what actually shipped:
| Feature | Original Plan | Actual Delivery | Key Metric |
|---|---|---|---|
| Document summarization | Q3 2026 | Week 4 | 47 beta users, +18% session time |
| Semantic search | Q3 2026 | Week 6 | 62% drop in "can't find" tickets |
| Reply assistant | Q4 2026 | Week 7 | 4.3/5 draft quality rating |
| Anomaly detection | Q4 2026 | Week 10 | 3 failures caught in first 2 weeks |
All four features. One quarter. No new headcount. Meridian retained their $400K enterprise deal — the one at risk when we started — and used the AI features as a differentiator in two additional enterprise pitches that quarter.
What This Means for Your Roadmap
If Meridian's story sounds familiar — features on the roadmap for multiple quarters, hiring not closing fast enough, one-off freelancers not delivering production quality — the structural solution is worth understanding.
The reason Meridian could ship four features in 90 days is not that we are faster than their engineers. It is that we brought dedicated, context-loaded AI engineering capacity with no ramp-up time on tooling, no competing sprint work, and no learning curve on production RAG, vector databases, or eval infrastructure. Their engineers kept shipping core product. We shipped the AI layer.
This model works best when:
- Your AI feature requirements are clear (inputs, outputs, acceptance criteria defined)
- Your existing backend is modern enough to integrate with (Python, Node, FastAPI, Rails all work)
- You need production-quality code, not notebooks or prototypes
- You have more than one AI feature to ship in the next 6 months
It works less well when your requirements are still exploratory — when you are not sure what the AI feature should do yet. If that is where you are, the right first step is a scoping call, not an engagement.
Got an AI feature in mind?
Book a free 20-minute AI Feature Scoping Call. We'll tell you whether Boundev is the right fit, what tier you'd need, and how fast we can ship. We say no to about a third of calls — the fit either works or it doesn't.
Book scoping call →Frequently Asked Questions
How long does it take Boundev to start shipping after a scoping call?
Typically 3–5 business days from signed agreement to first sprint kickoff. Onboarding involves a single setup session where we access your repo, review your stack, and align on acceptance criteria for the first feature.
Does Boundev require a long-term contract?
No. The subscription runs month-to-month after an initial 30-day commitment. Most customers stay 3–6 months because that is how long a typical AI feature batch takes to ship, test, and stabilize.
What if our requirements change mid-quarter?
We handle this regularly. If a feature scope shifts significantly, we reprioritize within the same sprint cycle. The model is optimized for iteration, not fixed-scope contracts.
Do we own the code Boundev writes?
Yes, fully. All code is committed to your repo, under your IP, from day one. No licensing restrictions or lock-in provisions in the engagement agreement.
What is the difference between Boundev's tiers?
Starter covers one AI feature at a time with one dedicated AI engineer. Growth covers two to three parallel features with two engineers and an AI Ops Manager. Scale handles complex, multi-system AI integrations with a full pod. The Meridian engagement ran on Growth tier throughout.

