The customer
A Series A legal-tech SaaS serving mid-market law firms. 32 employees. Their product replaces three legacy tools used in litigation document review. Customer support sits in 7 specialists fielding ~400 tickets a week — most of them already answered somewhere in the company's 12,000-page documentation archive.
What they tried first
Six months in three acts. First, an in-house attempt — their two senior engineers spent 4 weeks on it, hit a wall on chunking, and went back to the product roadmap. Then a Toptal contractor — vetted, paid hourly, vanished mid-sprint after billing $14K. Then an agency quote: $42K and 11 weeks. The CEO killed it on the call.
The task they submitted
“Build us a RAG system over our 12,000-page customer support archive. Should answer agent-side questions inside Slack, cite sources, and not hallucinate. Stack is Python on AWS; we're already on Anthropic for our other AI features.”
Our approach
Scoped and assigned in 90 minutes. Senior engineer (ex-Anthropic Solutions) on it Tuesday morning. Hybrid retrieval (BM25 + semantic), Weaviate self-hosted on their existing AWS, semantic chunking via document structure (not naive 512-token splits), Claude 3.7 Sonnet for generation, citation enforcement at the prompt layer. Eval suite built before the first generation call — 280 questions, ground-truth answers from the support team, run on every PR.
Daily 90-second Slack updates. One revision round (the customer wanted a second confidence band on citations) shipped in 4 hours.
“Boundev shipped what our last contractor couldn't in 4 months. The PR came in clean — tests, eval suite, deploy guide. We merged it the same day.”
The outcome
Shipped Tuesday → Tuesday. Six business days, including the revision round. Within 60 days of deploy: average support response time down 40%, CSAT up 18 points, and the support team escalates ~22% fewer tickets to engineering. The customer renewed at Growth and is on their fourth task.
What's next
Subsequent tasks: an MCP server exposing the same retrieval inside their product (so customers' own users can self-serve answers), and an LLM cost audit — coming up next month.
