Multi-hop RAG evidence tracker for legal and compliance researchers drowning in large document sets
Customer: Solo compliance analyst or legal researcher at a small law firm or boutique consultancy (1–10 person team), handling due diligence, regulatory review, or case research across 500–5,000 internal documents — technically comfortable enough to use a web UI but not a Python dev
Problem: When answering multi-part legal or compliance questions across large corpora, they waste hours re-reading the same source docs, lose track of which facts are already established, and produce memos with subtle contradictions because their RAG tool has no memory across retrieval iterations
Pricing: saas-mrr — $800 MRR in 4 months (8 customers at $99/mo)
Why now
GPT-4o + LlamaIndex have made multi-hop RAG tractable for solo builders; the research wave targeting redundant retrieval and scattered evidence (this cluster’s focus) signals a real, named pain point that practitioners are starting to recognize — early movers who productize it before the OSS ecosystem catches up have a 6–12 month window
Go-to-market
- Post a teardown on r/legaltech and r/MachineLearning showing the evidence ledger catching a real contradiction in a public legal filing — attach a Loom, link to a waitlist
- DM 20 solo compliance consultants on LinkedIn who post about AI tools; offer a free 2-week pilot in exchange for a 30-min feedback call and a testimonial
- List on There’s An AI For That and Futurepedia under ‘Legal AI’ — these drive low-intent but high-volume discovery traffic from exactly this persona
- Offer a $49 one-time ‘upload your corpus’ onboarding call so early adopters feel hand-held; use those sessions to find the 3 most common query patterns and harden the UI around them
Moat (or lack thereof)
No real moat — LlamaIndex and LangGraph can replicate the evidence ledger pattern in a weekend, and well-funded legal AI players (Harvey, Casetext) will add multi-hop features. The defensibility is purely execution: being first to have a polished Streamlit UI non-engineers can use, and accumulating a corpus of domain-specific prompt templates and retrieval configs from real customer use cases. That’s a 6–12 month head start, not a durable moat.