Automated physical plausibility scoring for synthetic video datasets, so ML engineers stop wasting GPU hours training on broken sim data.

Customer: A solo ML engineer or small team (2–5 people) at a robotics or AV startup who owns the synthetic data pipeline — they run a sim (Isaac Sim, CARLA, BlenderProc) at scale, generate thousands of clips per week, and currently do QA by spot-checking 50 clips manually before a training run.

Problem: Synthetic video generators produce physically broken clips — objects clipping through floors, unnatural motion blur, teleporting entities — at a non-trivial rate (~5–15%). These poison perception model training, but manual review doesn’t scale past a few hundred clips, so bad samples silently enter the dataset.

Pricing: saas-mrr — $800 MRR in 4 months (8 customers at $99/mo, or 3–4 at $199/mo for higher throughput tiers)

Why now

NVIDIA’s Cosmos 3 video foundation model (released concurrently with this cluster) provides a capable, accessible backbone for physical plausibility scoring without needing to train a custom model from scratch. The same week Cosmos 3 dropped, NVIDIA also pushed Isaac Sim updates — meaning the sim-to-real pipeline community is actively evaluating new tooling right now. The timing to capture early adopters is tight.

Go-to-market

Post a detailed teardown on Hacker News / r/MachineLearning showing a real before/after: take a public synthetic dataset (e.g., VIPER or SyntheticDriving), run the filter, show the score distribution and the worst-offending clips side-by-side. No product pitch — just the result. Link to a waitlist.
Find 3 robotics or AV ML engineers on LinkedIn or Discord (Roboflow, Hugging Face, NVIDIA dev forums) who publicly complain about synthetic data quality. DM them a free 500-clip batch analysis in exchange for a 20-minute call about their current workflow.
Offer a $0 self-hosted open-core CLI (MIT license on GitHub) that scores clips locally using a lighter model (e.g., VideoLlava on a consumer GPU). The SaaS product is the hosted API with Cosmos 3 as backend, higher throughput, and a SQLite-backed audit dashboard — upsell path is clear and organic.
Target NVIDIA’s Inception Program and Jetson / Isaac Sim developer newsletters for a guest post or tool spotlight — NVIDIA has strong incentive to promote tooling that demonstrates Cosmos 3’s utility in production workflows.

Moat (or lack thereof)

No real moat. The core scoring logic is a thin wrapper around a foundation model any competitor can also call. The defensible bits — if any — are (1) the scoring rubric and prompt engineering tuned to robotics/AV failure modes, which takes iteration time to get right, and (2) the audit trail / dataset provenance features that become sticky once a team’s QA workflow depends on them. Realistically this is a first-mover + distribution play, not a technical moat.