Unified Gymnasium wrapper lets one RL agent train across MuJoCo, PyBullet, and Brax without custom infra

Customer: Solo robotics ML engineer at a 5-20 person deep-tech startup or university lab — has working sim pipeline in one physics engine, needs sim-to-real robustness but can’t justify 2-week infra detour to abstract across engines

Problem: Switching or combining physics simulators (MuJoCo vs PyBullet vs Brax) requires rewriting env wrappers, observation spaces, reward shaping, and vectorization logic per engine — kills momentum on actual research

Pricing: open-core — $800 MRR in 4 months (8 teams at $99/mo for hosted dashboard + multi-engine sweep configs; core lib stays MIT)

Why now

Sim-to-real gap closing fast means teams now actually ship to hardware — physics generalization stops being academic and starts being a deployment blocker. Brax on TPUs + MuJoCo 3.x + Isaac Lab all landed in 18 months; no unified abstraction exists yet

Go-to-market

Ship MIT-licensed pip package, post on r/reinforcementlearning and HuggingFace forums with working ant/hopper cross-engine benchmark — let organic GitHub stars do early distribution
DM 20 robotics ML engineers active on Twitter/X who complained about sim transfer; offer free 30-min setup call in exchange for honest feedback and a testimonial
Write one deeply technical blog post: ‘We trained the same SAC agent across 3 physics engines — here’s what broke’ — target HN front page, cross-post to Towards Data Science
Offer $99/mo ‘Physics Sweep Pro’ tier: hosted config UI to define regime curricula (mass ranges, friction distributions, gravity perturbations) + experiment tracking integration — sell this once 3 free users hit scaling pain

Moat (or lack thereof)

No real moat. API abstraction is easy to copy; any sim vendor (IsaacLab, Brax team) could ship this themselves. Defensibility is execution speed and community trust before a big player notices — classic indie-hacker window of 12-18 months max. Network effects are weak. Bet on being first, not being sticky.