Natural-Language-to-Simulation Scenario Expander for Embodied AI
Give it a plain-English scenario (‘robot arm retrieves a tipped-over bottle from a wet countertop’) and it outputs fully parameterized simulation configs plus Cosmos-3-validated synthetic observation videos for training embodied agents.
Difficulty: 1-month | Stack: Python, FastAPI, Nemotron 3 Ultra (structured-output mode), NVIDIA Isaac Sim Python API or PyBullet, Cosmos 3 API, Pydantic, PostgreSQL, Next.js (review UI)
Who this is for
Robotics researchers and AV teams who need diverse, edge-case-rich training scenarios but lack the human hours to manually author simulation configs at the scale domain randomization demands.
Build steps
- Build a Nemotron-powered ‘scenario compiler’: accept a natural language description and use structured-output prompting to emit a validated JSON schema covering object list, material properties, initial poses, lighting conditions, and success/failure criteria.
- Write an Isaac Sim (or PyBullet) scene builder that consumes the JSON schema and programmatically instantiates the scene; run a short physics rollout and export RGB + depth frame sequences as video clips.
- Pipe each generated video clip through the Cosmos 3 API (or the plausibility scorer from Project 2) to score physical realism; auto-reject clips below threshold and trigger a re-sample with perturbed parameters.
- Implement a diversity sampler: use Nemotron to generate N variations of a base scenario (different object counts, surface textures, lighting) and deduplicate via embedding similarity to prevent redundant near-copies in the dataset.
- Expose a Next.js review UI where a researcher can browse generated scenarios, inspect per-clip plausibility scores, approve/reject batches, and export a final annotated dataset manifest (COCO-style or LeRobot format).
- Add a feedback loop: approved clips get embedded and stored; new scenario requests are checked against the existing library so the system suggests augmenting an existing scenario rather than regenerating from scratch when coverage already exists.
Risks
- Isaac Sim has a steep setup curve (license, Omniverse install, Python API version pinning); budget 3–5 days just for environment setup and hello-world physics validation before touching the LLM pipeline.
- Nemotron’s structured-output may hallucinate physically impossible parameter combinations (negative friction coefficients, overlapping rigid bodies at spawn) that crash the simulator; add a Pydantic validation layer with domain-specific constraints between the LLM output and the scene builder.
- End-to-end latency per scenario (LLM expand → sim rollout → Cosmos score) may be 2–5 minutes, making interactive iteration slow; parallelize sim rollouts across multiple Isaac Sim instances early or the feedback loop becomes unusable before the project is complete.
Business Angle
Stripe for simulation configs — paste a plain-English scenario, get a production-ready Isaac Sim config + synthetic video dataset in minutes, not weeks.
Customer: Solo robotics ML engineer at a 10-50 person robotics startup (think: warehouse picking, surgical robotics, or agri-bot companies) who owns the sim-to-real pipeline but has no dedicated simulation engineering team — they're bottlenecked authoring URDF tweaks and randomization params by hand.
Pricing: saas-mrr — $1,500 MRR within 4 months (5 customers × $300/mo on a 500-scenario/month plan)
Full business breakdown →