Natural-Language-to-Simulation Scenario Expander for Embodied AI

Give it a plain-English scenario (‘robot arm retrieves a tipped-over bottle from a wet countertop’) and it outputs fully parameterized simulation configs plus Cosmos-3-validated synthetic observation videos for training embodied agents.

Difficulty: 1-month | Stack: Python, FastAPI, Nemotron 3 Ultra (structured-output mode), NVIDIA Isaac Sim Python API or PyBullet, Cosmos 3 API, Pydantic, PostgreSQL, Next.js (review UI)

Who this is for

Robotics researchers and AV teams who need diverse, edge-case-rich training scenarios but lack the human hours to manually author simulation configs at the scale domain randomization demands.

Build steps

Build a Nemotron-powered ‘scenario compiler’: accept a natural language description and use structured-output prompting to emit a validated JSON schema covering object list, material properties, initial poses, lighting conditions, and success/failure criteria.
Write an Isaac Sim (or PyBullet) scene builder that consumes the JSON schema and programmatically instantiates the scene; run a short physics rollout and export RGB + depth frame sequences as video clips.
Pipe each generated video clip through the Cosmos 3 API (or the plausibility scorer from Project 2) to score physical realism; auto-reject clips below threshold and trigger a re-sample with perturbed parameters.
Implement a diversity sampler: use Nemotron to generate N variations of a base scenario (different object counts, surface textures, lighting) and deduplicate via embedding similarity to prevent redundant near-copies in the dataset.
Expose a Next.js review UI where a researcher can browse generated scenarios, inspect per-clip plausibility scores, approve/reject batches, and export a final annotated dataset manifest (COCO-style or LeRobot format).
Add a feedback loop: approved clips get embedded and stored; new scenario requests are checked against the existing library so the system suggests augmenting an existing scenario rather than regenerating from scratch when coverage already exists.

Risks

Isaac Sim has a steep setup curve (license, Omniverse install, Python API version pinning); budget 3–5 days just for environment setup and hello-world physics validation before touching the LLM pipeline.
Nemotron’s structured-output may hallucinate physically impossible parameter combinations (negative friction coefficients, overlapping rigid bodies at spawn) that crash the simulator; add a Pydantic validation layer with domain-specific constraints between the LLM output and the scene builder.
End-to-end latency per scenario (LLM expand → sim rollout → Cosmos score) may be 2–5 minutes, making interactive iteration slow; parallelize sim rollouts across multiple Isaac Sim instances early or the feedback loop becomes unusable before the project is complete.

Natural-Language-to-Simulation Scenario Expander for Embodied AI

Who this is for

Build steps

Risks

Business Angle