Storyboard-to-Video Agentic Pipeline
Give an LLM a script and let it plan scenes, generate each clip via API, and stitch them into a coherent short video autonomously.
Difficulty: weekend | Stack: Python, Claude API (claude-3-5-sonnet), Replicate API (Wan2.1 or LTX-Video), FFmpeg, MoviePy
Who this is for
Indie creators and game devs who want to prototype narrative video concepts without touching a timeline editor — describe a story, get a rough animatic back.
Build steps
- Build a scene-planning prompt: feed the LLM a script/brief and have it output a structured JSON array of scene objects (description, duration_s, camera_hint, mood).
- Write a generator loop that iterates over scenes, calls Replicate’s video model API with each scene description, and downloads the resulting mp4 clips.
- Add a continuity-injection step: before each generation call, prepend a ‘persistent world context’ string (characters, color palette, setting) derived by the LLM from the script to keep visual coherence.
- Stitch clips in order with MoviePy/FFmpeg, adding simple crossfades, and write the final output file.
- Wrap in a minimal CLI (argparse) so the user can pass a text file and get an mp4 back.
Risks
- Video generation APIs are slow (20-60s per clip) and expensive — a 10-scene script can cost $5-15 and take 10+ minutes; users may abandon it before seeing results.
- Prompt-level continuity hints rarely survive into actual visual consistency; characters and lighting will drift noticeably between clips, making the output feel disconnected.
- Replicate/RunwayML rate limits or model availability changes can break the pipeline mid-run with no easy recovery checkpoint.