StepPO Visualizer: Agentic Credit Assignment Explorer

An interactive tool that runs a small LLM agent on multi-step tasks and visualizes how step-level vs token-level reward signals differ across a trajectory.

Difficulty: weekend | Stack: Python, LangGraph, Gradio, OpenAI API (or local Ollama), Matplotlib/Plotly

Who this is for

ML engineers and researchers who want an intuition for why token-level RL is a poor fit for agentic tasks — seeing the gradient signal difference visually is more compelling than reading the math.

Build steps

Build a minimal tool-calling agent with LangGraph that solves short multi-step tasks (e.g., calculator + web-search mock), logging every step (tool chosen, input, output) as a discrete node.
Implement two reward-attribution modes: token-spread (reward smeared across all tokens in the trajectory) vs step-aligned (reward assigned only to the action token at each step boundary).
Generate a batch of trajectories (successful and failed) using the agent, storing step metadata and token-level log-probabilities from the model.
Build a Gradio UI that replays a selected trajectory and overlays a heat-map of reward attribution per token, toggling between the two attribution modes.
Add a simple comparison panel showing mean attribution variance and signal-to-noise ratio across modes for a set of trajectories, making the granularity mismatch concrete and measurable.

Risks

Extracting per-token log-probabilities from API models (e.g., OpenAI) is rate-limited and may require switching to a local model like Llama via Ollama to get full logprob access.
Designing tasks that are long enough to show a meaningful difference between attribution modes but short enough to run dozens of trajectories within a weekend budget.
Reward function design is deceptively hard — a naive binary success/failure reward will make both modes look similar; you need partial-credit rewards per step to surface the difference.

StepPO Visualizer: Agentic Credit Assignment Explorer

Who this is for

Build steps

Risks

Business Angle