Latent-State Streaming Chat UI

Build a streaming chat interface that shows a ‘thinking indicator’ driven by real concurrent reasoning tokens, not a spinner hack

Difficulty: weekend | Stack: TypeScript, Next.js, Vercel AI SDK, Claude claude-sonnet-4-5 (extended thinking mode), Tailwind CSS

Who this is for

Developers demoing agent UX who want to visualize concurrent reasoning state rather than hiding latency behind fake animations

Scaffold Next.js app with Vercel AI SDK useChat hook targeting Claude claude-sonnet-4-5 with extended thinking enabled (budget_tokens: 8000)
Stream thinking blocks and text blocks separately — render thinking block content in a collapsible side panel that updates in real-time
Add per-token timing overlay: show thinking tokens/sec vs. response tokens/sec as live sparklines
Implement ‘reasoning compression’ toggle: when on, summarize the thinking block via a second cheap LLM call before displaying
Deploy to Vercel; measure perceived latency (time-to-first-text-token) with and without thinking block visible

Claude extended thinking adds wall-clock latency before first text token — UI must handle 5-15s blank period gracefully
Thinking block content is not always coherent prose — raw display may confuse non-technical users
Vercel hobby plan function timeout (10s) too short for long thinking budgets — need pro plan or edge streaming workaround