CoT Graph Compressor
A Streamlit app that converts a model’s chain-of-thought trace into a Mermaid reasoning graph, lets you prune redundant nodes, and re-injects the compressed graph as a structured prompt prefix.
Difficulty: 1-week | Stack: Python, Streamlit, Anthropic SDK, Mermaid.js (via streamlit-mermaid), NetworkX, spaCy
Who this is for
Prompt engineers and researchers who want to explore the Render-of-Thought hypothesis practically — making CoT reasoning inspectable and shorter without sacrificing accuracy.
Build steps
- Build a Streamlit UI that sends a user question to Claude with an explicit chain-of-thought instruction and streams back the reasoning trace.
- Use spaCy’s dependency parser to extract (subject, predicate, object) triples from each CoT sentence and build a directed NetworkX reasoning graph.
- Render the graph as a Mermaid diagram embedded in Streamlit; highlight nodes by estimated semantic redundancy (cosine similarity between adjacent node embeddings using sentence-transformers).
- Add an interactive node-pruning panel where the user can collapse or remove redundant steps, then serialize the pruned graph back to a compact bullet-point summary.
- Re-inject the compact summary as a structured prefix for a second LLM call and display accuracy delta and token savings side-by-side.
- Run 20 benchmark questions from GSM8K to compare original CoT token count vs. compressed prefix token count vs. final answer accuracy.
Risks
- Triple extraction from natural language CoT is noisy — spaCy often misparses conditional or hypothetical sentences, producing a garbled graph.
- The re-injection format (bullets vs. JSON vs. pseudo-logic) significantly affects downstream accuracy and requires empirical tuning rather than a principled answer.
- Mermaid graphs become unreadable beyond ~15 nodes, so longer reasoning chains (math proofs, multi-step code) will exceed the visualization limit quickly.
Business Angle
Sell CoT Graph Compressor as a token-cost reduction tool for AI teams burning money on long reasoning chains
Customer: Solo AI engineers or small startup CTOs (1-5 person teams) who are using Claude or GPT-4o with extended thinking / chain-of-thought prompting and are getting $500–$2,000/month token bills they want to cut
Pricing: one-time — $800 in one-time sales within 3 months (roughly 16 × $49 licenses via Gumroad or Lemon Squeezy)
Full business breakdown →