Session Memory Consolidation Service
A background service that compresses and consolidates agent conversation history into a structured memory store, injected as a compact context prefix on next session.
Difficulty: 1-week | Stack: Python, FastAPI, SQLite + sqlite-vec, Claude API (haiku for compression), APScheduler
Who this is for
Developers building multi-session agents (coding assistants, research agents) where re-deriving prior context burns tokens and causes drift.
Build steps
- Build a session logger: any agent posts raw turns to POST /sessions/{id}/turns; stored in SQLite with timestamps.
- Write a consolidation job (APScheduler, runs idle periods): calls Haiku with a compression prompt that extracts decisions, facts, and open tasks into a structured JSON memory object.
- Store versioned memory snapshots; on GET /sessions/{id}/context return a compact Markdown summary (<500 tokens) ready to prepend to new session system prompt.
- Add a semantic search endpoint: POST /memory/search with a query returns top-k relevant memory chunks via sqlite-vec embeddings.
- Ship a Python SDK wrapper (3 functions: log_turn, get_context, search_memory) that any agent framework can call in 5 lines.
Risks
- Haiku compression hallucinates or drops critical details — need a recall eval: ask questions about original turns and check memory answers match.
- Memory grows unboundedly across many sessions — need a tiered eviction policy (recent full, older compressed, oldest summarized only).
- APScheduler job timing conflicts with active sessions — need a session-active lock so consolidation never runs mid-conversation.
Business Angle
Plug-in memory layer that compresses multi-session agent history into token-efficient context prefixes, billed per compression job.
Customer: Solo dev building a coding assistant or research agent with Claude/GPT who ships to 10–200 end users and keeps hitting context limits or paying for repeated re-derivation of prior session state
Pricing: saas-mrr — $400 MRR in 3 months (8 devs × $50/mo tier, ~500k compressions/mo each)
Full business breakdown →