AI Pulse
← Projects · 1-week

Session Memory Consolidation Service

A background service that compresses and consolidates agent conversation history into a structured memory store, injected as a compact context prefix on next session.

Difficulty: 1-week | Stack: Python, FastAPI, SQLite + sqlite-vec, Claude API (haiku for compression), APScheduler

Who this is for

Developers building multi-session agents (coding assistants, research agents) where re-deriving prior context burns tokens and causes drift.

Build steps

  1. Build a session logger: any agent posts raw turns to POST /sessions/{id}/turns; stored in SQLite with timestamps.
  2. Write a consolidation job (APScheduler, runs idle periods): calls Haiku with a compression prompt that extracts decisions, facts, and open tasks into a structured JSON memory object.
  3. Store versioned memory snapshots; on GET /sessions/{id}/context return a compact Markdown summary (<500 tokens) ready to prepend to new session system prompt.
  4. Add a semantic search endpoint: POST /memory/search with a query returns top-k relevant memory chunks via sqlite-vec embeddings.
  5. Ship a Python SDK wrapper (3 functions: log_turn, get_context, search_memory) that any agent framework can call in 5 lines.

Risks

  • Haiku compression hallucinates or drops critical details — need a recall eval: ask questions about original turns and check memory answers match.
  • Memory grows unboundedly across many sessions — need a tiered eviction policy (recent full, older compressed, oldest summarized only).
  • APScheduler job timing conflicts with active sessions — need a session-active lock so consolidation never runs mid-conversation.

Business Angle

Plug-in memory layer that compresses multi-session agent history into token-efficient context prefixes, billed per compression job.

Customer: Solo dev building a coding assistant or research agent with Claude/GPT who ships to 10–200 end users and keeps hitting context limits or paying for repeated re-derivation of prior session state

Pricing: saas-mrr — $400 MRR in 3 months (8 devs × $50/mo tier, ~500k compressions/mo each)

Full business breakdown →