Agent Session Archivist
A CLI tool that captures, tags, and links AI coding-session transcripts to the git commits they produced.
Difficulty: weekend | Stack: Python, Click, SQLite, GitPython, Anthropic Claude API
Who this is for
Solo developers and small teams who use Claude Code or Cursor and want to preserve the ‘why’ behind AI-assisted commits — especially useful when returning to a codebase months later.
Build steps
- Build a CLI command (
archive-session) that reads a transcript file (JSON or markdown export) and stores it in a local SQLite database alongside metadata: timestamp, summary, and file paths touched. - Hook into git via GitPython to automatically associate the stored transcript with the most recent commit hash at archive time, writing a back-reference into the commit’s notes (
git notes). - Add a
searchcommand that does full-text SQLite FTS5 search over stored transcripts so developers can query ‘why was X changed’ and get ranked transcript excerpts back. - Add a
showcommand that, given a commit SHA, retrieves and pretty-prints the linked transcript summary and a link to the full stored record. - Write a thin git post-commit hook installer (
archive-session install-hook) so archiving becomes automatic on every commit.
Risks
- Transcript formats differ across tools (Claude Code exports JSON, Cursor exports markdown) — parsing two or three formats adds more edge-case work than expected.
- Sensitive data (API keys, passwords pasted into the session) ends up in the archive; no built-in scrubbing means the SQLite file becomes a security liability.
- SQLite FTS5 search quality degrades when transcripts are long and verbose — relevance ranking may surface unhelpful chunks without chunking/embedding upgrades.