pastclaude turns every Claude Code session on your machine into a verbatim, searchable, fact-checked memory — Postgres + pgvector, local models only, injected back into new sessions automatically. Nothing leaves your machine. Nothing is summarized away. And when a past session was wrong, the recall says so.
<pastclaude-recall hits="2" scope="cwd"> <session id="9c41b7a2" date="2026-03-14" sim="0.81" topics="payments-api,webhooks,debugging"> summary: Traced duplicate webhook deliveries to a missing idempotency key on the consumer side; added a processed-events table and replay guard. </session> <session id="d20effc1" date="2026-05-02" sim="0.50" source="lex" topics="payments-api,migrations"> summary: Schema migration for processed_events — exact match on ticket PAY-218, surfaced by lexical search. </session> </pastclaude-recall>
Every memory framework argues about storage, injection, and recall. pastclaude does those — and adds the one nobody talks about.
Raw session JSONL goes into Postgres untouched, on a schedule. Curation happens after capture: a local LLM scores every session 0–10 and tombstones the noise. Because the record is verbatim, every downstream decision is reversible — re-evaluate with a better model, re-embed with a better embedder, re-audit old claims. A summary written at capture time is an opinion you can never appeal.
A UserPromptSubmit hook runs recall on each prompt and injects
up to three relevant past sessions — capped, sanitized against stored
prompt-injection, scoped to the project you're in. Session start gets a
separate curated brief. No unbounded context dumps.
Per-turn embeddings in pgvector catch meaning; Postgres full-text catches the things embeddings systematically miss — ticket IDs, commit hashes, file paths, acronyms. The two arms are fused with reciprocal rank fusion, so an exact-identifier match can outrank a vague semantic one.
An audit stage extracts testable claims from past sessions — paths, file contents, infrastructure facts — and checks them against the actual filesystem. Claims that fail are stored as corrections and prepended to every future recall of that session. Citing a source is not the same as the source being true.
"A summarizer that decides what survives is an opinion about what matters. pastclaude keeps the record and forms opinions later."
Runs from launchd a few times a day. Every stage is isolated — one failure is logged and skipped, the rest still run.
side outputs: a daily brief (commits · kanban · Slack) and a weekly recall-quality report
Live figures from the original install — one developer's machine, every Claude Code session since the beginning, unattended since May 2026.
* eval runs on a local Qwen via an OpenAI-compatible server; embeddings via Ollama. The pipeline has never made a paid API call.
Every memory system has a failure mode it won't mention: the memory exists, is retrieved perfectly, is cited beautifully — and is wrong. Models state file paths that don't exist, describe functions that were never written, remember infrastructure that was never deployed. Store that, and your memory system becomes a fabrication-delivery service with great UX.
pastclaude's audit stage re-opens past sessions, extracts the claims that can be tested — "X is at path Y", "config contains Z" — and tests them. Failures go into a corrections ledger. When a session with failed claims is recalled later, the corrections are printed above the session content, so the model reads "this was wrong" before it reads the thing that was wrong.
It catches its own author, regularly. That's the point.
<pastclaude-corrections count="1"> <!-- Audited claims that DID NOT verify. Trust these over the session itself. --> <correction session="9c41b7a2"> claim: retry limit is read from RETRY_LIMIT in config.py verdict: not found — no RETRY_LIMIT anywhere in config.py inspected: src/payments-api/config.py </correction> </pastclaude-corrections>
Your transcripts contain client names, schemas, credentials-adjacent detail. pastclaude's entire pipeline — eval LLM, embeddings, search — runs on your hardware. There is no cloud tier, no telemetry, no third party to trust.
Postgres, pgvector, Ollama, ~4,600 lines of typed Python you can read
in an afternoon. No framework lock-in: the engine is harness-agnostic —
Claude Code is the first adapter, wired in through two hooks
and a launchd job.
Per-turn capture services bill you on every message. A batch pipeline on local models means total memory of everything you've ever done, at the price of some idle CPU three times a day.
Requirements: Python 3.10+, Postgres with pgvector, Ollama for embeddings, and any OpenAI-compatible local server for the eval model (oMLX, llama.cpp, LM Studio, Ollama).
Wire the recall hook into ~/.claude/settings.json
and every new prompt arrives pre-briefed with your own history.
# index everything you've ever done with Claude Code $ git clone https://github.com/siTOTis-ai/pastclaude $ cd pastclaude $ createdb pastclaude && psql pastclaude < schema.sql $ python -m venv .venv && .venv/bin/pip install -e . $ ollama pull mxbai-embed-large # first full run: ingest → eval → embed → wiki → audit $ .venv/bin/pastclaude all $ .venv/bin/pastclaude status # ask your own history a question $ .venv/bin/pastclaude search "how did we fix the webhook replay bug"