grab the key — generated & reconstructed by AI

A records office for your AI

Claude forgets.
The archive doesn't.

pastclaude turns every Claude Code session on your machine into a verbatim, searchable, fact-checked memory — Postgres + pgvector, local models only, injected back into new sessions automatically. Nothing leaves your machine. Nothing is summarized away. And when a past session was wrong, the recall says so.

View the source ↗ How it works ↓

1,815 sessions 164k turns verbatim $0 API spend MIT licensed

Exhibit A — what Claude sees on your next prompt

<pastclaude-recall hits="2" scope="cwd">
  <session id="9c41b7a2" date="2026-03-14" sim="0.81" topics="payments-api,webhooks,debugging">
    summary: Traced duplicate webhook deliveries to a missing idempotency
    key on the consumer side; added a processed-events table and replay guard.
  </session>
  <session id="d20effc1" date="2026-05-02" sim="0.50" source="lex" topics="payments-api,migrations">
    summary: Schema migration for processed_events — exact match on
    ticket PAY-218, surfaced by lexical search.
  </session>
</pastclaude-recall>

Chapter 01

Memory has four jobs.
Most systems do three.

Every memory framework argues about storage, injection, and recall. pastclaude does those — and adds the one nobody talks about.

01 / STORAGE

Verbatim first. Opinions later.

Raw session JSONL goes into Postgres untouched, on a schedule. Curation happens after capture: a local LLM scores every session 0–10 and tombstones the noise. Because the record is verbatim, every downstream decision is reversible — re-evaluate with a better model, re-embed with a better embedder, re-audit old claims. A summary written at capture time is an opinion you can never appeal.

02 / INJECTION

Right context, every prompt.

A UserPromptSubmit hook runs recall on each prompt and injects up to three relevant past sessions — capped, sanitized against stored prompt-injection, scoped to the project you're in. Session start gets a separate curated brief. No unbounded context dumps.

03 / RECALL

By meaning and by name.

Per-turn embeddings in pgvector catch meaning; Postgres full-text catches the things embeddings systematically miss — ticket IDs, commit hashes, file paths, acronyms. The two arms are fused with reciprocal rank fusion, so an exact-identifier match can outrank a vague semantic one.

The job everyone skips

04 / VERIFICATION

Memories get audited.

An audit stage extracts testable claims from past sessions — paths, file contents, infrastructure facts — and checks them against the actual filesystem. Claims that fail are stored as corrections and prepended to every future recall of that session. Citing a source is not the same as the source being true.

Four antique card-catalog drawers, one sealed with a glowing red wax seal — storage · injection · recall · verification

"A summarizer that decides what survives is an opinion about what matters. pastclaude keeps the record and forms opinions later."

— design principle № 1

Chapter 02

One unattended pipeline,
nine stages.

Runs from launchd a few times a day. Every stage is isolated — one failure is logged and skipped, the rest still run.

Brass pneumatic tubes carrying glowing document capsules through a dark archive — ingest → eval → embed → wiki → audit

ingestClaude Code JSONL transcripts → Postgres, idempotent on mtime + size

notes-ingestyour hand-written Obsidian notes join the same index

evallocal LLM scores each session; keep / tombstone verdict + summary + topics

embedper-turn embeddings via Ollama into pgvector (1024-dim)

wikian Obsidian source note per kept session

compileconcept pages aggregated across sessions and notes

notes-linkdeterministic [[wikilink]] insertion — no LLM, no surprises

lintorphans, stubs, stale concepts, dangling links

auditclaim extraction + filesystem verification → corrections ledger

side outputs: a daily brief (commits · kanban · Slack) and a weekly recall-quality report

Chapter 03

Not a whiteboard.
A production system.

Live figures from the original install — one developer's machine, every Claude Code session since the beginning, unattended since May 2026.

1,815

sessions indexed

164,881

turns stored verbatim

100,966

turn embeddings

261

noise sessions tombstoned

114

tests, no live deps needed

$0^*

API spend, ever

* eval runs on a local Qwen via an OpenAI-compatible server; embeddings via Ollama. The pipeline has never made a paid API call.

Chapter 04

The part that makes it
trustworthy.

Every memory system has a failure mode it won't mention: the memory exists, is retrieved perfectly, is cited beautifully — and is wrong. Models state file paths that don't exist, describe functions that were never written, remember infrastructure that was never deployed. Store that, and your memory system becomes a fabrication-delivery service with great UX.

pastclaude's audit stage re-opens past sessions, extracts the claims that can be tested — "X is at path Y", "config contains Z" — and tests them. Failures go into a corrections ledger. When a session with failed claims is recalled later, the corrections are printed above the session content, so the model reads "this was wrong" before it reads the thing that was wrong.

It catches its own author, regularly. That's the point.

A brass magnifying loupe and red stamp on documents under desk lamplight — the audit desk

Did not verify

Exhibit B — corrections, prepended at recall

<pastclaude-corrections count="1">
  <!-- Audited claims that DID NOT verify.
       Trust these over the session itself. -->
  <correction session="9c41b7a2">
    claim: retry limit is read from RETRY_LIMIT
           in config.py
    verdict: not found — no RETRY_LIMIT anywhere
           in config.py
    inspected: src/payments-api/config.py
  </correction>
</pastclaude-corrections>

Chapter 05

Local-first isn't a preference.
It's the point.

A small home server with a glowing amber keyhole on a desk at night — your data, at home

Client work stays home.

Your transcripts contain client names, schemas, credentials-adjacent detail. pastclaude's entire pipeline — eval LLM, embeddings, search — runs on your hardware. There is no cloud tier, no telemetry, no third party to trust.

Boring, inspectable parts.

Postgres, pgvector, Ollama, ~4,600 lines of typed Python you can read in an afternoon. No framework lock-in: the engine is harness-agnostic — Claude Code is the first adapter, wired in through two hooks and a launchd job.

Costs nothing to remember.

Per-turn capture services bill you on every message. A batch pipeline on local models means total memory of everything you've ever done, at the price of some idle CPU three times a day.

Chapter 06 — Climax

Run it tonight.

Requirements: Python 3.10+, Postgres with pgvector, Ollama for embeddings, and any OpenAI-compatible local server for the eval model (oMLX, llama.cpp, LM Studio, Ollama).

Wire the recall hook into ~/.claude/settings.json and every new prompt arrives pre-briefed with your own history.

Read the docs on GitHub ↗

terminal — install

# index everything you've ever done with Claude Code
$ git clone https://github.com/siTOTis-ai/pastclaude
$ cd pastclaude
$ createdb pastclaude && psql pastclaude < schema.sql
$ python -m venv .venv && .venv/bin/pip install -e .
$ ollama pull mxbai-embed-large

# first full run: ingest → eval → embed → wiki → audit
$ .venv/bin/pastclaude all
$ .venv/bin/pastclaude status

# ask your own history a question
$ .venv/bin/pastclaude search "how did we fix the
  webhook replay bug"

Claude forgets.The archive doesn't.

Memory has four jobs.Most systems do three.