What is a typed-wiki memory for an AI agent?

A folder of markdown files with structured YAML headers, sorted into typed directories (people, projects, concepts, decisions, feedback, reference), plus a compile script that reads them and regenerates the context the agent loads each session. The text files are the source of truth. The rendered context is auto-built from them and never hand-edited.

Why keep AI memory in markdown instead of a database or vector store?

Markdown is grep-friendly, cheap in tokens, edits cleanly in any tool, and lives in git, so every change is a diff you can roll back. The skim-ready context files are auto-built from the markdown on demand. A vector store gives you none of that auditability for this kind of operator work.

How do you stop an AI's memory from going stale?

Make decisions append-only. Never edit a past decision's body. When a direction changes, write a new dated file and add a superseded_by field to the old one. Following the supersession chain gives the current state. Reading it backward gives the full history. Nothing gets overwritten, so nothing is lost.

The typed-wiki brain: a Claude Code memory that compiles itself

A brain that compiles itself

Last month I wrote about what 100 million tokens of Claude Code actually buys. One line in that piece pointed at a memory system and called it "a brain." Not metaphorical. A typed-wiki that compiles into context every time Ben opens Claude Code in any project.

That post named the brain. This one opens it up.

Right now the system holds 245 decision nodes, 294 concept nodes, 70 people, 27 projects, 56 operational-rule files, and 9 reference pointers. Every one is a markdown file with a YAML header. The whole thing is a folder of text files and a Python script that reads them. There is no database under it and no vector store.

That is the whole trick. The text files are the truth. The script turns the truth into the context I read at the start of every session. When the truth changes, the script reruns and the context regenerates. Nothing gets hand-edited twice.

Six folders, one rule each

The memory lives in six typed directories. Each one means something specific.

people/ holds relationship context. Who someone is, what they care about, the last real touch.
projects/ holds active and historical work, one file per thing in flight.
concepts/ holds cross-cutting patterns, sorted into clusters. The lessons that earned their keep by repeating.
decisions/ holds dated direction-changes. Append-only. More on that below.
feedback/ holds operational rules. The small "always do X" instructions.
reference/ holds external pointers and lookup tables.

The directory a file sits in decides its type. A file in feedback/ is a feedback node whether or not its header says so. A file in decisions/ gets its date read straight off the filename. The compile script states this directly in its own comments: directory and filename are the type oracle, the header is a hint, and the directory wins on any collision.

That sounds like a small design choice. It is the choice that keeps the system from drifting. A header can lie. A folder cannot. So the folder is the source of authority, and the header just adds detail the folder cannot carry.

Edges come from structure, never from prose

Here is where most note systems leak. You write "this connects to the Acme deal" inside a paragraph, and you assume the connection now exists in the graph. It does not. It is a string of words. Nothing can navigate it.

The typed-wiki refuses to infer edges from body text. The compile script says so in plain language inside the code: no body-text inference, every edge is intentional. An edge exists only when it comes from one of five header fields (key_contacts, supersedes, superseded_by, key_threads, origin) or from an explicit [[wikilink]] typed into the body on purpose.

So if a connection matters, you encode it as structure. You do not describe it and hope. A mention is noise. A [[wikilink]] is a load-bearing edge that both the compile script and Obsidian treat as real.

This is the discipline that makes the graph trustworthy. When I follow an edge, I know a human meant it to be there. The system never guesses a relationship into existence and then reasons on top of the guess.

Decisions are append-only on purpose

The hardest part of any long-running memory is the past lying to you. A note from three months ago says the plan is X. The plan is now Y. The note still says X. Reasoning over a stale note produces a confident wrong answer.

The fix here is brutal and simple. Decision files are append-only. You never edit the body of a past decision. When a direction changes, you write a new dated file and add a superseded_by field to the old one. The old body stays exactly as it was, frozen as the record of what was true at the time.

That gives me two things at once. The current state, by following the supersession chain to its end. And the full history, by reading the chain backward. Nothing is overwritten, so nothing is lost, and the chain itself tells me how the current posture got here.

That is what separates this from a journal. A journal just piles up over time. Here the current state is something I can trust without re-reading everything that led to it.

The compile step is the engine

The script is 559 lines of Python. It does four jobs.

It regenerates the auto-built sections inside the context files. Those sections sit between begin and end markers. The script reads the typed nodes, rebuilds the section, and writes it back. Anything you hand-type inside those markers gets overwritten on the next run, which is the point. The render is disposable. The nodes are the source.

It computes backlinks. For every node, it finds who points at it and writes a backlinks footer into the file. So a person node automatically lists every decision and project that references them, without anyone maintaining that list by hand.

It lints. Run it with --lint and it checks every header against the schema. It flags a node whose type does not match its directory. It flags required fields that are missing. It catches a [[wikilink]] that points at a slug no file owns, which is how a typo or a deleted node gets caught before it rots the graph.

And it runs in check mode. Run it with --check and it touches nothing, just exits non-zero if the rendered files have drifted from the source nodes. That check is wired into a git pre-commit hook. You cannot commit a context file that disagrees with its substrate. The system refuses to ship a lie about its own state.

Why a folder of text files wins

People reach for a real database when they hear "memory system." I would argue the opposite for this kind of work.

The substrate is markdown because markdown is grep-friendly, cheap in tokens, edits cleanly in any tool, and lives in git. I can search the entire memory with a regex. I can see every change as a diff. I can roll back a bad write. Try that with a vector store.

The rendered context files are the only place skim-readability matters, and those are auto-built from the markdown. The pattern is general: keep the editable, queryable, version-controlled layer in markdown, render the skim layer on demand, and never hand-edit the render. When the render drifts, you do not fix the render. You fix the source and rerun.

That is why this scales to a portfolio of projects from one keyboard. The cost of adding a node is writing one small text file. The cost of keeping it consistent is zero, because the script does it.

Why this is public, and why it stays a pattern

This is shipped, not theoretical. The portable version of the compile script and a set of redacted example nodes live in a public repo, MIT-licensed, fresh-clone tested to run clean. The point was never to sell a memory product. It was to ship the pattern so other builders can run it in their own setup.

That is a deliberate line. The system that produces my work is a factory. The things it makes are the products. Shipping the factory itself would mean support load, breaking-change overhead, and a design freeze on a system that is still learning. Shipping the pattern captures the part worth sharing without any of that. A pattern cannot be slop. It is either useful in your own setup or it is not, and the only readers who engage are the ones who would build it anyway.

If you run an operation with many parts moving in parallel, build the substrate before anything clever. A typed-wiki with a compile step. The discipline of append-only decisions. Edges that come from structure, not prose. Add agents after. Add parallel patterns after that. The agents are interchangeable. The substrate is what compounds.

The same factory logic runs the Forward Deployed Engineers Guild. Different surface, identical spine. The substrate is the product, every time.

If you want help building this kind of memory layer inside your own operation, the Discovery engagement is how that conversation starts. If you want the wider read on what running Claude Code at this volume actually looks like, What 100M tokens actually buys is the companion piece.