What does 500 Articles on AI look like?
The wiki-as-memory pattern is one answer. Not the only one, but a structurally honest one.
Andrej Karpathy recently described shifting most of his token usage from code generation to knowledge compilation — and the distinction is worth sitting with.
His setup: raw articles land in a folder. An LLM reads them, writes structured wiki entries with backlinks and cross-references, then runs periodic health checks to find contradictions and orphaned concepts. The output is plain markdown. Files on disk. Nothing you can't read yourself.
Above is what mine looks like for AI Regulation articles I am tracking.
He calls it an LLM Knowledge Base. The analogy is a compiler: raw/ is source code, the LLM is the compiler, the wiki is the executable, the health checks are the test suite.
What's interesting isn't that it replaces RAG — it probably doesn't, at scale. What's interesting is what it reveals about where RAG actually breaks.
RAG has a structural problem most teams discover too late: retrieval quality degrades as corpus complexity increases. Chunking breaks context. Embeddings drift. Similarity search finds the sentence you wrote, not the concept you meant. You end up with a system that's confident and wrong about what it knows.
The markdown-first approach flips the question. Instead of "how do I retrieve the right chunks," you ask "can I build a substrate the model can reason over without retrieval at all?" For small-to-mid corpora, the answer is often yes.
The governance implication is the part I keep coming back to.
An inspectable knowledge base — files, not vectors — means you can audit what the model incorporated, trace how it updated a concept over time, and understand why an agent reached a conclusion. That's not just a nice feature. In regulated contexts, it's the difference between a system you can defend and one you hope nobody asks about.
We spend a lot of time talking about agent outputs. We spend much less time asking: what did the agent actually know, and how do we know it knew it?
The wiki-as-memory pattern is one answer. Not the only one, but a structurally honest one.