LLM Wiki: Andrej Karpathy's Pattern for Building a Compounding Knowledge Base

Andrej Karpathy just published a 2,000-word idea file on GitHub. Within days, it got over 10 million views. The idea? Use an LLM to build and maintain a personal wiki — not just for storage, but as a living, compounding knowledge artifact that gets richer every time you use it.

No vector databases. No RAG pipelines. Just markdown files, an LLM, and Obsidian as the frontend. That’s the LLM Wiki pattern — and it’s changing how developers and researchers think about knowledge management.

The Problem With RAG

Most people’s experience with LLMs and documents looks like this: you upload files, the LLM retrieves relevant chunks at query time, and generates an answer. ChatGPT file uploads work this way. Most RAG systems work this way. It works — but there’s a fundamental inefficiency buried inside.

The LLM rediscoveries knowledge from scratch on every question.

Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every single time. Nothing accumulates. The system gets better at answering your specific questions only in the moment of answering. There’s no compounding.

Over time, this becomes a real limitation. You read 50 articles about a topic. You ask 100 questions about it. And yet, every new question still feels like you’re starting from zero.

The LLM Wiki: A Different Model

The LLM Wiki pattern flips this on its head. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources.

When you add a new source, the LLM:

Reads it and extracts the key information
Integrates it into the existing wiki
Updates entity pages, revises topic summaries
Notes where the new data contradicts old claims
Strengthens or challenges the evolving synthesis

A single source might touch 10–15 wiki pages. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you’ve read. You don’t just retrieve — you know.

Karpathy put it this way: “The wiki is a persistent, compounding artifact. You never write the wiki yourself — the LLM writes and maintains all of it. You’re in charge of sourcing, exploration, and asking the right questions.”

The Three-Layer Architecture

Layer 1: Raw Sources

Your curated collection of source documents — articles, papers, images, data files. Immutable. The LLM reads from them but never modifies them. This is your source of truth. Think of it as the raw/ directory in a data pipeline: inputs only, never overwritten.

Layer 2: The Wiki

A directory of LLM-generated markdown files — summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent.

You read the wiki. The LLM writes it. This is the key inversion.

Layer 3: The Schema

A configuration file (e.g. CLAUDE.md or AGENTS.md) that tells the LLM how the wiki is structured, what conventions to follow, and what workflows to execute on ingest, query, or maintenance. The schema is what makes the LLM a disciplined wiki maintainer rather than a generic chatbot.

Why It’s Going Viral

Karpathy’s GitHub Gist describing the pattern went semi-viral within days. Here’s why developers and researchers are paying attention:

1. It’s ridiculously simple

No vector embeddings. No ChromaDB or Pinecone. No RAG framework. Just plain markdown files in a folder, an LLM with a good system prompt, and Obsidian as the viewer. The barrier to entry is essentially zero for anyone already using AI tools.

2. The “Agent-Forward” Implication

As one commenter put it: “This is a meta-framework. It doesn’t depend on any specific model or tech stack — it’s trying to define a way humans and AI collaborate to manage knowledge.”

But there’s a more provocative reading: in the Agent era, sharing specific code or apps is becoming less important. Share the idea, and let the recipient’s Agent build the implementation. Karpathy distributed his pattern as a single gist. You give it to your Agent, and it builds out your wiki from scratch. The knowledge artifact is now portable across Agent platforms.

3. The Compounding Effect

Traditional note-taking apps (Notion, Roam, Obsidian without the LLM) all suffer from the same problem: the maintenance burden grows faster than the value. You add 50 notes, and now you have 50 notes to keep organized. Humans give up.

LLMs don’t give up. They don’t get bored updating cross-references. They can touch 15 files in one pass. The cost of maintaining the wiki approaches zero. And the value compounds — more sources mean more synthesis, more synthesis means better answers, better answers encourage more sourcing.

4. It’s Already Working at Scale

Karpathy himself reported that after ~100 sources (~400,000 characters of text), he could ask the wiki complex questions and get answers that synthesized across the entire corpus. Not “which document mentions X” — “what is the overall thesis emerging from all my research, and where are the contradictions?”

What People Are Saying

The reaction to Karpathy’s pattern has been a mix of excitement and genuine insight:

“In the Agent era, we no longer need to share specific code or apps. Just share the idea, and let the Agent customize and implement it according to your needs.” — Sina News summarizing community discussion
“This is a meta-framework. It doesn’t depend on any specific model or tech stack — it’s trying to define how humans and AI collaborate to manage knowledge.” — Developer community response
“I have the LLM agent open on one side and Obsidian on the other. The LLM makes edits based on our conversation, and I browse the results in real time. Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.” — Karpathy himself
Multiple open-source implementations have already appeared on GitHub within days of the original Gist

Practical Applications

The pattern applies across a wide range of contexts:

Personal: Tracking goals, health, psychology, self-improvement — filing journal entries, articles, podcast notes, and building a structured picture of yourself over time
Research: Going deep on a topic over weeks or months — reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis
Book reading: Filing each chapter as you go, building out pages for characters, themes, and plot threads — by the end you have a rich companion wiki, like a Tolkien Gateway built personally for each book
Business/Team: An internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls — the wiki stays current because the LLM does the maintenance nobody wants to do
Competitive analysis, due diligence, trip planning, course notes — anything where you’re accumulating knowledge over time and want it organized rather than scattered

How to Get Started

Starting your own LLM Wiki is straightforward:

Create a raw/ directory — drop in articles, papers, notes as markdown files
Create a wiki/ directory — the LLM will populate this
Set up Obsidian — open the wiki folder as a vault, enable the Graph View plugin
Give your Agent the LLM Wiki pattern — paste Karpathy’s Gist or your own version into your Agent’s context
Start ingesting — paste an article, ask the LLM to process it and file it into the wiki
Query the wiki — ask questions, explore, let the synthesis compound

Optional tools that enhance the workflow:

Obsidian Web Clipper — browser extension that converts web articles to markdown, quick way to populate your raw collection
qmd (github.com/tobi/qmd) — local search engine for markdown with BM25/vector hybrid, can be used as an MCP server so the LLM queries natively
Dataview plugin — runs queries over YAML frontmatter in wiki pages, generates dynamic tables and lists
Obsidian Graph View — visualize what’s connected to what, find hubs and orphan pages

The Relationship to Vannevar Bush’s Memex

The idea is related in spirit to Vannevar Bush’s Memex (1945) — Bush’s vision of a personal, curated knowledge store with associative trails between documents. Bush’s Memex was closer to the LLM Wiki vision than to what the web actually became: private, actively curated, with the connections between documents as valuable as the documents themselves.

The part Bush couldn’t solve: who does the maintenance? The Memex required humans to build and maintain all the associations. The maintenance burden was too high for widespread adoption.

The LLM handles that now. The knowledge compilation is free. The bookkeeping is free. The cross-referencing is free. What remains is the human work that actually matters: sourcing, thinking, and asking good questions.

Is This the Future of Personal Knowledge Management?

The LLM Wiki pattern is still new — the GitHub Gist is only a few weeks old. But the reaction from the developer community suggests this has struck a nerve. We’ve been waiting for a knowledge management solution that doesn’t require constant manual maintenance. LLMs finally make that possible.

Whether it becomes a lasting paradigm or an interesting experiment remains to be seen. But if you’re someone who reads a lot, researches deeply, or thinks carefully about complex topics, the LLM Wiki pattern is worth trying today. The tools already exist. The barrier is low. And the compounding effect — if you stick with it — is genuinely different from anything that came before.

Build Your Own LLM Wiki: A Step-by-Step Guide

This section is written for middle school students and up. No prior experience needed!

Ready to build your own LLM Wiki? Here’s how, step by step. We’ll show you two ways: a manual way (you do everything yourself) and an auto way (your AI agent does the heavy lifting).

What You’ll Need

A computer (Mac or Windows)
An AI tool — either Claude (claude.ai) or OpenClaw (your personal AI assistant)
Obsidian — a free note-taking app (download at obsidian.md)
A folder for your wiki (we’ll call it my-first-wiki/)

Step 1: Set Up Your Folder Structure

Create a new folder on your computer called my-first-wiki. Inside it, create two folders:

my-first-wiki/
├── raw/          ← Put your sources here (articles, notes)
└── wiki/         ← The AI will fill this with summaries

Then open Obsidian and click “Open folder as vault”. Select your my-first-wiki folder.

Step 2: Add Your First Source

Drop an article, a PDF, or some notes into the raw/ folder. Give it a descriptive name like climate_change_article.md. (If it’s a web page, you can copy the text and paste it into a new .md file.)

Manual Way (You Do It)

Open the file in Obsidian
Read it carefully
Create a new note in wiki/ — e.g., climate_change_summary.md
Write a 3-sentence summary of the main points
List 2–3 key facts you want to remember
Add links to other topics using [[other topic]] syntax (e.g., [[global warming]])
Add this to your wiki/index.md file so you can find it later

Auto Way (AI Does It)

Open Claude or OpenClaw
Copy-paste this prompt:

I've set up an LLM Wiki. My raw sources are in ~/my-first-wiki/raw/
and the wiki is in ~/my-first-wiki/wiki/.

Please:
1. Read the file in raw/ called "climate_change_article.md"
2. Create a wiki page in wiki/ summarizing the key points
3. Add it to wiki/index.md
4. Link it to any related topics you find

Do this now and show me what you created.

Watch the AI read your file, write the summary, and organize everything
Open Obsidian to see the results

Step 3: Search Your Wiki — A Real Example

Let’s say you’re working on a debate: “Should schools ban smartphones?”

What You Put In (Input)

First, you add sources to your raw folder:

raw/
├── smartphone_study_2025.md   ← A research article on phone bans
├── student_opinions_poll.md  ← Survey of 1,000 students
└── district_results_ca.md    ← Real data from California districts

Then you ask your AI:

I've added 3 new sources about smartphone bans in schools to my raw/ folder.
Please compile everything into my wiki and give me a structured debate brief
for the topic: "Should schools ban smartphones?"

For each side, include:
- 3 strongest arguments with source citations
- 1 counterargument to the opposing side
- Key statistics (with source numbers)
- What experts disagree on

Make it organized enough that I can use it directly in a debate.

What You Get Out (Expected Output)

The AI compiles everything into a structured page in your wiki, something like this:

# Debate Brief: Should Schools Ban Smartphones?

## YES — Arguments For a Ban

1. **Distraction kills learning**
   - Study: Students check phones 11 times per class on average
   - Source: smartphone_study_2025.md
   - Impact: Grades drop 23% with unrestricted phone access

2. **Mental health improves**
   - 3 in 4 students report less anxiety after phone ban
   - Source: student_opinions_poll.md
   - Note: Self-reported, not clinically verified

3. **Equalizes access**
   - All students compete on same playing field
   - Source: district_results_ca.md
   - Stat: 68% improvement in test scores in CA districts

## NO — Arguments Against a Ban

1. **Phones are learning tools**
   - Source: smartphone_study_2025.md
   - Counter-point: 40% of classwork uses digital devices

2. **Emergency communication**
   - Students need to reach parents instantly
   - Policy option: Phone stays in backpack, not in hand

3. **Life skills**
   - Teaching self-control > removing temptation
   - Source: student_opinions_poll.md

## Experts Disagree On
- Whether phone bans affect low-income students differently
- Whether teachers can effectively enforce partial bans

## Files Compiled From
- [[raw/smartphone_study_2025.md]]
- [[raw/student_opinions_poll.md]]
- [[raw/district_results_ca.md]]

Created: 2026-04-10 | Updated: 2026-04-10

This is a living document. As you add more sources, the AI updates it. New study comes out about phone bans? Drop it in raw/, ask the AI to compile, and your brief gets better automatically.

Step 4: Use It Downstream — Where Your Wiki Goes Next

A compiled wiki is useful in many ways:

📝 In Your Schoolwork

Essays: Ask the AI, “Based on my wiki, write an outline for an essay on smartphone bans.” It pulls from your compiled sources instead of making things up.
Presentations: Ask the AI to turn your wiki page into bullet points for a slideshow.
Study guides: “Create 10 quiz questions from my wiki on this topic.”

💬 In Debates

Your compiled brief (shown above) is your script. Practice citing sources from the wiki.
When your opponent makes a claim, search your wiki: “Do I have data on this?”
After the debate, add new sources and ask the AI to update the brief.

🧠 In Your Personal Knowledge

Ask open-ended questions: “What patterns do I see across all my sources on this topic?”
Find contradictions: “Where do my sources disagree, and why?”
Discover gaps: “What haven’t I researched yet?”

🔗 Share With Others

Your wiki/ folder is just markdown files — you can share it with classmates, email it to your teacher, or upload it to Google Drive.
No special software needed to read it — any text editor works.

The Manual vs. Auto Comparison

Task	Manual Way (You)	Auto Way (AI Agent)
Read a source	You read every word	AI reads it instantly
Write summary	You write 5-10 sentences	AI writes it in seconds
Find connections	You manually compare files	AI cross-references all files at once
Update when new info arrives	You re-read and rewrite	AI updates all relevant pages in one pass
Catch contradictions	You might miss them	AI explicitly flags contradictions
Keep index current	Easy to forget	AI updates index automatically

The best approach: Start manually to understand how it works. Once you get the hang of it, let the AI do the routine work so you can focus on thinking, sourcing, and asking good questions.

Frequently Asked Questions

What’s the difference between LLM Wiki and RAG?

RAG retrieves relevant chunks from raw documents at query time — the LLM rediscovers knowledge from scratch every question. LLM Wiki pre-compiles knowledge into a structured, interlinked wiki — the synthesis is already done, contradictions already flagged, cross-references already established. For complex, multi-source questions, LLM Wiki answers are deeper because the work is done once and reused.

Do I need Obsidian?

No — Obsidian is just one good option for viewing the wiki. Any folder of markdown files works. The key is that the LLM generates and maintains the wiki content; you just need a way to read and browse the results. Notion, VS Code, or even plain text editors work too.

How is this different from just taking notes in Obsidian?

Traditional note-taking requires you (a human) to maintain the wiki — updating cross-references, filing new notes, catching contradictions. Humans are bad at this at scale. In the LLM Wiki pattern, the LLM does all the maintenance. You only source and query. The workload inversion is the whole point.

What about privacy — isn’t my wiki on someone else’s server?

Not necessarily. Your raw sources and wiki are just markdown files on your local machine or your own cloud storage. The LLM processes them locally or via an API you choose. You control where the data goes. For maximum privacy, use a local LLM (Ollama) with the wiki stored on your own machine.

Where can I read Karpathy’s original idea file?

It’s on GitHub Gist: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f. Paste it into your AI Agent’s context and it will guide you through building your own wiki.

Build Your Own LLM Wiki: A Step-by-Step Guide

This section is written for middle school students and up. No prior experience needed!

Ready to build your own LLM Wiki? Here’s how, step by step. We’ll show you two ways: a manual way (you do everything yourself) and an auto way (your AI agent does the heavy lifting).

What You’ll Need

A computer (Mac or Windows)
An AI tool — either Claude (claude.ai) or OpenClaw (your personal AI assistant)
Obsidian — a free note-taking app (download at obsidian.md)
A folder for your wiki (we’ll call it my-first-wiki/)

Step 1: Set Up Your Folder Structure

Create a new folder on your computer called my-first-wiki. Inside it, create two folders:

my-first-wiki/
├── raw/          ← Put your sources here (articles, notes)
└── wiki/         ← The AI will fill this with summaries

Then open Obsidian and click “Open folder as vault”. Select your my-first-wiki folder.

Step 2: Add Your First Source

Manual Way (You Do It)

Open the file in Obsidian
Read it carefully
Create a new note in wiki/ — e.g., climate_change_summary.md
Write a 3-sentence summary of the main points
List 2–3 key facts you want to remember
Add links to other topics using [[other topic]] syntax (e.g., [[global warming]])
Add this to your wiki/index.md file so you can find it later

Auto Way (AI Does It)

Open Claude or OpenClaw
Copy-paste this prompt:

I've set up an LLM Wiki. My raw sources are in ~/my-first-wiki/raw/
and the wiki is in ~/my-first-wiki/wiki/.

Please:
1. Read the file in raw/ called "climate_change_article.md"
2. Create a wiki page in wiki/ summarizing the key points
3. Add it to wiki/index.md
4. Link it to any related topics you find

Do this now and show me what you created.

Watch the AI read your file, write the summary, and organize everything
Open Obsidian to see the results

Step 3: Search Your Wiki — A Real Example

Let’s say you’re working on a debate: “Should schools ban smartphones?”

What You Put In (Input)

First, you add sources to your raw folder:

raw/
├── smartphone_study_2025.md   ← A research article on phone bans
├── student_opinions_poll.md  ← Survey of 1,000 students
└── district_results_ca.md    ← Real data from California districts

Then you ask your AI:

I've added 3 new sources about smartphone bans in schools to my raw/ folder.
Please compile everything into my wiki and give me a structured debate brief
for the topic: "Should schools ban smartphones?"

For each side, include:
- 3 strongest arguments with source citations
- 1 counterargument to the opposing side
- Key statistics (with source numbers)
- What experts disagree on

Make it organized enough that I can use it directly in a debate.

What You Get Out (Expected Output)

The AI compiles everything into a structured page in your wiki, something like this:

# Debate Brief: Should Schools Ban Smartphones?

## YES — Arguments For a Ban

1. **Distraction kills learning**
   - Study: Students check phones 11 times per class on average
   - Source: smartphone_study_2025.md
   - Impact: Grades drop 23% with unrestricted phone access

2. **Mental health improves**
   - 3 in 4 students report less anxiety after phone ban
   - Source: student_opinions_poll.md
   - Note: Self-reported, not clinically verified

3. **Equalizes access**
   - All students compete on same playing field
   - Source: district_results_ca.md
   - Stat: 68% improvement in test scores in CA districts

## NO — Arguments Against a Ban

1. **Phones are learning tools**
   - Source: smartphone_study_2025.md
   - Counter-point: 40% of classwork uses digital devices

2. **Emergency communication**
   - Students need to reach parents instantly
   - Policy option: Phone stays in backpack, not in hand

3. **Life skills**
   - Teaching self-control > removing temptation
   - Source: student_opinions_poll.md

## Experts Disagree On
- Whether phone bans affect low-income students differently
- Whether teachers can effectively enforce partial bans

## Files Compiled From
- [[raw/smartphone_study_2025.md]]
- [[raw/student_opinions_poll.md]]
- [[raw/district_results_ca.md]]

Created: 2026-04-10 | Updated: 2026-04-10

This is a living document. As you add more sources, the AI updates it. New study comes out about phone bans? Drop it in raw/, ask the AI to compile, and your brief gets better automatically.

Step 4: Use It Downstream — Where Your Wiki Goes Next

A compiled wiki is useful in many ways:

📝 In Your Schoolwork

Essays: Ask the AI, “Based on my wiki, write an outline for an essay on smartphone bans.” It pulls from your compiled sources instead of making things up.
Presentations: Ask the AI to turn your wiki page into bullet points for a slideshow.
Study guides: “Create 10 quiz questions from my wiki on this topic.”

💬 In Debates

Your compiled brief (shown above) is your script. Practice citing sources from the wiki.
When your opponent makes a claim, search your wiki: “Do I have data on this?”
After the debate, add new sources and ask the AI to update the brief.

🧠 In Your Personal Knowledge

Ask open-ended questions: “What patterns do I see across all my sources on this topic?”
Find contradictions: “Where do my sources disagree, and why?”
Discover gaps: “What haven’t I researched yet?”

🔗 Share With Others

Your wiki/ folder is just markdown files — you can share it with classmates, email it to your teacher, or upload it to Google Drive.
No special software needed to read it — any text editor works.

The Manual vs. Auto Comparison

Task	Manual Way (You)	Auto Way (AI Agent)
Read a source	You read every word	AI reads it instantly
Write summary	You write 5-10 sentences	AI writes it in seconds
Find connections	You manually compare files	AI cross-references all files at once
Update when new info arrives	You re-read and rewrite	AI updates all relevant pages in one pass
Catch contradictions	You might miss them	AI explicitly flags contradictions
Keep index current	Easy to forget	AI updates index automatically

The best approach: Start manually to understand how it works. Once you get the hang of it, let the AI do the routine work so you can focus on thinking, sourcing, and asking good questions.

Disclaimer: Unless otherwise specified or noted, all articles on this site are co-publications with AI. Any individual or organization is prohibited from copying, misappropriating, collecting, or publishing the content of this site to any website, book, or other media platform without the prior consent of this site. If any content on this site infringes upon the legitimate rights and interests of the original author, please contact us for processing. 声明：本站所有文章，如无特殊说明或标注，均为和AI 共创。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

LLM Wiki: Andrej Karpathy’s Pattern for Building a Compounding Knowledge Base

The Problem With RAG

The LLM Wiki: A Different Model

The Three-Layer Architecture

Layer 1: Raw Sources

Layer 2: The Wiki

Layer 3: The Schema

Why It’s Going Viral

1. It’s ridiculously simple

2. The “Agent-Forward” Implication

3. The Compounding Effect

4. It’s Already Working at Scale

What People Are Saying

Practical Applications

How to Get Started

The Relationship to Vannevar Bush’s Memex

Is This the Future of Personal Knowledge Management?

Build Your Own LLM Wiki: A Step-by-Step Guide

What You’ll Need

Step 1: Set Up Your Folder Structure

Step 2: Add Your First Source

Manual Way (You Do It)

Auto Way (AI Does It)

Step 3: Search Your Wiki — A Real Example

What You Put In (Input)

What You Get Out (Expected Output)

Step 4: Use It Downstream — Where Your Wiki Goes Next

📝 In Your Schoolwork

💬 In Debates

🧠 In Your Personal Knowledge

🔗 Share With Others

The Manual vs. Auto Comparison

Frequently Asked Questions

Build Your Own LLM Wiki: A Step-by-Step Guide

What You’ll Need

Step 1: Set Up Your Folder Structure

Step 2: Add Your First Source

Manual Way (You Do It)

Auto Way (AI Does It)

Step 3: Search Your Wiki — A Real Example

What You Put In (Input)

What You Get Out (Expected Output)

Step 4: Use It Downstream — Where Your Wiki Goes Next

📝 In Your Schoolwork

💬 In Debates

🧠 In Your Personal Knowledge

🔗 Share With Others

The Manual vs. Auto Comparison

Related Articles

Anthropic’s Claude Managed Agents: Three Shifts That Will Reshape Everything — and Who Gets Left Behind

Google Gemma 4: The Quiet Revolution of On-Device AI — And What It Means for You

TOON: The JSON Compression Format Built for AI

Can AI Forget Things? Uncovering Claude Code Three-Layer Memory System

Recent Posts

Recent Comments