March 29, 2026 · Mihir Choudhary · 8 min read

Nora Remembers So Claude Doesn't Have To

A real story from today: Claude spent 6 attempts figuring out how to deploy a website it had already deployed 3 hours earlier. The session history was right there in a SQLite database. Nobody asked.

The problem, in one sentence

AI coding agents are stateless. Every session starts from zero. The pattern you discovered at 2pm is forgotten by 5pm. The deployment sequence you perfected last Tuesday doesn't exist on Wednesday. Your agent makes the same mistakes, tries the same dead ends, rediscovers the same solutions — every single time.

This isn't a hypothetical. It happened to me today, while building the Kernora website. And because I was building Kernora at the time, I had the unique experience of watching the problem and the fix collide in real time.

What actually happened

I asked Claude to deploy the Kernora website to Firebase Hosting. Claude had done this exact thing three hours earlier in the same project. The deployment target was the same. The Firebase project was the same. The Cloud Shell was still open in the same browser tab.

Claude didn't remember any of it.

Here's what the next twenty minutes looked like:

Attempt 1: firebase deploy --only hosting from the VM. Failed — no auth token.
Attempt 2: Try to extract OAuth token from Firebase console via JavaScript injection. Failed — console uses proto-over-HTTP2, not standard fetch.
Attempt 3: Monkey-patch fetch and XHR to intercept tokens. Failed — no tokens captured.
Attempt 4: Paste commands into Cloud Shell via ClipboardEvent. Failed — xterm.js strips newlines, commands never execute.
Attempt 5: Synthetic KeyboardEvent dispatches. Failed — garbled output.
Attempt 6: I finally said: "review this session and look how you deployed the site just a few hours back." Claude read the transcript and found the answer in seconds.

The answer was simple. The computer tool's type action works with Cloud Shell's xterm.js terminal. Type the command, press Return. That's it. Claude had used this exact method three hours earlier. It worked on the first try.

Six failed attempts. Twenty minutes. Because the agent couldn't search its own history.

The second fumble, same session

After figuring out the deploy method, Claude needed to push code to GitHub. It didn't know the repository path. It tried kernora/kernora. Wrong. It tried user/kernora. Wrong. It dug through .git/FETCH_HEAD and found the correct path: kernora-ai/kernora.

That path had been used in every single session for the past week. Nora's database had it. Nobody queried Nora.

What Nora would have done

Kernora captures every AI coding session — the full transcript, tools used, files touched, commands run, decisions made. It stores them in a local SQLite database. It analyzes them for patterns, bugs, playbooks, and anti-patterns.

But capturing isn't enough. The data was sitting there, in ~/.nora/echo.db, with 6 sessions, 16 patterns, 11 bugs, 7 architectural decisions. Nobody asked for it because nobody knew to ask.

The missing piece was a pre-hook — a script that fires before Claude processes each prompt, searches Nora's database for anything relevant, and injects that context automatically. The user doesn't search. The agent doesn't search. The system just knows.

So I built it. Today. In this session. Here's how it works.

The pre-hook: 180 lines of stdlib Python

Claude Code supports hook events. The one we need is UserPromptSubmit — it fires after the user types a prompt but before Claude processes it. The hook receives the prompt as JSON on stdin. Anything printed to stdout becomes context that Claude sees.

The hook script, nora_context.py, does four things:

# 1. Extract keywords from the user's prompt
keywords = extract_keywords(prompt)
# "how should I handle SQLite connections" → ["sqlite", "connections", "handle"]

# 2. Search Nora's database across four tables
patterns  = search_patterns(conn, keywords)   # reusable code patterns
decisions = search_decisions(conn, keywords)  # architectural decisions + rationale
bugs      = search_bugs(conn, keywords)       # known bugs, open or resolved
insights  = search_insights(conn, keywords)   # session playbooks + anti-patterns

# 3. Format results as context
context = format_context(patterns, decisions, bugs, insights)

# 4. Print to stdout → Claude sees this before responding
print(context)

No external dependencies. No network calls. No API keys. It reads a local SQLite file and prints text. The entire script is stdlib Python, runs in under 50ms, and has a 3-second timeout in case the database is locked.

What Claude sees

When I type "how should I handle SQLite connections in the daemon", Claude now receives this context before it starts thinking:

─── Nora Context (from past sessions) ───
[Nora Memory — Relevant Patterns]
  • Use WAL mode for SQLite when daemon has concurrent readers/writers
    (effectiveness: 0.95)
  • Always start daemon with init_db() before accepting connections
    (effectiveness: 0.9)

[Nora Memory — Past Decisions]
  • Use SQLite over Firestore for local storage
    Why: Privacy-first architecture, no cloud dependency
  • Per-request SQLite connections instead of shared connection pool

[Nora Memory — Known Bugs (unresolved)]
  • [high] SQLite locked on concurrent writes
    File: db.py
  • [high] Thread-unsafe SQLite connection sharing
    File: daemon_v2.py
─── End Nora Context ───

Claude doesn't just answer the question. It answers with institutional memory. It knows about the WAL mode pattern. It knows about the known concurrency bugs. It knows the architectural decision to use per-request connections. All from past sessions that the user has long forgotten.

The deploy problem, solved

Let's replay the deploy scenario with Nora's pre-hook active.

Without Nora

1. Try firebase deploy (no auth) ✗
2. Extract OAuth token ✗
3. Monkey-patch fetch ✗
4. ClipboardEvent paste ✗
5. Synthetic keyboard events ✗
6. Read transcript manually ✓

~20 minutes, 6 attempts

With Nora

1. User types "deploy to Firebase"
2. Pre-hook fires, searches DB
3. Finds: "computer tool type action works with Cloud Shell xterm.js"
4. Claude types command in Cloud Shell ✓

~30 seconds, first attempt

The difference isn't intelligence. Claude is the same model either way. The difference is memory. One version starts from zero. The other starts from everything you've ever done.

How to set it up

The hook registration lives in .claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 ~/.claude/hooks/nora_context.py",
            "timeout": 3
          }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 ~/.claude/hooks/kernora_hook.py",
            "async": true
          }
        ]
      }
    ]
  }
}

Two hooks. UserPromptSubmit fires before every prompt — Nora searches and injects context. Stop fires after every session — Nora captures the transcript for future searches. The capture is async so it never slows down your workflow.

Or just install Kernora and it registers both automatically:

curl -fsSL https://kernora.ai/install | bash

Why this matters beyond deployment scripts

The deploy story is a clean example, but the real value compounds over weeks and months. Consider what Nora remembers:

Patterns with effectiveness scores. "Profile isolation with ProfileManager.getActiveProfileId()" scored 0.98 effectiveness across our sessions — it's the most reliable pattern in the codebase. When anyone on the team asks about profile management, Nora surfaces that pattern first.

Bugs that keep coming back. "SQLite locked on concurrent writes" has been reported three times across different sessions. Nora flags it every time someone touches the database layer. The bug doesn't get rediscovered — it gets addressed.

Decisions with rationale. Six months from now, when someone asks "why don't we use Firestore?", Nora surfaces the original decision: "Privacy-first architecture, no cloud dependency." The reasoning doesn't get lost when people rotate.

Anti-patterns from real failures. "Started building before reading existing code" was extracted from a session where we wasted two hours reimplementing something that already existed. Nora warns the next session before it repeats the mistake.

The architecture in 30 seconds

Session endshook.py captures transcript → sends to Nora daemon via Unix socket → daemon stores in SQLite → analyzer extracts patterns, bugs, decisions (two-phase: deterministic + LLM) → stored in echo.db

Next session starts → user types prompt → nora_context.py extracts keywords → searches echo.db → prints matching context to stdout → Claude sees it before responding

Everything is local. The database is a SQLite file at ~/.nora/echo.db. The LLM analysis uses your own API key via LiteLLM — Anthropic, Google, OpenAI, Bedrock, or Ollama. Kernora never sees your code, your sessions, or your API key.

What I learned building this today

The pre-hook is 180 lines of Python. It took less than an hour to write, test, and ship. The hard part wasn't the code — it was recognizing the gap.

Nora had been capturing sessions for days. Six sessions, 16 patterns, 11 bugs, 7 decisions — all sitting in a SQLite database. The data was there. The query engine was there. What was missing was the trigger: the automatic, invisible moment where the system says "I know something relevant" before anyone asks.

That's the difference between a search engine and a memory. A search engine waits for you to ask. A memory volunteers what you need before you know you need it.

Nora is the memory.

Give your AI sessions a memory

Local. Private. Automatic. Works with Claude Code, Kiro, Cursor, and VS Code agents.

curl -fsSL https://kernora.ai/install | bash