Engram Memory

Local-first memory plugin for OpenClaw. LLM-powered extraction, markdown storage, hybrid search via QMD.

Engram is a local-first memory plugin that gives OpenClaw agents persistent, searchable long-term memory across conversations — while keeping everything on your machine as plain markdown files. Built by Joshua Warren, it's the most feature-rich open-source memory solution available, with LLM-powered extraction, hybrid search via QMD (BM25 + vector + reranking), entity tracking, lifecycle management, governance systems, and over 60 configuration options. The plugin operates in three phases: Recall (injecting relevant memories before each conversation), Buffer (accumulating content after each turn until a trigger fires), and Extract (periodically using an LLM to extract structured memories). Memories are stored as markdown files with YAML frontmatter, categorized into types like fact, decision, preference, correction, relationship, principle, and more. They're fully portable, git-friendly, and can be backed up with standard tools. Engram differentiates itself from cloud-based alternatives by being entirely local. You can use OpenAI for extraction and reranking, or run fully offline with Ollama, LM Studio, or any OpenAI-compatible endpoint. The local-llm-heavy preset is optimized for completely offline operation. This makes it ideal for privacy-conscious users who don't want their conversations sent to external servers. Beyond OpenClaw, Engram also works as an MCP server for Codex CLI, Claude Code, and any MCP-compatible client. The standalone HTTP server supports multi-tenant setups where multiple agent harnesses share a single Engram instance. With 672 tests, an evaluation harness with benchmark packs, and a governance system with review queues and shadow modes, it's built for serious production use. The plugin has over 4,300 weekly downloads and is one of the most actively maintained OpenClaw plugins with frequent updates.

Tags: memory, local-first, search

Use Cases

  • Privacy-first personal AI assistant that never sends conversation data to external servers
  • Developer assistant that remembers code decisions, debugging sessions, and project context across sessions
  • Multi-agent team where each agent has isolated memory namespaces
  • Cross-tool memory — use with OpenClaw, Codex CLI, and Claude Code via MCP
  • Research assistant that accumulates knowledge over weeks of investigation

Tips

  • Start with the 'balanced' preset and customize later as you understand what works for your use case
  • Run 'openclaw engram doctor' to diagnose configuration issues and get remediation hints
  • Use 'openclaw engram setup --json' to validate your config and scaffold directories
  • For fully offline operation, use the 'local-llm-heavy' preset with Ollama
  • Back up your memory directory with git for version control and sync across machines
  • Use namespaces for multi-agent deployments to keep memory isolated
  • Run 'openclaw engram config-review' for opinionated tuning recommendations

Known Issues & Gotchas

  • The 60+ configuration options can be overwhelming — start with a preset (balanced, conservative, research-max, or local-llm-heavy)
  • Extraction uses LLM calls which cost money if using OpenAI — monitor usage
  • Local LLM extraction via Ollama works but quality is lower than GPT-4o for nuanced memory categorization
  • Pin the plugin version on install ('--pin' flag) to avoid unexpected breaking changes on updates
  • Memory files accumulate over time — set up lifecycle management rules early
  • If using MCP standalone server, secure it with authentication tokens
  • The evaluation harness is for power users — most people should just use presets

Alternatives

  • Supermemory
  • Mem0 Memory
  • QMD (built-in)

Community Feedback

S tier — QMD. Free, local, surgical. Grabs only what the agent needs instead of loading everything. This is the one.

— Reddit r/clawdbot

Engram uses hybrid search (BM25 + vector + reranking via QMD) to find semantically relevant memories. It doesn't just match keywords — it understands what you're working on and surfaces the right context.

— GitHub

Local-first memory plugin for OpenClaw AI agents. LLM-powered extraction, plain markdown storage, hybrid search via QMD. Gives agents persistent long-term memory across conversations.

— OpenClaw Directory

Frequently Asked Questions

Is Engram free?

Yes, Engram is open-source and free to use. The only cost is LLM usage for memory extraction — you can use OpenAI (paid) or run fully free with a local LLM via Ollama.

Where are memories stored?

All memories are stored as plain markdown files with YAML frontmatter on your local filesystem. No database required. You can read, edit, back up, and git-version your memories with standard tools.

Can I use Engram with Codex CLI or Claude Code?

Yes. Engram runs as an MCP server (HTTP or stdio) that works with Codex CLI, Claude Code, and any MCP-compatible client. Start the HTTP server and add it to your tool's MCP config.

How does Engram compare to Supermemory?

Engram is local-first, free, and open-source — all data stays on your machine. Supermemory is cloud-based, easier to set up, and scored higher on benchmarks. Choose Engram for privacy and control, Supermemory for convenience.

What presets are available?

Four presets: 'conservative' (minimal, safe defaults), 'balanced' (recommended starting point), 'research-max' (all features enabled), and 'local-llm-heavy' (optimized for fully offline operation with Ollama).

Does Engram work offline?

Yes. Use the 'local-llm-heavy' preset with Ollama or LM Studio for fully offline operation. No internet connection or API keys needed — everything runs locally on your machine.

How do I troubleshoot Engram?

Run 'openclaw engram doctor' for health diagnostics with remediation hints, 'openclaw engram setup --json' to validate config, and 'openclaw engram config-review' for tuning recommendations.

Configuration Examples

Balanced preset with OpenAI

{
  "plugins": {
    "entries": {
      "openclaw-engram": {
        "enabled": true,
        "config": {
          "preset": "balanced"
        }
      }
    }
  }
}

Fully local with Ollama

{
  "plugins": {
    "entries": {
      "openclaw-engram": {
        "enabled": true,
        "config": {
          "preset": "local-llm-heavy",
          "llm": {
            "provider": "ollama",
            "model": "llama3.1"
          }
        }
      }
    }
  }
}

MCP server for Codex CLI

# Start the server:
export OPENCLAW_ENGRAM_ACCESS_TOKEN="$(openssl rand -base64 32)"
openclaw engram access http-serve --host 127.0.0.1 --port 4318 --token "$OPENCLAW_ENGRAM_ACCESS_TOKEN"

# In ~/.codex/config.toml:
[mcp_servers.engram]
url = "http://127.0.0.1:4318/mcp"
bearer_token_env_var = "OPENCLAW_ENGRAM_ACCESS_TOKEN"

Installation

openclaw plugins install @joshuaswarren/openclaw-engram