Karakeep Semantic Search

Adds vector search to Karakeep bookmarks using Qdrant + OpenAI/Ollama embeddings.

Karakeep Semantic Search by @jamesbrooksco adds vector-based semantic search to Karakeep, the self-hosted bookmark manager. While Karakeep is excellent for hoarding bookmarks, its built-in search is keyword-based — you can only find things if you remember the exact words. This sidecar service converts bookmark content into vector embeddings using OpenAI or Ollama, stores them in an embedded Qdrant vector database, and provides a REST API for semantic search. The result: you can search your bookmarks by meaning rather than exact words. 'That article about getting things done' finds your GTD resources even if they never mention those exact words. 'The tutorial about deploying React apps' finds relevant bookmarks regardless of whether they use the word 'deploy' or 'ship' or 'publish.' For heavy bookmark users with thousands of saved items, this transforms Karakeep from a bookmark graveyard into a genuinely useful knowledge retrieval system. The architecture is clean and self-contained: a single Docker container bundles the search service with Qdrant, so there's no separate database to manage. It auto-syncs with Karakeep at configurable intervals, indexes new bookmarks automatically, and includes a ready-to-use OpenClaw skill in the repo. The skill lets your agent search your bookmark collection through conversation — 'find my bookmarks about prompt engineering' — adding another layer of accessibility to your saved knowledge.

Tags: search, vector, bookmarks, embeddings

Category: knowledge

Tips

  • Use OpenAI's text-embedding-3-small model for cost-effective embeddings — it's accurate and cheap enough for thousands of bookmarks
  • Alternatively, use Ollama with a local embedding model for fully self-hosted, zero-API-cost operation
  • Set SYNC_INTERVAL_MINUTES to 5-10 for near-real-time indexing of new bookmarks without excessive API calls
  • Install the included OpenClaw skill from the repo's skill/ directory so your agent can search bookmarks through conversation
  • Pair with Karakeep's browser extension for a smooth workflow: save bookmark → auto-indexed → searchable via agent within minutes

Community Feedback

Karakeep is great for hoarding bookmarks, but its search is keyword-based. This sidecar service adds semantic search — find bookmarks by meaning, not just exact words.

— GitHub

Gave my AI agent persistent semantic memory on a self-hosted bookmark manager. Now I can ask 'what was that article about X' and actually find it.

— Reddit r/selfhosted

Karakeep + semantic search + OpenClaw skill = never losing a bookmark again. The Qdrant integration is smooth and the Docker setup is one-command.

— OpenClaw Community

Frequently Asked Questions

Do I need Karakeep already set up?

Yes. This is a sidecar service that enhances an existing Karakeep installation. You need a running Karakeep instance with an API key. Karakeep is a free, self-hosted bookmark manager available at github.com/karakeep-app/karakeep.

How much does the OpenAI embedding API cost?

text-embedding-3-small costs $0.02 per million tokens. A typical bookmark might use 500-1000 tokens, so indexing 10,000 bookmarks costs about $0.10-0.20. Ongoing costs for new bookmarks are negligible.

Can I use local embeddings instead of OpenAI?

Yes. Set OLLAMA_URL instead of OPENAI_API_KEY to use Ollama with any local embedding model. This makes the entire setup self-hosted with zero external API dependencies.

Does the Qdrant database need separate management?

No. Qdrant is bundled inside the single Docker container. Data is persisted via a volume mount. No separate Qdrant installation or management is needed — it's truly a single-container deployment.