-
Couldn't load subscription status.
- Fork 9
Semantic Search
bkmr offers powerful semantic search capabilities, allowing you to find relevant content based on meaning rather than just keywords. This AI-powered feature helps developers locate information even when they don't remember the exact terms or tags.
Semantic search uses AI embeddings (vector representations of text) to capture the meaning of your bookmarks and queries. This allows bkmr to find content that's conceptually related, even when it doesn't contain the exact search terms.
- OpenAI API key set as environment variable:
OPENAI_API_KEY - The
--openaiflag when running commands that use embeddings
# Enable OpenAI embeddings and search for conceptually similar content
bkmr --openai sem-search "containerized application security"
# Limit results to top 5 matches
bkmr --openai sem-search "event-driven architecture" --limit 5
# Non-interactive mode
bkmr --openai sem-search "microservice patterns" --npSemantic search results work seamlessly with the action system. Each result will trigger the appropriate action based on its content type:
# Find and render documentation about Kubernetes
bkmr --openai sem-search "kubernetes pod configuration"
# Find and execute shell scripts related to deployment
bkmr --openai sem-search "deployment automation script"
# Find and copy code snippets for error handling
bkmr --openai sem-search "error handling patterns"Not all content benefits from semantic embeddings. By default, new bookmarks are not marked as embeddable to save API costs.
# Mark a bookmark as embeddable (will generate embeddings)
bkmr set-embeddable 123 --enable
# Mark a bookmark as non-embeddable
bkmr set-embeddable 123 --disable
# Backfill embeddings for all embeddable bookmarks
bkmr --openai backfill
# Preview what would be backfilled without making changes
bkmr --openai backfill --dry-runWhen using semantic search without the --np flag, you'll get an interactive interface:
- Results are displayed with their similarity scores
- You can select which result(s) to open
- The appropriate action will be executed based on content type
You can import text documents to make them searchable via semantic search:
# Import text documents from a JSON file
bkmr --openai load-texts path/to/documents.jsonl
# Preview importing without making changes
bkmr --openai load-texts path/to/documents.jsonl --dry-runThe file should be in NDJSON format (one JSON object per line):
{"id": "doc1.md", "content": "This is the content of document 1."}
{"id": "doc2.md", "content": "This is the content of document 2."}When working with markdown file references, bkmr can automatically embed the file content for semantic search when the file changes:
# Add a markdown file reference with embedding enabled
bkmr --openai add "~/documents/research.md" research,notes --type md
# The content is automatically read, embedded, and a content hash is storedWhen you access the bookmark later:
- The file is read again
- If the content has changed (detected via content hash), a new embedding is generated
- The markdown is rendered with the updated content
This ensures your semantic search always uses the latest version of your documents without manual intervention.
Semantic search transforms how developers access information:
- Concept-based retrieval - Find information based on concepts, not just keywords
- Natural language queries - Search the way you think, not how you tagged content
- Comprehensive knowledge base - Build a personal AI-powered documentation system
- Action-ready results - Results are immediately actionable based on content type
- Up-to-date content - File content is automatically re-embedded when it changes
-
bkmruses OpenAI's text-embedding-ada-002 model by default - Only portions of bookmarks marked as embeddable are sent to OpenAI for embedding generation
- Embeddings and content hashes are stored locally in your database
- Similarity is calculated using cosine similarity between vector representations
- File content is tracked using content hashes to minimize unnecessary API calls
Not all content types benefit equally from embeddings. Consider enabling embeddings for:
- Technical documentation and notes
- Complex code snippets with explanatory comments
- Project descriptions and requirements
- Reference materials and guides
- Markdown files that change frequently
Content that may not benefit as much:
- Very short snippets or one-liners
- URLs without descriptive content
- Binary files or executables
When using the OpenAI integration:
- Content from your bookmarks is sent to OpenAI's API for embedding generation
- No content is stored by OpenAI, but it may be used to improve their services
- If you have privacy concerns, consider carefully which bookmarks you mark as embeddable
Semantic search works with template-enabled content but searches the template itself rather than rendered content. Keep this in mind when creating searchable templates.
- Search and Discovery - Full-text search and tags
- Configuration - OpenAI API key setup
- Content Types - Understanding content types
- Advanced Workflows - Power user patterns