Qdrant is an Open-Source Vector Search Engine written in Rust. It provides fast and scalable vector similarity search service with convenient API.
Based on the limited social mentions provided, there isn't enough substantive user feedback to comprehensively summarize what users think about Qdrant. The social mentions consist mainly of YouTube video titles without actual user reviews or detailed discussions. The one HackerNews mention appears to be about a different AI agent runtime tool rather than Qdrant itself. To provide an accurate summary of user sentiment about Qdrant, more detailed reviews, forum discussions, or social media posts with actual user experiences would be needed.
Mentions (30d)
0
Reviews
0
Platforms
4
GitHub Stars
29,940
2,150 forks
Based on the limited social mentions provided, there isn't enough substantive user feedback to comprehensively summarize what users think about Qdrant. The social mentions consist mainly of YouTube video titles without actual user reviews or detailed discussions. The one HackerNews mention appears to be about a different AI agent runtime tool rather than Qdrant itself. To provide an accurate summary of user sentiment about Qdrant, more detailed reviews, forum discussions, or social media posts with actual user experiences would be needed.
Features
Use Cases
Industry
information technology & services
Employees
95
Funding Stage
Series B
Total Funding
$88.7M
1,590
GitHub followers
129
GitHub repos
29,940
GitHub stars
20
npm packages
40
HuggingFace models
423,508
npm downloads/wk
Show HN: Open-sourced AI Agent runtime (YAML-first)
Been running AI agents in production for a while and kept running into the same issues:<p>controlling what they can do tracking costs debugging failures making it safe for real workloads<p>So we built AgentRuntime, the infrastructure layer we wished we had. Not an agent framework, but the platform around agents:<p>policies memory workflows observability cost tracking RAG governance<p>Agents and policies are defined in YAML, so it's infrastructure-as-code rather than a chatbot builder. Example – agents and policies in YAML agent.yaml – declarative agent config name: support_agent<p>model: provider: anthropic name: claude-3-5-sonnet<p>context_assembly: enabled: true<p><pre><code> embeddings: provider: openai model: text-embedding-3-small providers: - type: knowledge config: sources: ["./docs"] top_k: 3 </code></pre> policies/safety.yaml – governance as code name: security-policy<p>rules: - id: block-file-deletion condition: tool.name == "file_delete" action: deny<p>CLI – run and inspect Create and run an agent agentctl agent create researcher --goal "Research AI safety" --llm gpt-4 agentctl agent run researcher agentctl runs watch <run-id><p>Manage policies agentctl policy list agentctl policy activate security-policy 1.0.0<p>RAG – ingest docs and ground responses in your knowledge base agentctl context ingest ./docs agentctl run --agent agent.yaml --goal "How do I deploy?"<p>Agent-level debugging agentctl debug -c agent.yaml -g "Analyze this dataset."<p>Cost tracking is exposed via the API (per agent/tenant), and the Web UI shows analytics. The workflow debugger (breakpoints, step-through) lives in the pkg layer; the CLI debug is for agent execution. What’s in there Governance<p>Policy engine (CEL) Risk scoring Encrypted audit logs RBAC Multi-tenancy Fully YAML-configurable<p>Orchestration<p>Visual workflow designer (React Flow) DAG workflows Multi-agent coordination Conditional logic Plugin hot-reload Workflow marketplace<p>Memory & Context<p>Working memory Persistent memory Semantic memory Event log<p>Context assembly combines:<p>policies workflow state memory tool outputs knowledge<p>RAG features:<p>embeddings (OpenAI or local) SQLite for development Postgres + vector stores in production<p>Observability<p>Cost attribution via API SLA monitoring Distributed tracing (OpenTelemetry) Prometheus metrics Deterministic replay (5 modes)<p>Production<p>Kubernetes operator (Agent, Workflow, Policy CRDs) Helm charts Istio config Auto-scaling Backup / restore GraphQL + REST API<p>Implementation<p>~50k LOC of Go Hundreds of tests Built for production (in mind)<p>Runs on: Local<p>SQLite In-memory runtime<p>Production<p>Postgres Redis Qdrant / Weaviate<p>Happy to answer questions or help people get started
View originalPricing found: $50
LLM Documentation accuracy solved for free with Buonaiuto-Doc4LLM, the MCP server that gives your AI assistant real, up-to-date docs instead of hallucinated APIs
LLMs often generate incorrect API calls because their knowledge is outdated. The result is code that looks convincing but relies on deprecated functions or ignores recent breaking changes. Buonaiuto Doc4LLM addresses this by providing free AI tools with accurate, version-aware documentation—directly from official sources. It fetches and stores documentation locally (React, Next.js, FastAPI, Pydantic, Stripe, Supabase, TypeScript, and more), making it available offline after the initial sync. Through the Model Context Protocol, it delivers only the relevant sections, enforces token limits, and validates library versions to prevent mismatches. The system also tracks documentation updates and surfaces only what has changed, keeping outputs aligned with the current state of each project. A built-in feedback loop measures which sources are genuinely useful, enabling continuous improvement. Search is based on BM25 with TF-IDF scoring, with optional semantic retrieval via Qdrant and local embedding models such as sentence-transformers or Ollama. A lightweight FastAPI + HTMX dashboard provides access to indexed documentation, queries, and feedback insights. Compatible with Claude Code, Cursor, Zed, Cline, Continue, OpenAI Codex, and other MCP-enabled tools. https://github.com/mbuon/Buonaiuto-Doc4LLM submitted by /u/mbuon [link] [comments]
View originalBuilt a tool to capture and search AI coding sessions across providers. Looking for feedback on the approach.
Core problem: AI sessions aren't searchable across providers. You solve something with Claude Code, need it again weeks later, can't find it. Start over. What I built: Three capture methods: API proxy for OpenAI/Anthropic/Google endpoints (zero code changes) Native hooks for Claude Code and Gemini CLI (structured session data via stdin) Browser extension for ChatGPT/Claude.ai Everything flows into a unified search: hybrid semantic (embeddings) + keyword (BM25), RRF fusion for ranking. Sub-second results across all providers. Hook-level DLP: When Claude Code reads .env files, actual secrets never reach the model. Intercepts file reads, replaces values with [REDACTED:API_KEY] placeholders, passes sanitized version to Claude. Model can reason about variables without seeing credentials. Architecture: Python FastAPI backend Qdrant for vector search (OpenAI embeddings, 1536d) Supabase (PostgreSQL) for session storage Next.js frontend Privacy: Everything runs locally or in your account. Export/delete anytime. Nothing shared. PyPI package: https://pypi.org/project/rclm (hooks + proxy) Live beta: reclaimllm.com Questions for this community: Claude Code users: Would you actually use hook-level capture, or is the transcript file enough? DLP approach: Is interception at file-read too aggressive, or is post-hoc flagging insufficient? Missing features: What would make this actually useful vs just interesting? Marketplace: Given the sessions can be sanitized to certain extent, would it make sense for a marketplace where people can share/sell their chat sessions? Primarily I think from open source perspective as we are getting tied down to closed source models Enterprise: What enterprise use you can think of for this service Honest feedback appreciated. If the approach is fundamentally wrong, I'd rather know now. submitted by /u/Inevitable-Lack-8747 [link] [comments]
View originalGave Claude Code persistent memory across sessions — it actually remembers now
Been using Claude Code as my main coding assistant for months. The one thing that kept bugging me: every session starts blank. I'd re-explain my project structure, re-teach my conventions, re-debug stuff we already solved together last week. So I built a memory layer that hooks into Claude Code's session lifecycle. When a session ends, it parses the transcript and extracts useful stuff — patterns, errors, decisions, preferences. When a new session starts, it injects the relevant context automatically. After a few sessions it gets pretty useful. Claude just knows my codebase conventions, remembers past errors, knows which approaches worked. Like going from a stranger to a teammate who's been on the project for a while. Setup is two config changes: MCP server in ~/.mcp.json (22 tools — search, save, episodes, graph, vault, etc.) Session hooks in ~/.claude/settings.json (start/stop triggers) It also tracks procedure success rates with Wilson scoring, so "proven" workflows rank higher than stuff that failed before. And if you work on multiple projects, patterns that show up in 3+ projects get promoted to global scope. Self-hosted, Rust, MIT licensed. Needs PostgreSQL + Qdrant (docker compose handles both). GitHub: https://github.com/Nonanti/Alaz Anyone else tried building memory/context systems around Claude Code? Curious what approaches others are taking. submitted by /u/Nonantiy [link] [comments]
View originalA fully local, private alternative to Context7 that reduces your token usage
Context7 is great for pulling docs into your agent's context, but it routes everything through a cloud API and an MCP server. You have to buy a subscription, manage API keys, and work within their rate limits. So I used Claude Code to build a local alternative. docmancer ingests documentation from GitBook, Mintlify, and other doc sites, chunks it, and indexes it locally using hybrid retrieval (BM25 + dense embeddings via Qdrant). Everything runs on your machine locally. Once you've ingested a doc source, you install a skill into your agent (Claude Code, Codex, Cursor, and others), and the agent queries the CLI directly for only the chunks it needs. This drastically reduces your token usage and saves a lot of context. GitHub (MIT license): https://github.com/docmancer/docmancer Give it a shot and let me know what you think. I am looking for honest feedback from heavy users of Claude Code. submitted by /u/galacticguardian90 [link] [comments]
View originalClaude Code memory that fits in a single SQLite file
I kept re-explaining my stack to Claude every session. The memory tools I tried either spawned a process that ate gigs of RAM, or dropped vector search to stay light. Built nan-forget with Claude Code over the last few weeks. Claude helped design the 3-stage retrieval pipeline (recognition → recall → spreading activation), wrote most of the SQLite migration from Qdrant, and caught edge cases in the vector search scoring I would have missed. It stores memories in one SQLite file, ~3MB, no background services. npx nan-forget setup and you're done. 4 hooks save context as you work. You never call save. "auth system" finds "We chose JWT with Clerk." Search by meaning, not keywords. Memories carry problem/solution/concepts fields. A bug fix from March surfaces when you hit the same error in June. Old memories decay on a 30-day half-life. Stale ones consolidate into summaries. Active ones sharpen. Same database across Claude Code (MCP), Codex, Cursor (REST API), and terminal (CLI). No LLM calls for memory ops. Runs locally. Free and open source. https://github.com/NaNMesh/nan-forget Anyone else fighting context loss across sessions? What have you tried? submitted by /u/NaNMesh [link] [comments]
View originalBuilt a memory system that actually works!!
a persistent memory system I've been building for Claude Code that gives LLM agents actual context continuity across sessions. Benchmarks: - LoCoMo: 90.8% (beats every published system) - LongMemEval: 89.1% Why it's interesting for agent builders: The architecture is adapter-based. Currently hooks into Claude Code's lifecycle events, but the core (storage, retrieval, intelligence) is framework-agnostic. The retrieval pipeline (4-channel RRF: FTS5 + Qdrant KNN + recency + graph walk) and the intelligence layer (intent classification, experience patterns, RL policy) could plug into any agent framework. Quick setup: ollama pull snowflake-arctic-embed2 bun install && bun run build && bun run setup node dist/angel/index.cjs Tech stack: TypeScript, SQLite (better-sqlite3), Qdrant, Ollama, esbuild, Vitest Key design decisions: - Dual-write (SQLite truth + Qdrant acceleration) with graceful degradation - Every operation is non-throwing — individual failures never break the pipeline - Ephemeral hooks (millisecond lifetime) for capture, persistent Angel for reflection - RL policy models are pure TypeScript (Float32Array math, no PyTorch) - Content-length-aware embedding backfill in background 29K lines, 1,968 tests, MIT licensed: https://github.com/grigorijejakisic/Claudex submitted by /u/Pristine_Use5236 [link] [comments]
View originalI built an MCP server that gives your agent semantic search over Obsidian vaults — stop losing docs to keyword matching
I was tired of my agent doing keyword searches across my Obsidian vault and missing half the relevant docs. Searching for "API logs" wouldn't find a section titled "Execution tracking endpoints". So I built an MCP server that indexes your vault into Qdrant with local embeddings and lets any MCP-compatible agent search it semantically. The idea is to keep a single Obsidian vault as the documentation hub for all your projects. Instead of scattering docs across repos or wikis, everything lives in one place — and the agent can search across projects or filter down to a specific one. Qdrant handles the heavy lifting, so even large vaults with hundreds of files stay fast without dumping everything into the context window. What it does: Chunks markdown by headings, never breaking tables or code blocks Embeds everything locally with BAAI/bge-small-en-v1.5 (384 dim, no API keys) Auto-starts Qdrant via Docker if it's not running Filters by project, doc type, or frontmatter tags Incremental indexing — only re-embeds changed files Returns only the relevant chunks, not entire files Works with Claude Code, Cursor, Windsurf, or any MCP client. GitHub: https://github.com/Marco-O94/obsidian-qdrant-search PyPI: https://pypi.org/project/obsidian-qdrant-search/ Would love feedback — especially on chunking strategies, embedding model choices, and bug reports. I'm sure there are edge cases I haven't hit yet. Issues and PRs welcome. submitted by /u/Marco_o94 [link] [comments]
View originalShow HN: Open-sourced AI Agent runtime (YAML-first)
Been running AI agents in production for a while and kept running into the same issues:<p>controlling what they can do tracking costs debugging failures making it safe for real workloads<p>So we built AgentRuntime, the infrastructure layer we wished we had. Not an agent framework, but the platform around agents:<p>policies memory workflows observability cost tracking RAG governance<p>Agents and policies are defined in YAML, so it's infrastructure-as-code rather than a chatbot builder. Example – agents and policies in YAML agent.yaml – declarative agent config name: support_agent<p>model: provider: anthropic name: claude-3-5-sonnet<p>context_assembly: enabled: true<p><pre><code> embeddings: provider: openai model: text-embedding-3-small providers: - type: knowledge config: sources: ["./docs"] top_k: 3 </code></pre> policies/safety.yaml – governance as code name: security-policy<p>rules: - id: block-file-deletion condition: tool.name == "file_delete" action: deny<p>CLI – run and inspect Create and run an agent agentctl agent create researcher --goal "Research AI safety" --llm gpt-4 agentctl agent run researcher agentctl runs watch <run-id><p>Manage policies agentctl policy list agentctl policy activate security-policy 1.0.0<p>RAG – ingest docs and ground responses in your knowledge base agentctl context ingest ./docs agentctl run --agent agent.yaml --goal "How do I deploy?"<p>Agent-level debugging agentctl debug -c agent.yaml -g "Analyze this dataset."<p>Cost tracking is exposed via the API (per agent/tenant), and the Web UI shows analytics. The workflow debugger (breakpoints, step-through) lives in the pkg layer; the CLI debug is for agent execution. What’s in there Governance<p>Policy engine (CEL) Risk scoring Encrypted audit logs RBAC Multi-tenancy Fully YAML-configurable<p>Orchestration<p>Visual workflow designer (React Flow) DAG workflows Multi-agent coordination Conditional logic Plugin hot-reload Workflow marketplace<p>Memory & Context<p>Working memory Persistent memory Semantic memory Event log<p>Context assembly combines:<p>policies workflow state memory tool outputs knowledge<p>RAG features:<p>embeddings (OpenAI or local) SQLite for development Postgres + vector stores in production<p>Observability<p>Cost attribution via API SLA monitoring Distributed tracing (OpenTelemetry) Prometheus metrics Deterministic replay (5 modes)<p>Production<p>Kubernetes operator (Agent, Workflow, Policy CRDs) Helm charts Istio config Auto-scaling Backup / restore GraphQL + REST API<p>Implementation<p>~50k LOC of Go Hundreds of tests Built for production (in mind)<p>Runs on: Local<p>SQLite In-memory runtime<p>Production<p>Postgres Redis Qdrant / Weaviate<p>Happy to answer questions or help people get started
View originalRT https://t.co/JBBn7XLEpq: Haystack, the #search #relevance #conference is coming to Berlin on 27 September. With a keynote by https://t.co/lTtfeY7Fp9, a presen… -- From Charlie Hull https://t.co/XuB
RT https://t.co/JBBn7XLEpq: Haystack, the #search #relevance #conference is coming to Berlin on 27 September. With a keynote by https://t.co/lTtfeY7Fp9, a presen… -- From Charlie Hull https://t.co/XuBTlUJLTV
View originalRepository Audit Available
Deep analysis of qdrant/qdrant — architecture, costs, security, dependencies & more
Yes, Qdrant offers a free tier. Pricing found: $50
Key features include: Expansive Metadata Filters, Native Hybrid Search (Dense + Sparse), Built-in Multivector, Efficient, One-Stage Filtering, Full-Spectrum Reranking, Qdrant Cloud, Qdrant Hybrid Cloud, Qdrant Private Cloud.
Qdrant is commonly used for: Build AI Search the Way You Want, Semantic Search.
Qdrant has a public GitHub repository with 29,940 stars.
Based on user reviews and social mentions, the most common pain points are: token usage, cost tracking.
Based on 14 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.