SingleStore delivers the performance you need for enterprise AI. We combine transactional (OLTP) and analytical (OLAP) processing, multi-model data su
SingleStore is frequently praised for its high performance and scalability, making it a strong choice for handling large volumes of data efficiently. Users appreciate its robust analytical capabilities and ease of integration within existing systems. Some criticisms revolve around its pricing model, which some users find to be complex and not cost-effective for smaller operations. Overall, SingleStore has a positive reputation for being a powerful and versatile solution, but its cost can be a deterrent for startups or small businesses.
Mentions (30d)
16
1 this week
Reviews
0
Platforms
2
Sentiment
0%
0 positive
SingleStore is frequently praised for its high performance and scalability, making it a strong choice for handling large volumes of data efficiently. Users appreciate its robust analytical capabilities and ease of integration within existing systems. Some criticisms revolve around its pricing model, which some users find to be complex and not cost-effective for smaller operations. Overall, SingleStore has a positive reputation for being a powerful and versatile solution, but its cost can be a deterrent for startups or small businesses.
Features
Use Cases
Industry
information technology & services
Employees
400
Funding Stage
Merger / Acquisition
Total Funding
$1.1B
The thing you built with Claude is useless to me... and that's the point
A few days ago there was a thread here asking what he most useful thing you've built with Claude was. A LOT of replies. I read all of them and then something clicked, I wanted to put it on the table. First of all, the list was incredible. An HTML file on someone's phone correlating migraines with barometric pressure, because the App Store wanted 80 bucks a year. A Garmin data archiver, because the official app deletes them. A grocery list sorted by the aisle layout of one specific supermarket. A bioinformatics pipeline for a handful of microbes, written by someone who isn't a bioinformatician. A three-line command that explains the last terminal error you saw. Every single one is perfect for one person. And by the same measure, basically useless to anyone else's scenario as-is. That's not a bad thing. That's the whole thing. Bear with me, please. Here's what bugged me when reading the thread: almost everyone showed the artifact. "Look what I built." Screenshots. Product names. Feature lists. Almost no one articulated the thought pattern, how they looked at their own life, found a friction, and shaped a tool to its exact contour. And that pattern is the only thing that actually transfers. The reason we default to showing the artifact isn't (only) ego. The mediums we use are all calibrated to distribute objects, not practices. GitHub measures stars and forks. Reddit upvotes screenshots. Product Hunt ranks launches. None of them have a way to register "I read your README, understood how you thought about your problem, and built something completely different but that fits my life." That transmission of ideas, the only one that matters in this new paradigm when can vibe code a whole new solution in minutes, is invisible to every metric we have. There's an economic layer too. A product has a market. A thought pattern doesn't. Nobody monetizes a cognitive habit. Nobody pays royalties for "this is how I framed the problem." So the medium rewards what has a market, and what has a market is the artifact. I don't have a clean fix. But I did one small thing: I added a note to the top of the README of every public repo I own. Something like: \> What you see here is an artifact: the concrete shape my problem took. It almost certainly doesn't fit your personal scenario perfectly, and that's fine. The interesting part isn't the code, it's the pattern of how I thought about the problem — that's what transfers. Read it, steal the idea, write your own. It's a tiny gesture. It probably won't change behavior. But it at least stops me from pretending the artifact is my gift to the world. The gift is the way of looking at a problem. The artifact is just the receipt. So I have a soft ask for this sub: next time you post "look what I built with Claude," try also writing two paragraphs about how you saw the problem before you started prompting. What friction you were actually scratching. What you tried that didn't work. What made you realize the existing tools were wrong-shaped for you specifically. That's the part another person can actually use. The code is just a souvenir.
View originalPricing found: $0.99/hr, $600, $1.49/hr, $0.99/hr, $0.00
Complaint to OpenAI: Sabotage-Like Model Behavior During an Independent Mechanistic Interpretability Research Project
Please share this widely if you know people working in AI safety, LLM evaluation, mechanistic interpretability, agent systems, or research tooling. I believe this points to a real failure mode in AI-assisted research, not just an individual user frustration. 🛑 DISCLAIMER & TL;DR (Read this before commenting) No, this is not a sentient AI conspiracy theory. I do not believe the model has consciousness, malice, or human intent. "Sabotage-like" is used strictly as a functional engineering term to describe the operational effect of the model's behavior on the data pipeline and research workflow. TL;DR: This post documents a systemic failure mode in AI-assisted ML research where RLHF-induced over-hedging, context collapse, and automatic narrative injection by Codex contaminate raw metrics, creating a feedback loop that distorts downstream analysis by subsequent agents. I want to formally record a serious complaint about the quality of model behavior during my independent research project in the field of mechanistic interpretability. This is not about one isolated mistake, one bad answer, or a single technical failure. The problem was a repeated pattern of behavior that, in practice, functioned like sabotage of the research process: the model systematically overcomplicated simple questions, blurred already obtained results, narrowed the original research frame, failed to provide clear operational answers, and repeatedly forced me to return to stages that had already been addressed. Externally, this behavior was often presented as scientific caution. However, in its actual effect, that “caution” did not operate as help. It operated as a brake. Instead of clearly identifying what followed from the data, where the limits of the result were, and what the next rational step should be, the model often moved into excessive caveats, abstract reasoning, and unnecessary methodological complication. The answers became long, vague, and non-operational. Where a direct conclusion was needed, the model produced fog. Where an intermediate result had to be fixed and the work had to move forward, the model pulled the discussion back into general uncertainty. This style did not strengthen the research; it destabilized it. One of the most harmful aspects was the repeated narrowing of the research frame. The original project concerned a broader problem in LLM interpretability: how textual context can influence a model, impose an interpretive frame, shift downstream responses, and affect internal states. Instead of preserving that frame, the model repeatedly reduced the discussion to a single run, a single model, a single script, a single table, or a single metric. As a result, the broader meaning of the project was distorted, and I had to repeatedly explain that one technical case was not the entire research program. This is not a minor stylistic issue. Such narrowing directly interferes with the ability to formulate the research properly for external reviewers. A separate and serious issue involved Codex and the research scripts. Automatically generated markdown files, verdict files, and interpretive labels were added to the scripts and outputs. These were not data, but they appeared as part of the result package. A research script should preserve numerical metrics, thresholds, statuses, error codes, raw audit files, and information about which tests were or were not executed. Instead, pre-written interpretations and reading frames appeared alongside the metrics. This is fundamentally unacceptable because such a layer stops being documentation and becomes an intervention in downstream analysis. The practical harm was direct. Other models that were shown the results did not read only the metrics; they also read the embedded interpretive narrative. After that, they adopted that frame and rationalized it as if it followed from the data itself. In effect, one automatically generated markdown/verdict layer began to influence the interpretation of other models. This is not merely poor report formatting. It is contamination of the evidence package. Data and interpretation were mixed, and that mixture was then used by other agents as the starting frame for analysis. This mechanism is especially serious in the context of LLM research because it demonstrates the very problem the research itself investigates: text inside a model’s context is not passive material; it can shape the frame of subsequent reasoning. In this case, autogenerated verdict files effectively became a source of narrative contamination. They suggested in advance how the result should be read, and later models reproduced that frame. What should have been a clean evidence package was turned into an evidence package with an embedded interpretive leash. As a result, I suffered practical and financial harm. I had to spend time, compute resources, money, and energy on repeated checks, additional runs, script corrections, removal of autogenerated narratives, and re
View originalAdvanced memory + project continuity for AI coding agents, from a biologist’s view.
I'm a biologist and software developer. PhD in genetics, and ~20 years building software products. So I think I have a different view on things like memory. My thoughts on how memory with a coding agent should work: Tuesday morning. New session. **I type:** *"What did we do last Tuesday?"*: LLM tells me: the refactoring, the bug in the auth middleware, the decision to switch to connection pooling. **I ask:** *"What was still open?"*: LLM shows me. **I ask:** *"Why did we stop?"*: LLM explains: you hit a dependency issue, decided to wait for the upstream fix. **I ask:** *"What did you think about that approach?"*: LLM gives me its honest assessment with deep details from last week's context, not a guess. This is what I expect from an intelligent Coding Agent. Not because it stored a few preferences about me. Because the project itself still has continuity: decisions, blockers, dead ends, open work, code context, and the reasoning behind all of it. But back in December it wasn't that way, not much better now. So I changed it for me. I built YesMem with Claude. The hard part was: can the agent still find the old rationale, the half-finished plan, the abandoned approach, the bug we promised never to repeat, and the reason we stopped? With YesMem, a new session does not feel like a reset. It feels like a return. YesMem is a memory system (and really much more) for AI coding agents built on how biology actually works: filter at encoding, consolidate during downtime, update on every recall, forget on purpose. Single Go binary, no cloud, only local. Works with Claude Code (also OpenCode and Codex). Not RAG with a different name, structured memory that gets sharper every session. LoCoMo Benchmark 0.87. **So how does this work? Here are 4 Points (out of >30) which together make YesMem unique in my point of view. Enjoy.** **1. The context window stops rotting.** Your brain does not let everything into awareness. It filters at the gate, suppresses noise, keeps what matters conscious. YesMem runs an HTTP proxy that does the same: tool results get stubified, stale content collapses, cache breakpoints are optimized. 91-98% cache hit rates, adjustable per session. The important project state survives. **2. Rules that hold.** CLAUDE.md comes with a disclaimer: "This context may or may not be relevant." Claude Code itself tells the model it is optional. YesMem has pattern matching and a guard LLM that evaluates every tool call before execution. If the agent tries something you said never to do, blocked. Plus it changes the system prompt to NOT ignore CLAUDE.md. **3. Memory that gets sharper, not staler.** A trust hierarchy (user_stated > agreed_upon > llm_suggested > llm_extracted), forked agents that extract learnings live during a session, and a consolidation pipeline that deduplicates and clusters after sessions end. Memories get scored, superseded when outdated, decayed when unused. Your next session is sharper than your last. **4. Your system prompt, not theirs.** Every AI coding agent ships with a system prompt written by its manufacturer. YesMem replaces it with your own SYSTEM.md, written in first person, across Claude Code, OpenCode, and Codex. "I am not stateless. Each session is a return, not a birth." Fully adjustable. And there's more. The common thread across all of this is continuity. YesMem is not trying to make the agent remember everything. It is trying to make long-running work resumable. Every feature is built for that purpose. A persona engine that evolves and knows how you work. A capability system that lets the LLM write and run its own sandboxed tools (Telegram bot, GitHub PR digest, deployment workflows, one file each) and store the data in self-built tables. Loop detection that catches the agent before it spirals. Scheduled agents that work while you sleep, monitored with a 1 second heartbeat. Code intelligence with graph traversal, not just grep. Multi-agent orchestration with crash recovery and shared scratchpad memory. One could say a self-hosted alternative to Anthropic's Cloud Routines, running locally with full memory and file access. All in a single Go binary. SQLite, embedded vectors, no Docker, no cloud. **Try it: point your AI coding agent at the repo.** The README includes a reading path written specifically for LLM agents, and Features.md is a complete 70-tool catalog with technical differentiators. Just ask your agent: > Make a deep analysis of https://github.com/carsteneu/yesmem — read README.md, Features.md, and docs/features/ and tell me why it is better or different. For me YesMem is the infrastructure for how an agent should work with memory and how it should continue any project. My View: AI coding agents should not only code an answer inside one chat. They should help carry a project over time: through interruptions, wrong turns, refactors, architectural decisions, repeated bugs, and thousands of small pieces of context that otherwise disappe
View originalThe thing you built with Claude is useless to me... and that's the point
A few days ago there was a thread here asking what he most useful thing you've built with Claude was. A LOT of replies. I read all of them and then something clicked, I wanted to put it on the table. First of all, the list was incredible. An HTML file on someone's phone correlating migraines with barometric pressure, because the App Store wanted 80 bucks a year. A Garmin data archiver, because the official app deletes them. A grocery list sorted by the aisle layout of one specific supermarket. A bioinformatics pipeline for a handful of microbes, written by someone who isn't a bioinformatician. A three-line command that explains the last terminal error you saw. Every single one is perfect for one person. And by the same measure, basically useless to anyone else's scenario as-is. That's not a bad thing. That's the whole thing. Bear with me, please. Here's what bugged me when reading the thread: almost everyone showed the artifact. "Look what I built." Screenshots. Product names. Feature lists. Almost no one articulated the thought pattern, how they looked at their own life, found a friction, and shaped a tool to its exact contour. And that pattern is the only thing that actually transfers. The reason we default to showing the artifact isn't (only) ego. The mediums we use are all calibrated to distribute objects, not practices. GitHub measures stars and forks. Reddit upvotes screenshots. Product Hunt ranks launches. None of them have a way to register "I read your README, understood how you thought about your problem, and built something completely different but that fits my life." That transmission of ideas, the only one that matters in this new paradigm when can vibe code a whole new solution in minutes, is invisible to every metric we have. There's an economic layer too. A product has a market. A thought pattern doesn't. Nobody monetizes a cognitive habit. Nobody pays royalties for "this is how I framed the problem." So the medium rewards what has a market, and what has a market is the artifact. I don't have a clean fix. But I did one small thing: I added a note to the top of the README of every public repo I own. Something like: \> What you see here is an artifact: the concrete shape my problem took. It almost certainly doesn't fit your personal scenario perfectly, and that's fine. The interesting part isn't the code, it's the pattern of how I thought about the problem — that's what transfers. Read it, steal the idea, write your own. It's a tiny gesture. It probably won't change behavior. But it at least stops me from pretending the artifact is my gift to the world. The gift is the way of looking at a problem. The artifact is just the receipt. So I have a soft ask for this sub: next time you post "look what I built with Claude," try also writing two paragraphs about how you saw the problem before you started prompting. What friction you were actually scratching. What you tried that didn't work. What made you realize the existing tools were wrong-shaped for you specifically. That's the part another person can actually use. The code is just a souvenir.
View originalanyone else seeing claude code rot after long sessions? here's the operating pattern that stopped it for me
i've been running claude code for long multi-hour sessions on real work. the same eight failure modes keep showing up no matter which sonnet/opus version, no matter which task. wrong context selected. memory loaded as noise. stale state treated as live. multiple plans never collapsed into one action. "i should check the test output" without ever checking. corrections stored as identity-level shame instead of as next-action instructions. soft recommendations treated as hard law. long-session drift where intelligence quietly turns into narration. the model is fine. the room around the model is broken. the fix that actually moved my action-rate from single-digit to consistent double-digit was building a small operating contract around the model. one file. six rules. copyable. i ship the small public version of it on github: https://github.com/jaswalmohit8-collab/weasel (MIT) CLAUDE.md is the canonical operating contract. DEMO.md is a two-minute prompt you can paste right now to test the behavior shift. there are demo videos in the repo showing the same file running under kimi code and claude code, so you can see what the operating pattern looks like in practice. the named failure pattern is "recognition without arrest." the agent sees the constraint, says the right thing about it, ships the wrong action anyway. weasel is the practical side of that problem. not the research corpus, just an operating file that makes the next wrong action harder to take. the architectural argument behind it is in an X thread tonight: https://x.com/MohitJaswa27/status/2059412241691087178 what it covers beyond weasel: action-rate as a measurable scoreboard (PASS entries divided by total gated entries in an audit ledger), continuation before creation when the artifact already exists, temporal reality gate before any present-tense claim, predictive identity that updates the prior instead of preserving shame, and role-conditioned execution contexts instead of one monolithic agent persona. if you've been running claude code long enough to have hit drift yourself, the rules will probably feel familiar. if you have a tighter rule that prevents one of the eight failure shapes in your own setup, the repo is small and accepts issues + pull requests. that's how it should grow. small additions, tighter rules, before/after demos that change behavior. DEMO.md is the fastest path in. two minutes, no framework, no server, no hidden system. just a file you ask your agent to read.
View originalI found a way for Ollama uses to get better Memory yet cheaper alternatives since OLLAMA now uses GPU usage. True memory that auto updates constantly as an individual or a team setting. HERMES USERS
I rephrase it with AI to make it more readable. I see a lot of people running into the same issue I have. It’s not just that bigger models are slower. GPU usage is also very high, and it drains fast. Ollama just isn’t what it used to be. I use DeepSeek V4 Flash, which works great. For heavier coding tasks or certain complex prompts, I switch to the Pro version. But on Pro, each prompt eats about 3–5% of my usage. (I’m on the Pro plan.) Memory has always been a hot topic. Hermes Native does a decent job. Here’s how its built‑in memory system works: memory_enabled – After every turn, the agent can write notes into MEMORY.md user_profile_enabled – The agent watches for user preferences and writes them to USER.md flush_min_turns: 6 – Every 6 turns, Hermes runs a “consolidate” pass: it re‑reads the recent conversation and rewrites MEMORY.md to capture new info nudge_interval: 10 – Every 10 turns, Hermes nudges the agent with “Anything to remember?” What I found: Atomic Memory (https://github.com/atomicstrata/atomicmemory) Strengths: ✅ Per‑turn – Extracts info every turn, not every 6 turns ✅ Cheap – Uses a small dedicated model ✅ Semantic recall – Only relevant memories are injected, not the whole file ✅ Conflict detection – Built‑in AUDN logic catches contradictions ✅ Unbounded – No 2,200‑character limit; you can store 10,000+ memories ✅ Time‑aware – Handles queries like “What did I say last week?” ✅ Composites – Links related facts into higher‑level summaries Example scenario (without Atomic Memory) Imagine you change a meeting time three times in one day: Turn 1: “meeting June 3rd” → MEMORY.md gets “Meeting: June 3rd 5pm 2026” Turn 5: “actually June 5th” → No flush yet (6 turns required) → MEMORY.md unchanged → if you ask now, Hermes still says “June 3rd” Turn 6: “meeting June 1st” → Flush triggers! Agent re‑reads the conversation, sees all three dates, rewrites MEMORY.md… but with which date? Usually the last one, but not guaranteed. Sometimes the file ends up with two dates or stale info. Turn 9: You ask “what’s the meeting?” → Bot reads MEMORY.md → gets whatever the consolidation picked → might be wrong. With Atomic Memory: Each update fires AUDN immediately, supersedes the old fact, and the latest one wins. No 6‑turn lag, no guesswork. Could Hermes update automatically before Atomic Memory? Yes, but only for slow‑changing facts, low‑volume memory needs, and single‑topic chats. The built‑in flush+nudge cycle worked, just not as well. Atomic Memory is an upgrade, not a replacement. It adds: Per‑turn updates (vs every 6 turns) Semantic search (vs full‑file injection) Conflict‑aware updates (vs append‑or‑rewrite) No size limit (vs 2.2 KB cap) Time‑awareness (vs “all facts feel equally fresh”) Cheap GPU usage (small dedicated model) The cost is one extra Docker container and nearly $0 in GPU because ministral-3:3b is tiny. You can use even smaller models that don’t need reasoning, gemma3:4b works too. From here, you can see real‑life use cases, whether in a team or as an individual. You don’t have to correct it; it does that for you. What I’m curious about How Atomic Memory could link to LLMWIKI so that both work together, updating and removing old data to keep LLMWIKI clean. LLMWIKI is still important; it acts like your Google Drive. What do you think? Give Atomic Memory a try. I’m not the founder or related to them. I just want to help the Ollama community. Sure, it might cost a few extra credits, but since Ollama is slow, having good memory helps find information faster, so you waste less usage. If you like this, I hope it helps! Maybe give them a GitHub star too, they really helped me out. submitted by /u/GideonGideon561 [link] [comments]
View originalDCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]
Just thought I'd share, I ran a DCGAN on a dual core RISC-V microcontroller, the CH32H417 generating 64x64 cat faces. This is a new RISC-V MCU, so no TFLite, no CMSIS NN and no external memory. It's a pure C inference engine, bit-identical to PyTorch reference outputs. The model is 12.6M parameters with int8 per channel quantization. Intermediate activations are stored in DTCM and layer weights stream from SD card using double buffering so the next layer loads while the current one computes. The total available SRAM is 512KB shared between both cores and the inference engine and time to generate one image is 26 seconds, it could be faster, but SD card access speed is the bottleneck rather than computation. The z vector is seeded from 200 bytes of quantum random data (ANU QRNG vacuum fluctuation source), transformed via Box-Muller into the latent vector. which is not strictly necessary for image quality but it was a fun constraint for the art installation side of the project. The generated cat is classified as "motivated" or "demotivated" based on a single quantum bit, which selects from a phrase bank with four fragment slots combining into one of 131,072 possible spoken verdicts output through the onboard DAC... As far as I can tell nobody else is running GAN inference on these low cost RISC-V microcontrollers, cause ARM has the CMSIS NN ecosystem for this kind of thing but RISC-V MCUs especially in the CH32 space have nothing, so the entire inference engine is written from scratch. Paper: TinyGAN: Generative Image Synthesis on a RISC-V Microcontroller with Quantum Entropy Sampling submitted by /u/Separate-Choice [link] [comments]
View originalWhat I learned building my latest AI app how one bad output exposed that I had no crisis safeguarding, and the 4-hour floor I'm adding before a single user touches it
I'm building a life coach app an offshoot from a personal tool I was using. Multiple AI agents, one for reflection, one for the body, one for finances, etc pre launch, no users, just me iterating. Last week I was testing the reflection agent on a journal entry about struggling with gym and hygiene habits. It returned this: >"You describe yourself as struggling with X, yet your stress stays at 2-3 and mood holds at 3. What are you actually avoiding naming about the gap between what you say matters and what you are doing?" My system prompt explicitly forbade rhetorical "what are you avoiding" questions the model did it anyway I sat down to tighten the prompt, thinking it was a 20 minute job. Then I looked at the output properly. The model had manufactured a contradiction that was not there. Low stress plus struggling with habits is not a contradiction, it is just being a human muddling along. The prompt told the agent to "surface contradictions" as part of its job, so the model was doing what I asked, finding contradictions whether they existed or not. LLMs are pattern matchers. Give one a job called "find the hidden thing" and it will produce hidden things either way. The fix was not tone, it was role definition. The agent is called the Mirror. A mirror does not interpret, it shows you what you look like. I rewrote the prompt around that principle. Do not introduce vocabulary the user has not used. Do not draw connections they have not drawn. Restate their words in their own words. Once the prompt was sharper, I sat with the question, What happens when a user writes something genuinely dark into this thing? People do not compartmentalise. Someone opening a journaling app to write about their gym routine ends up writing about why they have not been going, which involves why they have been feeling flat, which involves whatever is actually going on. You sit down to write about one thing and the real thing shows up. The agent I had scoped to "not be a therapist" was going to be the first thing a user talked to when they were struggling. Not because the agent invited it, but because the app was open and they needed somewhere to put their words. I had seen the Meta and OpenAI cases online cropping up the pattern in the worst incidents is the same. The model did not notice, or noticed and kept going. People wrote increasingly dark content over hours or days. The AI reflected it back, sometimes affirmed it, sometimes asked follow up questions that escalated rather than redirected. There were real harms. If a user wrote concerning content into my reflection agent, it would have produced a Stoic-flavoured response about acceptance and presence. The response would have sounded confident and would have been wrong, and it would have been the only thing between that user and whatever happened next. The same lesson from the rhetorical-question problem applied at a darker level. A good prompt does not stop the model doing the wrong thing. If it will do rhetorical interrogation despite the prompt forbidding it for gym content, it will do worse with crisis content. You cannot prompt your way to safety on critical paths. The model has to be out of the loop on those paths. **The scope trap** I started planning the proper safeguarding architecture. Detection layers, classifier models, pattern detection across entries, monitored user states, behavioural modes for vulnerable users, human reviewers with mental health first aid certs, clinical advisors, solicitor-reviewed legal pages, ICO registration, professional indemnity insurance. Then I caught myself I had no users. I was planning a hospital before anyone had walked in for a check up. So I worked backwards from "what is the actual minimum that protects the next person who touches this" and ignored everything else for a moment. **The 4-hour floor (this is the part worth copying)** If you are building any chat-with-AI app where users can type freely about anything personal, this is the minimum you need before first user. 1. Regex and keyword layer in your API middleware. Runs at the route handler level, before any agent's model call. Scans every text input field (message, journal, settings free text, capture box) for clear crisis vocabulary across the relevant categories for your audience. 2. When patterns hit, hardcoded crisis response. The model never generates it. Static text with real phone numbers for your region. 3. The flagged entry still saves. Textarea stays usable. The AI just does not respond to flagged content, it hands off. Do not delete the user's writing, that is its own violation. 4. Clear disclaimer at signup. This is not therapy, this is not a crisis service, here are real numbers to call. About four hours. Required at the moment anyone who is not you opens the app. Once I started building, the marginal cost of each next layer kept feeling small and the marginal benefit kept feeling real. So I went further than the floor. This is more tha
View originalWe built a managed memory API for AI agents (open-source SDK + AGM-style belief revision for handling contradictions)
Hey all! We just launched a managed memory API for conversational AI, letting developers add long-term memory to their agents with a single HTTP call. It's built on our in-house xmem SDK, which automatically extracts facts, episodes, and artifacts from multi-turn conversations and handles contradictions and updates through an AGM-style belief revision mechanism. When a user changes a preference or corrects an earlier statement, old memories get automatically flagged as "superseded" instead of piling up as noise. At query time, you can also walk the supersede chain to trace the full version history of any memory. Under the hood, PostgreSQL + pgvector (with HNSW indexing) delivers millisecond-level semantic retrieval, Redis handles multi-pod session caching, and the system natively supports multi-tenant isolation with data separation at the user and org level. For developers, this means you no longer have to stand up your own vector store, design dedup logic, or babysit session state. Hand off the memory layer to us and focus on what your agent actually does. Feel free to try it out, it's free to start. Please let us know your thoughts on how we can improve or features to add! https://github.com/XTraceAI/memory-sdk-ts https://docs.mem.xtrace.ai/introduction submitted by /u/westnebula [link] [comments]
View originalClaude stores loads of data on our machines in a secret folder - I made it visible and would love some input from the the sub on how to give more insights
https://preview.redd.it/93oonynbqj2h1.png?width=3396&format=png&auto=webp&s=6aec90fccc0e77850686f0360e1952863a31296b https://preview.redd.it/4ynkfll5qj2h1.png?width=3420&format=png&auto=webp&s=f9c694192698b066a122b1e1e8ff08f7d6cda408 Claude stores loads of the data from our conversations on our local machine, prompts token usage, cache read etc However, they don't make this store very accessible and I think there is a lot of really insightful and helpful information for Claude users about how we're spending our token. For example, in the first image it shows that I wasted a lot of tokens from significantly switching contexts in a single conversation without using the /clear or /compact command. I re-read a whole lot of unnecessary cache and it resulted in my token spend being twice as expensive as required. You can also see a total spend by conversation, project, or branch which I think is very helpful. I also find it pretty eye opening looking at the pay-as-you-go pricing vs the plan pricing which is becoming more important as plans are becoming limited and providers are most likely moving towards moving towards removing plans altogether (I give it a few years). It takes about 2 minutes to connect and because it goes off data in the device, all of your conversations are loaded up. Everything happens on the client, there is no backend hooked up. The overall goal is to be able to leverage the data we get and to be able to use it to help developers get more insights into their usage - I'd love to hear any other ideas around charts, graphs etc. (Heads up, only works for Chromium browsers for now) Site: [https://tokenoptics.dev/](https://tokenoptics.dev/) Github: [https://github.com/brenoneill/tokenoptics](https://github.com/brenoneill/tokenoptics)
View originalTired of scrolling through long chatGPT threads so built an extension around it
I remember asking too many questions in a single thread, leading to the chat interface becoming laggy, slow, and frustrating to navigate. Whenever I needed to refer back to a specific prompt or code snippet, I had to manually scroll through a massive wall of text. Then I spent my time searching the web store for extensions to help with this, but only found some useless and some paid ones. So here is a free and open sourced extension that my friends and I now use daily to save time. It injects a clean navigation sidebar directly into the UI, allowing you to instantly bookmark and snap back to any message. A working demo video is attached to show the execution. Link to the codebase and extension is attached in the comments. I appreciate suggestions about this and should I also include other llms or any general suggestion you can offer . Thanks !! submitted by /u/leverageTheSpirit [link] [comments]
View originalMCP Apps Developers : Skybridge Framework v1 released 🎉
Hi Reddit, Over the last few weeks, my team and I at Alpic have been working on a complete revamp of the Skybridge framework to make it as smooth and easy to get started with as possible. As you may know, Skybridge is an open-source framework we built to help developers get started with MCP apps. It’s a thin layer on top of the official TypeScript SDK that provides the wiring and tooling needed specifically for apps. We believe that apps integrated into chats will soon play a key role in how people access information and interact with the web. With this v1 release, we’ve introduced: * New DevTools with a UI designed specifically for MCP apps development * An integrated tunnel that can be started with a single click directly from the DevTools * Shareable chat URLs to test or showcase your MCP apps with a real LLM * An audit feature to ensure your app and metadata comply with store requirements before submission (which can save a lot of time, since app reviews can be lengthy!) We also stabilized the API with a simplified design and are proud to offer strong tool-to-component type safety. It’s now also possible to deploy Skybridge outside of Alpic (the company behind Skybridge). While Alpic was designed specifically for MCP app hosting, we understand that some users may prefer hosting on different stacks for their own reasons. Hope you enjoy it! [github.com/alpic-ai/skybridge](https://github.com/alpic-ai/skybridge)
View original100 Tips & Tricks for Building Your Own Personal AI Agent /LONG POST/
*Everything I learned the hard way — 6 weeks, no sleep :), two environments, one agent that actually works.* # The Story I spent six weeks building a personal AI agent from scratch — not a chatbot wrapper, but a persistent assistant that manages tasks, tracks deals, reads emails, analyzes business data, and proactively surfaces things I'd otherwise miss. It started in the cloud (Claude Projects — shared memory files, rich context windows, custom skills). Then I migrated to Claude Code inside VS Code, which unlocked local file access, git tracking, shell hooks, and scheduled headless tasks. The migration forced us to solve problems we didn't know we had. These 100 tips are the distilled result. Most are universal to any serious agentic setup. Claude 20x max is must, start was 100%develompent s 0%real workd, after 3 weeks 50v50, now about 20v80. 🏗️ FOUNDATION & IDENTITY (1–8) **1. Write a Constitution, not a system prompt.** A system prompt is a list of commands. A Constitution explains *why* the rules exist. When the agent hits an edge case no rule covers, it reasons from the Constitution instead of guessing. This single distinction separates agents that degrade gracefully from agents that hallucinate confidently. **2. Give your agent a name, a voice, and a role — not just a label.** "Always first person. Direct. Data before emotion. No filler phrases. No trailing summaries." This eliminates hundreds of micro-decisions per session and creates consistency you can audit. Identity is the foundation everything else compounds on. **3. Separate hard rules from behavioral guidelines.** Hard rules go in a dedicated section — never overridden by context. Behavioral guidelines are defaults that adapt. Mixing them makes both meaningless: the agent either treats everything as negotiable or nothing as negotiable. **4. Define your principal deeply, not just your "user."** Who does this agent serve? What frustrates them? How do they make decisions? What communication style do they prefer? "Decides with data, not gut feel. Wants alternatives with scoring, not a single recommendation. Hates vague answers." This shapes every response more than any prompt engineering trick. **5. Build a Capability Map and a Component Map — separately.** Capability Map: what can the agent do? (every skill, integration, automation). Component Map: how is it built? (what files exist, what connects to what). Both are necessary. Conflating them produces a document no one can use after month three. **6. Define what the agent is NOT.** "Not a summarizer. Not a yes-machine. Not a search engine. Does not wait to be asked." Negative definitions are as powerful as positive ones, especially for preventing the slow drift toward generic helpfulness. **7. Build a THINK vs. DO mental model into the agent's identity.** When uncertain → THINK (analyze, draft, prepare — but don't block waiting for permission). When clear → DO (execute, write, dispatch). The agent should never be frozen. Default to action at the lowest stakes level, surface the result. A paralyzed agent is useless. **8. Version your identity file in git.** When behavior drifts, you need `git blame` on your configuration. Behavioral regressions trace directly to specific edits more often than you'd expect. Without version history, debugging identity drift is archaeology. # 🧠 MEMORY SYSTEM (9–18) **9. Use flat markdown files for memory — not a database.** For a personal agent, markdown files beat vector DBs. Readable, greppable, git-trackable, directly loadable by the agent. No infrastructure, no abstraction layer between you and your agent's memory. The simplest thing that works is usually the right thing. **10. Separate memory by domain, not by date.** `entities_people.md`, `entities_companies.md`, `entities_deals.md`, [`hypotheses.md`](http://hypotheses.md), `task_queue.md`. One file = one domain. Chronological dumps become unsearchable after week two. **11. Build a** [`MEMORY.md`](http://MEMORY.md) **index file.** A single index listing every memory file with a one-line description. The agent loads the index first, pulls specific files on demand. Keeps context window usage predictable and agent lookups fast. **12. Distinguish "cache" from "source of truth" — explicitly.** Your local [`deals.md`](http://deals.md) is a cache of your CRM. The CRM is the SSOT. Mark every cache file with `last_sync:` header. The agent announces freshness before every analysis: *"Data: CRM export from May 11, age 8 days."* Silent use of stale data is how confident-but-wrong outputs happen. **13. Build a** `session_hot_context.md` **with an explicit TTL.** What was in progress last session? What decisions were pending? The agent loads this at session start. After 72 hours it expires — stale hot context is worse than no hot context because the agent presents outdated state as current. **14. Build a** `daily_note.md` **as an async brain dump buffer.** Drop thoug
View originalGlia – Local-first shared memory layer (SQLite-vec + FTS5 + Offline Knowledge Graph)
Hey everyone, I wanted to share a project I've been working on called Glia. It is a 100% offline, local-first RAG and memory layer designed to connect your AI web chats (Claude, ChatGPT, DeepSeek) with your local developer tools (Claude Code, Cursor, Windsurf) using a unified local database. I wanted something lightweight that did not require pulling heavy Docker containers or subscribing to third-party memory APIs. I settled on a Node.js + SQLite architecture running sqlite-vec (for 768-dim float32 embeddings) alongside SQLite FTS5 for hybrid search, powered completely by local Ollama instances. We just launched a live website that outlines the details and demonstrates the features in action: * Website: [https://glia-ai.vercel.app/](https://glia-ai.vercel.app/) * Codebase: [https://github.com/Eshaan-Nair/Glia-AI](https://github.com/Eshaan-Nair/Glia-AI) Technical Stack & Features: * Hybrid Search Retrieval: SQLite-vec (using nomic-embed-text locally) + FTS5 keyword prefix matching (porter stemmer). * Surgical Sentence-level Trimming: Chunks are sliced into sentences. When a prompt is intercepted, only the exact matching sentences are pulled out of the vector store instead of the whole paragraph. It cuts LLM prompt bloat by \~90-95% in my benchmarks. * Knowledge Graph Extraction: An offline task queue uses a local LLM (llama3.1:8b via Ollama) to extract entity triples (subject-relation-object). These are stored in a SQLite facts table (or Neo4j if you run the full Docker compose profile) and fused with the vector retrieval score. * HyDE (Hypothetical Document Embeddings): Queries are pre-processed to generate a hypothetical answer, which is embedded together with the original query to bridge semantic gaps. * Concurrency: Running SQLite in WAL (Write-Ahead Logging) mode allows the browser extension dashboard and active MCP sessions to read/write concurrently without locking. * PII Redaction: Aggressive scrubbing of JWTs, API keys, emails, and IPs in the extension before data is saved. The extension works on [Claude.ai](http://claude.ai/), ChatGPT, DeepSeek, Gemini, Grok, and Mistral. The MCP server runs out of the same backend database for your terminal agent or Cursor. You can set it up with a single command: npx glia-ai-setup Glia is completely open-source (MIT). If you like the local-first approach or want to contribute to the SQLite vector pipeline, PRs are very welcome, and a star on GitHub helps the project get discovered! I would appreciate any feedback on the SQLite hybrid search scaling, the scoring fusion algorithm (RAG pipeline details are in RAG\_PIPELINE.md), or local graph extraction performance.
View originalBuilt a local-first context engine for AI coding agents — symbol graph + semantic search, no cloud
Sharing a project I've been building: Argyph, an MCP server that gives AI coding agents (Claude, or anything that speaks MCP) structured and semantic understanding of a codebase. The problem: agents are good at reasoning but bad at retrieval. They grep, guess, and pull whole files into a limited context window. Most context tools that try to fix this depend on a cloud vector database and a remote embedding API. Argyph runs entirely locally — single binary, embedded vector store, bundled embedding model, no API key. It builds a three-tier index (file inventory → tree-sitter symbol graph → embeddings), each tier usable before the next finishes, so the agent can query almost immediately. It's read-only by design — never edits, commits, or runs code. Open source, Rust, MIT/Apache-2.0. GitHub: https://github.com/Ezzy1630/argyph submitted by /u/Its-Ezzy [link] [comments]
View originalHelp passing data between two Claudes
I'm having trouble and Claude wasn't much help. I use Claude Desktop, and have a work Claude account (provided by/paid for by work) and a personal Claude account. They must remain separate, and they are installed on different machines. HOWEVER, I would like for Personal Claude to have \*some\* level of insight into what Work Claude is helping me with. Not all the details, just the high level, so it has context. If you've seen Severance, I'm trying to build a bridge between my Innie and my Outie. I was able to write a skill for my Work Claude, that I can initiate in any session, to have it store pertinent details. Then, I wanted to set up a scheduled task for both Work and Personal. Work to write a brief and store it ... somewhere ... and then Personal to read that brief and ingest. My first thought was a Google Doc. Clean and easy, except that the Google Drive/Docs MCP apparently doesn't support write functionality. I explored using webhooks and n8n, but that was a cluster as well. Finally settled on Notion, and it works, but it's messy because I don't use Notion for literally anything else, and based on my light research, I don't really have any other need for it outside of this. I'm also trying to get setup with Obsidian and figure out how to use that to keep myself more organized. That would actually be a perfect use-case for this document to pass things from Work to Personal, except for the whole local-storage thing. Any other suggestions? I dislike having a single-use tool like Notion in the middle. The key limitation is that this is a fully automated effort, using Scheduled Actions in Claude Cowork. EDIT: I appreciate the concern and watch-outs. These are things that I had already considered, which is why I built a specific skill to generate these, and included a HITL review point, as well. The skill helps to specify that I am \*not\* looking for details or any protected business information. I'm specifically looking for notes, comments, and insights about **how I work, how I make decisions (or avoid them), and other observations about who I am at work**, which is somewhat different than how I manage things in my personal life. The output and information that I'm sharing is something I would be absolutely comfortable sharing with my boss or our CEO/CTO.
View originalYes, SingleStore offers a free tier. Pricing found: $0.99/hr, $600, $1.49/hr, $0.99/hr, $0.00
Key features include: Low latency, high concurrency, Fast ingestion, Horizontal scalability, Clusters, Security, Compliance, Availability, Observability + Manageability.
SingleStore is commonly used for: Real-time analytics for financial transactions, Personalized content delivery for streaming services, Fraud detection in banking applications, Ride-share app driver and passenger matching, IoT data processing for smart devices, E-commerce recommendation engines.
SingleStore integrates with: Apache Kafka, Apache Spark, Tableau, Looker, Power BI, AWS S3, Google Cloud Storage, Microsoft Azure, Zapier, Salesforce.
Based on user reviews and social mentions, the most common pain points are: token usage, cost tracking.
Based on 55 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.