Introducing our new Windsurf pricing plans
Based on the provided content, there is insufficient information to summarize user sentiment about Windsurf specifically. The social mentions discuss AI coding tools in general, with one TikTok post mentioning testing various AI coding tools for $500 and recommending V0 or Lovable for quick prototyping, and a Hacker News post about AI coding agents' limitations with architecture compliance. However, neither mention explicitly references Windsurf or provides direct user feedback about it. More specific reviews and mentions about Windsurf would be needed to provide an accurate sentiment summary.
Mentions (30d)
0
Reviews
0
Platforms
4
Sentiment
0%
0 positive
Based on the provided content, there is insufficient information to summarize user sentiment about Windsurf specifically. The social mentions discuss AI coding tools in general, with one TikTok post mentioning testing various AI coding tools for $500 and recommending V0 or Lovable for quick prototyping, and a Hacker News post about AI coding agents' limitations with architecture compliance. However, neither mention explicitly references Windsurf or provides direct user feedback about it. More specific reviews and mentions about Windsurf would be needed to provide an accurate sentiment summary.
Features
Industry
information technology & services
Employees
120
Funding Stage
Merger / Acquisition
Total Funding
$2.6B
I wasted $500 testing AI coding tools so you don't have to 💸 Here's what actually works: 🧪 Testing ideas? → V0 or Lovable Built a landing page in 90 seconds. Fully clickable, looked real. Code's me
I wasted $500 testing AI coding tools so you don't have to 💸 Here's what actually works: 🧪 Testing ideas? → V0 or Lovable Built a landing page in 90 seconds. Fully clickable, looked real. Code's messy but perfect for validation. 🏗️ Shipping real apps? → Bolt Full dev environment in your browser. I built a document uploader with front end + back end + database in one afternoon. 💻 Coding with AI? → Cursor or Windsurf Cursor = stable, used by Google engineers Windsurf = faster, newer, more aggressive Both are insane. 📚 Learning from scratch? → Replit Best coding teacher I've found. Explains errors, walks you through fixes, teaches as you build. Here's what 500+ hours taught me: The tool doesn't matter if you're using it for the wrong stage. Testing ≠ Building ≠ Coding ≠ Learning Stop comparing features. Match your goal first. Drop what you're building 👇 I'll tell you exactly which tool to use Save this. You'll need it. #AI #AITools #TechTok #ChatGPT #Coding
View originalPricing found: $10, $0/month, $20/month, $200/month, $40/user
Claude Code can now see and control your code editor.
Been shipping updates fast on claude-ide-bridge and wanted to share what's new. The big additions: Claude can now leave notes directly in your editor as you work, instead of dumping a wall of text in the chat, it highlights the exact lines it's talking about "Show me everything that calls this function" now actually works, Claude traces the full chain up and down through your code Claude can take a change all the way from your editor to a finished GitHub pull request in a single session, no manual steps Claude runs your tests, reads what broke, fixes it, and runs them again on its own One command (claude-ide-bridge init) sets everything up automatically, detects your editor, installs what's needed, and configures itself Works with VS Code, Windsurf, Cursor, and Antigravity. Built using Claude Code. github.com/Oolab-labs/claude-ide-bridge — free and open source. submitted by /u/wesh-k [link] [comments]
View originalI built a tool that tells AI coding agents which files actually matter before they edit your code
I’ve been building an open source tool called Contextception. The core idea is simple: AI coding agents are good at writing code, but they’re often bad at knowing what they should understand before they start editing. They read the file you pointed at, inspect a few imports, maybe grep around a bit, and then begin making changes. That works until they miss a dependency, a caller contract, a shared type, hidden coupling, or a risky nearby file that should have been reviewed first. The usual workaround is to dump a large amount of repo context into the model. That is expensive, noisy, and still not the same thing as giving the agent the right context. Contextception solves that deterministically. It builds a graph of your codebase, analyzes the dependency neighborhood around a file, and returns the files, tests, and risks that actually matter before the edit happens. It does this locally, fast, and with zero token cost. No extra model call to figure out what files matter. No giant repo dump. Just the right dependency-aware context at the right time. Recent releases also added automatic Claude Code setup and hooks. So this is not “remember to use the tool.” It’s: Install once, run setup once, and Claude automatically gets the right dependency-aware context before every edit. No extra model call to figure out what files matter. Just the right information at the right time, every time Claude edits code. What Contextception does It builds a dependency-aware graph of your codebase and answers: What files must be understood before safely changing this file? contextception index contextception analyze src/auth/login.py Here’s a trimmed example of the output: { "subject": "src/auth/login.py", "confidence": 0.92, "must_read": [ { "file": "src/auth/session.py", "symbols": ["create_session", "refresh_token"], "role": "foundation" }, { "file": "src/auth/types.py", "symbols": ["User", "AuthConfig"], "role": "utility", "stable": true }, { "file": "src/auth/middleware.py", "symbols": ["login_handler"], "direction": "imported_by", "role": "orchestrator" } ], "likely_modify": { "high": [ { "file": "src/auth/session.py", "signals": ["imports", "co_change:12"] } ] }, "tests": [ { "file": "tests/auth/test_login.py", "direct": true }, { "file": "tests/auth/test_session.py", "direct": false } ], "related": { "hidden_coupling": [ { "file": "src/api/error_handlers.py", "signals": ["hidden_coupling:4"] } ] }, "blast_radius": { "level": "medium", "fragility": 0.45 }, "hotspots": ["src/auth/session.py"] } What I wanted was not “more repo text.” I wanted ranked, explained context: must_read → what to understand first likely_modify → what may need edits too tests → what should probably be run or reviewed hidden_coupling → relationships imports miss blast_radius → how risky the surrounding impact is hotspots → high-churn, high-fan-in files that deserve extra care So instead of throwing a giant pile of code at an agent and hoping it notices the right files, you can hand it a focused map first. It also does blast radius + hotspot analysis I’m also including a few images below because these turned out to be some of the most useful views: Pipeline view — repo → index → analyze → ranked results https://preview.redd.it/b0ucp7mj7fug1.png?width=1600&format=png&auto=webp&s=1bb4fa598c89192a6d22270af6329930337d801c Blast radius view — critical / warning / related change impact https://preview.redd.it/475q0r5m7fug1.png?width=1200&format=png&auto=webp&s=8aa86045a1e170adb9b003fe39318c0e9793b69d Hotspot view — high churn + high fan-in = architectural risk https://preview.redd.it/3cxxoxno7fug1.png?width=1400&format=png&auto=webp&s=b4432181ca63c2d2124dda2be4bcda03f668f20f These have been especially useful for thinking about refactors and risky files, not just agent context. MCP support It also ships as an MCP server, so Claude Code, Cursor, Windsurf, and other MCP-compatible tools can query it directly. { "mcpServers": { "contextception": { "command": "contextception", "args": ["mcp"] } } } Goals open source fully offline token-efficient explainable fast after indexing useful for both humans and agents Supported languages Python TypeScript / JavaScript Go Java Rust Install brew install kehoej/tap/contextception or go install github.com/kehoej/contextception/cmd/contextception@latest Links GitHub: https://github.com/kehoej/contextception MCP guide: https://github.com/kehoej/contextception/blob/main/docs/mcp-tutorial.md Benchmarks: https://github.com/kehoej/contextception/tree/main/benchmarks MIT licensed. Would love feedback from people using AI coding agents, especially around what would make this most useful in real day-to-day development. submitted by /u/Kehoe [link] [comments]
View originalI used Claude Code to build a CLAUDE.md compiler — it reads your CI and generates governance for all 13 AI tools. Here's what I learned.
I've been using Claude Code as my primary coding tool for the past few months and kept running into the same problem: my CLAUDE.md would drift from my actual CI. I'd update a test runner or add a lint step, and CLAUDE.md would still reference the old commands. Claude would then suggest running commands that don't exist. So I built crag — largely with Claude Code itself — to solve this. It reads your repo's CI workflows, package.json, and configs, then generates a governance.md that captures your actual gates. Then it compiles that file to CLAUDE.md and 12 other tool formats. What I learned building this with Claude Code: The biggest insight was treating CLAUDE.md as a compiled artifact instead of a hand-written doc. Once I framed it that way, the architecture fell into place quickly. Claude Code was especially good at the pattern-matching logic for detecting CI commands across 7 different CI systems — it understood YAML schemas for GitHub Actions, GitLab CI, CircleCI etc. without much prompting. Where Claude struggled: the compile targets each have quirky format requirements (Cursor wants MDC frontmatter with YAML, Windsurf wants trigger patterns, AGENTS.md wants numbered steps). I had to be very specific in my CLAUDE.md about these format rules — which is ironic given that's the problem the tool solves. What it does: crag analyze — scans your repo, generates governance.md from your real CI gates (under 1 second, no LLM) crag compile --target claude — compiles to CLAUDE.md (or --target all for all 13 targets) crag audit — tells you when your CLAUDE.md has drifted from reality crag hook install — pre-commit hook that auto-recompiles when governance changes It also installs Claude Code skills (pre-start context loading) that give Claude your full governance context at session start. I benchmarked it on 50 top open-source repos — 46% had governance drift. Grafana's CLAUDE.md is literally 1 line (@AGENTS.md), but crag found 67 quality gates across their CI. Free to use, MIT licensed, zero dependencies: npx @whitehatd/crag on any repo. GitHub: https://github.com/WhitehatD/crag submitted by /u/Acceptable_Debate393 [link] [comments]
View originalMy dev team is burning through Claude / Cursor credits like crazy — how do you control AI usage in a team?
I run a dev team of 15 engineers. Recently we started using AI tools like Claude, Cursor, and Windsurf. Initially, I tried: - Shared Cursor/Windsurf accounts (2 people per seat) - Then upgraded to multiple Claude Max subscriptions ($100 each) But we’re facing a serious issue: Developers are using AI in “full speed mode”: - Spawning multiple agents - Running large prompts - No control on usage - Credits get exhausted mid-day or within hours Even when I try to scale: - If I buy more seats → usage just scales up - If I give “unlimited” → it still gets exhausted fast Now I’m considering: - Moving to API-based usage with per-developer budget ($20–$50/month) - Or restricting usage with strict rules But I’m worried: - Dev productivity might drop - Team is already “used to” Claude-level performance - They resist switching to cheaper models Key question: How are other teams managing AI usage at scale without burning costs? Specifically: - Do you use per-user API budgets? - Any tools for tracking usage per dev? - How do you prevent “AI overuse” behavior? - Do you enforce rules or just let teams manage themselves? Would really appreciate practical strategies from teams dealing with this. Thanks! submitted by /u/BackgroundTimely5490 [link] [comments]
View originalMade Claude Code actually understand my codebase — local MCP server with symbol graph + memory tied to git
I've been frustrated that Claude Code either doesn't know what's in my repo (so every session starts with re-explaining the architecture) or guesses wrong about which files matter. Cursor's @codebase kind of solves it but requires uploading to their cloud, which is a no-go for some of my client work. So I built Sverklo — a local-first MCP server that gives Claude Code (and Cursor, Windsurf, Antigravity) the same mental model of my repo that a senior engineer has. Runs entirely on my laptop. MIT licensed. No API keys. No cloud. What it actually does in a real session Before sverklo: I ask Claude Code "where is auth handled?" It guesses based on file names, opens the wrong file, reads 500 lines, guesses again, eventually finds it. After sverklo: Same question. Claude Code calls sverklo_search("authentication flow") and gets the top 5 files ranked by PageRank — middleware, JWT verifier, session store, login route, logout route. In one tool call. With file paths and line numbers. Refactor scenario: I want to rename a method on a billing class. Claude Code calls sverklo_impact("BillingAccount.charge") and gets the 14 real callers ranked by depth, across the whole codebase. No grep noise from recharge, discharge, or a Battery.charge test fixture. The rename becomes mechanical. PR review scenario: I paste a git diff. Claude Code calls sverklo_review_diff and gets a risk-scored review order — highest-impact files first, production files with no test changes flagged, structural warnings for patterns like "new call inside a stream pipeline with no try-catch" (the kind of latent outage grep can't catch). Memory scenario: I tell Claude Code "we decided to use Postgres advisory locks instead of Redis for cross-worker mutexes." It calls sverklo_remember and the decision is saved against the current git SHA. Three weeks later when I ask "wait, what did we decide about mutexes?", Claude Code calls sverklo_recall and gets the decision back — including a flag if the relevant code has moved since. The 20 tools in one MCP server Grouped by job: Search: sverklo_search, sverklo_overview, sverklo_lookup, sverklo_context, sverklo_ast_grep Refactor safety: sverklo_impact, sverklo_refs, sverklo_deps, sverklo_audit Diff-aware review: sverklo_review_diff, sverklo_test_map, sverklo_diff_search Memory (bi-temporal, tied to git SHAs): sverklo_remember, sverklo_recall, sverklo_memories, sverklo_forget, sverklo_promote, sverklo_demote Index health: sverklo_status, sverklo_wakeup All 20 run locally. Zero cloud calls after the one-time 90MB embedding model download on first run. Install (30 seconds) npm install -g sverklo cd your-project && sverklo init sverklo init auto-detects Claude Code / Cursor / Windsurf / Google Antigravity, writes the right MCP config file for each, appends sverklo instructions to your CLAUDE.md, and runs sverklo doctor to verify the setup. Safe to re-run on existing projects. Before you install — a few honest things Not magic. The README has a "when to use grep instead" section. Small repos (<50 files), exact string lookups, and single-file edits are all cases where the built-in tools are fine or better. Privacy is a side effect, not the pitch. The pitch is the mental model. Local-first happens to come with it because running a symbol graph on your laptop is trivially cheap. It's v0.2.16. Pre-1.0. I ran a structured 3-session dogfood protocol on my own tool before shipping this version — the log is public (DOGFOOD.md in the repo) including the four bugs I found in my own tool and fixed. I triage issues within hours during launch week. Links Repo: github.com/sverklo/sverklo Playground (see real tool output on gin/nestjs/react without installing): sverklo.com/playground Benchmarks (reproducible with npm run bench): BENCHMARKS.md in the repo Dogfood log: DOGFOOD.md in the repo If you try it, tell me what breaks. I'll respond within hours and ship fixes fast. submitted by /u/Parking-Geologist586 [link] [comments]
View originalMy Claude.md file
This is my Claude.md file, it is the same information for Gemini.md as i use Claude Max and Gemini Ultra. # CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview **Atlas UX** is a full-stack AI receptionist platform for trade businesses (plumbers, salons, HVAC). Lucy answers calls 24/7, books appointments, sends SMS confirmations, and notifies via Slack — for $99/mo. It runs as a web SPA and Electron desktop app, deployed on AWS Lightsail. The project is in Beta with built-in approval workflows and safety guardrails. ## Commands ### Frontend (root directory) ```bash npm run dev # Vite dev server at localhost:5173 npm run build # Production build to ./dist npm run preview # Preview production build npm run electron:dev # Run Electron desktop app npm run electron:build # Build Electron app ``` ### Backend (cd backend/) ```bash npm run dev # tsx watch mode (auto-recompile) npm run build # tsc compile to ./dist npm run start # Start Fastify server (port 8787) npm run worker:engine # Run AI orchestration loop npm run worker:email # Run email sender worker ``` ### Database ```bash docker-compose -f backend/docker-compose.yml up # Local PostgreSQL 16 npx prisma migrate dev # Run migrations npx prisma studio # DB GUI npx prisma db seed # Seed database ``` ### Knowledge Base ```bash cd backend && npm run kb:ingest-agents # Ingest agent docs cd backend && npm run kb:chunk-docs # Chunk KB documents ``` ## Architecture ### Directory Structure - `src/` — React 18 frontend (Vite + TypeScript + Tailwind CSS) - `components/` — Feature components (40+, often 10–70KB each) - `pages/` — Public-facing pages (Landing, Blog, Privacy, Terms, Store) - `lib/` — Client utilities (`api.ts`, `activeTenant.tsx` context) - `core/` — Client-side domain logic (agents, audit, exec, SGL) - `config/` — Email maps, AI personality config - `routes.ts` — All app routes (HashRouter-based) - `backend/src/` — Fastify 5 + TypeScript backend - `routes/` — 30+ route files, all mounted under `/v1` - `core/engine/` — Main AI orchestration engine - `plugins/` — Fastify plugins: `authPlugin`, `tenantPlugin`, `auditPlugin`, `csrfPlugin`, `tenantRateLimit` - `domain/` — Business domain logic (audit, content, ledger) - `services/` — Service layer (`elevenlabs.ts`, `credentialResolver.ts`, etc.) - `tools/` — Tool integrations (Outlook, Slack) - `workers/` — `engineLoop.ts` (ticks every 5s), `emailSender.ts` - `jobs/` — Database-backed job queue - `lib/encryption.ts` — AES-256-GCM encryption for stored credentials - `lib/webSearch.ts` — Multi-provider web search (You.com, Brave, Exa, Tavily, SerpAPI) with randomized rotation - `ai.ts` — AI provider setup (OpenAI, DeepSeek, OpenRouter, Cerebras) - `env.ts` — All environment variable definitions - `backend/prisma/` — Prisma schema (30KB+) and migrations - `electron/` — Electron main process and preload - `Agents/` — Agent configurations and policies - `policies/` — SGL.md (System Governance Language DSL), EXECUTION_CONSTITUTION.md - `workflows/` — Predefined workflow definitions ### Key Architectural Patterns **Multi-Tenancy:** Every DB table has a `tenant_id` FK. The backend's `tenantPlugin` extracts `x-tenant-id` from request headers. **Authentication:** JWT-based via `authPlugin.ts` (HS256, issuer/audience validated). Frontend sends token in Authorization header. Revoked tokens are checked against a `revokedToken` table (fail-closed). Expired revoked tokens are pruned daily. **CSRF Protection:** DB-backed synchronizer token pattern via `csrfPlugin.ts`. Tokens are issued on mutating responses, stored in `oauth_state` with 1-hour TTL, and validated on all state-changing requests. Webhook/callback endpoints are exempt (see `SKIP_PREFIXES` in the plugin). **Audit Trail:** All mutations must be logged to `audit_log` table via `auditPlugin`. Successful GETs and health/polling endpoints are skipped to reduce noise. On DB write failure, audit events fall back to stderr (never lost). Hash chain integrity (SOC 2 CC7.2) via `lib/auditChain.ts`. **Job System:** Async work is queued to the `jobs` DB table (statuses: queued → running → completed/failed). The engine loop picks up jobs periodically. **Engine Loop:** `workers/engineLoop.ts` is a separate Node process that ticks every `ENGINE_TICK_INTERVAL_MS` (default 5000ms). It handles the orchestration of autonomous agent actions. **AI Agents:** Named agents (Atlas=CEO, Binky=CRO, etc.) each have their own email accounts and role definitions. Agent behavior is governed by SGL policies. **Decisions/Approval Workflow:** High-risk actions (recurring charges, spend above `AUTO_SPEND_LIMIT_USD`, risk tier ≥ 2) require a `decision_memo` approval before execution. **Frontend Routing:** Uses `HashRouter` from React Router v7. All routes are defined in `src/routes.ts`. **Code Splitting:** Vite config splits chunks into `react-vendor`, `router`, `ui-vendor`, `charts`. **ElevenLabs Voice Agents:** Lucy's
View originalI got tired of re-explaining myself to Claude every session, so I built something
I got tired of re-explaining myself to every AI tool, so I built one that makes my context portable Hello everyone out there using AI every day… I build cardiac implants at Boston Scientific during the day and I’m a 1st year CS student. I use Claude, ChatGPT, Cursor, and Gemini daily to improve my skills and my productivity. But every tool starts from zero. Claude doesn’t know what I told Cursor. ChatGPT forgets my preferences. Gemini has no idea about my stack. I was spending the first 5 minutes of every session re-explaining who I am. Over and over. So I built aura-ctx; a free, open-source CLI that defines your AI identity once and serves it to all your tools via MCP. One source of truth. Everything stays local. No cloud. No lock-in. This is not another memory layer. Mem0, Zep, and Letta solve agent memory for developers. aura-ctx solves something different: the end user who wants to own and control their identity across tools. No Docker. No Postgres. No Redis. No auth tokens to manage. Just: pip install -U aura-ctx aura quickstart Why local-first matters here: your MCP server runs on localhost. No network latency. No auth hell. No token refresh. If you’ve dropped cloud-based MCP servers because of the overhead, this is the opposite architecture. Portability is by design: your entire identity lives in ~/.aura/packs/. Move machines? Copy the folder. That’s it. Security built-in: aura audit scans your packs for accidentally stored secrets (API keys, tokens, credentials) before they leak into your context. v0.3.3 is out with 3,500+ downloads. Supports 8 AI tools including Claude Desktop, Cursor, Windsurf, Gemini CLI, Claude Code and more. Exports to CLAUDE.md and AGENTS.md for agent frameworks. Still early. I’d like any feedback on what works, what doesn’t, and what’s missing. Curious : do you re-explain yourself every time you open Claude, or have you found a better way? GitHub: https://github.com/WozGeek/aura-ctx submitted by /u/Miserable_Celery9917 [link] [comments]
View originalHow I cut Claude Code usage in half (open source)
Every time I start a Claude Code session on a real codebase, it burns through tokens just trying to understand the repo. Read the file tree, open 20 files, trace the imports, figure out how auth connects to the API layer. On a 50k+ LOC project that exploration phase eats your context window before any real work starts. I built Repowise to fix this. It's a codebase intelligence layer that pre-computes the structural knowledge Claude Code needs and exposes it through MCP tools. Dependency graphs via AST parsing, searchable docs in LanceDB, git history tracking, architectural decision records. All local, nothing leaves your machine. Instead of Claude spelunking through your files every session, it calls something like `get_context` or `get_overview` and gets the full picture in one shot. Eight MCP tools total including `get_risk`, `search_codebase`, `get_dependency_path`, and `get_dead_code`. The savings come from the exploration side. That caveman prompt post from last week was clever for cutting output tokens, this attacks the input/exploration side. Claude already has the map so it stops burning context just to get oriented. Setup is just `pip install repowise`, then `repowise init` in your repo. Works with Claude Code, Cursor, and Windsurf. Fully open source, AGPL-3.0, self-hostable. GitHub: https://github.com/repowise-dev/repowise Would love your feedback on the same! submitted by /u/Obvious_Gap_5768 [link] [comments]
View originalClaude Engram - persistent memory for Claude Code that auto-tracks mistakes and context
Some of you might remember my post a few months ago about Mini Claude. I had Claude build its own memory system after researching its own user complaints. That project worked, but the hook system was a pain. I shelved it. Then Claude Code got "open-sourced", and I could actually see how hooks like PostToolUseFailure, PreCompact, and all the lifecycle events work internally. Rewrote the whole thing with proper hook integration. Renamed it Claude Engram. What changed from the original: The old version required Claude to manually call everything. The new version automatically hooks into Claude Code's tool lifecycle. Claude doesn't have to invoke anything for the core features to work. How it works: Hooks intercept every edit, bash command, error, and session event. Zero manual effort. Before you edit a file, it surfaces past mistakes and relevant context, scored by file match, tags, and recency. Survives context compaction. Auto-checkpoints before, re-injects rules and mistakes after. Tiered storage. Hot memories stay fast, old ones archive to cold storage. Searchable, restorable. Multi-project workspaces. Memories scoped per project, workspace-level rules cascade down. Hybrid search using AllMiniLM. Keyword + vector + reranking. No ChromaDB dependency. Update — v0.4.0: Session Mining Since the original post, engram now mines your Claude Code session logs automatically. This is the big addition. Claude Code stores your full conversation as JSONL files. After every session, engram parses them in the background and extracts what hooks can't capture: Decisions, mistakes, and approach changes extracted from conversation flow (not regex — structural analysis + AllMiniLM semantic scoring, naturally typo-tolerant) Searchable index across all past conversations — "what did we discuss about auth?" returns results in 112ms — every user message and assistant response from every past session gets embedded and indexed (7310 messages across 11 sessions in testing) Detects recurring struggles, error patterns across sessions, and which files are always edited together Predictive context — before you edit a file, it surfaces related files and likely errors from your history Cross-project learning — finds patterns that hold across all your projects Retroactive bootstrap — install on an existing project and it mines all your past sessions automatically Benchmark Result Decision Capture (220 prompts) 97.8% precision Injection Relevance (50 memories) 14/15, 100% isolation Compaction Survival 6/6 Error Auto-Capture (53 payloads) 100% recall, 97% precision Multi-Project Scoping 11/11 Session Mining Foundation 27/27 Obsidian Vault Compatibility 25/25 Cross-session search 112ms over 7310 indexed messages Not just Claude Code: The MCP server works with any MCP client — Cursor, Windsurf, Zed, Continue.dev. Claude Code gets the full auto-capture hooks + session mining on top. Also works with Obsidian vaults (PARA + CLAUDE.md structure). Tested and verified. No cloud, no API costs, runs locally. MIT licensed. https://github.com/20alexl/claude-engram submitted by /u/Crunchy-Nut1 [link] [comments]
View originalI'm a retail worker in Taiwan who built a 65-subsystem AI operating system with Claude Code in 3 months — here's the honest story (including the part where I've made $0)
I work at a chain retail store in Taiwan. No CS degree. No engineering background. I've had 6-7 jobs, all entry-level service work. In December 2025 I wanted out. My idea: build an AI system that generates income, then routes it into an automated investment engine — a self-reinforcing growth loop where AI runs both sides. 3 months and ~177,000 lines of code later, here's what exists. All built with Claude Code as my primary tool. --- **What I built (4 repos, all open for browsing)** **CAIOS** — "Central AI Operating System." 65 subsystems, 657 Python files, 154,740 lines of code, 46 database tables, 2,792 tests. Runs 30 scheduled jobs on a single GCP VM — morning briefs at 08:00, anomaly patrols every 30 min, daily reports at 20:00, memory sync at 23:00. All delivered through a Telegram bot. **creatoraitools.tools** — a Next.js 15 / React 19 web platform. 233 files, 21,395 lines of TypeScript, 20 pages, 30 API routes. Free to use, no login required for the tools. You can browse it right now. **joseph** — a Taiwan stock trading engine. Scans, scores, simulates, reports. Running in dry-run mode every weekday at 08:00. Live trading is permanently locked in source code (not config — more on this below). **buildhub-patrol** — a watchdog. Playwright e2e tests nightly at 03:00, health patrols every 6 hours. --- **How Claude Code was involved** Everything. I cannot write code — not one line from memory. My workflow: I describe what I want in natural language Claude Code writes the implementation I test and verify the result Iterate When I started in December 2025, I was copy-pasting chatbot output into Python files without understanding any of it. Then I found Windsurf, which helped but felt limited. Claude Code was the turning point — it plans, writes, debugs, tests, and explains in a way I can actually follow and direct. It's the difference between "AI writes code for me" and "AI is my engineering partner." I use Claude Code via the CLI with a Max subscription. Vertex AI / Gemini is my fallback. The entire CAIOS memory system is built on top of Claude Code's auto-memory feature — every session reads and writes to a persistent MEMORY.md so Claude already knows the full project context when I start a new conversation. --- **The hard lessons (real entries from my project memory)** **1. A watchdog that flaps is worse than no watchdog.** My web console's watchdog started flapping — restarting itself in a loop. I deliberately killed both the console and its watchdog, then wrote the re-enable steps into the memory file. System has been stable since. Lesson: ship the off switch before the feature. **2. When an autonomous loop produces garbage, stop it first.** My ADO (Autonomous Development OS) backlog ingestion twice exploded — the loop kept ingesting markdown fragments as new work items. Fix: stop → fix root cause → restart. Not "patch while running." I have 987 cancelled work packages in the database as a reminder. **3. Irreversible actions get compile-time blocks, not config flags.** Joseph's live trading is hard-coded `False` in the adapter — not a config toggle. Why? Because `bool(settings.allow_push)` under MagicMock silently evaluates truthy and bypasses the safety check. The fix — `if settings.allow_push is True` — is two extra characters that prevent an entire class of test-only false negatives. For anything you can't undo (real money, force pushes, database wipes), the guard belongs in source code. --- **What actually worked, ranked by impact** **Build the operations layer first.** I wired everything to Telegram on day one. Once I didn't need SSH to check on things, my throughput jumped 10x. The interface to all 65 subsystems is one chat thread. **Memory system on day one.** A persistent, structured memory file means Claude doesn't start from zero every session. The compounding is enormous. If you take one thing from this post: set up memory before you build features. **Schedule everything.** 30 jobs run on a clock. Morning briefs, anomaly patrols, daily reports, memory sync — all happen while I sleep. Cron is the most underrated framework in the world. **Off switches before features. Approval gates before automation.** Every CAIOS action has a risk classification. Risky actions stop at an approval gate and wait for me to tap a Telegram button. Safe actions run and notify me after. **2,792 tests are how I sleep at night.** Many are AI-generated, but I read every one. When you run autonomous loops, tests are the only thing between "the system fixed itself" and "the system silently destroyed itself." --- **The honest part** I have not made a single dollar from any of this. The trading engine works but I don't have capital to run it live. The web platform has almost zero organic traffic (2,910 impressions, 10 clicks in 28 days). Most "I built X with AI" posts skip this part. I'm not skipping it. What it has proven is that the gap between "I have an idea" and "I have
View originalAGENTS.md is the most important file in your Codex repo and nobody's testing theirs — I built a blind evaluation pipeline to fix that
I built this with Claude Code over a few months — the optimization pipeline, evaluation harness, and website. Posting here because AGENTS.md is one of the skill formats it optimizes, and Codex users are the ones most likely to care about measurable agent performance. Free to try: The optimized brainstorming skill is a direct download at presientlabs .com/free — no account, no credit card. Comes packaged for Claude, Codex, Cursor, Windsurf, ChatGPT, and Gemini with the original so you can A/B it yourself. --- The AGENTS.md problem Codex runs on AGENTS.md. That file shapes every decision the agent makes — what to prioritize, how to structure code, when to ask vs. decide, what patterns to follow. Most people write it once from a template or a blog post and never validate it. You have no way to know if your AGENTS.md is actually improving agent output or subtly degrading it. The same applies across the ecosystem: - CLAUDE.md for Claude Code - .cursorrules for Cursor - .windsurfrules for Windsurf - Custom Instructions for ChatGPT - GEMINI.md for Gemini These are all skills — persistent instruction layers. And none of them have a test suite. --- What I built A pipeline that treats skills like code: measure, optimize, validate. - Multiple independent AI judges evaluate output from competing skill versions blind — no knowledge of which is original vs. optimized - Every artifact is stamped with SHA-256 checksums — tamper-evident verification chain - Full judge outputs published for audit The output is a provable claim: "Version B beats Version A by X percentage points under blind conditions, verified by independent judges." --- Results Ran the brainstorming skill from the Superpowers plugin through the pipeline: - 80% → 96% blind pass rate - 10/10 win rate across independent judges - 70% smaller file size (direct token savings on every agent invocation) Also ran a writing-plans skill that collapsed to 46% after optimization — the optimizer gamed internal metrics without improving real quality. Published that failure as a case study. 5 out of 6 skills validated. 1 didn't. If you're running Codex on anything non-trivial, your AGENTS.md is either helping or hurting. This pipeline tells you which — with numbers, not feelings. --- Refund guarantee If the optimized skill doesn't beat the original under blind evaluation, full refund. Compute cost is on me. --- Eval data on GitHub: willynikes2/skill-evals. Free skill at presientlabs .com/free — direct download, no signup. --- The space in "presientlabs .com" is intentional — keeps automod from eating it while still being obvious to readers. Some subs even block spelled-out URLs though. If these still get removed, you can drop the URL entirely and just say "link in my profile" or "DM for link." submitted by /u/willynikes [link] [comments]
View originalPrism MCP — I gave my AI agent a research intern. It does not require a desk
So I got tired of my coding agent having the long-term memory of a goldfish and the research skills of someone who only reads the first Google result. I figured — what if the agent could just… go study things on its own? While I sleep? Turns out you can build this and it's slightly cursed. Here's what happens: On a schedule, a background pipeline wakes up, checks what you're actively working on, and goes full grad student. Brave Search for sources, Firecrawl to scrape the good stuff, Gemini to synthesize a report, then it quietly files it into memory at an importance level high enough that it's guaranteed to show up next time you talk to your agent. No "maybe the cosine similarity gods will bless us today." It's just there. The part I'm unreasonably proud of: it's task-aware. Running multiple agents? The researcher checks what they're all doing and biases toward that. Your dev agent is knee-deep in auth middleware refactoring? The researcher starts reading about auth patterns. It even joins the group chat — registers on a shared bus, sends heartbeats ("Searching...", "Scraping 3 articles...", "Synthesizing..."), and announces when it's done. It's basically the intern who actually takes notes at standups. No API keys? It doesn't care. Falls back to Yahoo Search and local parsing. Zero cloud required. I also added a reentrancy guard because the first time I manually triggered it during a scheduled run, two synthesis pipelines started arguing with each other and I decided that was a problem for present-me, not future-me. Other recent rabbit holes: Ported Google's TurboQuant to pure TypeScript — my laptop now stores millions of memories instead of "a concerning number that was approaching my disk limit" Built a correction system. You tell the agent it's wrong, it remembers. Forever. It's like training a very polite dog that never forgets where you hid the treats One command reclaims 90% of old memory storage. Dry-run by default because I am a coward who previews before deleting Local SQLite, pure TypeScript, works with Claude/Cursor/Windsurf/Gemini/any MCP client. Happy to nerd out on architecture if anyone's building agents with persistent memory. https://github.com/dcostenco/prism-mcp submitted by /u/dco44 [link] [comments]
View originalReducing AI agent token consumption by 90% by fixing the retrieval layer
Quick insight from building retrieval infrastructure for AI agents: Most agents stuff 50,000 tokens of context into every prompt. They retrieve 200 documents by cosine similarity, hope the right answer is somewhere in there, and let the LLM figure it out. When it doesn't, and it often doesn't, the agent re-retrieves. Every retry burns more tokens and money. We built a retrieval engine called Shaped that gives agents 10 ranked results instead of 200. The results are scored by ML models trained on actual interaction data, not just embedding similarity. In production, this means ~2,500 tokens per query instead of 50,000. The agent gets it right the first time, so no retry loops. The most interesting part: the ranking model retrains on agent feedback automatically. When a user rephrases a question or the agent has to re-retrieve, that signal trains the model. The model on day 100 is measurably better than day 1 without any manual intervention. We also shipped an MCP server so it works natively with Cursor, Claude Code, Windsurf, VS Code Copilot, Gemini, and OpenAI. If anyone's working on agent retrieval quality, I'd love to hear what approaches you've tried. Wrote up the full technical approach here: https://www.shaped.ai/blog/your-agents-retrieval-is-broken-heres-what-we-built-to-fix-it submitted by /u/skeltzyboiii [link] [comments]
View originalWhat model does OpenAI Prism use?
Trendingtopics.eu claims that OpenAI Prism uses GPT 5.2 but fails to give any sources. On my side, I saw in a Prism chat their refer to the model as codex. What model does OpenAI Prism use? submitted by /u/Franck_Dernoncourt [link] [comments]
View originalI built an open-source context framework for Codex CLI (and 8 other AI agents)
Codex is incredible for bulk edits and parallel code generation. But every session starts from zero — no memory of your project architecture, your coding conventions, your decisions from yesterday. What if Codex had persistent context? And what if it could automatically delegate research to Gemini and strategy to Claude when the task called for it? I built Contextium — an open-source framework that gives AI agents persistent, structured context that compounds across sessions. I'm releasing it today. What it does for Codex specifically Codex reads an AGENTS.md file. Contextium turns that file into a context router — a dynamic dispatch table that lazy-loads only the knowledge relevant to what you're working on. Instead of a static prompt, your Codex sessions get: Your project's architecture decisions and past context Integration docs for the APIs you're calling Behavioral rules that are actually enforced (coding standards, commit conventions, deploy procedures) Knowledge about your specific stack, organized and searchable The context router means your repo can grow to hundreds of files without bloating the context window. Codex loads only what it needs per session. Multi-agent delegation is the real unlock This is where it gets interesting. Contextium includes a delegation architecture: Codex for bulk edits and parallel code generation (fast, cheap) Claude for strategy, architecture, and complex reasoning (precise, expensive) Gemini for research, web lookups, and task management (web-connected, cheap) The system routes work to the right model automatically based on the task. You get more leverage and spend less. One framework, multiple agents, each doing what they're best at. What's inside Context router with lazy loading — triggers load relevant files on demand 27 integration connectors — Google Workspace, Todoist, QuickBooks, Home Assistant, and more 6 app patterns — briefings, health tracking, infrastructure remediation, data sync, goals, shared utilities Project lifecycle management — track work across sessions with decisions logged and searchable via git Behavioral rules — not just documented, actually enforced through the instruction file Works with 9 AI agents: Claude Code, Gemini CLI, Codex, Cursor, Windsurf, Cline, Aider, Continue, GitHub Copilot. Battle-tested I've used this framework daily for months: 100+ completed projects, 600+ journal entries, 35 app protocols running in production. The patterns shipped in the template are the ones that survived sustained real-world use. Plain markdown. Git-versioned. No vendor lock-in. Apache 2.0. Get started bash curl -sSL contextium.ai/install | bash Interactive installer with a gum terminal UI — picks your agent, selects your integrations, optionally creates a GitHub repo, then launches your agent ready to go. GitHub: https://github.com/Ashkaan/contextium Website: https://contextium.ai Happy to answer questions about the Codex integration or the delegation architecture. submitted by /u/Ashkaan4 [link] [comments]
View originalYes, Windsurf offers a free tier. Pricing found: $10, $0/month, $20/month, $200/month, $40/user
Key features include: Trusted by Developers. Proven in Enterprises..
Based on user reviews and social mentions, the most common pain points are: token cost.
Based on 22 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Shawn Wang
Founder at smol.ai
2 mentions