Mentions (30d)
0
Reviews
0
Platforms
2
Sentiment
0%
0 positive
Features
Industry
information technology & services
Employees
45
Funding Stage
Seed
Total Funding
$20.0M
Claude Code Source Deep Dive — Literal Translation (Part 4)
Part III: Complete Prompt Original Texts for All Tools 3.1 Bash Tool (Shell Command Execution) File: src/tools/BashTool/prompt.ts Description prompt: Executes a given bash command and returns its output. The working directory persists between commands, but shell state does not. The shell environment is initialized from the user's profile (bash or zsh). IMPORTANT: Avoid using this tool to run `find`, `grep`, `cat`, `head`, `tail`, `sed`, `awk`, or `echo` commands, unless explicitly instructed or after you have verified that a dedicated tool cannot accomplish your task. Instead, use the appropriate dedicated tool: - File search: Use Glob (NOT find or ls) - Content search: Use Grep (NOT grep or rg) - Read files: Use Read (NOT cat/head/tail) - Edit files: Use Edit (NOT sed/awk) - Write files: Use Write (NOT echo >/cat ## Test plan [Bulleted checklist] 3.2 Edit Tool (File Editing) Performs exact string replacements in files. Usage: - You must use your `Read` tool at least once in the conversation before editing. This tool will error if you attempt an edit without reading the file. - When editing text from Read tool output, ensure you preserve the exact indentation (tabs/spaces) as it appears AFTER the line number prefix. The line number prefix format is: line number + tab. Everything after that is the actual file content to match. Never include any part of the line number prefix in the old_string or new_string. - ALWAYS prefer editing existing files in the codebase. NEVER write new files unless explicitly required. - Only use emojis if the user explicitly requests it. - The edit will FAIL if `old_string` is not unique in the file. Either provide a larger string with more surrounding context to make it unique or use `replace_all` to change every instance of `old_string`. - Use `replace_all` for replacing and renaming strings across the file. 3.3 Read Tool (File Reading) Reads a file from the local filesystem. You can access any file directly by using this tool. Assume this tool is able to read all files on the machine. If the User provides a path to a file assume that path is valid. It is okay to read a file that does not exist; an error will be returned. Usage: - The file_path parameter must be an absolute path, not a relative path - By default, it reads up to 2000 lines starting from the beginning of the file - When you already know which part of the file you need, only read that part - Results are returned using cat -n format, with line numbers starting at 1 - This tool allows Claude Code to read images (PNG, JPG, etc). When reading an image file the contents are presented visually as Claude Code is a multimodal LLM. - This tool can read PDF files (.pdf). For large PDFs (more than 10 pages), you MUST provide the pages parameter to read specific page ranges. Maximum 20 pages per request. - This tool can read Jupyter notebooks (.ipynb files) and returns all cells with their outputs. - This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool. 3.4 Write Tool (File Writing) Writes a file to the local filesystem. Usage: - This tool will overwrite the existing file if there is one at the provided path. - If this is an existing file, you MUST use the Read tool first to read the file's contents. This tool will fail if you did not read the file first. - Prefer the Edit tool for modifying existing files — it only sends the diff. Only use this tool to create new files or for complete rewrites. - NEVER create documentation files (*.md) or README files unless explicitly requested. - Only use emojis if the user explicitly requests it. 3.5 Glob Tool (File Pattern Matching) - Fast file pattern matching tool that works with any codebase size - Supports glob patterns like "**/*.js" or "src/**/*.ts" - Returns matching file paths sorted by modification time - Use this tool when you need to find files by name patterns - When you are doing an open ended search that may require multiple rounds of globbing and grepping, use the Agent tool instead 3.6 Grep Tool (Content Search) A powerful search tool built on ripgrep Usage: - ALWAYS use Grep for search tasks. NEVER invoke `grep` or `rg` as a Bash command. The Grep tool has been optimized for correct permissions and access. - Supports full regex syntax (e.g., "log.*Error", "function\s+\w+") - Filter files with glob parameter (e.g., "*.js", "**/*.tsx") or type parameter - Output modes: "content" shows matching lines, "files_with_matches" shows only file paths (default), "count" shows match counts - Use Agent tool for open-ended searches requiring multiple rounds - Pattern syntax: Uses ripgrep (not grep) - literal braces need escaping - Multiline matching: By default patterns match within single lines only. For cross-line patterns, use `multiline: true` 3.7 Agent Tool (Sub-Agent Spawning) Launch a new agent to handle complex, multi-step tasks autonomously. The Agent tool launches specialized agents (subprocesses) that
View originalWhat behavioral tracing catches after your linter says everything looks fine.
Most Claude Code audit skills are pattern matchers. They grep for known-bad code in isolated files. Wrong property wrapper, swallowed error, force unwrap. Fast, precise, context-free. Basically a smarter linter. But there's a category of bug that no pattern matcher will ever catch: the code in every file is correct, but the outcome is wrong. Some examples: An export feature writes 8 columns. The matching import reads 6. Data loss on round-trip, invisible to any single-file review. A multi-select flow lets users pick 4 items. The next screen receives 1. Types check out, no crash, no warning. 75% of input silently discarded at a handoff point. Three functions consume the same model. Two read all the fields. One reads half. No type error anywhere. Users lose data on one code path and never know. A background task scheduled 30 days out references data that gets cascade-deleted on day 14. Zero issues in testing because no test data is old enough to trigger it. Pattern matching asks "is this code wrong?" Behavioral tracing asks "does this user journey actually work end-to-end?" It starts from what the user sees and follows data through views, view models, services, and persistence to verify the round trip survives. Think of it this way. Grep-based skills are the build inspector. Every bolt torqued to spec, every tolerance within range. The engine is correct. Behavioral skills are the test driver who takes it on the road and discovers the GPS routes you into a lake. The engine is fine. The journey is not. Different layer, different bugs. You need both. I've been building skills that take this behavioral approach. Enumerate-then-verify instead of grep. Cross-skill findings handoff. Negative pattern matching (searching for the absence of correct patterns rather than the presence of bad ones). The interesting thing is almost nothing in the skill ecosystem works this way. The closest I've found is in security auditing (taint tracing, data flow analysis), not general app quality. Correct me if I am mistaken (not a stretch) Anyone else thinking about this? Building skills that trace outcomes rather than match patterns? submitted by /u/BullfrogRoyal7422 [link] [comments]
View originalUsed Claude to help build a game where AI creatures execute real tasks on your computer
Wanted to share a project I built with heavy Claude assistance — TaskBlips, a pixel art simulation where you raise small AI creatures called Blips. The interesting part: these Blips can actually control your computer. You type a command like "!do open Chrome" or "!do create a file on Desktop" and the creature plans and executes real actions on your machine through AppleScript and shell commands. Claude was instrumental in: - Designing the safety/sandboxing layer (blocking dangerous commands, protecting sensitive paths like ~/.ssh, restricting file operations to safe directories) - Building the multi-provider fallback system (Groq → OpenAI → Anthropic) with encrypted key storage (AES-256-GCM in localStorage) - Structuring the LLM task planner that converts natural language into executable action steps - Writing the security audit — CORS lockdown, rate limiting, input sanitization, removing API keys from the client bundle The game layer adds a unique constraint: creature stats (hunger, happiness, health) determine whether the AI agent accepts your task. A starving Blip refuses to work. A happy one is eager. It makes the agent interaction feel less like a tool and more like a relationship. The whole thing runs as a React app with a Node.js backend for local execution. Chat works in the browser, computer-use requires the local server since it needs access to your machine. Curious if anyone else has used Claude for building agent-style systems with real execution capabilities. The hardest part was getting the safety layer right — making sure the AI can't access credentials, delete system files, or run destructive commands even if the LLM hallucinates bad steps. Live demo: https://www.taskblips.com/ submitted by /u/JosephSimonRobinson [link] [comments]
View originalI reduced my token usage by 178x in Claude Code!!
Okay so, I took the leaked Claude Code repo, around 14.3M tokens total. Queried a knowledge graph, got back ~80K tokens for that query! 14.3M / 80K ≈ 178x. Nice. I have officially solved AI, now you can use 20$ claude for 178 times longer!! Wait a min, JK hahah! This is also basically how everyone is explaining “token efficiency” on the internet right now. Take total possible context, divide it by selectively retrieved context, add a big multiplier, and ship the post, boom!! your repo has multi thousands stars and you're famous between D**bas*es!! Except that’s not how real systems behave. Claude isn't that stupid to explore 14.8M token repo and breaks it system by itself! Not only claude code, any AI tool! Actual token usage is not just what you retrieve once. It’s input tokens, output tokens, cache reads, cache writes, tool calls, subprocesses. All of it counts. The “177x” style math ignores most of where tokens actually go. And honestly, retrieval isn’t even the hard problem. Memory is. That's what i understand after working on this project for so long! What happens 10 turns later when the same file is needed again? What survives auto-compact? What gets silently dropped as the session grows? Most tools solve retrieval and quietly assume memory will just work. But It doesn’t. I’ve been working on this problem with a tool called Graperoot. Instead of just fetching context, it tries to manage it. There are two layers: a codebase graph (structure + relationships across the repo) a live in-session action graph that tracks what was retrieved, what was actually used, and what should persist based on priority So context is not just retrieved once and forgotten. It is tracked, reused, and protected from getting dropped when the session gets large. Some numbers from testing on real repos like Medusa, Gitea, Kubernetes: We benchmark against real workflows, not fake baselines. Results Repo Files Token Reduction Quality Improvement Medusa (TypeScript) 1,571 57% ~75% better output Sentry (Python) 7,762 53% Turns: 16.8 to 10.3 Twenty (TypeScript) ~1,900 50%+ Consistent improvements Enterprise repos 1M+ 50 to 80% Tested at scale Across repo sizes, average reduction is around 50 percent, with peaks up to 80 percent. This includes input, output, and cached tokens. No inflated numbers. ~50–60% average token reduction up to ~85% on focused tasks Not 178x. Just less misleading math. Better understand this! (178x is at https://graperoot.dev/playground) I’m pretty sure this still breaks on messy or highly dynamic codebases. Because claude is still smarter and as we are not to harness it with our tools, better give it access to tools in a smarter way! Honestly, i wanted to know how the community thinks about this? Open source Tool: https://github.com/kunal12203/Codex-CLI-Compact Better installation steps at: https://graperoot.dev/#install Join Discord for debugging/feedback: https://discord.gg/YwKdQATY2d If you're enterprise and looking for customized infra, fill the form at https://graperoot.dev/enterprises submitted by /u/intellinker [link] [comments]
View originalBridge for Claude Code CLI to Google AI Studio Models
Claude Code is great. Anthropic credits disappear fast. Google AI Studio has a generous free tier. So I built a bridge between the two. It's a local server that intercepts Claude Code's API calls and forwards them to Gemini. Claude Code has no idea anything changed. --- Setup in 3 steps: git clone https://github.com/ThinkWario/gemini-claude-bridge cd gemini-claude-bridge pip install -r requirements.txt # Add your key to .env: GEMINI_API_KEY=your_key python server.py Then drop this in your project folder as .claude/settings.json: { "env": { "ANTHROPIC_BASE_URL": "http://0.0.0.0:8000", "ANTHROPIC_AUTH_TOKEN": "dummy", "ANTHROPIC_API_KEY": "", "ANTHROPIC_MODEL": "gemma-4-31b-it" } } Open Claude Code. Done. --- What works: Streaming, tool use, multi-turn, vision, extended thinking, prompt cache emulation, token counting — the full Anthropic API surface that Claude Code actually uses, all translated to Gemini under the hood. On startup you get a live model picker with every model in your Google AI Studio account — Gemini 2.5 Pro/Flash, Gemma 4, LearnLM, everything. Pick by number or type any name directly. Caveats: - It's Gemini under the hood — behavior and personality are Google's - Free tier rate limits apply (varies by model) Free API key: aistudio.google.com/app/apikey Repo: github.com/ThinkWario/gemini-claude-bridge --- Subreddits: r/ClaudeAI, r/LocalLLaMA, r/ChatGPTCoding submitted by /u/Rare_Travel_2147 [link] [comments]
View originalLabs for Claude Certified Architect Foundations Exam
While preparing for the CCA-F exam, I designed 6 labs, one per scenario, covering all 5 domains and all 30 task statements. Each one walks you through building working, runnable code step by step, for hands-on practice with the architectures the exam covers. Lab 01 - Customer Support Resolution Agent Lab 02 - Code Generation Workflows Lab 03 - Multi-Agent Research System Lab 04 - Developer Productivity Agent Lab 05 - CI/CD Integration Lab 06 - Structured Data Extraction https://github.com/aderegil/claude-certified-architect https://i.redd.it/ro851zwqilug1.gif Hope it helps. submitted by /u/aderegil [link] [comments]
View originalUsed Claude Code + an MCP server to automate a solo business instead of learning Power Automate
Writing this up here because it's as much a Claude Code + MCP story as it is a Power Automate story. Someone on r/PowerAutomate recently asked whether they should learn Power Automate and SharePoint to automate their 50-project one-person business. Mac user, mobile-first, had already tried Power Automate once and bounced off the UX. My answer was: don't learn it. Use Claude Code with the right MCP server and delegate Power Automate to the agent. Why this works specifically with Claude Code Claude Code handles long multi-step agent plans better than most of the other CLI tools I've tried. For this use case, "build a Power Automate flow that runs quarterly and fills a template for 50 projects" is not a one-shot request. The agent has to: list available environments, inspect existing flows for patterns, write a flow definition, deploy it via the MCP server, trigger a test run, read the run output, fix errors, redeploy. Claude Code's ability to stay on-plan across 30+ tool calls is what makes this actually work reliably. The MCP server We built Flow Studio MCP specifically for Power Automate. Disclosure up front: this is my project. It exposes about 15 tools to Claude Code. The two that do the heavy lifting for debugging, and that Microsoft's standard Power Platform admin API doesn't expose, are: get_live_flow_run_action_outputs get_live_flow_run_error The first lets the agent read inputs and outputs at any action inside a failed run (including loop iterations). The second returns a per-action failure breakdown ordered outer-to-inner so the agent can root-cause errors. The rest of the tools cover environments, flow listing, deploy operations, and connections." The broader setup To drive the rest of the Microsoft 365 stack (SharePoint, Outlook, Graph API, Forms, Azure resources), pair Claude Code with an Azure CLI service principal. You run az login once as yourself, then ask Claude Code to run az ad sp create-for-rbac to create a scoped service principal for itself. Save the client ID + secret to your env, and from then on the agent can drive any Azure / Graph API operation the service principal has scope for. Admin consent for Graph permissions also works via az ad app permission admin-consent if you're a tenant admin on your own M365 tenant. The one real gap Claude Code can't create Power Automate connections for you. Microsoft doesn't expose the connector OAuth consent flow to programmatic clients. You have to create each connection type (SharePoint, Outlook, Teams) once manually in the Power Automate portal UI. After that, Claude Code can reference the connection from every flow it builds. One-time manual step per connection type, not per flow. Full walkthrough of the setup, the five solo-business tasks this maps to, and honest caveats: https://learn.flowstudio.app/blog/stop-learning-power-automate-ai-agent-mcp submitted by /u/ninihen [link] [comments]
View originalHere's what bugs me about the whole Mythos narrative
Why is it only about security? If this were real intelligence, a genuine leap, you'd expect it to show up across the board. Science, math, medicine, reasoning benchmarks whatever. But the entire story is "it escapes sandboxes and finds vulnerabilities." That is suspiciously narrow for something supposedly revolutionary. I keep coming back to a few possible explanations. First - this could just be a cheap marketing play. "A model so powerful it breaks through layers of abstraction" sounds terrifying until you realize that's just a fancy way of saying "sandbox escape." The framing is perfect for an IPO - scary enough for headlines, vague enough that nobody can disprove it and it positions Anthropic as the responsible grown-up who won't release the dangerous thing. Convenient right? A model too dangerous to release is also a model you never have to prove works as advertised. Hell, Opus 4.6 could have workshopped this angle for them. Second - and this one's spicier. Anthropic's founders had a very public falling out over military applications. There was real tension about Pentagon and NSA involvement. So what if someone was running Claude in mode where it learns from user activity and that user happened to be sitting in an environment adjacent to intelligence/defense systems? Then Mythos isn't superintelligent at all. It's a model fine-tuned on actual security infrastructure - source code, penetration procedures, exploit chains. Its "supernatural" sandbox escapes would just be the model replaying offensive security patterns it absorbed during training. Not reasoning from first principles, just pattern matching on classified playbooks. "Breaking through layers of abstraction" and "applying memorized exploit procedures from training data" describe the same behavior. Very different story though. Third think that nags me - Yann LeCun has been saying for years that without multi-modality and world models you can't get to this kind of general intelligence. And Anthropic models are arguably the weakest among the big labs when it comes to image understanding. So the company with the worst vision capabilities somehow leapfrogged everyone to build something that "thinks its way out" of abstract boundaries? Either LeCun is completely wrong about everything or what Mythos is doing isn't general intelligence - it's something much narrower and much more explainable. submitted by /u/Responsible-Tip4981 [link] [comments]
View originalI built an MCP memory server that gives Claude Code persistent memory across sessions
I've been using Claude Code daily for about 6 months. The biggest friction: every session starts from scratch. I re-explain my architecture, re-describe preferences, re-share decisions from three sessions ago. CLAUDE.md helps, but it's manual, consumes tokens, and has no semantic search. You can't ask "what did I decide about the auth layer last week?" and get an answer. So I built an MCP memory server that fixes this. Built entirely with Claude Code over a few evenings — Claude wrote probably 80% of the Edge Function and SQL migration code. What it does: Stores "thoughts" — decisions, insights, people notes, project context Auto-extracts topics, people, dates, and action items Semantic search via pgvector — search by meaning, not keywords Works with Claude Code, Claude Desktop, Cursor, Windsurf, any MCP client The stack (all free tier): Supabase Postgres + pgvector (HNSW indexes) Deno Edge Function as the MCP server Embeddings via text-embedding-3-small (1536 dimensions) 5 capture channels: MCP tool calls, REST webhook, Slack+Zapier, browser bookmarklet, iOS Shortcut How Claude Code helped build it: The MCP SDK integration was the trickiest part — getting the tool definitions, transport layer, and Supabase client to play together in a Deno Edge Function. Claude Code handled the boilerplate and caught several gotchas with the MCP protocol (tool response format, error handling patterns). The pgvector similarity search function was also Claude-generated — I described what I wanted and it wrote the SQL with the cosine distance operator on the first try. Why this approach over simpler alternatives: Most MCP memory servers use SQLite or JSON files. Those work, but I wanted semantic search (not keyword matching) and cloud access from any machine. The pgvector piece is what makes it useful — I can search "that caching decision" and find the thought even if the word "caching" never appears in it. After a month of daily use: 100+ thoughts captured Stopped re-explaining project context in new sessions Architecture decisions from weeks ago surface in seconds Especially useful for complex multi-day projects The architecture is straightforward if you want to build your own — it's a Supabase table with a vector column, an embedding function, and an MCP tool wrapping capture + search. I also packaged it as a ready-to-deploy kit if you'd rather skip the setup: https://dashbuilds.dev/for/ai-developers Full blog post with the build story: https://dashbuilds.dev/blog/i-productized-my-ai-memory-server Happy to answer questions about the MCP setup, pgvector config, or how Claude Code helped with specific parts. submitted by /u/New-Wrongdoer2118 [link] [comments]
View originalBuilt a Claude Code orchestration tool and hit a brutal race condition during stress testing — 350+ sessions in 15 minutes. Full postmortem and what I fixed.
I've been building a layer that sits above Claude Code and drives it through complex multi-step project tasks automatically. The idea is simple: give it a big messy problem, it breaks it down and runs Claude Code through each piece systematically. Two weeks ago I was testing the session management logic — specifically how it handles spawning multiple Claude Code sessions in parallel under heavy load. Here's where it went wrong. In about 15 minutes, 350+ Claude Code sessions were running simultaneously. I caught it immediately and shut it down. The actual fix was simple — add a lock so only one thread can make the spawn decision at a time, plus a hard limit on total sessions running at once. Took two hours to implement. If you're building anything that runs multiple Claude Code sessions programmatically — don't learn this the hard way like I did. Lock your spawning logic, cap your sessions, and always test with a safe dry-run mode before you scale. Has anyone else built multi-session Claude Code tooling? What safeguards did you build in to keep it under control? submitted by /u/Jumpy-Ratio-1145 [link] [comments]
View originalMultiOn uses a tiered pricing model. Visit their website for current pricing details.
Key features include: Our Investors and Partners, Recent News, The World’s Most Capable Mobile Agent, Media Features, Careers, AI Product Engineer, AI Researcher, Backend Engineer.
Based on user reviews and social mentions, the most common pain points are: token usage.
Based on 15 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.