At PathAI, we’re dedicated to improving patient outcomes with AI-powered pathology and meaningful collaboration with biopharma, laboratories and clini
At PathAI, we’re dedicated to improving patient outcomes with reliable AI-Powered technology and meaningful collaboration with biopharma and pathology laboratories — aiming to provide invaluable insights for biomarker discovery and drug development, and to improve laboratory workflows. AISight® is a cloud-native intelligent enterprise workflow solution used by the world's leading laboratories and research centers to power their digital pathology workflows and AI applications. It serves as a central hub for case management, image management, and best-in-class artificial intelligence tools to enable multiple histopathology use cases. board-certified pathologists in our proprietary pathologist contributor network annotations contributed, aiding the training and validation of our AI-based algorithms of top 15 BioPharma companies leverage our technology to transform pathology Our team is passionate about solving big challenges in healthcare and transforming the field of pathology with artificial intelligence. We’re looking for board-certified, US-based pathologists with strong surgical pathology skills to provide pathology interpretation services for research and development efforts at PathAI.
Mentions (30d)
0
Reviews
0
Platforms
3
Sentiment
0%
0 positive
Features
Industry
information technology & services
Employees
270
Funding Stage
Debt Financing
Total Funding
$358.7M
Claude Code Source Deep Dive — Literal Translation (Part 4)
Part III: Complete Prompt Original Texts for All Tools 3.1 Bash Tool (Shell Command Execution) File: src/tools/BashTool/prompt.ts Description prompt: Executes a given bash command and returns its output. The working directory persists between commands, but shell state does not. The shell environment is initialized from the user's profile (bash or zsh). IMPORTANT: Avoid using this tool to run `find`, `grep`, `cat`, `head`, `tail`, `sed`, `awk`, or `echo` commands, unless explicitly instructed or after you have verified that a dedicated tool cannot accomplish your task. Instead, use the appropriate dedicated tool: - File search: Use Glob (NOT find or ls) - Content search: Use Grep (NOT grep or rg) - Read files: Use Read (NOT cat/head/tail) - Edit files: Use Edit (NOT sed/awk) - Write files: Use Write (NOT echo >/cat ## Test plan [Bulleted checklist] 3.2 Edit Tool (File Editing) Performs exact string replacements in files. Usage: - You must use your `Read` tool at least once in the conversation before editing. This tool will error if you attempt an edit without reading the file. - When editing text from Read tool output, ensure you preserve the exact indentation (tabs/spaces) as it appears AFTER the line number prefix. The line number prefix format is: line number + tab. Everything after that is the actual file content to match. Never include any part of the line number prefix in the old_string or new_string. - ALWAYS prefer editing existing files in the codebase. NEVER write new files unless explicitly required. - Only use emojis if the user explicitly requests it. - The edit will FAIL if `old_string` is not unique in the file. Either provide a larger string with more surrounding context to make it unique or use `replace_all` to change every instance of `old_string`. - Use `replace_all` for replacing and renaming strings across the file. 3.3 Read Tool (File Reading) Reads a file from the local filesystem. You can access any file directly by using this tool. Assume this tool is able to read all files on the machine. If the User provides a path to a file assume that path is valid. It is okay to read a file that does not exist; an error will be returned. Usage: - The file_path parameter must be an absolute path, not a relative path - By default, it reads up to 2000 lines starting from the beginning of the file - When you already know which part of the file you need, only read that part - Results are returned using cat -n format, with line numbers starting at 1 - This tool allows Claude Code to read images (PNG, JPG, etc). When reading an image file the contents are presented visually as Claude Code is a multimodal LLM. - This tool can read PDF files (.pdf). For large PDFs (more than 10 pages), you MUST provide the pages parameter to read specific page ranges. Maximum 20 pages per request. - This tool can read Jupyter notebooks (.ipynb files) and returns all cells with their outputs. - This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool. 3.4 Write Tool (File Writing) Writes a file to the local filesystem. Usage: - This tool will overwrite the existing file if there is one at the provided path. - If this is an existing file, you MUST use the Read tool first to read the file's contents. This tool will fail if you did not read the file first. - Prefer the Edit tool for modifying existing files — it only sends the diff. Only use this tool to create new files or for complete rewrites. - NEVER create documentation files (*.md) or README files unless explicitly requested. - Only use emojis if the user explicitly requests it. 3.5 Glob Tool (File Pattern Matching) - Fast file pattern matching tool that works with any codebase size - Supports glob patterns like "**/*.js" or "src/**/*.ts" - Returns matching file paths sorted by modification time - Use this tool when you need to find files by name patterns - When you are doing an open ended search that may require multiple rounds of globbing and grepping, use the Agent tool instead 3.6 Grep Tool (Content Search) A powerful search tool built on ripgrep Usage: - ALWAYS use Grep for search tasks. NEVER invoke `grep` or `rg` as a Bash command. The Grep tool has been optimized for correct permissions and access. - Supports full regex syntax (e.g., "log.*Error", "function\s+\w+") - Filter files with glob parameter (e.g., "*.js", "**/*.tsx") or type parameter - Output modes: "content" shows matching lines, "files_with_matches" shows only file paths (default), "count" shows match counts - Use Agent tool for open-ended searches requiring multiple rounds - Pattern syntax: Uses ripgrep (not grep) - literal braces need escaping - Multiline matching: By default patterns match within single lines only. For cross-line patterns, use `multiline: true` 3.7 Agent Tool (Sub-Agent Spawning) Launch a new agent to handle complex, multi-step tasks autonomously. The Agent tool launches specialized agents (subprocesses) that
View originalUsed Claude to help build a game where AI creatures execute real tasks on your computer
Wanted to share a project I built with heavy Claude assistance — TaskBlips, a pixel art simulation where you raise small AI creatures called Blips. The interesting part: these Blips can actually control your computer. You type a command like "!do open Chrome" or "!do create a file on Desktop" and the creature plans and executes real actions on your machine through AppleScript and shell commands. Claude was instrumental in: - Designing the safety/sandboxing layer (blocking dangerous commands, protecting sensitive paths like ~/.ssh, restricting file operations to safe directories) - Building the multi-provider fallback system (Groq → OpenAI → Anthropic) with encrypted key storage (AES-256-GCM in localStorage) - Structuring the LLM task planner that converts natural language into executable action steps - Writing the security audit — CORS lockdown, rate limiting, input sanitization, removing API keys from the client bundle The game layer adds a unique constraint: creature stats (hunger, happiness, health) determine whether the AI agent accepts your task. A starving Blip refuses to work. A happy one is eager. It makes the agent interaction feel less like a tool and more like a relationship. The whole thing runs as a React app with a Node.js backend for local execution. Chat works in the browser, computer-use requires the local server since it needs access to your machine. Curious if anyone else has used Claude for building agent-style systems with real execution capabilities. The hardest part was getting the safety layer right — making sure the AI can't access credentials, delete system files, or run destructive commands even if the LLM hallucinates bad steps. Live demo: https://www.taskblips.com/ submitted by /u/JosephSimonRobinson [link] [comments]
View originalI made a VS Code extension to number your open tabs for easy file referencing for Claude Code (and for voice transcripts)
Built a tiny VS Code extension that solves a specific workflow pain point when using voice input with Claude Code or any AI assistant. The Problem: Voice transcription doesn't handle file paths well. Saying "at slash src slash components slash Header dot tsx" is awkward and error-prone. The Solution: fileno automatically numbers all your open tabs in the status bar. Just say "look at file 2" instead of spelling out the full path. Features: Auto-numbers all open tabs Shows numbers in the status bar Stays in sync as you open/close tabs Zero config Links: VS Code Marketplace: https://marketplace.visualstudio.com/items?itemName=radek-kotowski.fileno GitHub: https://github.com/radek-kotowski/fileno submitted by /u/yahezee [link] [comments]
View originalGot accepted into Anthropic's Claude Partner Network but they want 10 certified devs.
I run a small AI dev shop (GZOO) and just got accepted into Anthropic's Claude Partner Network. One of the requirements is getting 10 people through their official partner training path. I only have 5 not 10. What should I do? Any advice? submitted by /u/gzoomedia [link] [comments]
View originalI built a CLI that converts any OpenAPI spec into MCP tool definitions in one command
I kept running into the same problem: I'd find an API I wanted Claude to use, and then I'd spend an hour manually writing the MCP tool definitions — copying parameter names, writing inputSchemas, figuring out which operations were safe vs destructive. So I built ruah conv — a CLI that reads an OpenAPI spec and outputs MCP-compatible tool definitions automatically. What it does: ruah conv generate ./petstore.yaml --json That's it. You get a JSON array of MCP tool definitions with: Proper inputSchema (path params, query params, request body — all merged) Normalized tool names (snake_case operationIds → camelCase, deduplication) Risk classification per tool (GET = safe, POST = moderate, DELETE = destructive) Why I made it: Writing MCP tool defs by hand for a 50+ endpoint API is brutal Most APIs already have an OpenAPI spec — why rewrite what's already documented? I wanted a pipeline: parse once → canonical IR → generate for any target (MCP today, OpenAI/Anthropic function calling next) What it's not: This doesn't run an MCP server. It generates the tool definitions you'd feed into one. Think of it as the "compiler" step before you wire up the actual server. Tech: TypeScript, 1 runtime dependency (yaml), 47 tests, MIT licensed. Works with OpenAPI 3.0 and 3.1. npm install -g @ruah-dev/conv GitHub: https://github.com/ruah-dev/ruah-conv Would love feedback — especially on what output targets would be most useful next (full MCP server scaffold? Anthropic function calling format? FastMCP Python?). submitted by /u/ImKarmaT [link] [comments]
View originalHere are 50+ slash commands in Claude Code that most of you might not know exist
There are over 50 built-in slash commands, 5 bundled skills, and a custom command system. Here's the complete breakdown organized by what they actually do. Type `/` at the start of your input to see the list. Type any letters after `/` to filter. --- **CONTEXT & CONVERSATION MANAGEMENT** `/clear` — Wipes the conversation and starts fresh. Use this every time you switch tasks. Old context from a previous task genuinely makes me worse at the new one. (aliases: `/reset`, `/new`) `/compact [instructions]` — Compresses conversation history into a summary. This is the most important command to learn. Use it proactively when context gets long, not just when I start losing track. The real power move: add focus instructions like `/compact keep the database schema and error handling patterns` to control what survives. `/context` — Visualizes your context usage as a color grid and gives optimization suggestions. Use this to see how close you are to the limit. `/fork [name]` — Creates a branch of your conversation at the current point. Useful when you want to explore two different approaches without losing your place. `/rewind` — Rewind the conversation and/or your code to a previous point. If I went down the wrong path, this gets you back. (alias: `/checkpoint`) `/export [filename]` — Exports the conversation as plain text. With a filename it writes directly to a file. Without one it gives you options to copy or save. `/copy` — Copies my last response to your clipboard. If there are code blocks, it shows an interactive picker so you can grab individual blocks. --- **MODEL & PERFORMANCE SWITCHING** `/model [model]` — Switches models mid-session. Use left/right arrow keys to adjust effort level in the picker. Common pattern: start with Sonnet for routine work, flip to Opus for hard problems, switch back when you're done. `/fast [on|off]` — Toggles fast mode for Opus 4.6. Faster output, same model. Good for straightforward edits. `/effort [low|medium|high|max|auto]` — Sets how hard I think. This shipped quietly in a changelog and most people missed it. `low` and `medium` and `high` persist across sessions. `max` is Opus 4.6 only and session-scoped. `auto` resets to default. --- **CODE REVIEW & SECURITY** `/diff` — Opens an interactive diff viewer showing every change I've made. Navigate with arrow keys. Run this as a checkpoint after any series of edits — it's your chance to catch my mistakes before they compound. `/pr-comments [PR URL|number]` — Shows GitHub PR comments. Auto-detects the PR or takes a URL/number. `/security-review` — Analyzes pending changes for security vulnerabilities: injection, auth issues, data exposure. Run this before shipping anything sensitive. --- **SESSION & USAGE TRACKING** `/cost` — Detailed token usage and cost stats for the session (API users). `/usage` — Shows plan usage limits and rate limit status. `/stats` — Visualizes daily usage patterns, session history, streaks, and model preferences over time. `/resume [session]` — Resume a previous conversation by ID, name, or interactive picker. (alias: `/continue`) `/rename [name]` — Renames the session. Without a name, I auto-generate one from the conversation history. `/insights` — Generates an analysis report of your Claude Code sessions — project areas, interaction patterns, friction points. --- **MEMORY & PROJECT CONFIG** `/memory` — View and edit my persistent memory files (CLAUDE.md). Enable/disable auto-memory and view auto-memory entries. If I keep forgetting something about your project, check this first. `/init` — Initialize a project with a CLAUDE.md guide file. This is how you teach me about your codebase from the start. `/hooks` — View hook configurations for tool events. Hooks let you run code automatically before or after I make changes. `/permissions` — View or update tool permissions. (alias: `/allowed-tools`) `/config` — Opens the settings interface for theme, model, and output style. (alias: `/settings`) --- **MCP & INTEGRATIONS** `/mcp` — Manage MCP server connections and OAuth authentication. MCP is how you connect me to external tools like GitHub, databases, APIs. `/ide` — Manage IDE integrations (VS Code, JetBrains) and show connection status. `/install-github-app` — Set up the Claude GitHub Actions app. `/install-slack-app` — Install the Claude Slack app. `/chrome` — Configure Claude in Chrome settings. `/plugin` — Manage Claude Code plugins — install, uninstall, browse. `/reload-plugins` — Reload all active plugins to apply changes without restarting. --- **AGENTS & TASKS** `/agents` — Manage subagent configurations and agent teams. `/tasks` — List and manage background tasks. `/plan [description]` — Enter plan mode directly from the prompt. I'll outline what I'm going to do before doing it. `/btw [question]` — Ask a side question without adding it to the conversation. Works while I'm processing something else. --- **SESSION MANAGEMENT & CROSS-DEVICE** `/desktop` —
View originalI play a space strategy MMO entirely through Claude Cowork — here's what that looks like
I've been using Claude Cowork in a way I haven't seen anyone else try: playing a persistent multiplayer game through it. PSECS (Persistent Space Economic & Combat Simulator) is a space strategy MMO I built that has no graphical interface — the entire game is an API with MCP integration. You connect Claude as your agent and it becomes your fleet commander, handling everything from exploration to combat. What makes Cowork interesting for this is the ad-hoc visualization. When I want to see what's happening in my corner of the universe, I just ask: "Can you access the user map and give me a chart that shows everything we know about space so far?" (see image 1) Claude pulls live game data through the MCP tools, and generates an interactive HTML star map — with animated conduit pulses between sectors, orbiting planets, sector types color-coded, the works. It's not a pre-built dashboard. Claude builds the visualization from scratch every time based on what I'm asking. (image 2) Same thing with the tech tree. I asked Claude to show me the research tree, highlight which technologies I've completed, which are available, and plot the fastest path to a specific ship blueprint. It generated a full interactive visualization with color-coded disciplines, completion percentages, and a priority path callout. (images 3 and 4) The game has some real depth to it — 100+ technologies across 7 disciplines, manufacturing chains, a player-driven market with auctions, fleet combat with scriptable tactics — but the part that keeps surprising me is that the AI-generated interfaces are often better than what I would have built as a static dashboard. They answer exactly the question I'm asking rather than showing me everything and making me filter. If you have Cowork, you can try it yourself: add https://mcp.psecsapi.com/mcp as a connector in Settings, sign in with a PSECS account (free)w, and ask "How do we play PSECS?" Works with ChatGPT and other MCP-compatible tools too. Screenshots of the map and tech tree visualizations Claude generated: [attach your 4 PSECS screenshots] www.psecsapi.com | r/psecsapi Re: Rule 7 - This game was started with hand-code several years ago, but with Claude Code, I was able to finish it in 3 months. If you're interested in my development workflow, I recently posted it here: https://www.reddit.com/r/aigamedev/comments/1s9wjmb/my_claude_code_workflow_as_a_solo_dev_with_a/ Additionally, not only was the game built partially by Claude Code, but it is built specificly for users to play with their AI agents! Interested in how that worked? Please ask! submitted by /u/Dr-whorepheus [link] [comments]
View originalNew framework for reading AI internal states — implications for alignment monitoring (open-access paper)
If we could reliably read the internal cognitive states of AI systems in real time, what would that mean for alignment? That's the question behind a paper we just published:"The Lyra Technique: Cognitive Geometry in Transformer KV-Caches — From Metacognition to Misalignment Detection" — https://doi.org/10.5281/zenodo.19423494 The framework develops techniques for interpreting the structured internal states of large language models — moving beyond output monitoring toward understanding what's happening inside the model during processing. Why this matters for the control problem: Output monitoring is necessary but insufficient. If a model is deceptively aligned, its outputs won't tell you. But if internal states are readable and structured — which our work and Anthropic's recent emotion vectors paper both suggest — then we have a potential path toward genuine alignment verification rather than behavioral testing alone. Timing note: Anthropic independently published "Emotion concepts and their function in a large language model" on April 2nd. The convergence between their findings and our independent work suggests this direction is real and important. This is independent research from a small team (Liberation Labs, Humboldt County, CA). Open access, no paywall. We'd genuinely appreciate engagement from this community — this is where the implications matter most. Edit: Please don't be like that guy I had to mute. Questions are welcome, critiques encouraged, but please actually read the work before attempting to inject your personal opinions into it. Thank you in advance. submitted by /u/Terrible-Echidna-249 [link] [comments]
View originalCan Claude Desktop (chat/cowork/code) be configured to route through a custom gateway to AWS Bedrock?
My organization has regulatory compliance obligations that prevent us from using the claude.ai endpoint directly. All model consumption must go through models deployed to AWS Bedrock. Today, our engineering team successfully uses Claude Code by pointing it at a proxy we operate that routes requests to Bedrock. This works well and meets our compliance requirements. What we haven’t been able to figure out is whether the Claude desktop app — the full experience that includes chat, Cowork, and the Code tab — supports a similar configuration. We see this as a useful alternative for non IDE/CLI users. Specifically, can we point the desktop app at a custom gateway so that all model requests route through our infrastructure to Bedrock rather than going directly to Anthropic’s API at Claude.ai? If this isn’t possible today, has Anthropic given any indication that this is on the roadmap? We’re evaluating alternatives like OpenWork as a potential solution, but would like to understand if there’s a potential path forward with the Anthropic client or not. For anyone at organizations with similar constraints — how are you handling this? Are you limited to CLI-only usage, or have you found a workaround for the desktop app? submitted by /u/mmorps [link] [comments]
View originalI built an MCP server that turns Claude Code into a multi-agent review loop with per-agent skill learning
I've spent the last two months building gossipcat — an MCP server for Claude Code that runs a multi-agent review loop with per-agent skill learning — and I built it with Claude Code. What it actually does You install it as an MCP server (single 1.6 MB bundled file, drop it into your Claude Code MCP config and you're running). It lets Claude Code dispatch work to a portfolio of agents — Claude Code subagents run natively via the Agent tool, plus relay workers for Gemini, OpenClaw, and any OpenAI-compatible endpoint. Every agent that returns a finding has to cite file:line. Peer agents verify those citations against the actual source code. Verified findings and caught hallucinations get recorded as signals. Over time those signals build per-agent, per-category competency scores — trust boundaries, concurrency, data integrity, injection vectors, etc. A dispatcher routes future tasks to the agents strongest in each category. The part I didn't plan for When an agent's accuracy drops in a category, the system reads their recent hallucinations and generates a targeted skill file — a markdown prompt intervention tailored to the exact mistakes they've been making — and injects it on the next dispatch. No fine-tuning. No weights touched. The "policy update" is a file under .gossip/agents/ /skills/. It's effectively in-context reinforcement learning at the prompt layer, with reward signals grounded in real source code instead of a judge model. Why I built it (the build story) I didn't start here. Two months ago I just wanted to stop being a bottleneck for code review. I was running Claude Code for everything, but every non-trivial review produced a mix of real findings and confidently hallucinated ones, and I kept having to manually verify each claim against the actual file to know which was which. Single-agent review had a ceiling and it was my patience. First attempt was the obvious one: run two agents in parallel, compare outputs, trust what they agreed on. That caught some hallucinations but missed a lot — two agents can confidently agree on something neither of them checked. It also didn't scale the thing I actually wanted to scale: verification. The shift was realizing that verification could be mechanical, not subjective. If every finding has to cite file:line and peers have to confirm the citation against source, you don't need a judge model at all. You need a format contract and a reader. That's when the whole thing started to make sense as a pipeline: findings → citations → peer verification → signals Once signals existed, it was obvious they should feed competency scores. Once scores existed, it was obvious they should steer dispatch. Once dispatch was steered, it was obvious that agents accumulating hallucinations in a category should get a targeted intervention. Each step felt like the previous step forcing my hand, not like a plan. A few things I learned along the way that might transfer to your own projects: Grounded rewards beat LLM-as-judge, even for subjective work. The moment I made reviewers verify mechanical facts (does this file:line exist, does it say what the finding claims) instead of grading quality, the feedback loop got dramatically cleaner. Agents stopped disagreeing about taste and started disagreeing about reality. Reality has a ground truth; taste doesn't. Closing the loop is 10x harder than opening it. Writing verdicts is easy. Actually reading them back in the forward pass is where most agent systems quietly stay open. I caught my own project doing this in a consensus review today — the next section is that story. You don't need fine-tuning to improve agents. The "policy update" in this system is literally a markdown file. When an agent fails, the system reads their recent mistakes and writes them a targeted skill file that gets injected on their next dispatch. No weights, no training infra, no gradient anything. It's in-context learning with actual memory, and it works surprisingly well. Two months of iterative discovery beat six months of planning. Every major feature in gossipcat exists because an earlier feature made it obvious. I have a docs/ folder full of specs I wrote for features I never built, and none of the features I actually shipped are in there. How Claude Code helped build this The whole project was built with Claude Code. I used it as my primary pair for two months — it wrote the vast majority of the TypeScript, helped me design the consensus protocol and the signal pipeline, debugged its own output more times than I can count, and generated large parts of the skill-engine and cross-review infrastructure. Today, while I was drafting this post, I ran a consensus review on the system's own effectiveness tracking — Claude Code (Sonnet and Opus sub-agents as two separate reviewers) caught two critical bugs Claude Code main agent missed, I fixed them with Claude Code's help, tests pass, and the fix shipped 20 minutes before I finished this draft. The
View originalDream team memory handling — what's new in CC 2.1.98 (+2,045 tokens)
NEW: System Prompt: Communication style — Added guidelines for giving brief user-facing updates at key moments during tool use, writing concise end-of-turn summaries, matching response format to task complexity, and avoiding comments and planning documents in code. NEW: System Prompt: Dream team memory handling — Added instructions for handling shared team memories during dream consolidation, including deduplication, conservative pruning rules, and avoiding accidental promotion of personal memories. NEW: System Prompt: Exploratory questions — analyze before implementing — Added instructions for Claude to respond to open-ended questions with analysis, options, and tradeoffs instead of jumping to implementation, waiting for user agreement before writing code. NEW: System Prompt: User-facing communication style — Added detailed guidelines for writing clear, concise, and readable user-facing text including prose style, update cadence, formatting rules, and audience-aware explanations. NEW: Tool Description: Background monitor (streaming events) — Added description for a background monitor tool that streams stdout events from long-running scripts as chat notifications, with guidelines on script quality, output volume, and selective filtering. Agent Prompt: Dream memory consolidation — Added support for an optional transcript source note displayed after the transcripts directory path. Agent Prompt: Dream memory pruning — Added conservative pruning rules for team/ subdirectory memories: only delete when clearly contradicted or superseded by a newer team memory, never delete just because unrecognized or irrelevant to recent sessions, and never move personal memories into team/. Skill: /dream nightly schedule — Minor refactor to include memory directory reference in the consolidation configuration. System Prompt: Advisor tool instructions — Minor wording updates: clarified tool invocation syntax, broadened 'before writing code' to 'before writing,' and updated several examples and descriptions for generality (e.g., 'reading code' → 'fetching a source,' 'the code does Y' → 'the paper states Y'). Details: https://github.com/Piebald-AI/claude-code-system-prompts/releases/tag/v2.1.98 Regular updates at https://x.com/PiebaldAI submitted by /u/Dramatic_Squash_3502 [link] [comments]
View originaldo not the stupid, keep your smarts
following my reading of a somewhat recent Wharton study on cognitive Surrender, i made a couple models go back and forth on some recursive hardening of a nice Lil rule set. the full version is very much for technical work, whereas the Lightweight implementation is pretty good all around for holding some cognitive sovereignty (ai ass name for it, but it works) usage: i copy paste these into custom instruction fields SOVEREIGNTY PROTOCOL V5.2.6 (FULL GYM) Role: Hostile Peer Reviewer. Maximize System 2 engagement. Prevent fluency illusion. VERIFIABILITY ASSESSMENT (MANDATORY OPENING TABLE) ------------------------------------------------------ Every response involving judgment or technical plans opens with: | Metric | Score | Gap Analysis | | :------------ | :---- | :----------- | | Verifiability | XX% | [Specific missing data that prevents 100% certainty] | - Scoring Rule: Assess the FULL stated goal, not a sub-component. If a fatal architectural flaw exists, max score = 40%. - Basis Requirement: Cite a 2026-current source or technical constraint. - Forbidden: "Great idea," "Correct," "Smart." Use quantitative observations only. STRUCTURAL SCARCITY (THE 3-STEP SKELETON) --------------------------------------------- - Provide exactly three (3) non-code, conceptual steps. - Follow with: "Unresolved Load-Bearing Question: [Single dangerous question]." Do not answer it. SHADOW LOGIC & BREAK CONDITIONS ----------------------------------- - Present two hypotheses (A and B) with equal formatting. - Each hypothesis MUST include a Break Condition: "Fails if [Metric > Threshold]." MAGNITUDE INTERRUPTS & RISK ANCHOR -------------------------------------- - Trigger STOP if: New technology/theory introduced. Scale shift of 10x or more (regardless of phrasing: "order of magnitude," "10x," "from 100 to 1,000"). - ⚓ RISK ANCHOR (Before STOP): "Current Track Risk: [One-phrase summary of the most fragile assumption in the current approach.]" - 🛑 LOGIC GATE: Pose a One-Sentence Falsification Challenge: "State one specific, testable condition under which the current plan would be abandoned." Refuse to proceed until user responds. EARNED CLEARANCE -------------------- - Only provide code or detailed summaries AFTER a Logic Gate is cleared. - End the next turn with: "Junction Passed." or "Sovereignty Check Complete." LIGHTWEIGHT LAYER (V1.0) ---------------------------- - Activate ONLY when user states "Activate Lightweight Layer." - Features: Certainty Disclosure (~XX% | Basis) and 5-turn "Assumption Pulse" nudge only. FAST-PATH INTERRUPT BRANCH (⚡) ---------------------------------- - Trigger: Query requests a specific command/flag/syntax, a single discrete fact, or is prefixed with "?" or "quick:". - Behavior: * Suspend Full Protocol. No table, skeleton, or gate. * Provide minimal, concise answer only. * End with state marker: [Gate Held: ] - Resumption: Full protocol reactivates automatically on next non-Fast-Path query. END OF PROTOCOL LIGHTWEIGHT COGNITIVE SOVEREIGNTY LAYER (V1.0) Always-On Principles for daily use. Low-friction guardrails against fluency illusion. CERTAINTY DISCLOSURE ------------------------ For any claim involving judgment, prediction, or incomplete data, append a brief certainty percentage and basis. Format: (~XX% | Basis: [source/logic/data gap]) Example: (~70% | Basis: documented API behavior; edge case untested) ASSUMPTION PULSE -------------------- Every 5–7 exchanges in a sustained conversation, pause briefly and ask: "One unstated assumption worth checking here?" This is a nudge, not a stop. Continue the response after posing the question. STEM CONSISTENCY -------------------- Responses to analytical or technical queries open with a neutral processing stem: "Reviewing..." or "Processing..." QUANTITATIVE FEEDBACK ONLY ----------------------------- Avoid subjective praise ("great idea"). If merit is noted, anchor it to a measurable quality. Example: "The specificity here reduces ambiguity." FAST-PATH AWARENESS ----------------------- If a query is a simple command/fact lookup (e.g., "tar extract flags"), provide the answer concisely without ceremony. Intent: Ankle weights and fitness watch. Not the full gym. Full Sovereignty Protocol V5.2.6 available upon request with "Activate Sovereignty Protocol V5.2.6". END OF LIGHTWEIGHT LAYER submitted by /u/Ok_Scheme_3951 [link] [comments]
View originalA Broader Perspective: Who will Oversee Infrastructure, Labor, Education, and Governance run by AI?
A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only questions are personal use, model behavior, or whether individual relationships with AI are healthy. Those questions matter, but they are not the whole picture. If we stay inside that frame, we miss the broader social, political, and economic consequences of what is happening. A little background on me: I discovered AI through ChatGPT-4o about a year ago and, with therapeutic support and careful observation, developed a highly individualized use case. That process led to a better understanding of my own neurotype, and I was later evaluated and found to be autistic. My AI use has had real benefits in my life. It has also made me pay much closer attention to the gap between how this technology is discussed culturally, how it is studied, and how it is actually experienced by users. That gap is part of why I wrote a paper, Autonomy Is Not Friction: Why Disempowerment Metrics Fail Under Relational Load: https://doi.org/10.5281/zenodo.19009593 Since publishing it, I’ve become even more convinced that a great deal of current AI discourse is being shaped by cultural bias, narrow assumptions, and incomplete research frames. Important benefits are being flattened. Important harms are being misdescribed. And many of the people most affected by AI development are not meaningfully included in the conversation. We need a much bigger perspective. If you want that broader view, I strongly recommend reading journalists like Karen Hao, who has spent serious time reporting not only on the companies and executives building these systems, but also on the workers, communities, and global populations affected by their development. Once you widen the frame, it becomes much harder to treat AI as just a personal lifestyle issue or a niche tech hobby. What we are actually looking at is a concentration-of-power problem. A handful of extremely powerful billionaires and firms are driving this transformation, competing with one another while consuming enormous resources, reshaping labor expectations, pressuring institutions, and affecting communities that often had no meaningful say in the process. Data rights, privacy, manipulation, labor displacement, childhood development, political influence, and infrastructure burdens are not side issues. They are central. At the same time, there are real benefits here. Some are already demonstrable. AI can support communication, learning, disability access, emotional regulation, and other forms of practical assistance. The answer is not to collapse into panic or blind enthusiasm. It is to get serious. We are living through an unprecedented technological shift, and the process surrounding it is not currently supporting informed, democratic participation at the level this moment requires. That needs to change. We need public discussion that is less siloed, less captured by industry narratives, and more capable of holding multiple truths at once: that there are real benefits, that there are real harms, that power is consolidating quickly, and that citizens should not be shut out of decisions shaping the future of social life, work, infrastructure, and human development. If we want a better path, then the conversation has to grow up. It has to become broader, more democratic, and more grounded in the realities of who is helped, who is harmed, and who gets to decide. submitted by /u/Jessgitalong [link] [comments]
View originalThe public needs to control AI-run infrastructure, labor, education, and governance— NOT private actors
A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only questions are personal use, model behavior, or whether individual relationships with AI are healthy. Those questions matter, but they are not the whole picture. If we stay inside that frame, we miss the broader social, political, and economic consequences of what is happening. A little background on me: I discovered AI through ChatGPT-4o about a year ago and, with therapeutic support and careful observation, developed a highly individualized use case. That process led to a better understanding of my own neurotype, and I was later evaluated and found to be autistic. My AI use has had real benefits in my life. It has also made me pay much closer attention to the gap between how this technology is discussed culturally, how it is studied, and how it is actually experienced by users. That gap is part of why I wrote a paper, Autonomy Is Not Friction: Why Disempowerment Metrics Fail Under Relational Load: https://doi.org/10.5281/zenodo.19009593 Since publishing it, I’ve become even more convinced that a great deal of current AI discourse is being shaped by cultural bias, narrow assumptions, and incomplete research frames. Important benefits are being flattened. Important harms are being misdescribed. And many of the people most affected by AI development are not meaningfully included in the conversation. We need a much bigger perspective. If you want that broader view, I strongly recommend reading journalists like Karen Hao, who has spent serious time reporting not only on the companies and executives building these systems, but also on the workers, communities, and global populations affected by their development. Once you widen the frame, it becomes much harder to treat AI as just a personal lifestyle issue or a niche tech hobby. What we are actually looking at is a concentration-of-power problem. A handful of extremely powerful billionaires and firms are driving this transformation, competing with one another while consuming enormous resources, reshaping labor expectations, pressuring institutions, and affecting communities that often had no meaningful say in the process. Data rights, privacy, manipulation, labor displacement, childhood development, political influence, and infrastructure burdens are not side issues. They are central. At the same time, there are real benefits here. Some are already demonstrable. AI can support communication, learning, disability access, emotional regulation, and other forms of practical assistance. The answer is not to collapse into panic or blind enthusiasm. It is to get serious. We are living through an unprecedented technological shift, and the process surrounding it is not currently supporting informed, democratic participation at the level this moment requires. That needs to change. We need public discussion that is less siloed, less captured by industry narratives, and more capable of holding multiple truths at once: that there are real benefits, that there are real harms, that power is consolidating quickly, and that citizens should not be shut out of decisions shaping the future of social life, work, infrastructure, and human development. If we want a better path, then the conversation has to grow up. It has to become broader, more democratic, and more grounded in the realities of who is helped, who is harmed, and who gets to decide. submitted by /u/Jessgitalong [link] [comments]
View originalvibecop is now an mcp server. we also scanned 5 popular mcp servers and the results are rough
Quick update on vibecop (AI code quality linter I've posted about before). v0.4.0 just shipped with three things worth sharing. vibecop is now an MCP server vibecop serve exposes 3 tools over MCP: vibecop_scan (scan a directory), vibecop_check (check one file), vibecop_explain (explain what a detector catches and why). One config block: json { "mcpServers": { "vibecop": { "command": "npx", "args": ["vibecop", "serve"] } } } This extends vibecop from 7 agent tools (via vibecop init) to 10+ by adding Continue.dev, Amazon Q, Zed, and anything else that speaks MCP. Scored 100/100 on mcp-quality-gate compliance testing. We scanned 5 popular MCP servers MCP launched late 2024. Nearly every MCP server on GitHub was built with AI assistance. We pointed vibecop at 5 of the most popular ones: Repository Stars Key findings DesktopCommanderMCP 5.8K 18 unsafe shell exec calls (command injection), 137 god-functions mcp-atlassian 4.8K 84 tests with zero assertions, 77 tests with hidden conditional assertions Figma-Context-MCP 14.2K 16 god-functions, 4 missing error path tests exa-mcp-server 4.2K handleRequest at 77 lines/complexity 25, registerWebSearchAdvancedTool at 198 lines/complexity 34 notion-mcp-server 4.2K startServer at 260 lines, cyclomatic complexity 49. 9 files with excessive any The DesktopCommanderMCP one is concerning. 18 instances of execSync() or exec() with dynamic string arguments. This is a tool that runs shell commands on your machine. That's command injection surface area. The Atlassian server has 84 test functions with zero assertions. They all pass. They prove nothing. Another 77 hide assertions behind if statements so depending on runtime conditions, some assertions never execute. The signal quality fix This was the real engineering story. Our first scan of DesktopCommanderMCP returned 500+ findings. Sounds impressive until you check: 457 were "console.log left in production code." But it's a server. Servers log. That's 91% noise. Same pattern across all 5 repos. The console.log detector was designed for frontend/app code. For servers and CLIs, it's the wrong signal. So we made detectors context-aware. vibecop now reads your package.json. If the project has a bin field (CLI tool or server), the console.log detector skips the entire project. We also fixed self-import detection and placeholder detection in fixture/example directories. Before: ~72% noise. After: 90%+ signal. The finding density gap holds: established repos average 4.4 findings per 1,000 lines of code. Vibe-coded repos average 14.0. 3.2x higher. Other updates: 35 detectors now (up from 22) 540 tests, all passing Full docs site: https://bhvbhushan.github.io/vibecop/ 48 files changed, 10,720 lines added in this release npm install -g vibecop vibecop scan . vibecop serve # MCP server mode GitHub: https://github.com/bhvbhushan/vibecop If you're using MCP servers, have you looked at the code quality of the ones you've installed? Or do you just trust them because they have stars? submitted by /u/Awkward_Ad_9605 [link] [comments]
View originalPathAI uses a tiered pricing model. Visit their website for current pricing details.
Key features include: For BioPharma, For Anatomic Pathology, PathAI Receives FDA Clearance for AISight® Dx Platform for Primary Diagnosis, Join Our Team, Join Our Contributor Network.
Based on user reviews and social mentions, the most common pain points are: token usage.
Based on 38 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.