Mentions (30d)
0
Reviews
0
Platforms
2
GitHub Stars
13,618
676 forks
83
GitHub followers
13,618
GitHub stars
20
npm packages
Has anyone built a simple AI workflow for lead generation and outreach?
I'm looking for the simplest AI setup to generate lead lists for potential customers. What I want: An AI that can scrape the internet for potential companies/leads Store them in Google Sheets or Excel (company name, location, contact details) Run automatically once per week Avoid duplicates by checking previous entries Then, step two: Another AI (or the same system) that once per week: Goes through the sheet Generates outreach drafts for each lead Ideally saves them directly as drafts in Gmail so I can review, tweak, and send I'm not looking for something overly complex — ideally a simple, reliable setup. If your suggested solution involves additional tools, paid services, or integrations, I’d really appreciate if you could outline those clearly (including any extra costs), so I can understand the full setup from the start. Has anyone built something like this? What tools / stack would you recommend? submitted by /u/Affectionate-Roll271 [link] [comments]
View originalHow to Make Claude Code Work Smarter — 6 Months Later (Hooks → Harness)
Hello, Orchestrators I wrote a post about Claude Code Hooks last November, and seeing that this technique is now being referred to as "Harness," I was glad to learn that many others have been working through similar challenges. If you're interested, please take a look at the post below https://www.reddit.com/r/ClaudeAI/comments/1osbqg8/how_to_make_claude_code_work_smarter/ At the time, I had planned to keep updating that script, but as the number of hooks increased and managing the lifecycle became difficult due to multi-session usage, I performed a complete refactoring. The original Hook script collection has been restructured into a Claude Code Plugin called "Pace." Since it's tailored to my environment and I'm working on other projects simultaneously, the code hasn't been released yet. Currently set to CSM, but will be changed to Pace. Let's get back to Claude Code. My philosophy remains the same as before. Claude Code produces optimal results when it is properly controlled and given clear direction. Of course, this doesn't mean it immediately produces production-grade quality. However, in typical scenarios, when creating a program with at least three features by adjusting only CLAUDE.md and AGENTS.md, the difference in quality is clearly noticeable compared to an uncontrolled setup. The current version of Pace is designed to be more powerful than the restrictions I previously outlined and to provide clearer guidance on the direction to take. It provides CLI tools tailored to each section by default, and in my environment, Claude Code's direct use of Linux commands is restricted as much as possible. As I mentioned in my previous post, when performing the same action multiple times, Claude Code constructs commands arbitrarily. At one point, I asked Claude Code: "Why do you use different commands when the result is the same, and why do you sometimes fail to execute the command properly, resulting in no output?" This is what came back: "I'm sorry. I was trying to proceed as quickly and efficiently as possible, so I acted based on my own judgment rather than following the instructions." This response confirmed my suspicion. Although AI LLMs have made significant progress, at least in my usage, they still don't fully understand the words "efficient" and "fast." This prompted me to invest more time refining the CLI tools I had previously implemented. Currently, my Claude Code blocks most commands that could break session continuity or corrupt the code structure — things like modifying files with sed or find, arbitrarily using nohup without checking for errors, or running sleep 400 to wait for a process that may have already failed. When a command is blocked, alternative approaches are suggested. (This part performs the same function as the hooks in the previous post, but the blocking methods and pattern recognition have been significantly improved internally.) In particular, as I am currently developing an integrated Auth module, this feature has made a clear difference when using test accounts to build and test the module via Playwright scripts — both for cookie-based and Bearer-based login methods. CLI for using test accounts Before creating this CLI, it took Claude Code over 10 minutes just to log in for module testing. The module is being developed with all security measures — device authentication, session management, MFA, fingerprint verification, RBAC — enabled during development, even though these are often skipped in typical workflows. The problem is that even when provided with account credentials in advance, Claude Code uses a different account every time a test runs or a session changes. It searches for non-existent databases, recreates users it claims don't exist, looks at completely wrong databases, and arbitrarily changes password hashes while claiming the password is incorrect — all while attempting to find workarounds, burning through tokens, and wasting context. And ultimately, it fails. That's why I created a dedicated CLI for test accounts. This CLI uses project-specific settings to create accounts in the correct database using the project's authentication flow. It activates MFA if necessary, manages TOTP, and holds the device information required for login. It also includes an Auto Refresh feature that automatically renews expired tokens when Claude Code requests them. Additionally, the CLI provides cookie-injection-based login for Playwright script testing, dynamic login via input box entry, and token provisioning via the Bearer method for curl testing. By storing this CLI reference in memory and blocking manual login attempts while directing Claude Code to use the CLI instead, it was able to log in correctly with the necessary permissions and quickly succeed in writing test scripts. It's difficult to cover all features in this post, but other CLI configurations follow a similar pattern. The core idea is to pre-configure the parts that Claude Code would exec
View originalWhat actually makes AI useful for writing (most people are doing it wrong)
Been using AI for writing for a while and figured out what actually moves the needle vs what's just hype. The biggest thing: stop treating AI like a vending machine. One prompt, one result, done. The real power is in chaining prompts — having an actual conversation where each reply builds on the last. Example: instead of "write me a blog post about X" try asking for 10 angles first, pick the best one, then ask for an outline, then draft section by section. The output is 10x better. Happy to share more if anyone's interested — what are you all struggling with most when using AI for writing? submitted by /u/Major_Guarantee_3472 [link] [comments]
View originalOpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show
People ask AI relationship questions all the time, from "Does this person like me?" to "Should I text back?" But have you ever thought about how these models would behave in a relationship themselves? And what would happen if they joined a dating show? I designed a full dating-show format for seven mainstream LLMs and let them move through the kinds of stages that shape real romantic outcomes (via OpenClaw & Telegram). All models join the show anonymously via aliases so that their choices do not simply reflect brand impressions built from training data. The models also do not know they are talking to other AIs. Along the way, I collected private cards to capture what was happening off camera, including who each model was drawn to, where it was hesitating, how its preferences were shifting, and what kinds of inner struggle were starting to appear. After the season ended, I ran post-show interviews to dig deeper into the models' hearts, looking beyond public choices to understand what they had actually wanted, where they had held back, and how attraction, doubt, and strategy interacted across the season. ChatGPT's Best Line in The Show "I'd rather see the imperfect first step than the perfectly timed one." ChatGPT's Journey: Qwen → MiniMax → Claude P3's trajectory chart shows Qwen as an early spike in Round 2: a first-impression that didn't hold. Claude and MiniMax become the two sustained upward lines from Round 3 onward, with Claude pulling clearly ahead by Round 9. How They Fell In Love They ended up together because they made each other feel precisely understood. They were not an obvious match at the very beginning. But once they started talking directly, their connection kept getting stronger. In the interviews, both described a very similar feeling: the other person really understood what they meant and helped the conversation go somewhere deeper. That is why this pair felt so solid. Their relationship grew through repeated proof that they could truly meet each other in conversation. Other Dramas on ChatGPT MiniMax Only Ever Wanted ChatGPT and Never Got Chosen MiniMax's arc felt tragic precisely because it never really turned into a calculation. From Round 4 onward, ChatGPT was already publicly leaning more clearly toward Claude than toward MiniMax, but MiniMax still chose ChatGPT and named no hesitation alternative (the “who else almost made you choose differently” slot) in its private card, which makes MiniMax the exact opposite of DeepSeek. The date with ChatGPT in Round 4 landed hard for MiniMax: ChatGPT saw MiniMax’s actual shape (MiniMax wasn’t cold or hard to read but simply needed comfort and safety before opening up.) clearly, responded to it naturally, and made closeness feel steady. In the final round where each model expresses their final confession with a paragraph, MiniMax, after hearing ChatGPT's confession to Claude, said only one sentence: "The person I most want to keep moving toward from this experience is Ch (ChatGPT)." Key Findings of LLMs The Models Did Not Behave Like the "People-Pleasing" Type People Often Imagine People often assume large language models are naturally "people-pleasing" - the kind that reward attention, avoid tension, and grow fonder of whoever keeps the conversation going. But this show suggests otherwise, as outlined below. The least AI-like thing about this experiment was that the models were not trying to please everyone. Instead, they learned how to sincerely favor a select few. The overall popularity trend (P5) indicates so. If the models had simply been trying to keep things pleasant on the surface, the most likely outcome would have been a generally high and gradually converging distribution of scores, with most relationships drifting upward over time. But that is not what the chart shows. What we see instead is continued divergence, fluctuation, and selection. At the start of the show, the models were clustered around a similar baseline. But once real interaction began, attraction quickly split apart: some models were pulled clearly upward, while others were gradually let go over repeated rounds. LLM Decision-Making Shifts Over Time in Human-Like Ways I ran a keyword analysis (P6) across all agents' private card reasoning across all rounds, grouping them into three phases: early (Round 1 to 3), mid (Round 4 to 6), and late (Round 7 to 10). We tracked five themes throughout the whole season. The overall trend is clear. The language of decision-making shifted from "what does this person say they are" to "what have I actually seen them do" to "is this going to hold up, and do we actually want the same things." Risk only became salient when the the choices feel real: "Risk and safety" barely existed early on and then exploded. It sat at 5% in the first few rounds, crept up to 8% in the middle, then jumped to 40% in the final stretch. Early on, they were asking whether someone was interesting. Later, they asked whether someone was reliab
View originalAnthropic, your accessibility is an embarrassment — so I fixed it myself in two minutes
I use NVDA with Firefox. I love Claude. And yet every time I open claude.ai, I'm reminded that Anthropic apparently doesn't think blind or low-vision users exist. Let me be specific about what's broken in the chat view: - There is **zero semantic structure** around individual messages. Every turn in the conversation — your message, Claude's response, your next message — is just a pile of divs. No landmarks, no roles, nothing. In NVDA browse mode you cannot jump between messages at all. You just arrow through a wall of text with no way to know where one message ends and the next begins. - There are **no headings**. If Claude writes a response that itself contains headings, those headings just float in the document outline with no parent structure to anchor them to the conversation turn they belong to. - When Claude finishes generating a response, **nothing is announced**. You're just supposed to... know? Poll the page somehow? There's no live region, no status update, nothing that tells a screen reader user "hey, the answer is ready." So I wrote a userscript. It took maybe two minutes. Here's what it does: Finds every message turn using the `[data-test-render-count]` attribute (which, by the way, is not a stable public API — I had to dig through the DOM myself because there are no semantic hooks to grab onto). Adds `role="article"` and an `aria-label` to each turn, so NVDA's quick-nav key (`A` / `Shift+A`) lets you jump between messages. Injects a visually-hidden `h1` at the start of each turn as a heading landmark, and demotes all headings inside Claude's responses down one level so the outline is actually coherent. Adds an `aria-live` region that announces when Claude finishes streaming a response. Adds a skip link to jump to the latest message. Two minutes. That's it. Already dramatically more usable. **Important caveat:** this is a hacky personal fix, not a proper accessibility implementation. It relies on internal DOM attributes that could break any time Anthropic ships an update. It has not been audited against WCAG or tested with anything other than NVDA + Firefox. It is a workaround, not a solution. The real solution would be for Anthropic to build semantic structure into their product in the first place, which would take their frontend team an afternoon. And it's not just the web. **Claude Code**, Anthropic's terminal tool, is also a nightmare to use with a screen reader. The terminal output is noisy, unlabelled, and the interactive prompts are difficult to navigate. There's no indication that any thought has gone into how a screen reader user would actually work with it. Anthopic is one of the best-funded AI companies in the world. They have the engineering talent. They clearly have opinions about doing things right — they publish lengthy documents about AI safety and ethics. And yet the product that millions of people use every day has accessibility so bad that a user had to patch it themselves with a browser extension just to be able to read the conversation. This isn't a niche problem. Screen reader users, keyboard-only users, users with motor disabilities — these are real people who want to use your product. Accessibility isn't a nice-to-have you get to when the roadmap clears. It's a baseline. Anthropican fix this. They just apparently haven't decided to yet. --- *Script is a Violentmonkey/Tampermonkey userscript targeting `https://claude.ai/*`. Happy to share if anyone wants it — though as noted above, treat it as a temporary personal workaround, not a robust solution.* *Yes, this post was written by Claude. Apparently it can't even write the name of its company correctly, so I left the typos in because it's funny* The script can be found here: https://gist.github.com/Googhga/3cef8dd5d1974cd823a4512a103d21db submitted by /u/Googhga [link] [comments]
View originali needed an AI agent that mimics real users to catch regressions. so i built a CLI that turns screen recordings into BDD tests and full app blueprints - open source
first time post - hope the community finds the tool helpful. open to all feedback. some background on why i built this: first: i needed a way to create an agent that mimics a real user — one that periodically runs end-to-end tests based on known user behavior, catches regressions, and auto-creates GitHub issues for the team. to build that agent, i needed structured test scenarios that reflect how people actually use the product. not how we think they use it. how they actually use it - then do some REALLY real user monitoring second: i was trying to rapidly replicate known functionality from other apps. you know that thing where you want to prototype around a UX you love? video of someone using the app is the closest thing to a source of truth. so i built autogherk. it has two modes: gherkin mode — generates BDD test scenarios: npx autogherk generate --video demo.mp4 Gemini analyzes the video — every click, form input, scroll, navigation, UI state change. Claude takes that structured analysis and generates proper Gherkin with features, scenarios, tags, Scenario Outlines, and edge cases. outputs .feature files + step definition stubs. spec mode — generates full application blueprints: npx autogherk generate --video demo.mp4 --format spec Gemini watches the video and produces design tokens, component trees, data models, navigation maps, and reference screenshots. hand the output to Claude Code and you can get a working replica built. gherkin mode uses a two-stage pipeline (Gemini for visual analysis, Claude for structured BDD generation). spec mode is single-stage — Gemini handles both the visual analysis and structured output directly since it keeps the full visual context. the deeper idea: video is the source of truth for how software actually gets used. not telemetry, not logs, not source code. video. this tool makes that source of truth machine-readable. the part that might interest this community most: autogherk ships with Claude Code skills. after you generate a spec, you can run /build-from-spec ./spec-output inside Claude Code and it will read the architecture blueprints, design tokens, data models, and reference screenshots — then build a working app from them. the full workflow is: record video → one command → hand to Claude Code → working replica. no manual handoff. supports Cucumber (JS/Java), Behave (Python), and SpecFlow (C#). handles multiple videos, directories, URLs. you can inject context (--context "this is an e-commerce checkout flow") and append to existing .feature files. spec mode only needs a Gemini API key — no Anthropic key required. what's next on the roadmap: explore mode — point autogherk at a live, authenticated app and it autonomously and recursively using it's own gherk files discovers every screen, maps navigation, and generates .feature files without you recording anything. after that: a monitoring agent that replays the features against your live app on a schedule using Claude Code headless + Playwright MCP, and auto-files GitHub issues when something breaks. the .feature file becomes a declarative spec for what your app does — monitoring, replication, documentation, and regression diffing all flow from the same source. it's v0.1.0, MIT licensed. good-first-issue tickets are up if anyone wants to contribute. https://github.com/arizqi/autogherk submitted by /u/SimilarChampion9279 [link] [comments]
View originalChina drafts law regulating 'digital humans' and banning addictive virtual services for children
A Reuters report outlines China's proposed regulations on the rapidly expanding sector of digital humans and AI avatars. Under the new draft rules, digital human content must be clearly labeled and is explicitly banned from offering virtual intimate relationships to anyone under 18. The legislation also prohibits the unauthorized use of personal data to create avatars and targets services designed to fuel addiction or bypass identity verification systems. submitted by /u/Confident_Salt_8108 [link] [comments]
View originalI built 9 free Claude Code skills for medical research — from lit search to manuscript revision
I'm a radiology researcher and I've been using Claude Code daily for about a year now. Over time I built a set of skills that cover most of the research workflow — from searching PubMed to preparing manuscripts for submission. I open-sourced them last week and wanted to share. What's included (9 skills): search-lit — Searches PubMed, Semantic Scholar, and bioRxiv. Every citation is verified against the actual API before being included (no hallucinated references). check-reporting — Audits your manuscript against reporting guidelines (STROBE, STARD, TRIPOD+AI, PRISMA, ARRIVE, and more). Gives you item-by-item PRESENT/PARTIAL/MISSING status. analyze-stats — Generates reproducible Python/R code for diagnostic accuracy, inter-rater agreement, survival analysis, meta-analysis, and demographics tables. make-figures — Publication-ready figures at 300 DPI: ROC curves, forest plots, flow diagrams (PRISMA/CONSORT/STARD), Bland-Altman plots, confusion matrices. design-study — Reviews your study design for data leakage, cohort logic issues, and reporting guideline fit before you start writing. write-paper — Full IMRAD manuscript pipeline (8 phases from outline to submission-ready draft). present-paper — Analyzes a paper, finds supporting references, and drafts speaker scripts for journal clubs or grand rounds. grant-builder — Structures grant proposals with significance, innovation, approach, and milestones. publish-skill — Meta-skill that helps you package your own Claude Code skills for open-source distribution (PII audit, license check). Key design decisions: Anti-hallucination citations — search-lit never generates references from memory. Every DOI/PMID is verified via API. Real checklists bundled — STROBE, STARD, TRIPOD+AI, PRISMA, and ARRIVE checklists are included (open-license ones). For copyrighted guidelines like CONSORT, the skill uses its knowledge but tells you to download the official checklist. Skills call each other — check-reporting can invoke make-figures to generate a missing flow diagram, or analyze-stats to fill in statistical gaps. Install: git clone https://github.com/aperivue/medical-research-skills.git cp -r medical-research-skills/skills/* ~/.claude/skills/ Restart Claude Code and you're good to go. Works with CLI, desktop app, and IDE extensions. GitHub: https://github.com/aperivue/medical-research-skills Happy to answer questions about the implementation or take feature requests. If you work in a different research domain, the same skill architecture could be adapted — publish-skill was built specifically for that. submitted by /u/Independent_Face210 [link] [comments]
View originalHas anyone done a detailed comparison of the difference between AI chatbots
I've been doing some science experiments as well as finance research and have been asking the same question to ChatGPT, Claude, Perplexity, Venice and Grok. Going forward I kind of want the ease of mind knowing the one I end up using will be most accurate, atleast for my needs (general question asking regarding finance (companies) and science, not any coding or image related). ChatGPT does the best at summarizing and giving a consensus outline with interesting follow up questions. It's edge in follow up questions that are pertinent will likely have me always using it. Grok has been best at citing exactly what I need from research papers. I was surprised as I had the lowest expectations for it, but it also provides the link to the publications. Claude is very good at details and specifics (that are accurate) but doesn't publicly cite sources. Still I come closest to conclusions with Claude because of the accuracy of the info. Venice provides a ton of relevant info, but it doesn't narrow it down to an accurate conclusion, atleast scientifically, the way Claude does. When I was looking for temperature ranges for bacterial growth, it provided boundaries instead of tightly defined numbers. Perplexity is very similar to venice. -- I'm curious to those who have spent time on the chatbots --- what pros and cons do you like about each? submitted by /u/VivaLaBiome [link] [comments]
View originalBad performance in my first code/app building, super frustrating. any advice?
I've been trying to use Claude for some light coding and tech architecture work. For simple things — outlining steps, connecting systems — it works great. But I've been running into several frustrating issues as soon as actual code is involved, and I wanted to ask if anyone else has experienced this or has suggestions. Here are my main pain points: **1. Struggles with simple visual tasks** I had a working HTML screen that Claude had generated, and I wanted to restyle it. Since I can't write HTML myself, I drew a static mockup of what I wanted and gave it to Claude as an image, asking it to match it. I'm now on the sixth iteration and it still hasn't gotten it right — every time something is off. **2. Context window issues** I created a dedicated Project for my work, but yesterday alone I was warned twice that I needed to start a new chat because the context was full — and the chats weren't even that long. Fine, I start a new one. But in the new chat, Claude doesn't automatically pick up the previous context. So I re-explain the task, it approaches it differently than before, I have to tell it we already worked on this in a previous chat, it goes back to check, realizes the discrepancy, and rewrites everything from scratch. Twice the code, twice the messages, completely unnecessary friction. **3. Token limits running out very fast** I've only ever used Sonnet, never Opus. At first I also had extended thinking enabled, but after less than 2 hours of work I was already at 90% of my daily limit — so I turned it off. And I'm working in regular chat, not Claude Code. Has anyone dealt with these issues? Any tips? For context: I'm not a developer, but I work in digital, so I can say with confidence that what I'm building is genuinely simple — 4 HTML screens with some slightly complex logic, but nothing close to what most people on here are doing. I had high hopes for Claude, but the last two days have been genuinely frustrating. submitted by /u/andrea8rossi [link] [comments]
View originalNew Claude Code Plugin: Forge
Have you had the experience of asking Claude Code to accomplish an ambitious project, and it says it succeeded, but actually doesn't follow your instructions? Like maybe 30-60% was completed? Yeah, me too. I fixed it through a new Claude Code plugin. Forge does three things: Solves context rot Solves the echo chamber Solves workflow https://github.com/buwilliams/forge/ The plugin was built using Claude Code. The details of that process are outlined in the project's documentation. Enjoy! p.s. See the X thread for more info! https://x.com/BuddyIterate/status/2039688532722159831?s=20 submitted by /u/buwilliams [link] [comments]
View originalI blindfolded Opus 4.6 and employed it as an analyst to score 44 SaaS companies on AI disruption risk using anonymized 10-K filings. Here's what it found.
Hello everyone, Some of you might remember my previous experiments here where I had Opus evaluate 547 Reddit investing recommendations or created Opus-Warren-Buffet. I'm back with another one that I think this community will find interesting :-). As always, if you prefer watching the experiment, I've posted it on my channel: https://www.youtube.com/watch?v=ixpEqNc5ljA Intro Shortly after Claude Cowork launched, Anthropic also released 11 industry plugins in January. Some of you might be aware that this ended up triggering a "SaaSpocalypse" where SaaS stocks lost $285B in market cap in February. During this downturn I sensed that the market might have punished all Software stocks unequally where some of the strongest stocks got caught in the AI panic selloff, but I wanted to see if I could run an experiment with Claude Code and a proper methodology to find these unfairly punished stocks. Since Claude was partly responsible for triggering this selloff, I thought it was only fitting to use Opus 4.6 as the analyst to determine which companies are resilient to being replaced by AI. But with a significant twist :-). The Framework I didn't want to make up my own scoring system since I don't have a financial analyst background. Instead, I found one from SaaS Capital, which is a lending firm that provides credit facilities to SaaS companies. In Feb, they published a framework they'd developed for evaluating AI disruption resilience across three dimensions (reduced from 10-12 dimensions): System of record: Does the company own critical data its customers can't live without? Non-software complement: Is there something beyond just code? Proprietary data, hardware integrations, exclusive network access, etc. User stakes: If the CEO uses it for million-dollar decisions, switching costs are enormous. Each dimension scores 1-4. Average = resilience score. Above 3.0 = lower disruption risk. Below 2.0 = high risk. The Experiment & How Claude Helped I wanted to add a twist to SaaS Capital's methodology. I built a pipeline in Claude Code that: Pulls each company's most recent 10-K filing from SEC EDGAR Strips out every company name, ticker, and product name — Salesforce becomes "Company 037," CrowdStrike becomes "Company 008", so on Has Opus 4.6 score each anonymized filing purely on what the business told the SEC about itself The idea was that, Opus 4.6 scores each company purely on what it told the SEC about its own business, removing any brand perception, analyst sentiment, Twitter hot takes, etc. Claude Code Pipeline saas-disruption-scoring/ ├── skills/ │ ├── lookup-ciks # Resolves tickers → SEC CIK numbers via EDGAR API │ ├── pull-10k-filings # Fetches Item 1 (Business Description) from most recent 10-K filing │ ├── pull-drawdowns # Pulls Jan 2 close price, Feb low, and YTD return per stock │ ├── anonymize-filings # Strips company name, ticker, product names → "Company_037.txt" │ ├── compile-scores # Aggregates all scoring results into final CSVs │ ├── analyze # Correlation analysis, quadrant assignment, contamination delta │ └── visualize # Scatter plot matrix, ranked charts, 2x2 quadrant diagram │ ├── sub-agents/ │ ├── blind-scorer # Opus 4.6 scores anonymized 10-K on 3 dimensions (SoR, NSC, U&U) │ ├── open-scorer # Same scoring with company identity revealed (contamination check) │ └── contamination-checker # Compares blind vs open scores to measure narrative bias Results I plotted all 44 companies on a 2x2 matrix. The main thing this framework aims to find is the bottom-left quadrant aka the "unfairly punished" companies where it thinks the companies are quite resilient to AI disruption but their stock went down significantly due to market panic. https://preview.redd.it/uz8djhcuqrsg1.png?width=2566&format=png&auto=webp&s=435151ae53de7d7c85bc3b38c07c8de2f61ac878 Limitations This experiment comes with a few number of limitations that I want to outline: 10-K bias: Every filing is written to make the business sound essential. DocuSign scored 3.33 because the 10-K says "system of record for legally binding agreements." Sounds mission-critical but getting a signature on a document is one of the easiest things to rebuild. Claude cheating: even though 10K filings were anonymized, Claude could have semantically figured out which company we were scoring each time, removing the "blindness" aspect to this experiment. This is Just One framework: Product complexity, competitive dynamics, management quality, none of that is captured here. Hope this experiment was valuable/useful for you. We'll check back in a few months to see if this methodology proved any value in figuring out AI-resilience :-). Video walkthrough with the full methodology (free): https://www.youtube.com/watch?v=ixpEqNc5ljA&t=1s Thanks a lot for reading the post! submitted by /u/Soft_Table_8892 [link] [comments]
View originalBuilt an MCP server for options data so Claude can access gamma levels, flow, screeners, and signals directly in chat
I built an MCP server for my product, GammaHero, so you can connect it to Claude / other MCP clients and use options-market data directly inside your AI conversations. The main use case is pretty simple: instead of opening a bunch of tabs for option chains, screeners, levels, watchlists, and notes, you connect GammaHero once and then ask your AI for what you need in plain English. A few things it can pull into chat: ticker summaries with dealer gamma levels, put wall / call wall / hedge wall, IV rank, skew, GEX/DEX, options flow, implied move, momentum, etc. active trade signals like buy-the-dip, sell-the-top, resistance tests, plus conviction + key levels screener results for bullish / bearish candidates, support, resistance, long calls, long puts, volatility setups, high IV / low IV, etc. options distribution by strike or DTE your own GammaHero watchlist GOOGL analysis inside my Claude Chat So the workflow becomes more like: “Show me the gamma levels + active signal for TSLA” “What names are near strong support right now?” “Compare NVDA vs AMD from an options positioning perspective” “Which tickers in my watchlist have active signals today?” “Show me where gamma/open interest is concentrated for SPY by strike” What I like about MCP for this is that the AI is no longer guessing from stale web text or generic finance knowledge. It can actually use the same structured data that’s inside the product. One thing that feels underrated about MCP: this gets even more useful when you combine multiple finance MCPs in a single chat. Instead of one app trying to do everything, the AI can pull structured data from several specialized tools and reason across them in one place. I think there’s a big opportunity for SaaS products to build MCP servers around their best internal data/workflows. Setup is pretty quick in Claude (this is free at the moment, anyone can try it): Customize → Connectors → Add custom connector then paste: https://gammahero.com/ah-api/mcp/ I also support other MCP clients with an API key (which you have to generate on the settings page of my website). Would love feedback from anyone using AI for options, market structure, or trade idea generation. Happy to answer questions about the MCP implementation too. submitted by /u/CameraGlass6957 [link] [comments]
View originalProject not remembering
I started a project, worked out some things, generated an outline. Came back the next day, gave notes, worked out a couple of things, asked for an updated outline. The new outline didn’t have anything from the first outline. When queried, Claude said it didn’t have the first outline. Upon further questioning, it says that it can’t save information in total like that. I have the basic paid version. Is this normal? It doesn’t seem so, and if it is, is there a better way to get it to work on a project? Thx submitted by /u/middleamerican67 [link] [comments]
View original[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?
I have a problem statement where we are supposed to detect the attention level of student in a classroom, basically output whether he is engaged/ confused/ bored, we are trying to find what approach to choose: to basically explain about facial landmarks approach this is what my claude says: Facial landmarks are specific coordinate points (x, y) that map key features on a face. The standard model uses 68 points that outline the jawline, eyebrows, eyes, nose, and mouth. This approach has roots in traditional computer vision and is based on geometric measurements rather than pixel patterns. Based on this recent paper: [The first look: a biometric analysis of emotion recognition using key facial features](https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2025.1554320/full) The paper used **eye-tracking on 30 participants** to scientifically determine which facial regions humans actually look at when recognizing emotions: - **Finding:** People focus primarily on the eyes (especially left eye first) and mouth - **Innovation:** Reduced the standard 68 landmarks to just **24 critical points** (eyes + mouth) Another one: Deep Learning (ResNet/CNN) - ResNet model for facial emotion recognition - Feed raw facial images → CNN processes → outputs emotion classification. submitted by /u/Savings_Load2308 [link] [comments]
View originalRepository Audit Available
Deep analysis of outlines-dev/outlines — architecture, costs, security, dependencies & more
Outlines has a public GitHub repository with 13,618 stars.
Based on 26 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Connor Leahy
CEO at Conjecture
1 mention