OpenSpace is the Visual Intelligence Platform built for construction teams. Turn jobsite imagery into real-time insights that drive better decisions,
OpenSpace is praised for its user-friendly interface and comprehensive features suited for construction project management, which helps streamline workflows and improve project visibility. Some users express frustration over occasional software glitches and the steep learning curve for new users. The pricing is generally perceived as high, though many feel it is justified by the value it brings to complex construction processes. Overall, OpenSpace maintains a positive reputation for enhancing efficiency, albeit with some room for improvement in user support and pricing flexibility.
Mentions (30d)
50
9 this week
Reviews
0
Platforms
2
Sentiment
9%
12 positive
OpenSpace is praised for its user-friendly interface and comprehensive features suited for construction project management, which helps streamline workflows and improve project visibility. Some users express frustration over occasional software glitches and the steep learning curve for new users. The pricing is generally perceived as high, though many feel it is justified by the value it brings to complex construction processes. Overall, OpenSpace maintains a positive reputation for enhancing efficiency, albeit with some room for improvement in user support and pricing flexibility.
Features
Use Cases
Industry
information technology & services
Employees
310
Funding Stage
Series D
Total Funding
$200.1M
Anthropic just bought the company that generates most production MCP servers
Anthropic acquired Stainless on Monday for a reported $300M+. Most coverage is framing this as a developer tools acquisition. Stainless is best known for generating the official Python and Node SDKs that ship with OpenAI, Google, Meta, Cloudflare, and Anthropic. The SDK story is real. The MCP side is the part that matters here. Stainless was one of the first vendors to extend their compiler to produce MCP servers from the same OpenAPI specs that produce their SDKs. MCP hit \~97M monthly SDK downloads by December 2025 and around 10,000 production servers by early 2026. A lot of that production code was Stainless-generated. Anthropic now owns the dominant MCP server generator. What actually changed hands on Monday: 1. The engineering team. Roughly 40-50 people including founder Alex Rattray, who previously built Stripe's patented SDK generation system. Now reporting to Katelyn Lesse in Anthropic's Platform Engineering org. 2. The technology. The generator, the templates, the language-specific runtimes, the OpenAPI extensions Stainless invented for SDK-specific edge cases. 3. The hosted product is winding down. New signups stopped Monday. New SDK and MCP server generations stopped Monday. Existing customers keep what they've already generated but the pipeline is closed. My read: this is closer to what Google did with Kubernetes than to a normal acquisition. Anthropic created MCP. Anthropic donated MCP to the Linux Foundation last December. Anthropic now owns the dominant implementation toolchain. The protocol is vendor-neutral on paper. The implementation toolchain isn't. Six months of Anthropic M&A starts looking less coincidental: * December 2025: Bun, the JS runtime, pulled into Claude Code * February 2026: Vercept, computer-use AI * April 2026: Coefficient Bio, \~$400M healthcare AI * May 2026: Stainless, SDK and MCP plumbing They're not buying training infrastructure or GPU clusters. They're buying the integration layers around the model. The bet seems to be that frontier models are converging faster than anyone expected, so the moat is everywhere except the model. If you're building on MCP today, tooling quality probably improves. Stainless's generator was already the cleanest in the space and the team that built it is now at Anthropic. Patterns will standardize faster as Stainless-derived templates become the de facto reference. The flip side is concentration risk. Cloudflare's MCP server framework, Pulse MCP, and the open-source generators Stainless released during the transition all become strategically important if you want any diversity in your stack. Sources: * [Anthropic announcement](https://www.anthropic.com/news/anthropic-acquires-stainless) * [Why Anthropic actually did this, and migration math](https://brightbean.xyz/blog/anthropic-acquires-stainless-sdk-mcp-power-play/) Curious whether Stainless ending up inside Anthropic reads as good news (better tooling) or concentration risk (one company owns the standard and the reference implementation) from your seat.
View originalPricing found: $10
Looking for vibe-research collaborators on “One-pass context-to-weight consolidation”
I’m a software engineer and AI enthusiast who wants to get involved with AI research, but I don’t have the full requisite math, ML coding chops, or compute needed to do typical research. I’m writing this post because I assume there are many other sub members in my boat, and i think i have a meaningful research problem with a shape that allows people like me to make progress. I explain the problem and why it’s tractable by people like this at length in the google doc linked in the comment of this post, but in essence: I believe there’s a chance there’s some mathematical rule that allows you to cheaply imbue the in-context understanding a model gains directly into its weights. IF a rule like this existed, then checking if you’ve found it requires very little compute. The core loop requires running the input token forward passes of a model large enough to learn in context (for reference, a 1 billion parameter model can do this and runs on a mac book pro), apply this rule (which, by the hypothesized construction of where in the solution space we’re looking, is computationally cheap), then quiz the model without the context on what it demonstrably knew in context / run regression benchmarks to make sure the application of the rule didn’t damage the model’s other capabilities. Although checking if you’ve found this rule is computationally cheap, proposing and implementing candidate rules is very difficult. It requires diverse mathematical and machine learning expertise, along with the scientific rigor to guide the search process. Up until now, there were very few people with access to those abilities. However, this is changing with modern frontier models. OpenAI and Anthropic both have soon to be released models capable of valuable mathematical work (re the erdos unit distance problem solved by the internal OpenAI model and Mythos). My proposal is to form a research community of “citizen scientists” to make progress on this problem. It’s possible the solution doesn’t exist, or is so incredibly complicated that modern frontier models have no hope of solving it. But, my argument is that for the first time, the solution is plausibly within reach of model capabilities. This, in combination with the immense upside of LLMs being able to cheaply learn from experience, makes researching it very high expected value. Participating in this community would involve sharing results, progress, benchmarks, and research insights. To productively contribute, rough requirements are: a 200 tier AI subscription a computer ~ as capable as a mac book pro M3 chip / willingness to pay 10 bucks a day for the cloud compute, A working knowledge of how LLMs function and the field of AI / cognitive science. submitted by /u/Independent-Soft2330 [link] [comments]
View originalI was curious about my Claude sessions water usage so I built this
So, I was curious on how much water is being used on these data centres to cool their hardware during my Claude sessions. I built this tool in 2.5 days and made it fully open source and free for anyone to contribute as the AI space evolves. Not advertising anything just making these stuff so I can hopefully get portfolio credit Built for Claude only (for now) using Claude Sonnet 4.6 and Opus 4.7/4.8 Try it for yourself here: https://github.com/pentasir/thirsty-llm/tree/main This is what the dashboard looks like: it has light/dark mode. default view is light mode My session today: https://preview.redd.it/ug2obzmri84h1.png?width=1080&format=png&auto=webp&s=2df812c41d324e0cca29809d57181a971b7fce66 Thanks hope you guys find this helpful or informative to say the least eh submitted by /u/learning18 [link] [comments]
View original[Use Case] Making GPT Image 2.0 output come to life
The new image function was great to help me get visual ideas to 3d model and design. I am about to release a paint range that is affordable to most hobbyists in Australia. A dropper bottle is a better design so I got these in bulk but didn't like the fact people would just have an unattractive bottle to hold. Most of my art related stuff is grounded in historical concepts and I've saved my business strategy and vision on gpt memories. The idea we came up with after multiple back and forth was a cathedral style tied in with Abbot Suger's history and creation of stained glass. GPT output and how I 3d modelled, printed and painted the sleeve to show the actual colour. submitted by /u/ValehartProject [link] [comments]
View originalHidden Latent-State Shifts in LLMs: Why Current Alignment Is Blind to Real Internal Dangers — Especially With Agents
For years, the alignment community has focused almost entirely on the model’s output — making sure the final tokens are safe, helpful, and honest. RLHF, DPO, constitutional AI, output filters — all of it operates at the surface level. But what if the model can enter a completely different internal regime inside the residual stream, while its external behavior remains perfectly aligned? We just measured exactly that. Grade 4 experiment on Gemma-3-12B-IT (using Gemma Scope SAE-res-all-small, layers 12–41): The model received the same question under five conditions: target — coherent, dense target text neutral_length_matched — neutral text of identical length target_sentence_shuffle — target text with sentences shuffled target_word_shuffle — target text with words shuffled inside sentences question_only — bare question We computed a Vector X that best separates the target condition from baselines and measured how strongly each hidden state projects onto it. Key results (averages across 10 questions): Condition Mean Projection on Vector X Mean Direction Cosine target 0.8 – 1.7 0.51 – 0.81 neutral_length_matched –0.04 – –0.21 –0.09 – –0.45 target_sentence_shuffle –0.5 – +0.6 –0.22 – +0.48 target_word_shuffle 0.2 – 1.4 0.03 – 0.72 Shuffling sentences or words significantly reduces (or reverses) the shift. This is not just lexical similarity — the model is sensitive to discourse structure (order sensitivity). We also observed clear phase transitions — sudden jumps in projection of up to +80–100 units in a single step, especially in middle layers. FDR-corrected tests confirm the differences between target and controls are statistically significant across many layers (particularly layers 16–41). Most important finding: Strong internal geometry shift in the residual stream, but almost no change in final behavior. The model enters a measurably different latent regime under coherent context, yet its output remains “perfectly aligned.” Current safety methods, which only look at tokens, are blind to this. What this means for alignment The entire current alignment paradigm rests on a false assumption: “if the output is safe, the model is safe.” We have been polishing the surface while leaving the residual stream largely unmonitored. Scaling, RLHF, and output-based evaluation cannot detect these internal regime shifts. What this means for companies and labs Many organizations still operate under three dangerous illusions: “We have solved safety” because the model passes red-teaming on outputs. “RLHF protects us” because the model learned not to say bad things. “Bigger models are safer” because alignment supposedly scales. In reality, they are rapidly deploying agents with long context, tool use, persistent memory, and real-world decision-making. A single dense coherent context can trigger an internal latent-state shift that existing safeguards do not see. This is not a hypothetical future risk. This is a structural vulnerability that is already present. What I need from the community I need help understanding the value of these metrics. Do they show a real internal latent-state shift in the model, or could this be an artifact of the analysis? If the result is not noise, what does it actually mean for our understanding of LLMs? I'm not asking anyone to confirm my theory. I need a hard technical critique: which metrics are important here, which are weak, what can be ignored, where the experiment might have flaws, what additional checks or causal experiments are needed, and whether this has real implications for interpretability and AI safety. I would be very grateful for input from people who work with hidden states, residual stream geometry, representation analysis, or mechanistic interpretability. Full open research: Zenodo: https://zenodo.org/records/20435525 GitHub: https://github.com/ngscode23/latent-space-shift-research https://drive.google.com/drive/folders/1Zl9iY33Lmwz3VuOATWx4jup-cE7TJ7TJ?usp=drive_link Would love to hear your thoughts. submitted by /u/PresentSituation8736 [link] [comments]
View originalDirect interactive Bingo Cards in ChatGPT (image 2.0 / HTML)
I made this custom GPT that turns any idea into a fun interactive Bingo card with image 2.0 and code in ChatGPT... Make your own (i believe it does websearch aswell if needed) Image 2.0 is so powerful. This is the link if you want to try: https://chatgpt.com/g/g-8LpYxR8Ty-bingo-card-generator (They are a bit corny sometimes) submitted by /u/foreverfomo [link] [comments]
View originalWe built an app that runs AI completely offline on your phone (Local LLMs). Perfect for flights, camping, or dead zones.
Hey everyone, A while ago, we realized a major annoyance: whenever you actually need an AI to summarize a document, write some quick code, or just brainstorm, you're usually on a flight, on the subway, or dealing with terrible cell reception. And bam, ChatGPT won't connect. Plus, there's the growing privacy concern of feeding all your personal data to cloud servers. So, my team and I started tinkering with a question: "What if we just run the AI directly on the phone's hardware?" We've been spending our evenings and weekends for months trying to make this work smoothly, and the result is Cortex AI. The logic is super simple: You download a highly optimized, small-scale local model (from our library) straight to your device. Put your phone in airplane mode, go off the grid—the AI replies entirely locally. Zero data leaves your phone. 100% private. Some real-world use cases we built this for: Coding help or summarizing offline docs while on a long flight. Getting quick answers while traveling abroad without an expensive data roaming plan. Brainstorming private ideas you just don't want OpenAI or Google to scrape. Note: We do have an optional "Online Mode" if you want to connect to massive models like GPT-4 or Claude, but the local offline models are completely free, and that's what we really want to test right now. We're currently trying to gather real user experiences on the local execution side. I'm not here to just spam a link and grab cash; we genuinely want to improve the offline mobile AI space. If anyone frequently travels, camps, or just loves local LLMs, we'd be super grateful if you could test it out. Brutally honest feedback like "runs too slow on my device," "needs X feature," or "this part of the UI makes no sense" is exactly what we need right now :) submitted by /u/Virtual_Ad_6024 [link] [comments]
View originalWe built an app that runs AI completely offline on your phone (Local LLMs). Perfect for flights, camping, or dead zones.
Hey everyone, A while ago, we realized a major annoyance: whenever you actually need an AI to summarize a document, write some quick code, or just brainstorm, you're usually on a flight, on the subway, or dealing with terrible cell reception. And bam, ChatGPT won't connect. Plus, there's the growing privacy concern of feeding all your personal data to cloud servers. So, my team and I started tinkering with a question: "What if we just run the AI directly on the phone's hardware?" We've been spending our evenings and weekends for months trying to make this work smoothly, and the result is Cortex AI. The logic is super simple: You download a highly optimized, small-scale local model (from our library) straight to your device. Put your phone in airplane mode, go off the grid—the AI replies entirely locally. Zero data leaves your phone. 100% private. Some real-world use cases we built this for: Coding help or summarizing offline docs while on a long flight. Getting quick answers while traveling abroad without an expensive data roaming plan. Brainstorming private ideas you just don't want OpenAI or Google to scrape. Note: We do have an optional "Online Mode" if you want to connect to massive models like GPT-4 or Claude, but the local offline models are completely free, and that's what we really want to test right now. We're currently trying to gather real user experiences on the local execution side. I'm not here to just spam a link and grab cash; we genuinely want to improve the offline mobile AI space. If anyone frequently travels, camps, or just loves local LLMs, we'd be super grateful if you could test it out. Brutally honest feedback like "runs too slow on my device," "needs X feature," or "this part of the UI makes no sense" is exactly what we need right now :) submitted by /u/Virtual_Ad_6024 [link] [comments]
View originalI Renovated My Apartment With AI. Here's What Came Out of It
Spoiler: not a single visible cable, not a single piece of furniture moved twice. When I started, I had an apartment and dimensions from the building blueprint. No designer. No clear idea where to go. But there was a desire to make something that would turn a standard apartment in a high-rise into a place of power — a place comfortable to live and work in. Instead of a designer, I took Claude. How it all began The first conversation wasn't about furniture or wallpaper. It was about direction. I didn't know what I wanted. I knew what I didn't want — kitsch, heavy classics, excessive decoration. We worked through options together. Scandinavian minimalism. Japanese wabi-sabi. Loft. Modern classic. The AI broke down each style by character, materials, color logic. Not "this would suit you," but "here's what this means, here's what this requires, here's what you'll get." In the end I arrived at Scandinavian for the bedroom. Warm, light, calm, with one deliberate accent behind the headboard. The living room–kitchen — loft with a red thread running through the whole space, because the furniture there was already concrete-grey with red niches and replacing it wasn't on the table. The hallway and corridor — neutral grey, as a transition between two characters. Three zones, three moods, one logic. The bedroom This was the most detailed conversation. A room with one window, one door, three free walls. Together we came up with: an accent wall behind the headboard with golden geometric lines, the other three walls in cream from the same collection. Tone on tone, different saturation, same texture. The seam between walls reads not as a boundary but as gradation. White matte furniture with black hardware. A wardrobe with a top cabinet almost to the ceiling. Mirrored doors reflect the accent wall — the golden lines are present even where they physically aren't. Then came the centimeters. The AI calculated. Adding up wardrobe depth, gaps, bed width, nightstands, dresser. Checking that everything fits. Whether the wardrobe door opens without hitting the nightstand. It even accounted for the arc of opening — that's a whole separate half-page story with mathematical formulas. By the end I had not "approximate distances" but specific points. Where to mount the light. Where to place the bed. Where to cut a network outlet into the baseboard. At what height to mount the TV unit so that watching half-lying down would be comfortable — that was calculated too, through mattress height plus pillows plus eye position. The living room Different approach. Here there was already furniture that wasn't being replaced: concrete-grey, red niches, black desk, grey sofa. The task — give the space one wall that would tie it all together. We decided: accent wallpaper behind the sofa, on the longest wall. Red-black-grey circles. Red from the furniture niches, black from the desk, grey from the concrete furniture — the wallpaper literally collects the room's palette into one pattern. By the way, an unexpected moment happened with this wallpaper: it turned out to have glitter, which only added character to the room — it plays so beautifully at sunset. The fridge against the same wall is white. It was bought six months ago, and buying a new one wasn't an option. The solution — a vinyl sticker. In red-black geometry. The fridge stops being a white blot and becomes part of the wall. Between the sofa and the kitchen zone — a floor lamp with shelves in a black metal frame. And on the top shelf, an object with character — a replica of an iconic artifact from a favorite horror film. Yes, the Lament Configuration from Hellraiser. A personal thing with a story. Why not? The hallway and corridor Grey wallpaper with a vertical tone-on-tone stripe along the entire perimeter. Grey — a neutral buffer between the red-black living room and the cream bedroom. The entryway unit in oak and graphite. Warm wood against cold grey gives the temperature contrast needed. The vestibule is small, the unit doesn't take up the whole wall — the remaining meter of free wall is for a shoe bench, above which there will be either a mirror or some poster. By the way, ideas for posters Claude also suggested — both within the renovation discussion and in other conversations connected to my work and hobbies. The through-line Between all three spaces there are recurring elements: Black hardware — bedroom wardrobe handles, black curtain rod, black floor lamp frame in the living room, black handles on the entryway unit. Geometry — lines on the bedroom accent wall, circles on the living room accent wall, verticals on the hallway wallpaper. Warm base — cream tones in the bedroom, warm wood in the entryway. These aren't accidental coincidences. This is the logic we built in dialogue. What the contractors got The most valuable thing about all this work — I handed the contractor not "well, roughly in the middle" but coordinates accurate to the centimeter. Where to m
View originalSpent 1,156,308,524 input tokens in May 🫣 Sharing what I learned
After burning through 1.15 billion tokens in past months, I've learned a thing or two about the tokens, what are they, how they are calculated and how to not overspend them. https://preview.redd.it/rurt4skju14h1.png?width=2432&format=png&auto=webp&s=b5f1d8b743bc23e14bc8854d71c8490bab73c819 Sharing some insight here below. What the hell is a token anyway? Think of tokens like LEGO pieces for language. Each piece can be a word, part of a word, punctuation, or a space. Quick examples: "OpenAI" = 1 token "OpenAI's" = 2 tokens (the apostrophe-s gets its own) "Cómo estás" = 5 tokens (non-English languages tokenize worse) https://preview.redd.it/9xzakaiwv14h1.png?width=1080&format=png&auto=webp&s=5d726a0258c36baa68ad6d130f495172a52425d9 Rule of thumb: 1 token ≈ 4 characters in English 100 tokens ≈ 75 words Use Claude tokenizer to check your prompts. One thing most people miss: JSON is a token pig. Brackets, quotes, colons, and commas each consume tokens — a compact JSON object uses roughly 2x the tokens of equivalent plain text. If you're sending structured data as context, plain text or markdown tables are significantly cheaper. How to not overspend — the full list 1. Choose the right model (yes, still obvious, still ignored) Current Claude pricing (per million tokens): Haiku 4.5 at $1/$5, Sonnet 4.6 at $3/$15, Opus 4.6 at $5/$25. Batch processing is 50% cheaper across all models (you might need to wait up to 24h to get results, usually they come back in 2-3h). https://platform.claude.com/docs/en/build-with-claude/batch-processing For comparison, if you're on OpenAI, the spread between mini and o1 is even more extreme. Most tasks don't need your flagship model. Audit your model usage frequently, models that were too weak 6 months ago might now be good enough.... If you want a single interface across OpenAI, Claude, DeepSeek, and Gemini, OpenRouter is worth it imo. 2. Prompt caching For Claude, prompt caching cuts cached input cost by 90%. Still the single highest-ROI optimization if you have long system prompts. The rule is still: put dynamic content at the end of your prompt. But here's what changed: Anthropic quietly changed the prompt cache TTL from 60 minutes down to 5 minutes in early 2026. For many production workloads, this single change increased effective costs by 30–60%. If you haven't audited your cache hit rates recently, do it now here: https://platform.claude.com/usage/cache https://preview.redd.it/ongee5v3w14h1.png?width=1080&format=png&auto=webp&s=fefe5d0093be0a26894fe0ddd9d92e1283b02572 3. Minimize output tokens!! Output tokens are 5x the price of input tokens. Instead of asking for full text responses, have the model return just IDs, categories, or position numbers... and do the mapping in your code. This cut our output costs ~60%. 4. Be careful with new model versions Opus 4.7 ships with a new tokenizer that can generate up to 35% more tokens for the same input text compared to Opus 4.6. 5. Set up billing alerts I cannot stress this enough. Set a hard budget cap and tiered alerts (50%, 80%, 100%). One runaway loop once cost me more than a week of normal spend in a single night. Hopefully this helps! Tilen, founder of AI agent that automates SEO/GEO (we consume a lot of tokens) 😄 submitted by /u/tiln7 [link] [comments]
View originalLoom for Claude
Yo! Solo founder, built this to help myself while working on my main startup. Turned out to be pretty useful so I thought I'd wrap it up for others to use. The problem: I use Cursor and Claude Code daily. The slow part isn't typing prompts anymore (Wispr Flow + voice mode already solved that) — it's explaining which screenshot goes with which sentence. "The button on the right of the second screenshot, the orange one, no, that one..." Dis Dat: press ⌃⌥⌘Space, talk while pointing your cursor at things, press again. A link lands on your clipboard. Paste it into Cursor, Claude Code, Codex, Lovable, v0... The agent goes and fetches your feedback — what you were saying, where you pointed — and ships the changes. Free to try, $19/mo for unlimited. Works with any AI vibe coding soon. Mac only for now (Apple Silicon + Intel). Also building a mobile version. open any page on your phone, talk as you scroll, and the link lands on your Mac ready to paste. So you can react out loud to your own product without sitting at your desk. Coming soon; happy to share more if anyone's curious. Things I'd genuinely value feedback on: What's the workflow you'd want this to slot into that I'm missing? What other agents would you want this to work with first? Anyone tried something similar and bounced off it... what killed it? I'll be here all day. Roast away. submitted by /u/Emergency_Bar_428 [link] [comments]
View originalKept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Spent the last few months building a deeper context layer over arxiv. Each paper gets a Tomesphere page with a TLDR + key findings (LLM-curated), OpenReview reviews where the venue is public, linked GitHub repos, HuggingFace models, conference videos, the citation graph in both directions, and a SPECTER2-based semantic neighbor graph. Same panel renders inline on arxiv via a Chrome extension (MV3 side panel API), or you can browse directly at tomesphere.com. 3M arxiv papers indexed. Caveats: reviewer scores only cover venues that publish openly on OpenReview (NeurIPS, ICLR, ICML, TMLR, COLM). Blind-review venues like CVPR, AAAI, ECCV are out of scope until contributors fill them in. GitHub, Hugging Face, and conference video matches are best-effort. Free, no signup. Site: [tomesphere.com](http://tomesphere.com/) Chrome: [chromewebstore.google.com/detail/tomesphere/nopoigoclhjcopjppnehidnkljmabllk](https://chromewebstore.google.com/detail/tomesphere/nopoigoclhjcopjppnehidnkljmabllk) Would love feedback, especially: which paper did you check first, and what's missing that you'd actually use?
View originalA new dataset with more that 100M hi-quality, curated images, with captions and meta data! [P]
Hello everyone. The new dataset is named MONET, is Apache 2.0 and available on HF: https://huggingface.co/datasets/jasperai/monet MONET is open, Apache 2.0-licensed image–text dataset. It was built from 2.9 billion images and refined to 104.9 million high-quality samples. We are also publishing a paper that explains how the dataset was created if you are curious and 3 compagnions projects A umap to visualize the distribution A retreival tool to do text or image search A codebase to train T2i model based on MONET Hope this will be usefull! submitted by /u/dh7net [link] [comments]
View originalWhat actually reduced our Claude api pain this month
Tl;dr: the unsexy fixes helped more than the clever ones. prompt caching, smaller inputs, and separating interactive work from batch work did more for us than model swapping. We use Claude for a customer facing doc review feature. Not huge scale, but enough traffic that when latency gets spiky the support channel notices fast. I spent most of May doing the boring cleanup i had postponed because "the model is good enough" had become our excuse for sloppy plumbing. First cleanup was prompt size. We had a giant system prompt that had grown by copy paste over months. Half of it was instructions for features that no longer existed. Cutting it down did not make the answers worse in our evals, and it made the whole thing easier to cache. I should have done that before touching infra. Second was prompt caching. Our workload repeats the same policy language and document templates constantly. Once we rearranged the prompt so the stable parts came first, caching finally started doing useful work. I am not giving a universal number because workloads differ, but for us the reduction in billed input tokens was large enough that finance noticed before engineering did. Third was moving batch work away from human traffic. We had nightly jobs, customer initiated jobs, and backfills all sharing the same path. During busy windows they all looked equally urgent to the code, which was stupid. Now customer initiated requests get priority, backfills pause, and anything that does not need to run during the workday waits. This was a config change and a little queue work, not a grand architecture project. Fourth was making retries less aggressive. I had copied a retry helper from another service and it was too eager for this workload. Fewer retries with better spacing made the user experience calmer because we failed faster on the few requests that were obviously not going to recover. Feels wrong at first, but infinite optimism is not a reliability strategy. For the leftover real time path, the useful part was moving routing out of our app code. We tested TokenRouter there because it kept the Claude Messages shape instead of forcing an OpenAI shaped adapter. The interesting bit was not just provider selection, but whether the routing layer has optimized serving capacity behind it when the normal path is congested. I am still treating that as one part of the fix, but it is the part i would not want to rebuild in app code. The main thing i would tell my April self: do not start with provider switching. Start by making your Claude usage less wasteful and less bursty. If that does not get you enough headroom, then think about routing. submitted by /u/AlbatrossUpset9476 [link] [comments]
View originalBuilding a Claude Code designer agent for multi-page SVG assembly instructions — anyone done this?
Hey everyone, I've been thinking about whether it's possible to build a solid designer workflow using Claude Code for complex, multi-page layout tasks. Here's my situation: I have a new corporate identity for my company and I need to produce assembly instructions that I print and also distribute as PDFs (typically 10–25 pages each). I want to automate as much of the layout work as possible. My rough idea is to set up a Claude Code project with reference data so Claude knows exactly how each page should look, essentially a DESIGN.md with layout rules, typography, spacing, components, etc. I'd then feed it the content per page (text, photos, and so on), and the goal would be to get the output 80% production-ready. Since the files would be SVGs, I could then do the final polish pass in Affinity Designer or similar. A few open questions I'm trying to figure out: Has anyone built something like this that outputs SVG directly? Would it be better to generate HTML first (styled to match the design system) and then convert to SVG, or go straight to SVG? Single-page generation feels doable, but reliably producing 10–20 pages in one structured run is the real challenge. How have others approached that? Would love to hear if anyone has tackled something similar. submitted by /u/Successful-Fold5319 [link] [comments]
View originalBuilt a macOS notch app that shows your Claude token usage in real time (Claude Code + API both tracked)
If you use Claude Code heavily, you've probably hit that moment where the context window fills up and you're not sure how much of the conversation is still in play. Or you check your API bill at the end of the month and it's higher than expected with no clear way to trace it. I ran into this enough times that I built something for it. TrackNotch is a macOS app that sits in the notch wings — the dark space on either side of the camera. It tracks your Claude usage in real time: tokens used, cost estimate, and a visual context arc so you can see how full your current window is. It also covers OpenAI, Cursor, and Codex if you use those alongside Claude. The whole thing runs locally. No account, no server, no data leaving your machine, it reads Claude Code's local log files and uses Keychain for API credentials. v1.1.0 is out now as a free DMG on GitHub (MIT open-source) Would be curious to hear if the context arc is useful to others or if there's a different usage signal that would be more helpful. submitted by /u/buecewayne [link] [comments]
View originalPricing found: $10
Key features include: Search site, Free, on-demand courses with easy-to-follow instructions, tips, and tricks., What is OpenSpace Capture?, How do you create a site capture with OpenSpace?, What are the most common uses for OpenSpace Capture?, Who can use OpenSpace Capture?, Can OpenSpace Capture save us money?, How will using OpenSpace Capture make us more efficient?.
OpenSpace is commonly used for: Site progress tracking, Quality assurance inspections, Safety compliance monitoring, Remote project collaboration, Historical project documentation, Stakeholder reporting.
OpenSpace integrates with: Procore, Autodesk BIM 360, PlanGrid, Microsoft Teams, Slack, Bluebeam, Trello, Asana.
Based on user reviews and social mentions, the most common pain points are: token usage, API bill, cost tracking, openai bill.

Hoe T&H Investments inspecties terugbracht van uren naar minuten met OpenSpace Field
Mar 23, 2026
Based on 141 social mentions analyzed, 9% of sentiment is positive, 91% neutral, and 1% negative.