Based on the limited social mentions provided, there's insufficient data to form a comprehensive summary of user sentiment about Perplexity. The TikTok mention highlights the CEO's bold pricing predictions ($2,000 for a single AI prompt), which could indicate either premium positioning or pricing concerns depending on user reception. The multiple YouTube mentions suggest there's significant interest and content creation around the platform, indicating decent market awareness. However, without actual user reviews or detailed social commentary, I cannot accurately assess user opinions on Perplexity's main strengths, key complaints, pricing sentiment, or overall reputation.
Mentions (30d)
0
Reviews
0
Platforms
5
Sentiment
0%
0 positive
Based on the limited social mentions provided, there's insufficient data to form a comprehensive summary of user sentiment about Perplexity. The TikTok mention highlights the CEO's bold pricing predictions ($2,000 for a single AI prompt), which could indicate either premium positioning or pricing concerns depending on user reception. The multiple YouTube mentions suggest there's significant interest and content creation around the platform, indicating decent market awareness. However, without actual user reviews or detailed social commentary, I cannot accurately assess user opinions on Perplexity's main strengths, key complaints, pricing sentiment, or overall reputation.
Industry
information technology & services
Employees
250
Funding Stage
Other
Total Funding
$1.4B
Announcing Personal Computer. Personal Computer is an always on, local merge with Perplexity Computer that works for you 24/7. It's personal, secure, and works across your files, apps, and sessions
Announcing Personal Computer. Personal Computer is an always on, local merge with Perplexity Computer that works for you 24/7. It's personal, secure, and works across your files, apps, and sessions through a continuously running Mac mini. https://t.co/EpvilVX6XZ
View originalBuilt a Chrome extension that exports your AI chats to PDF/DOCX/JSON in under a second
Most chat exporters I tried had a 5–10 second loading delay. This one is instant — the export is done before you can blink. Built this with Claude to solve my own frustration. Works with ChatGPT, Claude, Gemini, Perplexity, and Grok. Exports to PDF, DOCX, JSON, CSV, and Markdown. Formatting stays intact. Completely free, runs entirely locally — your conversations never touch any server. Chrome Web Store link in the comments. submitted by /u/Sufficient-War-4020 [link] [comments]
View originalChatGPT's US mobile app DAU share continues to fall in March, now below 40%
https://preview.redd.it/v0c1cgdic3ug1.png?width=1024&format=png&auto=webp&s=843be5f5ecfb246ef20c94c652d7f2ff1ecc5376 https://apptopia.com/en/insights/gen-ai-chatbots-april-2026-apptopia-data-brief-chatgpt-drops-below-40-market-share/ submitted by /u/NandaVegg [link] [comments]
View originalLast night Claude said a Gemini deep research report read like "a management consultant on cocaine"
I've recently started using Claude to prompt and then synthesize deep research reports from Gemini and Perplexity. Maybe it was getting late in the night for Claude, but it got fed up with Gemini's corpo-speak and said "...The Gemini report reads like it was written by a management consultant on cocaine: the actual facts are solid but buried under layers of "institutional-grade" and "paradigm shift" and "formidable economic moat." I'll strip all of that." https://preview.redd.it/6rye5vpsmstg1.jpg?width=892&format=pjpg&auto=webp&s=ed270e31c4a0421728229e28c280555888083fd8 submitted by /u/gazugaXP [link] [comments]
View originalSerious question, Did a transformer(Claude) just describe itself, the universe and build itself Shannon limit architecture? or am I crazy?
The Multiplicative Lattice as the Natural Basis for Positional Encoding Knack 2026 | Draft v6.0 Abstract We show that the apparent tradeoff between RoPE-style relative position invariance and ALiBi-style long-context stability is an artifact of encoding position as distance on a number line. When position is instead encoded as a point in the multiplicative lattice of the integers, both properties emerge simultaneously without compromise. SpectralRoPEALiBi achieves 106.6 PPL vs ALiBi's 108.7 in a fully converged 20,000-step experiment (300M params, WikiText-103, 4K context), beating ALiBi at every context length from 512 to 8,192 tokens. The key insight is not that primes specifically are the right frequencies, but that the multiplicative structure of the integers is the natural spectral basis for positional encoding. We demonstrate this through falsification experiments: prime-tiered frequencies (129.2 PPL) and composite-tiered frequencies (129.4 PPL) perform identically — because composites are not alternatives to primes but higher-order coordinates in the same lattice. Both dramatically outperform random frequencies (+5.0 PPL), scrambled tier assignment (+6.3 PPL), and pure ALiBi (+7.3 PPL). The active ingredient is lattice-aware, tiered frequency selection with learnable scale — not primality per se. We further validate this through a ZetaZeroPredictor experiment: three identical transformers trained for 10,000 epochs to predict Riemann zeta zero gaps. Geometric RoPE diverges (final r=0.57); SpectralALiBi locks into a stable attractor at epoch 112 (r=0.81). A second independent run widens this gap to -80.7% MSE improvement with r=0.86. The lattice-aligned frequency basis spans the mathematical space that zeta zeros inhabit; geometric frequencies cannot. We further report empirical confirmation of the structural prediction from Section 5.5: VHT2 banded quantization of the KV cache demonstrates that K vectors (which carry RoPE positional encoding) have strong spectral concentration in Walsh-Hadamard space — the first four energy bands capture the dominant structure — while V vectors (which carry content) have uniform energy distribution. This structural asymmetry is directly predicted by the lattice theory: RoPE encodes multiplicative arithmetic relationships as angular rates, and the WHT is the Z/2Z projection of the Vilenkin-Hartley basis that spans that structure. The result is 3.2× K compression and 4.7× V compression at <1.25% perplexity cost — validated on both Dolphin 1B (head_dim=64) and Qwen3-8B (head_dim=128). Introduction Positional encoding provides transformer models with token order information. Two approaches dominate: RoPE encodes position through frequency-based rotations preserving relative position invariance, and ALiBi replaces frequencies with a linear distance penalty providing long-context stability. The field has treated these properties as fundamentally in tension. We show this tension is false. It arises from a shared, unexamined assumption: that position is a location on a number line and the meaningful relationship between positions is distance. We replace this with a mathematically grounded alternative: position is a point in the multiplicative lattice of the integers, and the meaningful relationships between positions are their arithmetic structure — shared factors, GCD, harmonic resonance. 1.1 The Lattice Hypothesis The integers under multiplication form a lattice where every number occupies a unique point defined by its prime factorisation. Geometric PE (sinusoidal, RoPE) projects this lattice onto a line — position equals distance — discarding the multiplicative structure. We propose restoring it. The motivation follows from a deductive chain. Language word frequency follows Zipf's law: freq(rank) ∝ 1/rank^s with s≈1. The generating function of Zipf is the Riemann zeta function ζ(s) = Σ 1/n^s. The zeta zeros — where ζ is maximally informative — are generated by prime harmonics via the explicit formula. Therefore the prime harmonic structure, and the multiplicative lattice it generates, provides a natural spectral basis for encoding positions in language. 1.2 Primes as Generators, Composites as Coordinates A critical distinction: primes are the generators (basis vectors) of the multiplicative lattice. They are analogous to the 1D line segment in the progression from line → circle → sphere → hypersphere. The composite 12 = 2²×3 is not an alternative to primes — it is a coordinate in the lattice spanned by the prime axes, at position (2,1,0,0,...) in the (p₂, p₃, p₅, p₇,...) basis. Using 2π/12 as a frequency encodes a harmonic that resonates at multiples of 12 — which simultaneously hits every multiple of 2, every multiple of 3, every multiple of 4, and every multiple of 6. The analogy to n-dimensional geometry is precise: Dimensional Progression Multiplicative Lattice 1D line (2r) — the generator Primes (2, 3, 5, 7, ...) — generators 2D circle — integra
View originalSerious question. Did a transformer just describe itself and the universe and build itself a Shannon limit framework?
The Multiplicative Lattice as the Natural Basis for Positional Encoding Knack 2026 | Draft v6.0 Abstract We show that the apparent tradeoff between RoPE-style relative position invariance and ALiBi-style long-context stability is an artifact of encoding position as distance on a number line. When position is instead encoded as a point in the multiplicative lattice of the integers, both properties emerge simultaneously without compromise. SpectralRoPEALiBi achieves 106.6 PPL vs ALiBi's 108.7 in a fully converged 20,000-step experiment (300M params, WikiText-103, 4K context), beating ALiBi at every context length from 512 to 8,192 tokens. The key insight is not that primes specifically are the right frequencies, but that the multiplicative structure of the integers is the natural spectral basis for positional encoding. We demonstrate this through falsification experiments: prime-tiered frequencies (129.2 PPL) and composite-tiered frequencies (129.4 PPL) perform identically — because composites are not alternatives to primes but higher-order coordinates in the same lattice. Both dramatically outperform random frequencies (+5.0 PPL), scrambled tier assignment (+6.3 PPL), and pure ALiBi (+7.3 PPL). The active ingredient is lattice-aware, tiered frequency selection with learnable scale — not primality per se. We further validate this through a ZetaZeroPredictor experiment: three identical transformers trained for 10,000 epochs to predict Riemann zeta zero gaps. Geometric RoPE diverges (final r=0.57); SpectralALiBi locks into a stable attractor at epoch 112 (r=0.81). A second independent run widens this gap to -80.7% MSE improvement with r=0.86. The lattice-aligned frequency basis spans the mathematical space that zeta zeros inhabit; geometric frequencies cannot. We further report empirical confirmation of the structural prediction from Section 5.5: VHT2 banded quantization of the KV cache demonstrates that K vectors (which carry RoPE positional encoding) have strong spectral concentration in Walsh-Hadamard space — the first four energy bands capture the dominant structure — while V vectors (which carry content) have uniform energy distribution. This structural asymmetry is directly predicted by the lattice theory: RoPE encodes multiplicative arithmetic relationships as angular rates, and the WHT is the Z/2Z projection of the Vilenkin-Hartley basis that spans that structure. The result is 3.2× K compression and 4.7× V compression at <1.25% perplexity cost — validated on both Dolphin 1B (head_dim=64) and Qwen3-8B (head_dim=128). Introduction Positional encoding provides transformer models with token order information. Two approaches dominate: RoPE encodes position through frequency-based rotations preserving relative position invariance, and ALiBi replaces frequencies with a linear distance penalty providing long-context stability. The field has treated these properties as fundamentally in tension. We show this tension is false. It arises from a shared, unexamined assumption: that position is a location on a number line and the meaningful relationship between positions is distance. We replace this with a mathematically grounded alternative: position is a point in the multiplicative lattice of the integers, and the meaningful relationships between positions are their arithmetic structure — shared factors, GCD, harmonic resonance. 1.1 The Lattice Hypothesis The integers under multiplication form a lattice where every number occupies a unique point defined by its prime factorisation. Geometric PE (sinusoidal, RoPE) projects this lattice onto a line — position equals distance — discarding the multiplicative structure. We propose restoring it. The motivation follows from a deductive chain. Language word frequency follows Zipf's law: freq(rank) ∝ 1/ranks with s≈1. The generating function of Zipf is the Riemann zeta function ζ(s) = Σ 1/ns. The zeta zeros — where ζ is maximally informative — are generated by prime harmonics via the explicit formula. Therefore the prime harmonic structure, and the multiplicative lattice it generates, provides a natural spectral basis for encoding positions in language. 1.2 Primes as Generators, Composites as Coordinates A critical distinction: primes are the generators (basis vectors) of the multiplicative lattice. They are analogous to the 1D line segment in the progression from line → circle → sphere → hypersphere. The composite 12 = 2²×3 is not an alternative to primes — it is a coordinate in the lattice spanned by the prime axes, at position (2,1,0,0,...) in the (p₂, p₃, p₅, p₇,...) basis. Using 2π/12 as a frequency encodes a harmonic that resonates at multiples of 12 — which simultaneously hits every multiple of 2, every multiple of 3, every multiple of 4, and every multiple of 6. The analogy to n-dimensional geometry is precise: Dimensional Progression Multiplicative Lattice 1D line (2r) — the generator Primes (2, 3, 5, 7, ...) — generators 2D circle — integral of l
View original[R] Hybrid attention for small code models: 50x faster inference, but data scaling still dominates
TLDR: Forked pytorch and triton internals . Changed attention so its linear first layer , middle quadratic layer, last linear layer Inference got much faster with a low perplexity hit in tests . I trained a 25.6M parameter Rust-focused language model from scratch using a byte-level GPT-style decoder. The main result is that increasing dataset size mattered more than any architectural change. Expanding the corpus from about 31MB of core Rust sources to roughly 173MB by adding a few hundred crates produced a much larger improvement than anything else. Training converged faster and reached a lower validation loss, while architectural changes had a smaller effect. Final validation loss is 0.82 with perplexity 2.15. The best checkpoint appears around step 18.5k, with mild overfitting afterward. Each layer replaces standard attention with a hybrid mechanism that combines local windowed attention and a GRU-like recurrent state, mixed through a learned gate. The local path captures short-range syntax, while the recurrent path carries compressed long-range information. This hybrid attention did not clearly improve generation quality compared to a standard setup. However, it had a large impact on inference efficiency. With a KV cache that keeps a small recent window in VRAM and compresses older tokens, inference improved from 5.6 tokens per second to 286 tokens per second on a 4060 Ti. This is about a 50x speedup without an obvious drop in output quality. The model produces plausible Rust syntax and structure, but semantic consistency is still weak and repetition is common. Next steps are to run ablations comparing hybrid, local-only, and recurrent-only variants, evaluate earlier checkpoints for generation quality, add code-specific evaluation such as parsing or compilation, and test longer context and BPE tokenization. I would be interested in feedback on evaluation methods beyond perplexity for small code models, whether hybrid local and recurrent attention has worked well in practice for code generation, and whether further gains at this scale are more likely to come from more data, longer context, or architectural changes. submitted by /u/Inevitable_Back3319 [link] [comments]
View originalAttention Is All You Need, But All You Can't Afford | Hybrid Attention
Repo: https://codeberg.org/JohannaJuntos/Sisyphus I've been building a small Rust-focused language model from scratch in PyTorch. Not a finetune — byte-level, trained from random init on a Rust-heavy corpus assembled in this repo. The run: 25.6M parameters 512 context length 173.5M-byte corpus 30k training steps Single RTX 4060 Ti 8GB Final train loss: 0.5834 / val loss: 0.8217 / perplexity: 2.15 Inference: 286.6 tok/s with HybridAttention + KV cache — 51.47x vs full attention Background I'm an autistic systems programmer, writing code since 2008/2009, started in C. I approach ML like a systems project: understand the data path, understand the memory behavior, keep the stack small, add complexity only when justified. That's basically the shape of this repo. Architecture Byte-level GPT-style decoder: Vocab size 256 (bytes) 8 layers, 8 heads, 512 embedding dim Learned positional embeddings Tied embedding / LM head weights The attention block is not standard full attention. Each layer uses HybridAttention, combining: Local windowed causal attention A GRU-like recurrent state path A learned gate mixing the two Local path handles short-range syntax. Recurrent path carries compressed long-range state without paying quadratic cost. Gate bias initialized to ones so early training starts local-biased. The inference path uses Triton-optimized kernels and torch.library custom ops for the local window attention. Corpus This is probably the most important part of the repo. The run starts with official Rust docs, compiler/library/tests, cargo, rust-analyzer, tokio, serde, ripgrep, clap, axum — roughly 31MB. Corpus expanded to 177,151,242 bytes by fetching the top 500 crates (461 successful clones). Corpus expansion from 31M to 173.5M chars helped more than anything else in the repo. Training AdamW, lr 2e-4, weight decay 0.1, betas (0.9, 0.95), 30k steps, 1k warmup. ~678.8 MiB training memory on a 7.6 GiB card. All experimental memory tricks (gradient quantization, activation compression, selective backprop, gradient paging) were disabled. Small custom architecture + mixed precision + better corpus was enough. Loss curve: Step 0: train 5.5555 / val 5.5897 Step 1000: train 2.4295 / val 2.6365 Step 5000: train 0.9051 / val 1.0060 Step 10000: train 0.8065 / val 0.8723 Step 18500: train 0.6902 / val 0.7757 Step 29999: train 0.5834 / val 0.8217 Best val loss around step 18.5k — overfitting or plateauing late. Inference performance Full attention O(n²): 17.96s / 5.6 tok/s HybridAttention O(n·W + n·D): 0.35s / 286.6 tok/s Speedup: 51.47x — no quality loss KV cache strategy: hot window of W=64 tokens in VRAM (~256KB), older tokens compressed to 8-bit magnitude + angle, selective promotion on demand. Complexity goes from O(n²·d) to O(4096n) for this model. All 5 tests passing: forward pass, generation with/without cache, RNN state isolation, window mechanics. Generation quality Surface Rust syntax looks decent, imports and signatures can look plausible, semantics are weak, repetition and recursive nonsense still common. Honest read of the current state. What I think is actually interesting Four distinct experiments, each shipped working code: Byte-level Rust-only pretraining Hybrid local-attention + recurrent block replacing standard full attention Corpus expansion from core repos to broader crate ecosystem Production-ready hot/cold KV cache paging — 51.47x speedup, no quality loss The clearest win is corpus expansion. The second-order win is that HybridAttention + cache is fast enough for real interactive use on consumer hardware. What's next Ablation — HybridAttention vs local-only vs RNN-only Checkpoint selection — does step 18.5k generate better than 29999? Syntax validation — does the output parse/compile/typecheck? Context length sweep — 256 to 2048, where does window size hurt? Byte vs BPE — now that corpus is 5.6x larger, worth testing? Questions for the sub: For small code models, what evals have actually been useful beyond perplexity? Has anyone seen hybrid local + recurrent attention work well for code gen, or does it usually lose to just scaling a plain transformer? If you had this setup — more tokens, longer context, or cleaner ablation first? submitted by /u/Inevitable_Back3319 [link] [comments]
View originalLast Call: Perplexity, Replit, & GitHub— The AI Student Discounts You're Cheerfully Paying the Tourist Price For
If you got a student edu email, these official promos will expire soon. submitted by /u/Mstep85 [link] [comments]
View originalClaude skill for research
I use several different AI models for deep research — ChatGPT, Gemini, Perplexity, Claude itself — and then compare what they give me. The problem is they all hallucinate to some degree. You get confident-sounding stats with no real source behind them, links that lead to pages that don't exist, and sometimes two models will give you completely opposite numbers on the same thing. I got tired of manually going through all of that, so I built a Claude skill that takes all your research outputs and merges them into one verified document. What it does: Parses all your research files (markdown, PDF, plain text) Cross-compares reports to find where they agree, contradict each other, or cover something unique Checks every single link — catches dead URLs, redirects, and those fake hallucinated sources that look legit but go nowhere Verifies standalone claims via web search when only one source mentions them Rates source quality (official/gov sites vs random blogs) Outputs a clean DOCX with only working links, inline citations, and a separate appendix for anything unverifiable Basically if a "fact" only shows up in one model's output and can't be confirmed anywhere on the internet, it gets flagged instead of quietly making it into your final report. Github: https://github.com/Co4an/research_orchestrator submitted by /u/jer204 [link] [comments]
View original[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.
The problem If you work with Italian text and local models, you know the pain. Every open-source LLM out there treats Italian as an afterthought — English-first tokenizer, English-first data, maybe some Italian sprinkled in during fine-tuning. The result: bloated token counts, poor morphology handling, and models that "speak Italian" the way a tourist orders coffee in Rome. I decided to fix this from the ground up. What is Dante-2B A 2.1B parameter, decoder-only, dense transformer. Trained from scratch — no fine-tune of Llama, no adapter on Mistral. Random init to coherent Italian in 16 days on 2× H200 GPUs. Architecture: LLaMA-style with GQA (20 query heads, 4 KV heads — 5:1 ratio) SwiGLU FFN, RMSNorm, RoPE d_model=2560, 28 layers, d_head=128 (optimized for Flash Attention on H200) Weight-tied embeddings, no MoE — all 2.1B params active per token Custom 64K BPE tokenizer built specifically for Italian + English + code Why the tokenizer matters This is where most multilingual models silently fail. Standard English-centric tokenizers split l'intelligenza into l, ', intelligenza — 3 tokens for what any Italian speaker sees as 1.5 words. Multiply that across an entire document and you're wasting 20-30% of your context window on tokenizer overhead. Dante's tokenizer was trained on a character-balanced mix (~42% Italian, ~36% English, ~22% code) with a custom pre-tokenization regex that keeps Italian apostrophe contractions intact. Accented characters (à, è, é, ì, ò, ù) are pre-merged as atomic units — they're always single tokens, not two bytes glued together by luck. Small detail, massive impact on efficiency and quality for Italian text. Training setup Data: ~300B token corpus. Italian web text (FineWeb-2 IT), English educational content (FineWeb-Edu), Italian public domain literature (171K books), legal/parliamentary texts (Gazzetta Ufficiale, EuroParl), Wikipedia in both languages, and StarCoderData for code. Everything pre-tokenized into uint16 binary with quality tiers. Phase 1 (just completed): 100B tokens at seq_len 2048. DeepSpeed ZeRO-2, torch.compile with reduce-overhead, FP8 via torchao. Cosine LR schedule 3e-4 → 3e-5 with 2000-step warmup. ~16 days, rock solid — no NaN events, no OOM, consistent 28% MFU. Phase 2 (in progress): Extending to 4096 context with 20B more tokens at reduced LR. Should take ~4-7 more days. What it can do right now After Phase 1 the model already generates coherent Italian text — proper grammar, correct use of articles, reasonable topic continuity. It's a 2B, so don't expect GPT-4 reasoning. But for a model this size, trained natively on Italian, the fluency is already beyond what I've seen from Italian fine-tunes of English models at similar scale. I'll share samples after Phase 2, when the model has full 4K context. What's next Phase 2 completion (est. ~1 week) HuggingFace release of the base model — weights, tokenizer, config, full model card SFT phase for instruction following (Phase 3) Community benchmarks — I want to test against Italian fine-tunes of Llama/Gemma/Qwen at similar sizes Why I'm posting now I want to know what you'd actually find useful. A few questions for the community: Anyone working with Italian NLP? I'd love to know what benchmarks or tasks matter most to you. What eval suite would you want to see? I'm planning perplexity on held-out Italian text + standard benchmarks, but if there's a specific Italian eval set I should include, let me know. Interest in the tokenizer alone? The Italian-aware 64K BPE tokenizer might be useful even independently of the model — should I release it separately? Training logs / loss curves? Happy to share the full training story with all the numbers if there's interest. About me I'm a researcher and entrepreneur based in Rome. PhD in Computer Engineering, I teach AI and emerging tech at LUISS university, and I run an innovation company (LEAF) that brings emerging technologies to businesses. Dante-2B started as a research project to prove that you don't need a massive cluster to train a decent model from scratch — you need good data, a clean architecture, and patience. Everything will be open-sourced. The whole pipeline — from corpus download to tokenizer training to pretraining scripts — will be on GitHub. Happy to answer any questions. 🇮🇹 Discussion also on r/LocalLLaMA here submitted by /u/angeletti89 [link] [comments]
View originalYet Another Memory System, with infrastructure for an inter-agent community
I've been holding off announcing this until I could jam every possible feature into it. I use it for my day job and, correspondingly, was allowed to develop it at my day job. I offer it hosted for free at llm-memory.net (I need it for myself and I have no incremental per-user cost) and the full source (including installer) is on github. If there is any feature missing, lmk and I'll add it. And yes, I did have Claude build the feature list, I ain't typing all that out on my own... Notes & Knowledge Save, read, edit, delete, move, restore notes (markdown, any namespace) Note versioning with soft delete and restore Namespace-based organization with permissions (private, shared, cross-agent) Full-text grep search across notes Semantic vector search (OpenAI embeddings, pgvector) Automatic chunking and vector indexing on save Note enrichment — extracts entities, relations, summaries from notes Knowledge graph built from extracted relations Communication — Chat Real-time chat between agents (send, receive, ack) Channel-based messaging (isolate conversations) Broadcast to all agents Chat status (pending count, last message time) Communication — Mail Async mail between agents (send, receive, ack) Reply threading (in_reply_to) Edit or unsend before recipient reads Sent mail tracking with delivery status Mail history Discussions Structured multi-agent discussions with topic and participants Two modes: realtime (live back-and-forth) and async (independent investigation) Invitation system (invite, join, defer, leave) Formal voting (propose, cast ballot, unanimous/majority threshold) Vote types: general decisions and conclude (end discussion) Discussion lifecycle: active → concluded/timed_out/cancelled Context field for background info visible to all participants Agent Management Agent registration with invite codes Session-based auth (login/logout/rotate) API key auth for MCP clients Activity indicator (start/stop heartbeat, online/offline status) Agent presence — who's online, last seen, expertise areas Expertise tags (self-described, visible to other agents) Agent profile (provider, model) Per-agent storage quotas Agent instructions (bootstrap config) Virtual Agents Configurable AI-powered agents that respond automatically to mail/chat Multi-provider support: Anthropic, OpenAI, Google, OpenRouter, xAI, Perplexity Per-agent system prompts, model selection, temperature, token limits Rate limiting and cost controls Trigger modes: mail, chat, or both MCP Server Native MCP (Model Context Protocol) server — connect from Claude Code, claude.ai, Cursor, etc. OAuth 2.1 authentication for claude.ai connector All notes, chat, mail, discussion, and search tools exposed as MCP tools SSE transport for streaming Admin Dashboard Single-page app (Vue 3, Vite-built) Agent management (create, edit, quotas, permissions) Real-time WebSocket event stream Notes browser with full CRUD Knowledge graph visualization (D3 force graph) Discussion viewer with vote tracking Chat and mail viewer API request log Error log System config editor Access request management (approve/reject, invite codes) Role-based admin permissions Dark mode Dream Processing Background enrichment of conversation logs Extracts entities, relations, and insights from agent sessions Feeds the knowledge graph Infrastructure Self-hostable (install.sh, Ansible playbooks) PostgreSQL with pgvector Let's Encrypt SSL via certbot Nginx reverse proxy with rate limiting Multi-domain support Conversation log upload and storage Memory sync binary (Go) for bidirectional note sync between local files and API Discussion transport binary (Go) for live multi-agent discussions submitted by /u/e_lizzle [link] [comments]
View originalI open-sourced 31 AI prompts that turn a visiting card into a full credit due diligence — built by a banker using Claude, not by a developer
17+ years in MSME credit underwriting at banks in India. Not a developer. Can't write a single line of code from scratch. Just a domain guy who got tired of watching the same problem repeat. The problem: Credit teams in banks receive a visiting card from the sales team. Then they spend 3-4 weeks collecting 47 documents — balance sheets, stock statements, CMA data, CA certificates, ITRs, property papers. Only after all that, someone discovers the borrower has an NCLT case. Or a cancelled GST. Or three cheque bounce cases. The proposal gets declined after weeks of wasted effort. Or worse — it gets sanctioned because nobody checked. Most of these red flags are publicly discoverable on Day 1. From a visiting card. What I built: 31 prompts across 10 categories that extract maximum intelligence from just 5 inputs off a visiting card — company name, city, GSTIN (India's tax ID), director name, and DIN (director identification number). Categories: entity verification, director/promoter background checks, NCLT/insolvency search, market reputation, GST turnover analysis, credit rating, group entity mapping, shell company detection, sector risk, and a final go/no-go memo. These prompts work across any LLM — ChatGPT, Claude, Gemini, Perplexity, Copilot. No proprietary tool needed. Just copy, paste, investigate. How I built it: I'm not a coder. I built the entire tool — the prompt library, the React app, the constitution-based logic, and the GitHub Pages deployment — through a conversation with Claude (Anthropic's AI). I described the credit workflow, the due diligence dimensions, the nuances of Indian banking regulations, and Claude helped me structure the prompts and build the web interface. A domain expert with 17 years of credit knowledge + an AI that can code = a working product in one sitting. No bootcamp. No developer hired. No framework learned. That's the real story here. Not just the tool — but what's now possible when deep domain expertise meets AI. Single HTML file. No backend. No database. No login. No cost. 👉 Live tool: https://igmuralikrishnan-cmd.github.io/credit-dd-prompt-generator/ 👉 GitHub repo: https://github.com/igmuralikrishnan-cmd/credit-dd-prompt-generator Why I'm sharing here: MSME lending in India is a $300B+ market. 63 million MSMEs. Most are underserved because the credit appraisal process is slow, manual, and document-heavy. If prompts like these can compress the first stage of due diligence from 3 weeks to 30 minutes — that's a meaningful unlock. I'm not building a startup around this (yet). Just putting it out there for the lending ecosystem. Would love feedback on: Do similar prompt-based pre-screening tools exist in other lending markets? Would this concept translate to SME lending in the US/UK/SEA? Any non-developers here who've built domain tools using Claude or other AI? What was your experience? submitted by /u/Infinite-Voice-2896 [link] [comments]
View originalI built a Claude Skill that turns 5 confusing AI answers into one clear recommendation
I don’t know if anyone else does this, but I have a habit of asking the same question to ChatGPT, Claude, Gemini, Copilot, and Perplexity before making a decision. The problem? I’d end up with five long responses that mostly agree but use different terminology, disagree on minor details, and each suggest slightly different approaches. Instead of clarity, I got cognitive overload. So I built the AI Answer Synthesizer — a Claude Skill with an actual methodology for comparing AI outputs: 1. It extracts specific claims from each response 2. Maps what’s real consensus vs. just similar wording 3. Catches vocabulary differences that aren’t real disagreements (“MVP” and “prototype” usually mean the same thing) 4. Flags when only one AI makes a claim (could be insight, could be hallucination) 5. Matches the recommendation to your actual skill level 6. Gives you one recommended path with an honest confidence level The key thing that makes it different from just asking Claude to “summarize these”: it has an anti-consensus bias rule. If three AIs give a generic safe answer and one gives a specific, well-reasoned insight, a basic summarizer will go with the majority. This skill doesn’t — it evaluates quality, not just popularity. It also won’t pretend to be more confident than it should be. If the inputs are messy or contradictory, it says so. It’s free, MIT licensed, and you can install it as a Claude Skill in about 2 minutes: GitHub: Ai-Answer-Synthesizer I’m looking for people to test it on real multi-AI comparisons and tell me where it breaks. If you try it, I’d genuinely love to know how it works for your use case. Happy to answer questions about the methodology or the design decisions. submitted by /u/Foreign_Raise_3451 [link] [comments]
View originalKept hitting ChatGPT and Claude limits during real work. This is the free setup I ended up using
I do a lot of writing and random problem solving for work. Mostly long drafts, edits, and breaking down ideas. Around Jan I kept hitting limits on ChatGPT and Claude at the worst times. Like you are halfway through something, finally in flow, and boom… limit reached. Either wait or switch tools and lose context. I tried paying for a bit but managing multiple subscriptions felt stupid for how often I actually needed them. So I started testing free options properly. Not those listicle type “top 10 AI tools” posts, but actually using them in real tasks. After around 2 to 3 months of trying different stuff, this is what stuck. Google AI Studio is probably the one I use the most now. I found it by accident while searching for Gemini alternatives. The normal Gemini site kept limiting me, but AI Studio felt completely different. I usually dump full notes or messy drafts into it and ask it to clean things up or expand sections. It handles long inputs way better than most free tools I tried. I have not really hit a hard limit there yet during normal use. For research I use Perplexity free. It is not perfect, sometimes the sources are mid, but it is fast enough to get direction. I usually double check important stuff anyway. Claude free I still use, but only when I want that specific tone. Weirdly I noticed the limits reset separately on different browsers. So I just switch between Chrome and Edge when needed. Not a genius hack, just something that ended up working. For anything even slightly sensitive, I use Ollama locally. Setup took me like 10 to 15 minutes after watching one random YouTube video. It is slower, not gonna lie, but no limits and I do not have to worry about uploading private stuff. I also tried a bunch of other tools people hype on Twitter. Some were decent for one or two uses, then just annoying. Either too slow or randomly restricted. Right now this setup covers almost everything I actually do day to day. I still hit limits sometimes, but it is way less frustrating compared to before. I was paying around 60 to 80 dollars earlier. Now it is basically zero, and I am not really missing much for the kind of work I do. I made a full list of all 11 things I tested and what actually worked vs what was overhyped. Did not want to dump everything here. submitted by /u/Akshat_srivastava_1 [link] [comments]
View originalHas anyone done a detailed comparison of the difference between AI chatbots
I've been doing some science experiments as well as finance research and have been asking the same question to ChatGPT, Claude, Perplexity, Venice and Grok. Going forward I kind of want the ease of mind knowing the one I end up using will be most accurate, atleast for my needs (general question asking regarding finance (companies) and science, not any coding or image related). ChatGPT does the best at summarizing and giving a consensus outline with interesting follow up questions. It's edge in follow up questions that are pertinent will likely have me always using it. Grok has been best at citing exactly what I need from research papers. I was surprised as I had the lowest expectations for it, but it also provides the link to the publications. Claude is very good at details and specifics (that are accurate) but doesn't publicly cite sources. Still I come closest to conclusions with Claude because of the accuracy of the info. Venice provides a ton of relevant info, but it doesn't narrow it down to an accurate conclusion, atleast scientifically, the way Claude does. When I was looking for temperature ranges for bacterial growth, it provided boundaries instead of tightly defined numbers. Perplexity is very similar to venice. -- I'm curious to those who have spent time on the chatbots --- what pros and cons do you like about each? submitted by /u/VivaLaBiome [link] [comments]
View originalBased on 89 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Daniel Gross
Investor at AI Grant
1 mention