We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Hugging Face is praised for its robust community involvement and contributions to open-source projects, maintaining and enhancing resources like PapersWithCode. Users appreciate its dedication to advancing AI accessibility and development. However, there are some concerns about discontinued features following acquisitions, such as the case with PapersWithCode by Meta. Pricing sentiment is generally favorable, as many of their tools and resources are freely available, and the overall reputation of Hugging Face remains positive as a leader in AI collaboration and innovation.
Mentions (30d)
15
4 this week
Reviews
0
Platforms
2
GitHub Stars
158,591
32,698 forks
Hugging Face is praised for its robust community involvement and contributions to open-source projects, maintaining and enhancing resources like PapersWithCode. Users appreciate its dedication to advancing AI accessibility and development. However, there are some concerns about discontinued features following acquisitions, such as the case with PapersWithCode by Meta. Pricing sentiment is generally favorable, as many of their tools and resources are freely available, and the overall reputation of Hugging Face remains positive as a leader in AI collaboration and innovation.
Features
Use Cases
Industry
information technology & services
Employees
730
Funding Stage
Series D
Total Funding
$395.7M
61,117
GitHub followers
402
GitHub repos
158,591
GitHub stars
20
npm packages
40
HuggingFace models
Talkie: a 13B LLM trained only on pre-1931 text used Claude Sonnet to help test the model and judge its output
Researchers Alec Radford (GPT, CLIP, Whisper), Nick Levine, and David Duvenaud just released **talkie**: a 13 billion parameter language model trained *exclusively* on text published before 1931. No internet. No Wikipedia. No World War II. Its worldview is frozen at December 31, 1930. **Why does this matter?** Every major LLM today (GPT, Claude, Gemini, Llama) ultimately shares a common ancestor: the modern web. That makes it nearly impossible to tell what these models genuinely *reason* versus what they simply *memorized*. Talkie breaks that lineage entirely. From the team: >*"It's an important question how much LM capabilities arise from memorization vs generalization. Vintage LMs enable unique generalization tests."* Interestingly, Claude has a direct role in talkie's creation: **Claude Sonnet 4.6** was used as the judge in talkie's reinforcement learning pipeline (online DPO), and Claude Opus 4.6 generated synthetic multi-turn conversations used in the final fine-tuning stage. The team even notes the irony: using a thoroughly modern LLM to help shape a model that's supposed to be frozen in 1930, and flagging it as a contamination risk they're actively working to eliminate in future versions. The most striking example: **talkie can learn to write Python code from just a few in-context examples... despite having zero modern code in its training data.** It's reasoning from 19th-century mathematics texts, not retrieval. **What it's being used to study** * **Long-range forecasting**: how well can a model "predict" the future from its frozen vantage point? * **Invention**: can it develop ideas that postdate its knowledge cutoff? * **LLM identity**: what makes a model *itself*? Talkie's alien data distribution helps isolate what's architecture vs. what's just "vibes absorbed from the web" **Links** * [Chat with talkie live](https://talkie-lm.com/chat) * [Official blog post](https://talkie-lm.com/introducing-talkie) * [Original announcement on X](https://x.com/status_effects/status/2048878495539843211?s=20) * [Discussion on r/accelerate](https://reddit.com/r/accelerate/comments/1sxmjeq/new_research_from_alec_radford_key_openai/) * [Discussion on r/singularity](https://www.reddit.com/r/singularity/s/qQnKdFHjWs) Both models are **Apache 2.0 licensed** and open-weight on Hugging Face. The team is already planning a GPT-3-scale vintage model for later this year.
View originalPricing found: $9 /month, $20 /month, $50 /month, $23, $15
Weekly AI roundup (May 23–30, 2026): Claude Opus 4.8 Fast Mode 3x cheaper, Qwen 3.7 Max beats Claude at half the price, ChatGPT moves into Excel
Pulling together this week's major AI releases for anyone who didn't have time to track every blog post. Sticking to substantive changes, not hype. Anthropic — Claude Opus 4.8 Released this week. Headline pricing unchanged, but Fast Mode dropped from $30 input / $150 output per million tokens to $10 / $50 — a 3x reduction on the premium tier. Reported improvements in "judgment" and longer autonomous runs. Also shipped 20+ legal MCP connectors and Microsoft 365 add-ins (Excel, PowerPoint, Word) in GA. Alibaba — Qwen 3.7 Max Launched May 20 at Alibaba Cloud Summit. 1M-token context. Reported to top Claude Opus 4.6 Max on Terminal-Bench 2.0, SWE-Bench Pro, and MCP-Atlas. Pricing $2.50 / $7.50 per million tokens — roughly half of Opus 4.7. Alibaba claims autonomous operation up to 35 hours without performance degradation. Alibaba is now ranked #6 lab globally on Arena text leaderboard. OpenAI — GPT-5.5 Instant Now default in ChatGPT. Reports 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts (medicine, law, finance). OpenAI also shipped a ChatGPT sidebar inside Excel and Google Sheets, plus a personal finance dashboard for Pro users (US only). Google — Gemini 3.5 Flash Reported to beat Gemini 3.1 Pro on coding and agentic benchmarks at ~4x faster output token rate. Ultra subscription cut from $250 to $200/month; new $100/month Developer tier introduced. xAI — Grok Build 0.1 Coding agent moved to public API beta May 28. Custom Skills feature added for reusable user-defined tasks. Connectors for SharePoint, OneDrive, Notion, GitHub, Linear, plus bring-your-own MCP support. Mistral Launched Vibe (unified work + code agent, replaces Le Chat). Acquired Emmi AI for physics-based simulation. Targeting €1B revenue in 2026; new 10MW inference DC announced. Hugging Face Launched an app store for the Reachy Mini robot. ~10,000 units shipped. Also reported a malicious repo masquerading as an OpenAI release that accumulated 244K downloads before takedown — relevant for anyone pinning models from HF in production. My take as someone building on top of these APIs: The 3x Opus Fast Mode price cut and Qwen 3.7 Max's pricing + autonomous duration are the real signal this week. The cost floor on premium-tier inference is dropping faster than most app-layer products have repriced for. Anyone running multi-step agent workflows needs to recompute unit economics this week — either pass through the savings or reinvest the margin. The other pattern worth noting: OpenAI and Anthropic are both pushing into Excel/M365 surfaces. Distribution is becoming the next battleground, not raw model capability. If you're building a productivity SaaS, the giants are now inside the same surface as you. submitted by /u/ksraj1001 [link] [comments]
View originalDoes anyone have a copy of the ICDAR2013 Chinese Handwriting Competition Dataset? [R]
I understand that this is a little unorthodox, but I'm desperately trying to download a copy of the ICDAR2013 Chinese Handwriting Recognition Competition Dataset. Unfortunately, the linked page in the Conference Archive: https://nlpr.ia.ac.cn/databases/handwriting/Download.html appears to be down, and has been down for the past few weeks consistently. I've checked every source I can find, like Kaggle, HuggingFace, remnant Google Drive and Baidu Netdisk links, even checking if someone's accidentally committed it to github, but no dice. I've tried every google dorking trick I know to no avail. Which brings me here. Please, if anyone has a copy of the Competition Dataset, I would be very grateful if you could share the ZIP with me. Thanks in advance! submitted by /u/Aathishs04 [link] [comments]
View originalWall-OSS-0.5: 4B VLA with open training code and zero-shot real-robot evaluation[D]
Wall-OSS-0.5 is a new 4B VLA release from X Square Robot, built on a 3B VLM backbone with action experts in a Mixture-of-Transformers layout. What caught my eye is that the report evaluates the pretrained checkpoint on real robots before task-specific fine tuning, instead of only reporting downstream fine-tuned performance. The reported numbers are: zero shot on a 17-task real-robot suite, 4 tasks above 80 task progress, including a held-out deformable task (Rope Tightening, 82). After fine tuning on a 15-task suite, they report 60.5 average task progress, +17.5pp over pi0.5, and +26pp on the 10-task manipulation subset. They also report +21.8pp on embodied grounding while general VL ability stays stable. The method bits I am trying to sanity check are the gradient bridge and the optimizer claim. They argue that discrete action-token CE is the dominant gradient into the VLM backbone, while flow matching's contribution to backbone updates collapses to roughly 5 percent within a few thousand steps. The Vision-Aligned RVQ tokenizer is supposed to make those action tokens semantically grounded instead of just numerical compression. For continuous actions, they still use flow matching, but supervise in recovered action space rather than velocity space. They also include DMuon, a distributed Muon optimizer, with a pretty aggressive overhead reduction claim. Code: https://github.com/X-Square-Robot/wall-x. Hugging Face org: https://huggingface.co/x-square-robot. Project page: https://x2robot.com/oss#resources. Paper: https://x2robot.com/api/files/file/wall_oss_05.pdf The questions I had after reading it: if you have run an analogous gradient-bridge ablation in another VLA, did action-token CE dominate in the same way? For people already using Muon, does the DMuon overhead claim sound plausible? And has anyone seen RVQ-with-vision-alignment clearly beat FAST-style tokenization outside this paper? If anyone is already trying to reproduce this on real hardware, drop notes. The third-party results will matter more than the release numbers. submitted by /u/Tall-Peak2618 [link] [comments]
View originalnoisekit - CLI for generating realistic degraded speech datasets for ASR benchmarking [P]
If you've ever tried to pick an STT vendor for a phone-based voice agent or call center product, you've probably hit this wall: you have plenty of real production audio, but it's unlabeled, so you can't compute WER on it. And the annotated public datasets (FLEURS, CommonVoice, LibriSpeech) are clean studio recordings that have nothing to do with how STT models actually handle your G.711 encoded noisy phone calls. Annotating production audio is slow, expensive, and usually a privacy headache. So most teams end up benchmarking on clean data, picking a vendor, then discovering in prod which one actually survives noise. noisekit fills that gap. Take a clean annotated dataset, apply degradations that approximate your production conditions, end up with a noisy annotated corpus you can run WER on across every STT candidate. uvx noisekit generate \ --dataset google/fleurs --config en_us --split test \ --samples 100 \ --output ./noisy-fleurs Feed ./noisy-fleurs through each STT candidate, normalize, and compute WER with the existing transcripts. The output is HuggingFace AudioFolder-compatible, so load_dataset("audiofolder", data_dir="./noisy-fleurs") works. Presets cover the conditions that actually matter for voice products: telecom: G.711 narrowband bandpass + 8-bit BitCrush + 16-32 kbps MP3 (sounds like a real phone call, not a synthetic low-pass filter) noise: real ambient mixed at 5-15 dB SNR (auto-downloads a MUSAN noise-only subset, or bring your own --noise-dir matching your domain: call center, cafe, car, street) reverb: pyroomacoustics far-field at 1-3 m mic distance low_bitrate: wideband MP3 at 16-32 kbps clipping: ADC / mic saturation clean_reference: control / WER floor compound chains stack realistically. noise_telecom = noisy room then phone codec, which is what an actual support call sounds like. Each output gets PESQ, SNR and NISQA scores in metadata.jsonl alongside the original transcript, so you can correlate WER with measured signal quality after the fact. Repo: https://github.com/karamouche/noisekit (MIT, uvx-runnable so zero install) Genuinely curious to hear from people who've benchmarked STT in production: what degradation conditions am I missing? submitted by /u/Karamouche [link] [comments]
View originalGPT-5.5 tops the benchmarks but sits at #22 for actual usage - I built a live index that tracks both (open source)
I built AgentTape to rank models on more than just benchmarks - it blends benchmark performance with who's actually using and talking about a model, plus cost and speed. It scores every public model from public signals (GitHub, Hugging Face, OpenRouter, MCP registries, npm, PyPI, arXiv, Hacker News) refreshed hourly, plus the main benchmark leaderboards daily. Right now OpenAI sits at the top: GPT-5 is #1, with 5.2, 5.1 and 5.4 Mini rounding out the top 5, and 5.2-Codex and 5.4 just behind - 6 of the top 7. The only thing breaking the run is xAI's Grok 4.20, level on score at #2. GPT-5.5 is the clearest example - it sits at #22 overall, and the breakdown shows why: * Quality: 96.4 - 2nd highest on the whole board, only pipped by Gemini 3.1 Pro Preview (97.2). On benchmarks alone it'd be near the top. * Adoption: 15 and Efficiency: 36 - both low. New release, steep price, so hardly anyone's using it day-to-day yet. * Biggest 24h climber on the board (+6) - so that's starting to shift. A benchmark-only board would put GPT-5.5 near #1 (second only to Gemini 3.1 Pro). That gap between topping the benchmarks and actually getting used is the whole reason I built this. Early days and I'm still tuning the methodology, so I'd love your thoughts - does weighting adoption alongside benchmarks match how you'd rank the GPT line-up, or would you trust the raw benchmark order?
View originalPapersWithCode new features - week 1 [P]
Hi, Niels here from the open-source team at Hugging Face. It's been one week since I launched paperswithcode.co, a revival of the website we all loved. It allows us to keep track of the state-of-the-art (SOTA) across various domains of AI, from agents to computer vision and time-series forecasting. The reception has been great, and I'm excited to extend this over the next few months. This week, I've added the following features: - Support for multiple metrics for a given benchmark: leaderboards now support multiple metrics, see e.g., the Open ASR Leaderboard for automatic speech recognition, which supports both Word Error Rate (WER) and the Inverse Real-Time Factor (RTFx) metrics, or the Object Detection leaderboard, which now also reports frames-per-second (FPS) besides mean average precision (mAP) on COCO. https://preview.redd.it/owlxn0b5u23h1.png?width=2878&format=png&auto=webp&s=1dff2f8feab4f160f77c97ceeb5d90e82382e63c - Support for external papers: We do support submitting papers beyond Arxiv, such as a Github repo, a blog post, BiorXiv, and more. You can submit a paper at paperswithcode.co/submit. AI will automatically enrich it with task and method tags, the GitHub repo, evals, and more. See e.g. DeepSeek-v4 below, which is not on Arxiv: https://preview.redd.it/uogbt0fjw23h1.png?width=2928&format=png&auto=webp&s=8b81e48af69b8935ddeb569d882d866b3e9ba216 - Support for paper lineage: whenever a paper has a follow-up or predecessor, this will be displayed with a small banner above the abstract. See e.g. Mamba-3, DINOv2 and GLM-4.5. https://preview.redd.it/f6vgtd1du23h1.png?width=2228&format=png&auto=webp&s=f8627f7669405f1766eecfd3322e925e15b4806d - New methods: support for new methods based on popularity, including Gated DeltaNet, Kimi Delta Attention, Mamba-2, and more. Each method also lists all papers that cite it. Find all supported methods here. https://preview.redd.it/6pzagifvu23h1.png?width=2984&format=png&auto=webp&s=400efdc9677d1fbd369eedf684e622dd8c807973 - Support for screenshotting a leaderboard for easy sharing on social media: each benchmark now includes a "copy image" button both on the scatter plot and table, which can be shared on social media. Try it on ClawEval, for example. https://preview.redd.it/w7y7t7xnw23h1.png?width=2950&format=png&auto=webp&s=cb70ad91c6ba075e49b743d6e34f157d22266f04 - Added many more evals: we are adding evals gradually, starting with all models supported in the Transformers library. So far, we have about 3k evals! Find them at the bottom of each paper page, e.g. Qwen 3.6. https://preview.redd.it/zao056s9x23h1.png?width=2218&format=png&auto=webp&s=540d87f473be05cb6f9c0aca88afa74fd4373e15 Happy to hear more feature requests and feedback! I will also launch a channel on the Hugging Face Discord server for easier communication. You can also chime in on the GitHub thread here. Cheers, Niels submitted by /u/NielsRogge [link] [comments]
View originalAnonymous Data Upload for Submission [D]
How do you upload data anonymously for a submission (ACL/EMNLP)? I have several models I need to upload for replication and was thinking HuggingFace, but HF offers download tracking on a paid plan. Does this violate the policy since there is the potential of tracking the download even if you do not use the service? Most grateful in advance. submitted by /u/Budget_Mission8145 [link] [comments]
View originalI built a live ranking of every AI agent and foundation model (open source)
I built AgentTape because none of the existing model leaderboards quite cover all the things that I was interested in: benchmark performance is one part, but so is who's actually using a model, who's talking about it, and how it compared on cost and speed. It pulls hourly data from GitHub, Hugging Face, OpenRouter, MCP registries, npm, PyPI, arXiv, Hacker News, and more - to score and compare each public AI agent and foundation model. I'm still tweaking the scoring methodology (it's early days), so I'd love to hear your thoughts, if it's helpful, or anything you think I've got wrong! submitted by /u/Celestialien [link] [comments]
View originalReleased a free 9.8M doc Indic multilingual corpus — Hindi, Bengali, Tamil, Telugu + 7 more (CC0, HuggingFace) [P]
Built this over the past few weeks as part of a multilingual research project. Figured I'd share it here. Check it out! ~9.8M web documents across 11 languages — hi, bn, ta, te, mr, gu, kn, ml, pa, ur, en. ~8.4B tokens. CC0 license. 🤗 https://huggingface.co/datasets/AM0908/indic-hplt-v1 submitted by /u/ashtok897 [link] [comments]
View originalReviving PapersWithCode (by Hugging Face) [P]
Hi, Niels here from the open-source team at Hugging Face. Like many others, I was a huge fan of paperswithcode. Sadly, that website is no longer maintained after its acquisition by Meta. Hence, I've been working on reviving it. I obviously use AI agents to parse papers at scale and automatically generate leaderboards (for now I'm the one verifying results). So far, I've only parsed high-impact papers for which I know they're SOTA, like Qwen 3.5 and 3.6, RF-DETR for object detection, DINOv3, SOTA embedding models from the MTEB leaderboard, the Open ASR Leaderboard for automatic speech recognition models, etc. For now, it includes the following: trending papers by default based on Github star velocity categorization by domain, e.g., OCR methods, which PwC used to have, e.g., RLVR eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom leaderboards for each domain, e.g., MMTEB or COCO val 2017 support for citation counts (you can also see the most cited papers by domain!) automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page) support for external papers beyond Arxiv, see e.g., DeepSeek v4 Harness reports for coding agent benchmarks, e.g., Terminal Bench "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups. I'm curious about your feedback + feature requests! Try it at paperswithcode.co https://preview.redd.it/whwji560fw1h1.png?width=3452&format=png&auto=webp&s=55bb7a30c1be58d140f7efcb07a31c6dac5693c7 See e.g. the SOTA leaderboard for Terminal Bench 2.0: https://preview.redd.it/98w9pi89fw1h1.png?width=3456&format=png&auto=webp&s=408fb64b0ba85ba24f55daa81d547d7c68e73951 A paper page looks like this: https://paperswithcode.co/paper/2602.15763 https://preview.redd.it/fiizit6dfw1h1.png?width=3450&format=png&auto=webp&s=9ea05a77ca5583a2fb395dccc95ba52c433362c5 submitted by /u/NielsRogge [link] [comments]
View originalLLM-Rosetta — format conversion library across LLM API standards, doubles as a proxy
This started because we had a proprietary internal LLM API that spoke none of the standard formats. Built an internal conversion layer to bridge it, maintained that for over a year. As colleagues started adopting more and more coding tools — Claude Code, opencode, Codex, VS Code plugins, Goose, and whatever came out that week — each with its own API format expectations, maintaining separate adapters for each became the actual problem. That's what pushed the internal conversion layer into a proper generalized design, and llm-rosetta is the result. It's a Python library that converts between LLM API formats — OpenAI Chat, Responses/Open Responses, Anthropic, and Google GenAI. The idea is you convert through a shared IR so you don't end up writing N² adapters. The key difference from LiteLLM: LiteLLM is a unified calling layer that takes OpenAI-style input and transforms it into provider-native requests — one direction. llm-rosetta uses a hub-and-spoke IR, so each provider only needs one converter, and you get any-to-any conversion for free. Anthropic → Google, OpenAI Chat → Anthropic, whatever direction you need. Use it as a library — pip install and call convert() directly, no server needed. Or run the gateway if you want a proxy that handles the format translation for you. Zero required runtime dependencies either way. The HTTP server, client, and persistence layer are vendored from zerodep (https://github.com/Oaklight/zerodep), another project of mine — stdlib-only single-file modules, not someone else's library repackaged. The gateway ships with a Docker image if you'd rather not deal with Python env setup. You can also deploy it on HuggingFace Spaces or anything similar — admin panel, dashboard, request log, config management all included. Screenshots: https://llm-rosetta.readthedocs.io/en/latest/gateway/admin-panel/ We've been running it in production for about 5 months as the conversion layer for an internal multi-model access platform — needed to support various API standards and coding tool integrations before the upstream APIs were fully standardized. The Responses converter passes all 6 official Open Responses compliance tests (schema + semantic) from the spec repo. So if you're running Ollama, vLLM, or LM Studio with Responses endpoints, it should just work as one side of the conversion. There's a shim layer for provider-specific quirks — built-in shims for OpenRouter, DeepSeek, Qwen, xAI, Volcengine, etc. Converters stay generic per API standard, shims handle the edge cases declaratively. 24 cross-provider examples in the repo covering all provider pairs, SDK + REST, streaming, tool calls, image inputs, multi-turn with provider switching mid-conversation. GitHub: https://github.com/Oaklight/llm-rosetta Docs: https://llm-rosetta.readthedocs.io arXiv: https://arxiv.org/abs/2604.09360 Gateway screenshot: https://preview.redd.it/qzzjr2dcdw1h1.png?width=949&format=png&auto=webp&s=bce4293aae81059f794909fc37f85071cee34378 submitted by /u/Oaklight_dp [link] [comments]
View originalI made a Claude skill that stops it from cloning whole repos when I just want one function
Kept hitting the same friction with Claude Code. I'd point at a GitHub repo and say "look at how this handles agent handoffs" — meaning, borrow the idea. Claude would git clone the whole repo, read 50 files, and ask which __init__.py was interesting. Or worse — it'd add the library to my package.json as a dependency. For one function. Suddenly I own the transitive deps, the CVE notifications, and a version pin I'll never upgrade. The actual problem: "use this library", "borrow an idea from this library", and "just steal that one function" deserve totally different workflows, and nothing was telling Claude which one I meant. So I wrote a skill — a single SKILL.md (surgical-github-extraction) that auto-triggers when I drop a GitHub URL as inspiration. The rule: Read the README first to get the shape. Pull 1–3 source files via raw URLs to see how the pattern is wired — prompts, schemas, the orchestration file. Never the whole repo. Pin to a commit SHA, save to /tmp (or %TEMP% on Windows). Lift the smallest useful unit — a function, a prompt, or just the pattern. Rewrite in your style. Cite the source SHA. Two concrete cases this week: Pointed it at TradingAgents (a multi-agent trading repo) asking "can we use this pattern for a job-applier?" → README plus a few agent/prompt files, proposed an analogue (JobFitAnalyst + Critic arguing against). Nothing copied into my project. Asked it to "steal the exp backoff from litl/backoff" → fetched one file (_wait_gen.py), extracted the 8-line generator, rewrote inline in my style with a provenance comment. No pip install. Sibling skill: code-graft — for when a one-off snippet isn't enough but a runtime dep is too much. Vendor only the slice of a library you use into your project, trim the rest, re-sync selectively from upstream. Think "I want one tokenizer out of HuggingFace transformers without the 2GB." Why a Skill and not an MCP: Pure discipline on tools Claude already has (WebFetch, curl, gh, Read). MCPs ship new tools; Skills ship instructions. Same shape as Anthropic's own mcp-builder — that's a Skill, not an MCP. MIT-licensed, single file install: `mkdir -p ~/.claude/skills/surgical-github-extraction` curl -fsSL https://raw.githubusercontent.com/jeet-dhandha/jd-skills/main/skills/surgical-github-extraction/SKILL.md \ -o ~/.claude/skills/surgical-github-extraction/SKILL.md Both skills (jd-skills collection): https://github.com/jeet-dhandha/jd-skills Curious if anyone has hit this and solved it differently — especially failure cases where the skill picks the wrong path (concept vs. snippet vs. full vendor). Issues welcome. submitted by /u/hone_coding_skills [link] [comments]
View originalHugging Face co-founder says Qwen 3.6 27B running on airplane mode is close to latest Opus in Claude Code
I've been using AI Desktop 98 heavily to run local llms like qwen on my iPhone. submitted by /u/ImaginaryRea1ity [link] [comments]
View originalCompiled every national AI strategy in Asia — Vietnam has the most comprehensive standalone law, Japan has no penalties, Korea just eliminated Naver from sovereign LLM competition for using Qwen weights
Compiled a tracker of every national AI strategy in Asia. Headline is that ten major Asian economies now have dedicated AI legislation or comprehensive national strategies, and they're all quite distinct from Western legislation like the EU AI Act or US executive orders. Clear that Asian governments treat AI as infrastructure, not a sector to regulate from a distance. Most national approaches lean promotional (incentives, sandboxes, sovereign LLM funding) rather than punitive (bans, heavy compliance). The exceptions are Vietnam (first standalone AI law in Asia, Dec 2025) and South Korea (Framework AI Act with high-risk-system rules). The major markets that stood out to me: China's open-source-as-industrial-policy framework. ~$98B committed to AI development. Premier Li Qiang declared at WEF 2025 that China's innovation is "open and open-source" and the country is "willing to share indigenous technologies with the world." Derivatives of Alibaba's Qwen are now the largest open-weight model ecosystem on Hugging Face — over 100,000 derivatives (USCC 2026). This is industrial policy through model release, not regulation. Two-tier system: research labs (DeepSeek-style) operate with light governance, consumer-facing apps face stricter rules. Japan's AI Promotion Act (May 2025). No penalties. It's a promotional framework — establishes the AI Strategic Headquarters as a cabinet-level body, mandates a National AI Basic Plan, aligns deployment with "Human-Centred AI Society Principles." Japan's structural problem: only 9% of individuals and 47% of companies were using gen AI as of 2024. The legislation is trying to close adoption gaps via incentives rather than gate behaviour. December 2025 commitment of ¥1 trillion (~$7B) over five years to AI + semiconductors backs it up. Vietnam's AI Law (effective March 2026). Most comprehensive standalone AI law anywhere — 36 articles, three-tier risk classification (low/medium/high), foreign AI providers must appoint a legal representative in Vietnam, max admin fines reach VNĐ 2 billion (~$76K) for orgs with serious violations capped at 2% of preceding year revenue. Plus a National AI Development Fund offering grants/loans/preferential financing, plus regulatory sandboxes for startups. Combined with the Law on Digital Technology Industry covering semiconductors and digital assets, Vietnam now has the most legible AI legal architecture in SEA. What I'm not sure about: how sustainable the "promotional, not punitive" approach is when the next major AI safety incident happens. Japan's framework explicitly has no penalties, and I think that only holds up until something goes wrong. Vietnam's law has teeth but limited enforcement bandwidth. Korea's is the only framework that has both tools and resources to enforce. For people closer to AI policy work — does the Asia approach seem more or less likely to scale globally than EU-style ex-ante rule-making? My read: Asia's bet on incentives + sandboxes + sovereign capability is more aligned with how AI is actually deploying in 2026 than EU rules-based approaches, but the governance gap shows up in the next 24 months. Fuller tracker with country-by-country breakdown: https://digitalinasia.com/2026/04/08/asia-ai-policy-tracker/ submitted by /u/tomsimps0n [link] [comments]
View originalLLM proxy that lets Claude Code talk to any model
I built rosetta-llm — an open-source multi-format LLM proxy that acts as a drop-in Claude Code gateway. Works as a Claude Code LLM gateway — set `ANTHROPIC_BASE_URL` and all configured models appear in `/model` picker Translates between formats — Anthropic Messages ↔ OpenAI Chat ↔ OpenAI Responses at the wire level Thinking blocks round-trip correctly — this is the hard part and why I built this Provider routing — `openai/gpt-5.4`, `anthropic/claude-opus-4-7`, `groq/llama-4` all through one endpoint Streaming on everything — passthrough fast path + cross-format translation with proper SSE handling The thinking-block problem Most proxies lose reasoning continuity. LiteLLM has had open PRs for thinking block handling for a long time — some dating back months — and they're still not merged. Without proper round-tripping, prompt caching breaks across turns and Claude Code loses context. Rosetta encodes encrypted reasoning into Anthropic's `signature` field and decodes it back — so multi-turn agentic workflows keep their prompt-cache hits. Zero-setup Hugging Face Space Literally a two-line Dockerfile: FROM ghcr.io/lokesh-chimakurthi/rosetta-llm:latest COPY --chown=app:app config.json /app/config.json Add config.json file and above Dockerfile into a HF Space (Docker SDK) and it's running. No clone, no build, no venv. The GHCR image has everything baked in. Make your HF space private and add api keys in hf space secrets. Check readme in github Also works with # No install — ephemeral uvx rosetta-llm # Persistent install uv tool install rosetta-llm rosetta-llm --config ~/.rosetta-llm/config.json # Docker docker run -p 7860:7860 \ -v ~/.rosetta-llm/config.json:/app/config.json \ ghcr.io/lokesh-chimakurthi/rosetta-llm:main Why another proxy? I looked at existing solutions: LiteLLM — thinking block round-trip PRs going nowhere, too many abstractions OpenRouter — great but closed-source, no self-hosting Direct passthrough proxies — don't translate between formats Nothing gave me lossless cross-format translation with proper reasoning fidelity. Links GitHub: https://github.com/Lokesh-Chimakurthi/rosetta-llm PyPI: https://pypi.org/project/rosetta-llm/ Contributions welcome I built this for myself and it works for my use cases. But there's a lot more it could do — better multimodal handling, embeddings support, rate limiting, an admin UI. If any of this sounds interesting, PRs are absolutely welcome. Happy to answer questions in the comments. submitted by /u/DataNebula [link] [comments]
View originalRepository Audit Available
Deep analysis of huggingface/transformers — architecture, costs, security, dependencies & more
Yes, Hugging Face offers a free tier. Pricing found: $9 /month, $20 /month, $50 /month, $23, $15
Key features include: Features/CrossoverSUV, bytedance-research/Lance, openbmb/MiniCPM5-1B, meituan-longcat/LongCat-Video-Avatar-1.5, NemoStation/Marlin-2B, HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive, LongCat-Video-Avatar 1.5, Wan2.2 14B Fast Preview.
Hugging Face is commonly used for: Team Enterprise.
Hugging Face integrates with: TensorFlow, PyTorch, Keras, ONNX, FastAPI, Streamlit, Gradio, Django, Flask, Apache Airflow.
Hugging Face has a public GitHub repository with 158,591 stars.
Lewis Tunstall
ML Engineer at Hugging Face
9 mentions
Based on 52 social mentions analyzed, 15% of sentiment is positive, 83% neutral, and 2% negative.