Dario Declared War on Open Source. The Real War Is Over Your AI Bill.
Dario Amodei did not just criticise open-source AI. Anthropic moved to have a slice of it banned. The company accused Chinese labs of stealing its capabilities and asked Washington to step in. On the 1 July episode of the 20VC x SaaStr show, Harry Stebbings, Jason Lemkin and Rory O'Driscoll pulled the argument apart, and the interesting part is not the geopolitics. It is what the fight reveals about who controls the cost of LLM intelligence, and how quickly that control is slipping.
This is our read as engineers who move client workloads across these models for a living, not a take on US-China policy. The thesis is simple: the open-source war is a proxy war over pricing power. Follow the money and it gets obvious.
Paying frontier prices for commodity work?
Book Free ConsultationWhat did Anthropic actually accuse China of?
The charge is distillation: training one model on the outputs of another, stronger one. In February 2026 Anthropic named DeepSeek, Moonshot AI and MiniMax, alleging they used Claude to extract capabilities in a way that broke its terms of service. On 24 June it escalated, accusing Alibaba of what it called the largest known distillation attack against it to date: roughly 28.8 million exchanges run through thousands of fraudulent accounts.
Read the wording carefully. Anthropic concedes that distillation itself is a widely used and legitimate training method. The complaint is about how it was done: the terms-of-service breach and the fraudulent accounts, not the technique. That distinction matters, because the fix Anthropic is pushing is much broader than the alleged crime. It has urged Congress to crack down on distillation by Chinese rivals and framed the whole thing as national security: cheap Chinese models, the argument goes, narrow the US lead and could accelerate an adversary's military and cyber AI.
If you want the mechanics of why distillation and fine-tuning make a strong model so easy to clone, that is the technical heart of the dispute. The business point is that the technique is now cheap enough that no single lab can keep a durable quality lead.
The protectionism read: is this security, or a moat?
Rory O'Driscoll's counter on the show was the sharpest. There is real hypocrisy in a US frontier lab, whose own models were trained on scraped web and third-party IP, invoking theft when a competitor learns from its outputs. And the proposed remedy looks less like security policy and more like a moat dug with regulation.
His analogy landed hardest: blocking low-cost open-weight models to protect a frontier lab's economics is like banning IBM PC clones in the 1980s to prop up IBM's stock price. Cloning is what dragged the price of computing down and put a machine on every desk. Calling that theft did not stop it then, and it will not stop it now. In his words, it is "fricking dumb."
The most plausible reading of Anthropic's move, and the one the panel kept circling, is a regulatory trade rather than a principled stand: accept some restrictions on the domestic side in exchange for a federal ban on distilled Chinese models. That is analysis, not confirmed fact. But it fits the incentives. When your product's per-unit price is falling fast, the cheapest way to defend margin is not to build a better product. It is to make the cheaper substitute illegal.

"When a product's price is collapsing, the cheapest way to defend margin is to make the cheaper substitute illegal. That is a policy fight, not a product one, and buyers should treat it as such."
Coinbase already ran the experiment
While Anthropic argues about the rules, a public company just showed what happens when you ignore the drama and optimise. In late June, Coinbase CEO Brian Armstrong said the firm cut its internal AI spend by roughly 50 percent while token usage kept climbing. He did it without capping a single engineer. The levers were exactly the ones we work through with clients:
- Better defaults, not usage caps. Coinbase defaulted engineers to open-weight models, specifically the GLM and Kimi classes, while leaving them free to pick a frontier model when the task needed one. Most people never hit their old caps anyway, so lowering the default price beat policing usage.
- Task-based routing. Prompts get preprocessed and routed to the cheapest model that can do the job, factoring in price and cache hits.
- Caching. The single biggest lever. Coinbase pushed its cache hit rate from around 5 percent to about 60 percent.
This is the same playbook we lay out in how to cut LLM token costs in 2026: cache what repeats, route the easy majority to a cheap model, and reserve the frontier model for the hard tail. The router layer that makes this safe is compared in our LLM gateway and router breakdown, and the open-weight models Coinbase leaned on are exactly the ones we benchmark in the open-weight LLM showdown.
The panel split on what it meant. Lemkin dismissed Armstrong's post as "performative social media", his point being that trimming cost matters little if top-line revenue is flat. Rory read it as cost management 101 that every cash-conscious exec will copy to curb runaway model fees. Both are right, and that tension is the whole story of enterprise AI right now.
The ROI reckoning nobody wants to name
Here is the uncomfortable pattern behind the Coinbase debate. Companies are pouring millions into AI tokens and posting the same growth rates they posted before. Adding spend has not moved the top line. Boards have noticed, and the pushback against reckless "token maxing" has started. CFOs want a clear line from AI spend to either faster delivery or hard bottom-line savings, and most cannot draw it yet.
That is not an argument against AI. It is an argument for measuring it. A cheaper per-token price means nothing if your product makes hundreds of calls per task, which is why we keep hammering the difference between cost per token and cost per task. And the projects that quietly fail rarely fail on model quality. They fail on scope, evals and integration, the failure modes we walk through in why AI agent projects get cancelled. If you cannot measure the lift, cutting the bill in half like Coinbase is the rational first move, because at least the saving is real.
Where the money is actually going
Two other data points from the week make the same argument from the capital side: investors are now rewarding discipline and cash generation, not token burn.
- Kalshi is chasing a $40B valuation. The prediction-market exchange is raising at roughly double its previous round, on the back of more than $2B in annualised revenue and around $178B in annualised trading volume. Worth correcting one bit of hype though: an IPO is not imminent. CEO Tarek Mansour has ruled out a listing before 2027, with late 2027 or 2028 the realistic window.
- Bending Spoons is the smartest IPO of the year. The Milan operator went public on Nasdaq on 1 July, priced at $29, and closed its first day up nearly 40 percent at about $40.50, a market cap near $25.7B, more than double its last private mark. It did not get there on organic user growth. It buys sticky but underperforming platforms, AOL, Vimeo, WeTransfer, Eventbrite and Evernote among them, then raises prices, strips redundancy and rewrites the software.
On the show, Lemkin argued this roll-up playbook is coming for mature B2B SaaS: acquire a sticky, underperforming platform, inject hungry operators, fix retention and capture the revenue arbitrage. The names he floated, Marketo, Asana, PagerDuty, are his speculation about future targets, not companies Bending Spoons owns. The signal underneath is what matters for anyone budgeting AI: public markets are pricing profitability and operating discipline, not growth-at-any-cost. The same logic that makes a roll-up attractive makes an unmeasured AI bill a liability.
What this means for an EU buyer
You do not have to pick a side in a US-China policy fight to act. The moves that cut your bill are available today, whatever Washington decides:
- Treat open weights as a first-class option. The quality gap on real coding and reasoning work has largely closed, at a fraction of frontier prices. Coinbase proved it in production.
- Solve governance with hosting, not avoidance. The real question with a Chinese open-weight model is not quality, it is where inference runs and where data lands. Run it self-hosted on EU infrastructure and you keep the price advantage without shipping data abroad. We cost that out in self-hosting LLMs in the EU and map the compliance options in EU data residency for AI apps.
- Route by default, escalate on need. Cheap model first, frontier model only when a confidence check fails. Track the escalation rate as a KPI.
If a federal ban does land, it applies to US access, not to a self-hosted open-weight model running on a server in Frankfurt. Sovereignty over your own stack is the hedge. Setting that up, the routing, the evals, the hosting decision, is exactly the work in our AI enablement engagements.
Frequently Asked Questions
What is model distillation, and is it legal?
Did Coinbase really cut its AI spend by 50 percent?
Are Chinese open-weight models safe for an EU company to use?
Should we switch our default model to open weights?
Is Kalshi about to IPO at $40 billion?
Final thoughts
The open-source war is a pricing war wearing a national-security jacket. When a product's price is falling this fast, the cheapest way to defend margin is to make the substitute illegal, and that is a policy fight, not a product one.
Buyers do not have to wait for the outcome. Coinbase already showed the move: open-weight defaults, smart routing and aggressive caching cut the bill in half with no loss of access. The projects that fail are not the ones that chose a cheaper model, they are the ones that never measured the lift. Treat open weights as a first-class option, solve governance with EU hosting rather than avoidance, and route by default. Sovereignty over your own stack is the only hedge that survives whatever Washington decides.
Want your AI stack cost-audited before your board asks?
Book Free Consultation