Making tokens affordable

Back to Blog

Jan 11, 2026

Written by

Simon Spurrier

For now, LLM inference is quite expensive. It’s getting cheaper, but your near unrestricted access to the frontier of AI on a daily basis costs someone, somewhere quite a lot of money.

It does not cost quite a lot of money because AI companies are mean (though they might be for other reasons). It is because both training and serving LLMs is expensive. It requires expensive hardware, expensive people, and expensive power. Increasingly it requires expensive data.

For a company to continue to operate it must be sustainable. Anthropic, for example, is a company. If you really like the products they make, you should hope they will be able to continue to operate to continue providing the products that you like.

The great Anthropic token subsidy of 2025

Recent Anthropic models, in combination with Claude Code, introduced people to the raw power of LLMs when you let them use lots of tokens in a terminal. It was so good, in fact, that despite Claude Code being an unwieldy terminal user interface, it quickly became one of fastest growing products of all time, reaching a $1B run rate in six months.

However, Claude Code uses a lot of tokens.

Claude Code users usually have a Claude plan which includes unspecified but apparently generous usage limits compared to the same spend over their inference API.

I have a feeling I'll soon be looking back at my Claude Code usage for the past month with pangs of nostalgia about how I managed to get over $13k and counting of inferences services for $600 worth of Claude Max subscriptions. There's no way this is sustainable for Anthropic.

Nobody knows just how much money Anthropic is making or losing on Claude Code but, even on the pricey $200 per month plan, there are estimates that heavy users save $3,000 per month vs API pricing. It’s also likely that the API is already subsidised.

Given the popularity of Claude Code, it wasn’t long before alternatives started to appear.

One high profile alternative, OpenCode, offered a similarly capable harness that could be used with models other than Claude. In a popular move that likely boosted its usage, but was technically against Anthropic’s terms of service, OpenCode allowed its users to plug in their Claude subscription credentials to get Anthropic LLM usage at heavily discounted rates compared to other means. It would have been incredibly hard to compete otherwise.

Many OpenCode users (and developers) took Anthropic’s silence as a tacit endorsement of this practice.

The great blockening of 2026

On the 8th January, Anthropic took steps to prevent Claude subscription usage in third party apps, including OpenCode. It was big news, at least on certain parts of Twitter.

In general, people were furious. They had been enjoying near unlimited access to one of the most complex technologies available to humanity at a price point accessible to a consumer, and it was suddenly taken away. That’s understandable, and Anthropic didn’t handle it as well as they could have.

However, reasonable and intelligent people expressed a clear expectation of endless token subsidy with no conditions. That’s obviously unsustainable.

OpenAI is not your friend

Soon after the blockening, other AI labs (with a much smaller market share in coding) quickly moved to capitalise on the gap left by Claude and the furious online sentiment.

OpenAI in particular was able to curry favour quickly, particularly among the open source community, by making their Codex plan (similar in structure and pricing to Claude) portable to other products outside of Codex products. They were widely praised and their models quickly gained greater recognition.

OpenAI is a great company, but they are not your friend. Notably they were not as permissive as Claude until the blockening outcry when an opportunity for market capture presented itself. They also make huge losses.

Other model labs highlighted their portable plans too. Z.ai has long offered cheap, portable inference plans and now has frontier level models competitive with Claude. However, they lost $343m on $27m of revenue in the first half of 2025 and are now a public company. This obviously cannot last.

What's going on then?

There is usually a strategy behind any subsidy. In this case, there are probably a few angles. This goes for pretty much any AI company losing money on inference (nearly all of them).

Data

Using AI products tends to create large amounts of useful human feedback data. This helps improve the models you are using and builds a competitive moat for the company. Access to data is increasingly deciding the winners at the frontier, so providing subsidies to acquire it makes sense.

If you use other harnesses, while Anthropic obviously gets the actual inference request data, they lose access to whatever telemetry OpenCode decides to keep to themselves. They are still paying the same subsidy and possibly even more.

Vendor lock in

This sounds evil (and usually is) but, at least in this case, there are some good reasons to insist on vendor lock-in. Reasons like control over how liberally the harness uses tokens, how well they can provide support, and how well their models are perceived by the end user.

They probably also want to lock you in to eventually make more money too, though.

Market capture

This can be as simple as getting you addicted to the harness and then jacking up the price. That’s not a great practice.

However, it can be less evil than that. Anthropic is probably trying to become the market standard. This can mean providing subsidy to individuals and small businesses using Claude to increase familiarity and market penetration, and then much more heavily monetising large companies and enterprises. They already do this by charging businesses according to API usage rather than offering per seat Claude plans.

The future of LLM costs

There may be yet one more angle in combination with some good old fashioned market capture - betting that training and inference costs will fall dramatically and simulating the experience until it does.

If trends over the last two years continue, this all gets pretty sustainable without anything else changing. You can now get frontier level inference for 30x less than GPT-4 on its release.

We may soon have Opus 4.5 level intelligence running on a laptop.

Sustainable & affordable AI

One way or another, this all needs to be sustainable so we can all continue to enjoy access to a fast moving AI frontier.

Level headedness about the incentives of the companies providing it may help, especially when it comes to accepting the trade offs about the ways we pay for it.

There are also innovative ways to broaden access. Like serving ads during code generation, or letting the user’s valuable data work for them.

That's what we do at cto.new. We provide an extremely aggressive model agnostic free tier in a way that works for us and our users, without sacrificing their privacy or user experience.

It’s not perfect, but we’re working really hard to make it better. I’d love for you to try it.

Anyway, please let's try not fall out over software terms of service.