Anthropic has always been the most visibly capacity-constrained of the big three and as the pioneers of Claude Code and the first model to make it really useful (Opus 4.5) + the main target of the OpenClaw phenomenon, I think they've been in a particularly difficult and somewhat unique capacity situation over the last few months.
The open-weight, mostly Chinese models make this a more complicated picture. Top Chinese models such as DeepSeek and Kimi have similar capabilities to the top models from the frontier American labs with significantly cheaper API pricing, even when hosted by third parties. Smaller open-weight models, such as the Gemma 4 and Qwen 3.6 series, can provide GPT4o-equivalent performance for many tasks while running on consumer hardware.
We're at the point where a lot of useful AI workloads do not require the latest frontier models, and so I think if OpenAI et al do radically increase their prices, a lot of workflows will shift to lesser-known providers, perhaps even internally managed open-weight models in some cases.
There's a popular folk belief about the frontier labs becoming techno-feudal overlords due to ironclad control over the means of AI inference, but I just don't think their products have enough differentiation for that. People have vibes-based preferences for e.g. Claude vs ChatGPT, but for most tasks these models are pretty much interchangeable, and this will be true for more and more especially non-frontier models as time goes on. Most heavy users I know already ruthlessly switch between monthly subscriptions to the frontier products.
I think we'll see a rise in companies that just provide inference using open-weight models, and can thus set prices without needing to consider the costs of their own model training. The frontier labs have no moats, the second-tier labs are months, not years, behind in capability, and we're doing more with smaller models. IMO it's just a matter of time before AI inference becomes a commodity.
Really great insights, thank you for commenting 🙏 I totally agree on the lack of moats - which is part of why I question the wisdom of incinerating so much capital so quickly. Everyone says “But then we will have all these DCs even if there is a crash” - to that I say, ask Chinese solar panel manufacturers how that plan worked out.
I realise the above para runs counter to my main thrust (ooh saucy) in the article, but I think both things can be true. The capacity of the entire ecosystem is constrained by physics - the question is whether that constraint can be passed to users or not. Given that these AI mongers are currently subsidising their products, I am sceptical they can continue to do so. But then Uber is still a going concern so lmao wtf shrug emoji.
I am very sure that all the switching and substitution effects you’re describing will become more and more common (as you say) but my gut says even that will not be enough. I grasp that cheaper models = less compute = stable prices, but I think that we’ll still run out of capacity in the short run. In my experience AI usage has a kind of totalising property. Once a person or a business starts to see and feel the tangible benefits, they tend to lean into it, and hard. In that scenario, this thing is going to keep going hockey stick until it reaches ±4 billion and that could happen by 2028 or 2029. It’s a proper boom, but also a proper bubble. RIP getting any sleep if you are on ops for a big DC.
It’s a bit esoteric but we essentially do plumbing and wiring for digital marketing teams. Most marketing teams - even in fairly large businesses - are fairly small and rely on all kinds of automation, analytics and optimisation tools. The difficulty is those tools are very complex and making them work together is onerous. Many generalist marketers lack either the technical skills or the time or both to make the most of this expensive kit. So we help them to make the most of those tools and systems, and we provide insights and analytics to help them keep improving. When I try to explain this to people I usually get polite smiles and moderately puzzled expressions 😂
Oh I see, so it looks like an ETL pipeline with some MCP servers attached; I previously had a company which did something similar.
In all honesty it was a glorified Jupyter notebook with a few custom and points attached to do sentiment analysis, but like you said marketers aren’t exactly technical so it’s a useful product.
Anthropic has always been the most visibly capacity-constrained of the big three and as the pioneers of Claude Code and the first model to make it really useful (Opus 4.5) + the main target of the OpenClaw phenomenon, I think they've been in a particularly difficult and somewhat unique capacity situation over the last few months.
The open-weight, mostly Chinese models make this a more complicated picture. Top Chinese models such as DeepSeek and Kimi have similar capabilities to the top models from the frontier American labs with significantly cheaper API pricing, even when hosted by third parties. Smaller open-weight models, such as the Gemma 4 and Qwen 3.6 series, can provide GPT4o-equivalent performance for many tasks while running on consumer hardware.
We're at the point where a lot of useful AI workloads do not require the latest frontier models, and so I think if OpenAI et al do radically increase their prices, a lot of workflows will shift to lesser-known providers, perhaps even internally managed open-weight models in some cases.
There's a popular folk belief about the frontier labs becoming techno-feudal overlords due to ironclad control over the means of AI inference, but I just don't think their products have enough differentiation for that. People have vibes-based preferences for e.g. Claude vs ChatGPT, but for most tasks these models are pretty much interchangeable, and this will be true for more and more especially non-frontier models as time goes on. Most heavy users I know already ruthlessly switch between monthly subscriptions to the frontier products.
I think we'll see a rise in companies that just provide inference using open-weight models, and can thus set prices without needing to consider the costs of their own model training. The frontier labs have no moats, the second-tier labs are months, not years, behind in capability, and we're doing more with smaller models. IMO it's just a matter of time before AI inference becomes a commodity.
Really great insights, thank you for commenting 🙏 I totally agree on the lack of moats - which is part of why I question the wisdom of incinerating so much capital so quickly. Everyone says “But then we will have all these DCs even if there is a crash” - to that I say, ask Chinese solar panel manufacturers how that plan worked out.
I realise the above para runs counter to my main thrust (ooh saucy) in the article, but I think both things can be true. The capacity of the entire ecosystem is constrained by physics - the question is whether that constraint can be passed to users or not. Given that these AI mongers are currently subsidising their products, I am sceptical they can continue to do so. But then Uber is still a going concern so lmao wtf shrug emoji.
I am very sure that all the switching and substitution effects you’re describing will become more and more common (as you say) but my gut says even that will not be enough. I grasp that cheaper models = less compute = stable prices, but I think that we’ll still run out of capacity in the short run. In my experience AI usage has a kind of totalising property. Once a person or a business starts to see and feel the tangible benefits, they tend to lean into it, and hard. In that scenario, this thing is going to keep going hockey stick until it reaches ±4 billion and that could happen by 2028 or 2029. It’s a proper boom, but also a proper bubble. RIP getting any sleep if you are on ops for a big DC.
Yes I agree with you in that these companies are going to follow the same path as pretty much every other technology for the last 20 years.
Provide a loss leader, then enshittify.
I’m curious if you would be willing to share what your business is?
It’s a bit esoteric but we essentially do plumbing and wiring for digital marketing teams. Most marketing teams - even in fairly large businesses - are fairly small and rely on all kinds of automation, analytics and optimisation tools. The difficulty is those tools are very complex and making them work together is onerous. Many generalist marketers lack either the technical skills or the time or both to make the most of this expensive kit. So we help them to make the most of those tools and systems, and we provide insights and analytics to help them keep improving. When I try to explain this to people I usually get polite smiles and moderately puzzled expressions 😂
Oh I see, so it looks like an ETL pipeline with some MCP servers attached; I previously had a company which did something similar.
In all honesty it was a glorified Jupyter notebook with a few custom and points attached to do sentiment analysis, but like you said marketers aren’t exactly technical so it’s a useful product.