The future had us abandon traditional currency in favor of bitcoin, it had digital artists being able to sell NFTs for their work, it had supersonic jet travel, self driving or even flying cars. It had population centers on the moon, mines on asteroids, fusion power plants, etc.
I think large language models have the same future as supersonic jet travel. It’s usefulness will fail to realize, with traditional models being good enough but for a fraction of the price, while some startups keep trying to push this technology but meanwhile consumers keep rejecting it.
Even if models keep stagnating at roughly the current state of the art (with only minor gains), we are still working through the massive economic changes they will bring.
Unlike supersonic passenger jet travel, which is possible and happened, but never had much of an impact on the wider economy, because it never caught on.
Cost was what brought supersonic down. Comparatively speaking, it may be the cost/benefit curve that will decide the limit of this generation of technology. It seems to me the stuff we are looking at now is massively subsidised by exuberant private investment. The way these things go, there will come a point where investors want to see a return, and that will be a decider on wether the wheels keep spinning in the data centre.
That said, supersonic flight is yet very much a thing in military circles …
AI is a bit like railways in the 19th century: once you train the model (= once you put down the track), actually running the inference (= running your trains) is comparatively cheap.
Even if the companies later go bankrupt and investors lose interest, the trained models are still there (= the rails stay in place).
That was reasonably common in the US: some promising company would get British (and German etc) investors to put up money to lay down tracks. Later the American company would go bust, but the rails stayed in America.
I think there is a fundamental difference though. In the 19th century when you had a rail line between two places it pretty much established the only means of transport between those places. Unless there was a river or a canal in place, the alternative was pretty much walking (or maybe a horse and a carriage).
The large language models are not that much better than a single artist / programmer / technical writer (in fact they are significantly worse) working for a couple of hours. Modern tools do indeed increase the productivity of workers to the extent where AI generated content is not worth it in most (all?) industries (unless you are very cheap; but then maybe your workers will organize against you).
If we want to keep the railway analogy, training an AI model in 2025 is like building a railway line in 2025 where there is already a highway, and the highway is already sufficient for the traffic it gets, and won’t require expansion in the foreseeable future.
> The large language models are not that much better than a single artist / programmer / technical writer (in fact they are significantly worse) working for a couple of hours.
That's like saying sitting on the train for an hour isn't better than walking for a day?
> [...] (unless you are very cheap; but then maybe your workers will organize against you).
I don't understand that. Did workers organise against vacuum cleaners? And what do eg new companies care about organised workers, if they don't hire them in the first place?
Dock workers organised against container shipping. They mostly succeeded in old established ports being sidelined in favour of newer, less annoying ports.
> That's like saying sitting on the train for an hour isn't better than walking for a day?
No, that’s not it at all. Hiring a qualified worker for a few hours—or having one on staff is not like walking for a day vs. riding a train. First of all, the train is capable of carrying a ton of cargo which you will never be able to on foot, unless you have some horses or mules with you. So having a train line offers you capabilities that simply didn’t exist before (unless you had a canal or a navigable river that goes to your destination). LLMs offers no new capabilities. The content it generates is precisely the same (except its worse) as the content a qualified worker can give you in a couple of hours.
Another difference is that most content can wait the couple of hours it takes the skilled worker to create it, the products you can deliver via train may spoil if carried on foot (even if carried by a horse). A farmer can go back tending the crops after having dropped the cargo at the station, but will be absent for a couple of days if they need to carry it on foot. etc. etc. None of these is applicable for generated content.
> Dock workers organised against container shipping. They mostly succeeded in old established ports being sidelined in favour of newer, less annoying ports.
But this is not true. Dock workers didn’t organized against mechanization and automation of ports, they organized against mass layoffs and dangerous working conditions as ports got more automated. Port companies would use the automation as an excuse to engage in mass layoffs, leaving far too few workers tending far to much cargo over far to many hours. This resulted in fatigued workers making mistakes which often resulted in serious injuries and even deaths. The 2022 US railroad strike was for precisely the same reason.
No, not really. I have a more global view in mind, eg Felixtowe vs London.
And, yes, you do mechanisation so that you can save on labour. Mass layoffs are just one expression of this (when you don't have enough natural attrition from people quitting).
You seem very keen on the American labour movements? There's another interesting thing to learn from history here: industry will move elsewhere, when labour movements get too annoying. Both to other parts of the country, and to other parts of the world.
Most models can be inferenced-upon with merely borderline-consumer hardware.
Even the fancy models where you need to buy compute (rails) that's about the price of a new car, they have a power draw of ~700W[0] while running inference at 50 tokens/second.
But!
The constraint with current hardware isn't compute, the models are mostly constrained by RAM bandwidth: back of the envelope estimate says that e.g. if Apple took the compute already in their iPhones and reengineered the chips to have 256 GB of RAM and sufficient bandwidth to not be constrained by it, models that size could run locally for a few minutes before hitting thermal limits (because it's a phone), but we're still only talking one-or-two-digit watts.
> e.g. if Apple took the compute already in their iPhones and reengineered the chips to have 256 GB of RAM and sufficient bandwidth to not be constrained by it, models that size could run locally for a few minutes before hitting thermal limits (because it's a phone), but we're still only talking one-or-two-digit watts.
That hardware cost Apple tens of billions to develop and what you're talking about in term of "just the hardware needed" is so far beyond consumer hardware it's funny. Fairly sure most Windows laptops are still sold with 8GB RAM and basically 512MB of VRAM (probably less), practically the same thing for Android phones.
I was thinking of building a local LLM powered search engine but basically nobody outside of a handful of techies would be able to run it + their regular software.
Apple don't sell M4 chips separately, but the general best-guess I've seen seems to be they're in the $120 range as a cost to Apple. Certainly it can't exceed the list price of the cheapest Mac mini with one (US$599).
As bleeding-edge tech, those are expensive transistors, but still 10 of them would have enough transistors for 256 GB of RAM plus all the compute each chip already has. Actual RAM is much cheaper than that.
10x the price of the cheapest Mac Mini is $6k… but you could then save $400 by getting a Mac Studio with 256 GB RAM. The max power consumption (of that desktop computer but with double that, 512 GB RAM) is 270 W, representing an absolute upper bound: if you're doing inference you're probably using a fraction of the compute, because inference is RAM limited not compute limited.
But irregardless, I'd like to emphasise that these chips aren't even trying to be good at LLMs. Not even Apple's Neural Engine is really trying to do that, NPUs (like the Neural Engine) are all focused on what AI looked like it was going to be several years back, not what current models are actually like today. (And given how fast this moves, it's not even clear to me that they were wrong or that they should be optimised for what current models look like today).
> Fairly sure most Windows laptops are still sold with 8GB RAM and basically 512MB of VRAM (probably less), practically the same thing for Android phones.
That sounds exceptionally low even for budget laptops. Only examples I can find are the sub-€300 budget range and refurbished devices.
For phones, there is currently very little market for this in phones, the limit is not because it's an inconceivable challenge. Same deal as thermal imaging cameras in this regard.
> I was thinking of building a local LLM powered search engine but basically nobody outside of a handful of techies would be able to run it + their regular software.
This has been a standard database tool for a while already. Vector databases, RAG, etc.
Look at computer systems that cost 2000 or less and they are useless at running LLM coding assistants for example locally. A minimal subscription to a cloud service unfortunately beats them, and even more expensive systems that can run larger models, run them too slowly to be productive. Yes you can chat with them and perform tasks slowly on low cost hardware but that is all. If you put local LLMs in your IDE they slow you down or just don't work.
My understanding of train lines in America is that lots of them went to ruin and the extant network is only “just good enough” for freight. Nobody talks about Amtrak or the Southern Belle or anything any more.
Air travel of course taking over is the main reason for all of this but the costs sunk into the rails are lost or ROI curtailed by market force and obsolescence.
Completely relevant. It’s all that remains of the train tracks today. Grinding out the last drops from those sunk costs, attracting minimal investment to keep it minimally viable.
Grinding out returns from a sunk cost of a century-old investment is pretty impressive all by itself.
Very few people want to invest more: the private sector doesn't want to because they'll never see the return, the governments don't want to because the returns are spread over their great-great-grandchildren's lives and that doesn't get them re-elected in the next n<=5 (because this isn't just a USA problem) years.
Even the German government dragged its feet over rail investment, but they're finally embarrassed enough by the network problems to invest in all the things.
That's simply because capitalists really don't like investments with a 50 year horizon without guarantees. So the infrastructure that needs to be maintained is not.
The current training method is the same as 30 years ago, it's the GPUs that changed and made it have practical results. So we're not really that innovative with all this...
Even if the current subsidy is 50%, gpt would be cheap for many applications at twice the price. It will determine adaption, but it wouldn’t prevent me having a personal assistant (and I’m not a 1%er, so that’s a big change)
You are right that outside of the massive capex spending on training models, we don't see that much of an economic impact, yet. However, it's very far from zero:
Remember these outsourcing firms that essentially only offer warm bodies that speak English? They are certainly already feeling the impact. (And we see that in labour market statistics for eg the Philippines, where this is/was a big business.)
And this is just one example. You could ask your favourite LLM about a rundown of the major impacts we can already see.
But those warm body that speak English, they offer a service by being warm, and able to sort of be attuned to the distress you feel. A frigging robot solving your unsolvable problem ? You can try, but witness the backlash.
We are mixing up two meanings of the word 'warm' here.
There's no emotional warmth involved in manning a call centre and explicitly being confined to a script and having no power to make your own decisions to help the customer.
'Warm body' is just a term that has nothing to do with emotional warmth. I might just as well have called them 'body shops', even though it's of no consequence that the people involved have actual bodies.
> A frigging robot solving your unsolvable problem ? You can try, but witness the backlash.
Front line call centre workers aren't solving your unsolvable problems, either. Just the opposite.
And why are you talking in the hypothetical? The impact on call centres etc is already visible in the statistics.
I've seen this take a lot, but I don't know why because it's extremely divorced from reality.
Demand for AI is insanely high. They can't make chips fast enough to meet customer demand. The energy industry is transforming to try to meet the demand.
Whomever is telling you that consumers are rejecting it is lying to you, and you should honestly probably reevaluate where you get your information. Because it's not serving you well.
> Demand for AI is insanely high. They can't make chips fast enough to meet customer demand.
Woah there cowboy, slow down a little.
Demand for chips is come from the inference providers. Demand for inference was (and still is) being sold at below cost. OpenAI, for example, has a spend rate of $5b per month on revenues of $0.5b per month.
They are literally selling a dollar for actual 10c. Of course "demand" is going to be high.
> Demand for chips is come from the inference providers. Demand for inference was (and still is) being sold at below cost. OpenAI, for example, has a spend rate of $5b per month on revenues of $0.5b per month.
This is definitely wrong, last year it was $725m/month expenses and $300m/month revenue. Looks like the nearly-2:1 ratio is also expected for this year: https://taptwicedigital.com/stats/openai
This also includes the cost of training new models, so I'm still not at all sure if inference is sold at-cost or not.
> This is definitely wrong, last year it was $725m/month expenses and $300m/month revenue.
It looks like you're using "expenses" to mean "opex". I said "spend rate", because they're spending that money (i.e. the sum of both opex and capex). The reason I include the capex is because their projections towards profitability, as stated by them many times, is based on getting the compute online. They don't claim any sort of profitability without that capex (and even with that capex, it's a little bit iffy)
This includes the Stargate project (they're committed for $10b - $20b (reports vary) before the end of 2025), they've paid roughly $10b to Microsoft for compute for 2025. Oracle is (or already has) committed $40b in GPUs for Stargate and Softbank has committments to Stargate independently of OpenAI.
I find it hard to trust these numbers[1]: The $40b funding was not in cash right now, and depends on Softbank for $30b with Softbank syndicating the remaining $10b. Softbank themselves don't have cash of $30b and has to get a loan to reach that amount. Softbank did provide $7.5b in cash, with milestones for the remainder. That was in May 2025. In August that money had run out and OpenAI did another raise of $8.3b.
In short, in the last two to three months, OpenAI spent $5b/month on revenues of $0.5b/m. They are also depending on Softbank coming through with the rest of the $40b before end of 2025 ($30b in cash and $10b by syndicating other investors into it) because their commitments require that extra cash.
Come Jan-2026, OpenAI would have received, and spent most of, $60b for 2025, with a projected revenue $12b-$13b.
---------------------------------
[1] Now, true, we are all going off rumours here (as this is not a public company, we don't have any visibility into the actual numbers), but some numbers match up with what public info there is and some don't.
> It looks like you're using "expenses" to mean "opex"
I took their losses and added it to their revenue. That seems like that sum would equal expenses.
> The $40b funding was not in cash right now,
Does this matter? I'm not counting it as revenue.
> In short, in the last two to three months, OpenAI spent $5b/month on revenues of $0.5b/m.
You're repeating the same claim as before, I've not seen any evidence to support your numbers.
The evidence I linked you to suggests the 2025 average will be double that revenue, $1bn/month, at an expense of ($9bn loss after $12bn revenue / 12 months = $21bn / 12 months) = $1.75bn/month
> Does this matter? I'm not counting it as revenue.
Well, yes, because they forecast spending all of it by end of 2025, and they moved up their last round ($8.3b) by a month or two because they needed the money.
My point was, they received a cash injection of $10b (first part of the $40b raise) and that lasted only two months.
>> In short, in the last two to three months, OpenAI spent $5b/month on revenues of $0.5b/m.
> You're repeating the same claim as before, I've not seen any evidence to support your numbers.
Briefly, we don't really have visibility into their numbers. What we do have visibility into is how much cash they needed between two points (Specifically, the months of June and July). We also know what their spending commitment is (to their capex suppliers) for 2025. That's what I'm using.
They had $10b injected at the start of June. They needed $8.3b at the end of July.
It's crazy how many people are completely confident in their "knowledge" of the margins these products have despite the companies providing them not announcing those details!
(To be clear, I'm not criticising the person I'm replying to.)
I tend to rough-estimate it based on known compute/electricity costs for open weights models etc., but what evidence I do have is loose enough that I'm willing to believe a factor of 2 per standard deviation of probability in either direction at the moment, so long as someone comes with receipts.
Subscription revenue and corresponding service provision are also a big question, because those will almost always be either under- or over-used, never precisely balanced.
I think the above post has a fair point. Demand for chatbot customer service in various forms is surely "insanely high" - but demand from whom? Because I don't recall any end-user ever asking for it.
No, instead it'll be the new calculator that you can use to lazy-draft an email on your 1.5 hour Ryanair economy flight to the South. Both unthinkable luxuries just decades ago, but neither of which have transformed humanity profoundly.
This is just the same argument. If you believe demand for AI is low then you should be able to verify that with market data.
Currently market data is showing a very high demand for AI.
These arguments come down to "thumbs down to AI". If people just said that it would at least be an honest argument. But pretending that consumers don't want LLMs when they're some of the most popular apps in the history of mankind is not a defensible position
I‘m not sure this works in reverse. If demand is indeed high, you could show that with market data. But if you have marked data e.g. showing high valuation of AI companies, or x many requests over some period, that doesn’t mean necessarily that demand is high. In other words, marked data is necessary but not sufficient to prove your claim.
Reasons for market data seemingly showing high demand without there actually being one include: Market manipulation (including marketing campaigns), artificial or inflated demand, forced usage, hype, etc. As an example NFTs, Bitcoin, and supersonic jet travel all had “an insane market data” which seemed at the time to show that there was a huge demand for these things.
My prediction is that we are in the early Concord era of supersonic jet travel and Boeing is racing to catch up to the promise of this technology. Except that in an unregulated market such as the current tech market, we have forgone all the safety and security measures and the Concord has made its first passenger flight in 1969 (as opposed to 1976), with tons of fan fare and all flights fully booked months in advance.
Note that in the 1960 it was market forecasts had the demand for Concord to build 350 airplanes by 1980, and at the time the first prototypes were flying they had 74 options. Only 20 were every built for passenger flight.
As an end user I have never asked for a chatbot. And if I'm calling support, I have a weird issue I probably need human being to resolve.
But! We here are not typical callers necessarily. How many IT calls for general population can be served efficiently (for both parties) with a quality chatbot?
And lest we think I'm being elitist - let's take an area I am not proficient in - such as HR, where I am "general population".
Our internal corporate chatbot has turned from "atrocious insult to man and God's" 7 years ago, to "far more efficiently than friendly but underpaid and inexperienced human being 3 countries away answering my incessant questions of what holidays do I have again, how many sick days do I have and how do I enter them, how do I process retirement, how do I enter my expenses, what's the difference between short and long term disability" etc etc. And it has a button for "start a complex hr case / engage a human being" for edge cases,so internally it works very well.
This is a narrow anecdata about notion of service support chatbot, don't infere (hah) any further claims about morality, economy or future of LLMs.
Chatgpt, claude, gemini in chatbot or coding agent form? Great stuff, saves me some googling.
The same AI popping up in an e-mail, chat or spreadsheet tool? No thanks, normal people don't need an AI summary of a 200 word e-mail or slack thread. And if I've paid a guy a month's salary to write a report on something, of course I'll find 30 minutes to read it cover-to-cover.