the argument is for in the future, not now

runarberg · 2025-08-07T23:27:17 1754609237

The future had us abandon traditional currency in favor of bitcoin, it had digital artists being able to sell NFTs for their work, it had supersonic jet travel, self driving or even flying cars. It had population centers on the moon, mines on asteroids, fusion power plants, etc.

I think large language models have the same future as supersonic jet travel. It’s usefulness will fail to realize, with traditional models being good enough but for a fraction of the price, while some startups keep trying to push this technology but meanwhile consumers keep rejecting it.

eru · 2025-08-07T23:49:25 1754610565

Even if models keep stagnating at roughly the current state of the art (with only minor gains), we are still working through the massive economic changes they will bring.

Unlike supersonic passenger jet travel, which is possible and happened, but never had much of an impact on the wider economy, because it never caught on.

rusk · 2025-08-08T00:03:30 1754611410

Cost was what brought supersonic down. Comparatively speaking, it may be the cost/benefit curve that will decide the limit of this generation of technology. It seems to me the stuff we are looking at now is massively subsidised by exuberant private investment. The way these things go, there will come a point where investors want to see a return, and that will be a decider on wether the wheels keep spinning in the data centre.

That said, supersonic flight is yet very much a thing in military circles …

eru · 2025-08-08T01:01:08 1754614868

Yes, cost is important. Very important.

AI is a bit like railways in the 19th century: once you train the model (= once you put down the track), actually running the inference (= running your trains) is comparatively cheap.

Even if the companies later go bankrupt and investors lose interest, the trained models are still there (= the rails stay in place).

That was reasonably common in the US: some promising company would get British (and German etc) investors to put up money to lay down tracks. Later the American company would go bust, but the rails stayed in America.

runarberg · 2025-08-08T01:39:00 1754617140

I think there is a fundamental difference though. In the 19th century when you had a rail line between two places it pretty much established the only means of transport between those places. Unless there was a river or a canal in place, the alternative was pretty much walking (or maybe a horse and a carriage).

The large language models are not that much better than a single artist / programmer / technical writer (in fact they are significantly worse) working for a couple of hours. Modern tools do indeed increase the productivity of workers to the extent where AI generated content is not worth it in most (all?) industries (unless you are very cheap; but then maybe your workers will organize against you).

If we want to keep the railway analogy, training an AI model in 2025 is like building a railway line in 2025 where there is already a highway, and the highway is already sufficient for the traffic it gets, and won’t require expansion in the foreseeable future.

eru · 2025-08-08T03:05:08 1754622308

> The large language models are not that much better than a single artist / programmer / technical writer (in fact they are significantly worse) working for a couple of hours.

That's like saying sitting on the train for an hour isn't better than walking for a day?

> [...] (unless you are very cheap; but then maybe your workers will organize against you).

I don't understand that. Did workers organise against vacuum cleaners? And what do eg new companies care about organised workers, if they don't hire them in the first place?

Dock workers organised against container shipping. They mostly succeeded in old established ports being sidelined in favour of newer, less annoying ports.

runarberg · 2025-08-08T05:11:06 1754629866

> That's like saying sitting on the train for an hour isn't better than walking for a day?

No, that’s not it at all. Hiring a qualified worker for a few hours—or having one on staff is not like walking for a day vs. riding a train. First of all, the train is capable of carrying a ton of cargo which you will never be able to on foot, unless you have some horses or mules with you. So having a train line offers you capabilities that simply didn’t exist before (unless you had a canal or a navigable river that goes to your destination). LLMs offers no new capabilities. The content it generates is precisely the same (except its worse) as the content a qualified worker can give you in a couple of hours.

Another difference is that most content can wait the couple of hours it takes the skilled worker to create it, the products you can deliver via train may spoil if carried on foot (even if carried by a horse). A farmer can go back tending the crops after having dropped the cargo at the station, but will be absent for a couple of days if they need to carry it on foot. etc. etc. None of these is applicable for generated content.

> Did workers organize against vacuum cleaners?

Workers have already organized (and won) against generative AI. https://en.wikipedia.org/wiki/2023_Writers_Guild_of_America_...

> Dock workers organised against container shipping. They mostly succeeded in old established ports being sidelined in favour of newer, less annoying ports.

I think you are talking about the 1971 ILWU strike. https://www.ilwu.org/history/the-ilwu-story/

But this is not true. Dock workers didn’t organized against mechanization and automation of ports, they organized against mass layoffs and dangerous working conditions as ports got more automated. Port companies would use the automation as an excuse to engage in mass layoffs, leaving far too few workers tending far to much cargo over far to many hours. This resulted in fatigued workers making mistakes which often resulted in serious injuries and even deaths. The 2022 US railroad strike was for precisely the same reason.

eru · 2025-08-08T08:09:38 1754640578

> Another difference is that most content can wait the couple of hours it takes the skilled worker to create it, [...]

I wouldn't just willy nilly turn my daughter's drawings into cartoons, if I had to bother a trained professional about it.

A few hours of a qualified worker's time takes a couple hundred bucks at minimum. And it takes at least a couple of hours to turn around the task.

Your argument seems a bit like web search being useless, because we have highly trained librarians.

Similar for electronic computers vs human computers.

> I think you are talking about the 1971 ILWU strike. https://www.ilwu.org/history/the-ilwu-story/

No, not really. I have a more global view in mind, eg Felixtowe vs London.

And, yes, you do mechanisation so that you can save on labour. Mass layoffs are just one expression of this (when you don't have enough natural attrition from people quitting).

You seem very keen on the American labour movements? There's another interesting thing to learn from history here: industry will move elsewhere, when labour movements get too annoying. Both to other parts of the country, and to other parts of the world.

loandbehold · 2025-08-08T02:17:11 1754619431

My understanding that inference costs are very high also, especially with new "reasoning" models.

ben_w · 2025-08-08T09:19:11 1754644751

Most models can be inferenced-upon with merely borderline-consumer hardware.

Even the fancy models where you need to buy compute (rails) that's about the price of a new car, they have a power draw of ~700W[0] while running inference at 50 tokens/second.

But!

The constraint with current hardware isn't compute, the models are mostly constrained by RAM bandwidth: back of the envelope estimate says that e.g. if Apple took the compute already in their iPhones and reengineered the chips to have 256 GB of RAM and sufficient bandwidth to not be constrained by it, models that size could run locally for a few minutes before hitting thermal limits (because it's a phone), but we're still only talking one-or-two-digit watts.

[0] https://resources.nvidia.com/en-us-gpu-resources/hpc-datashe...

[1] Testing of Mistral Large, a 123-billion parameter model, on a cluster of 8xH200 getting just over 400 tokens/second, so per 700W device one gets 400/8=50 tokens/second: https://www.baseten.co/blog/evaluating-nvidia-h200-gpus-for-...

oblio · 2025-08-08T11:51:04 1754653864

> e.g. if Apple took the compute already in their iPhones and reengineered the chips to have 256 GB of RAM and sufficient bandwidth to not be constrained by it, models that size could run locally for a few minutes before hitting thermal limits (because it's a phone), but we're still only talking one-or-two-digit watts.

That hardware cost Apple tens of billions to develop and what you're talking about in term of "just the hardware needed" is so far beyond consumer hardware it's funny. Fairly sure most Windows laptops are still sold with 8GB RAM and basically 512MB of VRAM (probably less), practically the same thing for Android phones.

I was thinking of building a local LLM powered search engine but basically nobody outside of a handful of techies would be able to run it + their regular software.

ben_w · 2025-08-08T12:45:01 1754657101

> That hardware cost Apple tens of billions to develop

Despite which, they sell them as consumer devices.

> and what you're talking about in term of "just the hardware needed" is so far beyond consumer hardware it's funny.

Not as big a gap as you might expect. M4 chip (as used in iPads) has "28 billion transistors built using a second-generation 3-nanometer technology" - https://www.apple.com/newsroom/2024/05/apple-introduces-m4-c...

Apple don't sell M4 chips separately, but the general best-guess I've seen seems to be they're in the $120 range as a cost to Apple. Certainly it can't exceed the list price of the cheapest Mac mini with one (US$599).

As bleeding-edge tech, those are expensive transistors, but still 10 of them would have enough transistors for 256 GB of RAM plus all the compute each chip already has. Actual RAM is much cheaper than that.

10x the price of the cheapest Mac Mini is $6k… but you could then save $400 by getting a Mac Studio with 256 GB RAM. The max power consumption (of that desktop computer but with double that, 512 GB RAM) is 270 W, representing an absolute upper bound: if you're doing inference you're probably using a fraction of the compute, because inference is RAM limited not compute limited.

This is also very close to the same price as this phone, which I think is a silly phone, but it's a phone and it exists and it's this price and that's all that matters: https://www.amazon.com/VERTU-IRONFLIP-Unlocked-Smartphone-Fo...

But irregardless, I'd like to emphasise that these chips aren't even trying to be good at LLMs. Not even Apple's Neural Engine is really trying to do that, NPUs (like the Neural Engine) are all focused on what AI looked like it was going to be several years back, not what current models are actually like today. (And given how fast this moves, it's not even clear to me that they were wrong or that they should be optimised for what current models look like today).

> Fairly sure most Windows laptops are still sold with 8GB RAM and basically 512MB of VRAM (probably less), practically the same thing for Android phones.

That sounds exceptionally low even for budget laptops. Only examples I can find are the sub-€300 budget range and refurbished devices.

For phones, there is currently very little market for this in phones, the limit is not because it's an inconceivable challenge. Same deal as thermal imaging cameras in this regard.

> I was thinking of building a local LLM powered search engine but basically nobody outside of a handful of techies would be able to run it + their regular software.

This has been a standard database tool for a while already. Vector databases, RAG, etc.

oblio · 2025-08-08T16:07:47 1754669267

> This has been a standard database tool for a while already. Vector databases, RAG, etc.

Oh, please show me the consumer version of this. I'll wait. I want to point and click.

Similar story for the consumer devices with cheap unified 256GB of RAM.

flembat · 2025-08-11T07:37:51 1754897871

Look at computer systems that cost 2000 or less and they are useless at running LLM coding assistants for example locally. A minimal subscription to a cloud service unfortunately beats them, and even more expensive systems that can run larger models, run them too slowly to be productive. Yes you can chat with them and perform tasks slowly on low cost hardware but that is all. If you put local LLMs in your IDE they slow you down or just don't work.

rusk · 2025-08-08T07:31:15 1754638275

My understanding of train lines in America is that lots of them went to ruin and the extant network is only “just good enough” for freight. Nobody talks about Amtrak or the Southern Belle or anything any more.

Air travel of course taking over is the main reason for all of this but the costs sunk into the rails are lost or ROI curtailed by market force and obsolescence.

eru · 2025-08-08T08:06:32 1754640392

Amtrak was founded in 1971. That's about a century removed from the times I'm talking about. Not particularly relevant.

rusk · 2025-08-08T08:11:42 1754640702

Completely relevant. It’s all that remains of the train tracks today. Grinding out the last drops from those sunk costs, attracting minimal investment to keep it minimally viable.

ben_w · 2025-08-08T09:26:44 1754645204

Grinding out returns from a sunk cost of a century-old investment is pretty impressive all by itself.

Very few people want to invest more: the private sector doesn't want to because they'll never see the return, the governments don't want to because the returns are spread over their great-great-grandchildren's lives and that doesn't get them re-elected in the next n<=5 (because this isn't just a USA problem) years.

Even the German government dragged its feet over rail investment, but they're finally embarrassed enough by the network problems to invest in all the things.

rusk · 2025-08-08T10:15:05 1754648105

Thanks yes the train tracks analogy does witber somewhat when you consider the significant maintenance costs.

jacquesm · 2025-08-08T08:29:10 1754641750

That's simply because capitalists really don't like investments with a 50 year horizon without guarantees. So the infrastructure that needs to be maintained is not.

giardini · 2025-08-08T04:56:48 1754629008

A valid analogy only if the future training method is the same as today's.

coldtea · 2025-08-08T09:13:14 1754644394

The current training method is the same as 30 years ago, it's the GPUs that changed and made it have practical results. So we're not really that innovative with all this...

techpineapple · 2025-08-08T02:25:29 1754619929

Wait why are these companies losing money on every query of inference is cheap.

eru · 2025-08-08T03:02:36 1754622156

Because they are charging even less?

techpineapple · 2025-08-08T04:11:38 1754626298

Sounds like a money making strategy. Also, given how expensive all this shit is if inference costs _more_? That’s not cheap to me.

But again the original argument was that they can run forever because inference is cheap, not cheap enough if you’re losing money on it.

bigfudge · 2025-08-08T08:07:40 1754640460

Even if the current subsidy is 50%, gpt would be cheap for many applications at twice the price. It will determine adaption, but it wouldn’t prevent me having a personal assistant (and I’m not a 1%er, so that’s a big change)

fieldcny · 2025-08-08T00:34:34 1754613274

What are you talking about, there’s zero impact from these thing so far.

eru · 2025-08-08T00:58:06 1754614686

You are right that outside of the massive capex spending on training models, we don't see that much of an economic impact, yet. However, it's very far from zero:

Remember these outsourcing firms that essentially only offer warm bodies that speak English? They are certainly already feeling the impact. (And we see that in labour market statistics for eg the Philippines, where this is/was a big business.)

And this is just one example. You could ask your favourite LLM about a rundown of the major impacts we can already see.

xwolfi · 2025-08-08T02:23:27 1754619807

But those warm body that speak English, they offer a service by being warm, and able to sort of be attuned to the distress you feel. A frigging robot solving your unsolvable problem ? You can try, but witness the backlash.

eru · 2025-08-08T03:09:10 1754622550

We are mixing up two meanings of the word 'warm' here.

There's no emotional warmth involved in manning a call centre and explicitly being confined to a script and having no power to make your own decisions to help the customer.

'Warm body' is just a term that has nothing to do with emotional warmth. I might just as well have called them 'body shops', even though it's of no consequence that the people involved have actual bodies.

> A frigging robot solving your unsolvable problem ? You can try, but witness the backlash.

Front line call centre workers aren't solving your unsolvable problems, either. Just the opposite.

And why are you talking in the hypothetical? The impact on call centres etc is already visible in the statistics.

zombiwoof · 2025-08-08T01:07:55 1754615275

But running inference isn’t cheap

And with trains people paid for a ticket and a hard good “travel”

Ai so far gives you what?

eru · 2025-08-08T01:10:57 1754615457

Running inference is fairly cheap compared to training.

xwolfi · 2025-08-08T02:24:07 1754619847

A rocket trip to the moon is fairly cheap compared to a rocket trip to Mars.

andsoitis · 2025-08-08T05:28:36 1754630916

And the view from the moon is pretty stunning. That from Mars… not so much!

ants_everywhere · 2025-08-08T01:40:23 1754617223

I've seen this take a lot, but I don't know why because it's extremely divorced from reality.

Demand for AI is insanely high. They can't make chips fast enough to meet customer demand. The energy industry is transforming to try to meet the demand.

Whomever is telling you that consumers are rejecting it is lying to you, and you should honestly probably reevaluate where you get your information. Because it's not serving you well.

lelanthran · 2025-08-08T06:03:18 1754632998

> Demand for AI is insanely high. They can't make chips fast enough to meet customer demand.

Woah there cowboy, slow down a little.

Demand for chips is come from the inference providers. Demand for inference was (and still is) being sold at below cost. OpenAI, for example, has a spend rate of $5b per month on revenues of $0.5b per month.

They are literally selling a dollar for actual 10c. Of course "demand" is going to be high.

ben_w · 2025-08-08T09:41:10 1754646070

> Demand for chips is come from the inference providers. Demand for inference was (and still is) being sold at below cost. OpenAI, for example, has a spend rate of $5b per month on revenues of $0.5b per month.

This is definitely wrong, last year it was $725m/month expenses and $300m/month revenue. Looks like the nearly-2:1 ratio is also expected for this year: https://taptwicedigital.com/stats/openai

This also includes the cost of training new models, so I'm still not at all sure if inference is sold at-cost or not.

lelanthran · 2025-08-08T10:34:40 1754649280

> This is definitely wrong, last year it was $725m/month expenses and $300m/month revenue.

It looks like you're using "expenses" to mean "opex". I said "spend rate", because they're spending that money (i.e. the sum of both opex and capex). The reason I include the capex is because their projections towards profitability, as stated by them many times, is based on getting the compute online. They don't claim any sort of profitability without that capex (and even with that capex, it's a little bit iffy)

This includes the Stargate project (they're committed for $10b - $20b (reports vary) before the end of 2025), they've paid roughly $10b to Microsoft for compute for 2025. Oracle is (or already has) committed $40b in GPUs for Stargate and Softbank has committments to Stargate independently of OpenAI.

> Looks like the nearly-2:1 ratio is also expected for this year: https://taptwicedigital.com/stats/openai

I find it hard to trust these numbers[1]: The $40b funding was not in cash right now, and depends on Softbank for $30b with Softbank syndicating the remaining $10b. Softbank themselves don't have cash of $30b and has to get a loan to reach that amount. Softbank did provide $7.5b in cash, with milestones for the remainder. That was in May 2025. In August that money had run out and OpenAI did another raise of $8.3b.

In short, in the last two to three months, OpenAI spent $5b/month on revenues of $0.5b/m. They are also depending on Softbank coming through with the rest of the $40b before end of 2025 ($30b in cash and $10b by syndicating other investors into it) because their commitments require that extra cash.

Come Jan-2026, OpenAI would have received, and spent most of, $60b for 2025, with a projected revenue $12b-$13b.

---------------------------------

[1] Now, true, we are all going off rumours here (as this is not a public company, we don't have any visibility into the actual numbers), but some numbers match up with what public info there is and some don't.

ben_w · 2025-08-08T10:51:15 1754650275

> It looks like you're using "expenses" to mean "opex"

I took their losses and added it to their revenue. That seems like that sum would equal expenses.

> The $40b funding was not in cash right now,

Does this matter? I'm not counting it as revenue.

> In short, in the last two to three months, OpenAI spent $5b/month on revenues of $0.5b/m.

You're repeating the same claim as before, I've not seen any evidence to support your numbers.

The evidence I linked you to suggests the 2025 average will be double that revenue, $1bn/month, at an expense of ($9bn loss after $12bn revenue / 12 months = $21bn / 12 months) = $1.75bn/month

lelanthran · 2025-08-08T11:17:15 1754651835

>> The $40b funding was not in cash right now,

> Does this matter? I'm not counting it as revenue.

Well, yes, because they forecast spending all of it by end of 2025, and they moved up their last round ($8.3b) by a month or two because they needed the money.

My point was, they received a cash injection of $10b (first part of the $40b raise) and that lasted only two months.

>> In short, in the last two to three months, OpenAI spent $5b/month on revenues of $0.5b/m.

> You're repeating the same claim as before, I've not seen any evidence to support your numbers.

Briefly, we don't really have visibility into their numbers. What we do have visibility into is how much cash they needed between two points (Specifically, the months of June and July). We also know what their spending commitment is (to their capex suppliers) for 2025. That's what I'm using.

They had $10b injected at the start of June. They needed $8.3b at the end of July.

swores · 2025-08-08T10:34:12 1754649252

It's crazy how many people are completely confident in their "knowledge" of the margins these products have despite the companies providing them not announcing those details!

(To be clear, I'm not criticising the person I'm replying to.)

ben_w · 2025-08-08T10:56:44 1754650604

Mm, quite.

I tend to rough-estimate it based on known compute/electricity costs for open weights models etc., but what evidence I do have is loose enough that I'm willing to believe a factor of 2 per standard deviation of probability in either direction at the moment, so long as someone comes with receipts.

Subscription revenue and corresponding service provision are also a big question, because those will almost always be either under- or over-used, never precisely balanced.

refactor_master · 2025-08-08T02:05:12 1754618712

I think the above post has a fair point. Demand for chatbot customer service in various forms is surely "insanely high" - but demand from whom? Because I don't recall any end-user ever asking for it.

No, instead it'll be the new calculator that you can use to lazy-draft an email on your 1.5 hour Ryanair economy flight to the South. Both unthinkable luxuries just decades ago, but neither of which have transformed humanity profoundly.

ants_everywhere · 2025-08-08T02:30:03 1754620203

This is just the same argument. If you believe demand for AI is low then you should be able to verify that with market data.

Currently market data is showing a very high demand for AI.

These arguments come down to "thumbs down to AI". If people just said that it would at least be an honest argument. But pretending that consumers don't want LLMs when they're some of the most popular apps in the history of mankind is not a defensible position

runarberg · 2025-08-08T06:34:03 1754634843

I‘m not sure this works in reverse. If demand is indeed high, you could show that with market data. But if you have marked data e.g. showing high valuation of AI companies, or x many requests over some period, that doesn’t mean necessarily that demand is high. In other words, marked data is necessary but not sufficient to prove your claim.

Reasons for market data seemingly showing high demand without there actually being one include: Market manipulation (including marketing campaigns), artificial or inflated demand, forced usage, hype, etc. As an example NFTs, Bitcoin, and supersonic jet travel all had “an insane market data” which seemed at the time to show that there was a huge demand for these things.

My prediction is that we are in the early Concord era of supersonic jet travel and Boeing is racing to catch up to the promise of this technology. Except that in an unregulated market such as the current tech market, we have forgone all the safety and security measures and the Concord has made its first passenger flight in 1969 (as opposed to 1976), with tons of fan fare and all flights fully booked months in advance.

Note that in the 1960 it was market forecasts had the demand for Concord to build 350 airplanes by 1980, and at the time the first prototypes were flying they had 74 options. Only 20 were every built for passenger flight.

NikolaNovak · 2025-08-08T03:44:22 1754624662

As an end user I have never asked for a chatbot. And if I'm calling support, I have a weird issue I probably need human being to resolve.

But! We here are not typical callers necessarily. How many IT calls for general population can be served efficiently (for both parties) with a quality chatbot?

And lest we think I'm being elitist - let's take an area I am not proficient in - such as HR, where I am "general population".

Our internal corporate chatbot has turned from "atrocious insult to man and God's" 7 years ago, to "far more efficiently than friendly but underpaid and inexperienced human being 3 countries away answering my incessant questions of what holidays do I have again, how many sick days do I have and how do I enter them, how do I process retirement, how do I enter my expenses, what's the difference between short and long term disability" etc etc. And it has a button for "start a complex hr case / engage a human being" for edge cases,so internally it works very well.

This is a narrow anecdata about notion of service support chatbot, don't infere (hah) any further claims about morality, economy or future of LLMs.

Workaccount2 · 2025-08-08T03:54:02 1754625242

People shame AI publicly and lean it heavily in private.

michaelt · 2025-08-08T10:27:41 1754648861

I mean, it's both.

Chatgpt, claude, gemini in chatbot or coding agent form? Great stuff, saves me some googling.

The same AI popping up in an e-mail, chat or spreadsheet tool? No thanks, normal people don't need an AI summary of a 200 word e-mail or slack thread. And if I've paid a guy a month's salary to write a report on something, of course I'll find 30 minutes to read it cover-to-cover.

ngcazz · 2025-08-08T08:01:42 1754640102

A future where anything has to be paid (but it's crypto) doesn't sound futuristic to me at all.

umanwizard · 2025-08-08T01:44:07 1754617447

LLMs are already extremely useful today

xwolfi · 2025-08-08T02:24:58 1754619898

Any sort of argument ?

umanwizard · 2025-08-08T02:34:19 1754620459

Personal experience: I use them.

germandiago · 2025-08-07T23:49:23 1754610563

I also have the intuition that something like this is the most likely outcome.