> Parent comment is fair and technically accurate.
In what way precisely? That local LLMs "suck"? Is that a technical argument?
Or this statement "there's little to no economical and functional meaning to those NPUs." - is that actual factual statement or a emotionally charged verbal flatulence? and what "they won't help your local inference because they are not general purpose enough for that" even means? People succesfully run largeish MoE llms on AMD Ryzen AI miniPCs.
> Do you have a real argument, especially a technical one, that you can contribute?
What kind of argument do you want me to "contribute" wrt the ideological rant the "parent comment" had managed to produce?
Hey, OP here with their (apparently) controversial views. I stand firmly within those lines:
- I shouldn't be paying more for my next CPU because it has a NPU that I won't ever use. Give me the freedom of choice.
- Given that freedom of choice, it would seem that a majority would opt-out (as seen recently by Dell), so the morals of all that are dubious.
- NPUs may not be completely stupid as a concept, in theory, but at this point in time they are proprietary black-boxes purpose-built for marketing and micro-benchmarks. Give me something more general-purpose and open, and I will change my mind
- …but the problem is, you can only build so much general-purpose computing in bespoke processor. That's kind of its defining trait. So I won't hold my breath.
- Re: local-inference for the masses, putting aside the NPU shortcomings from above: how large do you think a LLM needs to be so it's deemed useful by your average laptop user? How would the inference story be like, in your honest opinion (in terms of downloading the model, loading it in memory, roundtrip times)? And how often would the user realistically want to suffer through all that, versus, just hopping to ${favorlite_llm.ai} from their browser?
Anyhow, if that makes me "antiai", please, sign me up!
> I shouldn't be paying more for my next CPU because it has a NPU that I won't ever use. Give me the freedom of choice.
There is a plenty to choose from.
> - NPUs may not be completely stupid as a concept, in theory, but at this point in time they are proprietary black-boxes purpose-built for marketing and micro-benchmarks. Give me something more general-purpose and open, and I will change my mind
In fact the linked article is not talking about NPUs in particular, but about Ryzen AI cpus. These have unified memory and more compute compared to normal ones which make them very useful for inference.
> how large do you think a LLM needs to be so it's deemed useful by your average laptop user?
Depend what they need it for. Useful autcomplete in IDE starts at around 4b weights.
> loading it in memory
Happens only once, usually takes around 10sec.
> roundtrip times
Negligible? it is loca after all.
> And how often would the user realistically want to suffer
Do you have a real argument, especially a technical one, that you can contribute?