Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, the cost figures need more scrutiny; they started with Llama3 which they got for free; had they had to build it from scratch it would have cost more than $6M.

But as for your first paragraph: even if the "big AI players" have some secret sauce that will make their products better (and that they can actually keep secret), it seems unlikely it would be enough to command higher prices durably.

A model would have to be incredibly superior to justify paying for it, when there are so many free (or dirt cheap) alternatives that are simply good enough.



I don't know where you're getting your information from. Maybe you're confusing DeepSeek v3/r1 and the distilled r1 models.

DeepSeek V3/R1 architecture isn't anything like Llama 3. Llama 3 isn't even a mixture of experts, not to mention the various other differences like attention compression etc


Indeed I got confused. DeepSeek V3 is not based on Llama 3. Sorry about that.


You make a good point, that maybe the models won’t perform much better with those improvements, or at least not enough to get people to pay more.

I’m curious about the Llama3 bit - do you have a source for that? I’ve been hearing they trained using OpenAI outputs (not sure how that would work).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: