This comment is misleading. There is a "free lunch" here in the sense that serving this model is far cheaper than worse, open source models at scale.
Yes they probably are more willing to go down in price due to this, but the architecture is open, and they are charging similarly to a 30B-50B dense model, which is about how many active params deepseek-v3 has.
Its a matter of degree. If 90% of the cost savings are from a new, smarter architecture, it doesn't make sense to point to the API terms as the reason for it being so cheap.
Yes they probably are more willing to go down in price due to this, but the architecture is open, and they are charging similarly to a 30B-50B dense model, which is about how many active params deepseek-v3 has.