A pretty good read that succinctly picks apart the realities of current AI businesses. Easily something I’d reference as a “primer” to someone that is more business-minded than technically-minded.
One point I’ll agree on is his final one: that the true big players haven’t even been founded yet. Right now, the AI hype seems to still revolve around the dream of replacing humans with machines and still magically making Capitalism work in the process, which is something I (and other “contrarians”) have beaten to death in other threads. That said, what these companies have managed to demonstrate is that transformer-based predictive models are a part of the future - just not AGI.
If I were a VC, I’d be looking at startups that take the same training techniques but apply them in niche fields with higher success rates than general models. An example might be a firm that puts in the grunt work of training a foundational model in a specific realm of medicine, and then makes it easier for a hospital network to run said model locally against patient data while also continuously training and fine-tuning the underlying model. I wouldn’t want to get into the muck of SaaS in these cases, because data sovereignty is only going to become an ever-thornier issue in the coming decades, and these prediction models can leak user data like a sieve if not implemented correctly. Same goes for other narrow applications, like single-mode logistics networks or on-site hospitality interfaces. The real money will be in the ability to run foundational models against your own data in privacy and security, with inference at the edge or on-device rather than off in a hyperscaler datacenter somewhere.
Then again, I could be totally wrong. Guess we’ll all find out together.
I believe one of the real insights of the widespread adoption of LLMs across problem domains is that the general knowledge insight of such models actually maps to increased performance on specific domain tasks. Hence finetuning is a better approach than training from scratch, unless you have insane compute (at which point, why restrict yourself to a narrow domain?)
Aren't there already a ton of startups doing finetunes for their local niche? Many aren't even "AI" companies - it's pretty easy to slap a finetune together if you enough data.
If you mean developing a model from scratch just for your niche - the bitter lesson is that scale is everything and that a finetune from an internet-scale model will outperform you easily.
DeepSeek has some something pretty remarkable. It’s certainly not “just” fine-tuning a Llama or a GPT prompt. More of a order of magnitude optimization
One point I’ll agree on is his final one: that the true big players haven’t even been founded yet. Right now, the AI hype seems to still revolve around the dream of replacing humans with machines and still magically making Capitalism work in the process, which is something I (and other “contrarians”) have beaten to death in other threads. That said, what these companies have managed to demonstrate is that transformer-based predictive models are a part of the future - just not AGI.
If I were a VC, I’d be looking at startups that take the same training techniques but apply them in niche fields with higher success rates than general models. An example might be a firm that puts in the grunt work of training a foundational model in a specific realm of medicine, and then makes it easier for a hospital network to run said model locally against patient data while also continuously training and fine-tuning the underlying model. I wouldn’t want to get into the muck of SaaS in these cases, because data sovereignty is only going to become an ever-thornier issue in the coming decades, and these prediction models can leak user data like a sieve if not implemented correctly. Same goes for other narrow applications, like single-mode logistics networks or on-site hospitality interfaces. The real money will be in the ability to run foundational models against your own data in privacy and security, with inference at the edge or on-device rather than off in a hyperscaler datacenter somewhere.
Then again, I could be totally wrong. Guess we’ll all find out together.