This is something I've been wondering about myself. What's the "Minimally Viable LLM" that can have simple conversations. Then my next question is, how much can we push it so it can learn from looking up data externally, can we build a tiny model with an insanely larger context window? I have to assume I'm not the only one who has asked or thought of these things.
Ultimately, if you can build an ultra tiny model that can talk and learn on the fly, you've just fully localized a personal assistant like Siri.
This is extremely similar to Karpathy's idea of a "cognitive core" [1]; an extremely small model with near-0 encyclopedic knowledge and basic reasoning and tool-use capabilities.
My personal idea revolves around "can I run it on a basic smartphone, with whatever the 'floor' for basic smartphones under lets say $300 is for memory (let's pretend RAM prices are normal).
Edit: The fact this runs on a Smartphone means it is highly relevant. My only thing is, how do we give such a model an "unlimited" context window, so it can digest as much as it needs. I know some models know multiple languages, I wouldnt be surprised if sticking to only English would reduce the model size / need for more hardware and make it even smaller / tighter.
I think what's amazing to speculate is how we could have had some very basic LLMs in at least the 90s if we'd invented the tech previously. I wonder what the world would be like now if we had?
Ultimately, if you can build an ultra tiny model that can talk and learn on the fly, you've just fully localized a personal assistant like Siri.