This is something I've been wondering about myself. What's the "Minimally Viable...

andy12_ · 2025-12-29T17:30:47 1767029447

This is extremely similar to Karpathy's idea of a "cognitive core" [1]; an extremely small model with near-0 encyclopedic knowledge and basic reasoning and tool-use capabilities.

[1] https://x.com/karpathy/status/1938626382248149433

fho · 2025-12-29T17:30:04 1767029404

You might be interested in RWKV: https://www.rwkv.com/

Not exactly "minimal viable", but a "what if RNNs where good for LLMs" case study.

-> insanely fast on CPUs

giancarlostoro · 2025-12-30T02:46:15 1767062775

My personal idea revolves around "can I run it on a basic smartphone, with whatever the 'floor' for basic smartphones under lets say $300 is for memory (let's pretend RAM prices are normal).

Edit: The fact this runs on a Smartphone means it is highly relevant. My only thing is, how do we give such a model an "unlimited" context window, so it can digest as much as it needs. I know some models know multiple languages, I wouldnt be surprised if sticking to only English would reduce the model size / need for more hardware and make it even smaller / tighter.

qingcharles · 2025-12-29T18:46:22 1767033982

I think what's amazing to speculate is how we could have had some very basic LLMs in at least the 90s if we'd invented the tech previously. I wonder what the world would be like now if we had?

Dylan16807 · 2025-12-29T17:08:50 1767028130

For your first question, the LLM someone built in Minecraft can handle simple conversations with 5 million weights, mostly 8 bits.

I doubt it would be able to make good use of a large context window, though.