If VRAM is ~$10/gb I suspect people paying $450 for a 12GB card would be happy to pay $1200 for a 64gb card. Running local LLM only uses about 3-6% of my GPU's capability, but all of it's VRAM. Local LLM has no need for 6 3090s to serve a single or handful of users; they just need the VRAM to run the model locally.
Exactly. People would be thrilled with a $1200 64GB card with ok processing power and transfer speed. It's a bit of a mystery why it doesn't exist. Intel is enabling vendors to 'glue' two 24GB cards together for a $1200 list price 48GB card, but it's a frankenstein monster and will probably not be available for that price.
Nvidia has zero incentives to undercut their enterprise GPUs by adding more RAM to “cheap” consumer cards like the 5090.
Intel and even AMD can’t compete or aren’t bothering. I guess we’ll see how the glued 48GB B60 will do, but that’s a still relatively slow GPU regardless of memory. Might be quite competitive with Macs, though.
Plenty of people use eg 2, 4 or 6 3090s to run large models at acceptable speeds.
Higher VRAM at decent (much faster than DDR5) speeds will make cards better for AI.