Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Since the quantization requires more ram than running it, why can’t the quantized models be uploaded for use by those with 16-32gb?


It's against the license terms, but seeing how quickly the weights were leaked in the first place, I wouldn't be surprised if a torrent shows up with quantized ones soon.


They can and they have been look on HuggingFace or search torrent sites.

You only actually need about 30GB of VRAM (or unified memory) and no ram to run the largest 65B model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: