Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If the model weights are stored as int8, does this mean that the floating point capacity of the GPU is wasted? Or the int8 is converted to float in the GPU?


Well, tensor cores support int8 instructions (at least from Turing onwards), so the hardware is being used, if that’s your concern.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: