Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If the model weights are stored as int8, does this mean that the floating point capacity of the GPU is wasted? Or the int8 is converted to float in the GPU?


Well, tensor cores support int8 instructions (at least from Turing onwards), so the hardware is being used, if that’s your concern.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: