Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm guessing that the "ML accelerator" in the CPU cores means one of ARM's SME extensions for matrix multiplication. SME in ARM v8.4-A adds dot product instructions. v8.6-A adds more, including BF16 support.

https://community.arm.com/arm-community-blogs/b/architecture...



Apple has the NPU (also called Apple Neural Engine), which is specific hardware for running inference. Can't be used for LLMs though at the moment, maybe the M4 will be different. They also have a vector processor attached to the performance cluster of the CPU, they call the instruction set for it AMX. I believe that that one can be leveraged for faster LLM inferencing.

https://github.com/corsix/amx




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: