Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, as all NUMA machines do, one CPU can access all memory, both local (to the CPU) and global (through the interconnect). The problem is that there is a significant latency cost when a CPU accesses non-local memory (limitations of the interconnect). So the HPC people writing their algorithms make sure that this happens at a minimal amount, by enforcing that the data each CPU is using is allocated locally as possible (ex. by using special affinity controls provided by libnuma)

I was just curious if these kinds of optimizations are possible in the M1 Ultra.



But IS there an interconnect?

The way Apple presented it sounded more like the chips talked at a lower layer, much like if it was all built as one physical chip, than when you have two normal chips with an interconnect fabric.

Someone will figure it out with benchmarks or something.


There is an interconnect. They just claim it is faster then competitors.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: