Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Comes with a single PCIe 3.0 x16 link on die, good for 16Gbyte/s.

This is the area where Intel is just killing it with their E5 chips, along with being able to write directly to the L3 from I/O. (I have no idea if AMD does this.)

The E5 is so good that it lets you do entirely different architectures from what came before it. Total game changer.



> The E5 is so good that it lets you do entirely different architectures from what came before it.

As an example: Luke Gorrie is one such person who is actively talking about doing so by talking directly to Ethernet controllers via DMA from user space. Here he is in a 30 minute talk about exploiting 512 Gbit/s of PCIe in his project called Snabb Switch. He's even written a 10 Gbit/s Intel Ethernet driver in Lua. The idea, as far as I can tell, is you can turn a common Xeon server in to a very low latency, zero-copy, multi-gigabit, software defined, layer 2 network appliance.

https://cast.switch.ch/vod/clips/26uo9i576i/

https://github.com/SnabbCo/snabbswitch/wiki


Intel seems stuck at 2P and HT still has massively lower latency but 80 GByte/s worth of PCIe lanes is huge. As big as main-memory throughput huge! Hence DDIO, which you reference, which allows IO to write to cache and skip the historic data-path it used to take through main memory. AFAIK AMD doesn't have anything equivalent. And they only have 16 lanes on chip: the rest come out of io-hubs.

I'd love to see someone actually try and use all that Intel PCIe IO and report on how utilized those pipes can get. Perhaps someone wants to send the PacketShader people a box loaded with GPUs? That'd be great, thanks!

http://shader.kaist.edu/packetshader/


Cool project! I wonder if you'd get similar perf from CPUs if you could used Intel's ISPC compiler[0] with the same GPU algorithms. I've found that GPU algorithms also perform substantially better on plain old CPUs, IMO because they use memory bandwidth more effectively.

I too would like to see how far those PCI Express busses can be pushed. :)

BTW We're adopting Intel's DPDK[1] approach to get massive packet processing performance on a single machine. So far we're liking it, but we'll see as it's not in production yet.

[0] http://ispc.github.io/ [1] http://www.intel.com/content/www/us/en/intelligent-systems/i...


I don't follow hardware too closely, but I'm under the impression the new processors have ridiculously complicated architectures now. Integrated graphics on die, PCI bridges, write-through cache... I remember back in the day it was Processor / Northbridge / Southbridge. Is it still the case? In which direction are they going? System-on-a-chip?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: