Comes with a single PCIe 3.0 x16 link on die, good for 16Gbyte/s.
This is the area where Intel is just killing it with their E5 chips, along with being able to write directly to the L3 from I/O. (I have no idea if AMD does this.)
The E5 is so good that it lets you do entirely different architectures from what came before it. Total game changer.
> The E5 is so good that it lets you do entirely different architectures from what came before it.
As an example: Luke Gorrie is one such person who is actively talking about doing so by talking directly to Ethernet controllers via DMA from user space. Here he is in a 30 minute talk about exploiting 512 Gbit/s of PCIe in his project called Snabb Switch. He's even written a 10 Gbit/s Intel Ethernet driver in Lua. The idea, as far as I can tell, is you can turn a common Xeon server in to a very low latency, zero-copy, multi-gigabit, software defined, layer 2 network appliance.
Intel seems stuck at 2P and HT still has massively lower latency but 80 GByte/s worth of PCIe lanes is huge. As big as main-memory throughput huge! Hence DDIO, which you reference, which allows IO to write to cache and skip the historic data-path it used to take through main memory. AFAIK AMD doesn't have anything equivalent. And they only have 16 lanes on chip: the rest come out of io-hubs.
I'd love to see someone actually try and use all that Intel PCIe IO and report on how utilized those pipes can get. Perhaps someone wants to send the PacketShader people a box loaded with GPUs? That'd be great, thanks!
Cool project! I wonder if you'd get similar perf from CPUs if you could used Intel's ISPC compiler[0] with the same GPU algorithms. I've found that GPU algorithms also perform substantially better on plain old CPUs, IMO because they use memory bandwidth more effectively.
I too would like to see how far those PCI Express busses can be pushed. :)
BTW We're adopting Intel's DPDK[1] approach to get massive packet processing performance on a single machine. So far we're liking it, but we'll see as it's not in production yet.
I don't follow hardware too closely, but I'm under the impression the new processors have ridiculously complicated architectures now. Integrated graphics on die, PCI bridges, write-through cache... I remember back in the day it was Processor / Northbridge / Southbridge. Is it still the case? In which direction are they going? System-on-a-chip?
This is the area where Intel is just killing it with their E5 chips, along with being able to write directly to the L3 from I/O. (I have no idea if AMD does this.)
The E5 is so good that it lets you do entirely different architectures from what came before it. Total game changer.