Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

8 slow in-order A53 cores. I'd much rather take RPi4's four wide out-of-order A72 cores.


It really depends on your use. If you plan to build a desktop replacement, thread performance matters. If you want a small cluster node for educational purposes, more slow cores are better.


Clock for clock, A72 is 2-4x faster than A53.

But yeah, perhaps it's better to have a lot of small cores for a cluster.


That's great, but if you need crypto (because you have LUKS storage and send AES encrypted data over the network, like in a usual NAS case), A53 with aes instructions may be faster than this A72 without one. And so far it seems these cores don't have aes instructions.


Seriously? They left out Cryptography Extensions from the RPi4 SoC? Any source / references?

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc....

Edit: Ouch if this turns out to be true... https://www.raspberrypi.org/forums/viewtopic.php?t=243410

Edit 2: Ah, also RPi3B and 3B+ didn't have those extensions. Oh well.

Edit 3: Just tested my bog standard RPi3B+ running Raspbian. It could do "openssl speed -elapsed aes-256-cbc" 43 MB/s AES-256-CBC. Tried also with "-multi 4" and it resulted 144 MB/s using all 4 cores. So perhaps RPi4 will be fast enough with CPU only crypto... would still love to have HW assist.

Edit 4: RPi4 running 32-bit OS can do about 65 MB/s per core aes-256-cbc. 85 MB/s per core for aes-128-cbc. So by using two CPU cores for encryption (+ heatsink + fan :-)) 1 Gbps ethernet can be saturated.


Nice investigation!


I believe it might be possible to use VideoCore VI for crypto acceleration.

I think it would be pretty tricky to get LUKS to use it, though. At least writing a kernel module, but most likely LUKS patch would be required. You'd probably have to choose between 3D acceleration and full disk encryption.

Looks like VideoCore VI is getting some compute shader support:

https://gitlab.freedesktop.org/anholt/mesa/tree/v3d-cs/src/g...


I would never expect (or demand) a cluster designed for educational purposes out of sub $100 nodes to be fast.

In fact, being slow can be considered a feature in this scenario.


Yet, you can have the same priced board that can do 200-800MB/s AES-128 encryption (per core), leaving much more time to the actual useful work, or idle (less power consumption).

https://github.com/ThomasKaiser/sbc-bench/blob/master/Result...

See PineH64 for example.


Maybe. It has a lot of CPU power relative to IO bandwidth, it should do ok in pure software.

Hmm, anyone know if the VideoCore VI would be any good for crypto?


Quick Googling [0] revealed VideoCore VI might have at least some support for Vulkan and/or OpenCL. If this turns out to be true, then yes, VideoCore VI might be able to function as a crypto-coprocessor.

We'll see.

[0]: https://www.phoronix.com/scan.php?page=news_item&px=Broadcom...


People have also written some native code for the VC IV in assembler (both QPU and VPU) in the past.


Depends on your use case. If you have SSD attached over USB3, you can easilly pull 400MB/s. This would severely limit that, if that is an encrypted drive. On Allwinner H6, for example, you could get that speed on oncrypted drive while still having other cores free for other stuff.

So if you need to go through/process gigabytes of data on encrypted drive, having to do encryption in SW will slow you down massively, especially if you also need to process the retrieved data somehow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: