Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One thing I learned in grad school is you can do a stupendous number of floating point operations in the time it takes to serialize a matrix to ASCII numbers and deserialize it. Whatever it is you are doing with a JSON document might be slower than parsing it.

It's true that autovectorization accomplishes very little but specialized libraries could have a notable effect on perceived performance if they were widely developed and used.

Frankly Intel has had less interest in getting you to buy a new computer by making it the best computer you ever bought than it has been in taking as much revenue as it can from the rest of the BOM, for instance the junk integrated graphics that sabotaged the Windows Vista launch and has been making your computer crash faster ever since. Another example is that they don't ship MKL out of the box on Windows or Linux although they do on MacOS. And Intel wonders why their sales are slipping...



Matrix multiplication and similar is also one of the few operations where algorithms and special case instructions are interesting for floating point on a massive scale.

I.e adding two arrays together, computing dot products, those operations are just memory bound when the data grows, but matrix multiplication is dense enough with operations per element that it is limited by arithmetic operations too.


Had to look it up:

> Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.

https://en.m.wikipedia.org/wiki/Math_Kernel_Library


> One thing I learned in grad school is you can do a stupendous number of floating point operations in the time it takes to serialize a matrix to ASCII numbers and deserialize it. Whatever it is you are doing with a JSON document might be slower than parsing it.

> It's true that autovectorization accomplishes very little but specialized libraries could have a notable effect on perceived performance if they were widely developed and used.

I mean, sure, but even if we take JSON as an example, in vast majority of cases it gets fed to a giant blob of JS driving even bigger blob of browser code.

The cases where you do deserialization -> very little processing -> serialization are pretty rare.

Sure, if it is already on chip might as well use it, but realistically savings will be in single digit percents.


> The cases where you do deserialization -> very little processing -> serialization are pretty rare.

Actually I've seen a lot of systems that do that - query a datastore, do some minimal processing of the results, feed them back to the caller. Although that tends to get addressed at a higher level, e.g. MongoDB drivers shifting towards using BSON.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: