Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LUTs at least do well in microbenchmarks, but I do worry that they may do comparatively much worse in real code. That said, that's another advantage of small tables using vpermi2pd.

The Julia/base implementations of log and exp both use LUTs. The SIMD AVX512 implementation of exp used by LoopVectorization.jl will sometimes use the 16 element table. I experimented with log, but had some difficulty getting accuracy and performance, so the version LoopVectorization.jl currently uses doesn't use a table.



BTW, since you apparently working on the stuff like that, check out that repository:

https://github.com/Const-me/AvxMath/blob/master/AvxMath/AvxM...

The license is MIT, copy-paste friendly. It doesn’t use AVX512 though, only AVX1 and optionally 2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: