Specifically, LuaJIT has a very well-designed bytecode optimized for decoding and with specialization for types. The bytecode dispatch is hand-written in assembly which exploits this. And this is all even if your code never sees a JIT -- LuaJIT JIT'd numerical code is competitive in microbenchmarks.