I use FFI and LuaJIT is significantly faster on my embedded platform. Also, you can yield in coroutines in places where you can not with the standard Lua VM, such as across pcall, xpcall, iterators, and in metamethods.
They have function variants which can do this, but you need to call lua_pcallk instead of lua_pcall, which doesn't help me much with third party libraries.
It is fantastic software.