> It correctly handles the strange array/map duality of lua tables in the most efficient way it can do safely (serializing both variants concurrently and only writing out the correct one at the end)
The article doesn’t expand on this - what are they referring to? Are lua tables arrays?
Serializing something common like tables 2x seems like a massive overhead.
Lua doesn't have a separate "array" type, as arrays are just a special case of key->value where the key is always a positive integer. While this is nice (it simplifies the language yet Lua tables are incredibly versatile), it does indeed make it a little more work to translate structures to languages/serializations that distinguish between the two types, if the serializer needs to guess at runtime (e.g. if you don't have a predefined schema).
I can't comment on the efficiency of whatever this implementation is doing, I haven't read the code. It does sound a little expensive. Also it's not always 100% clear which the correct mapping is, unless the developer explicitly picks one. For example, a serializer has no way to determine whether '{}' is intended to be an empty array or an empty hash.
Another common gotcha, e.g. when translating to JSON, is that Lua does not restrict the types of keys (while JSON objects only support string keys), so boolean true/false are possible keys, for example, which cannot be translated to JSON directly.
I ended up drawing the same conclusion while implementing my language. An insertion ordered hash table turned out to be just a normal array of values where the keys map to array indexes instead. Other than speed, I'm struggling to think of a good reason to keep the dedicated array type...
This is really interesting. So it doesn't hash integer keys, it uses them directly as indexes in a separate array. How is order maintained between the two arrays though?
for k, v in pairs(t) do … end
for i, v in ipairs(t) do … end
ipairs just counts from 1 to n until t[n] == nil. Pairs returns all keys (array’s too) in an order that is undefined.
Merged dict and array looks neat on syntactic level but really is pita to deal with. It’s a questionable design in itself, and then you go outside and realize that you’re incompatible with everything else cause {} and [] are two different things.
Yeah, I assumed as much... Looks like order just isn't maintained at all. Stopping at the first nil is pretty confusing and surprising to me, it implies a loop over an array full of nils would not iterate even once which makes no sense. Javascript does it right: array length is the greatest numerical index in the object plus one.
Abstraction seemed neat at first but completely fell apart in the details. Many such cases. Looks like a true vector type is a good idea after all.
> Merged dict and array looks neat on syntactic level but really is pita to deal with.
It doesn't have to be that way.
An insertion ordered hash table also contains an array part, it's just that it's completely internal to the implementation. It behaves just like a normal hash table, but internally the keys hash to indexes in the array part. The order of the hashes is undefined but the order of the key-value pairs in the array part is defined to be the order they were inserted into the table.
It's really neat and it seems to work really well. It behaves exactly like an array if used like one. It definitely isn't an array though: integer keys are not offsets, they are hashed like every other key type. This doesn't seem like a huge problem to me. The lack of this property seems to be the root cause of the complexity of Lua's tables.
It's tempting to think of these insertion ordered hash tables as some kind of superset of arrays. Just like it's tempting to think of hypergraphs as supersets of graphs, of graphs as supersets of trees, and of trees as supersets of linked lists. I haven't learned of any cracks in this abstraction, at least not yet. The reason to prefer less general data structures always seems to be performance, not correctness. Offsetting a pointer is faster than hashing some bytes, indexing two arrays, comparing keys for equality and then offsetting a pointer.
In my opinion this is actually kinda awesome, because it’s as if every element in the domain maps to an element in the range, so there are no disjoint or undefined mappings in a table. It’s not how we’re used to thinking but when the logic is designed correctly it’s pretty useful and easy.
Should accessing an absent record really be an _error_? Shouldn’t it just be a “None” or “Missing” option or something like that instead? It doesn’t seem to me like indexing a set with an absent key is really an error, it’s just not a value.
There are definitely different approaches with different tradeoffs. Lua tends to prioritize implementation simplicity so I would guess that's a factor here.
Ideally yeah you'd return the Nothing side of an Option type but that's not representable in lua's type system. Returning an {:ok, data} tuple erlang-style is a pretty solid middle ground, and that is a common convention in lua with multiple return of ok, result.
But throwing an error has advantages too. There is a potentially important semantic difference between an unused key and a null data. The difference becomes especially important when you go mixing data types IMO. I'm begrudgingly fine with a hashmap returning null for unset keys, but not with an array returning it for an out of bounds index. With lua's approach you can't easily differentiate these things and it does cause serious problems. Everyone hated this in php, I don't see why it's such a popular choice when lua does it.
The problem with bounds errors is that it brings little value for a correct program and there are better ways to help with correctness than a runtime error (but that transcends Lua-like langs).
Looking at python, it’s more often and irritating to convert to .get() after an error than receiving an error and realizing it was helpful. Code that optimistically assumes non-null results rarely survives few next lines after getting nulls anyway.
There’s no difference in “nil” vs “exists but nil”, because nil means “doesn’t exist” by definition. The gotcha is in # operator which when applied to a table may return incorrect “length” of its array part if it has holes in it. For example #{1,2,nil,4} can be 2 or 4.
It happens because Lua arrays are not “just dicts with integer keys” as someone said above. A lua table impl has two parts: one for dict and one for array. # basically returns “impl(t).array.length”. This array part gets populated and depopulated heuristically creating non-determinated outcomes for #.
This has nothing to do with bounds or existence tests. “Really null” is an attempt to have nil in an array, which Lua doesn’t by design.
most often seen with something like JSON serialization where JSON null will result in the key being completely absent from the Lua table, thus re-encoding into JSON will be missing that key entirely instead of existing but with a null value.
It's why almost every serialization library will have userdata values to represent "null" and things like empty arrays instead of empty objects.
It’s really “not exists”, although js’s own lexicon uses “define property” instead of “create property”. They don’t hesitate to “delete” it though.
Apart from names, I find explicit models easier to work with and they allow for better designs. E.g. in Lua if t has mt.__index = {x=1}, then t.x == nil is non-representable, because t.x = nil deletes t.x and now .x proxies into index, which returns 1. In javascript, since existence is a separate thing, you can assign literally any value to t.x and it will not be proxied to a prototype until you delete t.x. This has a whole cascade of consequences that make your meta life easier.
Undefined vs null is still a mess though, but not because the two exist, but because standard APIs use and return them mostly arbitrarily.
No matter what system you use, there are always unrepresentable states like that. If you have javascript-style properties, then t.x cannot represent "the object lacks property x".
I'm fine not having explicit property existence, and prefer throwing out the complexity needed to support it. Maybe I'm underestimating the usefulness, but I think in most situations I'd be happy with either setting to false or using a plain old initial value instead of __index, and in the rest of cases I could get the same effect by having __newindex disable __index for that key.
T lacks x is represented by !Object.hasOwn(t, ‘x’), which lives on an explicit plane with in, delete, etc. While there is space for further argument, in my own practice js style is absolutely superior to lua minimal style when it comes to meta and also in general (although the latter is much less critical, I rarely do such distinctions in “user” code). The goal is not to find a universally infinite looped superidea, but to have one that is practical and no less.
The best Lua way to do it imo is to not do it at all and use plain tables for data. Reinventing something that was omitted by design, and broken by that same design, is futile. Lua is as it is. You use it as is, or you fight with it, or you choose something else.
Okay, I guess I should have elaborated. I'm talking about the situation where you'd use "in" because you want properties on the prototype to count. HasOwn would not work there. There is no way to default via prototype to "yes it exists" but override that with "no it does not exist".
> The goal is not to find a universally infinite looped superidea, but to have one that is practical and no less.
I find nil plenty practical when I use Lua. Even when working in Javascript, I have never felt the need to override a prototype value with null or undefined.
In my Lua/cbor use, what I did was use a custom tag (I think the ID was 0x7AB1E, for "TABLE") with an array type composed of a sub array and map. The array contains the array elements of the table up to the first nil, and the map contains the rest. Worked well enough for my use.
The article doesn’t expand on this - what are they referring to? Are lua tables arrays?
Serializing something common like tables 2x seems like a massive overhead.