This is a very interesting thought experiment. The consistent nature of Redis cluster has significant advantages and disadvantages, and an eventually-consistent solution might be better for many applications.
> * Sets are merged performing the set union of all the conflicting versions.
That's an interesting approach. Dealing with compound types is very tricky in this kind of system, because it's not clear which of the options are what you want. Perhaps I am not following your approach correctly, but this seems to have a high probability of causing deleted items to reappear in a set when nodes come back online (or partitions heal). Obviously intersection isn't the right operation either, because that will cause similar consistency problems with added items.
It seems like a quorum-based approach to handling sets would give a much better consistency experience for most applications, potentially at the cost of doing more reads. I wonder if antirez considered that approach.
> The Dynamo design partially rely on the idea that writes don't modify values, but rewrite an entirely new value. In the Redis data model instead most operations modify existing values.
This is something that Rich Hickey touches on in his 'value of values' talk (http://www.infoq.com/presentations/Value-Values). Making values immutable and copying them on writes simplifies many of the complexities of merging, especially if techniques like vector clocks are used to provide ordering information.
> but this seems to have a high probability of causing deleted items to reappear in a set when nodes come back online.
Basically there is no right way to do this, it depends on what the application goal is. For instance Dynamo queries the application in this case, so the application can merge things if needed, intersect things otherwise, and so forth.
For Sets the union was picked in order to guarantee safety. For instance if you model a shopping cart this way, and there is a net split where a client gets into a minority partition and writes a new item (as the user put a new item in the shopping cart during the partition), if you do union on merge the user will still have the item.
In other applications of course the side effect of this (resurrection of deleted items) is not a good idea, but there are different ways to deal with it. For instance in the case of the shopping cart, this can be made more resistent by adding special items that mark old items as deleted, letting the application displaying only the right thing.
How about providing the ability to set merge handlers in Lua for different data types / keys?
Or if there are 3 different and well known strategies, somehow let users pick one and set it as a default. If not let users run Lua scripts that will be executed to resolve conflicts in a custom way.
I do this for CouchDB, it has a very convenient changes feed that can also stream conflicts when they appear. So there is a custom (and separate) conflict resolver process that resolves conflicts in an application specific way.
For consistency though you'd need a way to run those synchronously somehow as soon as you detect the conflict.
> quorum-based approach...wonder if antirez considered that approach.
I'm sure he's aware of it, but I doubt he took it incredibly seriously unless he has a very specific strategy in mind. He's explicitly avoiding the Dynamo model for a lot of reasons.
The mixture of composite types and mutation-centric semantics have boxed Redis in a bit, although I love using it.
Hickey was probably right. Pity Datomic is commercial.
In quorum-based systems a write is performed only if the majority of the players agree to accept it. If you want a system that is write-available even in a minority partition, you can't use a quorum-based system with success, like in the famous case of the highly available shopping cart.
> * Sets are merged performing the set union of all the conflicting versions.
That's an interesting approach. Dealing with compound types is very tricky in this kind of system, because it's not clear which of the options are what you want. Perhaps I am not following your approach correctly, but this seems to have a high probability of causing deleted items to reappear in a set when nodes come back online (or partitions heal). Obviously intersection isn't the right operation either, because that will cause similar consistency problems with added items.
It seems like a quorum-based approach to handling sets would give a much better consistency experience for most applications, potentially at the cost of doing more reads. I wonder if antirez considered that approach.
> The Dynamo design partially rely on the idea that writes don't modify values, but rewrite an entirely new value. In the Redis data model instead most operations modify existing values.
This is something that Rich Hickey touches on in his 'value of values' talk (http://www.infoq.com/presentations/Value-Values). Making values immutable and copying them on writes simplifies many of the complexities of merging, especially if techniques like vector clocks are used to provide ordering information.