i love ES, but i don't really feel comfortable with it as a primary datastore. We tend to use couchdb to write to, and ES to query against. It all happens automagically with a single shell command.
I won't use ES on it's own, because I have experienced situations in the past where the dynamic type mapping functionality gets confused, ie: the first time it sees a field, it indexes it as an integer, but then one of the later records has 'n/a' instead of a number. The entire record became unquery-able after that, even if it might have stored the original data.
You could fix this by creating the mapping by hand, BEFORE any data has been imported, as it can't be modified later. But what you have then is a situation where you have to maintain a schema to not get it to 'randomly' ignore data.
You also can't just tell ES to rebuild an index when you need to mess with the mappings, you have to actually create a new index, change the mappings and then reimport the data into the new index (possibly from the existing index).
It actually also feels right to me to split storing the data versus querying the data between separate applications, because they have different enough concerns, that being able to scale them out differently is a boon sometimes.
Thank you for your input. Had minor issues with dynamic mapping, too - but since the data is more or less just strings, I could circumvent ES' mechanism to infer datatype from value by simple using an empty default-mapping.js. I'll definitely give your approach a try.
I won't use ES on it's own, because I have experienced situations in the past where the dynamic type mapping functionality gets confused, ie: the first time it sees a field, it indexes it as an integer, but then one of the later records has 'n/a' instead of a number. The entire record became unquery-able after that, even if it might have stored the original data.
You could fix this by creating the mapping by hand, BEFORE any data has been imported, as it can't be modified later. But what you have then is a situation where you have to maintain a schema to not get it to 'randomly' ignore data.
You also can't just tell ES to rebuild an index when you need to mess with the mappings, you have to actually create a new index, change the mappings and then reimport the data into the new index (possibly from the existing index).
It actually also feels right to me to split storing the data versus querying the data between separate applications, because they have different enough concerns, that being able to scale them out differently is a boon sometimes.