I fully sympathize, I also still have a 1.7.x cluster that I have wanted to get ...

mikljohansson · on Nov 9, 2018

We at Meltwater have some custom plugins for Elasticsearch that make a number of modifications to how queries are executed, and completely replace some low level Elasticsearch query implementations. We're also running a custom ES 1.7 version with some features backported from version 2+. The end result is something like 5-10x lower GC pressure and radically increased performance and query throughput for our particular workload. Without these changes we'd not be able to sustain our workload without a massive amount of more hardware and cost, just like you say.

Our flavor of Elasticsearch 1.7 is faster than vanilla 2.* for our workload, though still slower than ES 2.* with our customizations applied.

Recent Elasticsearch versions still use the same basic shard allocation algorithms as far as we know. Our workload is very imbalanced towards recent data, but it's not a binary hot-cold matter, rather a more exponential decay in workload for older indexes. We fully expect to need Shardonnay to balance the workload even with ES 7+.

We're also in early conversations with Elastic about shard placement optimization. They seem to be interested in applying linear optimization in a similar way, with a goal of solving the fundamental problems with shard allocation based on observed workload.

rpedela · on Nov 9, 2018

Honestly, I bet 6.x doesn't work well at PB scale. There may be different problems with different solutions, but I bet it would still need custom shard balancing, etc. I work at TB scale and 6.x largely works well with the exception of reindexing. Reindexing without downtime is still tricky.

The reason I think there are likely still problems at PB scale because of the attitude of ES core developers. They collectively act as if reindexing is no big deal and their proposed solution to many things I consider bugs is "just reindex". Reindex is the last thing I want to do given it is so hard to do it at TB scale with zero downtime. I don't think the core developers have experience with large clusters themselves so I find it unlikely that just upgrading to 6.x would solve all the problems at PB scale.

jillesvangurp · on Nov 9, 2018

I know people running ES at PB scale. You definitely need to know what you are doing of course but it is entirely possible. There are definitely companies doing this. Any operations at this scale need planning. So, I don't think you are being entirely fair here.

I'm not saying upgrading will solve all your problem but I sure know of a lot of problems I've had with 1.7 that are much less of a problem or not a problem at all with 6.x.

Reindexing is indeed time consuming. However, if you engineer your cluster properly, you should be able to do it without downtime. For example, I've used index aliases in the past to manage this. Reindex in the background, swap over to the new index atomically when ready. Maybe don't reindex all your indices at once. Also they have a new reindex API in es 6 to support this. At PB scale of course this is going to take time. I've also considered doing re-indexing on a separate cluster and using e.g. the snapshot API to restore the end result.

mikljohansson · on Nov 9, 2018

In our case at Meltwater we also have new documents as well as updates to old documents coming in. Meaning older indexes are actively being modified with low latency requirements on visibility and consistency of those modifications. This makes it more tricky to both reindex as well as doing a seamless/no-downtime-whatsoever upgrade to a new major version. It's not infeasible at all though, and we're working on it. Reindexing a PB of data should be possible on a matter of weeks based on our estimates, but we shall see!

krallja · on Nov 9, 2018

We’re still running 1.6 and 0.90 for a couple production workloads. Thankfully the 0.90 cluster requires no maintenance, because pre-1.0 things are pretty different.