Skip to content

[Bug] ECK should only Update Elasticsearch StatefulSet version when attempting to upgrade the StatefulSet Pods #8429

@BenB196

Description

@BenB196

(ECK version 2.15.0)

Background:

I was recently upgrading a rather large Elasticsearch cluster from 8.16.2 to 8.17.1, but ran into an issue where one of the dedicated Master pods was recreated part way through the upgrade process.

Issue

The problem appears that ECK when it gets an upgrade of the Elasticsearch version, it will automatically update all statefulset versions right away, and then perform the rolling restart. The problem is that if a pod gets killed/recreated part way through the process, there is no longer an "order of operations" applied and things can be upgraded in the wrong order.

Reproduction:

  1. Create an Elasticsearch cluster with dedicated masters
  2. Upgrade the Elasticsearch cluster
  3. Recreate one of the master pods while the upgrade is still working on non-master nodes
  4. Once the new master node gets created, create an index
  5. The index will get assigned the new Elasticsearch index version, and won't be allocatable on the lower version non-master nodes
  6. Observe that the upgrade managed via ECK deadlocks on a yellow state because of allocation issues from step 5.

Expectation:

ECK should only upgrade the statefulset version when its ready to perform the rolling restart of that statefulset, and not so far before in the upgrade process.

Workaround:

To workaround the deadlock, I had to manually (and carefully) delete/recreate each of the remaining non-master pods to allow them to pick up the new version.

Metadata

Metadata

Assignees

Labels

>bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions