OCPEDGE-1502: feat: podman-etcd: add support for cert rotation #2085

eggfoobar · 2025-10-17T03:43:49Z

added a cert check function to the monitor call to force a restart of etcd when the certs have been changed

knet-jenkins · 2025-10-17T03:44:36Z

Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2085/1/input

clobrano

Overall, it looks quite good, and I've added a few comments. My main concern now is that the nodes can act on certificate change at the same time, which results in the etcd-cluster being disrupted until both recover.
First of all, is it expected that both nodes can update certificate at the same time? If so, we might want to synchronize them, e.g. with CIB attributes. There are a few examples in the code, look for attribute_learner_node function for example.

heartbeat/podman-etcd

knet-jenkins · 2025-10-17T14:27:01Z

Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2085/2/input

eggfoobar · 2025-10-17T14:29:11Z

@clobrano I don't know about the CIB mechanism, looking into it now so I can port over the attribute_learner_node functionality for syncing this restart. I'll update the code soon, but the other suggestions should be up now

added a cert check function to the monitor call to force a restart of etcd when the certs have been changed Signed-off-by: ehila <[email protected]>

knet-jenkins · 2025-10-21T04:27:06Z

Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2085/3/input

eggfoobar · 2025-10-21T05:21:27Z

@clobrano I don't think we'll need the CIB, was able to validate this and the cert rotation is a rolling change, so the first node updates, then the second. I think we'll be adding extra complexity to have CIB in the mix. I haven't had a chance to look at the etcd-operator code yet, but I'll find this out so we can be sure that is the behavior. In my experimentation it was a rolling process from node to node.

clobrano · 2025-10-21T06:10:04Z

@clobrano I don't think we'll need the CIB, was able to validate this and the cert rotation is a rolling change, so the first node updates, then the second. I think we'll be adding extra complexity to have CIB in the mix. I haven't had a chance to look at the etcd-operator code yet, but I'll find this out so we can be sure that is the behavior. In my experimentation it was a rolling process from node to node.

If that so, I agree, there is no need to add complexity.

the cert rotation is a rolling change

does it mean the node also reboots, or that CEO just waits for one to be completed before moving to the next?

If the first, we need no logic at all, but I remember we reproduced a scenario when the node does not reboot after a certificate change.

If the latter, we should ensure the process is long enough for the agent to see (and signal) the file change, be stopped (which might require up to 80 seconds) and restart (which is usually much quicker), and rejoin the cluster, before the peer node is affected by the certificate change.

clobrano self-requested a review October 17, 2025 06:30

clobrano requested changes Oct 17, 2025

View reviewed changes

heartbeat/podman-etcd Show resolved Hide resolved

heartbeat/podman-etcd Outdated Show resolved Hide resolved

heartbeat/podman-etcd Outdated Show resolved Hide resolved

heartbeat/podman-etcd Show resolved Hide resolved

heartbeat/podman-etcd Outdated Show resolved Hide resolved

eggfoobar force-pushed the add-podman-etcd-file-monitor branch from 0a83ec4 to 06fd7a2 Compare October 17, 2025 14:26

feat: add support for podman-etcd cert rotation

6bfbe1d

added a cert check function to the monitor call to force a restart of etcd when the certs have been changed Signed-off-by: ehila <[email protected]>

eggfoobar force-pushed the add-podman-etcd-file-monitor branch from 06fd7a2 to 6bfbe1d Compare October 21, 2025 04:26

eggfoobar changed the title ~~WIP: feat: add support for podman-etcd cert rotation~~ OCPEDGE-1502: feat: add support for podman-etcd cert rotation Oct 21, 2025

eggfoobar mentioned this pull request Oct 21, 2025

WIP: feat: added logic to handle cert tnf cert rotation openshift/cluster-etcd-operator#1493

Closed

oalbrigt changed the title ~~OCPEDGE-1502: feat: add support for podman-etcd cert rotation~~ OCPEDGE-1502: feat: podman-etcd: add support for cert rotation Oct 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OCPEDGE-1502: feat: podman-etcd: add support for cert rotation #2085

OCPEDGE-1502: feat: podman-etcd: add support for cert rotation #2085

eggfoobar commented Oct 17, 2025

Uh oh!

knet-jenkins bot commented Oct 17, 2025

Uh oh!

clobrano left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

knet-jenkins bot commented Oct 17, 2025

Uh oh!

eggfoobar commented Oct 17, 2025

Uh oh!

knet-jenkins bot commented Oct 21, 2025

Uh oh!

eggfoobar commented Oct 21, 2025

Uh oh!

clobrano commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

OCPEDGE-1502: feat: podman-etcd: add support for cert rotation #2085

Are you sure you want to change the base?

OCPEDGE-1502: feat: podman-etcd: add support for cert rotation #2085

Conversation

eggfoobar commented Oct 17, 2025

Uh oh!

knet-jenkins bot commented Oct 17, 2025

Uh oh!

clobrano left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

knet-jenkins bot commented Oct 17, 2025

Uh oh!

eggfoobar commented Oct 17, 2025

Uh oh!

knet-jenkins bot commented Oct 21, 2025

Uh oh!

eggfoobar commented Oct 21, 2025

Uh oh!

clobrano commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clobrano left a comment •

edited

Loading