Skip to content

Conversation

fpetkovski
Copy link
Contributor

@fpetkovski fpetkovski commented Aug 19, 2025

What this PR does / why we need it:

KSM supports a range of Kubernetes versions which makes it important to keep up with new versions as clusters upgrade.

In order to allow for alerting on unsupported cluster versions, this commit adds an info metric for the client-go version used by KSM. This way users can set up alerts if the cluster in which KSM is running is outside of the supported range.

Alternatively, we can directly add expose the supported min and max versions as metrics instead of having users calculate the values themselves.

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)

Cardinality is unchanged

Which issue(s) this PR fixes: (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged)
Fixes #

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fpetkovski
Once this PR has been reviewed and has the lgtm label, please assign catherinef-dev for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 19, 2025
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 19, 2025
KSM supports a range of Kubernetes versions which makes it important to
keep up with new versions as clusters upgrade.

In order to allow for alerting on unsupported cluster versions, this commit
adds an info metric for the client-go version used by KSM. This way users can
set up alerts if the cluster in which KSM is running is outside of the supported range.

Signed-off-by: fpetkovski <[email protected]>
@fpetkovski fpetkovski changed the title Add info metric for client-go version feat: Add info metric for client-go version Aug 19, 2025

clientGoVersion := promauto.With(ksmMetricsRegistry).NewGaugeVec(
prometheus.GaugeOpts{
Name: "kube_state_metrics_client_go_info",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can make this part of the kube_state_metrics_build_info metrics, but then on the other hand we don't have that many metrics in this metric path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the label to kube_state_metrics_build_info makes sense. Does this metric exist though? I wasn't able to find it on port 8081.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prometheus/client_golang#1860 PR is here to add support.

clientGoVersion := promauto.With(ksmMetricsRegistry).NewGaugeVec(
prometheus.GaugeOpts{
Name: "kube_state_metrics_client_go_info",
Help: "An info metric for the client-go version used by kube-state-metrics",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mark it as experimental?

@mrueg mrueg added this to the v2.17.0 milestone Aug 21, 2025
@mrueg mrueg removed this from the v2.17.0 milestone Aug 26, 2025
@mrueg
Copy link
Member

mrueg commented Aug 26, 2025

Dropping this from the v2.17 milestone, I would prefer to have it in the build_info metric via the client_golang patch.

@mrueg
Copy link
Member

mrueg commented Aug 26, 2025

@fpetkovski prometheus/client_golang#1860 got merged. Do you want to update the client_golang dep to point at a commit on main and make the change here?

@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Instrumentation Aug 26, 2025
@rexagod rexagod moved this from Needs Triage to Backlog (stale) in SIG Instrumentation Sep 4, 2025
@rexagod
Copy link
Member

rexagod commented Sep 4, 2025

/triage accepted
/assign @mrueg

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 4, 2025
@rexagod rexagod moved this from Backlog (stale) to In Progress in SIG Instrumentation Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants