Skip to content

CA incorrectly reports 0% utilisation for nodes that have running pods #8814

@aaronfern

Description

@aaronfern

Which component are you using?:

/area cluster-autoscaler

What version of the component are you using?:

Component version: v1.32

What environment is this in?:

Our own custom cloud-provider, but this error is part of the core CA logic

What did you expect to happen?:
When CA calculates resource utilisation to determine underutilised nodes, it should reliably be able to calculate this utilisation value.

What happened instead?:
We see that at random time periods that CA calculates "0% utilisation". See logs below

2025-11-04 17:12:00 | {"log":"Node <node_name_1> is underutilized: memory requested (0% of allocatable) is below the scale-down utilization threshold","pid":"1","severity":"INFO","source":"eligibility.go:168"}
2025-11-04 17:12:00 | {"log":"Node <node_name_2> is underutilized: memory requested (0% of allocatable) is below the scale-down utilization threshold","pid":"1","severity":"INFO","source":"eligibility.go:168"}
2025-11-04 17:11:04 | {"log":"Node <node_name_3> is underutilized: memory requested (0% of allocatable) is below the scale-down utilization threshold","pid":"1","severity":"INFO","source":"eligibility.go:168"}

I've added diagnostic logs to the CalculateUtilizationOfResource() function here and it's clear that when this happens, nodeInfo.Pods() does not return anything which means that podsRequest ends up being 0 which is returned.
I've verified that the nodes in question do in fact contain running pods (which are not daemonset/mirror pods nor are they terminating)

I've noticed that this "0% utilisation" happens randomly during that day and does not seem to follow any pattern. Moreover, I also did not notice any error logs around the time this occurs

How to reproduce it (as minimally and precisely as possible):

I'm not sure this is relevant. I see this randomly and have not found a way to reproduce this

Anything else we need to know?:

Any pointers to what else I can look at would be helpful

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions