Can I scale multiple deployments with the same labels/selectors at once #1367

tpfau · 2025-10-30T06:19:00Z

tpfau
Oct 30, 2025

So here is my situation:
I have a somewhat heterogeneous kubernetes cluster with nodes with several different types of GPUs (lets assume A and B). I would like to host some LLMs on that cluster and scale those depending on demand.
However, some of the models I want to provide require more memory than provided by some of the GPUs (e.g. A has only 48GB, while the model needs 60GB )but can be run on multiple of those GPUs, while they can run on a single GPU of type B (which provides 96GB).
I could easily create two deployments for those models, one that specifies the resources and requirements for multiple GPUs of type A and another for type B which only requires one GPU.

And here is my problem/question:
Is there a way to have the Auto-scaler work, such that it scales the total replica count of the combination of Deployments of type A and type B ?
And/or does someone know whether there is a good way to achieve this or whether this is just out of scope for kubernetes?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can I scale multiple deployments with the same labels/selectors at once #1367

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Can I scale multiple deployments with the same labels/selectors at once #1367

Uh oh!

tpfau Oct 30, 2025

Replies: 0 comments

tpfau
Oct 30, 2025