Skip to content

Let the operator manage image tags when using custom collector image repositories #4433

@Kasper-Bizzkit

Description

@Kasper-Bizzkit

Component(s)

collector

Is your feature request related to a problem? Please describe.

We would like management of our OTel collectors to be as painless as possible when using the operator. Still, we have a desire to run different custom builds of the OTel collector for various use cases. For instance:

  1. Our sidecars should just run the default image, but from a custom container registry
  2. We have agents that should integrate with our cloud provider and hence include additional receivers
  3. We have a gateway that requires additional exporters for our telemetry backends

When setting the spec.image of the OpenTelemetryCollector resource, we also need to take ownership of the image tag. This causes us to have to disable automatic upgrades of the resource in order to ensure that the operator doesn't apply migrations that are incompatible with the collector version we are running.

But, even with automatic upgrades disabled, the operator still mutates our OpenTelemetryCollector resources via admission webhooks. Since we deploy our OpenTelemetryCollector via Helm, which performs a merge with the existing resource in the cluster, the only way we have found to consistently clean up such mutations applied by the admission webhook is by enabling automatic upgrades. Then the migration steps will keep the resource up to date.

So, we end up in a situation where using a custom image on OpenTelemetryCollector resources requires us to disable automatic upgrades, but we still rely on automatic upgrades to clean up mutations performed by admission webhooks. This is not painless.

We would like to use a custom image, but still benefit from automatic upgrades without worrying about compatibility issues. That would be painless 💯

Describe the solution you'd like

I can think of two ways to approach this:

Allow to override only the image repository on OpenTelemetryCollectors, and let the operator manage the tag

We always make sure that the custom OTel collector builds we generate are in lock-step with the upstream collector versions. Hence, for us it would be fantastic if we could simply define which variant each collector uses, but let the operator take care of versioning and automatic upgrades.

Make the operator honor the version of individual OpenTelemetryCollectors

We tried enabling automatic upgrades on an OpenTelemetryCollector using a 0.133.0 collector image while the operator defaults to version 0.136.0. According to the status on the OpenTelemetryCollector, the version is reported as 1.136.0 despite using an older image.

If we could somehow tell the operator to postpone running migrations until we actually upgrade the OpenTelemetryCollector image tag to 0.136.0, that would help us during upgrades of the operator. However, I can understand how this would be problematic with admission webhooks is we apply a new OpenTelemetryCollector to the cluster which uses an older image tag.

Describe alternatives you've considered

Right now, our alternative is to generate a single custom collector image with all receivers, exporters, etc. included, and use that for every collector (sidecars, agents, and gateway) in the cluster.

Additional context

It is entirely possible that we are just missing something here. If our goal is achievable with current features in the operator, we'd love to know!

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions