-
Notifications
You must be signed in to change notification settings - Fork 559
Description
Component(s)
collector
Is your feature request related to a problem? Please describe.
We would like management of our OTel collectors to be as painless as possible when using the operator. Still, we have a desire to run different custom builds of the OTel collector for various use cases. For instance:
- Our sidecars should just run the default image, but from a custom container registry
- We have agents that should integrate with our cloud provider and hence include additional receivers
- We have a gateway that requires additional exporters for our telemetry backends
When setting the spec.image
of the OpenTelemetryCollector
resource, we also need to take ownership of the image tag. This causes us to have to disable automatic upgrades of the resource in order to ensure that the operator doesn't apply migrations that are incompatible with the collector version we are running.
But, even with automatic upgrades disabled, the operator still mutates our OpenTelemetryCollector
resources via admission webhooks. Since we deploy our OpenTelemetryCollector
via Helm, which performs a merge with the existing resource in the cluster, the only way we have found to consistently clean up such mutations applied by the admission webhook is by enabling automatic upgrades. Then the migration steps will keep the resource up to date.
So, we end up in a situation where using a custom image on OpenTelemetryCollector
resources requires us to disable automatic upgrades, but we still rely on automatic upgrades to clean up mutations performed by admission webhooks. This is not painless.
We would like to use a custom image, but still benefit from automatic upgrades without worrying about compatibility issues. That would be painless 💯
Describe the solution you'd like
I can think of two ways to approach this:
Allow to override only the image repository on OpenTelemetryCollectors
, and let the operator manage the tag
We always make sure that the custom OTel collector builds we generate are in lock-step with the upstream collector versions. Hence, for us it would be fantastic if we could simply define which variant each collector uses, but let the operator take care of versioning and automatic upgrades.
Make the operator honor the version of individual OpenTelemetryCollectors
We tried enabling automatic upgrades on an OpenTelemetryCollector
using a 0.133.0
collector image while the operator defaults to version 0.136.0
. According to the status on the OpenTelemetryCollector
, the version is reported as 1.136.0
despite using an older image.
If we could somehow tell the operator to postpone running migrations until we actually upgrade the OpenTelemetryCollector
image tag to 0.136.0
, that would help us during upgrades of the operator. However, I can understand how this would be problematic with admission webhooks is we apply a new OpenTelemetryCollector
to the cluster which uses an older image tag.
Describe alternatives you've considered
Right now, our alternative is to generate a single custom collector image with all receivers, exporters, etc. included, and use that for every collector (sidecars, agents, and gateway) in the cluster.
Additional context
It is entirely possible that we are just missing something here. If our goal is achievable with current features in the operator, we'd love to know!
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1
or me too
, to help us triage it. Learn more here.