-
Notifications
You must be signed in to change notification settings - Fork 130
Description
Consider a DatadogAgent CRD with the following configuration:
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
spec:
features:
admissionController:
enabled: true
failurePolicy: Fail
# ...This resource will result in the creation of a "datadog-webhook" mutatingwebhookconfigurations.admissionregistration.k8s.io resource. When that webhook call fails, new Pod resources are prevented from being created because the webhook has been configured to fail-closed (failurePolicy: Fail). Normally this is a great way to ensure that new workloads have the APM agent injected as expected. However, when a DatadogAgent resource is deleted, it leaves behind the webhook configs but removes the cluster agent pods that typically handle those webhooks, thereby preventing all new Pods from being created until the "datadog-webhook" config resource is manually removed.
Now that the Cluster Agent is authorized (#1591) to delete the "datadog-webhook" mutatingwebhookconfigurations.admissionregistration.k8s.io and validatingwebhookconfigurations.admissionregistration.k8s.io resources, I believe it ought to be possible to have the Operator automatically clean up the webhook configs.
This bug affects the latest stable release of the Operator at the time of writing: v1.15.1 (via Operator Helm chart v2.10.0).