You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add some exponential backoff to k8s watch restarts (#225)
We're still seeing task_proc/tron get stuck in pretty hot restart loops
for expired resource versions - hopefully backing off a bit will help
out here since one current theory we have is that by hitting the
apiserver so hard is causing extra load and further exacerbating the
issue.
If this doesn't work, we'll likely want to switch to a pattern where we
have a reconcilliation thread/process periodically reconciling our state
with k8s' on top of having the watch always restart from a
resourceVersion of 0 (which skips the initial pod listing and starts the
watch "now").
0 commit comments