Skip to content

Conversation

@vietj
Copy link
Member

@vietj vietj commented Oct 8, 2025

Motivation:

The implementation of the Kubernetes resolver relies on watching a Kubernetes resource through a WebSocket to maintain its state. When the WebSocket fails, a reconnection is attemped which might fails, in particular the server might respond with a 410 GONE response indicating the the watched resource is not available anymore.

We could handle such failure in the resolver itself and perform a new GET then Watch operation, but it turns out that the endpoint resolver implements this already.

Changes:

When the resolver resolves a service, make the WebSocket connect part of the resolution and not a side effect of the resolution, hence if the WebSocket cannot connect, the resolution fails and the HTTP client reacts accordingly.

When the WebSocket is disconnected, mark the kube service state as invalid, this state is probed by the HTTP client and will attempt a new resolution leading to a new GET then Watch sequence in the resolver.

Motivation:

The implementation of the Kubernetes resolver relies on watching a Kubernetes resource through a WebSocket to maintain its state. When the WebSocket fails, a reconnection is attemped which might fails, in particular the server might respond with a 410 GONE response indicating the the watched resource is not available anymore.

We could handle such failure in the resolver itself and perform a new GET then Watch operation, but it turns out that the endpoint resolver implements this already.

Changes:

When the resolver resolves a service, make the WebSocket connect part of the resolution and not a side effect of the resolution, hence if the WebSocket cannot connect, the resolution fails and the HTTP client reacts accordingly.

When the WebSocket is disconnected, mark the kube service state as invalid, this state is probed by the HTTP client and will attempt a new resolution leading to a new GET then Watch sequence in the resolver.
@vietj vietj added the bug Something isn't working label Oct 8, 2025
@vietj vietj self-assigned this Oct 8, 2025
@vietj vietj added this to the 5.0.5 milestone Oct 8, 2025
@vietj vietj merged commit ff5872b into 5.0 Oct 8, 2025
4 checks passed
@vietj vietj deleted the websocket-handling-5.0 branch October 8, 2025 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Problem with reconnecting to websocket in KubeServiceState Continuos loop in KubeServiceState when Kubernetes api server returns error

2 participants