Frontend-friendly Kubernetes Watch API
Problem statement
Memory leak on the environments page (gitlab-org/gitlab#437956 - closed) might be related to how Kubernetes watch API is used today. Easy to use API might help avoid such bugs as it'd result in simpler code. Simpler code is easier to maintain.
Proposal
It's more convenient for the frontend to have a single WebSocket connection for all watch calls (vs individual connections). However, multiple such connections are still allowed.
Introduce a new API path to kas' Kubernetes proxy: /watch
. Accept WebSocket connections on that path.
Accept messages to establish a watch:
{
"type": "watch",
"watchId": "unique identifier for this watch, e.g. a random string",
"apiVersion": "...",
"resource": "...",
"namespace": "...",
// and so on
}
Bits of information to define what watch to establish:
- api version
- resource (plural of object
kind
) - namespace (optional)
- label selector (optional)
- field selector (optional)
- resource version (optional)
- resource version match (optional)
- send initial events (optional)
timeoutSeconds
Accept messages to drop a watch:
{
"type": "unwatch",
"watchId": "..."
}
Emit messages for watch events:
{
"object": "watch_event",
"watchId": "...",
"watchEvent": {...} // the watch event
}
Emit error events e.g. when watch couldn't be established or was interrupted, timed out:
{
"object": "error",
"watchId": "...",
"errorType": "WATCH_NOT_ESTABLISHED", // enum-like strings: "WATCH_NOT_ESTABLISHED", "WATCH_FAILED"
"error": "error message goes here"
}
When implementing this in kas, add a document that describes the implemented API to the doc directory.
Background
See gitlab-org/gitlab#429531 (comment 1806482872).
Security
When kas reads and parses the event stream from the cluster, have a limit on the size of an object read in a single read operation. This would prevent DoS attack from a malicious cluster.