Skip to content

Orphaned workspaces should be auto-terminated

MR: Pending

Description

Implement plan to terminate orphaned workspaces as described in &11452 (comment 1559480059):

  1. We will repurpose the existing logic which detects and logs orphaned workspaces in OrphanedWorkspacesObserver. Currently this detects workspace names that are sent by the agent in the reconciliation request, but do not exist in the database.
  2. Instead of just logging the orphaned workspaces, if we detect an orphaned workspace which is NOT yet in actual_state of terminated, we will add logic to also send a rails_info record in the reconciliation response to the agent, with the desired_state set to terminated. This is essentially the same thing that happens when a user normally requests to terminate a workspace, but we will be doing it automatically based on detecting the workspace as orphaned.
  3. This will cause the agent to eventually terminate the workspace, and then it will be reported with actual_state of terminated, so we will then stop attempting to terminate it even though it remains orphaned.

Because of the way we have designed the Railway Oriented Programming pipeline for the reconciliation logic, this should be straightforward to implement and test.

Acceptance Criteria

  • Orphaned workspaces (workspaces the agent knows about but missing a database record based on unique workspace name) should be guaranteed to be eventually terminated, by process described above.

Impact Assessment

Ensures that orphaned workspaces due to cascade delete rules in database cannot continue to run and incur infrastructure costs, or put the UI in an permanently inconsistent state with the running kubernetes cluster state.

Edited by Vishal Tak