Backend: Add logic for using the agent's default/max cpu/memory during reconciliation
MR: Use default and max workspace resources on work... (!139209 - merged)
Description
As a user, I want to be able to specify the default and max cpu/memory to be used for all workspaces provisioned through an agent.
If any container in the generated Kubernetes Deployment does not have the cpu/memory requests/limits specified, the default value from the agent's default_resources_per_workspace_container
are added to it.
Also generate the Kubernetes Resource Quota using the agent's max_resources_per_workspace
Acceptance Criteria
-
If any container in the generated Kubernetes Deployment does not have the cpu/memory requests/limits specified, the default value from the agent's default_resources_per_workspace_container
are added to it. -
Kubernetes Resource Quota using the agent's max_resources_per_workspace
is generated. This is only sent in full reconciliation or ifforce_include_all_resources
is true. -
Creating a workspace from a devfile which has cpu/memory requests/limits higher than what is allowed by the agent, results in the workspace being reported as Starting
for 600 seconds, thenFailed
after 600 seconds.
Impact Assessment
If default_resources_per_workspace_container
or max_resources_per_workspace
are updated(and successfully stored in the DB), then it will immediately apply to all the existing workspace. It would result in the restart of the existing workspaces.
Reason for not setting the defaults during workspace creation
If we use the defaults(default_resources_per_workspace_container
) configured at the agent during workspace creation(by modifying the devfile before storing it in the DB), the result would be that any change in default_resources_per_workspace_container
would not be reflected onto existing workspaces and thus they won't be restarted.
However, this behaviour would be in contrast with the way max_resources_per_workspace
are handled. Since the Kubernetes Resource Quota (like Kubernetes Deployment) is generated during each reconciliation, any update to max_resources_per_workspace
would result in an updated Kubernetes Resource Quota. However, by the nature of Resource Quota, it only applies to any new Pods that are created. So in that sense it wouldn't affect the existing workspaces. But if the workspace pod(was rescheduled in Kubernetes - this can happen at any time for various reason), then the updated Resource Quota would come into effect and the workspace(if it violates the max cpu/memory requests/limits) would result in a Failed
state.
To make this behaviour more consistent, using the default_resources_per_workspace_container
during workspace reconciliation, as opposed to workspace creation, would force a new workspace pod to be created. Because a new pod is being created, the updated Resource Quota would come into effect and the workspace(if it violates the max cpu/memory requests/limits) would result in a Failed
state. However, since the behaviour is deterministic, the agent administrator can schedule this activity(of updating default_resources_per_workspace_container
and max_resources_per_workspace
) and all workspaces associated with the agent would get restarted and if any workspace violates the max cpu/memory requests/limits, it would result in a Failed
state.
Reason for not using Limit Range to set the defaults during workspace reconcile
If you specify a container's limit, but not its request(in the devfile) and if you are using a Kubernetes Limit Range to set the defaults, the container's memory request is set to match its memory limit. Notice that the container was not assigned the default memory request value as mentioned in the Limit Range.
This is the behaviour of Kubernetes and does not make sense in our case. Thus, we will deep merge the container's resources with the agent's default resources value such that if the key is already present in the container's resources, it will take higher precedence.