Skip to content

GitLab.com KAS early-adopters (beta) programme

What is the kas address for GitLab.com

wss://kas.gitlab.com

What

Rollout the :kubernetes_agent_on_gitlab_com feature flag

Owners

  • Team: ~"group::configure"
  • Most appropriate slack channel to reach out to: #s_configure
  • Best individual to reach out to: @tkuah

Expectations

What are we expecting to happen?

We can deploy KAS service, and only projects in the allowlist will be allowed to use this feature

What might happen if this goes wrong?

  1. Clients who are not on the allowlist will see similar in their agentk logs:
│ {"level":"warn","time":"2021-02-07T20:06:24.920Z","msg":"GetConfiguration.Recv failed","error":"rpc error: code = PermissionDenied desc = forbidden"}                                                                                                      │
│ {"level":"error","time":"2021-02-07T20:06:25.387Z","msg":"GetObjectsToSynchronize.Recv failed","project_id":"tkuah/kas-agent-test","error":"rpc error: code = PermissionDenied desc = forbidden"}     

KAS service degrades, or fails (https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/kas/README.md)

What can we monitor to detect problems with this?

https://dashboards.gitlab.net/d/kubernetes-pods/kubernetes-pods?orgId=1&var-datasource=Global&var-cluster=gprd-gitlab-gke&var-namespace=gitlab&var-pod=gitlab-kas-5d5f69c799-spjnb&var-container=All

https://dashboards.gitlab.net/d/kas-main/kas-overview?orgId=1

Beta groups/projects

If applicable, any groups/projects that are happy to have this feature turned on early. Some organizations may wish to test big changes they are interested in with a small subset of users ahead of time for example.

NOTE: Only public, and GitLab Premium projects (~180K) are eligible

  1. GitLab internal testing projects
  2. Beta list of users

Success criteria to rollout to GA

  • Rate limiting for KAS agents is active
  • KAS / GitLab.com handles progressively increased load well.

Roll Out Steps

  • Enable on staging for selected project (/chatops run feature set --project=tkuah/kas-agent-test kubernetes_agent_on_gitlab_com true --staging
  • Enable on staging (/chatops run feature set kubernetes_agent_on_gitlab_com true --staging)
  • Test on staging
  • Ensure that documentation has been updated
  • Enable on GitLab.com for individual groups/projects listed above and verify behaviour (/chatops run feature set --project=gitlab-org/gitlab kubernetes_agent_on_gitlab_com true)
  • Progressive rollout 1% (up to ~1,800 GitLab Premium projects), /chatops run feature set kubernetes_agent_on_gitlab_com 1 --actors
  • Progressive rollout 10% (up to ~18,000 GitLab Premium projects), /chatops run feature set kubernetes_agent_on_gitlab_com 10 --actors
  • Progressive rollout 20%
  • Progressive rollout 50%
  • Progressive rollout 100%
  • Coordinate a time to enable the flag with the SRE oncall and release managers
    • In #production mention @sre-oncall and @release-managers. Once an SRE on call and Release Manager on call confirm, you can proceed with the rollout
  • Announce on the issue an estimated time this will be enabled on GitLab.com
  • Enable on GitLab.com by running chatops command in #production (/chatops run feature set kubernetes_agent_on_gitlab_com true)
  • Cross post chatops Slack command to #support_gitlab-com (more guidance when this is necessary in the dev docs) and in your team channel
  • Announce on the issue that the flag has been enabled
  • Remove feature flag and add changelog entry
  • After the flag removal is deployed, clean up the feature flag by running chatops command in #production channel

Rollback Steps

  • This feature can be disabled by running the following Chatops command:
/chatops run feature set --project=gitlab-org/gitlab kubernetes_agent_on_gitlab_com false
Edited by Viktor Nagy (GitLab)