Enable Auto DevOps by default for self managed instances of GitLab

Problem to solve

Once Auto DevOps is GA and proven, it should be enabled by default for all on-prem (self-hosted) installations.

Further details

Enabling Auto DevOps by default means GitLab runners performing additional work for n number of projects for which Auto DevOps, may or may not work. To mitigate these inefficiencies we want to at least Automatically disable Auto DevOps for a project if the first pipeline failed. The following are also considered important to address but not critical:

Proposal

Set Auto DevOps to "enabled" instance wide as the default Auto DevOps settings.

Current plan for enabling auto devops by default:

Aim to merge the following before first RC:
- Auto disable ADO upon first pipeline failure
- feature flag code
If staggered rollout via feature flag has no major issues and makes it to 100%, enable ADO by default for self-managed in 11.3 on 2018-09-22
Display Banner To Notify Users If The Project Is Implicitly Opted In To Auto DevOps: https://gitlab.com/gitlab-org/gitlab-ce/issues/50535

Considerations:

Internal communication with support
External communication prior to release (tweeter, blog)

What does success look like, and how can we measure that?

Auto DevOps jobs are triggered automatically after instance is upgraded.

Risks

There are several discussions in various threads about some risks that will need to be approved before we enable this for on premise customers. These have been separated into known risks (which are mostly UX issues) and hypothetical risks which are more complicated and relate to security, reliability and costs. The risks I've outlined are exactly that "risks". We don't know to what extent our customers may suffer from any of these problems but they are at least things we think our customers may end up experiencing and being bothered by. These risks are intended to be an objective list of things we believe customers may be frustrated by when we enable it for them by default and not just an exhaustive list of things Auto DevOps could do better but this can occasionally be subjective so may need revision.

Known Risks

Customers that don't have runners configured at all will have stuck jobs created: https://gitlab.com/gitlab-org/gitlab-ce/issues/49081
Even though we disable Auto DevOps after the first failure our implementation does not handle the case where many pipelines are created in one push (eg. pushing lots of branches and tags) so the UX is not ideal as many pipelines will be created that may all fail.
Auto DevOps runs stages which are not relevant for some projects which use excessive resources (runner time, object storage)
- https://gitlab.com/gitlab-org/gitlab-ce/issues/49223
- https://gitlab.com/gitlab-org/gitlab-ce/issues/48399
Current users of Auto DevOps believe it to be quite slow due to lots of docker images being downloaded uploaded which can waste lots of runner time (the pipeline can take around 30 minutes on a fast internet connection): https://gitlab.com/gitlab-org/gitlab-ce/issues/49562

Hypothetical Risks

Customers making use of external CI (eg. Jenkins) may experience strange results for CI (eg. failed Merge Request that is actually passing on Jenkins). We have not done any testing about how Auto DevOps interacts with external CI like Jenkins.
Customers may configure their own CI runners which are now running the Auto DevOps pipeline possibly unexpectedly as we enable this setting for them. As such running certain commands on their servers (runners) may cause very strange things to happen (eg. running rspec when you have a DATABASE_URL set on the server can cause very dangerous things to happen, like truncating a production DB, if you weren't intending on running this command on this host). This risk is higher with shell runners as they inherit the entire environment and have wider access to the server filesystem etc.
Customers may experience some of the same scale problems we've predicted on GitLab.com but to reiterate those can be rephrased for our customers as:
- Customers with very large numbers of repositories and high numbers of pushes may experience significant delays to their CI/CD which will potentially affect developer productivity or delay production deployments as their runners need to catch up with a very long queue of jobs
- Customers with very large numbers of repositories and high numbers of pushes may end up with a very large object storage bill as we store the docker images created in the build stage of Auto DevOps
- Depending on where the runner is hosted and where the object storage is hosted the customers with very large numbers of repositories and high numbers of pushes may end up with a very large ingress/egress bill for docker images being pushed/pulled during the build and container_scanning and deploy phases of the Auto DevOps pipeline
There is a slim chance that somebody has an existing project configured with a Kubernetes cluster but not using GitLab CI that is now going to end up being deployed to a cluster with an internet facing URL but never intended this to be deployed to the public. This is now incredibly unlikely that we don't plan to automatically set a domain name for them anymore (see https://gitlab.com/gitlab-org/gitlab-ce/issues/45560#note_101947623). We did however at least have one customer complain about this on gitlab.com as they were concerned that their application may have been made public online.
Customers may have docker runners setup that do not support Docker in Docker and as such their builds will fail in a way that is not helpful to them and will cause some frustration

Risks should be accepted before we merge #21157 (closed) (not needed until we've done some 1% testing on Gitlab.com):

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited Feb 08, 2024 by 🤖 GitLab Bot 🤖