What we can learn from Spinnaker

Background

Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence. Created at Netflix, it has been battle-tested in production by hundreds of teams over millions of deployments. It combines a powerful and flexible pipeline management system with integrations to the major cloud providers.

Research

The goal of this issue is to conduct research and get a better understanding in possible integration and/or missing feature set

We should also confirm if Spinnaker has more default integrations and is easier to customize deeply (like adding custom links in the UI).

We should also contact a couple users who are using Spinnaker with GitLab instead of our CD solution. I think @bjung can get us in touch.

Outputs

The output of this research should update the vision page for CD as well as https://about.gitlab.com/devops-tools/spinnaker-vs-gitlab.html (https://gitlab.com/gitlab-com/marketing/product-marketing/issues/1085 is the associated marketing issue).

Screenshots

Homepage image image Pipelines image Deploy to staging - colors each stage in green once complete. image A manual judgement determines whether to continue with the deployment to production. image image image Automated rollback when the validation tests fail and hot standby period after the successful validation in case something unexpected happens image image Kubernetes resources Spinnaker is managing can be viewed - and there is a one-clock to view the deployed app image image Pipeline - you can configure the window where and when you can run your stages image This can be skipped and enforce deployment to production image image Automatic rollback - an error was discovered image Triggers Rollback pipeline image image

Canary Analysis

Baseline is saved - version, number of pods etc. then both copy of production and canary are rolled out at the same time with the same scale to verify the metrics on equal terms image Adding a metric (integration with data dog) image You can configure real time statistics (common use case) or retrospective to analyze over time. Some canaries will run for days (long run) to check that there aren't any memory leaks. You can set a wait time before metrics will be collected to allow the setup to complete You can change the baseline version it is comparing to. image

If a threshold test fails - the canary will immediately stop image If it is in between - it will continue but not pass to production If it is green it will pass to production

In the example it was grey - 66% - needed manual review image

Failed example image

Links

Edited by Orit Golowinski