Determine source of truth for deployment metadata
Utilize this issue to come up with an plan of action to enable auto-deploys in a succinct manner with Kubernetes and Omnibus installs. The Intermediate and Permanent solutions I don't believe are good enough to meet our needs as discussed further below.
Currently
We rely on a single tag to carry metadata to determine what is deployed across our infrastructure
sequenceDiagram
participant rt as release-tools
participant om as omnibus-gitlab
participant d as deployer
rt ->> rt: get gitlab component versions
rt ->> om: tag
om ->> om: build omnibus package
om ->> d: trigger deploy
d ->> d: deploy
The same tag utilized to build our omnibus packages, is carried into deployer to deploy to our omnibus infrastructure. Changes made to our Kubernetes infrastructure is handled manually.
As a quick reminder: 12.9.202002191723-d42c6afcade.d99b3393e62
== <MAJOR>.<MINOR>.<TIMESTAMP>-<GitLab sha>.<omnibus sha>
Coming Soon
We are adding CNG and helm to the list of component release tools will initiate builds of all GitLab components.
- Tagging CNG will enable us to build our Docker images
- Tagging our helm chart will enable us to build and utilize the chart at that
ref
When the following is completed:
- #577 (closed) - tag CNG
- #623 (closed) - tag Helm Chart
We'll have 3 total tags we need to determine how to store/query/use this information. The only connection these tags will have with each other is the sha
for the GitLab code base. The last chunk of data on the tag will be specific to the repo for which it is operating against.
Soon we'll have something similar to this:
-
12.9.202002191723-d42c6afcade.d99b3393e62
==<MAJOR>.<MINOR>.<TIMESTAMP>-<GitLab sha>.<omnibus sha>
<- GitLab and Omnibus share this one -
12.9.202002191723-d42c6afcade.abcdef12345
==<MAJOR>.<MINOR>.<TIMESTAMP>-<GitLab sha>.<CNG sha>
-
3.0.202002191723-d42c6afcade.abcdef12345
==<MAJOR>.<MINOR>.<TIMESTAMP>-<GitLab sha>.<helm sha>
Intermediate Solution
We don't worry about these tags at all, and solely rely on nightly builds of our helm chart: gitlab-org/charts/gitlab#1905
This does not fully enable auto-deploy. It will only help provide a chart that is usable and would not be in alignment with the rest of our infrastructure. Doing this will help unblock certain work, but would not be good enough for moving to production as we cannot synchronously deploy to both infrastructures in a synchronized and controlled manner. The changing of the versions of components will always be from master
which means there's the potential we will deploy code which may not have passed tests. This can be alleviated by specifying specific versions in our values.yaml like we do today, but then this solution is a non-starter. The only benefit gained at this point would be to grab the latest version of our chart, and not the components running within it.
Permanent Future Proof Solution
Central Version Management: &113 (closed)
However, this may take awhile.
Proposals
Option 1. Release Tools becomes Deployer Trigger
With this, we remove the existing omnibus deployment trigger. Release tools is modified with wait jobs, waiting on signals from other repos that builds have been completed. CNG, omnibus, and our helm chart repos are modified to send something to release-tools letting it know that builds are done. Afterwards, release-tools kicks off a deployment with all known tags such that it will properly deploy all components in a known manner.
sequenceDiagram
participant rt as release-tools
participant om as omnibus-gitlab
participant cng as CNG
participant h as gitlab-chart
participant d as deployer
rt ->> rt: get gitlab component versions
rt ->> om: tag
rt ->> cng: tag
rt ->> h: tag
om ->> om: build omnibus package
cng ->> cng: build Docker images
h ->> h: build helm chart
om ->> rt: build complete notification
cng ->> rt: build complete notification
h ->> rt: build complete notification
rt ->> d: trigger deploy
From here we'll modify deployer
to work with k8s-workloads/gitlab-com
to perform Kubernetes upgrades.
Option 2. Fire and forget
I don't like this option, but this would be the quickest to implement. Similar how omnibus works today, once the helm chart has completed building, send a trigger to k8s-workloads that will perform a deploy.
sequenceDiagram
participant rt as release-tools
participant om as omnibus-gitlab
participant d as deployer
participant cng as CNG
participant h as gitlab-chart
participant k as k8s-workloads/gitlab-com
rt ->> rt: get gitlab component versions
rt ->> om: tag
rt ->> cng: tag
rt ->> h: tag
om ->> om: build omnibus package
cng ->> cng: build Docker images
h ->> h: build chart
om ->> d: trigger deploy
h ->> k: trigger deploy
This has the potential to place differing rails versions at different points in time. Database schema changes would probably a good reason to avoid this route.
Option 3: Abandon this and prioritize an Operator
A PoC is covered in this issue: #674 (closed)
We could take that PoC to the finish line. This will require further refinement and discovery.
Option 4: Prioritize "Permanent Future Proof Solution"
While it may take awhile to enable, this solution includes an intermediate step which would lead us down a good path to the future.
Option n: ...
Previously discussed was re-using the existing tag #577 (comment 265956565), however, it's been determined that this will be highly limiting if there are no changes in specific repos that would prevent a retag of the same ref
.
This situation makes all of our tooling more complicated.
- Tracking deploys will need some sort of refinement, otherwise it will continue to only represent what has been deployed to omnibus, and not accurately reflect what has been deployed to Kubernetes
- Initiating a deploys/rollbacks from chatops will be missing some information, and therefore we won't be able to deploy to Kubernetes
- Basic confusion over what tag belongs to what repository, since the actual tag contains only
shas
- I'm sure there's plenty more that I've not thought about while writing this up
Goals:
- Need to not add expanded complexity to whatever solution we choose
- Discovering WHAT the information provided, should be easy to decipher what objects are built/deployed/etc