GitLab CI/CD pipelines are a core value offering for the majority of our customers. At the same time, the GitOps workflow of the agent for Kubernetes did not integrate with the pipelines. This release allows to wait for a successful pipeline run before a pull-based synchronization of the manifests.
Problem to solve
As a Platform engineer, I want the pull-based deployments to happen after the pipeline run green.
Currently the GitOps sync occurs immediately after kas detects a new commit on the master branch.
This would be the default behaviour for GitOps. No new metrics are needed.
This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
It feels to me that the GitOps sync is already an automation that we're doing in favor of something that could have been done by a .gitlab-ci.yml, like a kubectl apply command for instance. Also, without being directly associated with a pipeline, the user can use this feature without spending any CI minutes. So that's something to consider if we really want to tie the GitOps sync feature with pipelines.
That being said, if the GitOps sync fails, we should signal GitLab somehow and a CI pipeline does seem to be the most e natural way. But would it make sense to have other ways? Some kind of badge in the main project's page?
But I guess this would only be able to notify failure of synchronization, not specific deployment check, like a pod properly accessible via an Ingress.
If we decide to tie it to the pipeline, I think there could be some kind of API that requests KAS to start the sync. This way the user could configure a script to trigger the sync by themselves, which would also reduce pooling pressure on Gitaly, then they could script their own tests to check if the deploy was successful and decide whether to exit 1 their pipeline.
The when: trigger is just to demonstrate the point. I think we will want more flexibility than when: trigger and need to be mindful of similarities / overlap with &4516 (closed).
@nagyv-gitlab@wleidheiser, feels like this issue didn't get much attention from GitLab users in the last year.
Do you know if we've collected enough customer feedback on our interviews to get an idea of the preferred approach to solve this problem, and whether this is a problem to be solved at all? Or should we keep this issue opened a little longer and perhaps even start a more focused research with them?
Do you know if we've collected enough customer feedback on our interviews to get an idea of the preferred approach to solve this problem, and whether this is a problem to be solved at all?
The research I've done hasn't focused on this particular topic. After reading through the issue, I don't have a full understanding of what the problem is from a users' perspective. Is the problem that users expect the Kubernetes agent to be integrated into the pipeline? Do users want more flexibility in how they can use our feature set? I think I need some help understanding the underlying problem.
Currently our GitOps offering works independently from our CI/CD pipelines. In other words, the application is deployed without executing any CI/CD jobs. The code simply gets synced and updated in Kubernetes, once the code gets merged to the default branch.
The proposal of this issue, is to evaluate whether we should implement a solution which would tie the synchronization of the GitOps manifest files, or Helm charts, with a CI/CD pipeline that's been run successfully to completion. This would aid automated tests, automated security checks, etc, to be executed before a deploy is executed. If the tests or security checks fail, GitLab wouldn't sync the GitOps manifest files, so the deploy would be intentionally automatically blocked.
There's also a question on what/how we would trigger the deployment. It could be a specific .gitlab-ci.yml key in a CI/CD job. It could be triggered by adding git tags being added, and probably there are other alternatives.
@Alexand - I appreciate you clarifying the topic and questions for me. That really helps. I asked some of the other product designers who work within Ops and they haven't gotten any feedback along this particular topic. @nagyv-gitlab - Have you heard users discuss this topic in your continuous interviews?
It seems like there could be problem validation research needed to understand how much of a problem it is to have the GitOps offering independent from our CI/CD pipelines. If it is a problem we should address, then it would make sense to evaluate any potential solution(s).
@Alexand@wleidheiser A specific use case is described below by a user. After conducting minimal research on the topic back in 2020, I did not focus on it in my interviews. My research in 2020 included GitLab SREs and a few people in a CNCF call.
Am I understanding correctly that the question is not the validity of the use case but the approach to take to solve it? Is expanding the CI syntax needed, or is waiting for a pipeline on the configured git ref to run green enough?
I agree with @ash2k here. There are many possible user flows, and we don't know the preferred one, yet.
The simplest setup is that the user runs their CI, and the CI updates the manifests. If the CI failed, the manifest did not get updated. Problem solved.
For Hordur's idea, I think we would need to add support for &4516 (closed)
Re what Joao said
That being said, if the GitOps sync fails, we should signal GitLab somehow and a CI pipeline does seem to be the most e natural way. But would it make sense to have other ways? Some kind of badge in the main project's page?
But I guess this would only be able to notify failure of synchronization, not specific deployment check, like a pod properly accessible via an Ingress.
I agree that we need to provide flexibility to our users. Some will prefer a liberal pull-based gitops model, others wont. The more flexibility we can provide the better. Likely there are many possible user flows and we shouldnt be dogmatic or prescriptive.
I think providing the option for a pull-based sync, after a CI run is a good config option. Many will likely utilize it.
@brett_jacobson@olastor@Z01d-b3rg May I invite you to share your insights and details about your process while you this issue? We want to learn more about you and your related problems to ensure we can provide the best possible solution.
If you are open to a call with me, I would love to learn more about your Kubernetes-related processes and use cases. You can reach me via e-mail from my GitLab profile to schedule a call.
Now that GitOps is done by Flux, with support for OCI artifact deployments (instead of only git repos), is this issue fixed automatically?
I mean: if you need to deploy after the pipeline is done, then use OCI mode and publish that artifact only after a successful pipeline. Problem solved. Actually, that's the current recommendation
Still, we have plans to improve this setup so that you won't need to consume runner minutes while the reconciliation is running. This direction is tracked in External CI jobs MVC (&10866)