With the implementation of our directed acyclic graph (DAG) in GitLab CI we introduce a syntax that makes a job run only if a different job finish its run (without waiting to its own stage). however if the DAG syntax is combined with rules syntax - which detairmine if a job will gets added to a pipeline there can be a situation where a job is waiting for a job that does not exists in the pipeline (because it did not got added to the pipeline), this will cause the pipeline to fail only on run. In this release we are adding the optional keyword to any DAG job which will process the pipline with the condition that If the optional job exists, then wait for it, otherwise run as part of the chosen stage, this will allow you to saifly combined the very popular rules syntax with the growing popularity of the DAG concept
Problem to solve
We are implementing an MVC for a directed acyclic graph (DAG) execution model in GitLab CI via https://gitlab.com/gitlab-org/gitlab-ce/issues/47063#note_198535521, which introduces the following syntax for marking a job as running after another job, without waiting for its own stage:
deploy:stage:deployneeds:[test]...
A "smart default" we chose was for anything indicated as needed by a job was in fact needed, and if the job did not exist (likely because it didn't match to the pipeline's only/except rules), that it would also not run. However, there are some unusual cases where the intent would be:
If job _x_ exists, then wait for it, otherwise run as part of the chosen stage
This is not possible with the current syntax, and this limits usefulness of a whole DAG feature significantly. Due to other limitations teams are often forced to duplicate jobs, this in turn causes snowball effect for DAG, where there is currently no way to express needs: [A or B]. Allowing to specify job in needs which might not exist in the given pipeline makes gitlab-ci.yml files more concise.
Intended users
Further details
Proposal
We can make this easier by allowing you to specify that the job should go ahead if the job it refers to does not exist:
job1:needs:[a,b,c,{job:d,optional:true},e,f]
The default value, when not explicitly overridden, is false (which is the current behavior today).
Note that this should work in concert with #233876 in cases where both are provided.
Permissions and Security
Documentation
Testing
What does success look like, and how can we measure that?
Links / references
This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
@jlenny Do we have a real-world use case that could explain the functional change we want to achieve?
I understand the change we want to make but not why. To me this looks like we're marking a job as a dependency with needs but also marking is as not required which doesn't make sense.
@jlenny I agree with Matija, saying "Job A needs, but does not require, Job B" is very confusing.
I recommend not using require as a keyword together with needs as they mean almost the same thing. I'd suggest something like if_exists.
deploy:stage:deployneeds:-job:testif_exists:true
This reads as "the deploy job needs the test job if it exists. If it doesn't exist, then it doesn't need it. Execute normally."
if_exists: false would read as "the deploy job needs the test job even if it does not exist. I would expect the pipeline to fail if the needed job is not there due to logic. I'd also expect this to be the default behavior."
Another option might be fail_if_missing where fail_if_missing: true is the default behavior and fails the job in the needed job does not exist due to only/except logic. fail_if_missing: false would mean basically "if the job doesn't exist, then you don't need it. Execute like a normal pipeline waiting for the stage to complete."
The problem is that, we have two jobs: (simplified)
compile-assets (runs on master)
compile-assets pull-cache (runs on non-master)
They're mutual exclusive, and they produce the assets we want for a number of subsequent jobs. We want either of them. However we cannot specify we needs both, because they're mutual exclusive. We're forced to split them, replicating all the behaviour of only/except we put into those jobs.
Basically we made everything just depend on compile-assets pull-cache, and make it run on master, too. Then change compile-assets to be just updating the cache in the end on master.
This might make it even better than having two jobs and needs all of them, because now the responsibility for each jobs are more clear.
Returning this to the backlog as we will unfortunately not have capacity to go after this in this fiscal year. This is primarily due to higher priority items with more demonstrated demand that we really need to deliver. If you feel this decision is in error or missing something important, please do @ me to let me know and we can discuss.
@jlenny , @matteeyah , this limits usefulness of a whole DAG feature significantly. Due to other limitations teams are often forced to duplicate jobs, this in turn causes snowball effect for DAG, where there is currently no way to express needs: [A or B]. Allowing to specify job in needs which might not exist in the given pipeline makes gitlab-ci.yml files more concise.
I think this use case is less unusual than the description of the issue indicates, especially for monorepos. You might have a deployment model like the following (using @williamchia's notation which I agree is more clear):
The scenario I'm thinking of here is updating a chart and deploying the updated version even though the image the chart uses hasn't changed. Changes to a helm chart or other infrastructure deployment mechanism can and should be decoupled from the build stage, but if a build is triggered the deploy should wait for it.
@ayufan @jlenny If I'm understanding this correctly - this is for introducing "conditional" DAG jobs? If the needed job is there, run as DAG job, otherwise run as normal job? Does that mean that the job has to have the stage associated with it in addition to having needs?
@matteeyah this is for a behavior if you need a job that ends up not instantiating due to an only/except rule for example. In the example below, the deploy job should be free to start whenever it was going to, even if there ends up being no job called test that ran:
I believe this is what we are trying to prevent in this issue with introducing required: false, right?
If job _x_ exists, then wait for it, otherwise run as part of the chosen stage
What happens if we have more than 1 need?
test1:stage:testscript:exit 0rules:-if:$CI_MERGE_REQUEST_IDtest2:stage:testscript:exit 0deploy:stage:deployscript:exit 0needs:-job:test1required:false-job:test2# required: true - by default
If we ignore non existing dependencies, deploy will only depend on test2. Contrary, if deploy has only 1 dependency with needs:[test1] we are saying it should run as part of the chosen stage. I'd suggest that we make it behave like needs: [], instead, given that it's now supported. This way a DAG job remains a DAG job and not a stage scheduled one.
I'd suggest that we make it behave like needs: [], instead, given that it's now supported. This way a DAG job remains a DAG job and not a stage scheduled one.
Another comment on the syntax required: true. We had discussed about the same syntax for a different purpose in #213080 (comment 337501218) where required: true was being thought as "expected to be present and executed" vs "exected to be present and skipped".
@fabiopitino@ayufan@thaoyeager I actually have no idea what the error message deploy: needs test1 is trying to say. We might need to open up the git blame to figure out what it was implemented for (and maybe while we're in there, make it more clear).
That said the syntax that Kamil wrote above makes sense to me for notating that the job only optionally (i.e., if it exists) needs something.
@ayufan aha, yeah. So the if there resolved to false? Yeah better UX would be something like deploy: needs job 'test1' but that job does not exist in this pipeline. What do you think @nudalova@dimitrieh?
I have a related need, where all jobs except the first one have a needs: statement - I have jobs that depend on the code being built, while others don't.
And I only want the whole pipeline to be executed on commits that do not have a certain tag defined using a regexp, so I have a hidden job that defines an except: statement and all other jobs depend on it. Something like:
In my case, what I want is for jobs to be skipped if their DAG dependency is also skipped. The example above is way simpler than the actual script, but in this simple case, Gitlab complains about an invalid YAML syntax for job2 when a tag matching the regexp is present.
It's quite befuddling to see this error pop up in the pipeline too, because the CI linter will not report an error.
Moving to the backlog - we are using a new process where only major direction items are pre-scheduled and we will populate the majority of upcoming items for releases during planning.
This is a really important feature for our company. We are not able to use the benefits of "needs" because we have many jobs that only runs in some scenarios. Thanks a lot for putting effort on this ticket.
I have exactly the same use case as this. Sometimes there are jobs in an early stage that, if they exist, other jobs need to wait on. However 99% of the pipelines these jobs do not exist and other jobs can run instantly.
Right now we just put needs: [] for those jobs, which means for the 1% of pipelines that do have the earlier jobs the latter can jobs may fail because of this optional dependency being required for this specific pipeline. We then have to retry the jobs in the latter stage.
Having a "need the job if it exists" as a form of optional dependency would solve this problem.
@thaoyeager@jyavorska I understand the planning process you are using right now, however, would you be able to give me a rough priority of where this issue stands in your backlog? It is blocking a Category:DAST issue (#33830) that we'd like to push forward, but can't until this feature is added. I'd like to know if I just need to put our issue in the backlog for the next few months or if this feature will be a priority for your team soon.
A commercial premium customer is interested in the functionality: https://gitlab.my.salesforce.com/0016100001Eo81O
They have found the current implementation difficult to maintain. The relationships are forced on all jobs and work against the rules functionality.
The following is a simplified draft of what the customer is trying to build:
We have a problem implementing DAG pipelines I think because of this issue. We have a base Docker image that needs to be built rarely (using only: changes). Several other build jobs depend on it because they start with this image. So, if base:build is in the pipeline, those jobs should wait for it. If it is not in the pipeline, they can proceed, because they are configured to get the appropriate cached image.
I have been reading the comments of this and other issues and I'm surprised I haven't seen a similar use case. This makes me think maybe I'm misunderstanding something and there's an easy way to accomplish this. For now, we're going back to relying on stages instead of needs.
The main problem here is that this was a feature working on stages mode (non-DAG) by having on dependencies a job that might not exist and then we come switch to DAGs and find out that doesn't work.
I don't understand how this is not getting prioritised accordingly :(
What would you think of flipping the behaviour by default? Is there a reasoning behind the "smart default"? It seems to me that it would be better (and easier to implement) to just accept and ignore jobs that aren't instantiated. In fact, that's what dependencies does: I think it's better to be consistent.
I guess nobody is relying on the current behaviour. It doesn't make sense that a pipeline that usually works just stops working one day (especially release day) because one rule/except clause now applies.
We'd like to see this implemented as well, and we'd want to make sure it supports bridge jobs as well as regular jobs.
We have a monorepo with different teams interested in different child pipelines. These child pipelines are included via rules:changes. We got a request from a team to send a single Slack notification on success if a commit is landed affecting a subset of files triggering either of their child pipelines. It could look like this:
I'd like to use needs, because I have an intermediate other stage in this example and I don't want to wait for all the jobs in other to complete before I send my Slack notice. But, instead I have to use dependencies, because I can't guarantee that child_pipeline_1 and child_pipeline_2 will always run together.
Another solution for this use case might be to support needing a stage instead of a job.
Hello everyone, I've updated this issue with the latest thinking of how we're going to go ahead with implementing this hopefully in 13.5. Please take a look and share your feedback.
This might make DAG a little bit more complex, but perhaps it is worth it. @furkanayhan can you take a look at this too? Do you think it is in-line with our plans to unify stages / DAG processing?
@cheryl.li@furkanayhan since no weight is assigned to this issue i am moving this to be ~"candidate::13.7", let's try to get a weight estimation for %13.7
Not having the ability to list jobs to wait for completion/success is really hurting the total run time of our CI jobs for mono repos.
Use DAG: I can't use a changes rule to only include sub-projects if they need to be built/deployed. So jobB must wait until jobA finishes, even if jobA has no need to run.
or
Use DAG: I have to script out a git check of my own, but the container/job still needs to spin up and clone the repo just to do a git check to see if it should run or not.
or
Don't use DAG: Use a changes rule, but arrange the jobs into different stages where those without dependencies or are the last dependencies in their line are in the last deployment stage. Here jobC that isn't actually dependent on jobA still has to wait for jobA to finish if it were included in the pipeline
Lack a way to make a dependency optional makes rules often useless for all except the last job in the pipeline. I have to put the condition into the script instead (so the job is always run, even if it does nothing) or the pipeline won't even start.
I have a very similar use case as @jacek_axeos I got a pipeline that are all deploy.
stage3 will need to be deployed if (stage2 ran) OR (specific file changes).
First I tried to have a global var that was set in stage2 but jobs are independent of each other that stage 3 rules don't see this.
I then thought of this (optional dependency), having stage 2 as optional so if it exist, stage 3 would run.
Will this feature help me on what I'm doing?
Additional feedback from an Ultimate customer interested in this feature:
We plan to deploy to multiple regional environments (say, europe-west1->4), where a region may be optional depending on the application (using rules). After the regional deployments, we then want to run migrations in a separate stage.
Using the DAG, we would define our needs to be all the regional environments but set as optional, so if a region is disabled the DAG will still work as intended (right now if a region is disabled the job is no longer run per its rule, and the DAG fails because the job does not exist in that pipeline context).
@dhershkovitch I see this is a candidate for 13.10. Is this still the plan? Thanks in advance for your feedback!
I think the bot needs an update of the tiers. That said, please keep that in Core. I do not think that pipeline definitions that work on one instance should break on another one, nor the testing for that feature will be fun if it behaves differently.
Thanks for the heads up @furkanayhan - although we should probably have @dhershkovitch decide on the proper syntactical keyword. It sounds like more of a naming preference when that suggestion comes out of an MR.
@furkanayhan@cheryl.li I've added my comment on the discussion, i believe we should proceed with the current planning and not introduce any last minute changes
This script defines some expensive jobs (build-image + build-sdk) which run only in special situations. I want to skip deploy silently unless images + sdk has been built.
Hi guys, can we please update the documentation on the Gitlab portal, version 13.9 it appears does not support optional
build changelog:stage:changelogextends:[.build-changelog,.only-release]needs:-job:versionoptional:false-job:build and test jdk8optional:true-job:build and test jdk11optional:true
Error: This GitLab CI configuration is invalid: jobs:build changelog:needs:need config contains unknown keys: optional.