This has caused few troubles which demonstrate the need to enforce some rules (with tests) and provide guidelines for other teams to leverage the include feature.
Template name must be unique, whatever the subfolder it is placed in. E.g. having these two templates should result in a test failure:
Security/DAST.gitlab-ci.yml
Jobs/DAST.gitlab-ci.yml
This rule was motivated by the behavior explained in #2.
Template placed in a subfolder can be included without specifying the folder. E.G
include:-template:SAST.gitlab-ci.yml
The gotcha we just discovered is that for #2 to work, the folder actually has to be listed as a category. For Security templates, this was done in EE only by extending the list. Though having the folder listed as a category also has the side effect to display the templates it contains in the Template dropdown in GitLab UI. For this reason (AFAIR), the Jobs template where not listed as a category.
The resulting behavior is:
in CE, Security templates must be included using the folder prefix: template: Security/SAST.gitlab-ci.yml (though, we don't particularly want to include them in CE, all our features are premium tiers)
in EE, Security templates can be included both with or without the folder prefix: template: Security/SAST.gitlab-ci.yml or template: SAST.gitlab-ci.yml
in CE and EE, Jobs templates must be included using the folder prefix: template: Jobs/Build.gitlab-ci.yml
NB: The ~Secure feature have been vendored in 11.9 following these assumptions listed above and we need to keep them working as is to avoid breaking customer configs. At least, if we want to change the behavior, we need to deprecate it now and drop
support with 12.0.
We need to decide which behavior we want to provide and align all our job templates on it.
Do we want to keep allowing inclusion of templates without specifying the subfolder?
YES:
rule #1 must be enforced, by adding a unit test ensuring uniqueness of template names, whatever their subfolder (both in CE and EE), and document this.
NO:
we need to announce deprecation of what ~Secure has released in 11.9, which is using unprefixed include e.g. template: SAST.gitlab-ci.yml and promote prefixed usage everywhere. Though, to avoid redoing a change later, I'd highly recommend that instead of asking to use the Security/ prefix we actually move Security templates into Jobs subfolder directly.
we need to decouple the logic that allows unprefixed include from the one that exposes the templates in the UI (dropdown), so that without exposing the Jobs/ templates in the UI, we still allow to include them unprefixed (to avoid breaking change for what ~Secure has released in 11.9). This code could then be dropped in 12.0 to disallow unprefixed include, if not used anywhere else.
What does success look like, and how can we measure that?
Unified behavior for job templates, at least from 12.0 if it implies breaking changes for what has been already released.
@gonzoyumo thanks heaps for writing up this detailed summary. This really helped me understand the problems.
My preference would be to remove the the ability to include templates without specifying a subfolder. I do not really see much value in this and it complicates the developer experience unnecessarily.
Also if we keep it I would urge us to move this code to CE. I can't see any real value in having this minor cosmetic behaviour of including templates without a prefix only available in EE. Every feature we have that is only in EE adds extra maintenance cost and increases the risk of breaking things in master without noticing until it gets merged (or even worse deployed) in EE.
The only advantage I see to allow including templates without specifying a subfolder is to avoid messing with customers if we want to reorganize our files. There might be other legacy reasons that I'm not aware of though (@ayufan if you know some?).
It also doesn't bring value to type this prefix and add room for typos (poor argument , just trying to find some).
One thing that we could do is to split at the top level to distinguish full templates from jobs one:
templates/full/ (yeah naming...)
templates/jobs/
This would allow us to handle these two types of templates differently (e.g. selecting a full template in the dropdown will wipe out your current config, the other will just add an include statement in your existing config. Then within each type, allow to organize in subfolders for clarity/maintenance purpose but without requiring to give it when including the file.
@gonzoyumo I don't really much mind about whether we keep this logic to not require the prefix when including templates. But I do feel strongly that if we keep it then it should be moved to CE to improve maintainability. I don't see any reason for this behaviour to be EE only.
As for the organisation of full templates vs. single jobs I think this is describing a whole new feature we could implement. I'm not sure how that helps us solve the current problem we have.
But I do feel strongly that if we keep it then it should be moved to CE to improve maintainability. I don't see any reason for this behaviour to be EE only.
This is not EE only actually. This logic is part of the CE code tied to using categories (which I think the primary goal was to organize them in the dropdown selector).
The only difference between EE and CE is about the templates that ~Secure team has vendored in 11.9 in the Security subfolder. As we often do (because we only have EE features), we extended the CE code to add the Security category in EE, making this weird behavior of needing to specify the full path in CE while being able to only use the file name in EE.
I won't consider that a hack, it's really a conventional way to override included job. Though, I get your point and to address that we need to decide on whether or not we want to enable DAST on the default branch (master in AutoDevOps).
If jobs are just for internal usage, they should be hidden from general eyes, and be referenced directly,
For all jobs that we expose folders in we should forbid the duplicate names, as then the behavior of template: is inconclusive what template gonna be picked.
If the folder is used, it means that this has to be a complete template that can be used separately outside of the include, it means that it has a proper stages: and stage: definition, which can conflict with like Auto-DevOps.gitlab-ci.yml.
We do enforce today that 3. is fulfilled, so each template has to be "useable". I will likely create the MR for 2. as this seems that from past MRs doing composable configurations is confusing why it does work or not.
Thank you for jumping in @ayufan and pushing these MRs! One problem solved.
If the folder is used, it means that this has to be a complete template that can be used separately outside of the include, it means that it has a proper stages: and stage: definition, which can conflict with like Auto-DevOps.gitlab-ci.yml.
I'n not sure I follow you on this one
This introduces a convention that is not straightforward to me.
Also, even if we provide "job" templates that are meant to be used in a full config using include, we can't presume it won't be the only job there. We had this issue with our DAST job template, where we had to define the stages to make sure it could work even if it was "alone" in a CI config.
I'd rather make the include feature consistent whatever the kind of template, but improve the UX around the usage of them.
Also could you please explain a bit more about what usages you are point at:
that can be used separately outside of the include
--
So, I would prefer if we have to include files in our templates to be explicit and use full folder.
I agree this makes things more explicit and straightforward, even if less flexible for us to reorganize the template folders (but that should be very rare anyway).
But then we could have duplicate names in separate folders, right? I'm asking this for the sake of understanding, I don't see it as a usability requirement, unique name will still makes things less ambiguous.
Templates exposed in a Category are meant to be "full" config (from the code perspective). We kinda broke that rules by exposing "job" template in categories (starting with Security and now with Jobs). This notably explains some unexpected behavior in the tests.
This introduces a convention that is not straightforward to me. Also, even if we provide "job" templates that are meant to be used in a full config using include, we can't presume it won't be the only job there. We had this issue with our DAST job template, where we had to define the stages to make sure it could work even if it was "alone" in a CI config.
We cannot really compose stages: and this proven to be problem, but stages: can also be fixed by upper-most template in such case, so maybe it is not a big deal. Taking above point into consideration, it means that each template has to be valid and self-sufficient, even if it is composed into another bigger one.
that can be used separately outside of the include
It is exactly what I mention above :)
Template to be self-sufficient to run without any changes.
We cannot really compose stages: and this proven to be problem, but stages: can also be fixed by upper-most template in such case, so maybe it is not a big deal. Taking above point into consideration, it means that each template has to be valid and self-sufficient, even if it is composed into another bigger one.
As a late bird, I'd put my 2 cents here: I don't like the current approach when the particular template (e.g. DAST.gitlab-ci.yml) contain knowledge about the previous stages/jobs of Auto DevOps (e.g. DAST refers to environment_url.txt file from the review app job).
I think this is another thing that is against
Template to be self-sufficient to run without any changes.
I think it's worth an issue to have all of these Auto DevOps specific details extracted from "stages" templates into the "root" Auto DevOps template, as much as possible. @gonzoyumo@ayufan WDYT?
@kencjohnston@brendan can we discuss the product vision on that matter? %11.10 is shipping soon so we might only having %11.11 left to deprecate previous syntax (if this is what we want to do).
Discussing with @ayufan about this topic led us to interesting outcome:
It seems that we went to use Jobs folder to distinguish the fact that these templates would be partials to be included (using include feature) but reduces the relevance of the name.
This breaks the current usage of the subfolder which was about meaningful organization of the templates.
So another approach to address this would be to use a different filename convention for partial jobs from the one for full templates.
This way we could keep the folder organization as a meaningful split that helps maintenance and understanding of what that template really do.
e.g. Security/SAST.include.yml or Security/SAST.partial.yml clearly shows the intent: it's about adding SAST security feature to your CI config.
Whereas Pages/Hugo.gitlab-ci.yml clearly states that this is a full CI config to build pages with Hugo.
This could help to easily address the GitLab CI Recipes feature I think.
It seems also that Jobs/Build.gitlab-ci.yml as a way to describe the usage of the template is very ambiguous on multiple levels compared to all other templates:
Jobs/: jobs to be used for what?
Build.gitlab-ci.yml: what we actually build?
It seems that the user here is Auto-DevOps, it would make a lot of more sense to instead of using Jobs/ to use Auto-DevOps/Build.gitlab-ci.yml as this could clearly show that this is build phase of Auto-DevOps template.
However, I think that we want to focus instead of phases on something meaningful that has reference to our documentation about the feature. Let's consider https://docs.gitlab.com/ee/topics/autodevops/#auto-build as an example. This describes the Auto-Build feature, which would make a lot of sense to follow the same pattern: Auto-DevOps/Auto-Build.gitlab-ci.yml. This as a template that could be included outside of Auto-DevOps is actually a very useful feature, to use the Auto-Build feature that is part of Auto-DevOps practices.
I think that if we want to expose templates to user's eyes, they by definition should be valid and should be used without any changes to the template itself outside of the given context. The upper-level Auto-DevOps.gitlab-ci.yml would basically compose a set of independent templates into a complete workflow consisting of multiple stages, making these items independent (by also reducing complexity).
I'm globally ok with these proposals. Only few comments:
require that sub-level template never defines any of global configurations, like stages:, before_script:, etc.
We had trouble with stages when vendoring DAST template, so it could make sense to simplify the rule and enforce that. Though, restricting to build, test or deploy might be a bit too simplistic. E.g. DAST job only make sense after having deployed your app somewhere to launch the Dynamic analysis against it. With such rule our DAST job template won't be usable "as is" with an include, we'll have to ask the users to redeclare the stages explicitly in their template, and it will make harder to implement GitLab CI Recipes I think.
@ayufan what does it take to enrich the default stages? AFAIR it doesn't hurt to have all stages we need there, as they won't show up in the pipeline unless there is a job assigned to them.
if we reference composable template from our top-level template we always use full path, like Jobs/Build.gitlab-ci.yml
I'm fine with that, but we also need to define rules for public usages of include.
We also still need to address the how to distinguish composable or job template (to be used in include) from full config.
And finally, we should settle on a naming convention to make sure we clearly understand each other:
top-level or full config
sub-level, composable, or job template
And as I understand it, this is totally decoupled from the folder hierarchy. So a top-level template could reside in a sub-folder, right? If yes, then I think a distinct filename convention would be necessary.
what does it take to enrich the default stages? AFAIR it doesn't hurt to have all stages we need there, as they won't show up in the pipeline unless there is a job assigned to them.
The default? I don't think that we should do it, as this is a one-way ticket that will have to be maintained like that forever. I would really ask, where the build, test and deploy does not work and really think then if we really need to change the default. I think, that no. DAST/SAST, they all can go into test. It is kind of our expectations that sometimes we might execute that at a different phase.
I'm also okay with these requirements - I think they make sense. Also, gitlab-ce#53307 may help us down the line to have the "composible" items exposed in a better, more meaningful way.
I disagree for DAST: this is a dynamic test that runs against a deployed application. Thus you can't run DAST during the test stage that happens before the deploy stage. I think it's similar for browser performance testing.
I think at the end of the day the structure and patterns we use are Verify and @brendan call. One of hte first applications of those structures will be within Auto-Devops (Configure) and that has implications on our security jobs (Secure) but it is a Verify call to make.
I thought the edict around breaking changes was that deprecation had to be announced on major releases, however we had leeway to perform the deprecation in later, minor releases if need be.
Since "in CE, Security templates must be included using the folder prefix", it will be really awkward to have 2 different syntaxes if we start moving security features to Core later this year.
It would be bad to wait for %13.0 for this kind of change to occur.
I thought the edict around breaking changes was that deprecation had to be announced on major releases
This is the actual drop of support (the breaking change really) that should be announced in major releases (even if technically we remove the code later). The deprecation notice has to be communicated earlier to let people know the upcoming breaking change in advance and give them enough time to adapt.
Since "in CE, Security templates must be included using the folder prefix", it will be really awkward to have 2 different syntaxes if we start moving security features to Core later this year.
This behavior is actually a consequence of having the templates only available in EE. If we move this availability in CE, it's likely that we'll also move their "discoverability" in the UI's dropdown, which is also what makes them available for inclusion without subdir prefix.
This has not been addressed yet, we still have discrepancies with our job templates usage, but it's like that for the past 6 months so it doesn't look like a real issue.
@gonzoyumo what is the potential impact to the discrepancies and us adding third party partners?
if this will make instructions / integration harder i want to move it to 12.6
if it won't i think we can backlog it - and it also sounds like there are multiple items so should it actually be assigned a grooming just to make it multiple smaller are more specific issues?
@NicoleSchwartz this is unrelated to 3rd party partners to me. The discrepancy is for our GitLab users who use these vendored templates in their CI config.
so - is this something we want to fix (and work with verify on the planning / review?) or is this something we are asking them to fix and the group should change?
@connorgilbert i think sast already does this?
@derekferguson@sam.white i'm still trying to squeek this for 15 it might miss but wondering where this is on ya'lls radar
cc @matt_wilson shouldn't impact you
It looks like the Container Scanning and Cluster Image Scanning templates are correctly in the /Security folder only and not in the /Jobs folder so this would not impact me (if I am understanding this issue correctly). Lots of other scanners are impacted though.
@sam.white yes your understanding is the same as mine - we either need to be in both or /security SAST is in both as a migratory move AFAIK which i'll be mirroring (be in both, announce change and deprecation, then on remove remove from jobs)
Mark Nuzzochanged title from Organize and define behavior of jobs templates to be included to Backend: Organize and define behavior of jobs templates to be included
changed title from Organize and define behavior of jobs templates to be included to Backend: Organize and define behavior of jobs templates to be included
HI @furkanayhan - thanks for reaching out here to discuss.
It seems like this issue is to standardize templates that are used. Though there is a correlation to the component catalog feature, if I'm understanding correctly, I'm wondering if this may be out of scope for the component catalog and each project owner would be responsible for how they wish to manage their components as long as it conforms to the general structure which was noted in #352694 (closed).
Thanks for the ping @marknuzzo - I think this issue should become irrelevant when we move templates and especially Security related templates to CI components.