default:variables

mentioned in epic &6565

added to epic &6565

added automation:ml grouppipeline execution labels

This issue was automatically tagged with the label grouppipeline execution by TanukiStan, a machine learning classification model, with a probability of 1.

If this label is incorrect, please tag this issue with the correct group label as well as automation:ml wrong to help TanukiStan learn from its mistakes.

This message was generated automatically. You're welcome to improve it.

added grouppipeline authoring label and removed grouppipeline execution label

@furkanayhan I'm definitely onboard with variables moved into default to start.

If we introduce default:rules, I would expect it to work like the others, it copies the one rule out to all the other jobs (so, does not work like workflow rules, but just repeated job rules).

@marcel.amirault We'd start to have some problems with default:variables and default:rules.

Right now, the default keyword works in a very easy way. It just overrides. No merge, no addition... We are adding new behaviors with variables and rules.

variables: we expect that a job's variables will merge with the default variables.

default:
  variables:
    VAR1: default var 1
    VAR2: default var 2

test:
  variables:
    VAR1: job var 1

will become

test:
  variables:
    VAR1: job var 1
    VAR2: default var 2

rules: how does it work? rules is an array, we cannot just append it to jobs...

default:
  variables:
    VAR1: default var 1
    VAR2: default var 2
  rules:
    - if: $CI_OPEN_MERGE_REQUESTS
      variables:
        VAR1: MR default var 1

test:
  variables:
    VAR1: job var 1
  rules:
    - if: $SOME_VAR
      when: never

what's the result?

@furkanayhan Yeah, it seems like (if we added it to default) the job rules would override the whole default rules, no merging... but I'm not sure it'd be worth it. You can just define a hidden job, then extend it in every job that needs the rule, so I'm not sure there's much benefit to doing more than the original plan to just add variables to default.

Setting label(s) devopsverify sectionops based on grouppipeline authoring.

added devopsverify sectionops labels

@dhershkovitch, please can you add a type label to this issue to help with issue discovery in issue reports.

added auto updated label

mentioned in issue gitlab-org/quality/triage-reports#5342 (closed)

mentioned in issue gitlab-org/quality/triage-reports#5408 (closed)

changed the description

@fabiopitino @furkanayhan @lauraX @mbobin @matteeyah @marcel.amirault

I've updated this issue based (moved rules to be out of scope for now)

on our call @fabiopitino raised the concern regarding the precedence between variables and default: variables (e.g. in case someone on an include), as suggested on the call should we make default: variables higher precedence than top-level keyword variables (although they are practically the same)

@dhershkovitch

the precedence between variables and default: variables (e.g. in case someone on an include)

I agree that default: variables should have higher precedence but how are we going to do that? I mean, we have some options...

If we want users to migrate from top-level variables to default: variables, I think what we need to do is to ignore the top-level variables when the YAML has default: variables. (We can also show a warning message like the top-level variables has been ignored).

If we merge the top-level variables with default: variables with a precedence, I think users will continue use them together.

WDYT? @fabiopitino @marcel.amirault

@furkanayhan My first instinct was to suggest a validation that prevents them from being used together in the same config, but even our own templates use top-level variables, so that wouldn't work for anyone using those templates.

I'm not convinced your solution is the best, but the only option seems to be merging/overriding with default: variables taking precedence, and that doesn't seem any better?

Let's try to envision a real scenario

We have the following template using top-level variables as keyword

# var.yml
variables:
  VAR1: var_1
  VAR2: var_2

This template is included in a pipeline that has the following

# .gitlab-ci.yml
include: var.yml
variables:
  VAR1: My_var1
  VAR3: My_var_3

When we introduce default: variables we should prompt a warning to replace top level variables with default: variables, and users will be able to update the .gitlab-ci.yml but not the content of the templates, so they will end up with the following

# .gitlab-ci.yml
include: var.yml
default:
  variables:
    VAR1: My_var1
    VAR3: My_var_3

We need to keep both scenarios identical, so we wont break existing pipelines

@dhershkovitch @furkanayhan I know we have had several discussions already but could we have a clear and concise problem statement? This would help us frame what we are trying to solve and how to balance costs vs benefits.

After the discussions we have had I'm still wondering whether this is the right path forward. Introducing a new syntax for a feature that already exists comes with costs:

we can't deprecate variables: as we never have deprecated any keywords so far at GitLab. Unless we have a way to version the YAML (another breaking change) we have to live with both. E.g. like today for stage: and type:.
living with 2 equivalent keywords adds a complexity cost (from development side) and cognitive cost (from user perspective).
As mentioned in another issue, default:variables would behave differently (merging values) than other default: sub-keywords (replacing values). This adds both complexity and cognitive cost.

Given the costs above (maybe there is something else I'm missing) and the problem statement, are the benefits justifying the costs?

Can this be a documentation improvement?

/cc @marcel.amirault

@furkanayhan I think you suggested earlier to raise an error if default:variables and variables are used in the same YAML and I wasn't convinced about it. I'm still not convinced it's a good UX when taking in consideration CI templates or 3rd party files BUT we actually do that today with regards to other top-level keywords:

For example before_script can be defined as top-level keyword or as default:before_script. We don't allow both to exist on the same merged YAML (https://gitlab.com/fabiopitino/test/-/pipelines/420440602) where .gitlab-ci.yml uses default:before_script and the included file uses before_script:. So, technically the idea of disallowing the presence of both default:variables and variables: could be somehow consistent with whatever we are doing today. There is still the inconsistency of the values being "merged" vs "replaced". cc @marcel.amirault

@fabiopitino

we can't deprecate variables: as we never have deprecated any keywords so far at GitLab. Unless we have a way to version the YAML (another breaking change) we have to live with both. E.g. like today for stage: and type:.

I believe we have a solution here, we are experimenting this with the types keyword, and if it will succeed we could adopt it as the official CI deprecation process, with Variables it will probably take years, since we should:

Remove it from our docs
Remove it from our templates
Announce deprecation
Raise warning when it is used
Measure how often it would use
Remove the keyword however, we should start somewhere

I think you suggested earlier to raise an error if default:variables and variables are used in the same YAML and I wasn't convinced about it. I'm still not convinced it's a good UX when taking in consideration CI templates or 3rd party files BUT we actually do that today

I think this is the right way for us to go, and i mentioned that in my comment on the epic (btw - let's make sure we add comment to the issue and not the epic), this way we can keep it consistent with other defaults: sub-keys

@fabiopitino

we can't deprecate variables: as we never have deprecated any keywords so far at GitLab. Unless we have a way to version the YAML (another breaking change) we have to live with both. E.g. like today for stage: and type:.

living with 2 equivalent keywords adds a complexity cost (from development side) and cognitive cost (from user perspective).

Agree with Dov's answer about &6859 (closed).

As mentioned in another issue, default:variables would behave differently (merging values) than other default: sub-keywords (replacing values). This adds both complexity and cognitive cost.

Ah yes, this will be a new behavior for the default keyword... I have also concerns about this...

I think you suggested earlier to raise an error if default:variables and variables are used in the same YAML and I wasn't convinced about it.

In this thread, we also discussed other options; like we don't raise an error but we ignore the top-level variables when using default:variables.

@dhershkovitch @furkanayhan what about the following questions. I've missed the problem we are trying to solve, whether it's a blocker for some customers or an improvement from our point of view.

could we have a clear and concise problem statement?

Given the costs above (maybe there is something else I'm missing) and the problem statement, are the benefits justifying the costs?

Can this be a documentation improvement?

@fabiopitino The top-level variables keyword has many roles/"problems";

It contains default variables for jobs.
- Users mostly see this as pipeline variables, this can be improved by a documentation update.
Even if it's used for storing default variables for jobs, it can be overridden by workflow:rules.
- I am not sure how to fix this.
It is used for the prefill-variables feature.
- We need to extract this feature from the top-level variables keyword.

With a good documentation improvement, we definitely can fix the problem for number 1. However, I am not sure about the rest.

@furkanayhan

Even if it's used for storing default variables for jobs, it can be overridden by workflow:rules.

I am not sure how to fix this.

Why is this a problem? My understanding of workflow:rules is that sits between the default variables and the job variables. Even if we change variables: with defatult:variables, wouldn't workflow:rules work the same way?

It is used for the prefill-variables feature.

We need to extract this feature from the top-level variables keyword.

I agree. I think the use of variables: as prefill-variables feature may have contributed to confuse users that it represents pipeline-level variables as they were being defined in the YAML. The problem is that they start as default job variables and then they may become pipeline variables, which inverts the inheritance:

if left as default variables, the job variables override them
if sent via the pipeline form, the job variables are overridden by them

Having prefill variables extracted out from variables: into either a new syntax or project setting is a good move since it would keep the existing default variables behavior predictable.

workflow:
  input:
    variables:
      DEPLOY_KEY: Insert the deploy key
      QUICK_RUN: Skip integration tests and advanced checks - default false

The benefit of having prefill variables defined in the YAML over the project setting is that they can be different per ref. On the other hand, they are (arguably?) not related to a pipeline structure or behavior.

@fabiopitino

Even if it's used for storing default variables for jobs, it can be overridden by workflow:rules. Why is this a problem?

The top-level variables keyword refers to jobs. The workflow keyword refers to the pipeline (out of topic: we also want to rename workflow as pipeline). In this case, you are overriding a job default value in pipeline rules. That's the problem/confusion.

Even if we change variables: with defatult:variables, wouldn't workflow:rules work the same way?

Maybe, we should not use workflow:rules to override.

Having prefill variables extracted out from variables: into either a new syntax or project setting is a good move since it would keep the existing default variables behavior predictable.

...

The benefit of having prefill variables defined in the YAML over the project setting is that they can be different per ref. On the other hand, they are (arguably?) not related to a pipeline structure or behavior.

Honestly, I don't know the correct answer here. We should definitely separate the prefill-variables feature from the top-level variables but where to put them is the question. cc @dhershkovitch @nadia_sotnikova

I second what @fabiopitino mentioned, it'd be very helpful to define a clear problem statement first, so we can discuss the potential solutions in the context of that problem.

I've been trying to track the discussions spread over the epic, the call recording and the issues, and it's difficult to understand exactly what problem we're trying to address and what are the solutions being considered and why. @dhershkovitch

Its very hard to pinpoint the exact problem, as it touches several areas, I believe that @furkanayhan comment explain the entire confusion our users have around variables.

About moving from variables to default: variables which is the purpose of this issue I can only voice from the issues I read, the variables top-level keyword is very confusing for our users. We can try improving our documentation, however, by doing so we are drifting away from trying to make our syntax clearer and easy to use, since the top-level variables act as any other default: sub-key, however since it is being used as a top-level keyword users believe it a workflow keyword and try to use it differently.

Related issue for prefill variables #295649 (closed)

Another confusion example; #337633 (comment 743894461)

Users can misinterpret the top-level variables, expect them to become pipeline variables, and try to use them in include.

mentioned in issue #337633 (closed)

mentioned in issue gitlab-org/quality/triage-reports#5556 (closed)

added backend typemaintenance + 1 deleted label

added CI variables label

@fabiopitino @furkanayhan @marcel.amirault @nadia_sotnikova @lauraX

Coming back to this discussion where we mention that

Having prefill variables extracted out from variables: into either a new syntax or project setting is a good move

It looks like we do have an opportunity to "update" your syntax with #29159 (closed), in this issue the main problem for our users is entering the variables manually when running a pipeline and not the yaml configuration, so any syntax we'll create would likely be adopted by our users, so let's try to come up with a new syntax for the prefill variables

Maybe, the existing top-level variables keyword is the perfect keyword

I mean, let's say we extracted the responsibility of "default variables" and "rules" from it to the new default:variables keyword. Then, its only responsibility would be just prefill variables

Or, I am also okay with

workflow:
  input:
    variables:

or

pipeline:
  input:
    variables:

let's say we extracted the responsibility of "default variables" and "rules" from it to the new default:variables keyword

In a way, I kind of like the explicitness about this - the keywords are the definition of the functionality.

default:
  variables:
    VAR1: default var 1
    VAR2: default var 2

@dhershkovitch

@dhershkovitch Addressing this specific detail:

either a new syntax or project setting

I think it'd be useful to think about whether or not prefilling variables should be extended to manually run jobs as well, in a later iteration. Basically:

If we use a new keyword in the YAML, is what we choose going to work if we try to add the keyword to jobs as well?
If we add this functionally to project settings instead, I'm sure it'll be fine for pipeline-level prefilled variables, but would it work for job-level config? But if it's now in the settings UI, it means we can extract it from the YAML and simplify the variables keyword?

I'm struggling to come up with my own personal preference here, not just the place for configuration (YAML or settings), but how it could all connect together and be used.

The sentiments that I am receiving from our customers is that we should try to move the configuration as much as we can to the YAML config, from the IAC reasons (Version control, Audit trails, Ease of collaboration, Knowledge sharing), so we should stick with the yaml syntax for now, I agree with your thought @marcel.amirault that we should think about a syntax that would be applicable both for jobs and pipelines

/cc @lauraX @furkanayhan @nadia_sotnikova

Sounds good!

mentioned in issue #29159 (closed)

mentioned in merge request gitlab-com/www-gitlab-com!99050 (merged)

mentioned in issue #366438

changed the description

mentioned in issue #339174

Related issue - #387021 (comment 1242286495)

marked this issue as blocking #366438

Setting label(s) Category:Pipeline Composition based on grouppipeline authoring.

added Category:Pipeline Composition label

added grouppipeline security label and removed grouppipeline authoring label

added grouppipeline authoring label and removed grouppipeline security label

added grouppipeline security label and removed Category:Pipeline Composition grouppipeline authoring labels

mentioned in issue gitlab-org/quality/triage-reports#11917 (closed)

added Category:Secrets Management label

@dhershkovitch does it look as a duplicate of the epic &7437 to you?

@atanayno No, this issue is to add a nested variable keyword underneath default: keyword, while the linked issue is to add a default value to a variable