Duplicate includes of glob-expanding file break pipeline with invalid YAML error
<!--- Please read this! Before opening a new issue, make sure to search for keywords in the issues filtered by the "regression" or "type::bug" label: - https://gitlab.com/gitlab-org/gitlab/issues?label_name%5B%5D=regression - https://gitlab.com/gitlab-org/gitlab/issues?label_name%5B%5D=type::bug and verify the issue you're about to submit isn't a duplicate. ---> ### Summary A regression in CI/CD configuration includes causes pipelines to fail validation with `Included file 'templates/bar.yml' does not have valid YAML syntax!` when two included YAML files both include a third file that itself uses a glob pattern to include additional YAML files. This behavior started in **GitLab 18.6.0** and appears related to the changes introduced in the MR [**Add wildcard support to cache + recursion detection**](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/211424) `location_expander.rb`). This was reported by a customer after upgrading to 18.6. [Zendesk Ticket](https://gitlab.zendesk.com/agent/tickets/707007) - internal only ### Steps to reproduce 1. On GitLab.com, navigate to the public example project: https://gitlab.com/cmutua/nested-yaml-syntax-error (project contains a minimal CI config to reproduce the issue). 2. Review the project structure (documented in the project README) to see the following pattern: * A root `.gitlab-ci.yml` includes two template files, e.g. `templates/foo.yml` and `templates/bar.yml`. * Both `templates/foo.yml` and `templates/bar.yml` include a shared file, e.g. `templates/x.yml`. * `templates/x.yml` contains a glob include such as: ``` include: -local: templates/nested/*.yml ``` which expands to files like `templates/nested/z.yml`. 3. In the example project, run a pipeline (e.g. by pushing a commit or using **Run pipeline** from the UI). 4. Observe that the pipeline configuration fails to load and GitLab shows an error for one of the includes, e.g. `Included file 'templates/bar.yml' does not have valid YAML syntax!`. ### Example Project * https://gitlab.com/cmutua/nested-yaml-syntax-error * The README in this project explains the file structure and how the nested includes are set up to trigger the issue. ### What is the current _bug_ behavior? * When evaluating the pipeline configuration for the example project on **GitLab 18.6.0+** (and on GitLab.com), the pipeline fails to start and the UI reports: `Included file 'templates/bar.yml' does not have valid YAML syntax!` * This occurs specifically when: * Two different included files (`foo.yml` and `bar.yml`) both include a shared file (`x.yml`), * And that shared file uses a glob include (`templates/nested/*.yml`) that resolves to one or more additional YAML files (e.g. `z.yml`). * Internal analysis suggests that the **deduplication check added in** `expand_wildcard_paths` **in** `location_expander.rb` is responsible: * The file that contains the glob include (`templates/x.yml`) is already tracked in `expandset` after being processed via one include chain (e.g. `foo.yml → x.yml → templates/nested/*.yml`). * When another include chain (`bar.yml → x.yml → templates/nested/*.yml`) is processed, the glob-expanded result (`templates/nested/z.yml`) is incorrectly treated as already expanded because of the earlier run, and is therefore skipped. * This leaves the include chain from `bar.yml` effectively empty/broken, resulting in the `does not have valid YAML syntax` error even though the YAML is valid. * The behavior **worked correctly in GitLab 18.5.5**; the regression is observed starting in **18.6.0**. Replacing `location_expander.rb` in an 18.5.5 instance with the 18.6.0 version reproduces the error in 18.5.5, further confirming the regression is tied to that change. ### What is the expected _correct_ behavior? * The pipeline configuration should be evaluated successfully when: * Two included files both reference a shared file that uses a glob include pattern, and * The glob include expands to valid YAML files. * The deduplication logic should **not** incorrectly suppress glob-expanded includes based on previous expansions in another include chain. * In this scenario, GitLab should accept the configuration and allow the pipeline to be created, as it did in **18.5.5**. * No `...does not have valid YAML syntax!` error should be raised as long as all referenced YAML files are syntactically valid. ### Relevant logs and/or screenshots Pipeline configuration error message (from the GitLab UI): `Included file 'templates/bar.yml' does not have valid YAML syntax!` Additional internal analysis of the failure mode (from investigation): The deduplication check added in expand_wildcard_paths — which skips files already present in expandset — causes false positives when a file that uses glob includes (e.g., `templates/x.yml`) is itself already tracked in expandset. The glob-expanded results (e.g., `templates/nested/z.yml`) get incorrectly suppressed, producing the error: `Included file 'templates/bar.yml' does not have valid YAML syntax!` So when processing bar.yml → x.yml → templates/nested/\*.yml: 1. It searches for files matching templates/nested/\*.yml and finds z.yml 2. It then checks: "Is z.yml already in expandset?" 3. Yes — because foo.yml → x.yml → z.yml already ran first 4. So it skips z.yml, leaving bar.yml's include chain empty/broken 5. That empty result is what causes the "does not have valid YAML syntax" error ### Output of checks <!--If you are reporting a bug on GitLab.com, uncomment below--> <!--This bug happens on GitLab.com--> <!--and uncomment below if you have /label privileges--> <!--/label ~"reproduced on GitLab.com"--> <!--or follow up with an issue comment of `@gitlab-bot label ~"reproduced on GitLab.com"` if you do not--> #### Results of GitLab environment info <!--Input any relevant GitLab environment information if needed.--> <details> <summary>Expand for output related to GitLab environment info</summary> <pre> (For installations with omnibus-gitlab package run and paste the output of: \\\\\\\\\\\\\\\`sudo gitlab-rake gitlab:env:info\\\\\\\\\\\\\\\`) (For installations from source run and paste the output of: \\\\\\\\\\\\\\\`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production\\\\\\\\\\\\\\\`) </pre> </details> #### Results of GitLab application Check <!--Input any relevant GitLab application check information if needed.--> <details> <summary>Expand for output related to the GitLab application check</summary> <pre> (For installations with omnibus-gitlab package run and paste the output of: \\\\\\\`sudo gitlab-rake gitlab:check SANITIZE=true\\\\\\\`) (For installations from source run and paste the output of: \\\\\\\`sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true\\\\\\\`) (we will only investigate if the tests are passing) </pre> </details> ### Possible fixes <!--If you can, link to the line of code that might be responsible for the problem.--> ### Patch release information for backports If the bug fix needs to be backported in a [patch release](https://handbook.gitlab.com/handbook/engineering/releases/patch-releases) to a version under [the maintenance policy](https://docs.gitlab.com/policy/maintenance/), please follow the steps on the [patch release runbook for GitLab engineers](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/patch/engineers.md). Refer to the [internal "Release Information" dashboard](https://dashboards.gitlab.net/d/delivery-release_info/delivery3a-release-information?orgId=1) for information about the next patch release, including the targeted versions, expected release date, and current status. #### High-severity bug remediation To remediate high-severity issues requiring an [internal release](https://handbook.gitlab.com/handbook/engineering/releases/internal-releases/) for single-tenant SaaS instances, refer to the [internal release process for engineers](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/internal-releases/engineers.md?ref_type=heads). <!--If you don't have /label privileges, follow up with an issue comment of `@gitlab-bot label ~"type::bug"`-->
issue