Docs: clarify that patterns in `cache:key:files_commits` are interpreted as git pathspecs
<!--IssueSummary start-->
<details>
<summary>
Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards.
</summary>
- [Label this issue](https://contributors.gitlab.com/manage-issue?action=label&projectId=278964&issueIid=595745)
</details>
<!--IssueSummary end-->
- [x] Start this issue's title with `Docs:` or `Docs feedback:`.
## Problem to solve
In `cache:key:files`, wildcard patterns are interpreted like globs.
In `cache:key:files_commits`, they're interpreted like git pathspecs.
These behave differently, e.g. `**/foo` matches top-level `foo` as a glob but not as a pathspec.
It would be helpful to document this.
I raised a similar issue before: https://gitlab.com/gitlab-org/gitlab/-/work_items/547149. It was closed because there was related work in progress (making `cache:key:files` use content-based hashing, and moving commit-based hashing to `cache:key:files_commits`) and the glob/pathspec mismatch was meant to be taken care of as part of that. That work has been merged, but the glob / pathspec mismatch remains.
When I raised #547149:
* `cache:key:files` called `last_commit_id_for_path(path)`, which delegated to Gitaly, which interpreted the path as a pathspec
Now:
* `cache:key:files` uses content-hashing instead of commit-hashing
* `cache:key:files` supports wildcards, using glob-like matching
* `cache:key:files_commits` uses the old commit-hasing
* it still calls `last_commit_id_for_path(path)`, as before
Reproducible example:
Suppose your project just has `foo` and `bar` at top-level.
```yaml
job-a:
cache:
key:
files_commits:
- "**/foo" # no match, because interpreted like pathspec, so cache key is 'default'
paths:
- "**/bar" # match, because interpreted like glob
script:
- echo "check the logs to see caching behaviour"
```
## Further details
<!--* Any concepts, procedures, reference info we could add to make it easier to successfully use GitLab?
* Include use cases, benefits, and/or goals for this work.
* If adding content: What audience is it intended for? (What roles and scenarios?)
For ideas, see personas at https://handbook.gitlab.com/handbook/product/personas/ or the persona labels at
https://gitlab.com/groups/gitlab-org/-/labels?subscribed=&search=persona%3A-->
## Proposal
Clarify in the docs that `cache:key:files_commits` patterns are interpreted as git pathspecs.
(Or change the implementation to interpret them as globs, consistently with `cache:key:files`, `cache:key:paths`, and more.)
## Who can address the issue
Anyone
## Other links/references
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/203233:
* `cache:key:files` switched from commit- to content-hashing
* `cache:key:files` paths are matched literally (no wildcard support)
* `cache:key:files_commits` added, using commit-hashing
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/209633
* adds glob-like wildcard support to `cache:key:files`
* fixed a bug when matching wildcards, so `**/foo` matches at top-level
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/211084
* reverted !209633 because it allowed a file to include itself
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/211424
* better version of !209633
* added glob-like wildcard support to `cache:key:files` as in !209633
* but also prevented a file including itself
issue