Improve artifact/cache path matching behaviour
Description
Previous behaviour
-
filepath.Glob()(**is not supported) -
filepath.Walk()walks on returned glob matches. -
Glob(".")andGlob("./")return.and./, so the current directory is also walked. - Directories and files are added recursively. The recursive behaviour is similar to
gitignore's behaviour. - Paths provided can be absolute, and the behaviour around this is a little odd. For example, supplying
/orC:/will walk the filesystem, but complain on every single file that it is "outside of the build directory", until it finally enters the build directory. In some cases, walking is so slow, that a job can easily expire just performing this task.
Current behaviour
-
doublestar.Glob()(**is supported) -
filepath.Walk()walks on returned glob matches. -
Glob(".")andGlob("./")return nothing, so the current directory is not walked. - Directories and files are still added recursively. Now that we support
**, this might be slightly different behaviour than the user expects (it's not shell-like), but isn't that odd asgitignoreis also recursive and supports**. - Paths provided can still be absolute.
excludes are now also supported however:
- They're not recursive.
- Absolute paths cannot be provided.
The following configurations would still add binaries/*.
artifacts:
paths:
# recursive, so anything inside will match
- binaries/
exclude:
# not recursive, so anything inside won't match.
- binaries/
artifacts:
paths:
# included because we're inside of the build directory
- C:\<build_directory>\binaries\
exclude:
# recursive due to '**', but will not work because the pattern is matched relative to the build directory.
- C:\binaries\**
The problem
- Users might expect shell-like behaviour, but recursion is always enabled. This isn't a big deal, as it's very similar to
gitignoreand that likely makes sense. - Excludes don't work the same way as Includes (not recursive, matched only on relative paths)
-
.and./support was (accidentally?) removed. -
filepath.Walkis slow. We have to do it several times, and now we're also globbing (also similarly slow at walking the filesystem).
Proposal
- Make
exclude's recursive. - Make
paths(includes) andexcludes relative only. If we can only ever include files within the build directory, why allow paths from outside of the build directory?- We could try to solve this situation on the config's behalf, by converting an absolute path to relative before we glob/match, but there's issues with that:
- With
gitignore/wildmatch, a/is used to indicate the root of the repository (in our case, the build directory). This is more important than it first seems, becausefiles,/files,files/and/files/have different meanings. - I don't believe
filepath.Rel()is case-insensitive on case-insensitive filesystems. So it might not work if the build directory is/BUILD_DIRbut the pattern provided was/build_dir. - Given the issues around absolute directories already, it's likely a rarely used feature as it is. Maybe we can just assume the path is relative moving forward, document that it is, and if a user supplies an absolute path, it simply won't match.
- With
- We could try to solve this situation on the config's behalf, by converting an absolute path to relative before we glob/match, but there's issues with that:
- Bring back support for
.and./? - Traversing the filesystem should only occur once, and we match as we go.
- Carry on with
gitignorelike behaviour (recursive), but potentially we can make it more like git'swildmatchin the future.
Edited by Darren Eastman