Skip to content

Improve artifact/cache path matching behaviour

Description

Previous behaviour

  • filepath.Glob() (** is not supported)
  • filepath.Walk() walks on returned glob matches.
  • Glob(".") and Glob("./") return . and ./, so the current directory is also walked.
  • Directories and files are added recursively. The recursive behaviour is similar to gitignore's behaviour.
  • Paths provided can be absolute, and the behaviour around this is a little odd. For example, supplying / or C:/ will walk the filesystem, but complain on every single file that it is "outside of the build directory", until it finally enters the build directory. In some cases, walking is so slow, that a job can easily expire just performing this task.

Current behaviour

  • doublestar.Glob() (** is supported)
  • filepath.Walk() walks on returned glob matches.
  • Glob(".") and Glob("./") return nothing, so the current directory is not walked.
  • Directories and files are still added recursively. Now that we support **, this might be slightly different behaviour than the user expects (it's not shell-like), but isn't that odd as gitignore is also recursive and supports **.
  • Paths provided can still be absolute.

excludes are now also supported however:

  • They're not recursive.
  • Absolute paths cannot be provided.

The following configurations would still add binaries/*.

artifacts:
  paths:
    # recursive, so anything inside will match
    - binaries/
  exclude:
    # not recursive, so anything inside won't match.
    - binaries/
artifacts:
  paths:
    # included because we're inside of the build directory
    - C:\<build_directory>\binaries\
  exclude:
    # recursive due to '**', but will not work because the pattern is matched relative to the build directory.
    - C:\binaries\**

The problem

  • Users might expect shell-like behaviour, but recursion is always enabled. This isn't a big deal, as it's very similar to gitignore and that likely makes sense.
  • Excludes don't work the same way as Includes (not recursive, matched only on relative paths)
  • . and ./ support was (accidentally?) removed.
  • filepath.Walk is slow. We have to do it several times, and now we're also globbing (also similarly slow at walking the filesystem).

Proposal

  • Make exclude's recursive.
  • Make paths (includes) and excludes relative only. If we can only ever include files within the build directory, why allow paths from outside of the build directory?
    • We could try to solve this situation on the config's behalf, by converting an absolute path to relative before we glob/match, but there's issues with that:
      • With gitignore/wildmatch, a / is used to indicate the root of the repository (in our case, the build directory). This is more important than it first seems, because files, /files, files/ and /files/ have different meanings.
      • I don't believe filepath.Rel() is case-insensitive on case-insensitive filesystems. So it might not work if the build directory is /BUILD_DIR but the pattern provided was /build_dir.
      • Given the issues around absolute directories already, it's likely a rarely used feature as it is. Maybe we can just assume the path is relative moving forward, document that it is, and if a user supplies an absolute path, it simply won't match.
  • Bring back support for . and ./?
  • Traversing the filesystem should only occur once, and we match as we go.
  • Carry on with gitignore like behaviour (recursive), but potentially we can make it more like git's wildmatch in the future.
Edited by Darren Eastman