Draft: Implement clcache in MSBuild CI jobs

The average MSBuild RelWithDebInfo job is either barely passing the 2 hour timeout, or barely failing the 2 hour timeout.

This MR implements my own fork of clcache, a compiler cache for MSVC, much like ccache. The time spent caching puts the empty case above the overall timeout threshold, but the new build stage timeout of 1h40m leaves 20 minutes of guaranteed after_script time to upload artifacts and cache. Subsequent jobs succeed as quickly as 49 minutes.

Examples:

Clcache:

I chose clcache initially because it was designed for MSVC. Ccache didn't historically have MSVC support, and it still seems to treat it as a second-class citizen. Clcache is no longer seeing active development, so I had to fork it to merge patches for Python versions >=3.10 and MSVC >=2019. My fork also contains changes that could allow precompiled header caching, but I haven't been able to get this to work in a stable way.

I also explored sccache and ccache. Both seem to have issues with their MSVC implementation. Ccache precompiled header support doesn't work with MSVC, and the cache hits seem to be just as slow as misses. I've found clcache to be the most stable option, but it would ultimately be better to switch to ccache (as the better-supported tool) if we could get comparable speeds somehow.

As this is my own fork and not a package, I've targeted the specific commit in my repository. A better solution would be to have OpenMW fork the repository and target that instead.

Caveats:

  • Script time capped to 1h40m of the 2hr job time. This isn't enough for an uncached build, but it's essential to leave 20 minutes for uploading cache and artifacts. As a nice bonus, build logs are always available as artifacts.
  • /Z7 is required and precompiled headers are disabled. This means build artifacts are larger in size and exceed the size cap when zipped. Artifacts are now compressed as 7z format x=5 to solve this, which takes longer, but is inconsequential compared to the time savings of a cached build. I've updated the compress symbols job to handle 7z format as well.

Breakdown:

if compgen -G "*sym_store.zip" > /dev/null; then
  unzip -d sym_store *sym_store.zip
elif compgen -G "*sym_store.7z" > /dev/null; then
  apt-get install -y p7zip-full
  7z x *sym_store.7z -osym_store
fi

Handle 7z format in the symbols compression job.

- choco install git --no-progress --force --params "/GitAndUnixToolsOnPath" -y --install-arguments="'/DIR=C:\Git'"
- choco install 7zip --no-progress -y
- choco install vswhere --no-progress -y
- choco install python --no-progress -y
- choco install awscli --no-progress -y --version=2.22.35

--no-progress prevents Chocolatey from filling the log with thousands of useless lines. --install-arguments="'/DIR=C:\Git'" solves a bug with our Git installation. The installed version can't be used from PATH, because the VM has tools pre-installed at C:\Git, and this location supersedes the choco install in the environment variable. Installing to the correct folder updates the existing installation and provides the newer version to the rest of the job.

- $env:CLCACHE_LOG=1
- $env:CLCACHE_COMPRESS=1
- $env:CLCACHE_DIR="$(Get-Location)\clcache"
- python -m pip install git+https://github.com/Aussiemon/clcache.git@21c50c2a699874243f7500367eb0dcce8805eb01
- $clcacheToolPath = Join-Path (Split-Path -Parent (Get-Command python).Source) "Scripts\"
- Set-Alias clcache (Join-Path $clcacheToolPath "clcache.exe")
- clcache -M 1610612736
- clcache -z
- clcache -s

CLCACHE_LOG enables some minor logging in CLCACHE. My fork splits off the verbose logging into a separate variable. I'd like to eventually split this off into a build log. CLCACHE_COMPRESS takes the cache size down to 750MB instead of the 6GB I was seeing previously. CLCACHE_DIR sets the cache location. python -m pip install ... installs my fork of clcache as an executable in <python>\scripts\. clcache -M 1610612736 sets the size of the cache to 1.5GB, which seems reasonable. It could probably be set lower without issue. clcache -z zeroes the statistics at the start of the build. clcache -z prints cache statistics before starting.

cmake --build . --config $config --target $targets -- /p:CLToolExe="clcache.exe" /p:CLToolPath="$clcacheToolPath" /p:TrackFileAccess=false

Sets required MSBuild parameters for using clcache.

7z a -t7z -mx=5 "..\..\$(Make-SafeFileName("OpenMW_MSVC2022_64_${config}_${CI_COMMIT_REF_NAME}_${CI_JOB_ID}_symbols.7z"))" '*.pdb' CI-ID.txt

Compresses artifacts as 7z format x=5. Compression level is somewhat overkill (the artifacts are smaller than zip archives even with the increased filesize), but gives us headroom for future increases. The increased compression time is small compared to the overall time savings.


BEFORE:
Creating archive: ..\..\OpenMW_MSVC2022_64_RelWithDebInfo_werewolf-lua-api_12201901068_symbols.zip
Add new data to archive: 18 files, 1265201269 bytes (1207 MiB)
Files read from disk: 18
Archive size: 247927069 bytes (237 MiB)
AFTER:
Creating archive: ..\..\OpenMW_MSVC2022_64_RelWithDebInfo_letsAttemptCLCacheAgain_12262892436_symbols.7z
Add new data to archive: 18 files, 1838297212 bytes (1754 MiB)
Files read from disk: 18
Archive size: 247927069 bytes (237 MiB)

BEFORE:
Creating archive: ..\OpenMW_MSVC2022_64_RelWithDebInfo_werewolf-lua-api_12201901068_sym_store.zip
Add new data to archive: 69 folders, 40 files, 1379579695 bytes (1316 MiB)
Files read from disk: 40
Archive size: 297481063 bytes (284 MiB)
AFTER:
Creating archive: ..\OpenMW_MSVC2022_64_RelWithDebInfo_letsAttemptCLCacheAgain_12262892436_sym_store.7z
Add new data to archive: 69 folders, 40 files, 1952634268 bytes (1863 MiB)
Files read from disk: 39
Archive size: 265515826 bytes (254 MiB)

BEFORE:
Creating archive: ..\..\OpenMW_MSVC2022_64_RelWithDebInfo_werewolf-lua-api.zip
Add new data to archive: 62 folders, 492 files, 237174787 bytes (227 MiB)
Files read from disk: 492
Archive size: 80396297 bytes (77 MiB)
AFTER:
Creating archive: ..\..\OpenMW_MSVC2022_64_RelWithDebInfo_letsAttemptCLCacheAgain.7z
Add new data to archive: 58 folders, 489 files, 237128831 bytes (227 MiB)
Files read from disk: 488
Archive size: 50191381 bytes (48 MiB)

- $env:CLCACHE_DIR="$(Get-Location)\clcache"
- Set-Alias clcache (Join-Path (Split-Path -Parent (Get-Command python).Source) "Scripts\clcache.exe")
- try {clcache -s} catch {Write-Host $_.Exception.Message}

Sets up clcache in the after_script so that we can print the statistics (if they exist) after the build has succeeded or failed.

# Delete unneeded files to free up space for archiving and caching
- try {(Get-ChildItem 'MSVC2022_64' -Recurse -File -ErrorAction Continue | Where-Object {-not ($_.FullName -match 'MSVC2022_64\\deps\\Qt')} -ErrorAction Continue | Where-Object {$_.Extension -notin '.zip', '.7z', '.log'} -ErrorAction Continue | Remove-Item -Force -ErrorAction Continue)} catch {Write-Host $_.Exception.Message}

A common issue with our CI build is that the VM runs out of disk space; sometimes even after otherwise finishing successfully. This line ensures that enough space is free to upload the cache and artifacts. Gitlab seems to stage these objects before the upload, which requires free space. The script clears everything in the MSVC2022_64 folder that isn't in 'deps\QT', a .zip, .7z, or .log.

- key: msbuild-2022-v13
  paths:
    - deps
    - MSVC2022_64/deps/Qt
- key: clcache
  paths:
    - clcache
  when: always

The clcache is set separately and will cache regardless of success or failure. This is essential, as the cache needs to upload on timeout to initialize for future jobs. The behavior for msbuild-2022-v13 is unchanged.

RUNNER_SCRIPT_TIMEOUT: 1h40m

This line caps the script at an hour and 40 minutes. It's below the common uncached build time, but the remaining 20 minutes is necessary for uploading the cache. See: set-script-and-after_script-timeouts.

Edited by Aussiemon

Merge request reports

Loading