Skip to content

Delete cache dirs after failed extraction

Matthew Bradburn requested to merge mbradburn-clean-up-failed-cache into main

What does this MR do?

It adds a feature flag which, when enabled, causes job build scripts to remove cache directories after a failed cache extraction.

I wasn't so familiar with the different kinds of cache configurations that we support but it's kind of a complicated set of options, which makes this change more complicated as well. I think I have it set up so that if we fail to extract the cache from the first key, we remove the cache path directories and then attempt to extract from the first fallback cache key. If that also fails we again remove the same set of directories, and so forth. And we do this for each configured cache.

Why was this MR needed?

In certain circumstances, cache extraction could make partial progress before dying, which could leave an inconsistent cache directory behind.

What's the best way to test this MR?

There might be a better way than how I was testing it, please clue me in. I instrumented the code so I could examine the generated build job script, and then messed around with intentionally-corrupted cache archives.

What are the relevant issue numbers?

#36988 (closed)

Edited by Matthew Bradburn

Merge request reports