Hanging tests due to exceptions in artifact cache

Summary

Likely only to affect developers but very annoying. Exceptions in the Artifact cache cause the BuildStream tests to lock up. They cannot be interrupted with ^C and need a kill -9 in another terminal to kill them.

Steps to reproduce

Simulate a missing file in CASCache._get_subdir, like this:

diff --git a/buildstream/_artifactcache/cascache.py b/buildstream/_artifactcache/cascache.py
index 9ca757d4..29cb84a0 100644
--- a/buildstream/_artifactcache/cascache.py
+++ b/buildstream/_artifactcache/cascache.py
@@ -797,6 +797,7 @@ class CASCache():
 
     def _get_subdir(self, tree, subdir):
         head, name = os.path.split(subdir)
+        raise CASError("Subdirectory {} not found".format(name))
         if head:
             tree = self._get_subdir(tree, head)

(This exception is raised at the end of this function, and can genuinely be raised if an badly-constructed artifact was placed in the cache.)

Now run the test:

/setup.py test --addopts "tests/artifactcache/pull.py::test_pull --integration -s"

This will halt at 'pull'.

What is the current bug behavior?

Tests lock up and can only be cleared with SIGKILL.

What is the expected correct behavior?

Details of the exception being raised are visible to the tester.

Possible fixes

While we can't simply remove it, ExitStack in tests/testutils/runcli.py is likely to be relevant. In my case, I tracked down the underlying exception by removing ExitStack from Cli.run temporarily. This allows the exception text and backtrace to appear on the test output.

  • BuildStream version affected: /milestone %BuildStream_v1.3