Artifact cache expiry relies on granular mtimes
Summary
When artifacts are produced in very quick succession, this can cause the ordering of a later expiry operation to go awry if the underlying file system does not support sub-second mtimes.
This is probably exceptionally rare in practice, and even then doesn't cause any particular issue, but is at least an annoyance for developers using regular GitLab runners (i.e., not our special fancy ones).
Steps to reproduce
- Run a system with non-sub-second mtime support
- Run the test suite's expiry ordering test
What is the current bug behavior?
The test will fail (most of the time)
What is the expected correct behavior?
The test should pass
Relevant logs and/or screenshots
> assert (tuple(cli.get_element_state(project, element) for element in
('unrelated.bst', 'target.bst', 'target2.bst', 'dep.bst', 'expire.bst')) ==
('buildable', 'buildable', 'buildable', 'cached', 'cached', ))
E AssertionError: assert ('cached', 'b...le', 'cached') == ('buildable', ...ed', 'cached')
E At index 0 diff: 'cached' != 'buildable'
E Full diff:
E - ('cached', 'buildable', 'buildable', 'buildable', 'cached')
E ? ----------
E + ('buildable', 'buildable', 'buildable', 'cached', 'cached')
E ? ++++++++++
tests/artifactcache/expiry.py:145: AssertionError
On the same machine:
$ echo 'Hello' > test
$ stat test
File: test
Size: 6 Blocks: 16 IO Block: 4096 regular file
Device: 809h/2057d Inode: 2092237 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-07-19 14:47:07.000000000 +0000
Modify: 2018-07-19 14:47:07.000000000 +0000
Change: 2018-07-19 14:47:07.000000000 +0000
Birth: -
$ sleep 120
$ touch -m test
$ stat test
File: test
Size: 6 Blocks: 16 IO Block: 4096 regular file
Device: 809h/2057d Inode: 2092237 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-07-19 14:47:07.000000000 +0000
Modify: 2018-07-19 14:49:08.000000000 +0000
Change: 2018-07-19 14:49:08.000000000 +0000
Birth: -
Possible fixes
Since buildstream internally relies on mtimes to determine artifact events, the "correct" way to fix this is to throw that out and keep a manifest with less coarse timestamps. This may not be worth the effort considering the bug probably has no practical impact outside of CI.
Perhaps we can use pytest
to skip the relevant test on machines that lack sub-second precision instead...
Other relevant information
- BuildStream version affected: /milestone %BuildStream_v1.2
- BuildStream version affected: /milestone %"BuildStream_v1.3"