Project with a large number of identical files exceeds ext4 hard link limit
Summary
For a project creating a complicated image such as a Flatpak runtime, it is possible to end up in a situation where elements are creating large numbers of identical files.
Steps to reproduce
This element creates 65000 files with the same content:
evil-files.bst:
kind: manual
build-depends:
- base.bst
config:
install-commands:
- |
for i in $(seq 1 65000); do
echo "B" > "%{install-root}/${i}.txt";
done
base.bst:
kind: import
description: |
Alpine Linux base runtime
sources:
- kind: tar
# This is a post doctored, trimmed down system image
# of the Alpine linux distribution.
#
url: https://bst-integration-test-images.ams3.cdn.digitaloceanspaces.com/integration-tests-base.v1.x86_64.tar.xz
ref: 3eb559250ba82b64a68d86d0636a6b127aa5f6d25d3601a79f79214dc9703639
To reproduce the issue, you will to store your buildstream cache on an ext3 or ext4 filesystem. Create a buildstream project with evil-files.bst and base.bst, and run the following commands:
bst build evil-files.bst
bst shell evil-files.bst
What is the current bug behavior?
When extracting artifacts from evil-files.bst, BuildStream fails with output like this:
$ bst shell evil-files.bst
[--:--:--][][] START Loading elements
[00:00:00][][] SUCCESS Loading elements
[--:--:--][][] START Resolving elements
[00:00:00][][] SUCCESS Resolving elements
[--:--:--][][] START Resolving cached state
[00:00:00][][] SUCCESS Resolving cached state
[--:--:--][555f7856][ main:evil-files.bst ] START Staging dependencies
[--:--:--][][] BUG [Errno 31] Too many links: '/home/dylan/.cache/buildstream/artifacts/cas/objects/c0/cde77fa8fef97d476c10aad3d2d54fcc2f336140d073651c2dcccf1e379fd6' -> '/home/dylan/.cache/buildstream/artifacts/extract/tmpzbdey6br/evil-bst/evil-files/555f7856d2c9e08c4c7ee7c5489c32518efeb30cd824978d08cc95c3719c3c4f/files/9999.txt'
Traceback (most recent call last):
File "/usr/bin/bst", line 8, in <module>
sys.exit(cli())
File "/home/dylan/.local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/_frontend/cli.py", line 174, in override_main
standalone_mode=standalone_mode, **extra)
File "/home/dylan/.local/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/dylan/.local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/dylan/.local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/dylan/.local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/dylan/.local/lib/python3.7/site-packages/click/decorators.py", line 27, in new_func
return f(get_current_context().obj, *args, **kwargs)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/_frontend/cli.py", line 632, in shell
command=command)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/_stream.py", line 150, in shell
return element._shell(scope, directory, mounts=mounts, isolate=isolate, prompt=prompt, command=command)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/element.py", line 1787, in _shell
with self._prepare_sandbox(scope, directory) as sandbox:
File "/usr/lib64/python3.7/contextlib.py", line 112, in __enter__
return next(self.gen)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/element.py", line 1328, in _prepare_sandbox
self.stage_dependency_artifacts(sandbox, dependency_scope)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/element.py", line 740, in stage_dependency_artifacts
update_mtimes=to_update)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/element.py", line 629, in stage_artifact
artifact_base, _ = self.__extract()
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/element.py", line 2402, in __extract
return (self.__artifacts.extract(self, key), key)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/_artifactcache/cascache.py", line 115, in extract
self._checkout(checkoutdir, tree)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/_artifactcache/cascache.py", line 643, in _checkout
self._checkout(fullpath, dirnode.digest)
File "/home/dylan/.local/lib/python3.7/site-packages/buildstream/_artifactcache/cascache.py", line 635, in _checkout
os.link(self.objpath(filenode.digest), fullpath)
OSError: [Errno 31] Too many links: '/home/dylan/.cache/buildstream/artifacts/cas/objects/c0/cde77fa8fef97d476c10aad3d2d54fcc2f336140d073651c2dcccf1e379fd6' -> '/home/dylan/.cache/buildstream/artifacts/extract/tmpzbdey6br/evil-bst/evil-files/555f7856d2c9e08c4c7ee7c5489c32518efeb30cd824978d08cc95c3719c3c4f/files/9999.txt'
It appears that all of those files (X.txt) are linked to the same object in BuildStream's artifacts CAS directory. This eventually exceeds ext4's limit on hard links per file.
What is the expected correct behavior?
Instead, BuildStream should avoid creating more hard links than the filesystem allows, creating additional objects in the CAS once the limit is reached. Alternatively, it could provide a more meaningful error message in this case that might help a user to resolve the issue.
Other relevant information
I first encountered this building a Flatpak runtime based on the Freedesktop SDK runtime. In particular, files such as LC_MEASUREMENT for different locales (https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/supported-locales.bst) end up being duplicated extensively.
I should add that I originally didn't have a cache quota in my buildstream.conf, so directories in ~/.cache/buildstream/artifacts/extract/ contained many entries. Changing the cache quota means that this is unlikely to reoccur for me, but having this error come out of a large cache size is a bit of a surprise, and the underlying issue could be trouble for other projects.
- BuildStream version affected: /milestone %BuildStream_v1.4