BuildStream spends a long time pulling/looking to pull before doing anything useful
Context
- I have just set up a remote artifact cache on a VM, which I intend to push built artifacts to from a project with ~1800 elements.
- Many of the elements in this project depend on a junction project which, in turn, contains ~12000 elements.
- I wish to push the ~1800 elements (and their junctioned dependencies) to the remote cache.
- My
project.conf
file was configured to push to the remote cache. - I wrote an ad hoc stack element that
depends:
on all ~1800 elements. Then triedbst build <stack_element>.bst
Summary
- First we found that building the entire stack was taking an extremely long time and this lead to uncover #703 (closed)
- However, to bypass #703 (closed), I wrote a wrapper for BuildStream's frontend, which
bst builds
elements in batches of 100. - A
bst build
of 100 elements in the first project, with dependencies, produces a pipeline of ~1000 jobs. - Now, BuildStream spends hours trying to pull, element by element, the ~1000 elements from the empty cache, and performs very few fetch jobs in the mean time.
Steps to reproduce
- Initialise a remote artifact server on a VM:
bst-artifact-server --enable-push --port 8080 <REPO>
- Try to build a big project (e.g. freedesktop or gnome-build-meta) with a
project.conf
which includes:
artifacts:
url: <ip>:<port>
push: true
- Record how long it takes to pull
Possible fixes
- We perform a batch request to the server saying 'here are all the elements/cache keys I'm looking for'.
- Ensure that fetch jobs get a higher priority than pull jobs
- As suggested by @juergbi on our IRC channel.
- Note that we have tried a quick hack for this and the result was that the overall build time was longer.
- We are able to disable a cache for a build run, build locally and then
bst push
.
Edited by James Ennis