Skip to content

BuildStream spends a long time pulling/looking to pull before doing anything useful

Context

  • I have just set up a remote artifact cache on a VM, which I intend to push built artifacts to from a project with ~1800 elements.
  • Many of the elements in this project depend on a junction project which, in turn, contains ~12000 elements.
  • I wish to push the ~1800 elements (and their junctioned dependencies) to the remote cache.
  • My project.conf file was configured to push to the remote cache.
  • I wrote an ad hoc stack element that depends: on all ~1800 elements. Then tried bst build <stack_element>.bst

Summary

  • First we found that building the entire stack was taking an extremely long time and this lead to uncover #703 (closed)
  • However, to bypass #703 (closed), I wrote a wrapper for BuildStream's frontend, which bst builds elements in batches of 100.
  • A bst build of 100 elements in the first project, with dependencies, produces a pipeline of ~1000 jobs.
  • Now, BuildStream spends hours trying to pull, element by element, the ~1000 elements from the empty cache, and performs very few fetch jobs in the mean time.

Steps to reproduce

  • Initialise a remote artifact server on a VM: bst-artifact-server --enable-push --port 8080 <REPO>
  • Try to build a big project (e.g. freedesktop or gnome-build-meta) with a project.conf which includes:
artifacts:
  url: <ip>:<port>
  push: true
  • Record how long it takes to pull

Possible fixes

  1. We perform a batch request to the server saying 'here are all the elements/cache keys I'm looking for'.
  2. Ensure that fetch jobs get a higher priority than pull jobs
    • As suggested by @juergbi on our IRC channel.
    • Note that we have tried a quick hack for this and the result was that the overall build time was longer.
  3. We are able to disable a cache for a build run, build locally and then bst push.

Edited by James Ennis
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information