Scheduler requires significant rework
Some Background
The thoughts I've written here have mostly developed from investigation into #712 (closed), in particular from dicsussions I've had with other contributers (the prime being @danielsilverstone-ct , WSalmon and @juergbi) while trying to reason about the perfomance impact of patches to the scheduler.
The Problem
1: Scaling
As it stands, the scheduler code does not scale well at all, the overhead it introduces becomes significant for large projects. Currently this overhead increases faster than linearly with the number of elements remaining in a pipeline.
@WSalmon created a nice demonstration of this using a simple set of empty import elements. Taking a pipeline of 5000 elements in a 'fan' graph from bstgen, the element throughput of the pipeline is almost an order of magnitude higher near the end than near the start. Note this holds even if one ignores either end to account for e.g loading time. On his mid-end laptop, this took over 20 minutes to run, without any network access and with builds which should be instantaneous. Profiling shows that over 80% of this time is spent in queue.pop_ready_jobs.
Issue #703 (closed) is one 'real world' encounter with this issue and my initial profiling observations support the assertion that job queues are taking a large amount of the total runtime.
Given we want BuildStream to be able to handle pipelines containing tens of thousands of elements, the scaling ability of the scheduler clearly needs improving.
2: Resource Starvation
As observed in #712 (closed). The current attempt at avoiding resource starvation by reversing the order of the queues in Scheduler._schedule_queue_jobs does not work. This is because all Pull jobs are ready to be scheduled immediately and therefore are guaranteed to be scheduled almost immediately. Any fetch jobs will be added to the ready_jobs list after the pull jobs and so will tend to wait until all pull jobs are complete.
Moving Forward
Refactoring
We're encountering enough pain here now that I believe it is worth thinking carefully about our approach to the scheduler as a whole. I've not yet spent much time thinking about alternate scheduler implementation so don't have much to offer on this front for now.
Benchmarking
As is demonstrated by #712 (closed) and #703 (closed), the synthetic benchmarks we have been using have not been representative of the large real world projects we are targeting. Before making any major changes we should ideally have some more realistic benchmarks which can be used to guide us.
@wsalmon has been doing some work on expanding bstgen to create more complex dependency trees. If we can show that these show the same performance issues as are being encountered in real projects then they could be used to perform easily repeatable, low cost benchmarks that we know reflect real world projects.
I believe @jennis has been working on constructing a benchmarking project from the Debian dependency graph, which could prove to be very valuable.
note: Additional links and more statistics to follow.