Push-based scheduler optimization

Background

Original proposal from @BenjaminSchubert: https://mail.gnome.org/archives/buildstream-list/2019-March/msg00034.html

I've been looking at the scheduler and the queues and from what I can see in some benchmarks and profiles, they are becoming an important bottleneck, especially with a large number of workers.

A few issues have been raised over time about that:

#824 (closed)

#943 (closed)

Currently, the queues go over every element until they find enough elements to fill up all resources. While this works fine with few workers, since the probability of finding a job that is ready early in the queue is quite high, the performance degrades when more workers are added, as the probability to have many jobs ready early decreases.

The problem becomes even worse when we are using remote execution with many workers.

I therefore think we should be pushing items in the queue as soon as they are needed.

Tasks

Add core scheduler support for queues that use push/callback-based ready state change instead of the current approach of calling Queue.status() on every element in harvest_jobs().

This first task is just about the core Scheduler/Queue code, without actually using the new functionality in the individual queues yet.
Use push/callback-based approach in
- PullQueue
- FetchQueue
- BuildQueue
The other queues do not use QueueStatus.WAIT and thus should not require significant changes.

These tasks will require corresponding changes in the Element class.
Drop the old Queue.status()-based ready check / wait queue.

Queue.status() should likely continue to exist (possibly under a different name) to decide whether the element should completely skip the queue. It might also make sense to keep using it to determine whether an element is immediately ready when it's enqueued. However, it should no longer be called as part of harvest_jobs().
Maybe: Defer enqueuing (non-track) elements before they are marked as 'required'.

I don't think this is strictly necessary as this can also be handled by the push/callback-based approach within the pull/fetch/build queues. However, it might be slightly cleaner to do it in Stream.