Element._update_state() does more stuff than it needs to when it's called
Summary
Element._update_state()
is a very long function that does a lot of work and is called frequently, often for very different reasons.
The full list of reasons are:
- Pipeline.resolve_elements(), so that we know the initial state of the elements.
- Stream._run(), so that we know the state of the elements after a run (to provide a useful summary)
- {Build,Fetch,Pull}Queue.status(), in case changes to dependencies have changed the element's state. Used by Queue.harvest_jobs() to be sure whether the job should still be run.
- FetchQueue.done(), to check whether the element is now cached
- Element._schedule_tracking(), to set the element state to be inconsistent
- Element._tracking_done(), to set the cache key now the element has been tracked.
- Element._set_required(), to schedule assembly if certain conditions are met.
- Element._schedule_assemble(), to synchronise the element state before it gets used in a subprocess.
- Element._assemble_done(), to synchronise the element state after the subprocess has done jobs.
- Element._pull_done(), to synchronise the element state after a pull attempt.
i.e. the purposes are broadly:
- Calculating the initial state
- Scheduling assembly
- Deciding whether a job should be run
- Deciding whether the whole dependency tree should be recalculated
- Synchronising the state because of things that have happened / will happen in a subprocess
Possible fixes
Given a lot of work is done in Element._update_state(), there is likely to be a performance improvement if we can reduce the amount of redundant work that is done each time we call Element._update_state().
Other relevant information
- BuildStream version affected: /milestone %BuildStream_v1.x