Move pipeline graph data processing to the backend

What

Let's move the calculation of the layers and ancestors used by the big pipeline graph from the frontend to the backend. This would include all the code called by the listByLayers function.

Then the frontend could request a pipeline graph with a view argument set to to either stage or layers and have data returned in the correct format. For stages it would look as it does now. For layers, it would include extra data:

# Note: This representation does not show all the data we get for a pipeline, just the relevant values.

{
  layers: [ 
    { 
      groups: [
        {
          layer,
          jobs: {
            {
              name,
              needs: [{ name }]
              ancestors: [{ name }]
              source, 
              target,
            }
          }
        }
      ] 
    }
  ]
}

The backend would also return a representation of the links between jobs. (This bit would need some refining between backend and frontend, but would be able to be calculated with the same methods we are using now.)

Why

Why can be broken up into a few reasons: performance and UX, simplifying the frontend, reducing error footage between frontend and backend.

Performance and UX

Currently, calculating the laters and ancestors on the frontend is not terribly performant. Initial looks at the browser performance tab show on smaller graphs it's not just the calculation itself but the follow-on node updates in the Vue vDOM representation that take a lot of time. On a graph of around 40 jobs, I'm seeing that whole task takes about 1.47s. On a graph with 341 nodes, calculating ancestors comes in around 18s! createSankey is stable at about 1ms.

(More stats to come here.)

This performance bottleneck is blocking requested features like defaulting to links on and showing the forward graph for a job.

Simplifying the Frontend

Because the frontend is doing so much data manipulation and refinement, there is a lot of code focused on caching so as to avoid recalculation every time we poll (see: !58646 (merged), !59792 (merged)).

That same code is also a bit delicate in terms of expected data structures (see next section) and causes some challenges with updating information held in the Apollo cache in a way that does not involve a lot of extra processing but also remains immutable.

Less code solving these issues means less to break. Speaking of less to break ...

Reducing error footage between frontend and backend

We've already had multiple instances where changes to a backend representation have broken the frontend data processing. Moving the processing would keep the same group responsible for a larger portion of the data, which should reduce the dangers of shared responsibility here.

Why Not Fix the Frontend?

I do believe there may be some low-hanging fruits that can speed up the frontend a bit and, depending on prioritization of other features, doing so might be a stopgap. However, the non-performance pros are also strong, I think, and so make the change worthwhile.

Why Now?

We've done the fast iterations on the frontend to prove out this approach and now that we know it works, we can strengthen it by improving it is a good time to improve it.

Proposal

Create POC for evaluation of performance implications of this change (on both Backend and Frontend), and UX changes

Previous desciption

For the new pipeline graph, we are currently doing quite a lot of calculation on the frontend, especially around the layers view — calculating the layers, and generating the ancestor lists, in particular. This can cause performance issues (for instance: #330071) and leads to us solving some complex caching issues on the frontend as well (see: !58646 (merged), !59792 (merged)).

This made a lot of sense in developing this work, but in the end it is likely more efficient and would simplify frontend code if we moved this to the backend. Now that the code exists, I think it shows fairly well what needs to be done, and we should be able to move it into the Rails section / GraphQL resolvers. The largest part of this project would likely be replicating the d3 graph algorithm, but there may be a gem and if not, we can read how it is implemented.

This would also help us solve ongoing issues where changes to backend data representation break frontend parsing. Because these are so loosely linked, it can be difficult for people to remember to check the full flow. Moving the processing would keep the same group responsible for a larger portion of the data, which should also help.

Edited Jun 11, 2021 by Sarah Groff Hennigh-Palermo