Junction `overrides` feature does not resolve links

Currently link elements are not acceptable in the junction overrides feature, because we currently perform a simple search in junctions for the junction names of the subprojects which might be overridden, ignoring the fact that the junction names specified in the overrides might contain link elements.

Working example

When overriding a subproject directly using a link, like so:

kind: junction

# subproject-link.bst is a link element to a subproject
# in the subproject this junction refers to
config:
  overrides: subproject-link.bst: myoverridding-junction.bst

Then the resolution works, because at the time we load the subproject, we are providing subproject-link.bst to search for in the subproject, and did not yet need to resolve the link.

Failing example

When overriding a subsubproject using a link existing in the subproject, like so:

kind: junction

# subproject-link.bst is a link element to a subproject
# in the subproject this junction refers to
config:
  overrides: subproject-link.bst:subsubproject.bst: myoverridding-junction.bst

This fails, because subproject-link.bst was resolved and the actual loaded junction is subproject.bst (the target of subproject-link.bst.

While there is an override specified for subproject-link.bst, and it was used in order to load the subproject (link is already resolved), it does not match the name of the loaded junction, and as such the link is ignored.

Proposed fix

This is a bit complicated to solve, ideally, we should look up previously resolved links and substitute them in junction overrides, this approach would:

  • Fail to override the subproject work if the junction link name was used to address the subproject in the override, but not used to address elements in the subproject in question, i.e. it would depend on the link resolution to happen in advance.
    • This is a fairly safe assumption to make; if you are addressing a subproject using a link, we should have a reasonable expectation that the link name is what is used to address that subproject
  • Avoid extra work while loading, potentially harming load performance
    • The alternative approach would be to actually resolve the elements specified in overrides, this would incur additional load time especially when a pipeline of elements could otherwise be loaded without needing to load elements across the overridden junctions (this approach could lead to requiring loading of unneeded subprojects).

The caveat I can see is:

  • When a subproject gets refactored, they might decide to leave a link in place for one of it's subprojects, this would be a sound decision in order to avoid breaking API in the case of some element naming churn which happened.
  • At that point, it would make sense for the project to internally start using the new/real junction name, and leave the link name behind only for compatibility reasons
  • When a downstream project continues to access that junction name only in order to override it, but the sub-subproject is only ever loaded by request of the subproject's dependencies, then the override will be silently ignored (unless it happens to collide with another coexisting junction to the same project).

Better proposed fix

There should be a better algorithm to resolve links without making the tradeoff listed above.

Given that, at the time we are searching for an override for a junction we're about to load, we have loaded all parent projects already who have the right to decide, then we have the link elements already staged and ready to load, even if we did not happen to load them yet, it should be worth the price of simply shallow loading the elements referred to in the overrides list just to determine if they are links.

In general, the algorithm would run like this:

  • Every given Loader in the hierarchy would maintain a reverse lookup table of element name -> link names
    • This way we would keep in context, every link which can resolve to any given element
    • TBD: Maybe this needs to be maintained in the global LoadContext, unclear what makes most sense at this stage
  • At the time we load a junction, we traverse the overrides table of that junction
    • If the override key is local to the project being loaded (contains no :), then we shallow load it with Loader._load_file_no_deps()
    • If the loaded LoadElement is a link, we record that link in the appropriate table
  • Whenever we search for an override in advance to loading a junction, instead of the current blind matching:
    • We build a normalized override table in the projects we search, an override table where the override keys have been resolved to full element paths
      • This might not be real element paths, in the case that we have links which traverse junction boundaries which have not yet been loaded, but this is immaterial at this stage, and will only become material at the time of crossing those junction boundaries
      • This alternative table needs to be reconstructed on the fly, as it might be resolved differently as we load subprojects further down the build graph
    • We check for a match in that normalized overrides table instead
Edited by Tristan Van Berkom