Fix ValueError traceback for URL without alias or URI scheme
Description
Previously URL was split into two distinct items by :
, this caused issues when :
was not present in the string.
This went unnoticed due to the majority of cases using a URI scheme (http://
) or alias gitlab:
.
Modify logic to avoid alias, _ = ...
This merge request, when approved, will close issue/bug: #597 (closed)
Merge request reports
Activity
mentioned in issue #599 (closed)
added Bug Needs Backport labels
- Resolved by Tristan Van Berkom
So there is another thing about this patch which is really unfortunate and I wonder if we can fix in some way.
Until @jonathanmaw's work on downloading from source mirrors, all configuration was guaranteed to be read and all calls to
Source.translate_url()
guaranteed to be made atPlugin.configure()
or at the latestPlugin.preflight()
time.This would have made it possible to collect any of these instances at startup time, and issue a single warning after having loaded all the elements with a nice summary.
But instead with this patch:
- The urls without aliases are not batched into a single warning at load time
- The warnings are not guaranteed to happen at startup time
Because of how the
SourceFetcher
API works, some of these warnings will only happen once you actually get to fetching agit
source, or one of it's submodules.I think that there are two different kinds of warning here too which can occur:
- A project author specified a URL without specifying an alias
- This applies to any url, including urls that are specified in the
git
submodules configuration
- This applies to any url, including urls that are specified in the
- A project author has not specified any url at all for one of the submodules of a
git
repository- This could be a side effect of using
bst track
on agit
source and causing the newly tracked reference to point to a commit sha which introduces a new submodule.
- This could be a side effect of using
For the first class of warning; it is entirely possible to guarantee that the warnings will occur at load time, before any processing begins; saving the user valuable time - for the second class of warning there is not much we can do about it.
Is there any way that we can push this a bit further, and guarantee that this first class of warning is issued after loading of elements and before processing starts ?
changed milestone to %BuildStream_v1.2
added Urgent label
Additional note: the fact that parsing of configuration risks being delayed with the
SourceFetcher()
APIs is a bad oversight, we should guarantee that all user provided configuration is loaded and checked at startup time.This might or might not be possible to guarantee with the current
SourceFetcher()
API, but seeing as this invariant of validate-before-processing is very bad to break, I would be open to a last minuteSource
API change inbst-1.2
to accommodate fixing this.Alternatively, this might be achieved as a strong guideline for
Source
plugins; if there is no way to enforce this with the API, in which case we could add a core assertion in the case that aSource
plugin callstranslate_url()
on an unaliased URI that the core has not yet seen after loading (I.e.: We can call it a bug in the plugin, if the plugin fails to properly validate it's configured URLs at load time but circles back and tries to validate them later).It seems to me though; that probably the best thing to do is to make sure that the fetchers are instantiated once at load time. This will significantly change how the
git
source will have to be implemented, but as far as I can see:- The git source can instantiate a fetcher for the main git repo
- The git source can instantiate a fetcher for each URL that the project author specified
- At fetch time, the main fetcher can be responsible for fetching any submodules which were not specified by project configuration (possibly issuing the second class of warning mentioned in the previous comment)
- It is entirely pointless to have a separate
SourceFetcher
for a URL that is not specified in project configuration anyway - you cannot use source mirroring on a url that the project author did not alias
I think that the best approach here, is to avoid the
git
plugin using it's internalGitMirror()
helper object as a direct subclass ofSourceFetcher()
, and ensure that all fetchers have been instantiated at project load time.This should happen after
Plugin.configure()
and beforePlugin.preflight()
, giving theSource
implementations a chance to preflight with an initialized list of fetchers.I think forcing sources to instantiate their fetchers at configuration time is the right call, ensuring that urls are translated/marked for download before preflight. I think adding an assertation to ensure this is a good place to start and figure out how we can adapt the
git
source accordingly.Edited by Qinustyadded 30 commits
-
6c7e8c91...f7ef151b - 29 commits from branch
master
- 6c7ec7b5 - Prevent traceback on URL without alias
-
6c7e8c91...f7ef151b - 29 commits from branch
I have a basic implementation of enforcing url resolution being performed within
configure()
, however this will not currently work without an overhaul of the current system. As you say, the currentgit
implementation relies on being able to resolve urls throughout the main fetch logic.source.py also relies on this as source.py:912 instantiates new Sources and performs preflight for each mirror during a fetch.
Overall, I'm not too sure we're going to be able to enforce this right now until we take a step back and think it through properly.
assigned to @Qinusty