Skip to content
Snippets Groups Projects

Fix ValueError traceback for URL without alias or URI scheme

Merged Qinusty requested to merge Qinusty/597-non-alias-url-fix into master

Description

Previously URL was split into two distinct items by :, this caused issues when : was not present in the string.

This went unnoticed due to the majority of cases using a URI scheme (http://) or alias gitlab:.

Modify logic to avoid alias, _ = ...

This merge request, when approved, will close issue/bug: #597 (closed)

Edited by Qinusty

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • So there is another thing about this patch which is really unfortunate and I wonder if we can fix in some way.

    Until @jonathanmaw's work on downloading from source mirrors, all configuration was guaranteed to be read and all calls to Source.translate_url() guaranteed to be made at Plugin.configure() or at the latest Plugin.preflight() time.

    This would have made it possible to collect any of these instances at startup time, and issue a single warning after having loaded all the elements with a nice summary.

    But instead with this patch:

    • The urls without aliases are not batched into a single warning at load time
    • The warnings are not guaranteed to happen at startup time

    Because of how the SourceFetcher API works, some of these warnings will only happen once you actually get to fetching a git source, or one of it's submodules.

    I think that there are two different kinds of warning here too which can occur:

    • A project author specified a URL without specifying an alias
      • This applies to any url, including urls that are specified in the git submodules configuration
    • A project author has not specified any url at all for one of the submodules of a git repository
      • This could be a side effect of using bst track on a git source and causing the newly tracked reference to point to a commit sha which introduces a new submodule.

    For the first class of warning; it is entirely possible to guarantee that the warnings will occur at load time, before any processing begins; saving the user valuable time - for the second class of warning there is not much we can do about it.

    Is there any way that we can push this a bit further, and guarantee that this first class of warning is issued after loading of elements and before processing starts ?

  • changed milestone to %BuildStream_v1.2

  • added Urgent label

  • Additional note: the fact that parsing of configuration risks being delayed with the SourceFetcher() APIs is a bad oversight, we should guarantee that all user provided configuration is loaded and checked at startup time.

    This might or might not be possible to guarantee with the current SourceFetcher() API, but seeing as this invariant of validate-before-processing is very bad to break, I would be open to a last minute Source API change in bst-1.2 to accommodate fixing this.

    Alternatively, this might be achieved as a strong guideline for Source plugins; if there is no way to enforce this with the API, in which case we could add a core assertion in the case that a Source plugin calls translate_url() on an unaliased URI that the core has not yet seen after loading (I.e.: We can call it a bug in the plugin, if the plugin fails to properly validate it's configured URLs at load time but circles back and tries to validate them later).

    It seems to me though; that probably the best thing to do is to make sure that the fetchers are instantiated once at load time. This will significantly change how the git source will have to be implemented, but as far as I can see:

    • The git source can instantiate a fetcher for the main git repo
    • The git source can instantiate a fetcher for each URL that the project author specified
    • At fetch time, the main fetcher can be responsible for fetching any submodules which were not specified by project configuration (possibly issuing the second class of warning mentioned in the previous comment)
    • It is entirely pointless to have a separate SourceFetcher for a URL that is not specified in project configuration anyway - you cannot use source mirroring on a url that the project author did not alias

    I think that the best approach here, is to avoid the git plugin using it's internal GitMirror() helper object as a direct subclass of SourceFetcher(), and ensure that all fetchers have been instantiated at project load time.

    This should happen after Plugin.configure() and before Plugin.preflight(), giving the Source implementations a chance to preflight with an initialized list of fetchers.

  • Marked Urgent as this might require API break before next week - once we have a solid plan in place we can deescalate from Urgent.

  • Author Developer

    I think forcing sources to instantiate their fetchers at configuration time is the right call, ensuring that urls are translated/marked for download before preflight. I think adding an assertation to ensure this is a good place to start and figure out how we can adapt the git source accordingly.

    Edited by Qinusty
  • Let's start by fixing the assertions, and handle the more complex problem of providing a nice report of all the unaliased URLs in a single WARNING at load time separately.

  • Qinusty added 30 commits

    added 30 commits

    Compare with previous version

  • Author Developer

    I have a basic implementation of enforcing url resolution being performed within configure(), however this will not currently work without an overhaul of the current system. As you say, the current git implementation relies on being able to resolve urls throughout the main fetch logic.

    source.py also relies on this as source.py:912 instantiates new Sources and performs preflight for each mirror during a fetch.

    Overall, I'm not too sure we're going to be able to enforce this right now until we take a step back and think it through properly.

  • assigned to @Qinusty

  • Qinusty added 1 commit

    added 1 commit

    • e1536545 - Prevent ValueError on URLs without an alias

    Compare with previous version

  • Qinusty changed the description

    changed the description

  • Qinusty added 1 commit

    added 1 commit

    • e15d98f0 - Prevent ValueError on URLs without an alias

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading