Skip to content

Draft: GithubImporter: Refactoring representation layer

What does this MR do and why?

This is the first part of a bigger plan to improve the GithubImporter Representation layer.

Context

The GitHub importer uses a variation of the ETL architecture, where:

  • Extractions happens on Importers (with pluralized resource names);
  • Transformations happens in Representation layer;
  • Loading happens in Importers (with singular resource names);

Work in this commit

Currently some transformations are not happening exclusively in the Representation objects, instead, some transformations are leaking to the Loading layer. This is happening because some transformations depends on the context of the import, like the project being imported.

  • This first step:

    • creates Gitlab::GithubImporter::Context to pass the context to the representations.
    • moves the common API among the representations to the Representation::Base
    • adds more context to the representation to enable the Representation to do all the transformations to the Loader layer. At this moment, the project being imported and GitHub client being used, is being passed to the Representation Class.
      • #initialize,
      • #parse,
      • #deserialize,
      • #github_identifiers
    • rename from_api_response to parse
    • rename from_json_hash to deserialize
    • add parse_with to enable parse nested objects with the same context;
    • add deserialize_with to enable deserialize nested objects with the same context;
  • The next step is move the transformations that are happening out side the Representations layer back in to these classes.

  • Example of what can be removed from the Import (load) layer: !72458 (diffs)

Related to: #330331

Screenshots or screen recordings

Current architecture overview
sequenceDiagram
    participant GithubAPI
    participant Stage
    participant Representation
    participant ObjectImporter

    Stage ->> GithubAPI: Fetch Collection
    activate GithubAPI
    GithubAPI ->> Stage: Collection of objects
    deactivate GithubAPI

    loop every object
        Stage ->> Representation: from_api_response (serialize)
        activate Representation
        Representation ->> Stage: serialized object
        deactivate Representation

        Stage ->> ObjectImporter: execute (serialized object)
        
        ObjectImporter ->> Representation: from_json_hash
        activate Representation
        Representation ->> ObjectImporter: deserialized object
        deactivate Representation
        
        Note right of ObjectImporter: At this point<br>the ObjectImporter<br>uses the deserialized object and some<br>transformations from the Representation<br>to build the attributes (more transformations) to<br>save the object on Gitlab
    end
Proposed changed architecture overview
sequenceDiagram
    participant GithubAPI
    participant Stage
    participant Representation
    participant ObjectImporter

    Stage ->> GithubAPI: Fetch Collection
    activate GithubAPI
    GithubAPI ->> Stage: Collection of objects
    deactivate GithubAPI

    loop every object
        Stage ->> Representation: serialize
        activate Representation
        Representation ->> Stage: serialized object
        deactivate Representation

        Stage ->> ObjectImporter: execute (serialized object)
        
        ObjectImporter ->> Representation: deserialize
        activate Representation
        Representation ->> ObjectImporter: deserialized object
        deactivate Representation
        
        Note right of ObjectImporter: Instead of having transformations on both<br>Representation and ObjectImporter<br>the end goal is to move all the<br>transformations to the Representer Layer
    end

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Kassio Borges

Merge request reports