Follow-up from "Refactoring rake task to import GitHub repositories"
The following discussions from !10695 (merged) should be addressed:
-
@jameslopez started a discussion: (+3 comments) Should we DRY the
reset_callbacks
call moving them to a module? Since they are all the same. Wondering if you can get rid of these classes altogether and just use them in the models with anif :importing?
or something along those lines... -
@jameslopez started a discussion: This class looks like it's following a Facade pattern but it includes all the responsibilities on itself - so it looks like it's doing way too many things. In other imports we have a class per different task... For instance you could have a repofetcher, wiki importer, etc... We already have those classes on other imports so maybe they could even be reused or extended if there are some differences... Wdyt? By doing SRP it would be way easier/cleaner to test as well.
-
@jameslopez started a discussion: (+2 comments) should we do this before instantiating a new object?
-
@jameslopez started a discussion: (+1 comment) Since you have a representation, wouldn't it much easier trying to call
to_hash
on that representation object? You can put on the base class a basicto_hash
method that does something likeinstance_variables.each
so you don't have to do any of this... -
@jameslopez started a discussion: You probably wouldn't need such a long assignment and could use
to_hash
to populate everything following https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/10695/diffs#note_27869476 -
@jameslopez started a discussion: (+1 comment) Crazy idea. Wondering if after we create the MR, we could avoid doing creating the diffs or even the comments and schedule to do that in parallel after, so once we fetch all MRs, there's another thread that just loops and creates all diffs - not sure if it's possible so I'm asking you. I know there are certain actions importing from GitHub that need to occur in a certain order, but if we can parallelize things that are not dependent on anything else, that would speed up the process
🤔 -
@jameslopez started a discussion: a lot of these methods do the same thing... If we extracted them into classes (as I suggested earlier) we could probably extract a lot of the logic into a
BaseFetcher
class if it's better to use inheritance, or into a genericUrlFetcher
that yields the response using composition instead. -
@jameslopez started a discussion: (+2 comments) What happens if we retry the import? Would it resume? Should we cache stuff using Redis?