Follow-up from "Refactoring rake task to import GitHub repositories"
The following discussions from !10695 (merged) should be addressed:
Should we DRY the
reset_callbackscall moving them to a module? Since they are all the same. Wondering if you can get rid of these classes altogether and just use them in the models with an
if :importing?or something along those lines...
This class looks like it's following a Facade pattern but it includes all the responsibilities on itself - so it looks like it's doing way too many things. In other imports we have a class per different task... For instance you could have a repofetcher, wiki importer, etc... We already have those classes on other imports so maybe they could even be reused or extended if there are some differences... Wdyt? By doing SRP it would be way easier/cleaner to test as well.
should we do this before instantiating a new object?
Since you have a representation, wouldn't it much easier trying to call
to_hashon that representation object? You can put on the base class a basic
to_hashmethod that does something like
instance_variables.eachso you don't have to do any of this...
You probably wouldn't need such a long assignment and could use
to_hashto populate everything following !10695 (diffs, comment 27869476)
Crazy idea. Wondering if after we create the MR, we could avoid doing creating the diffs or even the comments and schedule to do that in parallel after, so once we fetch all MRs, there's another thread that just loops and creates all diffs - not sure if it's possible so I'm asking you. I know there are certain actions importing from GitHub that need to occur in a certain order, but if we can parallelize things that are not dependent on anything else, that would speed up the process
a lot of these methods do the same thing... If we extracted them into classes (as I suggested earlier) we could probably extract a lot of the logic into a
BaseFetcherclass if it's better to use inheritance, or into a generic
UrlFetcherthat yields the response using composition instead.
What happens if we retry the import? Would it resume? Should we cache stuff using Redis?