Expand project Import / Export test coverage and ensure we use data set including all project features
During our team effort to improve Project Import
performance and memory consumption, we are making structural changes to the related code to make it more simple and clear: #33226 (closed).
It is important to have Import/Export
being extensively tested and covered as part of our CI process, to avoid having broken imports/exports for our users and customers.
Currently, we have a number of specs (e.g. spec/lib/gitlab/import_export/project_tree_restorer_spec.rb
) that are using pre-generated exported json (spec/fixtures/lib/gitlab/import_export/complex/project.json
and simplier versions) to check that.
Recently, we found that our project.json
doesn't cover some of the cases: !18024 (comment 237306938)
It was possible to catch that by importing gitlabhq_export.tar.gz
from https://gitlab.com/gitlab-org/quality/performance-data
Please note that this tar
is 3mo old and seems to be updated manually. Maybe we should change that?
Also, Kamil mentioned that gitlabhq
is not a complete dataset, as it doesn't use all the features.
Overall, that means our CI coverage is not reliable enough and we should improve that.
We want to test Import/Export
on "complete" data set, which means that every relation (as in our Import/Export definition) and every project feature is covered.
What we need to understand under this issue:
- How to expand our current specs to include all relations and features? Which ones are we obviously missing now?
- How to ensure that new relations and new features will be mapped into our Import/Export specs? (it may be a separate problem to solve, but quite an important one)
- Do we need to include some
tar
import as part of our CI (similar to what we do withgitlabhq
manually) or specs will be enough?
Kamil suggested such structure of the "functional" test, I find it reasonable:
you import complex json, and export complex json, they should be equal
cc @ayufan