Technical Investigation: Decide if we can fix Cobertura report paths in the background

Topic to Evaluate

As per #217664 (comment 434986155) we should investigate whether we can solve most of the pathing issues with cobertura reports in the background before we consider adding an external tool to parse the reports.

This might look like:

  1. Fetch build's CI_PROJECT_DIR. We will use this value to fix the absolute path of the source nodes in the XML and to come up with source paths that are relative to the project root only.
  2. Fetch Ci::Pipeline#all_worktree_paths and build a Set out of it for fast lookup of file paths.
  3. As we traverse each class node in the XML, we will form the path out of the source (with CI_PROJECT_DIR extracted out of it) and its filename, and then check the worktree set if the path exists, if it does then we will use it as the class' filename.

So far 2 uncertain parts here:

  1. I have not found a way to access CI_PROJECT_DIR, which is a CI/CD predefined variable, from the Ci::Build.variables when I was testing this locally on the Rails console. But if we would like to go with this route, then making it available is a good first step here.
  2. Is it safe to call Ci::Pipeline#all_worktree_paths on a huge project like for example GitLab.

Overall, I like this option if we could get it to work because then users won't have to do anything new to get their cobertura files to work. We also don't have to increase their build time by rewriting their XML file with a CLI tool. We transfer the burden to our own background workers.

Tasks to Evaluate

  • Determine feasibility of the feature
  • Determine if we can access CI_PROJECT_DIR from Ci:Build.variables
  • Determine safety of calling Ci::Pipeline#all_worktree_paths on a gitlab-org/gitlab
  • Update proposal in #217664 (closed)

Risks and Implementation Considerations

Team

/cc @iamricecake @jheimbuck_gl