Analyze exceptions to the `git fsck` command

Problem

As more customers import legacy repositories into GitLab, they may run into issues with malformed commit messages or other problems detected by git fsck. At present, we have a set of exceptions to git fsck that can be configured on a per-instance basis. Therefore, our self-managed users can control which failures are allowed. However, for our SaaS offering, we currently have a standard set of allowed failures that may be more restrictive than necessary.

Analysis required

There are two aspects to this analysis that we need to consider before recommending changes to the GitLab.com platform's configuration around git fsck failures. We should analyze all options here for the following two items:

Does allowing the error open us to an abuse or attack vector? If so, we cannot allow the exception.
If allowing the error is allowable from a security standpoint, we must also validate that allowing the exception does not have a downstream impact. For example, allowing malformed commit email addresses may not pose an attack vector, but may cause issues to other areas within GitLab that assume that all email addresses are properly formed.

Once we've completed this analysis, we should suggest an updated configuration for the GitLab.com configuration, as well as document our decisions publicly to avoid any further questions around this topic.

Preferred solution

After walking through some of the options, we believe the best path forward would be to upstream changes to Git such that there are three levels of issues identified — Critical, Error, or Warning. In this way, the analysis around security / safety would be performed once, and by the experts. We could then determine our response based on the criticality of the error received.

This would require an upstream change to Git which we will need to identify and propose.

Edited Sep 18, 2024 by Mark Wood