Geo - Improve checksum algorithm to take more refs into account

From a Slack thread:

@zj: I was reading the Checksum code, and I think Geo might not sync data correctly if I push a reference to for example: refs/zj-trolls/geo. The checksum will not take this reference into account, will it? And than the primary will have that data, the secondary doesn't?

@stanhu: Geo cares about these refs: https://gitlab.com/gitlab-org/gitaly/-/blob/a01c38381deef1601448447f7bde0083be78a043/internal/gitaly/service/repository/calculate_checksum.go#L19

@dbalexandre:

❯ git push origin main:refs/zj-trolls/geo
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
To dbalexandre-test-1-primary-geo.gogitlab.ml:root/test-checksum.git
 * [new reference]   main -> refs/zj-trolls/geo-1

Creating a new reference as above does not change the checksum because we don't have refs/heads/zj-trolls/geo-1 Geo does not care about it https://gitlab.com/gitlab-org/gitaly/-/blob/a01c38381deef1601448447f7bde0083be78a043/internal/gitaly/service/repository/calculate_checksum.go#L19, but Geo sync these changes to secondary nodes.

❯ git co -b refs/zj-trolls/geo-2
Switched to a new branch 'refs/zj-trolls/geo-2'
~/Sandbox/test-checksum on refs/zj-trolls/geo-2
❯ git push origin refs/zj-trolls/geo-2
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
remote:
remote: To create a merge request for refs/zj-trolls/geo-2, visit:
remote:   http://dbalexandre-test-1-primary-geo.gogitlab.ml/root/test-checksum/-/merge_requests/new?merge_request%5Bsource_branch%5D=refs%2Fzj-trolls%2Fgeo-2
remote:
To dbalexandre-test-1-primary-geo.gogitlab.ml:root/test-checksum.git
* [new branch]      refs/zj-trolls/geo-2 -> refs/zj-trolls/geo-2

Creating a new branch as above changes the checksum because we have refs/heads/zj-trolls/geo-2, and Geo sync the changes to the secondary nodes.

@fzimmer: So is this something that we need to address?

dbalexandre: The additional refs may increase load and flag more repositories as out of sync, but we actively try to correct verification failures on the secondary node. So, I think it's fine to add these refs.

@dbalexandre: We also need to consider checksum invalidation when the algorithm changes, as any checksums generated with the old algorithm, will become invalid.

/cc @fzimmer @nhxnguyen

Edited by Douglas Barbosa Alexandre