geo.staging.gitlab.com snippet repository replication failures
I've reset some of the failures and they synced fine.. (edited)
valery 11 hours ago
I'm going to reset all of them.
valery 10 hours ago
Most of them are failed again. All those that failed has a different error than it used to be for the ones I fixed Error syncing repository: 2:fetch remote: \"fatal: the remote end hung up unexpectedly\\n\": exit status 128. (edited)
valery 10 hours ago
It may be something environment related because the repos are invalid on staging (gstg). See: Gitlab::git::Repository::NoRepository (5:GetRepoPath: not a git repository '/var/opt/gitlab/git-data/repositories/@snippets/fd/85/fd85acb07af078e66c6f2261ac372958cc9782cb033b3fa384ca5f130985c9a0.git'.) (edited)
valery 9 hours ago
I've asked in #staging https://gitlab.slack.com/archives/CLM200USV/p1608061024071600
valery
Any idea why half of Snippet repositories are invalid on staging? I call Snippet[...].repository.branches and it fails with Gitlab::git::Repository::NoRepository (5:GetRepoPath: not a git repository
Posted in #staging | Today at 11:37 AM | View message
Nick Nguyen 6 hours ago
cc @douglas in case you have an idea or know who else could help investigate
:eyes:
1
douglas 6 hours ago
https://sentry.gitlab.net/gitlab/geo-staging-gitlabcom/issues/1723366
douglas 6 hours ago
https://sentry.gitlab.net/gitlab/geo-staging-gitlabcom/issues/1536391 (edited)
douglas 6 hours ago
These two Sentry errors seems related
douglas 6 hours ago
> failed_ids = Geo::SnippetRepositoryRegistry.failed.pluck(:snippet_repository_id)
> shard_ids = SnippetRepository.where(snippet_id: failed_ids).pluck(:shard_id).uniq
> shards = Shard.find(shard_ids)
=> [#<Shard id: 88, name: "nfs-file22">, #<Shard id: 8, name: "nfs-file07">]
#
# "nfs-file07" => {"path"=>"/var/opt/gitlab/git-data-file07", "gitaly_address"=>"tcp://file-01-stor-gstg.c.gitlab-staging-1.internal:9999"}
# "nfs-file22" => {"path"=>"/var/opt/gitlab/git-data-file22", "gitaly_address"=>"tcp://i.gstg-gcp-tcp-lb-internal-praefect.il4.us-east1.lb.gitlab-staging-1.internal:2305"}
#
> SnippetRepository.where(snippet_id: failed_ids).group(:shard_id).count
=> {88=>19529, 8=>11485}
douglas 6 hours ago
[ gstg ] production> s = SnippetRepository.find(3945)
=> #<SnippetRepository shard_id: 88, snippet_id: 3945, disk_path: "@snippets/53/66/5366ec7df49331f43da1f43fedc75ce2b3...", verification_retry_count: nil, verification_retry_at: nil, verified_at: nil, verification_checksum: nil, verification_failure...
[ gstg ] production> s.repository
=> #<Repository:@snippets/53/66/5366ec7df49331f43da1f43fedc75ce2b333ccc2b39fd9aca07abbfee570112b>
[ gstg ] production> s.repository.exists?
=> true
[ gstg ] production> s.repository.empty?
=> true
douglas 6 hours ago
We need to double-check but it seems related to the staging environment.
douglas 6 hours ago
The repositories are empty on the primary site.
Edited by Michael Kozono
