Intermittent Snippet repository update error

Summary

There is an intermittent error on the Staging environment, when updating Snippets.

This was initially reported due to failing QA specs (#239091 (closed)).

Although it was originally thought this was related to the visibility level, we've since discovered the error can sometimes occur no matter what the changes are.

image

Reproduction

So far, we've only been able to reproduce this error on Staging - local dev and production does not seem to happen, but due to the randomness of the error, I'm hesitatant to say this is staging-only.

To reproduce, repeatedly update any Snippet until the error occurs 🙈

Staging console

I was able to replicate the error in the staging console:

[ gstg ] production> snippet.snippet_repository.multi_files_action(user, [{previous_path: "Snippe-repo.txt", file_path: "Snippe-repo.txt", content: "new content"}], {branch_name: 'master', message: 'Testing'})
{:message=>"Snippet git error. Reason: The repository must contain at least 1 file.", :snippet=>1684278}
Traceback (most recent call last):
        5: from (irb):21
        4: from (irb):21:in `rescue in irb_binding'
        3: from app/models/snippet_repository.rb:33:in `multi_files_action'
        2: from app/models/snippet_repository.rb:40:in `capture_git_error'
        1: from app/models/snippet_repository.rb:50:in `rescue in capture_git_error'
SnippetRepository::CommitError (The repository must contain at least 1 file.)

But then the very next repetition....

[ gstg ] production> snippet.snippet_repository.multi_files_action(user, [{previous_path: "Snippe-repo.txt", file_path: "Snippe-repo.txt", content: "new content"}], {branch_name: 'master', message: 'Testing'})
=> "29bd193da850c57bb9bcab3d99408627254b5397"

Investigation

My investigation so far has led me to the following:

Three method log entries in the Gitaly kibana logs seem pertinent:

1. GetObjectPool: `alternates file does not point to valid git repository`
{
  "_index": "pubsub-gitaly-inf-gstg-000143",
  "_type": "_doc",
  "_id": "IhCzcnQBSJ4tD642OsoO",
  "_score": 1,
  "_source": {
    "@timestamp": "2020-09-09T11:49:41.425Z",
    "ecs": {
      "version": "1.4.0"
    },
    "json": {
      "grpc.start_time": "2020-09-09T11:48:54Z",
      "grpc.request.repoPath": "@snippets/7e/7a/7e7a5986b8fb50a24b70cb1788a209ebedc521262f1571d0435e03ccb7dc7610.git",
      "environment": "gstg",
      "tag": "gitaly",
      "type": "gitaly",
      "grpc.service": "gitaly.ObjectPoolService",
      "level": "warning",
      "pid": 345,
      "span.kind": "server",
      "storage": "@snippets/7e/7a/7e7a5986b8fb50a24b70cb1788a209ebedc521262f1571d0435e03ccb7dc7610.git",
      "stage": "main",
      "grpc.request.glProjectPath": "",
      "grpc.method": "GetObjectPool",
      "error": "no alternates directory exists",
      "tier": "stor",
      "time": "2020-09-09T11:48:54.935Z",
      "grpc.request.repoStorage": "file-04",
      "correlation_id": "2PtMNcdk0t2",
      "grpc.request.glRepository": "",
      "grpc.meta.auth_version": "v2",
      "peer.address": "10.224.37.7:47136",
      "fqdn": "file-04-stor-gstg.c.gitlab-staging-1.internal",
      "shard": "default",
      "system": "grpc",
      "grpc.request.topLevelGroup": "@snippets",
      "grpc.request.fullMethod": "/gitaly.ObjectPoolService/GetObjectPool",
      "grpc.meta.deadline_type": "none",
      "msg": "alternates file does not point to valid git repository",
      "hostname": "file-04-stor-gstg"
    },
    "type": "pubsubbeat-pubsub-gitaly-inf-gstg-6d8c5bdd45-fnm5s",
    "message_id": "1523148496640045",
    "publish_time": "2020-09-09T11:49:41.364Z",
    "host": {
      "name": "pubsubbeat-pubsub-gitaly-inf-gstg-6d8c5bdd45-fnm5s"
    }
  },
  "fields": {
    "json.grpc.start_time": [
      "2020-09-09T11:48:54.000Z"
    ],
    "@timestamp": [
      "2020-09-09T11:49:41.425Z"
    ],
    "json.grpc.request.deadline": [],
    "publish_time": [
      "2020-09-09T11:49:41.364Z"
    ],
    "json.time": [
      "2020-09-09T11:48:54.935Z"
    ]
  }
}
2. ListFiles: `fatal: bad object error`
{
  "_index": "pubsub-gitaly-inf-gstg-000143",
  "_type": "_doc",
  "_id": "12izcnQBdA_qmth-Fj6F",
  "_score": 1,
  "_source": {
    "@timestamp": "2020-09-09T11:49:32.473Z",
    "type": "pubsubbeat-pubsub-gitaly-inf-gstg-6d8c5bdd45-fnm5s",
    "message_id": "1523148259397119",
    "publish_time": "2020-09-09T11:49:32.450Z",
    "json": {
      "shard": "default",
      "time": "2020-09-09T11:48:53.917Z",
      "tag": "gitaly",
      "grpc.request.topLevelGroup": "@snippets",
      "grpc.method": "ListFiles",
      "grpc.meta.deadline_type": "regular",
      "grpc.request.repoStorage": "file-03",
      "type": "gitaly",
      "grpc.service": "gitaly.CommitService",
      "peer.address": "10.224.37.6:34108",
      "grpc.start_time": "2020-09-09T11:48:53Z",
      "span.kind": "server",
      "grpc.request.glProjectPath": "vij-group-test-test/test-project/@snippets/1684278",
      "fqdn": "file-03-stor-gstg.c.gitlab-staging-1.internal",
      "grpc.request.deadline": "2020-09-09T11:49:23Z",
      "grpc.request.fullMethod": "/gitaly.CommitService/ListFiles",
      "msg": "fatal: bad object a81a39336b6a24b61fa11c1806bd6da965a66a53\\n",
      "grpc.request.repoPath": "@snippets/7e/7a/7e7a5986b8fb50a24b70cb1788a209ebedc521262f1571d0435e03ccb7dc7610.git",
      "stage": "main",
      "level": "error",
      "correlation_id": "KcZAQh6fDHa",
      "environment": "gstg",
      "pid": 10482,
      "grpc.meta.client_name": "gitlab-web",
      "tier": "stor",
      "grpc.request.glRepository": "snippet-1684278",
      "hostname": "file-03-stor-gstg",
      "system": "grpc",
      "grpc.meta.auth_version": "v2"
    },
    "host": {
      "name": "pubsubbeat-pubsub-gitaly-inf-gstg-6d8c5bdd45-fnm5s"
    },
    "ecs": {
      "version": "1.4.0"
    }
  },
  "fields": {
    "json.grpc.start_time": [
      "2020-09-09T11:48:53.000Z"
    ],
    "@timestamp": [
      "2020-09-09T11:49:32.473Z"
    ],
    "json.grpc.request.deadline": [
      "2020-09-09T11:49:23.000Z"
    ],
    "publish_time": [
      "2020-09-09T11:49:32.450Z"
    ],
    "json.time": [
      "2020-09-09T11:48:53.917Z"
    ]
  }
}
3. ReplicateRepository: diskcache state change
{
  "_index": "pubsub-gitaly-inf-gstg-000143",
  "_type": "_doc",
  "_id": "xhO_cnQBSJ4tD6426ORd",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2020-09-09T12:03:32.546Z",
    "type": "pubsubbeat-pubsub-gitaly-inf-gstg-6d8c5bdd45-fnm5s",
    "message_id": "1523154185356320",
    "ecs": {
      "version": "1.4.0"
    },
    "host": {
      "name": "pubsubbeat-pubsub-gitaly-inf-gstg-6d8c5bdd45-fnm5s"
    },
    "publish_time": "2020-09-09T12:03:32.505Z",
    "json": {
      "grpc.service": "gitaly.RepositoryService",
      "grpc.request.repoStorage": "file-03",
      "grpc.start_time": "2020-09-09T12:03:29Z",
      "grpc.request.glRepository": "",
      "type": "gitaly",
      "diskcache": "065373e6-c50f-4c89-b16d-e3a711177900",
      "pid": 10482,
      "grpc.meta.auth_version": "v2",
      "span.kind": "server",
      "hostname": "file-03-stor-gstg",
      "level": "info",
      "msg": "diskcache state change",
      "grpc.meta.deadline_type": "none",
      "shard": "default",
      "tag": "gitaly",
      "time": "2020-09-09T12:03:29.539Z",
      "grpc.request.glProjectPath": "",
      "stage": "main",
      "correlation_id": "6vWwmYm0md2",
      "fqdn": "file-03-stor-gstg.c.gitlab-staging-1.internal",
      "tier": "stor",
      "grpc.method": "ReplicateRepository",
      "grpc.request.repoPath": "@snippets/7e/7a/7e7a5986b8fb50a24b70cb1788a209ebedc521262f1571d0435e03ccb7dc7610.git",
      "environment": "gstg",
      "grpc.request.topLevelGroup": "@snippets",
      "system": "grpc",
      "grpc.request.fullMethod": "/gitaly.RepositoryService/ReplicateRepository",
      "peer.address": "10.224.37.6:34108"
    }
  },
  "fields": {
    "json.grpc.start_time": [
      "2020-09-09T12:03:29.000Z"
    ],
    "@timestamp": [
      "2020-09-09T12:03:32.546Z"
    ],
    "json.grpc.request.deadline": [],
    "publish_time": [
      "2020-09-09T12:03:32.505Z"
    ],
    "json.time": [
      "2020-09-09T12:03:29.539Z"
    ]
  },
  "highlight": {
    "json.correlation_id": [
      "@kibana-highlighted-field@6vWwmYm0md2@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1599653009539
  ]
}

Potentially interesting observations

  • Both erroring snippets were on repoStorage file-03 (example)
  • A snippet that doesn't cause an error (lots of updating) is on repoStorage nfs-file07 (example)
Edited by Vijay Hawoldar