Intermittent Graphql "Timeout on validation of query" error when querying Geo replication status in UI
Summary
In a self-managed v18.5.1 Geo deployment from time to time the Geo site object status page presents a There was an error fetching the Project Repositories. The GraphQL API call to the secondary may have failed error:
This correlates with the following error being logged on the Geo secondary in gitlab-rails/graphql_json.log :
{
"severity": "INFO",
"time": "2025-12-11T19:33:20.819Z",
"correlation_id": "01KC7EEKM0Y2Y0M88KKB6FZK2E",
"meta.caller_id": "graphql:unknown",
"meta.feature_category": "not_owned",
"meta.organization_id": 1,
"meta.remote_ip": "redacted",
"meta.user": "reddacted",
"meta.user_id": 1645,
"meta.client_id": "user/1645",
"trace_type": "execute_query",
"query_fingerprint": "anonymous/v_ax9jPZ2bHNA3ZFAiOz5IgajOvOilvjqS2Mzb-vgpM=/8/ffCQTn8T2_vEpRSimVZfu6TYhWao1vgElruXD86OO5k=",
"duration_s": 0.7373864450055407,
"operation_name": null,
"operation_fingerprint": "anonymous/v_ax9jPZ2bHNA3ZFAiOz5IgajOvOilvjqS2Mzb-vgpM=",
"is_mutation": false,
"variables": "{\"sort\"=>\"ID_ASC\", \"before\"=>\"\", \"after\"=>\"\", \"first\"=>20, \"last\"=>nil, \"replicationState\"=>nil, \"verificationState\"=>nil, \"ids\"=>nil}",
"query_string": "query ($first: Int, $last: Int, $before: String!, $after: String!, $sort: GeoRegistrySort, $replicationState: ReplicationStateEnum, $verificationState: VerificationStateEnum, $ids: [GeoProjectRepositoryRegistryID!]) {\n geoNode {\n projectRepositoryRegistries(\n first: $first\n last: $last\n before: $before\n after: $after\n sort: $sort\n replicationState: $replicationState\n verificationState: $verificationState\n ids: $ids\n ) {\n pageInfo {\n ...PageInfo\n __typename\n }\n count\n nodes {\n id\n state\n retryCount\n lastSyncFailure\n retryAt\n lastSyncedAt\n modelRecordId\n verifiedAt @include(if: true)\n verificationState @include(if: true)\n verificationFailure @include(if: true)\n createdAt\n __typename\n }\n __typename\n }\n __typename\n }\n}\n\nfragment PageInfo on PageInfo {\n hasNextPage\n hasPreviousPage\n startCursor\n endCursor\n __typename\n}",
"graphql_errors": [
{
"message": "Timeout on validation of query",
"locations": [],
"extensions": {
"code": "validationTimeout"
}
}
]
}
The Geo secondary host is well-resourced (64 cores, 64GiB mem) and mostly just handles Geo replication activity. The query being validated does not seem particularly complex.
In #396784 (closed) in the context of CI spec jobs failing for the GitLab project, there was some discussion around why this timeout seems to occur randomly for non-complex queries, and the suggestion made that the current hard-coded timeout of 0.2 seconds could be increased.
It is not clear what the root cause of these timeouts is or what can be done to prevent them from occurring.
Steps to reproduce
Unable to reproduce on demand, occurs at random intervals.
Example Project
What is the current bug behavior?
Intermittent errors in the UI viewing Geo secondary object status.
What is the expected correct behavior?
No errors should occur.
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: \`sudo gitlab-rake gitlab:env:info\`) (For installations from source run and paste the output of: \`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production\`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:check SANITIZE=true`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true`) (we will only investigate if the tests are passing)
Possible fixes
Patch release information for backports
If the bug fix needs to be backported in a patch release to a version under the maintenance policy, please follow the steps on the patch release runbook for GitLab engineers.
Refer to the internal "Release Information" dashboard for information about the next patch release, including the targeted versions, expected release date, and current status.
High-severity bug remediation
To remediate high-severity issues requiring an internal release for single-tenant SaaS instances, refer to the internal release process for engineers.
