Skip to content

fix: race condition in TestDBLoadBalancer_ProcessQueryError

What does this MR do?

Fixes TestDBLoadBalancer_ProcessQueryError in registry/datastore/loadbalancing_test.go.

Updates #1484

To reproduce, run `go test` with `-race`

Before (run a few times):

$ go test -count=1 -v -race -shuffle=on -failfast -run=^TestDBLoadBalancer_ProcessQueryError$ ./registry/datastore/...

-test.shuffle 1759920494008548000
=== RUN   TestDBLoadBalancer_ProcessQueryError
time="2025-10-08T13:48:14+03:00" level=info msg="resolving replicas with fixed hosts list" component=registry.datastore.DBLoadBalancer go_version=go1.25.1 version=unknown
time="2025-10-08T13:48:14+03:00" level=info msg="replica is new, opening connection" component=registry.datastore.DBLoadBalancer db_replica_addr="replica1:5432" go_version=go1.25.1 version=unknown
time="2025-10-08T13:48:14+03:00" level=info msg="updating replicas list" added_hosts="replica1:5432" component=registry.datastore.DBLoadBalancer db_replica_addr="replica1:5432" go_version=go1.25.1 removed_hosts= version=unknown
time="2025-10-08T13:48:14+03:00" level=warning msg="replica database connection error during query execution; initiating liveness probe" component=registry.datastore.DBLoadBalancer db_host_addr="replica1:5432" db_host_type=replica error=":  (SQLSTATE 08006)" go_version=go1.25.1 query="SELECT 1" version=unknown
time="2025-10-08T13:48:14+03:00" level=info msg="performing liveness probe" db_host_addr="replica1:5432" go_version=go1.25.1 version=unknown
time="2025-10-08T13:48:14+03:00" level=warning msg="host failed liveness probe; invoking callback" db_host_addr="replica1:5432" duration_s=2.7125e-05 error="connection failed" go_version=go1.25.1 version=unknown
time="2025-10-08T13:48:14+03:00" level=warning msg="removing replica from pool" component=registry.datastore.DBLoadBalancer db_replica_addr="replica1:5432" go_version=go1.25.1 version=unknown
==================
WARNING: DATA RACE
Read at 0x00c00022a4c0 by goroutine 43:
  github.com/docker/distribution/registry/datastore.(*DBLoadBalancer).Replicas()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing.go:845 +0xd80
  github.com/docker/distribution/registry/datastore_test.TestDBLoadBalancer_ProcessQueryError()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing_test.go:3576 +0xd78
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:1934 +0x164
  testing.(*T).Run.gowrap1()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:1997 +0x3c

Previous write at 0x00c00022a4c0 by goroutine 46:
  github.com/docker/distribution/registry/datastore.(*DBLoadBalancer).removeReplica()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing.go:721 +0x3bc
  github.com/docker/distribution/registry/datastore.(*DBLoadBalancer).removeReplica-fm()
      <autogenerated>:1 +0x4c
  github.com/docker/distribution/registry/datastore.(*LivenessProber).Probe()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing.go:386 +0x460
  github.com/docker/distribution/registry/datastore.(*DBLoadBalancer).ProcessQueryError.gowrap2()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing.go:499 +0x50

Goroutine 43 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:1997 +0x6e0
  testing.runTests.func1()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:2477 +0x74
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:1934 +0x164
  testing.runTests()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:2475 +0x734
  testing.(*M).Run()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:2337 +0xaf4
  main.main()
      _testmain.go:233 +0x100

Goroutine 46 (finished) created at:
  github.com/docker/distribution/registry/datastore.(*DBLoadBalancer).ProcessQueryError()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing.go:499 +0x598
  github.com/docker/distribution/registry/datastore_test.TestDBLoadBalancer_ProcessQueryError()
      /Users/alexandear/src/gitlab.com/gitlab-org/container-registry/registry/datastore/loadbalancing_test.go:3570 +0xd68
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:1934 +0x164
  testing.(*T).Run.gowrap1()
      /opt/homebrew/Cellar/go/1.25.1/libexec/src/testing/testing.go:1997 +0x3c
==================
    testing.go:1617: race detected during execution of test
--- FAIL: TestDBLoadBalancer_ProcessQueryError (0.16s)
FAIL
FAIL    github.com/docker/distribution/registry/datastore       0.865s
-test.shuffle 1759920493781118000
testing: warning: no tests to run
PASS
ok      github.com/docker/distribution/registry/datastore/metrics       1.488s [no tests to run]
?       github.com/docker/distribution/registry/datastore/migrations    [no test files]
?       github.com/docker/distribution/registry/datastore/migrations/mocks      [no test files]
?       github.com/docker/distribution/registry/datastore/migrations/postmigrations     [no test files]
?       github.com/docker/distribution/registry/datastore/migrations/premigrations      [no test files]
?       github.com/docker/distribution/registry/datastore/mocks [no test files]
-test.shuffle 1759920493522414000
testing: warning: no tests to run
PASS
ok      github.com/docker/distribution/registry/datastore/models        1.227s [no tests to run]
?       github.com/docker/distribution/registry/datastore/testutil      [no test files]
FAIL

After:

$ go test -count=1 -v -race -shuffle=on -failfast -run=^TestDBLoadBalancer_ProcessQueryError$ ./registry/datastore/...

-test.shuffle 1759920438954660000
=== RUN   TestDBLoadBalancer_ProcessQueryError
time="2025-10-08T13:47:18+03:00" level=info msg="resolving replicas with fixed hosts list" component=registry.datastore.DBLoadBalancer go_version=go1.25.1 version=unknown
time="2025-10-08T13:47:18+03:00" level=info msg="replica is new, opening connection" component=registry.datastore.DBLoadBalancer db_replica_addr="replica1:5432" go_version=go1.25.1 version=unknown
time="2025-10-08T13:47:18+03:00" level=info msg="updating replicas list" added_hosts="replica1:5432" component=registry.datastore.DBLoadBalancer db_replica_addr="replica1:5432" go_version=go1.25.1 removed_hosts= version=unknown
time="2025-10-08T13:47:18+03:00" level=warning msg="replica database connection error during query execution; initiating liveness probe" component=registry.datastore.DBLoadBalancer db_host_addr="replica1:5432" db_host_type=replica error=":  (SQLSTATE 08006)" go_version=go1.25.1 query="SELECT 1" version=unknown
time="2025-10-08T13:47:18+03:00" level=info msg="performing liveness probe" db_host_addr="replica1:5432" go_version=go1.25.1 version=unknown
time="2025-10-08T13:47:18+03:00" level=warning msg="host failed liveness probe; invoking callback" db_host_addr="replica1:5432" duration_s=2.7666e-05 error="connection failed" go_version=go1.25.1 version=unknown
time="2025-10-08T13:47:18+03:00" level=warning msg="removing replica from pool" component=registry.datastore.DBLoadBalancer db_replica_addr="replica1:5432" go_version=go1.25.1 version=unknown
--- PASS: TestDBLoadBalancer_ProcessQueryError (0.15s)
PASS
ok      github.com/docker/distribution/registry/datastore       1.940s
-test.shuffle 1759920438718398000
testing: warning: no tests to run
PASS
ok      github.com/docker/distribution/registry/datastore/metrics       1.554s [no tests to run]
?       github.com/docker/distribution/registry/datastore/migrations    [no test files]
?       github.com/docker/distribution/registry/datastore/migrations/mocks      [no test files]
?       github.com/docker/distribution/registry/datastore/migrations/postmigrations     [no test files]
?       github.com/docker/distribution/registry/datastore/migrations/premigrations      [no test files]
?       github.com/docker/distribution/registry/datastore/mocks [no test files]
-test.shuffle 1759920439282367000
testing: warning: no tests to run
PASS
ok      github.com/docker/distribution/registry/datastore/models        2.116s [no tests to run]
?       github.com/docker/distribution/registry/datastore/testutil      [no test files]

Author checklist

  • Assign one of conventional-commit prefixes to the MR.
    • fix: Indicates a bug fix, triggers a patch release.
    • feat: Signals the introduction of a new feature, triggers a minor release.
    • perf: Focuses on performance improvements that don't introduce new features or fix bugs, triggers a patch release.
    • docs: Updates or changes to documentation. Does not trigger a release.
    • style: Changes that do not affect the code's functionality. Does not trigger a release.
    • refactor: Modifications to the code that do not fix bugs or add features but improve code structure or readability. Does not trigger a release.
    • test: Changes related to adding or modifying tests. Does not trigger a release.
    • chore: Routine tasks that don't affect the application, such as updating build processes, package manager configs, etc. Does not trigger a release.
    • build: Changes that affect the build system or external dependencies. May trigger a release.
    • ci: Modifications to continuous integration configuration files and scripts. Does not trigger a release.
    • revert: Reverts a previous commit. It could result in a patch, minor, or major release.
  • MR contains database changes including schema/background migrations:
    • Do not include code that depends on the schema migrations in the same commit. Split the MR into two or more.
    • Do not include code that depends on background migrations in the same release.
    • Manually run up and down migrations in a postgres.ai production database clone and add a link for the query plan(s) to the MR.
    • If adding new schema migrations make sure the REGISTRY_SELF_MANAGED_RELEASE_VERSION CI variable in migrate.yml is pointing to the latest GitLab self-managed released registry version. Find the correct registry version here. Make sure to select the branch of the latest GitLab release.
    • If adding new queries, extract a query plan from postgres.ai and post the link here. If changing existing queries, also extract a query plan for the current version for comparison.
      • I do not have access to postgres.ai and have made a comment on this MR asking for these to be run on my behalf.
    • If adding new background migration, follow the guide for performance testing new background migrations and add a report/summary to the MR with your analysis.
  • Change contains a breaking change - apply the breaking change label.
  • Change is considered high risk - apply the label high-risk-change
  • I created or linked to an existing issue for every added or updated TODO, BUG, FIXME or OPTIMIZE prefixed comment
  • Changes cannot be rolled back
    • Apply the label cannot-rollback.
    • Add a section to the MR description that includes the following details:
      • The reasoning behind why a release containing the presented MR can not be rolled back (e.g. schema migrations or changes to the FS structure)
      • Detailed steps to revert/disable a feature introduced by the same change where a migration cannot be rolled back. (note: ideally MRs containing schema migrations should not contain feature changes.)
      • Ensure this MR does not add code that depends on these changes that cannot be rolled back.
Documentation/resources

Code review guidelines

Go Style guidelines

Feature flags

When documentation is required

Documentation workflow

Reviewer checklist

  • Ensure the commit and MR tittle are still accurate.
  • If the change contains a breaking change, verify the breaking change label.
  • If the change is considered high risk, verify the label high-risk-change
  • Identify if the change can be rolled back safely. (note: all other reasons for not being able to rollback will be sufficiently captured by major version changes).
Edited by 🤖 GitLab Bot 🤖

Merge request reports

Loading