Skip to content

Regular and predictable database latency spikes are leading to web latency slowdowns (or "the {00,04,08,12,16,20}:05 spike")

Spun out of https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/9248#note_294148937

Within ~1 minute of the following times, on a daily basis, we experience a slowdown on our web (and possibly other services)

  • 00:05
  • 04:05
  • 08:05
  • 12:05
  • 16:05
  • 20:05

p99.99 db durations over 24 hours

image

https://log.gprd.gitlab.net/goto/6f240764af89c33cbaf08ff91d92fe53

p99.9 web duration over 3 minutes during a spike

image

https://log.gprd.gitlab.net/goto/c414837b2116e33e0809ec871bafa946

cc @Finotto