Backlog on multiple topics in the logging pipeline
In the last few weeks we repeatedly saw cases of backlog accumulating for multiple topics and subscriptions in PubSub. Initial investigations were inconclusive, i.e. they were not pointing to any single bottleneck.
Currently, we suspect the bottleneck might be on:
- ES (utilization was high on multiple resources, however, there was no significant saturation anywhere)
- GCP rate limiting (we didn't find in the GCP web console any rate limit that we exceeded)
The purpose of this epic is to investigate the problem in detail and find the bottleneck so that we can scale at will.
epic