Job pages poll for traces before jobs start, and while tab is hidden

In gitlab-com/gl-infra/scalability#654, we discovered two cases that combined lead to a significant chunk of our polling traffic:

  1. On a job details page, we poll for a trace even before the job starts.
  2. We also continue to poll while the browser tab is not visible.

Here's an example from gitlab-com/gl-infra/scalability#654 (comment 449099954):

image

This user's browser requested a job trace 36,609 times over the course of 27 hours. Only 753 of those requests were after the job started; the rest were all while it was waiting to be picked up:

[ gprd ] production> Ci::Build.find(845834636).created_at
=> Thu, 12 Nov 2020 14:52:53 UTC +00:00
[ gprd ] production> Ci::Build.find(845834636).started_at
=> Fri, 13 Nov 2020 16:34:39 UTC +00:00
[ gprd ] production> Ci::Build.find(845834636).finished_at
=> Fri, 13 Nov 2020 17:32:41 UTC +00:00
[ gprd ] production> Ci::Build.find(845834636).status
=> "failed"

It's hard to estimate the precise impact of this because our logging does not include the job status at the time of polling, but we can see that the most popular (trace endpoint, user, IP) combinations all follow this pattern: a job that takes a long time to be picked up, and presumably isn't in a foregrounded tab, because it seems unlikely that someone would foreground a tab of a job that's not even started yet 🙂

Removing those top polling users would dramatically reduce the traffic to this endpoint and reduce the strain on Redis.