More resilient usage pings

Description

We send back weekly usage pings, but if an instance misses a week for whatever reason, that missing data messes with our metrics. It adds noise to the system. We should make the data reporting more robust.

Proposal

We currently attempt to aggregate our usage data and send it in one go during the week. We should decouple this into two services:

Collecting usage statistics on an instance
  • Create a new usage_statistics table. -> there already is RawUsageData
  • Each week, we collect the information in usage_data.rb and add it to a new row of the table. -> Done at GitlabServicePingWorker
  • For each row, we should track whether or not this information was successfully received by version.gitlab.com.
    • This information should be stored whether or not the instance is electing to share usage ping (we simply won't attempt to send it anywhere if not). -> Done
  • For each row, we'll also need to record the datetime for that activity. -> Done (there is sent_at column in RawUsageData)
Sending usage statistics to GitLab, Inc.
  • Each day, if an instance has elected to share usage data with GitLab, a separate service will iterate through the table and attempt to send any rows not received by GitLab to version.gitlab.com.
    • We should establish some reasonable daily limit so we're not attempting to send a massive amount of information all at once if a very old instance (with a very large usage_statistics table) enables usage ping.
    • We should start with the most recent stats available and backfill from there.

Links / references

Edited Nov 10, 2022 by Mikołaj Wawrzyniak
Assignee Loading
Time tracking Loading