Standard usage logging
As part of the Cloud Spend working group, we are trying to understand the usage patterns of users on GitLab.com so that we can optimise our cloud costs.
Something that has become apparent while doing this analysis is that it's very difficult to attribute resource usage on GitLab.com to different types of users.
I would like to start thinking about how we could address this with the smallest viable change that would start providing us with more insight into usage patterns.
Note that any effort in this area would also be very useful for the abuse team cc @wvandenberg @rostrander @jurbanc
I'm open to any ideas that others have, and have one proposal myself
Standardised Usage Structured Logs
This approach is very similar to what @stanhu did with audit logging, except for usage: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22471
- For events related to usage we emit usage logs in a structured (NDJSON / newline delimited JSON) format
- The audit logs contain the following common details (when appropriate)
timestamp
-
usage_type
the type of usage -
user_id
+username
-
project_id
+full_path
-
plan_id
: free tier, gold etc -
correlation_id
: to correlate usage with other log data, to allow deeper analysis in future. -
amount
: the quantity of usage -
unit
: the unit of usage
- Some
usage_type
events would have additional details specific to their domain. For example, for runner minutes, it's important to know which runner-manager the jobs were executed on, since some runners are owned by - These events can be ingested into the logging system for later analysis. Currently, we use Elastic for this purpose, but in future this could even be done with Splunk or Hadoop jobs.
- While probably not helpful for small installations, this data may be useful for cross-departmental billing in larger self-managed instances too
Anything that has a significant cost associated with it could be logged with this approach - for example CI runner minutes, artifact storage, registry images, LFS objects etc.
Here are some examples:
CI Runner Usage
{
usage_type: "ci_runner",
user_id: 5,
username: "andrewn",
project_id: 123,
full_path: "andrewn/pirate",
plan_id: 4, // Gold tier customer
correlation_id: "oiquoei123as", // For further investigation
runner_manager: "gitlab", // Using the GitLab runners
amount: 26,
units: "minute" // The unit for measuring runner usage is minutes
}
LFS storage
{
usage_type: "lfs_storage",
user_id: 5,
username: "andrewn",
project_id: 123,
full_path: "andrewn/pirate",
plan_id: 4, // Gold tier customer
correlation_id: "oiquoei123as", // For further investigation
amount: 221,
units: "mebibyte" // The unit for measuring LFS object storage is mebibytes
}
git storage
Git storage usage logs could be written following GC/Housekeeping operations
{
usage_type: "git_storage",
user_id: 5,
username: "andrewn",
project_id: 123,
full_path: "andrewn/pirate",
plan_id: 4, // Gold tier customer
correlation_id: "2312asdas31dasd", // For further investigation
amount: 1503,
units: "mebibyte" // The unit for measuring git storage is mebibytes
}
Reporting
Once we're recording usage and being able to attribute usage to users/plans we can run queries such as:
- How many runner minutes is a particular user using?
- How many runner minutes are being used for each tier of our usage on GitLab.com?
- Which users are storing the most LFS object
We will be able to keep track of these and feed them into our KPIs.