Skip to content

Runner Fleet Dashboard - Admin View: Runner Compute Costs

Release notes

{placeholder for release notes}

Problem(s) to solve

Users who use cloud platforms (AWS, GCP, Azure, and more) to host their Fleet of runners do not have an easy way of knowing how much their runners are costing them. The current flow they go through is finding the total compute cost in their cloud platform, and then trying to manually attribute that to jobs run in GitLab. They have no easy way to know who their top users are of their Instance runners (the ones any project can use) so they can “charge” those groups or projects accordingly or even update their Fleet to optimize pipeline performance for certain groups and projects.

Out of scope

Finally, users don't have a way to optimize their costs based on previous data. For example, if they are spending $1K in 1 week, how do they lower costs while still maintaining pipeline performance? Another example is if one project attributes to the majority of the runner costs, how can the user optimize their Fleet (or even company organization) to account for the higher usage for that project?

Intended users

Self-managed platform engineers who are using cloud platforms to create their Fleet of autoscaling runners (this is 80% of our users who bring their own runners).

Related research

I need to understand which projects are the biggest users of the shared runner pools that my team maintains?

For cost-distribution, the systems team wants to see who is using what type of runner and for how long.

As a GitLab admin, in order to track costs for shared runner minutes (instance) incurred by applications, I need a report of shared runner minutes usage per asset id.

Requirements

  • Only support the following Cloud providers for this feature:
    • GCP, AWS, Azure
  • Only provide cost visibility runners that are:
  • Users must be able to download the cost report as a .csv file from within the UI.

User experience goal

The user must be able to at a glance understand the total compute costs for their Runner Fleet.

Proposal (TBD)

Prior art from PM

Runner_Fleet_AI

Metrics date range selection options:

The default time range for the view is the current month.

Relative dates

  • Last 7 days
  • Last 30 days

Absolute dates

  • Last month
  • This month

Filter results options

The default view will display the total runner fleet cloud costs for all projects organized by group name

  • Group
  • Project

Cost dashboard - Panel 1 - total costs and trends

  • Total cloud costs = sum of cloud costs for all projects.
  • Cost trends = (current_period_costs) - (previous_period_cost) for all projects

example query and output

SELECT
    round(sum(ci_job_compute_cost)) AS cloud_costs,
    bar(cloud_costs, 0, 100, 80)
FROM ci_finished_builds
WHERE (created_at >= toDateTime('2023-07-01 00:00:00')) AND (created_at <= NOW())

Screenshot_2023-07-26_at_8.21.25_PM

Cost dashboard Panel 2 - cost chart

  • Displays the total cost per day for all projects.
  • Chart filters:
    • Daily - the default chart time period is daily.
    • Monthly - switches the chart time period to monthly.
    • Bar chart - the default chart type.
    • Line chart - changes the chart type to a line chart.

Configuration options for runner worker compute costs

Option 1 - user enters required cost attributes to runner details.

The required cost attributes are:

  1. Runner Worker Machine Type - runner_worker_compute_type
  2. Runner Worker Compute Cost Per Hour - - runner_worker_compute_cost_hr

Option 2 - automatically retrieve the required cost attributes from the runner host

  • Theoretically we can implement a solution that automatically grabs the instance type information from the instance. For example, you can use the aws cli and the command `aws ec2 describe-instances – instance-ids.

Example command with output:

aws ec2 describe-instances \
--query "Reservations[*].Instances[*].{PublicIP:PublicIpAddress,Type:InstanceType,Name:Tags[?Key=='Name']|[0].Value,Status:State.Name}" \
--filters "Name=instance-state-name,Values=running" "Name=instance-type,Values='t2.medium','t2.micro'" \
--output table

Note - OpenCost.io " is a vendor-neutral open source project for measuring and allocating infrastructure and container costs. However, as of 2023-07-26, the solution is "built for Kubernetes", so we have to explore other options for inputting the public cloud compute cost data into GitLab.

Open technical questions

  • For the MVC should we default to USD as the currency designation?
  • For the MVC do we attempt to use an automated solution to retrieve the compute host specs and vendor list price per hour for compute, or have the user manually enter that data.
  • How to track changes in runner_worker_compute_cost?

Disclaimer

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited by 🤖 GitLab Bot 🤖