Draft: Add Snowplow writer for usage logging

NOTE: This is an experimental change implementing path 4 described in this document.

It is not our preferred path, because the snowplow emitter has many issues, see: https://gitlab.com/gitlab-org/architecture/usage-billing/design-doc/-/issues/33.

It may be necessary as a stopgap though, since there are SOX compliance implications with the other approaches.


Implements a new Snowplow writer that sends runner compute usage events to Snowplow collectors for billing and analytics. Events conform to the com.gitlab/billable_usage/jsonschema/1-0-1 schema and are sent asynchronously via HTTP.

Architecture changes:

  1. Multi-writer support:
    • Add UsageLogger.Writer field to select writer type ("logrotate" or "snowplow")
    • Refactor usage logger initialization in multi.go with helper methods (createLogrotateWriter, createSnowplowWriter)
    • Maintain backward compatibility with deprecated config fields
  2. New Snowplow writer implementation:
    • New snowplow package with Writer, Options, and BillingEvent types
    • Configurable via TOML: collector_uri, app_id, namespace, protocol, request_type, base64_encode, send_limit
    • Uses snowplow-golang-tracker v3 for HTTP POST with in-memory storage
    • Defaults to plain JSON encoding (base64_encode=false) for debugging
  3. Event structure:
    • Conforms to com.gitlab/billable_usage/jsonschema/1-0-1 Iglu schema
    • Required fields: event_id (UUIDv7), event_type, unit_of_measure (seconds), realm (SM), timestamp
    • Optional fields: instance_id, project_id, namespace_id, subject (runner_id), quantity
    • Flexible metadata map for job details and custom labels
  4. Concurrency protection:
    • Add RWMutex to protect usageLogger/usageLoggerLabels in RunCommand
    • Add Mutex in Snowplow writer to serialize tracker calls (workaround for upstream race condition)
    • Include lifecycle management with closed flag to prevent use after close
  5. Enhanced usage_log package:
    • Add UUID and Timestamp fields to Record type
    • Update UsageLogRecordFrom() to generate UUIDv7 and accept labels
    • Simplify Storage interface (remove unused methods)
  6. Testing:
    • Comprehensive test suite with HTTP mock server
    • Verify full nested event structure including Iglu schemas
    • Test concurrent Store() calls and lifecycle edge cases

The writer sends self-describing events with nested schemas:

  • Outer: com.snowplowanalytics.snowplow/unstruct_event/1-0-0
  • Inner: com.gitlab/billable_usage/1-0-1 with billing data

Configuration example:

  [experimental.usage_logger]
    enabled = true
    writer = "snowplow"
    [experimental.usage_logger.snowplow]
      collector_uri = "https://collector.example.com"
      app_id = "gitlab-runner"
      namespace = "production"

See also:

Edited by Igor

Merge request reports

Loading