Skip to content

Use ULIDs for CorrelationIDs

Andrew Newdigate requested to merge use-ksuid-for-correlation-ids into master

This change switches correlationIDs from Random strings to ULIDs.

What is a ULID?

Universally Unique Lexicographically Sortable Identifier

UUID can be suboptimal for many uses-cases because:

  • It isn't the most character efficient way of encoding 128 bits of randomness
  • UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
  • UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
  • UUID v4 provides no other information than randomness which can cause fragmentation in many data structures

Instead, herein is proposed ULID:

  • 128-bit compatibility with UUID
  • 1.21e+24 unique ULIDs per millisecond
  • Lexicographically sortable!
  • Canonically encoded as a 26 character string, as opposed to the 36 character UUID
  • Uses Crockford's base32 for better efficiency and readability (5 bits per character)
  • Case insensitive
  • No special characters (URL safe)
  • Monotonic sort order (correctly detects and handles the same millisecond)

https://github.com/oklog/ulid

ULID are attractive for using as correlationIDs for one of their properties: they are lexicographically sortable, meaning that it's very easy to determine the order of requests, given only a set of correlationIDS. The time of generation can even be determined from the ULID.

Switching to ULID (actually, the similar KSUIDs) is something I've wanted to do for a long time, but much of the code in !78 (merged) led me to realise that now is a good time to roll this out.

SafeRandomID()

This change also deprecates correlation.RandomID() in favour of correlation.SafeRandomID(). correlation.RandomID() returned an error, although no request should ever be cancelled due to the failure of a correlationID to be generated. This led to different applications using different fallback strategies.

SafeRandomID() ensures that a correlationID is always returned, even in the exceedingly unlikely case that the system does not have enough entropy.

Instead a simple fallback, of E:<encodedtimestamp> is now used as the correlationId when the system does not have enough entropy.

Benchmark

pkg: gitlab.com/gitlab-org/labkit/correlation
BenchmarkSafeRandomID-8   	 7097577	       165 ns/op
PASS

For prosperity, below is the benchmark used for the KSUID implementation that we used previously. ULID generation is a little faster (although both are fine).

pkg: gitlab.com/gitlab-org/labkit/correlation
BenchmarkSafeRandomID-8   	 2266785	       530 ns/op
PASS

cc @ash2k

Edited by Andrew Newdigate

Merge request reports