Use ULIDs for CorrelationIDs (!80) · Merge requests · GitLab.org / labkit

This change switches correlationIDs from Random strings to ULIDs.

What is a ULID?

Universally Unique Lexicographically Sortable Identifier

UUID can be suboptimal for many uses-cases because:

It isn't the most character efficient way of encoding 128 bits of randomness

UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address

UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures

UUID v4 provides no other information than randomness which can cause fragmentation in many data structures

Instead, herein is proposed ULID:

128-bit compatibility with UUID

1.21e+24 unique ULIDs per millisecond

Lexicographically sortable!

Canonically encoded as a 26 character string, as opposed to the 36 character UUID

Uses Crockford's base32 for better efficiency and readability (5 bits per character)

Case insensitive

No special characters (URL safe)

Monotonic sort order (correctly detects and handles the same millisecond)

https://github.com/oklog/ulid

ULID are attractive for using as correlationIDs for one of their properties: they are lexicographically sortable, meaning that it's very easy to determine the order of requests, given only a set of correlationIDS. The time of generation can even be determined from the ULID.

Switching to ULID (actually, the similar KSUIDs) is something I've wanted to do for a long time, but much of the code in !78 (merged) led me to realise that now is a good time to roll this out.

`SafeRandomID()`

This change also deprecates correlation.RandomID() in favour of correlation.SafeRandomID(). correlation.RandomID() returned an error, although no request should ever be cancelled due to the failure of a correlationID to be generated. This led to different applications using different fallback strategies.

SafeRandomID() ensures that a correlationID is always returned, even in the exceedingly unlikely case that the system does not have enough entropy.

Instead a simple fallback, of E:<encodedtimestamp> is now used as the correlationId when the system does not have enough entropy.

Benchmark

pkg: gitlab.com/gitlab-org/labkit/correlation
BenchmarkSafeRandomID-8   	 7097577	       165 ns/op
PASS

For prosperity, below is the benchmark used for the KSUID implementation that we used previously. ULID generation is a little faster (although both are fine).

pkg: gitlab.com/gitlab-org/labkit/correlation
BenchmarkSafeRandomID-8   	 2266785	       530 ns/op
PASS

cc @ash2k

Edited Nov 02, 2020 by Andrew Newdigate

Use ULIDs for CorrelationIDs

Universally Unique Lexicographically Sortable Identifier

SafeRandomID()

Benchmark

Merge request reports

`SafeRandomID()`