Pseudo anonymize all record ids in experiment tracking
What does this MR do?
This MR pseudo anonymizes (using a one way seeded hashing strategy) any record passed to tracking via experimentation. It uses the same strategy that's used in anonymizing the experiment context key, which has been approved and in use for a while.
In our Snowplow Documentation it states that namespace
, project
and user
are valid arguments for Gitlab::Tracking.event
. But these arguments are then passed to, and ignored by the Gitlab::Tracking::StandardContext
-- this MR bypasses that missing logic and instead anonymizes these values before passing them downstream.
A reasonable amount of effort and education has been put into gitlab-experiment in terms of minimizing the need to link things to the wider dataset, but we continue to see this be a struggle in reporting and generating deep data about experiments. This MR is a proposal of how, as engineers that are doing our best to make product happy, and also be respectful of GDPR rules, our privacy policy, and commitments made to the community, we might approach this issue -- that for performance reasons doesn't involve writing these things to the database, as is currently happening.
Screenshots (strongly suggested)
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team