Draft: Adds vector transformation for snowplow_bad_events
What does this MR do and why?
This removes PII data from snowplow_bad_events
Relates to: https://gitlab.com/gitlab-org/analytics-section/product-analytics/analytics-stack/-/issues/176
Screenshots or screen recordings
Sample payload before/after vector transformation
Before | After |
---|---|
|
|
How to set up and validate locally
-
Run
helm upgrade -f custom.values.yaml [RELEASE_NAME]
for your cluster. -
Generate a bad event in snowplow by sending an incorrect payload. Replace
#{collector host}
and#{appId}
with your application's collector host and appId.POST /com.snowplowanalytics.snowplow/tp2 HTTP/1.1 Host: #{collector host} accept: */* accept-language: en-GB,en-US;q=0.9,en;q=0.8 content-type: application/json; charset=UTF-8 origin: null sec-ch-ua: "Google Chrome";v="123", "Not:A-Brand";v="8", "Chromium";v="123" sec-ch-ua-mobile: ?0 sec-ch-ua-platform: "macOS" sec-fetch-dest: empty sec-fetch-mode: cors sec-fetch-site: cross-site sp-anonymous: * user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36 x-gitlab-appid: #{appId} { "schema": "iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-4", "data": [ { "e": "pv", "page": "Test Page", "eid": "any-string", "tv": "js-3.12.0", "tna": "gitlab", "aid": "#{appId}", "p": "web", "cs": "UTF-8", "lang": "en-GB", "res": "2560x1440" } ] }
-
Check clickhouse table
default.snowplow_bad_events
. schema should not have enriched payload, user id or ip address.
MR acceptance checklist
-
The correct type labels have been applied to this MR. -
This MR has been made as small as possible, to improve review efficiency and code quality. -
This MR has been self-reviewed per the code review guidelines. -
The changes have undergone manual testing and are functioning as intended. -
This MR has updated the Chart.yaml
version number following SemVer versioning practices. -
This MR documents any breaking changes in the MR description, and the upgrade path has been documented in the first commit as well as in MR description.
How to set up and validate
Numbered steps to set up and validate the change are strongly suggested.
How to deploy upon merging
Numbered steps to explain how this change needs to be deployed. For instance, if there are any changes that should be made outside of the code changes themselves.