GA4 Client ID for Snowplow
Problem
Google Analytics Universal (GA3) is sunsetting on July 1, 2024. Currently, the Snowplow data contains a column that pulls in the GA3 client ID. We use this to join marketing website data from Google Analytics to Product/Trial Registration (to Paid) data with Snowplow, since it contains namespace ID.
GA4 is the new version, and we should ensure the Snowplow GA client ID column(s) will pull from GA4 instead.
I was able to trace the GA client ID to this table: snowplow_gitlab_events_context_flattened
on line 68.
Desired Outcome
Ensure GA client ID column continues to pull from GA4. We should aim to keep the original column name the same to avoid updating other tables, as a lot of models refer to this original source.
Potential Solution
According to this Snowplow document, it might be as easy has switching the GA cookie plug-in from iglu:com.google.analytics/cookies/jsonschema/1-0-0
to iglu:com.google.ga4/cookies/jsonschema/1-0-0
- but I will let the engineer be the judge of that.
How to verify
On about.gitlab.com, you can enter the following on Dev > Console to see your own GA4 client ID:
gtag('get', 'G-ENFH3X7M5Y', 'client_id', function(clientId) {
console.log(clientId);
});
And then if everything is correct, you might be able to see it again within any Snowplow table.
select gsc_google_analytics_client_id, [add desired columns here]
from PROD.COMMON.FCT_BEHAVIOR_WEBSITE_PAGE_VIEW