Investigate performance & scalability for devops::growth
The purpose of this issue is to investigate performance concerns for growth team's responsibily areas. ## Areas of investigation #### 1. `experiment_users` postgres table and related queries Concerns with inserts. ``` gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from experiment_users group by 1 order by 1 desc; period | count ------------------------+-------- 2021-03-01 00:00:00+00 | 359462 2021-02-01 00:00:00+00 | 680383 2021-01-01 00:00:00+00 | 395077 2020-12-01 00:00:00+00 | 85631 2020-11-01 00:00:00+00 | 250651 2020-10-01 00:00:00+00 | 449921 2020-09-01 00:00:00+00 | 241132 2020-08-01 00:00:00+00 | 133816 ``` #### 2. `onboarding_progresses` table and related queries ``` gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from onboarding_progresses group by 1 order by 1 desc; period | count ------------------------+------- 2021-03-01 00:00:00+00 | 68095 2021-02-01 00:00:00+00 | 60443 2021-01-01 00:00:00+00 | 36367 ``` #### 3. Redis HLL tracking and readings. Concerns with Redis overload #### 4. gitlab-experiment gem experiment caching #### 5. Snowplow frontend tracking Browser performance and timings IGLU overload and mirroring #### 6. Snowplow backend tracking performance Threading, connections, memory use #### 7. Spam and abuse risks related to growth experiments & features As growth team removes frictions in usage, spam & abusers may discover opportunities. discussed with @pcalder /cc @gitlab-org/growth for any other areas of investigation #### 8. `experiment_subjects` table ``` gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from experiment_subjects group by 1 order by 1 desc; period | count ------------------------+------- 2021-03-01 00:00:00+00 | 66542 2021-02-01 00:00:00+00 | 43812 (2 rows) ```
epic