Investigate performance & scalability for devops::growth

The purpose of this issue is to investigate performance concerns for growth team's responsibily areas.

Areas of investigation

1. experiment_users postgres table and related queries

Concerns with inserts.

gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from experiment_users group by 1 order by 1 desc;
         period         | count  
------------------------+--------
 2021-03-01 00:00:00+00 | 359462
 2021-02-01 00:00:00+00 | 680383
 2021-01-01 00:00:00+00 | 395077
 2020-12-01 00:00:00+00 |  85631
 2020-11-01 00:00:00+00 | 250651
 2020-10-01 00:00:00+00 | 449921
 2020-09-01 00:00:00+00 | 241132
 2020-08-01 00:00:00+00 | 133816

2. onboarding_progresses table and related queries

gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from onboarding_progresses group by 1 order by 1 desc;
         period         | count 
------------------------+-------
 2021-03-01 00:00:00+00 | 68095
 2021-02-01 00:00:00+00 | 60443
 2021-01-01 00:00:00+00 | 36367

3. Redis HLL tracking and readings.

Concerns with Redis overload

4. gitlab-experiment gem experiment caching

5. Snowplow frontend tracking

Browser performance and timings

IGLU overload and mirroring

6. Snowplow backend tracking performance

Threading, connections, memory use

7. Spam and abuse risks related to growth experiments & features

As growth team removes frictions in usage, spam & abusers may discover opportunities.

discussed with @pcalder

/cc @gitlab-org/growth for any other areas of investigation

8. experiment_subjects table

gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from experiment_subjects group by 1 order by 1 desc;
         period         | count 
------------------------+-------
 2021-03-01 00:00:00+00 | 66542
 2021-02-01 00:00:00+00 | 43812
(2 rows)
Edited by Alper Akgun