Investigate performance & scalability for devops::growth
The purpose of this issue is to investigate performance concerns for growth team's responsibily areas.
## Areas of investigation
#### 1. `experiment_users` postgres table and related queries
Concerns with inserts.
```
gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from experiment_users group by 1 order by 1 desc;
period | count
------------------------+--------
2021-03-01 00:00:00+00 | 359462
2021-02-01 00:00:00+00 | 680383
2021-01-01 00:00:00+00 | 395077
2020-12-01 00:00:00+00 | 85631
2020-11-01 00:00:00+00 | 250651
2020-10-01 00:00:00+00 | 449921
2020-09-01 00:00:00+00 | 241132
2020-08-01 00:00:00+00 | 133816
```
#### 2. `onboarding_progresses` table and related queries
```
gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from onboarding_progresses group by 1 order by 1 desc;
period | count
------------------------+-------
2021-03-01 00:00:00+00 | 68095
2021-02-01 00:00:00+00 | 60443
2021-01-01 00:00:00+00 | 36367
```
#### 3. Redis HLL tracking and readings.
Concerns with Redis overload
#### 4. gitlab-experiment gem experiment caching
#### 5. Snowplow frontend tracking
Browser performance and timings
IGLU overload and mirroring
#### 6. Snowplow backend tracking performance
Threading, connections, memory use
#### 7. Spam and abuse risks related to growth experiments & features
As growth team removes frictions in usage, spam & abusers may discover opportunities.
discussed with @pcalder
/cc @gitlab-org/growth for any other areas of investigation
#### 8. `experiment_subjects` table
```
gitlabhq_production=> select date_trunc('month', created_at) as period, count(*) from experiment_subjects group by 1 order by 1 desc;
period | count
------------------------+-------
2021-03-01 00:00:00+00 | 66542
2021-02-01 00:00:00+00 | 43812
(2 rows)
```
epic