Optimize deployment counters related to the ::Deployment
What does this MR do?
Optimize queries produced by
- Query no period per user
# 1 Locations https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/usage_data.rb
time_period = { }
deployments: distinct_count(::Deployment.where(time_period), :user_id),
failed_deployments: distinct_count(::Deployment.failed.where(time_period), :user_id),
successful_deployments: distinct_count(::Deployment.success.where(time_period), :user_id)
Query with no time constraint and batching per user_id
```ruby Gitlab::UsageData.distinct_count(::Deployment.where(time_period), :user_id) ``` ```sql SELECT MIN("deployments"."user_id") FROM "deployments" SELECT MAX("deployments"."user_id") FROM "deployments" SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."user_id" BETWEEN $1 AND $2 [["user_id", 0], ["user_id", 9999]] ```Gitlab::UsageData.distinct_count(::Deployment.failed.where(time_period), :user_id)
SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 3]]
SELECT MAX("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 3]]
SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."user_id" BETWEEN $2 AND $3 [["status", 3], ["user_id", 0], ["user_id", 9999]]
Gitlab::UsageData.distinct_count(::Deployment.success.where(time_period), :user_id)
SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 2]]
SELECT MAX("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 2]]
SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."user_id" BETWEEN $2 AND $3 [["status", 2], ["user_id", 0], ["user_id", 9999]]
- Query with period filter per user
# 2 Locations https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/usage_data.rb
time_period = { created_at: 28.days.ago..Time.current }
deployments: distinct_count(::Deployment.where(time_period), :user_id),
failed_deployments: distinct_count(::Deployment.failed.where(time_period), :user_id),
successful_deployments: distinct_count(::Deployment.success.where(time_period), :user_id)
query with time constraint and batching per user_id
Gitlab::UsageData.distinct_count(::Deployment.where(time_period), :user_id)
SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."created_at" BETWEEN $1 AND $2 [["created_at", "2020-02-12 12:31:33.200695"], ["created_at", "2020-03-11 12:31:33.200777"]]
SELECT MAX("deployments"."user_id") FROM "deployments" WHERE "deployments"."created_at" BETWEEN $1 AND $2 [["created_at", "2020-02-12 12:31:33.200695"], ["created_at", "2020-03-11 12:31:33.200777"]]
SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."created_at" BETWEEN $1 AND $2 AND "deployments"."user_id" BETWEEN $3 AND $4 [["created_at", "2020-02-12 12:31:33.200695"], ["created_at", "2020-03-11 12:31:33.200777"], ["user_id", 0], ["user_id", 9999]]
Gitlab::UsageData.distinct_count(::Deployment.failed.where(time_period), :user_id)
SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."created_at" BETWEEN $2 AND $3 [["status", 3], ["created_at", "2020-02-12 12:31:33.200695"], ["created_at", "2020-03-11 12:31:33.200777"]]
MAX("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."created_at" BETWEEN $2 AND $3 [["status", 3], ["created_at", "2020-02-12 12:31:33.200695"], ["created_at", "2020-03-11 12:31:33.200777"]]
SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."created_at" BETWEEN $2 AND $3 AND "deployments"."user_id" BETWEEN $4 AND $5 [["status", 3], ["created_at", "2020-02-12 12:31:33.200695"], ["created_at", "2020-03-11 12:31:33.200777"], ["user_id", 0], ["user_id", 9999]]
time_period = { created_at: 28.days.ago..Time.current }
Gitlab::UsageData.distinct_count(::Deployment.success.where(time_period), :user_id)
SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."created_at" BETWEEN $2 AND $3 [["status", 2], ["created_at", "2020-02-12 11:46:44.765586"], ["created_at", "2020-03-11 11:46:44.765767"]]
SELECT MAX("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."created_at" BETWEEN $2 AND $3 [["status", 2], ["created_at", "2020-02-12 11:46:44.765586"], ["created_at", "2020-03-11 11:46:44.765767"]]
SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."created_at" BETWEEN $2 AND $3 AND "deployments"."user_id" BETWEEN $4 AND $5 [["status", 2], ["created_at", "2020-02-12 11:46:44.765586"], ["created_at", "2020-03-11 11:46:44.765767"], ["user_id", 0], ["user_id", 9999]]
- Query count deployments
# 3 Locations https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb
deployments: count(Deployment),
successful_deployments: count(Deployment.success),
failed_deployments: count(Deployment.failed),
query count of deployments
Gitlab::UsageData.count(::Deployment)
SELECT MIN("deployments"."id") FROM "deployments"
SELECT MAX("deployments"."id") FROM "deployments"
SELECT COUNT("deployments"."id") FROM "deployments" WHERE "deployments"."id" BETWEEN $1 AND $2 [["id", 1], ["id", 100000]]
query count of failed deployments
Gitlab::UsageData.count(::Deployment.failed)
SELECT MIN("deployments"."id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 3]]
SELECT MAX("deployments"."id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 3]]
SELECT COUNT("deployments"."id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."id" BETWEEN $2 AND $3 [["status", 3], ["id", 0], ["id", 99999]]
query count of successful deployments
Gitlab::UsageData.count(::Deployment.success)
SELECT MIN("deployments"."id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 2]]
SELECT MAX("deployments"."id") FROM "deployments" WHERE "deployments"."status" = $1 [["status", 2]]
SELECT COUNT("deployments"."id") FROM "deployments" WHERE "deployments"."status" = $1 AND "deployments"."id" BETWEEN $2 AND $3 [["status", 2], ["id", 0], ["id", 99999]]
Notes: on optimization
- Using database-lab
- Add 2 indexes
exec CREATE INDEX CONCURRENTLY index_deployments_on_user_id_status_and_created_at ON deployments USING btree (user_id,status,created_at)
The query has been executed. Duration: 4.079 m
exec CREATE INDEX CONCURRENTLY index_deployments_on_id_and_status ON deployments USING btree (id,status)
The query has been executed. Duration: 2.118 min
1. MIN/MAX calculations per user_id
- no period
explain
```sql explain SELECT MIN("deployments"."user_id") FROM "deployments" ```Before: Time: 1.251 min https://explain.depesz.com/s/DN0V
After: Time: 3.471 ms https://explain.depesz.com/s/XrdC
- period
explain
```sql explain SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."created_at" BETWEEN '2020-02-10 08:14:55.773067' AND '2020-03-09 08:14:55.773185' ```Before: https://explain.depesz.com/s/lOoMK
After: Time: 10.867 ms https://explain.depesz.com/s/vWIh
2. COUNT DISTINCT per user_id
- no period
explain
```sql explain SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."user_id" BETWEEN 0 AND 9999 ```Before: Time: 9.184 min https://explain.depesz.com/s/5ikP
Filter: ((deployments.user_id >= 0) AND (deployments.user_id <= 9999))
After: Time: 224.553 ms https://explain.depesz.com/s/KqUM
- period
explain
```sql explain SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."created_at" BETWEEN '2020-02-10 08:14:55.773067' AND '2020-03-09 08:14:55.773185' AND "deployments"."user_id" BETWEEN 0 AND 9999 ```Before: Time: 9.138 s https://explain.depesz.com/s/DSrL
Filter: ((deployments.user_id >= 0) AND (deployments.user_id <= 9999))
After: Time: 10.867 ms https://explain.depesz.com/s/vWIh
3. MIN/MAX status failed per user_id
- no period
explain
```sql explain SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 3 ```Before: Time: 16.416 min https://explain.depesz.com/s/YHhP
After: Time: 0.530 ms https://explain.depesz.com/s/5y1g
- period
explain
```sql explain SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 3 AND "deployments"."created_at" BETWEEN '2020-02-10 08:14:55.773067' AND '2020-03-09 08:14:55.773185' ```Before: Time: 3.954 min https://explain.depesz.com/s/Tsc
After: Time: 0.979 ms https://explain.depesz.com/s/vT02
4. COUNT DISTINCT failed per user_id
- no period
explain
```sql explain SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 3 AND "deployments"."user_id" BETWEEN 0 AND 9999 ```Before: Time: 22.331 min https://explain.depesz.com/s/OKI8
Filter: ((deployments.user_id >= 0) AND (deployments.user_id <= 9999))
After: Time: 12.364 ms https://explain.depesz.com/s/4PRq
- period
explain
```sql explain SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 3 AND "deployments"."created_at" BETWEEN '2020-02-10 11:01:14.332176' AND '2020-03-09 11:01:14.332272' AND "deployments"."user_id" BETWEEN 0 AND 9999 ```Before: Time: 2.792 min https://explain.depesz.com/s/SG5A
Filter: ((deployments.user_id >= 0) AND (deployments.user_id <= 9999) AND (deployments.status = 3))
After: Time: 9.637 ms https://explain.depesz.com/s/5nrt
5. MIN/MAX per user_id status success
- no period
explain
```sql explain SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 2 ```Before: Time: 24.389 min https://explain.depesz.com/s/Crss
After: Time: 0.365 ms https://explain.depesz.com/s/JE39
- period
explain
```sql explain SELECT MIN("deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 2 AND "deployments"."created_at" BETWEEN '2020-02-10 11:54:39.194310' AND '2020-03-09 11:54:39.194373' ```6. DISTINCT COUNT per user_id status success
- no period
explain
```sql EXPLAIN SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 2 AND "deployments"."user_id" BETWEEN 0 AND 9999 ```Before: Time: 16.352 min https://explain.depesz.com/s/LxOK
After: Time: 35.899 ms https://explain.depesz.com/s/Kg8R
- period
explain
```sql explain SELECT COUNT(DISTINCT "deployments"."user_id") FROM "deployments" WHERE "deployments"."status" = 2 AND "deployments"."created_at" BETWEEN '2020-02-10 11:54:39.194310' AND '2020-02-10 11:54:39.194310' AND "deployments"."user_id" BETWEEN 2 AND 10000 ```7. COUNT Deployments
explain
```sql explain SELECT COUNT("deployments"."id") FROM "deployments" WHERE "deployments"."id" BETWEEN 1 AND 100000 ```Before: Time: 215.300 ms https://explain.depesz.com/s/GoEE
After: Time: 52.250 ms https://explain.depesz.com/s/4qzn
8. MIN/MAX Deployments
explain
```sql explain SELECT MIN("deployments"."id") FROM "deployments" ```Before: Time: 0.442 ms https://explain.depesz.com/s/xLnh
After: Time: 0.317 ms https://explain.depesz.com/s/p7Zi
9. MIN/MAX Deployments success
explain
```sql explain SELECT MIN("deployments"."id") FROM "deployments" WHERE "deployments"."status" = 2 ```Before: Time: 0.346 ms https://explain.depesz.com/s/nFek
Filter: (deployments.status = 2)
After: Time: 0.355 ms https://explain.depesz.com/s/luv2
10. COUNT Deployments failed
explain
```sql explain SELECT COUNT("deployments"."id") FROM "deployments" WHERE "deployments"."status" = 2 AND "deployments"."id" BETWEEN 0 AND 99999 ```Before: Time: 9.215 s https://explain.depesz.com/s/mM0r
Filter: (deployments.status = 2)
After: Time: 34.054 ms https://explain.depesz.com/s/v2Rp
11. MIN/MAX Deployments failed
explain
```sql explain SELECT MIN("deployments"."id") FROM "deployments" WHERE "deployments"."status" = 3 ```Before: Time: 30.274 s https://explain.depesz.com/s/r0CC
Filter: (deployments.status = 3)
After: Time: 427.971 ms https://explain.depesz.com/s/gR8f
12. COUNT Deployments success
explain
```sql explain SELECT COUNT("deployments"."id") FROM "deployments" WHERE "deployments"."status" = 3 AND "deployments"."id" BETWEEN 0 AND 99999 ```Before: Time: 300.198 ms https://explain.depesz.com/s/5e3i
Filter: (deployments.status = 3)
After: Time: 5.544 ms https://explain.depesz.com/s/K6Wc
Migration output
== 20200311084025 AddIndexOnUserIdStatusCreatedAtToDeployments: migrating =====
-- transaction_open?()
-> 0.0000s
-- index_exists?(:deployments, [:user_id, :status, :created_at], {:algorithm=>:concurrently})
-> 0.0080s
-- execute("SET statement_timeout TO 0")
-> 0.0005s
-- add_index(:deployments, [:user_id, :status, :created_at], {:algorithm=>:concurrently})
-> 0.0162s
-- execute("RESET ALL")
-> 0.0004s
== 20200311084025 AddIndexOnUserIdStatusCreatedAtToDeployments: migrated (0.0254s)
== 20200311094020 AddIndexOnIdAndStatusToDeployments: migrating ===============
-- transaction_open?()
-> 0.0000s
-- index_exists?(:deployments, [:id, :status], {:algorithm=>:concurrently})
-> 0.0052s
-- execute("SET statement_timeout TO 0")
-> 0.0005s
-- add_index(:deployments, [:id, :status], {:algorithm=>:concurrently})
-> 0.0026s
-- execute("RESET ALL")
-> 0.0004s
== 20200311094020 AddIndexOnIdAndStatusToDeployments: migrated (0.0091s) ======
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec -
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Closes #208886 (closed)