Develop and execute a database remediation plan to ensure dot-com availability [Environments x DB]
From @krasio:
The following indexes were identified as generating a lot of WAL:
| Index | Type | Table |
|------------------------------------------------------------|-------|----------------|
| index_deployments_on_environment_status_sha | btree | deployments |
| index_deployments_on_environment_id_status_and_finished_at | btree | deployments |
| index_deployments_on_environment_id_and_id | btree | deployments |
| index_deployments_on_environment_id_status_and_id | btree | deployments |
| index_deployments_on_environment_id_and_ref | btree | deployments |
| index_deployments_on_project_id_and_status_and_created_at | btree | deployments |
| index_deployments_on_user_id_and_status_and_created_at | btree | deployments |
| index_deployments_on_project_id_and_iid | btree | deployments |
| index_deployments_on_archived_project_id_iid | btree | deployments |
| index_deployments_on_id_and_status_and_created_at | btree | deployments |
| index_deployments_on_created_at | btree | deployments |
| index_deployments_on_deployable_type_and_deployable_id | btree | deployments |
We should investigate is some of thse indexes can be removed.
In addition, the following queries were identified as generating a lot of WAL:
| Queryid | Table | Type |
|----------------------|-------------|--------|
| -3054865659211593778 | deployments | INSERT |
| 4019160767093606575 | deployments | UPDATE |
On average these queries generate 72 GB (of approximately 5 TB in total) per day. The normalized SQL can be found here.
As a first step of preparing a remediaton plan, we need to understand how these queries are used in the app. Once we have the complete picture, we can start looking and how to reduce the anmount of WAL they generate.
Possible solutions:
-
Execute the query less often.
This is the ideal solution, as no writes mean no WAL, and it will help for all of UDPATE, INSERT, and DELETE.
-
Reduce the number of indexes on the table
Update on indexed column generate more WAL, as it requires logging modifications for both the table and the corresponding index entries
-
Extract columns that are updated often in a separate table(s)
Updates in PostgreSQL are implemented as a combination of delete and insert. This means that when single column is updated, the whole row is logegd in WAL. Extracting such column(s) will reduce the ammount of WAL generated. An example of such vertical split - https://docs.gitlab.com/ee/development/database/layout_and_access_patterns.html#data-model-trade-offs.
-
Extract columns that are rarely updated in a separate table
Same as above.