Handling Locked Database Partitions when Truncating Legacy Tables
Background about the problem
We have two rake tasks that can be used to truncate legacy tables. These two rake tasks were initially introduced in this MR
rake gitlab:db:truncate_legacy_tables:main
rake gitlab:db:truncate_legacy_tables:ci
When we ran the rake tasks on GSTG
and GPRD
to truncate the legacy tables (gitlab_ci
tables on main
database and gitlab_main
tables on ci
database) we found a problem that was first discovered on Staging.
You can read the full details about this error here
Some database table partitions, that are not detached yet, used to be tables before. But have been converted into partitions, in a process to convert the original table into a partitioned table.
For example, the table security_findings
has at least one partition called gitlab_partitions_dynamic.security_findings_1
, but this table partition itself used to be called security_findings
and it was on the schema public
, then it was moved to the database schema gitlab_partitions_dynamic.security_findings_1
.
When we truncate legacy tables, we assume that these tables are locked for writes. Which makes sense, because they are legacy, and we wanted to make sure that we never write data to them by mistake. So when we truncate these tables, we disable the lock_writes
on them temporarily using this statement:
"SELECT set_config('lock_writes.#{table_name_without_schema}', 'false', false)"
It's temporarily, because set_config
uses this config only within the run time of the current transaction, that truncates the tables.
This all happens in this class Gitlab::Database::TablesTruncate
.
But the tables partitions are not locked for writes. We never lock them for writes. We only lock the tables
What is the problem?
The partition gitlab_partitions_dynamic.security_findings_1
happened to be also locked on GSTG
and GPRD
as well. This is because we once ran gitlab:db:lock_writes
on these environments when it was still a normal table. So when it was converted into a partition, it happened to be locked for writes.
When we truncate a parent table, the TRUNCATE STATEMENT
propagates into the table partitions as well, that's by design (we want this), and it's because we don't use ONLY
in the TRUNCATE
statement.
But because the partition is locked for writes, and we don't disable the lock writes on the partitions as well. We get errors.
Alternative Solution
We an also check for the triggers on the partitions, and remove them using the LockWritsManager.locked_for_writes?
& LockWritesManager.unlock_writes
as well.
Goal of this issue
The goal of this issue to handle all the cases when there are lock-write partitions of the table, and disable the lock-writes on them as well, in the same way as we do on the tables
This new functionality can be added here: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/tables_truncate.rb We can also make use of this model/table: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/postgres_partition.rb
CC: @tkuah @rutgerwessels