MonitorLockedTables doesn't report correct information anymore

GitLab.com has been running with multiple databases main and ci for almost 3 years, and recently we introduced a new sec database. All of these 3 database still have the same schema. To avoid that the app writes to the wrong database, for example users on ci or sec, or ci_pipelines on main or sec, we locked those tables. We refer to them as Legacy tables, because they are left-overs from the 1 database architecture.

To know more about locking tables, see here

To make sure we always have the legacy tables locked, we have regular cron job MonitorLockedTables that runs every 3 days. See details here

MonitorLockedTables runs by calling TablesLocker in dry_run mode. TablesLocker calls LocksWritesManager.

Recently we got false-alarm that many tables need to be locked or unlocked. But that's because we removed the check for table or trigger existence in this MR !188276 (diffs). The goal was to make it faster. No incidents happened.

Therefore, we disabled the lock_tables_in_monitoring on both Staging and Production until we have this resolved:

Suggested corrective action:

  1. Bring back the old checks of whether a table or trigger exist in LockWritesManager. By reverting !188276 (diffs), but make them skipped by introducing a force mode that makes this operation faster and skips the checks.
  2. The rake tasks gitlab:db:lock_writes and gitlab:db:unlock_writes should eventually call the LockWritesManager in force mode. But MonitorLockedTables should pass force set to false.
  3. Once we get correct logs in Kibana that no tables need to be locked or unlocked as expected we can re-enable the feature flag lock_tables_in_monitoring on both staging and production.

CC: @theoretick @ghavenga @rutgerwessels

Edited by Lucas Charles