Skip to content

Resolve "Perm delete job fails for any trashable item that requires locking too many things in the Postgres in a single transaction"

Summary

This MR aims to lessen the impact of PostgreSQL exceeding its max_locks_per_transaction limit in three places:

  1. When a database snapshot is being created.
  2. When a database is being duplicated.
  3. When the trash is being permanently emptied.

Database snapshots

When a database snapshot is taken, we fetch all tables and, in a transaction, serialize them with export_serialized.

This very quickly exhausts max_locks_per_transaction. We'll now detect when this happens, raise DatabaseSnapshotMaxLocksExceededException, and make CreateSnapshotJob apply an appropriate message in its human_readable_error.

This message will be then shown in the frontend's snapshot modal.

Screenshot_2023-01-11_at_10.44.15

Database duplication

When a database is duplicated, we fetch all tables in the source database, and in a transaction, serialize them with export_serialized.

This very quickly exhausts max_locks_per_transaction. We'll now detect when this happens, raise DuplicateApplicationMaxLocksExceededException, and make DuplicateApplicationJob apply an appropriate message in its human_readable_error.

This message will then be shown in the frontend's notification popup.

Screenshot_2023-01-11_at_11.35.38

Permanent trash emptying

At the moment when the trash is being permanently emptied in permanently_delete_marked_trash, if there are too many TrashEntry to iterate over in its transaction, we'll quickly exhaust max_locks_per_transaction.

We'll now detect when this happens and raise PermanentDeletionMaxLocksExceededException. We will not, for now, do anything else. We want the next task's iteration to retry and not skip over the entries as self-hosters could have increased max_locks_per_transaction.

We will however provide a command to clear a database of its tables, see below.

Bonus command

To ensure that customers can clear a database of thousands/tens of thousands of tables, we'll provide a command that doesn't delete all tables in a transaction, but on a table-by-table basis:

./baserow permanently_empty_database databaseId

Will confirm how many tables will be deleted, and if you're happy to proceed, the deletion takes place with a --confirm parameter

./baserow permanently_empty_database databaseId --confirm

How to test

  1. Create a database with a lot of tables.
    • ./baserow fill_tables databaseId 10000
      • We'll call this database Large.
  2. Create a database, this automatically creates a single table.
    • We'll call this database Small.
  • Snapshot tests
    • Attempt to snapshot Large, confirm we get a nice error message in the modal.
    • Attempt to snapshot Small, confirm everything works as intended.
  • Duplication tests
    • Attempt to duplicate Large, confirm we get a nice error message in the notification.
    • Attempt to duplicate Small, confirm everything works as intended.
  • Permanent trash emptying tests
    • Trash and mark Large for deletion.
      • Wait for the next iteration, or manually enqueue permanently_delete_marked_trash.
        • Confirm PermanentDeletionMaxLocksExceededException is raised.
    • Trash and mark Small for deletion.
      • Wait for the next iteration, or manually enqueue permanently_delete_marked_trash.
        • Confirm it was successfully permanently deleted.

Merge Request Checklist

  • changelog.md has been updated if required
  • New/updated Premium features are separated correctly in the premium folder
  • The latest Chrome and Firefox have been used to test any new frontend features
  • Documentation has been updated
  • Quality Standards are met
  • Performance: tables are still fast with 100k+ rows, 100+ field tables
  • The redoc API pages have been updated for any REST API changes
  • Our custom API docs are updated for changes to endpoints accessed via api tokens
  • The UI/UX has been updated following UI Style Guide

Closes #1090 (closed)

Edited by Peter Evans

Merge request reports