Show error immediately if updating field is blocked by another concurrent operation
Feature description
While doing a database duplication, an additional import or any long running task, some users actions can break the tasks if not prevented. We need some sort of locking mechanism to control which action is allowed and which one is forbidden.
For database duplication / Snapshot
The possible requirements from the snapshot / duplication side are:
- If a snapshot is in progress I should get some sort of immediate error if I try to edit any field in that database (perhaps with a specific error saying it was due to a snapshot, but perhaps a generic error which says "an op is in progress" is also fine not sure)
- Multiple snapshots technically should be able to run at the same time, but perhaps we will limit this at the job system level for load reasons
Normal field updates should still block if another field update is running
For additional import
- If an import is in progress I should get some sort of immediate error if I try to permanently delete a field or if I change the field type or if I modify a field in an incompatible way.
- The error can be something like that: "The action has failed because another operation is in progress on this table". Later we can advise to visit the "job in progress" page to check the job in progress.
- We also can imagine to wait 10 sec before failing to let the opportunity to the task to finish
General requirements
- The locking should be hierarchical: if we lock database actions, table and field level actions should be locked too.
Technical API propositions
First
# In the duplication/snapshot job
with CoreHandlre().advisatory_lock(['application__all']):
# Do the snapshot job
# In the import job
with TableHandler().advisatory_lock(['field__deletion', 'field__incompatible_update', 'field__type_update']):
# Do the import job
# In the field handler
def update_field():
...
TableHandler().wait_lock(['field__type_modification'], timeout=10)
#
...
# the lock hierarchy
# application > table > field
# locks: application__all, application__update, application__creation...
# table__all, table__creation, table_modification, ...
# field__update, field__deletion, field__update, field__type_update, ...
Second
# For snapshot/duplication
with transaction_atomic(isolation_level=REPEATABLE_READ):
# Do stuff that need a snapshot of DB
# For file import job
fields = fields.select_for_share().all()
# use -> SELECT * FROM database_field FOR SHARE
# Do the import
# For a specific field update
field = fields.objects.select_for_update(nowwait=True).get(id=fieldId)
# Do an update operation on the field
proposed solutions
Using pgSQL locks
One option for importing into a single table is to take a FOR SHARE
lock on the database_field rows to prevent any field edits for the table + an ACCESS SHARE
lock on the table and possibly an ACCESS SHARE
lock on any m2m tables.
SELECT * FROM database_field WHERE table_id = xyz FOR SHARE;
LOCK TABLE database_table_xyz ACCESS SHARE;
LOCK TABLE database_relation_abc ACCESS SHARE;
All we need to do is:
- READing operations which happen async and we want to block start with a SELECT * FROM database_field FOR SHARE before proceeding
- WRITING operations which READ the fields but don't modify them themselves we want to block until they can write (import table etc) start with a SELECT * FROM database_field FOR SHARE
- WRITING operations on the fields themselves we want to block until they can write (????) start with a SELECT * FROM database_field FOR UPDATE
- WRITING operations we want to exit immediately and show an error (update field) if another operation is in progress will do SELECT * FROM database_field FOR UPDATE NO WAIT
- Separately most READ operations should probably be switched to READ REPEATABLE so they see a snapshot of the database state at the time of their first query
and so the snapshots/application level duplications will wait to get a FOR SHARE lock on all of the fields in the entire database. And the import table operation will wait to get perhaps a FOR SHARE actually. this way you can only lock the exact fields you care about and it will only block updates to those field
Ref https://code.djangoproject.com/ticket/10088
Using django-pglocks
Using this library: https://github.com/Xof/django-pglocks which lets you create an application specific "advisory lock". Which is still in the database itself etc but we control locking/unlocking it