Prevent commit_write from writing to S3 without updating Index
Context
Currently, if writing to the index fails, commit_write
will have uploaded a blob to the Storage without updating the index. The written blob will be very difficult to identity or remove, since bgd cleanup
will only remove indexed blobs.
- Expected behaviour:
commit_write
should not leak unindexed blobs to the storage - Current behaviour: On database errors,
commit_write
will leak blobs
Task Description
Ultimately, we want commit_write
to be an atomic operation.
There are a few possible ways we could tackle this problem, the first would be to manually implement rollback logic. This logic will delete the blob in s3 if the database transaction fails.
Alternatively, we can upload blobs to storage within the database session. Consequently, if the upload to storage fails, we can just roll back the database changes.
Acceptance Criteria
commit_write
should be atomic. If it fails, the blob should not be in the storage or the index. If it succeeds, it should be in both.
Consider whether the following are required, and complete if so:
-
Unit tests