Skip to content

Use _bulk_refresh_timestamps on existing blobs in bulk_read_blobs

Jeremiah Bonney requested to merge jbonney/batch-read-timestamp-update into master

Before raising this MR, consider whether the following are required, and complete if so:

  • Unit tests
  • Metrics
  • Documentation update(s)

If not required, please explain in brief why not.

Description

This PR changes the behavior of bulk_read_blobs for the index in that instead of using one transaction for updating blobs + timestamps, it is now split in two. Specifically one transaction to update any rows which need blob content inlined, and a second to refresh timestamps for the other blobs using _bulk_refresh_timestamps. _bulk_refresh_timestamps uses skip_locked in it's query, meaning that concurrent requests with common blobs won't be serialized as would happen with _save_digests_to_index.

Now the only time _save_digests_to_index is called from bulk_read_blobs when fallback_on_get is disabled is when the inline blob size is adjusted. Assuming that value is only occasionally updated this still ends up being a single transaction. A similar change was made to the fallback_on_get case, but that still calls _save_digests_to_index on any blob not inlined including large blobs.

Merge request reports