WIP: Reimplement S3 parallelism (!582) · Merge requests · BuildGrid / buildgrid

Richard Kennedy requested to merge richken/new_s3_parrallelism into master Nov 30, 2020

Description

Previously, an MR was merged, which added support for parallelizing S3 requests. That MR ended up being reverted for a few reasons:

There was no way to turn off the parallelism
A large request could monopolize all the threads, starving other requests
Potentially because of 2, there weren't significant improvement in performance

This new MR attempts to resolve the previous issues with the old implementation. This is done by using a BoundedThreadPool Executor instead of just a regular ThreadPool executor. When we run out of threads, submitting a request to the BoundedThreadPool executor will fail. Consequently, the S3Storage Class can fall back on serially submitting jobs when this happens. This has the benefit of not blocking requests, if all threads are currently in use. Moreover, there is additional support for preventing a single large job from monopolizing all threads in the Executor. Any single request can only use a specific number of threads at any point in time (specified by self._max_s3_executor_load).

Changes proposed in this merge request:

Reimplement S3 Parallelism

Additional work to be done in separate MR

Seperate ThreadPool Executors for reading/writing (To Do)

Edited Jun 07, 2021 by Marios Hadjimichael

WIP: Reimplement S3 parallelism

Description

Changes proposed in this merge request:

Additional work to be done in separate MR

Merge request reports