Boto3+s3 backend does not retry if connection fails during backend initialization
Summary
When there is a error during the creation of the boto3+s3 backend, retry logic does not seem to be triggered. Since I contributed the boto3+s3 back end back in December, I am happy to help. The root cause is that s3_boto3_backend.py:S3Boto3Backend.__init__
calls reset_connection()
, which can fail. (A pattern that the older boto backend also used).
It should be simple enough to lazy initialize the connection and remove the connection reset from the constructor (where it probably doesn't belong), but before doing that I wanted to ask if reentrancy needed to be considered, or if it was safe to assume that an instance of the backend would never be accessed by multiple threads at the same time. The backends/README doesn't seem to say anything about this.
Steps to reproduce
perform an operation with the boto3+s3 backend under network conditions that result in an error during backend initialization (The backend will do a s3 HEAD call).
What is the current bug behaviour?
Error and stack trace after what appears to be a single failure from the boto3 lib (regardless of what it might be doing internally).
What is the expected correct behaviour?
Connection establishment should be retried according to duplicity's normal retry logic, which is observed when the connection drops during an upload.
Relevant logs and/or screenshots
/opt/duplicity/0.8-dev/duplicity-venv/bin/duplicity --progress --progress-rate=30 --encrypt-key=MyKey --verbosity=Notice --volsize=1024 --archive-dir=/Store0/duplicity/archive --file-prefix-archive archive- --file-prefix-manifest manifest- --file-prefix-signature signature- remove-older-than 210D --force boto3+s3://my_bucket/my_backup_set
Traceback (innermost last):
File "/opt/duplicity/0.8-dev/duplicity/bin/duplicity", line 101, in <module>
with_tempdir(main)
File "/opt/duplicity/0.8-dev/duplicity/bin/duplicity", line 87, in with_tempdir
fn()
File "/opt/duplicity/0.8-dev/duplicity/duplicity/dup_main.py", line 1526, in main
action = commandline.ProcessCommandLine(sys.argv[1:])
File "/opt/duplicity/0.8-dev/duplicity/duplicity/commandline.py", line 1172, in ProcessCommandLine
globals.backend = backend.get_backend(args[0])
File "/opt/duplicity/0.8-dev/duplicity/duplicity/backend.py", line 225, in get_backend
obj = get_backend_object(url_string)
File "/opt/duplicity/0.8-dev/duplicity/duplicity/backend.py", line 211, in get_backend_object
return factory(pu)
File "/opt/duplicity/0.8-dev/duplicity/duplicity/backends/s3_boto3_backend.py", line 85, in __init__
self.reset_connection()
File "/opt/duplicity/0.8-dev/duplicity/duplicity/backends/s3_boto3_backend.py", line 96, in reset_connection
self.s3.meta.client.head_bucket(Bucket=self.bucket_name)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/client.py", line 648, in _make_api_call
operation_model, request_dict, request_context)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/client.py", line 667, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
return self._send_request(request_dict, operation_model)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/endpoint.py", line 137, in _send_request
success_response, exception):
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/endpoint.py", line 231, in _needs_retry
caught_exception=caught_exception, request_dict=request_dict)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/hooks.py", line 356, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/hooks.py", line 228, in emit
return self._emit(event_name, kwargs)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
response = handler(**kwargs)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/retryhandler.py", line 183, in __call__
if self._checker(attempts, response, caught_exception):
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/retryhandler.py", line 251, in __call__
caught_exception)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/retryhandler.py", line 277, in _should_retry
return self._checker(attempt_number, response, caught_exception)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/retryhandler.py", line 317, in __call__
caught_exception)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/retryhandler.py", line 223, in __call__
attempt_number, caught_exception)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception
raise caught_exception
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/endpoint.py", line 200, in _do_get_response
http_response = self._send(request)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/endpoint.py", line 244, in _send
return self.http_session.send(request)
File "/opt/duplicity/0.8-dev/duplicity-venv/lib64/python3.6/site-packages/botocore/httpsession.py", line 283, in send
raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://my_bucket.s3.amazonaws.com/"
Possible fixes
We can lazy initialize some of the internals of the backend inside the _put / _get / _list / _delete / _query
calls, or the caller can be restructured to retry initialization of the backends when there is an error. I imagine the former would be less work.
I've done a quick test of lazy initialization with this - carlalexanderadams/duplicity@c0671312
But, I have not done a deep regression under the conditions that brought this to my attention. (That system is currently busy backing up terabytes, and I do not want to interrupt it)