Maximum recursion error while using tar source (upstream Python issue)
Summary
When unpacking certain tar files, particularly ones containing symlinks whose source name is the same as the target name (e.g. /opt/foo/tmp -> /tmp
), tarfile
may go into an infinite loop due to an upstream bug: https://bugs.python.org/issue23228
Steps to reproduce
The following shell script will create and test a tar file which demonstrates this:
#!/bin/sh
mkdir -p templink
ln -f -s /tmp templink/tmp
tar cf templink.tar.gz templink
# Leaving the original templink directory here is intentional
cat > extract.py <<_end_of_python
#!/usr/bin/env python3
import tarfile
tf = tarfile.open("templink.tar.gz")
tf.extractall()
_end_of_python
python3 extract.py
Inside BuildStream, you can trigger this by staging two similar tar files inside the same directory, or using a Docker import which has two similar tar files in its layers.
What is the current bug behavior?
Traceback (most recent call last):
File "/usr/lib/python3.6/tarfile.py", line 2212, in makelink
os.symlink(tarinfo.linkname, targetpath)
FileExistsError: [Errno 17] File exists: '/tmp' -> './templink/tmp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/tarfile.py", line 2212, in makelink
os.symlink(tarinfo.linkname, targetpath)
FileExistsError: [Errno 17] File exists: '/tmp' -> './templink/tmp'
This continues until the stack runs out.
What is the expected correct behavior?
Tarfile extracted correctly (giving the same results as tar xf tempfile.tar.gz
does)
Possible fixes
The only change we can make is to stop using tarfile
entirely. Otherwise, we just have to wait for upstream to resolve this, or nudge it along if we can.
- BuildStream version affected: /milestone %BuildStream_v1.2.4