Weird segfault in aarch64 chroot during "apk add phoc"
phosh and phoc have just been upgraded in pmaports.git, v20.05 branch. phoc depends on phosh.
phoc was built for aarch64 successfully and is now in the unpublished WIP repository of builds.postmarketos.org.
When trying to build phosh for aarch64, it would install that version of phoc from the WIP repository, and segfault during the apk add
:
(81/421) Installing phoc (0.4.2-r0)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
>>> ERROR: phosh: builddeps failed
This can reliably reproduced whenever trying to build phosh with pmbootstrap for aarch64 on builds.sr.ht:
- https://builds.sr.ht/~postmarketos/job/278326
- https://builds.sr.ht/~postmarketos/job/278328
- https://builds.sr.ht/~postmarketos/job/278329
This error does not happen on x86_64 or armv7. I was not able to reproduce it locally, even after downloading the same WIP phoc package.
I've resubmitted the build in my own sourcehut namespace, and logged into the builder after the failed build. After ulimit -c unlimited
and reproducing the bug there with...
./pmbootstrap/pmbootstrap.py \
-mp https://build.postmarketos.org/wip/ \
-mp http://mirror.postmarketos.org/postmarketos/ \
--details-to-stdout \
\
build \
--arch=aarch64 \
--force \
--strict \
phosh
... I was able to magic-wormhole the core dump out. I'll attach it here.
We could build apk with debug symbols (which is the default, run make
and make install
in apk-tools.git), and then either try to use that core file, or generate a new one after writing a hack for pmbootstrap to install the apk with debug symbols to /sbin/apk in the chroot_build_aarch64. This should give an useful backtrace in theory.
However, I'm not sure how to make gdb or gdb-multiarch display the backtrace, since the binary ran with qemu-aarch64-static:
$ file core
core: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from '/usr/bin/qemu-aarch64-static /sbin/apk --no-progress add --wait 30 --repository', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/sbin/apk', platform: 'x86_64'
I've also noticed, that if I remove --strict
(-> let pmbootstrap call apk to install the depends, not abuild), the bug does not get triggered. So after spending multiple hours on this, I'll just configure bpo to build this package without --strict
as workaround (code to do that exists already).
If somebody was able to pull out the backtrace from the core dump, it would be highly appreciated.