x86_64 32-bit EFI boot is broken on systemd-boot
Edit: This was a kernel bug, and is fixed by https://lore.kernel.org/linux-efi/CAMj1kXEHt-UP2MWc_-jBzf22vqhFr2QykwrGwEeYAk2WSkY4Zg@mail.gmail.com/T/#me9520b3c8565fddfb47ec7deacd40ce9f376f882
On pmOS Edge, booting an x86_64 system with 32-bit EFI should work, using systemd-boot. It works with grub-x86 (grub built for 32-bit EFI) , and this used to work a few months ago (Oct/Nov 2023) when I ported sd-boot-254 to pmOS. (EDIT: I'm now not actually 100% sure about that ...
I have 3 systems that exhibit different (but maybe related?) behavior:
-
Intel Bay Trail tablet, Dell Venue 8 pro --> hangs after selecting the entry in sd-boot menu
-
Intel Bay Trail NUC, Zotac ZBOX-PI320 --> reboots after selecting the entry in sd-boot menu
-
qemu + 32-bit EFI --> reboots after selecting the entry in sd-boot menu
In all of these situations, I can not get anything to print on a display. So this has seriously limited further debug for me. I've been focused on trying to get more info printed when it hangs/reboots. The kernel options I am using are:
debug ignore_loglevel earlycon=efifb earlyprintk=efi,keep efi=debug PMOS_NOSPLASH PMOS_NO_OUTPUT_REDIRECT
The last two params are used in our initramfs, but I'm like 90% sure we aren't even making it to the initramfs.
Alpine Linux's linux-lts
kernel config:
CONFIG_DMI=y
CONFIG_EFI=y
CONFIG_EFI_EARLYCON=y
CONFIG_EFI_ESRT=y
CONFIG_EFI_HANDOVER_PROTOCOL=y
CONFIG_EFI_MIXED=y
CONFIG_EFI_RUNTIME_WRAPPERS=y
CONFIG_EFI_STUB=y
CONFIG_EFI_VARS_PSTORE=m
CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=y
Things I've tried
-
grub-x86 --> works
-
sd-boot-254 --> does not work anymore (but used to)
- maybe this is a kernel issue?
-
Alpine linux-edge (6.8) and linux-lts (6.6) in Alpine edge --> does not work
-
comparing kernel config between Alpine Linux and Arch Linux kernels, wrt EFI-related config --> did not find anything to explain it
-
systemd-boot-255 Arch Linux w/ Alpine kernel & pmos initramfs --> does not work
- reasoning: it might be a build/compiler issue on Alpine?
- since it doesn't work, maybe it's a kernel issue?
-
systemd-boot-255 & 6.7.5 from Arch Linux --> does not work
- What?!
-
older kernel (6.2-zen, from Arch Linux) --> does not work
-
booted with
panic=0
--> kernel doesn't appear to be panicking- reasoning: is the reboot on NUC from the kernel being configured to reboot on panic AND the is kernel panicking?
- even with that param set, the system still reboots after selecting the boot entry in sd-boot, indicating that maybe it's not panicking
-
putting the kernel in
<esp>/efi/boot/bootia32.efi
so EFI runs it directly via efistub --> does not work -
Try <6.2 kernel --> works
- bisected to kernel, sent message to kernel ML #2690 (comment 1826329409)
Things to try next:
-
Try again with qemu x86_64 + 32-bit EFI, with a serial device and kernel configured to output via serial tty.
-
If there are GPU/graphics issues preventing us from seeing kernel output (that could help debug further), serial output might allow us to see that output.
-
Unfortunately I temporarily lost access to a system that can run x86_64 qemu
😞
-
-
Try to figure out if the hang/reboot happens in sd-boot code or in kernel
- Could add printfs to sd-boot code, starting(?) in logic around kernel load/exec
-
Try an older kernels, <6.2
-
Will need to rebuild w/ required EFI stuff set if it's from Alpine, IIRC some stuff (
EFI_MIXED
?) was only enabled more recently in those packaged versions -
Is there an old Arch kernel packaged somewhere with the required kconfig?
-