An alternative approach to supporting ZFS
Currently, SystemRescue does not support ZFS. The reason why is explained in the FAQ. A number of feature requests for ZFS support have been closed accordingly.
I recently came onto the need for ZFS support myself, and as I would rather work with a lightweight rescue system than an Ubuntu live ISO (currently tipping the scales at 4.7 GB), I looked into compiling ZFS support for SystemRescue myself as suggested by the FAQ.
And ohhhh, holy heck. This is not a straightforward procedure. I was already at a disadvantage given that Debian/Ubuntu are my home turf, and I thus had to learn how Arch Linux handles a number of things---even operations as basic as installing a package. And then on top of that, the SystemRescue environment has its own idiosyncrasies, like several files needed for the ZFS build just not being present despite the associated packages being installed. (Reinstalling said packages restores the files.)
To make a long story short, it took me a fair amount of time to figure out, and that is with me already living and breathing this kind of thing professionally. This approach is not going to be feasible for the majority of SystemRescue users. So I would like to propose a better one.
In the course of getting that ZFS build working, I put together a script that codifies the whole process. I've tested and refined it, and would like to present it here:
https://gitlab.com/-/snippets/3614050
This script has a number of desirable properties:
-
Runs on a live SystemRescue instance (requires >= 4 GB RAM);
-
Also runs on a bare
archlinux
Docker container (though it won't produce kernel modules compatible with SystemRescue; I hope this can eventually be made possible); -
Produces a proper SRM file that can be loaded into SystemRescue in the standard way;
-
Builds everything up to and including the SRM file reproducibly, so different users building this on different systems should get exactly the same result (with some potential caveats on package versions, which I'll bring up shortly);
-
Prints a summary at the end with some useful details;
-
Idempotent, for the most part;
-
Well-commented
😄
Just making this script available to users will improve the situation tremendously, saving them the learning curve and frustration that likely await them otherwise. However, since the build is reproducible, some additional possibilities open up that are worth considering.
Because anyone who runs the build on a given SystemRescue release should get the same exact SRM file as a result, the hash/digest of that SRM could be published. (Imagine having the SHA-256/512 hashes for the SRM on the download page, albeit with no corresponding download link.) Users could look it up to verify that their build turned out correctly... or more likely, they can download that SRM file from a third party willing to eat the legal risk, and be able to verify that it is just as good as one they compiled themselves.
Provided that the reproducibility is ironclad, you could even bake the hash into the SystemRescue image itself, so that the SRM is checked even if the user doesn't bother to do so. So instead of this being an arbitrary SRM that happens to contain ZFS, it becomes more of a "SystemRescue ZFS add-on" that is effectively part of the release, but never actually distributed by the project.
That's the sum of my proposal. Now, on the topic of reproducibility:
My understanding is that the SystemRescue ISO uses, by default, a "snapshot" pacman config that fixes the versions of all packages to those available as of the ISO release date (e.g. 10.02 uses 2023-09-16). So upgrading any package is a no-op, and if I install a package six months from now, I'll get the same exact version as if I did it today. (Please correct me if I'm wrong, as a lot is riding on this assumption)
There is a possibility that the hardware environment (e.g. CPU features) affects the build, but I haven't seen anything in the build log to suggest this will be the case. Some additional testing may be beneficial here.
The ZFS packages come from the AUR, and currently the script just uses the latest versions in Git. So when the AUR eventually updates, the resulting SRM will be different. An alternate approach would be for the script to select the commit that was current as of the release date, which should then make the build fully reproducible over time. If you are amenable to the idea of each new SystemRescue release having a unique associated ZFS SRM build, then this would be the way to go.
Thoughts?