Alternative Title: An implementation critique of the Filesystem Hierarchy Standard
One of the biggest design goals of this project is to eliminate
$PATH hell. Linux packages typically make many assumptions about the layout of a filesystem, and filesystems are fairly concrete for the most part: What you see in your directory hierarchy is the way it is actually stored. This on its face makes a lot of sense. Packages frequently need access to external/shared resources and they need to know where to get them. This is what Filesystem Hierarchy Standard (hereafter "FHS") aims to standardize.
However, there are many cross-cutting concerns when it comes to deciding where files should go. We ask a lot from our filesystem. It is supposed to:
- Carve out per-user R/W spaces (under
/home/...) separate from system-level spaces (everywhere else)
- Provide a installation point for package manager installed packages (under a whole slew of FS URIs)
- make it possible for a privileged user to install packages on their own
- make it possible to decide what files go on which physical media
Suffice it to say, there are many ways that "actual use situations" fail to fit in the neat box that FHS describes. What's more, the FHS freely admits that it simply describes the prevailing convention. Many packages freely ignore the FHS, and instead recommend users to enhance their $PATH.
If you ask me, this whole situation is a big mess, and it would provide a lot of benefit to treat resource URIs and filesystem management on their own terms.
The FHS under a microscope
One fortunate thing about the FHS is that, at it's core there are approximately three root directories of all well-known paths:
/usr, which FHS regards as shareable and static. (Honorable mention is
/opt, which I am ignoring because there are popular distros don't have that directory at all)
/etc, which FHS regards as unshareable and static. The unshareable (The FHS also includes
/boot, which puzzles me, because the hierarchy within /boot is not well standardized)
- Certain subdirectories of
/varthat FHS regard as variable and sharable.
Popular distros have many more roots than just these three. I am not considering them for the following reasons:
/tmpare all pre-existing examples of abstract filesystems. The contents of these directories are strictly laid out by filesystem drivers (
/sys), are temporary ramdisks (
/tmp), or walk the line between the two (
/mntdon't contain any well-known files, presumably they are specified for sysadmin convenience.
/bootis a standardized mount point for the EFI system partition (or not, if the system is old enough to still be BIOS booting). No application on a booted system cares about these files besides the package manager.
/homeis a conventional place to store user directories, but few applications mandate this.
/sbinare often symlinked into the
/usrhierarchy these days (thank goodness).
/rootis a conventional place for the root home directory, but I am unaware of any application that mandates this. It's not even mentioned in FHS.
Decoupling the FHS from the physical filesystem
It may not aesthetically appear as such, but something like
/bin/bash (that so often appears in a shebang) is a URI just the same as is
https://gitlab.com/. For the purposes of this whitepaper, I will refer to such paths enshrined in the FHS, POSIX, and other standards as "well-known paths". This also includes standardized search paths (especially the infamous
Note that historically, the path in a URL corresponded to real paths on server filesystems. Design patterns like REST and URL rewriting have existed on the web for a long time now, with various goals, but they all have the side-effect of decoupling the physical storage of a resource from how you access it.
I argue there is great strength in treating the FHS as a path-URI specification, not as a mandate of how you store files.
/etc is the primary place of system-level configuration. Contrary to the description in FHS, there are numerous examples of files that are shareable (like
/etc/services), and what's worse, many examples of variable files (like
/etc/ld.so.cache) that should really be stored in their rightful physical locations and symlinked in.
/var is a bit of a sticky wicket because applications expect to be able to write to it arbitrarily. However, leaving this as-is is mostly okay because the vast majority of applications do well to segregate their data in eponymous directories, but can introduce problems if there is a desire to allow for multiple versions of the same package to run concurrently.
/usr is ripe for abstraction because of its static nature. The only application that ever writes into it is the package manager. A big downside with the /usr hierarchy is that it co-mingles files from various packages, which makes it incredibly painful to install/remove packages your package manager doesn't support, or to do away with the package manager entirely.
OccamPath is not a package manager
This project takes a radical approach in allowing a user to place packages (unmodified) in a specific directory, and reacts by updating the
/etc trees to match. OccamPath will provide facilities to which package they want to use in cases of conflict (for example, if two versions of the same package are installed). The ideal way to achieve this is to provide for a new
usrfs filesystem driver for
/usr (à la
/proc) and a corresponding
Can't you just symlink?
POSIX supports symlinks and hardlinks, which can be used to provided an indirected hierarchy, but has limited usefulness because it moves the problem to managing the symlinks. For example, symlinks don't provide any facility to merge directory trees; you must instead symlink on a per-file basis, and somehow keep track of changes made to the physical tree.