device mapper udev rules rework (v3)
here is a rewrite of the device mapper udev rules, as follow up to the previous thread "About DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN" 1, and my previous mailing list submissions (2, 3). While I have taken care to minimize the impact on other udev rule sets, some changes to both the multipath-tools and systemd rules will be necessary. Patches for these are in preparation.
In short, the effect of this patch set is as follows:
- As before, DM_UDEV_DISABLE_OTHER_RULES_FLAG is used as general flag to indicate to follow-up rules that they should avoid probing the device, and should import device properties from the udev db if necessary. The flag does not imply that existing device stacks must be destructed. -
- DM_UDEV_DISABLE_OTHER_RULES_FLAG represents the combination of multiple conditions that may make it impossible to probe the device just now. As such, it isn't restored from the udev db, but determined anew for every uevent. For saving the DM_COOKIE variable of the same name in the udev db, a new variable DM_COOKIE_DISABLE_OTHER_RULES_FLAG is introduced.
- The properties DM_SUSPENDED and DM_NOSCAN, which have always been considered internal to device-mapper and its subsystems, are renamed to .DM_SUSPENDED and .DM_NOSCAN, respectively. Thus they aren't saved in the udev db any more, and it becomes more obvious that they are not supposed to be used by non-dm udev rules.
- ID_FS_TYPE is imported from the db for "spurious" events.
- Properties are imported for DISK_RO="?" events, too.
Changes v2->v3:
- Use
$env
instead of%E
- Remove mention of
.DM_SUSPENDED
from12-dm-permissions.rules
Changes from v1 ("RFC") patch set:
- Instead of removing the DISK_RO clause entirely, just moved it further down (after the property import), and apply it for DISK_RO==0, too.
- Updated comments about device properties at the top of 10-dm.rules.
Merge request reports
Activity
added 7 commits
- ba39a463 - 13-dm-disk.rules: import ID_FS_TYPE
- 30f9c35d - 10-dm.rules: test DISK_RO after importing properties
- d4804318 - 10-dm-rules: don't restore DM_UDEV_DISABLE_OTHER_RULES_FLAG from db
- 9862ccab - 11-dm-lvm.rules: don't restore DM_UDEV_DISABLE_OTHER_RULES_FLAG from db
- 59d3a49e - dm udev rules: don't export and save DM_SUSPENDED
- 8a738de7 - dm udev rules: don't export and save DM_NOSCAN
- 7012573e - 10-dm.rules: bump DM_UDEV_RULES_VSN to 3
Toggle commit listThank you @mwilck for this work!
Do we also have the patch for systemd's 99-systemd.rules prepared? It would be better if that goes first, so we are sure we can bump the dependency on systemd in dm/lvm2.
requested review from @prajnoha
systemd PR: https://github.com/systemd/systemd/pull/31661
mentioned in merge request !5 (merged)
This has been approved since 2 weeks, but not merged ... are you waiting for something specific?
Note that this MR is independent of https://github.com/systemd/systsemd/pull/31661.Note: this MR removesDM_SUSPENDED
, which systemd depends on as long as https://github.com/systemd/systsemd/pull/31661 is not merged.Edited by Martin WilckI've just waited for systemd, if there was anything else to settle down there (any more comments from them or objections) and, ideally, expected that one to go first so we could bump dependency on systemd in lvm2 when packaging it.
This way, we could at least check that we have the recent systemd that does not depend on
DM_SUSPENDED
anymore if the new lvm2 is installed. But it's true we would still miss the check the other way round (old lvm2 + new systemd) - that would require a dependency hook on systemd side of packaging, which I don't think they would be adding.It looks the systemd has no other objections and that patch is slated for v256, so yes, we can merge also the lvm2 part now...
(As discussed further on that systemd PR and in our further communication through mail, we're planning to move the DM rule from
99-systemd.rules
to our DM/LVM repo eventually.)added 39 commits
-
7012573e...e5dc11e2 - 32 commits from branch
lvmteam:main
- 6f44e109 - 13-dm-disk.rules: import ID_FS_TYPE
- 0fe2d778 - 10-dm.rules: test DISK_RO after importing properties
- f98d020e - 10-dm-rules: don't restore DM_UDEV_DISABLE_OTHER_RULES_FLAG from db
- 2b2f11a7 - 11-dm-lvm.rules: don't restore DM_UDEV_DISABLE_OTHER_RULES_FLAG from db
- 21ca92c4 - dm udev rules: don't export and save DM_SUSPENDED
- a1967529 - dm udev rules: don't export and save DM_NOSCAN
- 038f9254 - 10-dm.rules: bump DM_UDEV_RULES_VSN to 3
Toggle commit list-
7012573e...e5dc11e2 - 32 commits from branch
WHATS_NEW update: 71bac6f4
Edited by Peter RajnohaIs systemd#31661 the only systemd-side PR/commit that is necessary for these udev changes, and the reason behind the release notes recommending systemd v256? I note 31661 applies cleanly to systemd v254.12, is this sufficient to fulfill the systemd-side requirements for LVM2 v2.03.24?
Is there any downside to running systemd with that patch using an older version of LVM2? If not then it might be a good candidate for back-porting to the systemd-stable branches.
The v256 is recommended due to the rules in 99-systemd.rules that check for DM_SUSPENDED variable which is not available with newer version of dm/LVM2. In the new LVM2 version, the DM_SUSPENDED is made internal and renamed to .DM_SUSPENDED (the dot at the beginning). A check for dm device unavailability is now covered by single DM_UDEV_DISABLE_OTHER_RULES_FLAG. Also, the SYSTEMD_READY is handled differently in case a properly activated DM device is temporarily not available for use at the moment - in that case the previous value of SYSTEMD_READY is imported. The SYSTEMD_READY is important for related systemd units to activate at right time.
If you used new dm/LVM2 with old systemd, then systemd would not correctly detect suspended state and if it performs any IO on the device, it can stall up until the DM device is resumed again. However, the suspended states of DM devices are usually kept to a minimum time as much as possible.
If you used old dm/LVM2 with new systemd, then DM_UDEV_DISABLE_OTHER_RULES_FLAG may not be set correctly for new systemd rule. If I remember correctly, that could happen in case we had suspended case mixed together with synthetic (generated) udev events which caused issues with dm-multipath where the "ready" state was incorrectly set in that case (and which was one of the reasons for this patchset).
As such, we recommend using the new versions for both libdm/LVM2 and systemd that have both new and synced udev rules.
And yes, the systemd#31661 is the only systemd-side change and it is solely related to handling dm-based devices so it is pretty isolated. Backporting that patch to older systemd versions should then just work.
Thanks, Peter. A few small additions from my side:
The recommendation to use the rules from systemd v256 with the rules from this PR is of course correct. But even if only either rule set is updated, the likelihood for actual breakage is rather low. If anything goes wrong, it will most probably be in the "udev coldplug" phase during boot.
If you used new dm/LVM2 with old systemd, then systemd would not correctly detect suspended state and if it performs any IO on the device, it can stall up until the DM device is resumed again
The systemd v255 rule that evaluates
DM_SUSPENDED
is a late rule, which is evaluated after any commands that do IO (like blkid). Thus it's not likely that udev workers get stalled. systemd itself doesn't do IO on the device either, and thus won't stall. In theory, it can happen that mount or swapon processes of dependent units may hang. But because the new rules setENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}=="1"
on these devices, blkid etc. won't be run by udev. Thus from systemd's point of view, either the device has been activated before (meaning that dependentmount
orswap
units have been started, too), or the blkid properties likeID_FS_TYPE
andID_FS_UUID_ENC
aren't set, and thus nomount
orswap
units will be started for the device.With the rule set from this PR, for suspended devices, systemd will see
ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}=="1"
withoutDM_SUSPENDED
being set. If this happens in an "add" event (coldplug during boot after switching root),SYSTEMD_READY=0
will be set. Thus if this patch set is used together with systemd v255 or older, it can happen that a device which is suspended while a coldplug event is processed is unmounted1.The only other "foreign" rule set that looks at
DM_SUSPENDED
is the ruleset for multipath and kpartx. Thus if you use either tool, you'll want to apply the related udev rule changes from multipath-tools 0.9.9 (commit 8647bfe..f8d7356).If you used old dm/LVM2 with new systemd, then DM_UDEV_DISABLE_OTHER_RULES_FLAG may not be set correctly for new systemd rule.
In the cases I can think of, if the new dm rules set
DM_UDEV_DISABLE_OTHER_RULES_FLAG
, the old rules would have set it, too (with the effect that systemd v256 tries to importSYSTEMD_READY
from the db). Before this PR, the dm rules did setENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="1"
wrongly in some cases (e.g. after restoring a state from the data base which should rather be evaluated anew on each event). This causes systemd v256 to importSYSTEMD_READY
from the db unnecessarily. If no other event for the device follows (unlikely), this may lead to a device not being activated.
Edited by Martin Wilck