Skip to content

rcu: Tighten rcu_advance_cbs_nowake() checks

Waiman Long requested to merge llong1/centos-stream-9:bz2026991_rcu into main

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2026991
MR: !513 (merged)
Tested: The WARNING and subsequent RCU stall reproduced on my test VM in matter of seconds. With this patch the race window is closed and the system remains stable.

Upstream Status: rcu/next https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/kernel/rcu/tree.c?h=rcu/next&id=21e034adb9df3581fda926a29b3a11bda38ba93b related discussion https://lore.kernel.org/all/20211118225923.GX641268@paulmck-ThinkPad-P17-Gen-1/

commit 21e034adb9df3581fda926a29b3a11bda38ba93b
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Fri Sep 17 15:04:48 2021 -0700

    rcu: Tighten rcu_advance_cbs_nowake() checks

    Currently, rcu_advance_cbs_nowake() checks that a grace period is in
    progress, however, that grace period could end just after the check.
    This commit rechecks that a grace period is still in progress the lock.
    The grace period cannot end while the current CPU's rcu_node structure's
    ->lock is held, thus avoiding false positives from the WARN_ON_ONCE().

    As Daniel Vacek noted, it is not necessary for the rcu_node structure
    to have a CPU that has not yet passed through its quiescent state.

    Tested-By: Guillaume Morin <guillaume@morinfr.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

(cherry picked from commit 21e034adb9df3581fda926a29b3a11bda38ba93b)
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Edited by Waiman Long

Merge request reports