Skip to content

sched: Fix balance_push() vs __sched_setscheduler()

Phil Auld requested to merge prauld/centos-stream-9:bz2100215 into main

Bugzilla: https://bugzilla.redhat.com/2100215
Tested: Ran cpu hot[un]plug for 24+ hours while stress tests were running.

commit 04193d590b390ec7a0592630f46d559ec6564ba1
Author: Peter Zijlstra peterz@infradead.org
Date: Tue Jun 7 22:41:55 2022 +0200

sched: Fix balance_push() vs __sched_setscheduler()  

The purpose of balance_push() is to act as a filter on task selection  
in the case of CPU hotplug, specifically when taking the CPU out.  

It does this by (ab)using the balance callback infrastructure, with  
the express purpose of keeping all the unlikely/odd cases in a single  
place.  

In order to serve its purpose, the balance_push_callback needs to be  
(exclusively) on the callback list at all times (noting that the  
callback always places itself back on the list the moment it runs,  
also noting that when the CPU goes down, regular balancing concerns  
are moot, so ignoring them is fine).  

And here-in lies the problem, __sched_setscheduler()'s use of  
splice_balance_callbacks() takes the callbacks off the list across a  
lock-break, making it possible for, an interleaving, __schedule() to  
see an empty list and not get filtered.  

Fixes: ae7927023243 ("sched: Optimize finish_lock_switch()")  
Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>  
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>  
Tested-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>  
Link: https://lkml.kernel.org/r/20220519134706.GH2578@worktop.programming.kicks-ass.net  

Signed-off-by: Phil Auld pauld@redhat.com

Edited by Phil Auld

Merge request reports

Loading