Skip to content

Revert "sched/fair: Make sure to try to detach at least one movable task"

Phil Auld requested to merge prauld/centos-stream-9:rhel45194 into main

JIRA: https://issues.redhat.com/browse/RHEL-45194
Upstream status: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/urgent

commit 2feab2492deb2f14f9675dd6388e9e2bf669c27a
Author: Josh Don joshdon@google.com
Date: Thu Jun 20 14:44:50 2024 -0700

Revert "sched/fair: Make sure to try to detach at least one movable task"  

This reverts commit b0defa7ae03ecf91b8bfd10ede430cff12fcbd06.  

b0defa7ae03ec changed the load balancing logic to ignore env.max_loop if  
all tasks examined to that point were pinned. The goal of the patch was  
to make it more likely to be able to detach a task buried in a long list  
of pinned tasks. However, this has the unfortunate side effect of  
creating an O(n) iteration in detach_tasks(), as we now must fully  
iterate every task on a cpu if all or most are pinned. Since this load  
balance code is done with rq lock held, and often in softirq context, it  
is very easy to trigger hard lockups. We observed such hard lockups with  
a user who affined O(10k) threads to a single cpu.  

When I discussed this with Vincent he initially suggested that we keep  
the limit on the number of tasks to detach, but increase the number of  
tasks we can search. However, after some back and forth on the mailing  
list, he recommended we instead revert the original patch, as it seems  
likely no one was actually getting hit by the original issue.  

Fixes: b0defa7ae03e ("sched/fair: Make sure to try to detach at least one movable task")  
Signed-off-by: Josh Don <joshdon@google.com>  
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>  
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>  
Link: https://lore.kernel.org/r/20240620214450.316280-1-joshdon@google.com  

Signed-off-by: Phil Auld pauld@redhat.com

Merge request reports

Loading