Offering: HULK hulk inclusion category: bugfix bugzilla: IDC4VN -------------------------------- CPU0 becomes overloaded when hosting a CPU-bound RT task, a non-CPU-bound RT task, and a CFS task stuck in kernel space. When other CPUs switch from RT to non-RT tasks, RT load balancing (LB) is triggered; with HAVE_RT_PUSH_IPI enabled, they send IPIs to CPU0 to drive the execution of rto_push_irq_work_func. During push_rt_task on CPU0, if next_task->prio < rq->donor->prio, resched_curr() sets NEED_RESCHED and after the push operation completes, CPU0 calls rto_next_cpu(). Since only CPU0 is overloaded in this scenario, rto_next_cpu() should ideally return -1 (no further IPI needed). However, multiple CPUs invoking tell_cpu_to_push() during LB increments rd->rto_loop_next. Even when rd->rto_cpu is set to -1, the mismatch between rd->rto_loop and rd->rto_loop_next forces rto_next_cpu() to restart its search from -1. With CPU0 remaining overloaded (satisfying rt_nr_migratory && rt_nr_total > 1), it gets reselected, causing CPU0 to queue irq_work to itself and send self-IPIs repeatedly. As long as CPU0 stays overloaded and other CPUs run pull_rt_tasks(), it falls into an infinite self-IPI loop, which triggers a CPU hardlockup due to continuous self-interrupts. The trigging scenario is as follows: cpu0 cpu1 cpu2 pull_rt_task tell_cpu_to_push <------------irq_work_queue_on rto_push_irq_work_func push_rt_task resched_curr(rq) pull_rt_task rto_next_cpu tell_cpu_to_push <-------------------------- atomic_inc(rto_loop_next) rd->rto_loop != next rto_next_cpu irq_work_queue_on rto_push_irq_work_func Fix redundant self-IPI by filtering the initiating CPU in rto_next_cpu(). This solution has been verified to effectively eliminate spurious self-IPIs and prevent CPU hardlockup scenarios. Fixes: 4bdced5c9a29 ("sched/rt: Simplify the IPI based RT balancing logic") Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Chen Jinghuang <chenjinghuang2@huawei.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> --- kernel/sched/rt.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 7f512203d9e7..4f37f97cb908 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1952,6 +1952,7 @@ static void push_rt_tasks(struct rq *rq) */ static int rto_next_cpu(struct root_domain *rd) { + int this_cpu = smp_processor_id(); int next; int cpu; @@ -1970,10 +1971,13 @@ static int rto_next_cpu(struct root_domain *rd) */ for (;;) { - /* When rto_cpu is -1 this acts like cpumask_first() */ - cpu = cpumask_next(rd->rto_cpu, rd->rto_mask); + do { + /* When rto_cpu is -1 this acts like cpumask_first() */ + cpu = cpumask_next(rd->rto_cpu, rd->rto_mask); - rd->rto_cpu = cpu; + rd->rto_cpu = cpu; + /* Do not send IPI to self */ + } while (cpu == this_cpu); if (cpu < nr_cpu_ids) return cpu; -- 2.34.1