From: Frederic Weisbecker frederic@kernel.org
mainline inclusion from mainline-v5.9-rc1 commit 3c8920e2dbd1a55f72dc14d656df9d0097cf5c72 category: bugfix bugzilla: 45956 CVE: NA
--------------------------------
Setting a tick dependency on any task, including the case where a task sets that dependency on itself, triggers an IPI to all CPUs. That is of course suboptimal but it had previously not been an issue because it was only used by POSIX CPU timers on nohz_full, which apparently never occurs in latency-sensitive workloads in production. (Or users of such systems are suffering in silence on the one hand or venting their ire on the wrong people on the other.)
But RCU now sets a task tick dependency on the current task in order to fix stall issues that can occur during RCU callback processing. Thus, RCU callback processing triggers frequent system-wide IPIs from nohz_full CPUs. This is quite counter-productive, after all, avoiding IPIs is what nohz_full is supposed to be all about.
This commit therefore optimizes tasks' self-setting of a task tick dependency by using tick_nohz_full_kick() to avoid the system-wide IPI. Instead, only the execution of the one task is disturbed, which is acceptable given that this disturbance is well down into the noise compared to the degree to which the RCU callback processing itself disturbs execution.
Fixes: 6a949b7af82d (rcu: Force on tick when invoking lots of callbacks) Reported-by: Matt Fleming matt@codeblueprint.co.uk Signed-off-by: Frederic Weisbecker frederic@kernel.org Cc: stable@kernel.org Cc: Paul E. McKenney paulmck@kernel.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Liu Chao liuchao173@huawei.com Reviewed-by: Jian Cheng cj.chengjian@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- kernel/time/tick-sched.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 48403fb653c2..00d2418412f1 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -334,16 +334,24 @@ void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) }
/* - * Set a per-task tick dependency. Posix CPU timers need this in order to elapse - * per task timers. + * Set a per-task tick dependency. RCU need this. Also posix CPU timers + * in order to elapse per task timers. */ void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit) { - /* - * We could optimize this with just kicking the target running the task - * if that noise matters for nohz full users. - */ - tick_nohz_dep_set_all(&tsk->tick_dep_mask, bit); + if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask)) { + if (tsk == current) { + preempt_disable(); + tick_nohz_full_kick(); + preempt_enable(); + } else { + /* + * Some future tick_nohz_full_kick_task() + * should optimize this. + */ + tick_nohz_full_kick_all(); + } + } }
void tick_nohz_dep_clear_task(struct task_struct *tsk, enum tick_dep_bits bit)