From: Cheng Jian <cj.chengjian(a)huawei.com>
hulk inclusion
category: bugfix
Bugzilla: 47618
CVE: NA
----------------------------------------
e221d028bb ("sched,rt: fix isolated CPUs leaving
root_task_group indefinitely throttled") only fixes
isolated CPUs leaving root_task_group, and not fix
all other ordinary task_groutask_group.
In some scenarios where we need attach task bind to
isolated CPUs in task_group, the same problem will occur.
Isolated CPUs and non-isolate CPUs are not in the same
root_domain. and the hrtimer only check the cpumask of
this_rq's root_domain. so when the handler of RT_BANDWIDTH
hrtimer is running on the isolated CPU, it will leaved
the non-isolated CPUs indefinitely throttled. Because
bandwidth period hrtimer can't resume them. and viceversa.
Let the bandwidth timer check all the rt_rq of cpu_online_mask.
Signed-off-by: Cheng Jian <cj.chengjian(a)huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: zhangyi (F) <yi.zhang(a)huawei.com>
Signed-off-by: Lu Jialin <lujialin4(a)huawei.com>
Reviewed-by: xiu jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
Signed-off-by: Zhao Wenhui <zhaowenhui8(a)huawei.com>
---
kernel/sched/rt.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index c21b2da3735a..3285125a512f 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -906,16 +906,14 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
span = sched_rt_period_mask();
#ifdef CONFIG_RT_GROUP_SCHED
/*
- * FIXME: isolated CPUs should really leave the root task group,
- * whether they are isolcpus or were isolated via cpusets, lest
- * the timer run on a CPU which does not service all runqueues,
- * potentially leaving other CPUs indefinitely throttled. If
- * isolation is really required, the user will turn the throttle
- * off to kill the perturbations it causes anyway. Meanwhile,
- * this maintains functionality for boot and/or troubleshooting.
+ * When the tasks in the task_group run on either isolated
+ * CPUs or non-isolated CPUs, whether they are isolcpus or
+ * were isolated via cpusets, check all the online rt_rq
+ * to lest the timer run on a CPU which does not service
+ * all runqueues, potentially leaving other CPUs indefinitely
+ * throttled.
*/
- if (rt_b == &root_task_group.rt_bandwidth)
- span = cpu_online_mask;
+ span = cpu_online_mask;
#endif
for_each_cpu(i, span) {
int enqueue = 0;
--
2.34.1