From: Tejun Heo <tj@kernel.org> mainline inclusion from mainline-v6.12-rc1 commit 9f391f94a1730232ad2760202755b2d9baf4688d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IDC9YK Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... -------------------------------- sched_domains regulate the load balancing for sched_classes. A machine can be partitioned into multiple sections that are not load-balanced across using either isolcpus= boot param or cpuset partitions. In such cases, tasks that are in one partition are expected to stay within that partition. cpuset configured partitions are always reflected in each member task's cpumask. As SCX always honors the task cpumasks, the BPF scheduler is automatically in compliance with the configured partitions. However, for isolcpus= domain isolation, the isolated CPUs are simply omitted from the top-level sched_domain[s] without further restrictions on tasks' cpumasks, so, for example, a task currently running in an isolated CPU may have more CPUs in its allowed cpumask while expected to remain on the same CPU. There is no straightforward way to enforce this partitioning preemptively on BPF schedulers and erroring out after a violation can be surprising. isolcpus= domain isolation is being replaced with cpuset partitions anyway, so keep it simple and simply disallow loading a BPF scheduler if isolcpus= domain isolation is in effect. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20240626082342.GY31592@noisy.programming.kicks-ass.... Cc: David Vernet <void@manifault.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Zicheng Qu <quzicheng@huawei.com> --- kernel/sched/build_policy.c | 1 + kernel/sched/ext.c | 6 ++++++ 2 files changed, 7 insertions(+) diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c index 4ae066f08cc9..07c93846999b 100644 --- a/kernel/sched/build_policy.c +++ b/kernel/sched/build_policy.c @@ -16,6 +16,7 @@ #include <linux/sched/clock.h> #include <linux/sched/cputime.h> #include <linux/sched/hotplug.h> +#include <linux/sched/isolation.h> #include <linux/sched/posix-timers.h> #include <linux/sched/rt.h> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index a47197a820e0..c8c5bbea5466 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -4403,6 +4403,12 @@ static int scx_ops_enable(struct sched_ext_ops *ops) unsigned long timeout; int i, cpu, ret; + if (!cpumask_equal(housekeeping_cpumask(HK_TYPE_DOMAIN), + cpu_possible_mask)) { + pr_err("sched_ext: Not compatible with \"isolcpus=\" domain isolation"); + return -EINVAL; + } + mutex_lock(&scx_ops_enable_mutex); if (!scx_ops_helper) { -- 2.34.1