From: Tejun Heo <tj@kernel.org> mainline inclusion from mainline-v6.12-rc1 commit 9390a923e109f85b242bf676dc5bc81958d447fa category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IDC9YK Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... -------------------------------- scx_task_iter_next_locked() skips tasks whose sched_class is idle_sched_class. While it has a short comment explaining why it's testing the sched_class directly isntead of using is_idle_task(), the comment doesn't sufficiently explain what's going on and why. Improve the comment. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: David Vernet <void@manifault.com> Signed-off-by: Zicheng Qu <quzicheng@huawei.com> --- kernel/sched/ext.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 645d12be385a..59f5801bff2c 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -1256,8 +1256,29 @@ scx_task_iter_next_locked(struct scx_task_iter *iter, bool include_dead) while ((p = scx_task_iter_next(iter))) { /* - * is_idle_task() tests %PF_IDLE which may not be set for CPUs - * which haven't yet been onlined. Test sched_class directly. + * scx_task_iter is used to prepare and move tasks into SCX + * while loading the BPF scheduler and vice-versa while + * unloading. The init_tasks ("swappers") should be excluded + * from the iteration because: + * + * - It's unsafe to use __setschduler_prio() on an init_task to + * determine the sched_class to use as it won't preserve its + * idle_sched_class. + * + * - ops.init/exit_task() can easily be confused if called with + * init_tasks as they, e.g., share PID 0. + * + * As init_tasks are never scheduled through SCX, they can be + * skipped safely. Note that is_idle_task() which tests %PF_IDLE + * doesn't work here: + * + * - %PF_IDLE may not be set for an init_task whose CPU hasn't + * yet been onlined. + * + * - %PF_IDLE can be set on tasks that are not init_tasks. See + * play_idle_precise() used by CONFIG_IDLE_INJECT. + * + * Test for idle_sched_class as only init_tasks are on it. */ if (p->sched_class != &idle_sched_class) break; -- 2.34.1