From: Tejun Heo <tj@kernel.org> mainline inclusion from mainline-v6.12-rc1 commit 991ef53a4832941c8130008ef35c66ec88c7fa0f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IDC9YK Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... -------------------------------- scx_rq_online() currently only tests SCX_RQ_ONLINE. This isn't fully correct - e.g. consume_dispatch_q() uses task_run_on_remote_rq() which tests scx_rq_online() to see whether the current rq can run the task, and, if so, calls consume_remote_task() to migrate the task to @rq. While the test itself was done while locking @rq, @rq can be temporarily unlocked by consume_remote_task() and nothing prevents SCX_RQ_ONLINE from going offline before the migration takes place. To address the issue, add cpu_active() test to scx_rq_online(). There is a synchronize_rcu() between cpu_active() being cleared and the rq going offline, so if an on-going scheduling operation sees cpu_active(), the associated rq is guaranteed to not go offline until the scheduling operation is complete. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 60c27fb59f6c ("sched_ext: Implement sched_ext_ops.cpu_online/offline()") Acked-by: David Vernet <void@manifault.com> Signed-off-by: Zicheng Qu <quzicheng@huawei.com> --- kernel/sched/ext.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 00f50f967a0d..61cb72a64b0f 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -1827,7 +1827,14 @@ static void direct_dispatch(struct task_struct *p, u64 enq_flags) static bool scx_rq_online(struct rq *rq) { - return likely(rq->scx.flags & SCX_RQ_ONLINE); + /* + * Test both cpu_active() and %SCX_RQ_ONLINE. %SCX_RQ_ONLINE indicates + * the online state as seen from the BPF scheduler. cpu_active() test + * guarantees that, if this function returns %true, %SCX_RQ_ONLINE will + * stay set until the current scheduling operation is complete even if + * we aren't locking @rq. + */ + return likely((rq->scx.flags & SCX_RQ_ONLINE) && cpu_active(cpu_of(rq))); } static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags, -- 2.34.1