[PATCH OLK-5.10 V1] sched/topology: Prevent race condition in sched_domain topology

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IB485D -------------------------------- A race condition in `build_sched_domains() -> build_sched_domain() -> sd_init()` can lead to a null pointer dereference when `tl->data` is accessed. This occurs because `build_sched_domains() -> alloc_state()` may skip memory allocation for `tl->data` based on `tl->flags` and `SDTL_SKIP`, which can be influenced by concurrent modifications through `sched_cluster_handler()`, supported by the feature named "scheduler: Add runtime knob sysctl_sched_cluster". The issue arises when `sysctl_sched_cluster()` is modified via `/proc/sys/kernel/sched_cluster`, affecting `tl->flags` through `sched_cluster_handler() -> set_sched_cluster()`. This can lead to an inconsistent state where `tl->data` is expected to be non-null in `sd_init()`, but was not allocated in `alloc_state()` before `sd_init()`. To resolve this, lock `sched_domains_mutex` before calling `set_sched_cluster()`. This ensures that changes to `tl->flags` do not interfere with the memory allocation process in `build_sched_domains()`. Fixes: 8ce3e706b314 ("scheduler: Add runtime knob sysctl_sched_cluster") Signed-off-by: Zicheng Qu <quzicheng@huawei.com> --- kernel/sched/topology.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 4bf575e4e7fc..3a8673a1a3fc 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1722,7 +1722,13 @@ int sched_cluster_handler(struct ctl_table *table, int write, ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); if (!ret && write) { if (oldval != sysctl_sched_cluster) { + /* + * Here may have raced with partition_sched_domains_locked, + * it needs to be protected with sched_domains_mutex. + */ + mutex_lock(&sched_domains_mutex); set_sched_cluster(); + mutex_unlock(&sched_domains_mutex); arch_rebuild_cpu_topology(); } } -- 2.34.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IB485A -------------------------------- A race condition in `build_sched_domains() -> build_sched_domain() -> sd_init()` can lead to a null pointer dereference when `tl->data` is accessed. This occurs because `build_sched_domains() -> alloc_state()` may skip memory allocation for `tl->data` based on `tl->flags` and `SDTL_SKIP`, which can be influenced by concurrent modifications through `sched_cluster_handler()`, supported by the feature named "scheduler: Add runtime knob sysctl_sched_cluster". The issue arises when `sysctl_sched_cluster()` is modified via `/proc/sys/kernel/sched_cluster`, affecting `tl->flags` through `sched_cluster_handler() -> set_sched_cluster()`. This can lead to an inconsistent state where `tl->data` is expected to be non-null in `sd_init()`, but was not allocated in `alloc_state()` before `sd_init()`. To resolve this, lock `sched_domains_mutex` before calling `set_sched_cluster()`. This ensures that changes to `tl->flags` do not interfere with the memory allocation process in `build_sched_domains()`. Fixes: c89577a6f0f3 ("scheduler: Add runtime knob sysctl_sched_cluster") Signed-off-by: Zicheng Qu <quzicheng@huawei.com> --- kernel/sched/topology.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index ad250ac53931..15773324af38 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1765,7 +1765,13 @@ int sched_cluster_handler(struct ctl_table *table, int write, ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); if (!ret && write) { if (oldval != sysctl_sched_cluster) { + /* + * Here may have raced with partition_sched_domains_locked, + * it needs to be protected with sched_domains_mutex. + */ + mutex_lock(&sched_domains_mutex); set_sched_cluster(); + mutex_unlock(&sched_domains_mutex); arch_rebuild_cpu_topology(); } } -- 2.34.1

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/13413 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/13413 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/13414 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/N... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/13414 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/N...
participants (2)
-
patchwork bot
-
Zicheng Qu