hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IB485A
--------------------------------
A race condition in `build_sched_domains() -> build_sched_domain() -> sd_init()` can lead to a null pointer dereference when `tl->data` is accessed. This occurs because `build_sched_domains() -> alloc_state()` may skip memory allocation for `tl->data` based on `tl->flags` and `SDTL_SKIP`, which can be influenced by concurrent modifications through `sched_cluster_handler()`, supported by the feature named "scheduler: Add runtime knob sysctl_sched_cluster".
The issue arises when `sysctl_sched_cluster()` is modified via `/proc/sys/kernel/sched_cluster`, affecting `tl->flags` through `sched_cluster_handler() -> set_sched_cluster()`. This can lead to an inconsistent state where `tl->data` is expected to be non-null in `sd_init()`, but was not allocated in `alloc_state()` before `sd_init()`.
To resolve this, lock `sched_domains_mutex` before calling `set_sched_cluster()`. This ensures that changes to `tl->flags` do not interfere with the memory allocation process in `build_sched_domains()`.
Fixes: c89577a6f0f3 ("scheduler: Add runtime knob sysctl_sched_cluster") Signed-off-by: Zicheng Qu quzicheng@huawei.com --- kernel/sched/topology.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index ad250ac53931..15773324af38 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1765,7 +1765,13 @@ int sched_cluster_handler(struct ctl_table *table, int write, ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); if (!ret && write) { if (oldval != sysctl_sched_cluster) { + /* + * Here may have raced with partition_sched_domains_locked, + * it needs to be protected with sched_domains_mutex. + */ + mutex_lock(&sched_domains_mutex); set_sched_cluster(); + mutex_unlock(&sched_domains_mutex); arch_rebuild_cpu_topology(); } }