[PATCH OLK-6.6 0/2] XSched: List Corruption
XSched: List Corruption Liu Kai (2): xsched: protect group member list with xcu_lock to prevent corruption xsched/cgroup: move list_del from css_free to css_offline to prevent corruption kernel/xsched/cgroup.c | 30 +++++++++++++++--------------- kernel/xsched/core.c | 7 +++---- kernel/xsched/vstream.c | 1 + 3 files changed, 19 insertions(+), 19 deletions(-) -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IDB5TR ----------------------------------------- The xsched_group_xse_detach() function in delete_ctx can race with xsched_group_xse_attach() in xcu_move_task, potentially corrupting the xg->members linked list due to concurrent modifications. Race scenario: CPU0 CPU1 mutex_lock(xcu_lock) dequeue_ctx mutex_unlock(xcu_lock) mutex_lock(xcu_lock) dequeue_ctx xse_detach xse_attach enqueue_ctx mutex_unlock(xcu_lock) Without proper synchronization, xse_detach() and xse_attach() can concurrently manipulate xg->members, leading to linked list corruption. Fix: 1. Move xsched_group_xse_detach() inside the xcu_lock critical section to serialize access with xsched_group_xse_attach() 2. Update nr_ctx counter after list_del(&ctx->ctx_node) for better semantic alignment and consistency This ensures atomic operations on group membership lists and prevents data structure corruption under concurrent access patterns. Fixes: 43bbefc53356 ("xsched: Add XCU control group implementation and its backend in xsched CFS") Signed-off-by: Liu Kai <liukai284@huawei.com> --- kernel/xsched/core.c | 7 +++---- kernel/xsched/vstream.c | 1 + 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/xsched/core.c b/kernel/xsched/core.c index 5e6c5eec2dc9..1bf7a93985bb 100644 --- a/kernel/xsched/core.c +++ b/kernel/xsched/core.c @@ -174,15 +174,14 @@ int delete_ctx(struct xsched_context *ctx) if (curr_xse == xse) xcu->xrq.curr_xse = NULL; dequeue_ctx(xse, xcu); - --xcu->nr_ctx; - mutex_unlock(&xcu->xcu_lock); - - xse->class->xse_deinit(xse); #ifdef CONFIG_CGROUP_XCU xsched_group_xse_detach(xse); #endif + mutex_unlock(&xcu->xcu_lock); + + xse->class->xse_deinit(xse); return 0; } diff --git a/kernel/xsched/vstream.c b/kernel/xsched/vstream.c index ebde50cbb8c6..bf2f8c6b5c6c 100644 --- a/kernel/xsched/vstream.c +++ b/kernel/xsched/vstream.c @@ -87,6 +87,7 @@ static void xsched_task_free(struct kref *kref) delete_ctx(ctx); list_del(&ctx->ctx_node); + --xcu->nr_ctx; mutex_unlock(&xcu->ctx_list_lock); mutex_lock(&xcu->xcu_lock); -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IDB5TR ----------------------------------------- css_free() is executed asynchronously via css_free_work_fn() when the CSS reference count reaches zero. Performing list_del() within this asynchronous context can lead to list corruption if other operations concurrently access the same list. The issue arises because: 1. css_free() runs in workqueue context, potentially delayed 2. Other code paths may assume the CSS is already removed from lists 3. Concurrent list operations during this window can corrupt list pointers Solution: Move the list_del() operation from css_free() to css_offline(), which: - Runs synchronously during CSS teardown - Ensures timely removal from all lists - Maintains list integrity throughout the destruction process This guarantees that once css_offline() completes, the CSS is no longer present in any shared lists, preventing use-after-free races. Fixes: 34a49359681b ("xsched: prevent NULL deref by refcounting css and tracking offline state") Signed-off-by: Liu Kai <liukai284@huawei.com> --- kernel/xsched/cgroup.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/kernel/xsched/cgroup.c b/kernel/xsched/cgroup.c index 8f3e2d9e9e12..e50556a82cea 100644 --- a/kernel/xsched/cgroup.c +++ b/kernel/xsched/cgroup.c @@ -196,7 +196,7 @@ static int xcu_cg_init(struct xsched_group *xcg, return xcu_cfs_cg_init(xcg, parent_xg); default: XSCHED_INFO("xcu_cgroup: init RT group css=0x%lx\n", - (uintptr_t)&xcg->css); + (uintptr_t)&xcg->css); break; } @@ -243,20 +243,6 @@ static void xcu_css_free(struct cgroup_subsys_state *css) { struct xsched_group *xcg = xcu_cg_from_css(css); - if (!xsched_group_is_root(xcg)) { - switch (xcg->sched_class) { - case XSCHED_TYPE_CFS: - xcu_cfs_cg_deinit(xcg); - break; - default: - XSCHED_INFO("xcu_cgroup: deinit RT group css=0x%lx\n", - (uintptr_t)&xcg->css); - break; - } - } - - list_del(&xcg->group_node); - kmem_cache_free(xsched_group_cache, xcg); } @@ -318,6 +304,20 @@ static void xcu_css_offline(struct cgroup_subsys_state *css) hrtimer_cancel(&xcg->quota_timeout); cancel_work_sync(&xcg->refill_work); cancel_work_sync(&xcg->file_show_work); + + if (!xsched_group_is_root(xcg)) { + switch (xcg->sched_class) { + case XSCHED_TYPE_CFS: + xcu_cfs_cg_deinit(xcg); + break; + default: + XSCHED_INFO("xcu_cgroup: deinit RT group css=0x%lx\n", + (uintptr_t)&xcg->css); + break; + } + } + + list_del(&xcg->group_node); } static void xsched_group_xse_attach(struct xsched_group *xg, -- 2.34.1
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/19702 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/ZF6... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/19702 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/ZF6...
participants (2)
-
Liu Kai -
patchwork bot