[PATCH OLK-5.10 0/2] xfrm: Fix CVE-2026-43091 RCU policy netns exit
From: Dong Chenchen <dongchenchen2@huawei.com> This patch series fixes CVE-2026-43091 in the OLK-5.10 branch. * Patch 1: Add synchronize_rcu() before freeing policy hash tables on netns exit to wait for concurrent RCU readers. * Patch 2: Move the RCU sync and policy flush from .exit to .pre_exit for O(1) RCU grace periods per batch instead of O(N). Steffen Klassert (1): xfrm: Wait for RCU readers during policy netns exit Usama Arif (1): xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit net/xfrm/xfrm_policy.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) -- 2.43.0
From: Steffen Klassert <steffen.klassert@secunet.com> mainline inclusion from mainline-v7.1-rc1 commit 069daad4f2ae9c5c108131995529d5f02392c446 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14643 CVE: CVE-2026-43091 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- xfrm_policy_fini() frees the policy_bydst hash tables after flushing the policy work items and deleting all policies, but it does not wait for concurrent RCU readers to leave their read-side critical sections first. The policy_bydst tables are published via rcu_assign_pointer() and are looked up through rcu_dereference_check(), so netns teardown must also wait for an RCU grace period before freeing the table memory. Fix this by adding synchronize_rcu() before freeing the policy hash tables. Fixes: e1e551bc5630 ("xfrm: policy: prepare policy_bydst hash for rcu lookups") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/xfrm/xfrm_policy.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index c7a827d4265c..096cbf2e388c 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -4119,6 +4119,8 @@ static void xfrm_policy_fini(struct net *net) #endif xfrm_policy_flush(net, XFRM_POLICY_TYPE_MAIN, false); + synchronize_rcu(); + WARN_ON(!list_empty(&net->xfrm.policy_all)); for (dir = 0; dir < XFRM_POLICY_MAX; dir++) { -- 2.43.0
From: Usama Arif <usama.arif@linux.dev> mainline inclusion from mainline-v7.1-rc6 commit 3e52417318473782012b236d0325bf7d2266a597 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14643 CVE: CVE-2026-43091 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- The struct pernet_operations docstring in include/net/net_namespace.h explicitly warns against blocking RCU primitives in .exit handlers: Exit methods using blocking RCU primitives, such as synchronize_rcu(), should be implemented via exit_batch. [...] Please, avoid synchronize_rcu() at all, where it's possible. Note that a combination of pre_exit() and exit() can be used, since a synchronize_rcu() is guaranteed between the calls. xfrm_policy_fini() violates this: it calls synchronize_rcu() before freeing the policy_bydst hash tables (so no RCU reader is mid- traversal at free time), but runs from xfrm_net_ops.exit -- once per namespace -- so a cleanup_net() of N namespaces pays N full RCU grace periods serially. Use the documented pre_exit/exit split. Move the policy flush (and the workqueue drains it depends on) into a new .pre_exit handler; xfrm_policy_fini() then runs in .exit and frees the hash tables after the synchronize_rcu_expedited() that cleanup_net() guarantees between the two phases. Providing O(1) RCU grace periods per batch instead of O(N). Observed on Linux 6.18 with a workload doing unshare(CLONE_NEWNET) at ~13/sec sustained: cleanup_net() and the netns_wq rescuer kthread both stuck in xfrm_policy_fini()'s synchronize_rcu(), >300k struct net accumulated in the cleanup queue, Percpu in /proc/meminfo climbed to 130+ GB on 256-CPU hosts, and memcg OOMs followed. setup_net and __put_net counts were balanced, ruling out a refcount leak. Fixes: 069daad4f2ae ("xfrm: Wait for RCU readers during policy netns exit") Signed-off-by: Usama Arif <usama.arif@linux.dev> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Conflicts: net/xfrm/xfrm_policy.c [OLK-5.10 lacks commit 7a0207094f1b] Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/xfrm/xfrm_policy.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 096cbf2e388c..77125d0272ee 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -4107,19 +4107,20 @@ static int __net_init xfrm_policy_init(struct net *net) return -ENOMEM; } -static void xfrm_policy_fini(struct net *net) +static void __net_exit xfrm_net_pre_exit(struct net *net) { - struct xfrm_pol_inexact_bin *b, *t; - unsigned int sz; - int dir; - flush_work(&net->xfrm.policy_hash_work); #ifdef CONFIG_XFRM_SUB_POLICY xfrm_policy_flush(net, XFRM_POLICY_TYPE_SUB, false); #endif xfrm_policy_flush(net, XFRM_POLICY_TYPE_MAIN, false); +} - synchronize_rcu(); +static void xfrm_policy_fini(struct net *net) +{ + struct xfrm_pol_inexact_bin *b, *t; + unsigned int sz; + int dir; WARN_ON(!list_empty(&net->xfrm.policy_all)); @@ -4189,6 +4190,7 @@ static void __net_exit xfrm_net_exit(struct net *net) static struct pernet_operations __net_initdata xfrm_net_ops = { .init = xfrm_net_init, + .pre_exit = xfrm_net_pre_exit, .exit = xfrm_net_exit, }; -- 2.43.0
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://atomgit.com/openeuler/kernel/merge_requests/23661 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/JFK... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://atomgit.com/openeuler/kernel/merge_requests/23661 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/JFK...
participants (2)
-
patchwork bot -
superdcc97@163.com