mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2026 -----
  • April
  • March
  • February
  • January
  • ----- 2025 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 27 participants
  • 23208 discussions
[PATCH OLK-5.10] mm: make sure freeram is smaller than totalram
by Liu Shixin 17 Mar '26

17 Mar '26
From: Yang Yingliang <yangyingliang(a)huawei.com> hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8748 CVE: NA -------------------------------------------------- The memory stat is not done in real time, it have some gap with real value. In CPU-less NUMA node, the values of MemTotal and MemFree can be nearly equal, the gap make cause MemFree is bigger than MemTotal, it leads MemUsed is negative which print as a large positive number. cat /sys/devices/system/node/node17/meminfo Node 17 MemTotal: 4194304 kB Node 17 MemFree: 4195552 kB Node 17 MemUsed: 18446744073709550368 kB Node 17 Active: 52 kB Node 17 Inactive: 320 kB Node 17 Active(anon): 0 kB Node 17 Inactive(anon): 0 kB Node 17 Active(file): 52 kB Node 17 Inactive(file): 320 kB Node 17 Unevictable: 0 kB Node 17 Mlocked: 0 kB Node 17 Dirty: 0 kB Node 17 Writeback: 0 kB Node 17 FilePages: 372 kB Node 17 Mapped: 320 kB Node 17 AnonPages: 0 kB Node 17 Shmem: 0 kB Node 17 KernelStack: 0 kB Node 17 PageTables: 0 kB Node 17 NFS_Unstable: 0 kB Node 17 Bounce: 0 kB Node 17 WritebackTmp: 0 kB Node 17 KReclaimable: 0 kB Node 17 Slab: 0 kB Node 17 SReclaimable: 0 kB Node 17 SUnreclaim: 0 kB Node 17 AnonHugePages: 79872 kB Node 17 ShmemHugePages: 0 kB Node 17 ShmemPmdMapped: 0 kB Node 17 FileHugePages: 0 kB Node 17 FilePmdMapped: 0 kB Node 17 HugePages_Total: 0 Node 17 HugePages_Free: 0 Node 17 HugePages_Surp: 0 To avoid this exception, make MemFree equals MemTotal, when MemFree is bigger than MemTotal Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 72d5303f8000..02db21de93aa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5837,6 +5837,7 @@ void si_meminfo_node(struct sysinfo *val, int nid) val->totalram = managed_pages; val->sharedram = node_page_state(pgdat, NR_SHMEM); val->freeram = sum_zone_node_page_state(nid, NR_FREE_PAGES); + val->freeram = min(val->freeram, val->totalram); #ifdef CONFIG_HIGHMEM for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++) { struct zone *zone = &pgdat->node_zones[zone_type]; -- 2.33.0
2 1
0 0
[PATCH OLK-5.10] mm: make sure freeram is smaller than totalram
by Liu Shixin 17 Mar '26

17 Mar '26
From: Yang Yingliang <yangyingliang(a)huawei.com> The memory stat is not done in real time, it have some gap with real value. In CPU-less NUMA node, the values of MemTotal and MemFree can be nearly equal, the gap make cause MemFree is bigger than MemTotal, it leads MemUsed is negative which print as a large positive number. cat /sys/devices/system/node/node17/meminfo Node 17 MemTotal: 4194304 kB Node 17 MemFree: 4195552 kB Node 17 MemUsed: 18446744073709550368 kB Node 17 Active: 52 kB Node 17 Inactive: 320 kB Node 17 Active(anon): 0 kB Node 17 Inactive(anon): 0 kB Node 17 Active(file): 52 kB Node 17 Inactive(file): 320 kB Node 17 Unevictable: 0 kB Node 17 Mlocked: 0 kB Node 17 Dirty: 0 kB Node 17 Writeback: 0 kB Node 17 FilePages: 372 kB Node 17 Mapped: 320 kB Node 17 AnonPages: 0 kB Node 17 Shmem: 0 kB Node 17 KernelStack: 0 kB Node 17 PageTables: 0 kB Node 17 NFS_Unstable: 0 kB Node 17 Bounce: 0 kB Node 17 WritebackTmp: 0 kB Node 17 KReclaimable: 0 kB Node 17 Slab: 0 kB Node 17 SReclaimable: 0 kB Node 17 SUnreclaim: 0 kB Node 17 AnonHugePages: 79872 kB Node 17 ShmemHugePages: 0 kB Node 17 ShmemPmdMapped: 0 kB Node 17 FileHugePages: 0 kB Node 17 FilePmdMapped: 0 kB Node 17 HugePages_Total: 0 Node 17 HugePages_Free: 0 Node 17 HugePages_Surp: 0 To avoid this exception, make MemFree equals MemTotal, when MemFree is bigger than MemTotal Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 72d5303f8000..02db21de93aa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5837,6 +5837,7 @@ void si_meminfo_node(struct sysinfo *val, int nid) val->totalram = managed_pages; val->sharedram = node_page_state(pgdat, NR_SHMEM); val->freeram = sum_zone_node_page_state(nid, NR_FREE_PAGES); + val->freeram = min(val->freeram, val->totalram); #ifdef CONFIG_HIGHMEM for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++) { struct zone *zone = &pgdat->node_zones[zone_type]; -- 2.33.0
2 1
0 0
[PATCH OLK-5.10] l2tp: fix double dst_release() on sk_dst_cache race
by Li Xiasong 17 Mar '26

17 Mar '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8696 Reference: https://lore.kernel.org/netdev/20251215145537.5085-1-m.lobanov@rosa.ru/ ---------------------------------------- A reproducible rcuref - imbalanced put() warning is observed under IPv6 L2TP (pppol2tp) traffic with blackhole routes, indicating an imbalance in dst reference counting for routes cached in sk->sk_dst_cache and pointing to a subtle lifetime/synchronization issue between the helpers that validate and drop cached dst entries. rcuref - imbalanced put() WARNING: CPU: 0 PID: 899 at lib/rcuref.c:266 rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266 Modules linked in: CPSocket connected tcp:127.0.0.1:48148,server=on <-> 127.0.0.1:33750 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266 Call Trace: <TASK> __rcuref_put include/linux/rcuref.h:97 [inline] rcuref_put include/linux/rcuref.h:153 [inline] dst_release+0x291/0x310 net/core/dst.c:167 __sk_dst_check+0x2d4/0x350 net/core/sock.c:604 __inet6_csk_dst_check net/ipv6/inet6_connection_sock.c:76 [inline] inet6_csk_route_socket+0x6ed/0x10c0 net/ipv6/inet6_connection_sock.c:104 inet6_csk_xmit+0x12f/0x740 net/ipv6/inet6_connection_sock.c:121 l2tp_xmit_queue net/l2tp/l2tp_core.c:1214 [inline] l2tp_xmit_core net/l2tp/l2tp_core.c:1309 [inline] l2tp_xmit_skb+0x1404/0x1910 net/l2tp/l2tp_core.c:1325 pppol2tp_sendmsg+0x3ca/0x550 net/l2tp/l2tp_ppp.c:302 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg net/socket.c:744 [inline] ____sys_sendmsg+0xab2/0xc70 net/socket.c:2609 ___sys_sendmsg+0x11d/0x1c0 net/socket.c:2663 __sys_sendmmsg+0x188/0x450 net/socket.c:2749 __do_sys_sendmmsg net/socket.c:2778 [inline] __se_sys_sendmmsg net/socket.c:2775 [inline] __x64_sys_sendmmsg+0x98/0x100 net/socket.c:2775 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x64/0x140 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fe6960ec719 </TASK> The race occurs between the lockless UDPv6 transmit path (udpv6_sendmsg() -> sk_dst_check()) and the locked L2TP/pppol2tp transmit path (pppol2tp_sendmsg() -> l2tp_xmit_skb() -> ... -> inet6_csk_xmit() → __sk_dst_check()), when both handle the same obsolete dst from sk->sk_dst_cache: the UDPv6 side takes an extra reference and atomically steals and releases the cached dst, while the L2TP side, using a stale cached pointer, still calls dst_release() on it, and together these updates produce an extra final dst_release() on that dst, triggering rcuref - imbalanced put(). The Race Condition: Initial: sk->sk_dst_cache = dst ref(dst) = 1 Thread 1: sk_dst_check() Thread 2: __sk_dst_check() ------------------------ ---------------------------- sk_dst_get(sk): rcu_read_lock() dst = rcu_dereference(sk->sk_dst_cache) rcuref_get(dst) succeeds rcu_read_unlock() // ref = 2 dst = __sk_dst_get(sk) // reads same dst from sk_dst_cache // ref still = 2 (no extra get) [both see dst obsolete & check() == NULL] sk_dst_reset(sk): old = xchg(&sk->sk_dst_cache, NULL) // old = dst dst_release(old) // drop cached ref // ref: 2 -> 1 RCU_INIT_POINTER(sk->sk_dst_cache, NULL) // cache already NULL after xchg dst_release(dst) // ref: 1 -> 0 dst_release(dst) // tries to drop its own ref after final put // rcuref_put_slowpath() -> "rcuref - imbalanced put()" Make L2TP's IPv6 transmit path stop using inet6_csk_xmit() (and thus __sk_dst_check()) and instead open-code the same routing and transmit sequence using the existing sk_dst_check() with dst reference counting and ip6_dst_lookup_flow(). The new code builds a flowi6 from the socket fields in the same way as inet6_csk_route_socket(), then performs dst lookup using sk_dst_check() with proper reference counting, attaches the resulting dst to the skb via skb_dst_set(), and finally invokes ip6_xmit() for transmission. This makes both the UDPv6 and L2TP IPv6 paths use the same dst-cache handling logic for a given socket and removes the possibility that the lockless sk_dst_check() and the locked __sk_dst_check() concurrently drop the same cached dst and trigger the rcuref - imbalanced put() warning under concurrent traffic. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: b0270e91014d ("ipv4: add a sock pointer to ip_queue_xmit()") Signed-off-by: Li Xiasong <lixiasong1(a)huawei.com> --- net/l2tp/l2tp_core.c | 60 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 3 deletions(-) diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 97350350478d..80a20c079a9a 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -995,19 +995,73 @@ static int l2tp_build_l2tpv3_header(struct l2tp_session *session, void *buf) return bufp - optr; } +#if IS_ENABLED(CONFIG_IPV6) +static int l2tp_xmit_ipv6(struct sock *sk, struct sk_buff *skb) +{ + struct ipv6_pinfo *np = inet6_sk(sk); + struct inet_sock *inet = inet_sk(sk); + struct in6_addr *final_p, final; + struct dst_entry *dst; + struct flowi6 fl6; + int err; + + memset(&fl6, 0, sizeof(fl6)); + fl6.flowi6_proto = sk->sk_protocol; + fl6.daddr = sk->sk_v6_daddr; + fl6.saddr = np->saddr; + fl6.flowlabel = np->flow_label; + IP6_ECN_flow_xmit(sk, fl6.flowlabel); + + fl6.flowi6_oif = sk->sk_bound_dev_if; + fl6.flowi6_mark = sk->sk_mark; + fl6.fl6_sport = inet->inet_sport; + fl6.fl6_dport = inet->inet_dport; + fl6.flowi6_uid = sk->sk_uid; + + security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6)); + + rcu_read_lock(); + final_p = fl6_update_dst(&fl6, rcu_dereference(np->opt), &final); + rcu_read_unlock(); + + dst = sk_dst_check(sk, np->dst_cookie); + if (!dst) { + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); + if (IS_ERR(dst)) { + sk->sk_err_soft = -PTR_ERR(dst); + sk->sk_route_caps = 0; + kfree_skb(skb); + return PTR_ERR(dst); + } + + ip6_dst_store(sk, dst_clone(dst), NULL, NULL); + } + + skb_dst_set(skb, dst); + fl6.daddr = sk->sk_v6_daddr; + + rcu_read_lock(); + err = ip6_xmit(sk, skb, &fl6, sk->sk_mark, rcu_dereference(np->opt), + np->tclass, sk->sk_priority); + rcu_read_unlock(); + return err; +} +#endif + /* Queue the packet to IP for output: tunnel socket lock must be held */ static int l2tp_xmit_queue(struct l2tp_tunnel *tunnel, struct sk_buff *skb, struct flowi *fl) { int err; + struct sock *sk = tunnel->sock; skb->ignore_df = 1; skb_dst_drop(skb); #if IS_ENABLED(CONFIG_IPV6) - if (l2tp_sk_is_v6(tunnel->sock)) - err = inet6_csk_xmit(tunnel->sock, skb, NULL); + if (l2tp_sk_is_v6(sk)) + err = l2tp_xmit_ipv6(sk, skb); else #endif - err = ip_queue_xmit(tunnel->sock, skb, fl); + err = ip_queue_xmit(sk, skb, fl); return err >= 0 ? NET_XMIT_SUCCESS : NET_XMIT_DROP; } -- 2.34.1
2 1
0 0
[PATCH OLK-6.6] l2tp: fix double dst_release() on sk_dst_cache race
by Li Xiasong 17 Mar '26

17 Mar '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8696 Reference: https://lore.kernel.org/netdev/20251215145537.5085-1-m.lobanov@rosa.ru/ ---------------------------------------- A reproducible rcuref - imbalanced put() warning is observed under IPv6 L2TP (pppol2tp) traffic with blackhole routes, indicating an imbalance in dst reference counting for routes cached in sk->sk_dst_cache and pointing to a subtle lifetime/synchronization issue between the helpers that validate and drop cached dst entries. rcuref - imbalanced put() WARNING: CPU: 0 PID: 899 at lib/rcuref.c:266 rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266 Modules linked in: CPSocket connected tcp:127.0.0.1:48148,server=on <-> 127.0.0.1:33750 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266 Call Trace: <TASK> __rcuref_put include/linux/rcuref.h:97 [inline] rcuref_put include/linux/rcuref.h:153 [inline] dst_release+0x291/0x310 net/core/dst.c:167 __sk_dst_check+0x2d4/0x350 net/core/sock.c:604 __inet6_csk_dst_check net/ipv6/inet6_connection_sock.c:76 [inline] inet6_csk_route_socket+0x6ed/0x10c0 net/ipv6/inet6_connection_sock.c:104 inet6_csk_xmit+0x12f/0x740 net/ipv6/inet6_connection_sock.c:121 l2tp_xmit_queue net/l2tp/l2tp_core.c:1214 [inline] l2tp_xmit_core net/l2tp/l2tp_core.c:1309 [inline] l2tp_xmit_skb+0x1404/0x1910 net/l2tp/l2tp_core.c:1325 pppol2tp_sendmsg+0x3ca/0x550 net/l2tp/l2tp_ppp.c:302 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg net/socket.c:744 [inline] ____sys_sendmsg+0xab2/0xc70 net/socket.c:2609 ___sys_sendmsg+0x11d/0x1c0 net/socket.c:2663 __sys_sendmmsg+0x188/0x450 net/socket.c:2749 __do_sys_sendmmsg net/socket.c:2778 [inline] __se_sys_sendmmsg net/socket.c:2775 [inline] __x64_sys_sendmmsg+0x98/0x100 net/socket.c:2775 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x64/0x140 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fe6960ec719 </TASK> The race occurs between the lockless UDPv6 transmit path (udpv6_sendmsg() -> sk_dst_check()) and the locked L2TP/pppol2tp transmit path (pppol2tp_sendmsg() -> l2tp_xmit_skb() -> ... -> inet6_csk_xmit() → __sk_dst_check()), when both handle the same obsolete dst from sk->sk_dst_cache: the UDPv6 side takes an extra reference and atomically steals and releases the cached dst, while the L2TP side, using a stale cached pointer, still calls dst_release() on it, and together these updates produce an extra final dst_release() on that dst, triggering rcuref - imbalanced put(). The Race Condition: Initial: sk->sk_dst_cache = dst ref(dst) = 1 Thread 1: sk_dst_check() Thread 2: __sk_dst_check() ------------------------ ---------------------------- sk_dst_get(sk): rcu_read_lock() dst = rcu_dereference(sk->sk_dst_cache) rcuref_get(dst) succeeds rcu_read_unlock() // ref = 2 dst = __sk_dst_get(sk) // reads same dst from sk_dst_cache // ref still = 2 (no extra get) [both see dst obsolete & check() == NULL] sk_dst_reset(sk): old = xchg(&sk->sk_dst_cache, NULL) // old = dst dst_release(old) // drop cached ref // ref: 2 -> 1 RCU_INIT_POINTER(sk->sk_dst_cache, NULL) // cache already NULL after xchg dst_release(dst) // ref: 1 -> 0 dst_release(dst) // tries to drop its own ref after final put // rcuref_put_slowpath() -> "rcuref - imbalanced put()" Make L2TP's IPv6 transmit path stop using inet6_csk_xmit() (and thus __sk_dst_check()) and instead open-code the same routing and transmit sequence using the existing sk_dst_check() with dst reference counting and ip6_dst_lookup_flow(). The new code builds a flowi6 from the socket fields in the same way as inet6_csk_route_socket(), then performs dst lookup using sk_dst_check() with proper reference counting, attaches the resulting dst to the skb via skb_dst_set(), and finally invokes ip6_xmit() for transmission. This makes both the UDPv6 and L2TP IPv6 paths use the same dst-cache handling logic for a given socket and removes the possibility that the lockless sk_dst_check() and the locked __sk_dst_check() concurrently drop the same cached dst and trigger the rcuref - imbalanced put() warning under concurrent traffic. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: b0270e91014d ("ipv4: add a sock pointer to ip_queue_xmit()") Signed-off-by: Li Xiasong <lixiasong1(a)huawei.com> --- net/l2tp/l2tp_core.c | 60 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 3 deletions(-) diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index f85b383811d5..79a5f635b049 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1000,19 +1000,73 @@ static int l2tp_build_l2tpv3_header(struct l2tp_session *session, void *buf) return bufp - optr; } +#if IS_ENABLED(CONFIG_IPV6) +static int l2tp_xmit_ipv6(struct sock *sk, struct sk_buff *skb) +{ + struct ipv6_pinfo *np = inet6_sk(sk); + struct inet_sock *inet = inet_sk(sk); + struct in6_addr *final_p, final; + struct dst_entry *dst; + struct flowi6 fl6; + int err; + + memset(&fl6, 0, sizeof(fl6)); + fl6.flowi6_proto = sk->sk_protocol; + fl6.daddr = sk->sk_v6_daddr; + fl6.saddr = np->saddr; + fl6.flowlabel = np->flow_label; + IP6_ECN_flow_xmit(sk, fl6.flowlabel); + + fl6.flowi6_oif = sk->sk_bound_dev_if; + fl6.flowi6_mark = sk->sk_mark; + fl6.fl6_sport = inet->inet_sport; + fl6.fl6_dport = inet->inet_dport; + fl6.flowi6_uid = sk->sk_uid; + + security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6)); + + rcu_read_lock(); + final_p = fl6_update_dst(&fl6, rcu_dereference(np->opt), &final); + rcu_read_unlock(); + + dst = sk_dst_check(sk, np->dst_cookie); + if (!dst) { + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); + if (IS_ERR(dst)) { + WRITE_ONCE(sk->sk_err_soft, -PTR_ERR(dst)); + sk->sk_route_caps = 0; + kfree_skb(skb); + return PTR_ERR(dst); + } + + ip6_dst_store(sk, dst_clone(dst), NULL, NULL); + } + + skb_dst_set(skb, dst); + fl6.daddr = sk->sk_v6_daddr; + + rcu_read_lock(); + err = ip6_xmit(sk, skb, &fl6, sk->sk_mark, rcu_dereference(np->opt), + np->tclass, sk->sk_priority); + rcu_read_unlock(); + return err; +} +#endif + /* Queue the packet to IP for output: tunnel socket lock must be held */ static int l2tp_xmit_queue(struct l2tp_tunnel *tunnel, struct sk_buff *skb, struct flowi *fl) { int err; + struct sock *sk = tunnel->sock; skb->ignore_df = 1; skb_dst_drop(skb); #if IS_ENABLED(CONFIG_IPV6) - if (l2tp_sk_is_v6(tunnel->sock)) - err = inet6_csk_xmit(tunnel->sock, skb, NULL); + if (l2tp_sk_is_v6(sk)) + err = l2tp_xmit_ipv6(sk, skb); else #endif - err = ip_queue_xmit(tunnel->sock, skb, fl); + err = ip_queue_xmit(sk, skb, fl); return err >= 0 ? NET_XMIT_SUCCESS : NET_XMIT_DROP; } -- 2.34.1
2 1
0 0
[PATCH OLK-6.6 0/2] mptcp: fix soft lockup in mptcp_recvmsg()
by Li Xiasong 17 Mar '26

17 Mar '26
Li Xiasong (1): mptcp: fix soft lockup in mptcp_recvmsg() Paolo Abeni (1): mptcp: fix MSG_PEEK stream corruption net/mptcp/protocol.c | 44 +++++++++++++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 11 deletions(-) -- 2.34.1
2 3
0 0
[PATCH OLK-6.6] l2tp: fix double dst_release() on sk_dst_cache race
by Li Xiasong 17 Mar '26

17 Mar '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8696 Reference: https://lore.kernel.org/netdev/20251215145537.5085-1-m.lobanov@rosa.ru/ ---------------------------------------- A reproducible rcuref - imbalanced put() warning is observed under IPv6 L2TP (pppol2tp) traffic with blackhole routes, indicating an imbalance in dst reference counting for routes cached in sk->sk_dst_cache and pointing to a subtle lifetime/synchronization issue between the helpers that validate and drop cached dst entries. rcuref - imbalanced put() WARNING: CPU: 0 PID: 899 at lib/rcuref.c:266 rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266 Modules linked in: CPSocket connected tcp:127.0.0.1:48148,server=on <-> 127.0.0.1:33750 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266 Call Trace: <TASK> __rcuref_put include/linux/rcuref.h:97 [inline] rcuref_put include/linux/rcuref.h:153 [inline] dst_release+0x291/0x310 net/core/dst.c:167 __sk_dst_check+0x2d4/0x350 net/core/sock.c:604 __inet6_csk_dst_check net/ipv6/inet6_connection_sock.c:76 [inline] inet6_csk_route_socket+0x6ed/0x10c0 net/ipv6/inet6_connection_sock.c:104 inet6_csk_xmit+0x12f/0x740 net/ipv6/inet6_connection_sock.c:121 l2tp_xmit_queue net/l2tp/l2tp_core.c:1214 [inline] l2tp_xmit_core net/l2tp/l2tp_core.c:1309 [inline] l2tp_xmit_skb+0x1404/0x1910 net/l2tp/l2tp_core.c:1325 pppol2tp_sendmsg+0x3ca/0x550 net/l2tp/l2tp_ppp.c:302 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg net/socket.c:744 [inline] ____sys_sendmsg+0xab2/0xc70 net/socket.c:2609 ___sys_sendmsg+0x11d/0x1c0 net/socket.c:2663 __sys_sendmmsg+0x188/0x450 net/socket.c:2749 __do_sys_sendmmsg net/socket.c:2778 [inline] __se_sys_sendmmsg net/socket.c:2775 [inline] __x64_sys_sendmmsg+0x98/0x100 net/socket.c:2775 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x64/0x140 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fe6960ec719 </TASK> The race occurs between the lockless UDPv6 transmit path (udpv6_sendmsg() -> sk_dst_check()) and the locked L2TP/pppol2tp transmit path (pppol2tp_sendmsg() -> l2tp_xmit_skb() -> ... -> inet6_csk_xmit() → __sk_dst_check()), when both handle the same obsolete dst from sk->sk_dst_cache: the UDPv6 side takes an extra reference and atomically steals and releases the cached dst, while the L2TP side, using a stale cached pointer, still calls dst_release() on it, and together these updates produce an extra final dst_release() on that dst, triggering rcuref - imbalanced put(). The Race Condition: Initial: sk->sk_dst_cache = dst ref(dst) = 1 Thread 1: sk_dst_check() Thread 2: __sk_dst_check() ------------------------ ---------------------------- sk_dst_get(sk): rcu_read_lock() dst = rcu_dereference(sk->sk_dst_cache) rcuref_get(dst) succeeds rcu_read_unlock() // ref = 2 dst = __sk_dst_get(sk) // reads same dst from sk_dst_cache // ref still = 2 (no extra get) [both see dst obsolete & check() == NULL] sk_dst_reset(sk): old = xchg(&sk->sk_dst_cache, NULL) // old = dst dst_release(old) // drop cached ref // ref: 2 -> 1 RCU_INIT_POINTER(sk->sk_dst_cache, NULL) // cache already NULL after xchg dst_release(dst) // ref: 1 -> 0 dst_release(dst) // tries to drop its own ref after final put // rcuref_put_slowpath() -> "rcuref - imbalanced put()" Make L2TP's IPv6 transmit path stop using inet6_csk_xmit() (and thus __sk_dst_check()) and instead open-code the same routing and transmit sequence using the existing sk_dst_check() with dst reference counting and ip6_dst_lookup_flow(). The new code builds a flowi6 from the socket fields in the same way as inet6_csk_route_socket(), then performs dst lookup using sk_dst_check() with proper reference counting, attaches the resulting dst to the skb via skb_dst_set(), and finally invokes ip6_xmit() for transmission. This makes both the UDPv6 and L2TP IPv6 paths use the same dst-cache handling logic for a given socket and removes the possibility that the lockless sk_dst_check() and the locked __sk_dst_check() concurrently drop the same cached dst and trigger the rcuref - imbalanced put() warning under concurrent traffic. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: b0270e91014d ("ipv4: add a sock pointer to ip_queue_xmit()") Signed-off-by: Li Xiasong <lixiasong1(a)huawei.com> --- net/l2tp/l2tp_core.c | 60 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 3 deletions(-) diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index f85b383811d5..79a5f635b049 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1000,19 +1000,73 @@ static int l2tp_build_l2tpv3_header(struct l2tp_session *session, void *buf) return bufp - optr; } +#if IS_ENABLED(CONFIG_IPV6) +static int l2tp_xmit_ipv6(struct sock *sk, struct sk_buff *skb) +{ + struct ipv6_pinfo *np = inet6_sk(sk); + struct inet_sock *inet = inet_sk(sk); + struct in6_addr *final_p, final; + struct dst_entry *dst; + struct flowi6 fl6; + int err; + + memset(&fl6, 0, sizeof(fl6)); + fl6.flowi6_proto = sk->sk_protocol; + fl6.daddr = sk->sk_v6_daddr; + fl6.saddr = np->saddr; + fl6.flowlabel = np->flow_label; + IP6_ECN_flow_xmit(sk, fl6.flowlabel); + + fl6.flowi6_oif = sk->sk_bound_dev_if; + fl6.flowi6_mark = sk->sk_mark; + fl6.fl6_sport = inet->inet_sport; + fl6.fl6_dport = inet->inet_dport; + fl6.flowi6_uid = sk->sk_uid; + + security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6)); + + rcu_read_lock(); + final_p = fl6_update_dst(&fl6, rcu_dereference(np->opt), &final); + rcu_read_unlock(); + + dst = sk_dst_check(sk, np->dst_cookie); + if (!dst) { + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); + if (IS_ERR(dst)) { + WRITE_ONCE(sk->sk_err_soft, -PTR_ERR(dst)); + sk->sk_route_caps = 0; + kfree_skb(skb); + return PTR_ERR(dst); + } + + ip6_dst_store(sk, dst_clone(dst), NULL, NULL); + } + + skb_dst_set(skb, dst); + fl6.daddr = sk->sk_v6_daddr; + + rcu_read_lock(); + err = ip6_xmit(sk, skb, &fl6, sk->sk_mark, rcu_dereference(np->opt), + np->tclass, sk->sk_priority); + rcu_read_unlock(); + return err; +} +#endif + /* Queue the packet to IP for output: tunnel socket lock must be held */ static int l2tp_xmit_queue(struct l2tp_tunnel *tunnel, struct sk_buff *skb, struct flowi *fl) { int err; + struct sock *sk = tunnel->sock; skb->ignore_df = 1; skb_dst_drop(skb); #if IS_ENABLED(CONFIG_IPV6) - if (l2tp_sk_is_v6(tunnel->sock)) - err = inet6_csk_xmit(tunnel->sock, skb, NULL); + if (l2tp_sk_is_v6(sk)) + err = l2tp_xmit_ipv6(sk, skb); else #endif - err = ip_queue_xmit(tunnel->sock, skb, fl); + err = ip_queue_xmit(sk, skb, fl); return err >= 0 ? NET_XMIT_SUCCESS : NET_XMIT_DROP; } -- 2.34.1
2 1
0 0
[PATCH OLK-5.10 0/3] arm64/mpam: Extending MPAM features and bugfix
by Quanmin Yan 17 Mar '26

17 Mar '26
Extending the MPAM features and bugfix in OLK-5.10. Quanmin Yan (3): arm64/mpam: Add quirk to shrink MATA PMG range arm64/mpam: Add quirk for MPAM MSMON_MBWU monitor NRDY bit arm64/mpam: Allow MBMIN to be set to 0 arch/arm64/kernel/mpam/mpam_device.c | 33 +++++++++++++++++++++++---- arch/arm64/kernel/mpam/mpam_resctrl.c | 8 ++++--- 2 files changed, 34 insertions(+), 7 deletions(-) -- 2.43.0
2 4
0 0
[PATCH OLK-5.10 0/3] arm64/mpam: Extending MPAM features and bugfix
by Quanmin Yan 17 Mar '26

17 Mar '26
Extending the MPAM features and bugfix in OLK-5.10. Quanmin Yan (3): arm64/mpam: Add quirk to shrink MATA PMG range arm64/mpam: Add quirk for MPAM MSMON_MBWU monitor NRDY bit arm64/mpam: Allow MBMIN to be set to 0 arch/arm64/kernel/mpam/mpam_device.c | 33 +++++++++++++++++++++++---- arch/arm64/kernel/mpam/mpam_resctrl.c | 8 ++++--- 2 files changed, 34 insertions(+), 7 deletions(-) -- 2.43.0
2 4
0 0
[PATCH OLK-6.6] bfq: Lock when clearing the q->elevator entry
by Zizhi Wo 16 Mar '26

16 Mar '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8717 -------------------------------- In the error path out_free of bfq_init_queue(), the last reference count of eq->kobj is released, causing eq to be freed. Before the caller blk_mq_init_sched() executes "q->elevator = NULL", another process modifies the blkcgroup interface, such as disabling iocost, which triggers wbt_enable_default() to access the already freed elevator, leading to a UAF issue. Lock when clearing the q->elevator entry to solve this issue. And do cleanup on some repeated nulling processes. Fixes: 671fae5e5129 ("blk-wbt: don't enable throttling if default elevator is bfq") Signed-off-by: Zizhi Wo <wozizhi(a)huawei.com> --- block/bfq-iosched.c | 3 +++ block/blk-mq-sched.c | 4 ++-- block/blk-wbt.c | 2 ++ block/elevator.c | 1 - 4 files changed, 7 insertions(+), 3 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index a8ebf3962f11..a12f0196dc51 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7417,6 +7417,9 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) return 0; out_free: + spin_lock_irq(&q->queue_lock); + q->elevator = NULL; + spin_unlock_irq(&q->queue_lock); kfree(bfqd); kobject_put(&eq->kobj); return -ENOMEM; diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 7b48630b63a7..bb166403d311 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -499,8 +499,6 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) err_free_map_and_rqs: blk_mq_sched_free_rqs(q); blk_mq_sched_tags_teardown(q, flags); - - q->elevator = NULL; return ret; } @@ -550,5 +548,7 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e) if (e->type->ops.exit_sched) e->type->ops.exit_sched(e); blk_mq_sched_tags_teardown(q, flags); + spin_lock_irq(&q->queue_lock); q->elevator = NULL; + spin_unlock_irq(&q->queue_lock); } diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 6b81f2c47279..71ca08deb485 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -732,9 +732,11 @@ void wbt_enable_default(struct gendisk *disk) struct rq_qos *rqos; bool enable = IS_ENABLED(CONFIG_BLK_WBT_MQ); + spin_lock_irq(&q->queue_lock); if (q->elevator && test_bit(ELEVATOR_FLAG_DISABLE_WBT, &q->elevator->flags)) enable = false; + spin_unlock_irq(&q->queue_lock); /* Throttling already enabled? */ rqos = wbt_rq_qos(q); diff --git a/block/elevator.c b/block/elevator.c index ba072d8f660e..e80381765b3c 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -704,7 +704,6 @@ void elevator_disable(struct request_queue *q) elv_unregister_queue(q); elevator_exit(q); blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); - q->elevator = NULL; q->nr_requests = q->tag_set->queue_depth; blk_add_trace_msg(q, "elv switch: none"); -- 2.39.2
2 1
0 0
[PATCH OLK-6.6] bfq: Lock when clearing the q->elevator entry
by Zizhi Wo 16 Mar '26

16 Mar '26
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8717 -------------------------------- In the error path out_free of bfq_init_queue(), the last reference count of eq->kobj is released, causing eq to be freed. Before the caller blk_mq_init_sched() executes "q->elevator = NULL", another process modifies the blkcgroup interface, such as disabling iocost, which triggers wbt_enable_default() to access the already freed elevator, leading to a UAF issue. Lock when clearing the q->elevator entry to solve this issue. And do cleanup on some repeated nulling processes. Fixes: 671fae5e5129 ("blk-wbt: don't enable throttling if default elevator is bfq") Signed-off-by: Zizhi Wo <wozizhi(a)huawei.com> Signed-off-by: Zizhi Wo <wozizhi(a)huaweicloud.com> --- block/bfq-iosched.c | 3 +++ block/blk-mq-sched.c | 4 ++-- block/blk-wbt.c | 2 ++ block/elevator.c | 1 - 4 files changed, 7 insertions(+), 3 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index a8ebf3962f11..a12f0196dc51 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7417,6 +7417,9 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) return 0; out_free: + spin_lock_irq(&q->queue_lock); + q->elevator = NULL; + spin_unlock_irq(&q->queue_lock); kfree(bfqd); kobject_put(&eq->kobj); return -ENOMEM; diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 7b48630b63a7..bb166403d311 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -499,8 +499,6 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) err_free_map_and_rqs: blk_mq_sched_free_rqs(q); blk_mq_sched_tags_teardown(q, flags); - - q->elevator = NULL; return ret; } @@ -550,5 +548,7 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e) if (e->type->ops.exit_sched) e->type->ops.exit_sched(e); blk_mq_sched_tags_teardown(q, flags); + spin_lock_irq(&q->queue_lock); q->elevator = NULL; + spin_unlock_irq(&q->queue_lock); } diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 6b81f2c47279..71ca08deb485 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -732,9 +732,11 @@ void wbt_enable_default(struct gendisk *disk) struct rq_qos *rqos; bool enable = IS_ENABLED(CONFIG_BLK_WBT_MQ); + spin_lock_irq(&q->queue_lock); if (q->elevator && test_bit(ELEVATOR_FLAG_DISABLE_WBT, &q->elevator->flags)) enable = false; + spin_unlock_irq(&q->queue_lock); /* Throttling already enabled? */ rqos = wbt_rq_qos(q); diff --git a/block/elevator.c b/block/elevator.c index ba072d8f660e..e80381765b3c 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -704,7 +704,6 @@ void elevator_disable(struct request_queue *q) elv_unregister_queue(q); elevator_exit(q); blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); - q->elevator = NULL; q->nr_requests = q->tag_set->queue_depth; blk_add_trace_msg(q, "elv switch: none"); -- 2.39.2
2 1
0 0
  • ← Newer
  • 1
  • ...
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • ...
  • 2321
  • Older →

HyperKitty Powered by HyperKitty