[PATCH OLK-6.6 v3 00/21] net: LTS-6.6 backport
net lts patches backport Aakash Kumar S (1): xfrm: Duplicate SPI Handling Bui Quang Minh (1): virtio-net: fix received length check in big packets David Howells (1): rxrpc: Fix recv-recv race of completed call Florian Westphal (1): netfilter: nf_reject: don't leak dst refcount for loopback packets Jakub Kicinski (3): tls: handle data disappearing from under the TLS ULP tls: make sure to abort the stream if headers are bogus page_pool: always add GFP_NOWARN for ATOMIC allocations Jiayuan Chen (1): bpf, ktls: Fix data corruption when using bpf_msg_pop_data() in ktls Kuniyuki Iwashima (1): Bluetooth: hci_core: Fix use-after-free in vhci_flush() Maher Azzouzi (1): net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing Marc Kleine-Budde (4): can: dev: can_restart(): reverse logic to remove need for goto can: dev: can_restart(): move debug message and stats after successful restart can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode can: netlink: can_changelink(): allow disabling of automatic restart Michal Luczaj (3): vsock: Fix transport_{g2h,h2g} TOCTOU vsock: Fix transport_* TOCTOU vsock: Ignore signal/timeout on connect() if already established Nikolay Aleksandrov (1): net: bridge: fix use-after-free due to MST port state bypass Nishanth Menon (1): net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error Sabrina Dubroca (1): xfrm: xfrm_alloc_spi shouldn't use 0 as SPI William Liu (1): net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree drivers/net/can/dev/dev.c | 31 +++++---- drivers/net/can/dev/netlink.c | 16 ++++- drivers/net/ethernet/ti/netcp_core.c | 10 +-- drivers/net/virtio_net.c | 25 ++++---- drivers/soc/ti/knav_dma.c | 14 ++-- include/net/bluetooth/hci_core.h | 1 + include/trace/events/rxrpc.h | 3 + net/bluetooth/hci_core.c | 34 ++++++++-- net/bridge/br_forward.c | 2 +- net/bridge/br_input.c | 4 +- net/bridge/br_private.h | 8 ++- net/core/page_pool.c | 6 ++ net/ipv4/netfilter/nf_reject_ipv4.c | 6 +- net/ipv6/netfilter/nf_reject_ipv6.c | 5 +- net/rxrpc/call_accept.c | 1 + net/rxrpc/recvmsg.c | 19 +++++- net/sched/sch_htb.c | 4 +- net/sched/sch_mqprio.c | 2 +- net/tls/tls.h | 3 +- net/tls/tls_strp.c | 25 +++++--- net/tls/tls_sw.c | 19 +++++- net/vmw_vsock/af_vsock.c | 95 ++++++++++++++++++++++------ net/xfrm/xfrm_state.c | 75 +++++++++++++--------- 23 files changed, 286 insertions(+), 122 deletions(-) -- 2.25.1
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,转换为PR失败! 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/2WC... 失败原因:应用补丁/补丁集失败,Patch failed at 0002 vsock: Fix transport_{g2h,h2g} TOCTOU 建议解决方法:请查看失败原因, 确认补丁是否可以应用在当前期望分支的最新代码上 FeedBack: The patch(es) which you have sent to kernel@openeuler.org has been converted to PR failed! Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/2WC... Failed Reason: apply patch(es) failed, Patch failed at 0002 vsock: Fix transport_{g2h,h2g} TOCTOU Suggest Solution: please checkout if the failed patch(es) can work on the newest codes in expected branch
From: Kuniyuki Iwashima <kuniyu@google.com> stable inclusion from stable-v6.6.97 commit bc0819a25e04cd68ef3568cfa51b63118fea39a7 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- syzbot reported use-after-free in vhci_flush() without repro. [0] From the splat, a thread close()d a vhci file descriptor while its device was being used by iotcl() on another thread. Once the last fd refcnt is released, vhci_release() calls hci_unregister_dev(), hci_free_dev(), and kfree() for struct vhci_data, which is set to hci_dev->dev->driver_data. The problem is that there is no synchronisation after unlinking hdev from hci_dev_list in hci_unregister_dev(). There might be another thread still accessing the hdev which was fetched before the unlink operation. We can use SRCU for such synchronisation. Let's run hci_dev_reset() under SRCU and wait for its completion in hci_unregister_dev(). Another option would be to restore hci_dev->destruct(), which was removed in commit 587ae086f6e4 ("Bluetooth: Remove unused hci-destruct cb"). However, this would not be a good solution, as we should not run hci_unregister_dev() while there are in-flight ioctl() requests, which could lead to another data-race KCSAN splat. Note that other drivers seem to have the same problem, for exmaple, virtbt_remove(). [0]: BUG: KASAN: slab-use-after-free in skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline] BUG: KASAN: slab-use-after-free in skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937 Read of size 8 at addr ffff88807cb8d858 by task syz.1.219/6718 CPU: 1 UID: 0 PID: 6718 Comm: syz.1.219 Not tainted 6.16.0-rc1-syzkaller-00196-g08207f42d3ff #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xd2/0x2b0 mm/kasan/report.c:521 kasan_report+0x118/0x150 mm/kasan/report.c:634 skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline] skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937 skb_queue_purge include/linux/skbuff.h:3368 [inline] vhci_flush+0x44/0x50 drivers/bluetooth/hci_vhci.c:69 hci_dev_do_reset net/bluetooth/hci_core.c:552 [inline] hci_dev_reset+0x420/0x5c0 net/bluetooth/hci_core.c:592 sock_do_ioctl+0xd9/0x300 net/socket.c:1190 sock_ioctl+0x576/0x790 net/socket.c:1311 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fcf5b98e929 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fcf5c7b9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fcf5bbb6160 RCX: 00007fcf5b98e929 RDX: 0000000000000000 RSI: 00000000400448cb RDI: 0000000000000009 RBP: 00007fcf5ba10b39 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fcf5bbb6160 R15: 00007ffd6353d528 </TASK> Allocated by task 6535: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359 kmalloc_noprof include/linux/slab.h:905 [inline] kzalloc_noprof include/linux/slab.h:1039 [inline] vhci_open+0x57/0x360 drivers/bluetooth/hci_vhci.c:635 misc_open+0x2bc/0x330 drivers/char/misc.c:161 chrdev_open+0x4c9/0x5e0 fs/char_dev.c:414 do_dentry_open+0xdf0/0x1970 fs/open.c:964 vfs_open+0x3b/0x340 fs/open.c:1094 do_open fs/namei.c:3887 [inline] path_openat+0x2ee5/0x3830 fs/namei.c:4046 do_filp_open+0x1fa/0x410 fs/namei.c:4073 do_sys_openat2+0x121/0x1c0 fs/open.c:1437 do_sys_open fs/open.c:1452 [inline] __do_sys_openat fs/open.c:1468 [inline] __se_sys_openat fs/open.c:1463 [inline] __x64_sys_openat+0x138/0x170 fs/open.c:1463 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 6535: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2381 [inline] slab_free mm/slub.c:4643 [inline] kfree+0x18e/0x440 mm/slub.c:4842 vhci_release+0xbc/0xd0 drivers/bluetooth/hci_vhci.c:671 __fput+0x44c/0xa70 fs/file_table.c:465 task_work_run+0x1d1/0x260 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x6ad/0x22e0 kernel/exit.c:955 do_group_exit+0x21c/0x2d0 kernel/exit.c:1104 __do_sys_exit_group kernel/exit.c:1115 [inline] __se_sys_exit_group kernel/exit.c:1113 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1113 x64_sys_call+0x21ba/0x21c0 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff88807cb8d800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 88 bytes inside of freed 1024-byte region [ffff88807cb8d800, ffff88807cb8dc00) Fixes: bf18c7118cf8 ("Bluetooth: vhci: Free driver_data on file release") Reported-by: syzbot+2faa4825e556199361f9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f62d64848fc4c7c30cd6 Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Conflicts: include/net/bluetooth/hci_core.h [commit 5b70b93fc768 has been merged] Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- include/net/bluetooth/hci_core.h | 1 + net/bluetooth/hci_core.c | 34 ++++++++++++++++++++++++++++---- 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index 3c653208c911..afb4acf8b11b 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -342,6 +342,7 @@ struct adv_monitor { struct hci_dev { struct list_head list; + struct srcu_struct srcu; struct mutex lock; struct ida unset_handle_ida; diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index fd05a43ab74b..f5294ee7094b 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -65,7 +65,7 @@ static DEFINE_IDA(hci_index_ida); /* Get HCI device by index. * Device is held on return. */ -struct hci_dev *hci_dev_get(int index) +static struct hci_dev *__hci_dev_get(int index, int *srcu_index) { struct hci_dev *hdev = NULL, *d; @@ -78,6 +78,8 @@ struct hci_dev *hci_dev_get(int index) list_for_each_entry(d, &hci_dev_list, list) { if (d->id == index) { hdev = hci_dev_hold(d); + if (srcu_index) + *srcu_index = srcu_read_lock(&d->srcu); break; } } @@ -85,6 +87,22 @@ struct hci_dev *hci_dev_get(int index) return hdev; } +struct hci_dev *hci_dev_get(int index) +{ + return __hci_dev_get(index, NULL); +} + +static struct hci_dev *hci_dev_get_srcu(int index, int *srcu_index) +{ + return __hci_dev_get(index, srcu_index); +} + +static void hci_dev_put_srcu(struct hci_dev *hdev, int srcu_index) +{ + srcu_read_unlock(&hdev->srcu, srcu_index); + hci_dev_put(hdev); +} + /* ---- Inquiry support ---- */ bool hci_discovery_active(struct hci_dev *hdev) @@ -590,9 +608,9 @@ static int hci_dev_do_reset(struct hci_dev *hdev) int hci_dev_reset(__u16 dev) { struct hci_dev *hdev; - int err; + int err, srcu_index; - hdev = hci_dev_get(dev); + hdev = hci_dev_get_srcu(dev, &srcu_index); if (!hdev) return -ENODEV; @@ -614,7 +632,7 @@ int hci_dev_reset(__u16 dev) err = hci_dev_do_reset(hdev); done: - hci_dev_put(hdev); + hci_dev_put_srcu(hdev, srcu_index); return err; } @@ -2424,6 +2442,11 @@ struct hci_dev *hci_alloc_dev_priv(int sizeof_priv) if (!hdev) return NULL; + if (init_srcu_struct(&hdev->srcu)) { + kfree(hdev); + return NULL; + } + hdev->pkt_type = (HCI_DM1 | HCI_DH1 | HCI_HV1); hdev->esco_type = (ESCO_HV1); hdev->link_mode = (HCI_LM_ACCEPT); @@ -2670,6 +2693,9 @@ void hci_unregister_dev(struct hci_dev *hdev) list_del(&hdev->list); write_unlock(&hci_dev_list_lock); + synchronize_srcu(&hdev->srcu); + cleanup_srcu_struct(&hdev->srcu); + disable_work_sync(&hdev->rx_work); disable_work_sync(&hdev->cmd_work); disable_work_sync(&hdev->tx_work); -- 2.25.1
From: Michal Luczaj <mhal@rbox.co> stable inclusion from stable-v6.6.99 commit 401239811fa728fcdd53e360a91f157ffd23e1f4 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 209fd720838aaf1420416494c5505096478156b4 ] vsock_find_cid() and vsock_dev_do_ioctl() may race with module unload. transport_{g2h,h2g} may become NULL after the NULL check. Introduce vsock_transport_local_cid() to protect from a potential null-ptr-deref. KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f] RIP: 0010:vsock_find_cid+0x47/0x90 Call Trace: __vsock_bind+0x4b2/0x720 vsock_bind+0x90/0xe0 __sys_bind+0x14d/0x1e0 __x64_sys_bind+0x6e/0xc0 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f] RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0 Call Trace: __x64_sys_ioctl+0x12d/0x190 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250703-vsock-transports-toctou-v4-1-98f0eb530747@... Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/vmw_vsock/af_vsock.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f8f1a49689da..dc62b30c0be5 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -530,9 +530,25 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) } EXPORT_SYMBOL_GPL(vsock_assign_transport); +/* + * Provide safe access to static transport_{h2g,g2h,dgram,local} callbacks. + * Otherwise we may race with module removal. Do not use on `vsk->transport`. + */ +static u32 vsock_registered_transport_cid(const struct vsock_transport **transport) +{ + u32 cid = VMADDR_CID_ANY; + + mutex_lock(&vsock_register_mutex); + if (*transport) + cid = (*transport)->get_local_cid(); + mutex_unlock(&vsock_register_mutex); + + return cid; +} + bool vsock_find_cid(unsigned int cid) { - if (transport_g2h && cid == transport_g2h->get_local_cid()) + if (cid == vsock_registered_transport_cid(&transport_g2h)) return true; if (transport_h2g && cid == VMADDR_CID_HOST) @@ -2391,18 +2407,17 @@ static long vsock_dev_do_ioctl(struct file *filp, unsigned int cmd, void __user *ptr) { u32 __user *p = ptr; - u32 cid = VMADDR_CID_ANY; int retval = 0; + u32 cid; switch (cmd) { case IOCTL_VM_SOCKETS_GET_LOCAL_CID: /* To be compatible with the VMCI behavior, we prioritize the * guest CID instead of well-know host CID (VMADDR_CID_HOST). */ - if (transport_g2h) - cid = transport_g2h->get_local_cid(); - else if (transport_h2g) - cid = transport_h2g->get_local_cid(); + cid = vsock_registered_transport_cid(&transport_g2h); + if (cid == VMADDR_CID_ANY) + cid = vsock_registered_transport_cid(&transport_h2g); if (put_user(cid, p) != 0) retval = -EFAULT; -- 2.25.1
From: Michal Luczaj <mhal@rbox.co> stable inclusion from stable-v6.6.99 commit 9d24bb6780282b0255b9929abe5e8f98007e2c6e category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 687aa0c5581b8d4aa87fd92973e4ee576b550cdf ] Transport assignment may race with module unload. Protect new_transport from becoming a stale pointer. This also takes care of an insecure call in vsock_use_local_transport(); add a lockdep assert. BUG: unable to handle page fault for address: fffffbfff8056000 Oops: Oops: 0000 [#1] SMP KASAN RIP: 0010:vsock_assign_transport+0x366/0x600 Call Trace: vsock_connect+0x59c/0xc40 __sys_connect+0xe8/0x100 __x64_sys_connect+0x6e/0xc0 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250703-vsock-transports-toctou-v4-2-98f0eb530747@... Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/vmw_vsock/af_vsock.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index dc62b30c0be5..58b7404a0da0 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -406,6 +406,8 @@ EXPORT_SYMBOL_GPL(vsock_enqueue_accept); static bool vsock_use_local_transport(unsigned int remote_cid) { + lockdep_assert_held(&vsock_register_mutex); + if (!transport_local) return false; @@ -463,6 +465,8 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) remote_flags = vsk->remote_addr.svm_flags; + mutex_lock(&vsock_register_mutex); + switch (sk->sk_type) { case SOCK_DGRAM: new_transport = transport_dgram; @@ -478,12 +482,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) new_transport = transport_h2g; break; default: - return -ESOCKTNOSUPPORT; + ret = -ESOCKTNOSUPPORT; + goto err; } if (vsk->transport) { - if (vsk->transport == new_transport) - return 0; + if (vsk->transport == new_transport) { + ret = 0; + goto err; + } /* transport->release() must be called with sock lock acquired. * This path can only be taken during vsock_connect(), where we @@ -507,8 +514,16 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) /* We increase the module refcnt to prevent the transport unloading * while there are open sockets assigned to it. */ - if (!new_transport || !try_module_get(new_transport->module)) - return -ENODEV; + if (!new_transport || !try_module_get(new_transport->module)) { + ret = -ENODEV; + goto err; + } + + /* It's safe to release the mutex after a successful try_module_get(). + * Whichever transport `new_transport` points at, it won't go away until + * the last module_put() below or in vsock_deassign_transport(). + */ + mutex_unlock(&vsock_register_mutex); if (sk->sk_type == SOCK_SEQPACKET) { if (!new_transport->seqpacket_allow || @@ -527,6 +542,9 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) vsk->transport = new_transport; return 0; +err: + mutex_unlock(&vsock_register_mutex); + return ret; } EXPORT_SYMBOL_GPL(vsock_assign_transport); -- 2.25.1
From: William Liu <will@willsroot.io> stable inclusion from stable-v6.6.100 commit 7ff2d83ecf2619060f30ecf9fad4f2a700fca344 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 0e1d5d9b5c5966e2e42e298670808590db5ed628 ] htb_lookup_leaf has a BUG_ON that can trigger with the following: tc qdisc del dev lo root tc qdisc add dev lo root handle 1: htb default 1 tc class add dev lo parent 1: classid 1:1 htb rate 64bit tc qdisc add dev lo parent 1:1 handle 2: netem tc qdisc add dev lo parent 2:1 handle 3: blackhole ping -I lo -c1 -W0.001 127.0.0.1 The root cause is the following: 1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on the selected leaf qdisc 2. netem_dequeue calls enqueue on the child qdisc 3. blackhole_enqueue drops the packet and returns a value that is not just NET_XMIT_SUCCESS 4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate -> htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase 5. As this is the only class in the selected hprio rbtree, __rb_change_child in __rb_erase_augmented sets the rb_root pointer to NULL 6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL, which causes htb_dequeue_tree to call htb_lookup_leaf with the same hprio rbtree, and fail the BUG_ON The function graph for this scenario is shown here: 0) | htb_enqueue() { 0) + 13.635 us | netem_enqueue(); 0) 4.719 us | htb_activate_prios(); 0) # 2249.199 us | } 0) | htb_dequeue() { 0) 2.355 us | htb_lookup_leaf(); 0) | netem_dequeue() { 0) + 11.061 us | blackhole_enqueue(); 0) | qdisc_tree_reduce_backlog() { 0) | qdisc_lookup_rcu() { 0) 1.873 us | qdisc_match_from_root(); 0) 6.292 us | } 0) 1.894 us | htb_search(); 0) | htb_qlen_notify() { 0) 2.655 us | htb_deactivate_prios(); 0) 6.933 us | } 0) + 25.227 us | } 0) 1.983 us | blackhole_dequeue(); 0) + 86.553 us | } 0) # 2932.761 us | qdisc_warn_nonwc(); 0) | htb_lookup_leaf() { 0) | BUG_ON(); ------------------------------------------ The full original bug report can be seen here [1]. We can fix this just by returning NULL instead of the BUG_ON, as htb_dequeue_tree returns NULL when htb_lookup_leaf returns NULL. [1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4... Fixes: 512bb43eb542 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.") Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/sched/sch_htb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 02674a2e6e56..5cb93f8c2b5b 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -821,7 +821,9 @@ static struct htb_class *htb_lookup_leaf(struct htb_prio *hprio, const int prio) u32 *pid; } stk[TC_HTB_MAXDEPTH], *sp = stk; - BUG_ON(!hprio->row.rb_node); + if (unlikely(!hprio->row.rb_node)) + return NULL; + sp->root = hprio->row.rb_node; sp->pptr = &hprio->ptr; sp->pid = &hprio->last_ptr_id; -- 2.25.1
From: David Howells <dhowells@redhat.com> stable inclusion from stable-v6.6.100 commit 7692bde890061797f3dece0148d7859e85c55778 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 962fb1f651c2cf2083e0c3ef53ba69e3b96d3fbc ] If a call receives an event (such as incoming data), the call gets placed on the socket's queue and a thread in recvmsg can be awakened to go and process it. Once the thread has picked up the call off of the queue, further events will cause it to be requeued, and once the socket lock is dropped (recvmsg uses call->user_mutex to allow the socket to be used in parallel), a second thread can come in and its recvmsg can pop the call off the socket queue again. In such a case, the first thread will be receiving stuff from the call and the second thread will be blocked on call->user_mutex. The first thread can, at this point, process both the event that it picked call for and the event that the second thread picked the call for and may see the call terminate - in which case the call will be "released", decoupling the call from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control message). The first thread will return okay, but then the second thread will wake up holding the user_mutex and, if it sees that the call has been released by the first thread, it will BUG thusly: kernel BUG at net/rxrpc/recvmsg.c:474! Fix this by just dequeuing the call and ignoring it if it is seen to be already released. We can't tell userspace about it anyway as the user call ID has become stale. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- include/trace/events/rxrpc.h | 3 +++ net/rxrpc/call_accept.c | 1 + net/rxrpc/recvmsg.c | 19 +++++++++++++++++-- 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index e7c7b6389436..743f8f1f42a7 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -278,12 +278,15 @@ EM(rxrpc_call_put_userid, "PUT user-id ") \ EM(rxrpc_call_see_accept, "SEE accept ") \ EM(rxrpc_call_see_activate_client, "SEE act-clnt") \ + EM(rxrpc_call_see_already_released, "SEE alrdy-rl") \ EM(rxrpc_call_see_connect_failed, "SEE con-fail") \ EM(rxrpc_call_see_connected, "SEE connect ") \ EM(rxrpc_call_see_conn_abort, "SEE conn-abt") \ + EM(rxrpc_call_see_discard, "SEE discard ") \ EM(rxrpc_call_see_disconnected, "SEE disconn ") \ EM(rxrpc_call_see_distribute_error, "SEE dist-err") \ EM(rxrpc_call_see_input, "SEE input ") \ + EM(rxrpc_call_see_recvmsg, "SEE recvmsg ") \ EM(rxrpc_call_see_release, "SEE release ") \ EM(rxrpc_call_see_userid_exists, "SEE u-exists") \ EM(rxrpc_call_see_waiting_call, "SEE q-conn ") \ diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c index 0f5a1d77b890..413e69c6dc1a 100644 --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -218,6 +218,7 @@ void rxrpc_discard_prealloc(struct rxrpc_sock *rx) tail = b->call_backlog_tail; while (CIRC_CNT(head, tail, size) > 0) { struct rxrpc_call *call = b->call_backlog[tail]; + rxrpc_see_call(call, rxrpc_call_see_discard); rcu_assign_pointer(call->socket, rx); if (rx->discard_new_call) { _debug("discard %lx", call->user_call_ID); diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c index a482f88c5fc5..e24a44bae9a3 100644 --- a/net/rxrpc/recvmsg.c +++ b/net/rxrpc/recvmsg.c @@ -351,6 +351,16 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, goto try_again; } + rxrpc_see_call(call, rxrpc_call_see_recvmsg); + if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) { + rxrpc_see_call(call, rxrpc_call_see_already_released); + list_del_init(&call->recvmsg_link); + spin_unlock_irq(&rx->recvmsg_lock); + release_sock(&rx->sk); + trace_rxrpc_recvmsg(call->debug_id, rxrpc_recvmsg_unqueue, 0); + rxrpc_put_call(call, rxrpc_call_put_recvmsg); + goto try_again; + } if (!(flags & MSG_PEEK)) list_del_init(&call->recvmsg_link); else @@ -374,8 +384,13 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, release_sock(&rx->sk); - if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) - BUG(); + if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) { + rxrpc_see_call(call, rxrpc_call_see_already_released); + mutex_unlock(&call->user_mutex); + if (!(flags & MSG_PEEK)) + rxrpc_put_call(call, rxrpc_call_put_recvmsg); + goto try_again; + } if (test_bit(RXRPC_CALL_HAS_USERID, &call->flags)) { if (flags & MSG_CMSG_COMPAT) { -- 2.25.1
From: Jiayuan Chen <jiayuan.chen@linux.dev> stable inclusion from stable-v6.6.102 commit 0e853c1464bcf61207f8b5c32d2ac5ee495e859d category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 178f6a5c8cb3b6be1602de0964cd440243f493c9 ] When sending plaintext data, we initially calculated the corresponding ciphertext length. However, if we later reduced the plaintext data length via socket policy, we failed to recalculate the ciphertext length. This results in transmitting buffers containing uninitialized data during ciphertext transmission. This causes uninitialized bytes to be appended after a complete "Application Data" packet, leading to errors on the receiving end when parsing TLS record. Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/20250609020910.397930-2-jiayuan.chen@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/tls/tls_sw.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index b28e87371fa1..d1cf3a12cf67 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -872,6 +872,19 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, delta = msg->sg.size; psock->eval = sk_psock_msg_verdict(sk, psock, msg); delta -= msg->sg.size; + + if ((s32)delta > 0) { + /* It indicates that we executed bpf_msg_pop_data(), + * causing the plaintext data size to decrease. + * Therefore the encrypted data size also needs to + * correspondingly decrease. We only need to subtract + * delta to calculate the new ciphertext length since + * ktls does not support block encryption. + */ + struct sk_msg *enc = &ctx->open_rec->msg_encrypted; + + sk_msg_trim(sk, enc, enc->sg.size - delta); + } } if (msg->cork_bytes && msg->cork_bytes > msg->sg.size && !enospc && !full_record) { -- 2.25.1
From: Maher Azzouzi <maherazz04@gmail.com> stable inclusion from stable-v6.6.102 commit 39491e859fd494d0b51adc5c7d54c8a7dcf1d198 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit ffd2dc4c6c49ff4f1e5d34e454a6a55608104c17 ] TCA_MQPRIO_TC_ENTRY_INDEX is validated using NLA_POLICY_MAX(NLA_U32, TC_QOPT_MAX_QUEUE), which allows the value TC_QOPT_MAX_QUEUE (16). This leads to a 4-byte out-of-bounds stack write in the fp[] array, which only has room for 16 elements (0–15). Fix this by changing the policy to allow only up to TC_QOPT_MAX_QUEUE - 1. Fixes: f62af20bed2d ("net/sched: mqprio: allow per-TC user input of FP adminStatus") Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Maher Azzouzi <maherazz04@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250802001857.2702497-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/sched/sch_mqprio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index 793009f445c0..a0e3f3bae536 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -152,7 +152,7 @@ static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt, static const struct nla_policy mqprio_tc_entry_policy[TCA_MQPRIO_TC_ENTRY_MAX + 1] = { [TCA_MQPRIO_TC_ENTRY_INDEX] = NLA_POLICY_MAX(NLA_U32, - TC_QOPT_MAX_QUEUE), + TC_QOPT_MAX_QUEUE - 1), [TCA_MQPRIO_TC_ENTRY_FP] = NLA_POLICY_RANGE(NLA_U32, TC_FP_EXPRESS, TC_FP_PREEMPTIBLE), -- 2.25.1
From: Marc Kleine-Budde <mkl@pengutronix.de> stable inclusion from stable-v6.6.101 commit 71a2dc442e123152cc7ab0d9c1fa4dc155760a01 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 8f3ec204d340af183fb2bb21b8e797ac2ed012b2 ] Reverse the logic in the if statement and eliminate the need for a goto to simplify code readability. Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-3-91b5c1fd92... Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Stable-dep-of: c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- drivers/net/can/dev/dev.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c index 7d9a7c92d4cf..6c1ceb8ce6c4 100644 --- a/drivers/net/can/dev/dev.c +++ b/drivers/net/can/dev/dev.c @@ -142,14 +142,11 @@ static void can_restart(struct net_device *dev) /* send restart message upstream */ skb = alloc_can_err_skb(dev, &cf); - if (!skb) - goto restart; - - cf->can_id |= CAN_ERR_RESTARTED; - - netif_rx(skb); + if (skb) { + cf->can_id |= CAN_ERR_RESTARTED; + netif_rx(skb); + } -restart: netdev_dbg(dev, "restarted\n"); priv->can_stats.restarts++; -- 2.25.1
From: Marc Kleine-Budde <mkl@pengutronix.de> stable inclusion from stable-v6.6.101 commit 359492c2024ad39ccdfaacb7cbd55162db5cd58c category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit f0e0c809c0be05fe865b9ac128ef3ee35c276021 ] Move the debug message "restarted" and the CAN restart stats_after_ the successful restart of the CAN device, because the restart may fail. While there update the error message from printing the error number to printing symbolic error names. Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-4-91b5c1fd92... Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> [mkl: mention stats in subject and description, too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Stable-dep-of: c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- drivers/net/can/dev/dev.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c index 6c1ceb8ce6c4..88f345966c99 100644 --- a/drivers/net/can/dev/dev.c +++ b/drivers/net/can/dev/dev.c @@ -147,15 +147,15 @@ static void can_restart(struct net_device *dev) netif_rx(skb); } - netdev_dbg(dev, "restarted\n"); - priv->can_stats.restarts++; - /* Now restart the device */ netif_carrier_on(dev); err = priv->do_set_mode(dev, CAN_MODE_START); if (err) { - netdev_err(dev, "Error %d during restart", err); + netdev_err(dev, "Restart failed, error %pe\n", ERR_PTR(err)); netif_carrier_off(dev); + } else { + netdev_dbg(dev, "Restarted\n"); + priv->can_stats.restarts++; } } -- 2.25.1
From: Marc Kleine-Budde <mkl@pengutronix.de> stable inclusion from stable-v6.6.101 commit cf81a60a973358dea163f6b14062f17831ceb894 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit c1f3f9797c1f44a762e6f5f72520b2e520537b52 ] Andrei Lalaev reported a NULL pointer deref when a CAN device is restarted from Bus Off and the driver does not implement the struct can_priv::do_set_mode callback. There are 2 code path that call struct can_priv::do_set_mode: - directly by a manual restart from the user space, via can_changelink() - delayed automatic restart after bus off (deactivated by default) To prevent the NULL pointer deference, refuse a manual restart or configure the automatic restart delay in can_changelink() and report the error via extack to user space. As an additional safety measure let can_restart() return an error if can_priv::do_set_mode is not set instead of dereferencing it unchecked. Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com> Closes: https://lore.kernel.org/all/20250714175520.307467-1-andrey.lalaev@gmail.com Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface") Link: https://patch.msgid.link/20250718-fix-nullptr-deref-do_set_mode-v1-1-0b52009... Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- drivers/net/can/dev/dev.c | 12 +++++++++--- drivers/net/can/dev/netlink.c | 12 ++++++++++++ 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c index 88f345966c99..6018444a76b9 100644 --- a/drivers/net/can/dev/dev.c +++ b/drivers/net/can/dev/dev.c @@ -125,13 +125,16 @@ void can_change_state(struct net_device *dev, struct can_frame *cf, EXPORT_SYMBOL_GPL(can_change_state); /* CAN device restart for bus-off recovery */ -static void can_restart(struct net_device *dev) +static int can_restart(struct net_device *dev) { struct can_priv *priv = netdev_priv(dev); struct sk_buff *skb; struct can_frame *cf; int err; + if (!priv->do_set_mode) + return -EOPNOTSUPP; + if (netif_carrier_ok(dev)) netdev_err(dev, "Attempt to restart for bus-off recovery, but carrier is OK?\n"); @@ -153,10 +156,14 @@ static void can_restart(struct net_device *dev) if (err) { netdev_err(dev, "Restart failed, error %pe\n", ERR_PTR(err)); netif_carrier_off(dev); + + return err; } else { netdev_dbg(dev, "Restarted\n"); priv->can_stats.restarts++; } + + return 0; } static void can_restart_work(struct work_struct *work) @@ -181,9 +188,8 @@ int can_restart_now(struct net_device *dev) return -EBUSY; cancel_delayed_work_sync(&priv->restart_work); - can_restart(dev); - return 0; + return can_restart(dev); } /* CAN bus-off diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c index 01aacdcda260..abe8dc051d94 100644 --- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -285,6 +285,12 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[], } if (data[IFLA_CAN_RESTART_MS]) { + if (!priv->do_set_mode) { + NL_SET_ERR_MSG(extack, + "Device doesn't support restart from Bus Off"); + return -EOPNOTSUPP; + } + /* Do not allow changing restart delay while running */ if (dev->flags & IFF_UP) return -EBUSY; @@ -292,6 +298,12 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[], } if (data[IFLA_CAN_RESTART]) { + if (!priv->do_set_mode) { + NL_SET_ERR_MSG(extack, + "Device doesn't support restart from Bus Off"); + return -EOPNOTSUPP; + } + /* Do not allow a restart while not running */ if (!(dev->flags & IFF_UP)) return -EINVAL; -- 2.25.1
From: Marc Kleine-Budde <mkl@pengutronix.de> mainline inclusion from mainline-v6.18-rc3 commit 8e93ac51e4c6dc399fad59ec21f55f2cfb46d27c category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... ------------------------------------------ Since the commit c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode"), the automatic restart delay can only be set for devices that implement the restart handler struct can_priv::do_set_mode. As it makes no sense to configure a automatic restart for devices that doesn't support it. However, since systemd commit 13ce5d4632e3 ("network/can: properly handle CAN.RestartSec=0") [1], systemd-networkd correctly handles a restart delay of "0" (i.e. the restart is disabled). Which means that a disabled restart is always configured in the kernel. On systems with both changes active this causes that CAN interfaces that don't implement a restart handler cannot be brought up by systemd-networkd. Solve this problem by allowing a delay of "0" to be configured, even if the device does not implement a restart handler. [1] https://github.com/systemd/systemd/commit/13ce5d4632e395521e6205c954493c7fc1... Cc: stable@vger.kernel.org Cc: Andrei Lalaev <andrey.lalaev@gmail.com> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Closes: https://lore.kernel.org/all/20251020-certain-arrogant-vole-of-sunshine-14184... Fixes: c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode") Link: https://patch.msgid.link/20251020-netlink-fix-restart-v1-1-3f53c7f8520b@peng... Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- drivers/net/can/dev/netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c index abe8dc051d94..77d165ed0d53 100644 --- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -285,7 +285,9 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[], } if (data[IFLA_CAN_RESTART_MS]) { - if (!priv->do_set_mode) { + unsigned int restart_ms = nla_get_u32(data[IFLA_CAN_RESTART_MS]); + + if (restart_ms != 0 && !priv->do_set_mode) { NL_SET_ERR_MSG(extack, "Device doesn't support restart from Bus Off"); return -EOPNOTSUPP; @@ -294,7 +296,7 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[], /* Do not allow changing restart delay while running */ if (dev->flags & IFF_UP) return -EBUSY; - priv->restart_ms = nla_get_u32(data[IFLA_CAN_RESTART_MS]); + priv->restart_ms = restart_ms; } if (data[IFLA_CAN_RESTART]) { -- 2.25.1
From: Jakub Kicinski <kuba@kernel.org> stable inclusion from stable-v6.6.103 commit f1fe99919f629f980d0b8a7ff16950bffe06a859 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 6db015fc4b5d5f63a64a193f65d98da3a7fc811d ] TLS expects that it owns the receive queue of the TCP socket. This cannot be guaranteed in case the reader of the TCP socket entered before the TLS ULP was installed, or uses some non-standard read API (eg. zerocopy ones). Replace the WARN_ON() and a buggy early exit (which leaves anchor pointing to a freed skb) with real error handling. Wipe the parsing state and tell the reader to retry. We already reload the anchor every time we (re)acquire the socket lock, so the only condition we need to avoid is an out of bounds read (not having enough bytes in the socket for previously parsed record len). If some data was read from under TLS but there's enough in the queue we'll reload and decrypt what is most likely not a valid TLS record. Leading to some undefined behavior from TLS perspective (corrupting a stream? missing an alert? missing an attack?) but no kernel crash should take place. Reported-by: William Liu <will@willsroot.io> Reported-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://lore.kernel.org/tFjq_kf7sWIG3A7CrCg_egb8CVsT_gsmHAK0_wxDPJXfIzxFAMxq... Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250807232907.600366-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/tls/tls.h | 2 +- net/tls/tls_strp.c | 11 ++++++++--- net/tls/tls_sw.c | 3 ++- 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/net/tls/tls.h b/net/tls/tls.h index 02038d0381b7..5dc61c85c076 100644 --- a/net/tls/tls.h +++ b/net/tls/tls.h @@ -192,7 +192,7 @@ void tls_strp_msg_done(struct tls_strparser *strp); int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb); void tls_rx_msg_ready(struct tls_strparser *strp); -void tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh); +bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh); int tls_strp_msg_cow(struct tls_sw_context_rx *ctx); struct sk_buff *tls_strp_msg_detach(struct tls_sw_context_rx *ctx); int tls_strp_msg_hold(struct tls_strparser *strp, struct sk_buff_head *dst); diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index bea60b0160d1..6ce64a6e4495 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -474,7 +474,7 @@ static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len) strp->stm.offset = offset; } -void tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) +bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) { struct strp_msg *rxm; struct tls_msg *tlm; @@ -483,8 +483,11 @@ void tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) DEBUG_NET_WARN_ON_ONCE(!strp->stm.full_len); if (!strp->copy_mode && force_refresh) { - if (WARN_ON(tcp_inq(strp->sk) < strp->stm.full_len)) - return; + if (unlikely(tcp_inq(strp->sk) < strp->stm.full_len)) { + WRITE_ONCE(strp->msg_ready, 0); + memset(&strp->stm, 0, sizeof(strp->stm)); + return false; + } tls_strp_load_anchor_with_queue(strp, strp->stm.full_len); } @@ -494,6 +497,8 @@ void tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) rxm->offset = strp->stm.offset; tlm = tls_msg(strp->anchor); tlm->control = strp->mark; + + return true; } /* Called with lock held on lower socket */ diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index d1cf3a12cf67..8915f2eedd09 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1380,7 +1380,8 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, return sock_intr_errno(timeo); } - tls_strp_msg_load(&ctx->strp, released); + if (unlikely(!tls_strp_msg_load(&ctx->strp, released))) + return tls_rx_rec_wait(sk, psock, nonblock, false); return 1; } -- 2.25.1
From: Aakash Kumar S <saakashkumar@marvell.com> stable inclusion from stable-v6.6.103 commit 3d8090bb53424432fa788fe9a49e8ceca74f0544 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 94f39804d891cffe4ce17737d295f3b195bc7299 ] The issue originates when Strongswan initiates an XFRM_MSG_ALLOCSPI Netlink message, which triggers the kernel function xfrm_alloc_spi(). This function is expected to ensure uniqueness of the Security Parameter Index (SPI) for inbound Security Associations (SAs). However, it can return success even when the requested SPI is already in use, leading to duplicate SPIs assigned to multiple inbound SAs, differentiated only by their destination addresses. This behavior causes inconsistencies during SPI lookups for inbound packets. Since the lookup may return an arbitrary SA among those with the same SPI, packet processing can fail, resulting in packet drops. According to RFC 4301 section 4.4.2 , for inbound processing a unicast SA is uniquely identified by the SPI and optionally protocol. Reproducing the Issue Reliably: To consistently reproduce the problem, restrict the available SPI range in charon.conf : spi_min = 0x10000000 spi_max = 0x10000002 This limits the system to only 2 usable SPI values. Next, create more than 2 Child SA. each using unique pair of src/dst address. As soon as the 3rd Child SA is initiated, it will be assigned a duplicate SPI, since the SPI pool is already exhausted. With a narrow SPI range, the issue is consistently reproducible. With a broader/default range, it becomes rare and unpredictable. Current implementation: xfrm_spi_hash() lookup function computes hash using daddr, proto, and family. So if two SAs have the same SPI but different destination addresses, then they will: a. Hash into different buckets b. Be stored in different linked lists (byspi + h) c. Not be seen in the same hlist_for_each_entry_rcu() iteration. As a result, the lookup will result in NULL and kernel allows that Duplicate SPI Proposed Change: xfrm_state_lookup_spi_proto() does a truly global search - across all states, regardless of hash bucket and matches SPI and proto. Signed-off-by: Aakash Kumar S <saakashkumar@marvell.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/xfrm/xfrm_state.c | 72 ++++++++++++++++++++++++++----------------- 1 file changed, 43 insertions(+), 29 deletions(-) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index e9f36cdfd111..b36eedad21f1 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -1450,6 +1450,26 @@ struct xfrm_state *xfrm_state_lookup_byspi(struct net *net, __be32 spi, } EXPORT_SYMBOL(xfrm_state_lookup_byspi); +static struct xfrm_state *xfrm_state_lookup_spi_proto(struct net *net, __be32 spi, u8 proto) +{ + struct xfrm_state *x; + unsigned int i; + + rcu_read_lock(); + for (i = 0; i <= net->xfrm.state_hmask; i++) { + hlist_for_each_entry_rcu(x, &net->xfrm.state_byspi[i], byspi) { + if (x->id.spi == spi && x->id.proto == proto) { + if (!xfrm_state_hold_rcu(x)) + continue; + rcu_read_unlock(); + return x; + } + } + } + rcu_read_unlock(); + return NULL; +} + static void __xfrm_state_insert(struct xfrm_state *x) { struct net *net = xs_net(x); @@ -2250,10 +2270,8 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high, unsigned int h; struct xfrm_state *x0; int err = -ENOENT; - __be32 minspi = htonl(low); - __be32 maxspi = htonl(high); + u32 range = high - low + 1; __be32 newspi = 0; - u32 mark = x->mark.v & x->mark.m; spin_lock_bh(&x->lock); if (x->km.state == XFRM_STATE_DEAD) { @@ -2267,38 +2285,34 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high, err = -ENOENT; - if (minspi == maxspi) { - x0 = xfrm_state_lookup(net, mark, &x->id.daddr, minspi, x->id.proto, x->props.family); - if (x0) { - NL_SET_ERR_MSG(extack, "Requested SPI is already in use"); - xfrm_state_put(x0); + for (h = 0; h < range; h++) { + u32 spi = (low == high) ? low : get_random_u32_inclusive(low, high); + newspi = htonl(spi); + + spin_lock_bh(&net->xfrm.xfrm_state_lock); + x0 = xfrm_state_lookup_spi_proto(net, newspi, x->id.proto); + if (!x0) { + x->id.spi = newspi; + h = xfrm_spi_hash(net, &x->id.daddr, newspi, x->id.proto, x->props.family); + XFRM_STATE_INSERT(byspi, &x->byspi, net->xfrm.state_byspi + h, x->xso.type); + spin_unlock_bh(&net->xfrm.xfrm_state_lock); + err = 0; goto unlock; } - newspi = minspi; - } else { - u32 spi = 0; - for (h = 0; h < high-low+1; h++) { - spi = get_random_u32_inclusive(low, high); - x0 = xfrm_state_lookup(net, mark, &x->id.daddr, htonl(spi), x->id.proto, x->props.family); - if (x0 == NULL) { - newspi = htonl(spi); - break; - } - xfrm_state_put(x0); + xfrm_state_put(x0); + spin_unlock_bh(&net->xfrm.xfrm_state_lock); + + if (signal_pending(current)) { + err = -ERESTARTSYS; + goto unlock; } + + if (low == high) + break; } - if (newspi) { - spin_lock_bh(&net->xfrm.xfrm_state_lock); - x->id.spi = newspi; - h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto, x->props.family); - XFRM_STATE_INSERT(byspi, &x->byspi, net->xfrm.state_byspi + h, - x->xso.type); - spin_unlock_bh(&net->xfrm.xfrm_state_lock); - err = 0; - } else { + if (err) NL_SET_ERR_MSG(extack, "No SPI available in the requested range"); - } unlock: spin_unlock_bh(&x->lock); -- 2.25.1
From: Sabrina Dubroca <sd@queasysnail.net> stable inclusion from stable-v6.6.109 commit 0baf92d0b1590b903c1f4ead75e61715e50e8146 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit cd8ae32e4e4652db55bce6b9c79267d8946765a9 ] x->id.spi == 0 means "no SPI assigned", but since commit 94f39804d891 ("xfrm: Duplicate SPI Handling"), we now create states and add them to the byspi list with this value. __xfrm_state_delete doesn't remove those states from the byspi list, since they shouldn't be there, and this shows up as a UAF the next time we go through the byspi list. Reported-by: syzbot+a25ee9d20d31e483ba7b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a25ee9d20d31e483ba7b Fixes: 94f39804d891 ("xfrm: Duplicate SPI Handling") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/xfrm/xfrm_state.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index b36eedad21f1..4efbc0b22378 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2287,6 +2287,8 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high, for (h = 0; h < range; h++) { u32 spi = (low == high) ? low : get_random_u32_inclusive(low, high); + if (spi == 0) + goto next; newspi = htonl(spi); spin_lock_bh(&net->xfrm.xfrm_state_lock); @@ -2302,6 +2304,7 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high, xfrm_state_put(x0); spin_unlock_bh(&net->xfrm.xfrm_state_lock); +next: if (signal_pending(current)) { err = -ERESTARTSYS; goto unlock; -- 2.25.1
From: Jakub Kicinski <kuba@kernel.org> stable inclusion from stable-v6.6.108 commit 4cefe5be73886f383639fe0850bb72d5b568a7b9 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 0aeb54ac4cd5cf8f60131b4d9ec0b6dc9c27b20d ] Normally we wait for the socket to buffer up the whole record before we service it. If the socket has a tiny buffer, however, we read out the data sooner, to prevent connection stalls. Make sure that we abort the connection when we find out late that the record is actually invalid. Retrying the parsing is fine in itself but since we copy some more data each time before we parse we can overflow the allocated skb space. Constructing a scenario in which we're under pressure without enough data in the socket to parse the length upfront is quite hard. syzbot figured out a way to do this by serving us the header in small OOB sends, and then filling in the recvbuf with a large normal send. Make sure that tls_rx_msg_size() aborts strp, if we reach an invalid record there's really no way to recover. Reported-by: Lee Jones <lee@kernel.org> Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250917002814.1743558-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/tls/tls.h | 1 + net/tls/tls_strp.c | 14 +++++++++----- net/tls/tls_sw.c | 3 +-- 3 files changed, 11 insertions(+), 7 deletions(-) diff --git a/net/tls/tls.h b/net/tls/tls.h index 5dc61c85c076..a3c5c5a59fda 100644 --- a/net/tls/tls.h +++ b/net/tls/tls.h @@ -141,6 +141,7 @@ void update_sk_prot(struct sock *sk, struct tls_context *ctx); int wait_on_pending_writer(struct sock *sk, long *timeo); void tls_err_abort(struct sock *sk, int err); +void tls_strp_abort_strp(struct tls_strparser *strp, int err); int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx); void tls_update_rx_zc_capable(struct tls_context *tls_ctx); diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index 6ce64a6e4495..ae723cd6af39 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -12,7 +12,7 @@ static struct workqueue_struct *tls_strp_wq; -static void tls_strp_abort_strp(struct tls_strparser *strp, int err) +void tls_strp_abort_strp(struct tls_strparser *strp, int err) { if (strp->stopped) return; @@ -210,11 +210,17 @@ static int tls_strp_copyin_frag(struct tls_strparser *strp, struct sk_buff *skb, struct sk_buff *in_skb, unsigned int offset, size_t in_len) { + unsigned int nfrag = skb->len / PAGE_SIZE; size_t len, chunk; skb_frag_t *frag; int sz; - frag = &skb_shinfo(skb)->frags[skb->len / PAGE_SIZE]; + if (unlikely(nfrag >= skb_shinfo(skb)->nr_frags)) { + DEBUG_NET_WARN_ON_ONCE(1); + return -EMSGSIZE; + } + + frag = &skb_shinfo(skb)->frags[nfrag]; len = in_len; /* First make sure we got the header */ @@ -519,10 +525,8 @@ static int tls_strp_read_sock(struct tls_strparser *strp) tls_strp_load_anchor_with_queue(strp, inq); if (!strp->stm.full_len) { sz = tls_rx_msg_size(strp, strp->anchor); - if (sz < 0) { - tls_strp_abort_strp(strp, sz); + if (sz < 0) return sz; - } strp->stm.full_len = sz; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 8915f2eedd09..2a6026b6f1fc 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -2438,8 +2438,7 @@ int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb) return data_len + TLS_HEADER_SIZE; read_failure: - tls_err_abort(strp->sk, ret); - + tls_strp_abort_strp(strp, ret); return ret; } -- 2.25.1
From: Michal Luczaj <mhal@rbox.co> stable inclusion from stable-v6.6.118 commit 5998da5a8208ae9ad7838ba322bccb2bdcd95e81 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 002541ef650b742a198e4be363881439bb9d86b4 ] During connect(), acting on a signal/timeout by disconnecting an already established socket leads to several issues: 1. connect() invoking vsock_transport_cancel_pkt() -> virtio_transport_purge_skbs() may race with sendmsg() invoking virtio_transport_get_credit(). This results in a permanently elevated `vvs->bytes_unsent`. Which, in turn, confuses the SOCK_LINGER handling. 2. connect() resetting a connected socket's state may race with socket being placed in a sockmap. A disconnected socket remaining in a sockmap breaks sockmap's assumptions. And gives rise to WARNs. 3. connect() transitioning SS_CONNECTED -> SS_UNCONNECTED allows for a transport change/drop after TCP_ESTABLISHED. Which poses a problem for any simultaneous sendmsg() or connect() and may result in a use-after-free/null-ptr-deref. Do not disconnect socket on signal/timeout. Keep the logic for unconnected sockets: they don't linger, can't be placed in a sockmap, are rejected by sendmsg(). [1]: https://lore.kernel.org/netdev/e07fd95c-9a38-4eea-9638-133e38c2ec9b@rbox.co/ [2]: https://lore.kernel.org/netdev/20250317-vsock-trans-signal-race-v4-0-fc8837f... [3]: https://lore.kernel.org/netdev/60f1b7db-3099-4f6a-875e-af9f6ef194f6@rbox.co/ Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20251119-vsock-interrupted-connect-v2-1-70734cf1233... Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/vmw_vsock/af_vsock.c | 40 +++++++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 58b7404a0da0..68c35d24809d 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1549,18 +1549,40 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, timeout = schedule_timeout(timeout); lock_sock(sk); - if (signal_pending(current)) { - err = sock_intr_errno(timeout); - sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; - sock->state = SS_UNCONNECTED; - vsock_transport_cancel_pkt(vsk); - vsock_remove_connected(vsk); - goto out_wait; - } else if ((sk->sk_state != TCP_ESTABLISHED) && (timeout == 0)) { - err = -ETIMEDOUT; + /* Connection established. Whatever happens to socket once we + * release it, that's not connect()'s concern. No need to go + * into signal and timeout handling. Call it a day. + * + * Note that allowing to "reset" an already established socket + * here is racy and insecure. + */ + if (sk->sk_state == TCP_ESTABLISHED) + break; + + /* If connection was _not_ established and a signal/timeout came + * to be, we want the socket's state reset. User space may want + * to retry. + * + * sk_state != TCP_ESTABLISHED implies that socket is not on + * vsock_connected_table. We keep the binding and the transport + * assigned. + */ + if (signal_pending(current) || timeout == 0) { + err = timeout == 0 ? -ETIMEDOUT : sock_intr_errno(timeout); + + /* Listener might have already responded with + * VIRTIO_VSOCK_OP_RESPONSE. Its handling expects our + * sk_state == TCP_SYN_SENT, which hereby we break. + * In such case VIRTIO_VSOCK_OP_RST will follow. + */ sk->sk_state = TCP_CLOSE; sock->state = SS_UNCONNECTED; + + /* Try to cancel VIRTIO_VSOCK_OP_REQUEST skb sent out by + * transport->connect(). + */ vsock_transport_cancel_pkt(vsk); + goto out_wait; } -- 2.25.1
From: Nikolay Aleksandrov <razor@blackwall.org> stable inclusion from stable-v6.6.117 commit 3b60ce334c1ce8b3fad7e02dcd5ed9f6646477c8 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 8dca36978aa80bab9d4da130c211db75c9e00048 ] syzbot reported[1] a use-after-free when deleting an expired fdb. It is due to a race condition between learning still happening and a port being deleted, after all its fdbs have been flushed. The port's state has been toggled to disabled so no learning should happen at that time, but if we have MST enabled, it will bypass the port's state, that together with VLAN filtering disabled can lead to fdb learning at a time when it shouldn't happen while the port is being deleted. VLAN filtering must be disabled because we flush the port VLANs when it's being deleted which will stop learning. This fix adds a check for the port's vlan group which is initialized to NULL when the port is getting deleted, that avoids the port state bypass. When MST is enabled there would be a minimal new overhead in the fast-path because the port's vlan group pointer is cache-hot. [1] https://syzkaller.appspot.com/bug?extid=dd280197f0f7ab3917be Fixes: ec7328b59176 ("net: bridge: mst: Multiple Spanning Tree (MST) mode") Reported-by: syzbot+dd280197f0f7ab3917be@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69088ffa.050a0220.29fc44.003d.GAE@google.com/ Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251105111919.1499702-2-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/bridge/br_forward.c | 2 +- net/bridge/br_input.c | 4 ++-- net/bridge/br_private.h | 8 +++++--- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index e19b583ff2c6..193520321804 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -25,7 +25,7 @@ static inline int should_deliver(const struct net_bridge_port *p, vg = nbp_vlan_group_rcu(p); return ((p->flags & BR_HAIRPIN_MODE) || skb->dev != p->dev) && - (br_mst_is_enabled(p->br) || p->state == BR_STATE_FORWARDING) && + (br_mst_is_enabled(p) || p->state == BR_STATE_FORWARDING) && br_allowed_egress(vg, skb) && nbp_switchdev_allowed_egress(p, skb) && !br_skb_isolated(p, skb); } diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index e09000e38d07..951330c1a813 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -93,7 +93,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br = p->br; - if (br_mst_is_enabled(br)) { + if (br_mst_is_enabled(p)) { state = BR_STATE_FORWARDING; } else { if (p->state == BR_STATE_DISABLED) @@ -411,7 +411,7 @@ static rx_handler_result_t br_handle_frame(struct sk_buff **pskb) return RX_HANDLER_PASS; forward: - if (br_mst_is_enabled(p->br)) + if (br_mst_is_enabled(p)) goto defer_stp_filtering; switch (p->state) { diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index ebb12f37be94..b468c36fd6d3 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -1879,10 +1879,12 @@ static inline bool br_vlan_state_allowed(u8 state, bool learn_allow) /* br_mst.c */ #ifdef CONFIG_BRIDGE_VLAN_FILTERING DECLARE_STATIC_KEY_FALSE(br_mst_used); -static inline bool br_mst_is_enabled(struct net_bridge *br) +static inline bool br_mst_is_enabled(const struct net_bridge_port *p) { + /* check the port's vlan group to avoid racing with port deletion */ return static_branch_unlikely(&br_mst_used) && - br_opt_get(br, BROPT_MST_ENABLED); + br_opt_get(p->br, BROPT_MST_ENABLED) && + rcu_access_pointer(p->vlgrp); } int br_mst_set_state(struct net_bridge_port *p, u16 msti, u8 state, @@ -1897,7 +1899,7 @@ int br_mst_fill_info(struct sk_buff *skb, int br_mst_process(struct net_bridge_port *p, const struct nlattr *mst_attr, struct netlink_ext_ack *extack); #else -static inline bool br_mst_is_enabled(struct net_bridge *br) +static inline bool br_mst_is_enabled(const struct net_bridge_port *p) { return false; } -- 2.25.1
From: Bui Quang Minh <minhquangbui99@gmail.com> stable inclusion from stable-v6.6.117 commit 946dec89c41726b94d31147ec528b96af0be1b5a category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit 0c716703965ffc5ef4311b65cb5d84a703784717 upstream. Since commit 4959aebba8c0 ("virtio-net: use mtu size as buffer length for big packets"), when guest gso is off, the allocated size for big packets is not MAX_SKB_FRAGS * PAGE_SIZE anymore but depends on negotiated MTU. The number of allocated frags for big packets is stored in vi->big_packets_num_skbfrags. Because the host announced buffer length can be malicious (e.g. the host vhost_net driver's get_rx_bufs is modified to announce incorrect length), we need a check in virtio_net receive path. Currently, the check is not adapted to the new change which can lead to NULL page pointer dereference in the below while loop when receiving length that is larger than the allocated one. This commit fixes the received length check corresponding to the new change. Fixes: 4959aebba8c0 ("virtio-net: use mtu size as buffer length for big packets") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://patch.msgid.link/20251030144438.7582-1-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- drivers/net/virtio_net.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 5a949f9446a8..fd5663075bd7 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -591,17 +591,6 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, goto ok; } - /* - * Verify that we can indeed put this data into a skb. - * This is here to handle cases when the device erroneously - * tries to receive more than is possible. This is usually - * the case of a broken device. - */ - if (unlikely(len > MAX_SKB_FRAGS * PAGE_SIZE)) { - net_dbg_ratelimited("%s: too much data\n", skb->dev->name); - dev_kfree_skb(skb); - return NULL; - } BUG_ON(offset >= PAGE_SIZE); while (len) { unsigned int frag_size = min((unsigned)PAGE_SIZE - offset, len); @@ -1344,9 +1333,19 @@ static struct sk_buff *receive_big(struct net_device *dev, struct virtnet_rq_stats *stats) { struct page *page = buf; - struct sk_buff *skb = - page_to_skb(vi, rq, page, 0, len, PAGE_SIZE, 0); + struct sk_buff *skb; + + /* Make sure that len does not exceed the size allocated in + * add_recvbuf_big. + */ + if (unlikely(len > (vi->big_packets_num_skbfrags + 1) * PAGE_SIZE)) { + pr_debug("%s: rx error: len %u exceeds allocated size %lu\n", + dev->name, len, + (vi->big_packets_num_skbfrags + 1) * PAGE_SIZE); + goto err; + } + skb = page_to_skb(vi, rq, page, 0, len, PAGE_SIZE, 0); u64_stats_add(&stats->bytes, len - vi->hdr_len); if (unlikely(!skb)) goto err; -- 2.25.1
From: Jakub Kicinski <kuba@kernel.org> stable inclusion from stable-v6.6.117 commit 7613c06ffa89c1e2266fb532e23ef7dfdf269d73 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit f3b52167a0cb23b27414452fbc1278da2ee884fc ] Driver authors often forget to add GFP_NOWARN for page allocation from the datapath. This is annoying to users as OOMs are a fact of life, and we pretty much expect network Rx to hit page allocation failures during OOM. Make page pool add GFP_NOWARN for ATOMIC allocations by default. Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250912161703.361272-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/core/page_pool.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index cb7238043a33..c7aa2ff9aa88 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -422,6 +422,12 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, struct page *page; int i, nr_pages; + /* Unconditionally set NOWARN if allocating from NAPI. + * Drivers forget to set it, and OOM reports on packet Rx are useless. + */ + if ((gfp & GFP_ATOMIC) == GFP_ATOMIC) + gfp |= __GFP_NOWARN; + /* Don't support bulk alloc for high-order pages */ if (unlikely(pp_order)) return __page_pool_alloc_page_order(pool, gfp); -- 2.25.1
From: Nishanth Menon <nm@ti.com> stable inclusion from stable-v6.6.118 commit 952637c5b9be64539cd0e13ef88db71a1df46373 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 90a88306eb874fe4bbdd860e6c9787f5bbc588b5 ] Make knav_dma_open_channel consistently return NULL on error instead of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h returns NULL when the driver is disabled, but the driver implementation does not even return NULL or ERR_PTR on failure, causing inconsistency in the users. This results in a crash in netcp_free_navigator_resources as followed (trimmed): Unhandled fault: alignment exception (0x221) at 0xfffffff2 [fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000 Internal error: : 221 [#1] SMP ARM Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE Hardware name: Keystone PC is at knav_dma_close_channel+0x30/0x19c LR is at netcp_free_navigator_resources+0x2c/0x28c [... TRIM...] Call trace: knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c netcp_ndo_open from __dev_open+0x114/0x29c __dev_open from __dev_change_flags+0x190/0x208 __dev_change_flags from netif_change_flags+0x1c/0x58 netif_change_flags from dev_change_flags+0x38/0xa0 dev_change_flags from ip_auto_config+0x2c4/0x11f0 ip_auto_config from do_one_initcall+0x58/0x200 do_one_initcall from kernel_init_freeable+0x1cc/0x238 kernel_init_freeable from kernel_init+0x1c/0x12c kernel_init from ret_from_fork+0x14/0x38 [... TRIM...] Standardize the error handling by making the function return NULL on all error conditions. The API is used in just the netcp_core.c so the impact is limited. Note, this change, in effect reverts commit 5b6cb43b4d62 ("net: ethernet: ti: netcp_core: return error while dma channel open issue"), but provides a less error prone implementation. Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Nishanth Menon <nm@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- drivers/net/ethernet/ti/netcp_core.c | 10 +++++----- drivers/soc/ti/knav_dma.c | 14 +++++++------- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index d829113c16ee..1c33a9c9ddb5 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -1338,10 +1338,10 @@ int netcp_txpipe_open(struct netcp_tx_pipe *tx_pipe) tx_pipe->dma_channel = knav_dma_open_channel(dev, tx_pipe->dma_chan_name, &config); - if (IS_ERR(tx_pipe->dma_channel)) { + if (!tx_pipe->dma_channel) { dev_err(dev, "failed opening tx chan(%s)\n", tx_pipe->dma_chan_name); - ret = PTR_ERR(tx_pipe->dma_channel); + ret = -EINVAL; goto err; } @@ -1359,7 +1359,7 @@ int netcp_txpipe_open(struct netcp_tx_pipe *tx_pipe) return 0; err: - if (!IS_ERR_OR_NULL(tx_pipe->dma_channel)) + if (tx_pipe->dma_channel) knav_dma_close_channel(tx_pipe->dma_channel); tx_pipe->dma_channel = NULL; return ret; @@ -1678,10 +1678,10 @@ static int netcp_setup_navigator_resources(struct net_device *ndev) netcp->rx_channel = knav_dma_open_channel(netcp->netcp_device->device, netcp->dma_chan_name, &config); - if (IS_ERR(netcp->rx_channel)) { + if (!netcp->rx_channel) { dev_err(netcp->ndev_dev, "failed opening rx chan(%s\n", netcp->dma_chan_name); - ret = PTR_ERR(netcp->rx_channel); + ret = -EINVAL; goto fail; } diff --git a/drivers/soc/ti/knav_dma.c b/drivers/soc/ti/knav_dma.c index 0fbc37cd5123..f599eaab3c56 100644 --- a/drivers/soc/ti/knav_dma.c +++ b/drivers/soc/ti/knav_dma.c @@ -402,7 +402,7 @@ static int of_channel_match_helper(struct device_node *np, const char *name, * @name: slave channel name * @config: dma configuration parameters * - * Returns pointer to appropriate DMA channel on success or error. + * Return: Pointer to appropriate DMA channel on success or NULL on error. */ void *knav_dma_open_channel(struct device *dev, const char *name, struct knav_dma_cfg *config) @@ -414,13 +414,13 @@ void *knav_dma_open_channel(struct device *dev, const char *name, if (!kdev) { pr_err("keystone-navigator-dma driver not registered\n"); - return (void *)-EINVAL; + return NULL; } chan_num = of_channel_match_helper(dev->of_node, name, &instance); if (chan_num < 0) { dev_err(kdev->dev, "No DMA instance with name %s\n", name); - return (void *)-EINVAL; + return NULL; } dev_dbg(kdev->dev, "initializing %s channel %d from DMA %s\n", @@ -431,7 +431,7 @@ void *knav_dma_open_channel(struct device *dev, const char *name, if (config->direction != DMA_MEM_TO_DEV && config->direction != DMA_DEV_TO_MEM) { dev_err(kdev->dev, "bad direction\n"); - return (void *)-EINVAL; + return NULL; } /* Look for correct dma instance */ @@ -443,7 +443,7 @@ void *knav_dma_open_channel(struct device *dev, const char *name, } if (!dma) { dev_err(kdev->dev, "No DMA instance with name %s\n", instance); - return (void *)-EINVAL; + return NULL; } /* Look for correct dma channel from dma instance */ @@ -463,14 +463,14 @@ void *knav_dma_open_channel(struct device *dev, const char *name, if (!chan) { dev_err(kdev->dev, "channel %d is not in DMA %s\n", chan_num, instance); - return (void *)-EINVAL; + return NULL; } if (atomic_read(&chan->ref_count) >= 1) { if (!check_config(chan, config)) { dev_err(kdev->dev, "channel %d config miss-match\n", chan_num); - return (void *)-EINVAL; + return NULL; } } -- 2.25.1
From: Florian Westphal <fw@strlen.de> stable inclusion from stable-v6.6.103 commit a0a3ace2a57887dac1e7c9a724846040c3e31868 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13525 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 91a79b792204313153e1bdbbe5acbfc28903b3a5 ] recent patches to add a WARN() when replacing skb dst entry found an old bug: WARNING: include/linux/skbuff.h:1165 skb_dst_check_unset include/linux/skbuff.h:1164 [inline] WARNING: include/linux/skbuff.h:1165 skb_dst_set include/linux/skbuff.h:1210 [inline] WARNING: include/linux/skbuff.h:1165 nf_reject_fill_skb_dst+0x2a4/0x330 net/ipv4/netfilter/nf_reject_ipv4.c:234 [..] Call Trace: nf_send_unreach+0x17b/0x6e0 net/ipv4/netfilter/nf_reject_ipv4.c:325 nft_reject_inet_eval+0x4bc/0x690 net/netfilter/nft_reject_inet.c:27 expr_call_ops_eval net/netfilter/nf_tables_core.c:237 [inline] .. This is because blamed commit forgot about loopback packets. Such packets already have a dst_entry attached, even at PRE_ROUTING stage. Instead of checking hook just check if the skb already has a route attached to it. Fixes: f53b9b0bdc59 ("netfilter: introduce support for reject at prerouting stage") Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20250820123707.10671-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> --- net/ipv4/netfilter/nf_reject_ipv4.c | 6 ++---- net/ipv6/netfilter/nf_reject_ipv6.c | 5 ++--- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c index 675b5bbed638..2d663fe50f87 100644 --- a/net/ipv4/netfilter/nf_reject_ipv4.c +++ b/net/ipv4/netfilter/nf_reject_ipv4.c @@ -247,8 +247,7 @@ void nf_send_reset(struct net *net, struct sock *sk, struct sk_buff *oldskb, if (!oth) return; - if ((hook == NF_INET_PRE_ROUTING || hook == NF_INET_INGRESS) && - nf_reject_fill_skb_dst(oldskb) < 0) + if (!skb_dst(oldskb) && nf_reject_fill_skb_dst(oldskb) < 0) return; if (skb_rtable(oldskb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST)) @@ -321,8 +320,7 @@ void nf_send_unreach(struct sk_buff *skb_in, int code, int hook) if (iph->frag_off & htons(IP_OFFSET)) return; - if ((hook == NF_INET_PRE_ROUTING || hook == NF_INET_INGRESS) && - nf_reject_fill_skb_dst(skb_in) < 0) + if (!skb_dst(skb_in) && nf_reject_fill_skb_dst(skb_in) < 0) return; if (skb_csum_unnecessary(skb_in) || diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c index e4776bd2ed89..f3579bccf0a5 100644 --- a/net/ipv6/netfilter/nf_reject_ipv6.c +++ b/net/ipv6/netfilter/nf_reject_ipv6.c @@ -293,7 +293,7 @@ void nf_send_reset6(struct net *net, struct sock *sk, struct sk_buff *oldskb, fl6.fl6_sport = otcph->dest; fl6.fl6_dport = otcph->source; - if (hook == NF_INET_PRE_ROUTING || hook == NF_INET_INGRESS) { + if (!skb_dst(oldskb)) { nf_ip6_route(net, &dst, flowi6_to_flowi(&fl6), false); if (!dst) return; @@ -397,8 +397,7 @@ void nf_send_unreach6(struct net *net, struct sk_buff *skb_in, if (hooknum == NF_INET_LOCAL_OUT && skb_in->dev == NULL) skb_in->dev = net->loopback_dev; - if ((hooknum == NF_INET_PRE_ROUTING || hooknum == NF_INET_INGRESS) && - nf_reject6_fill_skb_dst(skb_in) < 0) + if (!skb_dst(skb_in) && nf_reject6_fill_skb_dst(skb_in) < 0) return; icmpv6_send(skb_in, ICMPV6_DEST_UNREACH, code, 0); -- 2.25.1
participants (2)
-
Dong Chenchen -
patchwork bot