Backport 5.10.134 LTS patches from upstream. git cherry-pick v5.10.133..v5.10.134~1 -s
Already merged(-12): ab5050fd7430 lockdown: Fix kexec lockdown bypass with ima policy 2ee0cab11f66 io_uring: Use original task for req identity in io_identity_cow() ed6964ff4714 net: make free_netdev() more lenient with unregistering devices 2e11856ec379 net: make sure devices go through netdev_wait_all_refs 47b696dd6544 xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup() 0adf21eec590 watchqueue: make sure to serialize 'wqueue->defunct' properly 6a8184825286 tty: drivers/tty/, stop using tty_schedule_flip() 4d374625cca2 tty: the rest, stop using tty_schedule_flip() c84986d09745 tty: drop tty_schedule_flip() a4bb7ef2d6f6 tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push() 08afa87f58d8 tty: use new tty_insert_flip_string_and_push_buffer() in pty_write() bb1990a3005e watch-queue: remove spurious double semicolon
Context Conflict: 2686f62fa78c docs: net: explain struct net_device lifetime bf2f3d1970c0 net: move rollback_registered_many() b7b9e5cc8b24 x86/amd: Use IBPB for firmware calls
Total patches: = 101 - 12 = 89
Alex Deucher (1): drm/amdgpu/display: add quirk handling for stutter mode
Alexander Aring (1): dlm: fix pending remove if msg allocation fails
Alexey Kardashevskiy (1): KVM: Don't null dereference ops->destroy
Ben Dooks (1): riscv: add as-options for modules with assembly compontents
Biao Huang (1): net: stmmac: fix unbalanced ptp clock issue in suspend/resume flow
Dawid Lukwinski (1): i40e: Fix erroneous adapter reinitialization during recovery process
Demi Marie Obenour (1): xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
Eric Dumazet (1): bpf: Make sure mac_header was set before using it
Fabien Dessenne (1): pinctrl: stm32: fix optional IRQ support to gpios
Greg Kroah-Hartman (1): Revert "m68knommu: only set CONFIG_ISA_DMA_API for ColdFire sub-arch"
Haibo Chen (3): gpio: pca953x: only use single read/write for No AI mode gpio: pca953x: use the correct range when do regmap sync gpio: pca953x: use the correct register address when regcache sync during init
Hristo Venev (1): be2net: Fix buffer overflow in be_get_module_eeprom
Ido Schimmel (1): mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
Jakub Kicinski (5): docs: net: explain struct net_device lifetime net: move net_set_todo inside rollback_registered() net: inline rollback_registered() net: move rollback_registered_many() net: inline rollback_registered_many()
Jeffrey Hugo (4): PCI: hv: Fix multi-MSI to allow more than one MSI vector PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() PCI: hv: Fix interrupt mapping for multi-MSI
Jose Alonso (1): net: usb: ax88179_178a needs FLAG_SEND_ZLP
Junxiao Chang (1): net: stmmac: fix dma queue left shift overflow issue
Juri Lelli (1): sched/deadline: Fix BUG_ON condition for deboosted tasks
Kees Cook (1): x86/alternative: Report missing return thunk details
Kuniyuki Iwashima (37): ip: Fix data-races around sysctl_ip_no_pmtu_disc. ip: Fix data-races around sysctl_ip_fwd_use_pmtu. ip: Fix data-races around sysctl_ip_fwd_update_priority. ip: Fix data-races around sysctl_ip_nonlocal_bind. ip: Fix a data-race around sysctl_ip_autobind_reuse. ip: Fix a data-race around sysctl_fwmark_reflect. tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept. tcp: Fix data-races around sysctl_tcp_mtu_probing. tcp: Fix data-races around sysctl_tcp_base_mss. tcp: Fix data-races around sysctl_tcp_min_snd_mss. tcp: Fix a data-race around sysctl_tcp_mtu_probe_floor. tcp: Fix a data-race around sysctl_tcp_probe_threshold. tcp: Fix a data-race around sysctl_tcp_probe_interval. igmp: Fix data-races around sysctl_igmp_llm_reports. igmp: Fix a data-race around sysctl_igmp_max_memberships. igmp: Fix data-races around sysctl_igmp_max_msf. tcp: Fix data-races around keepalive sysctl knobs. tcp: Fix data-races around sysctl_tcp_syncookies. tcp: Fix data-races around sysctl_tcp_reordering. tcp: Fix data-races around some timeout sysctl knobs. tcp: Fix a data-race around sysctl_tcp_notsent_lowat. tcp: Fix a data-race around sysctl_tcp_tw_reuse. tcp: Fix data-races around sysctl_max_syn_backlog. tcp: Fix data-races around sysctl_tcp_fastopen. tcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout. ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. ip: Fix data-races around sysctl_ip_prot_sock. udp: Fix a data-race around sysctl_udp_l3mdev_accept. tcp: Fix data-races around sysctl knobs related to SYN option. tcp: Fix a data-race around sysctl_tcp_early_retrans. tcp: Fix data-races around sysctl_tcp_recovery. tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. tcp: Fix a data-race around sysctl_tcp_retrans_collapse. tcp: Fix a data-race around sysctl_tcp_stdurg. tcp: Fix a data-race around sysctl_tcp_rfc1337. tcp: Fix data-races around sysctl_tcp_max_reordering.
Lennert Buytenhek (1): igc: Reinstate IGC_REMOVED logic and implement it properly
Liang He (1): drm/imx/dcss: Add missing of_node_put() in fail path
Luiz Augusto von Dentz (7): Bluetooth: Add bt_skb_sendmsg helper Bluetooth: Add bt_skb_sendmmsg helper Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg Bluetooth: Fix passing NULL to PTR_ERR Bluetooth: SCO: Fix sco_send_frame returning skb->len Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks
Marc Kleine-Budde (1): spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers
Miaoqian Lin (1): power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe
Pali Rohár (1): serial: mvebu-uart: correctly report configured baudrate value
Pawan Gupta (1): x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts
Peter Zijlstra (3): perf/core: Fix data race between perf_event_set_output() and perf_mmap_close() bitfield.h: Fix "type of reg too small for mask" test x86/amd: Use IBPB for firmware calls
Piotr Skajewski (1): ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero
Przemyslaw Patynowski (1): iavf: Fix handling of dummy receive descriptors
Robert Hancock (1): i2c: cadence: Change large transfer count reset logic to be unconditional
Takashi Iwai (1): ALSA: memalloc: Align buffer allocations in page size
Tariq Toukan (1): net/tls: Fix race in TLS device down flow
Wang Cheng (1): mm/mempolicy: fix uninit-value in mpol_rebind_policy()
Wang ShaoBo (1): drm/imx/dcss: fix unused but set variable warnings
William Dean (1): pinctrl: ralink: Check for null return of devm_kcalloc
Documentation/networking/netdevices.rst | 171 +++++++++++++- arch/m68k/Kconfig.bus | 2 +- arch/riscv/Makefile | 1 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/mshyperv.h | 7 - arch/x86/include/asm/nospec-branch.h | 2 + arch/x86/kernel/alternative.c | 4 +- arch/x86/kernel/cpu/bugs.c | 14 +- drivers/gpio/gpio-pca953x.c | 22 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 +++ drivers/gpu/drm/imx/dcss/dcss-dev.c | 3 + drivers/gpu/drm/imx/dcss/dcss-plane.c | 2 - drivers/i2c/busses/i2c-cadence.c | 30 +-- .../chelsio/inline_crypto/chtls/chtls_cm.c | 6 +- drivers/net/ethernet/emulex/benet/be_cmds.c | 10 +- drivers/net/ethernet/emulex/benet/be_cmds.h | 2 +- .../net/ethernet/emulex/benet/be_ethtool.c | 31 ++- drivers/net/ethernet/intel/i40e/i40e_main.c | 13 +- drivers/net/ethernet/intel/iavf/iavf_txrx.c | 5 +- drivers/net/ethernet/intel/igc/igc_main.c | 3 + drivers/net/ethernet/intel/igc/igc_regs.h | 5 +- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 1 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 + .../net/ethernet/intel/ixgbe/ixgbe_sriov.c | 6 + .../ethernet/mellanox/mlxsw/spectrum_router.c | 5 +- .../net/ethernet/stmicro/stmmac/dwmac4_core.c | 3 + .../net/ethernet/stmicro/stmmac/stmmac_main.c | 17 +- .../ethernet/stmicro/stmmac/stmmac_platform.c | 8 +- drivers/net/usb/ax88179_178a.c | 20 +- drivers/pci/controller/pci-hyperv.c | 99 ++++++-- drivers/pinctrl/stm32/pinctrl-stm32.c | 18 +- drivers/power/reset/arm-versatile-reboot.c | 1 + drivers/spi/spi-bcm2835.c | 12 +- .../staging/mt7621-pinctrl/pinctrl-rt2880.c | 2 + drivers/tty/serial/mvebu-uart.c | 25 ++- drivers/xen/gntdev.c | 3 +- fs/dlm/lock.c | 3 +- include/linux/bitfield.h | 19 +- include/net/bluetooth/bluetooth.h | 65 ++++++ include/net/inet_sock.h | 5 +- include/net/ip.h | 6 +- include/net/tcp.h | 18 +- include/net/udp.h | 2 +- kernel/bpf/core.c | 8 +- kernel/events/core.c | 45 ++-- kernel/sched/deadline.c | 5 +- mm/mempolicy.c | 2 +- net/bluetooth/rfcomm/core.c | 50 ++++- net/bluetooth/rfcomm/sock.c | 46 +--- net/bluetooth/sco.c | 30 +-- net/core/dev.c | 212 ++++++++---------- net/core/filter.c | 4 +- net/core/secure_seq.c | 4 +- net/ipv4/af_inet.c | 4 +- net/ipv4/fib_semantics.c | 2 +- net/ipv4/icmp.c | 2 +- net/ipv4/igmp.c | 25 ++- net/ipv4/inet_connection_sock.c | 2 +- net/ipv4/ip_forward.c | 2 +- net/ipv4/ip_sockglue.c | 6 +- net/ipv4/route.c | 2 +- net/ipv4/syncookies.c | 9 +- net/ipv4/sysctl_net_ipv4.c | 6 +- net/ipv4/tcp.c | 10 +- net/ipv4/tcp_fastopen.c | 9 +- net/ipv4/tcp_input.c | 51 +++-- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_metrics.c | 3 +- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 29 +-- net/ipv4/tcp_recovery.c | 6 +- net/ipv4/tcp_timer.c | 20 +- net/ipv6/af_inet6.c | 2 +- net/ipv6/syncookies.c | 3 +- net/sctp/protocol.c | 2 +- net/smc/smc_llc.c | 2 +- net/tls/tls_device.c | 8 +- net/xfrm/xfrm_state.c | 2 +- sound/core/memalloc.c | 1 + virt/kvm/kvm_main.c | 5 +- 80 files changed, 877 insertions(+), 454 deletions(-)
From: Fabien Dessenne fabien.dessenne@foss.st.com
stable inclusion from stable-v5.10.134 commit 31f3bb363a8972891b51747f27665663bce17a5b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit a1d4ef1adf8bbd302067534ead671a94759687ed upstream.
To act as an interrupt controller, a gpio bank relies on the "interrupt-parent" of the pin controller. When this optional "interrupt-parent" misses, do not create any IRQ domain.
This fixes a "NULL pointer in stm32_gpio_domain_alloc()" kernel crash when the interrupt-parent = <exti> property is not declared in the Device Tree.
Fixes: 0eb9f683336d ("pinctrl: Add IRQ support to STM32 gpios") Signed-off-by: Fabien Dessenne fabien.dessenne@foss.st.com Link: https://lore.kernel.org/r/20220627142350.742973-1-fabien.dessenne@foss.st.co... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/pinctrl/stm32/pinctrl-stm32.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/pinctrl/stm32/pinctrl-stm32.c b/drivers/pinctrl/stm32/pinctrl-stm32.c index b017dd400c46..60406f1f8337 100644 --- a/drivers/pinctrl/stm32/pinctrl-stm32.c +++ b/drivers/pinctrl/stm32/pinctrl-stm32.c @@ -1303,15 +1303,17 @@ static int stm32_gpiolib_register_bank(struct stm32_pinctrl *pctl, bank->bank_ioport_nr = bank_ioport_nr; spin_lock_init(&bank->lock);
- /* create irq hierarchical domain */ - bank->fwnode = of_node_to_fwnode(np); + if (pctl->domain) { + /* create irq hierarchical domain */ + bank->fwnode = of_node_to_fwnode(np);
- bank->domain = irq_domain_create_hierarchy(pctl->domain, 0, - STM32_GPIO_IRQ_LINE, bank->fwnode, - &stm32_gpio_domain_ops, bank); + bank->domain = irq_domain_create_hierarchy(pctl->domain, 0, STM32_GPIO_IRQ_LINE, + bank->fwnode, &stm32_gpio_domain_ops, + bank);
- if (!bank->domain) - return -ENODEV; + if (!bank->domain) + return -ENODEV; + }
err = gpiochip_add_data(&bank->gpio_chip, bank); if (err) { @@ -1481,6 +1483,8 @@ int stm32_pctl_probe(struct platform_device *pdev) pctl->domain = stm32_pctrl_get_irq_domain(np); if (IS_ERR(pctl->domain)) return PTR_ERR(pctl->domain); + if (!pctl->domain) + dev_warn(dev, "pinctrl without interrupt support\n");
/* hwspinlock is optional */ hwlock_id = of_hwspin_lock_get_id(pdev->dev.of_node, 0);
From: Ben Dooks ben.dooks@codethink.co.uk
stable inclusion from stable-v5.10.134 commit 15155fa898cb0530c3c44eaf92df18e04bef613f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit c1f6eff304e4dfa4558b6a8c6b2d26a91db6c998 upstream.
When trying to load modules built for RISC-V which include assembly files the kernel loader errors with "unexpected relocation type 'R_RISCV_ALIGN'" due to R_RISCV_ALIGN relocations being generated by the assembler.
The R_RISCV_ALIGN relocations can be removed at the expense of code space by adding -mno-relax to gcc and as. In commit 7a8e7da42250138 ("RISC-V: Fixes to module loading") -mno-relax is added to the build variable KBUILD_CFLAGS_MODULE. See [1] for more info.
The issue is that when kbuild builds a .S file, it invokes gcc with the -mno-relax flag, but this is not being passed through to the assembler. Adding -Wa,-mno-relax to KBUILD_AFLAGS_MODULE ensures that the assembler is invoked correctly. This may have now been fixed in gcc[2] and this addition should not stop newer gcc and as from working.
[1] https://github.com/riscv/riscv-elf-psabi-doc/issues/183 [2] https://github.com/gcc-mirror/gcc/commit/3b0a7d624e64eeb81e4d5e8c62c46d86ef5...
Signed-off-by: Ben Dooks ben.dooks@codethink.co.uk Reviewed-by: Bin Meng bmeng.cn@gmail.com Link: https://lore.kernel.org/r/20220529152200.609809-1-ben.dooks@codethink.co.uk Fixes: ab1ef68e5401 ("RISC-V: Add sections of PLT and GOT for kernel module") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- arch/riscv/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index e8462dd7f7ee..07957f1f3480 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -73,6 +73,7 @@ ifeq ($(CONFIG_PERF_EVENTS),y) endif
KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax) +KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
# GCC versions that support the "-mstrict-align" option default to allowing # unaligned accesses. While unaligned accesses are explicitly allowed in the
From: Ido Schimmel idosch@nvidia.com
stable inclusion from stable-v5.10.134 commit 426336de3557b5b7290399425b5ca142e635a777 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit e5ec6a2513383fe2ecc2ee3b5f51d97acbbcd4d8 upstream.
mlxsw needs to distinguish nexthops with a gateway from connected nexthops in order to write the former to the adjacency table of the device. The check used to rely on the fact that nexthops with a gateway have a 'link' scope whereas connected nexthops have a 'host' scope. This is no longer correct after commit 747c14307214 ("ip: fix dflt addr selection for connected nexthop").
Fix that by instead checking the address family of the gateway IP. This is a more direct way and also consistent with the IPv6 counterpart in mlxsw_sp_rt6_is_gateway().
Cc: stable@vger.kernel.org Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop") Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Amit Cohen amcohen@nvidia.com Reviewed-by: Nicolas Dichtel nicolas.dichtel@6wind.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index 53128382fc2e..5f143ca16c01 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -4003,7 +4003,7 @@ static bool mlxsw_sp_fi_is_gateway(const struct mlxsw_sp *mlxsw_sp, { const struct fib_nh *nh = fib_info_nh(fi, 0);
- return nh->fib_nh_scope == RT_SCOPE_LINK || + return nh->fib_nh_gw_family || mlxsw_sp_nexthop4_ipip_type(mlxsw_sp, nh, NULL); }
From: Demi Marie Obenour demi@invisiblethingslab.com
stable inclusion from stable-v5.10.134 commit 7a99c7c32c85cd5239600533b77a34f884741fcc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 166d3863231667c4f64dee72b77d1102cdfad11f upstream.
The error paths of gntdev_mmap() can call unmap_grant_pages() even though not all of the pages have been successfully mapped. This will trigger the WARN_ON()s in __unmap_grant_pages_done(). The number of warnings can be very large; I have observed thousands of lines of warnings in the systemd journal.
Avoid this problem by only warning on unmapping failure if the handle being unmapped is not INVALID_GRANT_HANDLE. The handle field of any page that was not successfully mapped will be INVALID_GRANT_HANDLE, so this catches all cases where unmapping can legitimately fail.
Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()") Cc: stable@vger.kernel.org Suggested-by: Juergen Gross jgross@suse.com Signed-off-by: Demi Marie Obenour demi@invisiblethingslab.com Reviewed-by: Oleksandr Tyshchenko oleksandr_tyshchenko@epam.com Reviewed-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/xen/gntdev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index f415c056ff8a..54fee4087bf1 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -401,7 +401,8 @@ static void __unmap_grant_pages_done(int result, unsigned int offset = data->unmap_ops - map->unmap_ops;
for (i = 0; i < data->count; i++) { - WARN_ON(map->unmap_ops[offset+i].status); + WARN_ON(map->unmap_ops[offset+i].status && + map->unmap_ops[offset+i].handle != -1); pr_debug("unmap handle=%d st=%d\n", map->unmap_ops[offset+i].handle, map->unmap_ops[offset+i].status);
From: Jakub Kicinski kuba@kernel.org
stable inclusion from stable-v5.10.134 commit 2686f62fa78ce4734e871af80161a602d22be843 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 2b446e650b418f9a9e75f99852e2f2560cabfa17 upstream.
Explain the two basic flows of struct net_device's operation.
Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Conflicts: net/core/rtnetlink.c [zzk: code to be modified was deleted by commit ("net: make free_netdev() more lenient with unregistering devices")] Reviewed-by: Wei Li liwei391@huawei.com --- Documentation/networking/netdevices.rst | 171 +++++++++++++++++++++++- 1 file changed, 165 insertions(+), 6 deletions(-)
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index 5a85fcc80c76..557b97483437 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -10,18 +10,177 @@ Introduction The following is a random collection of documentation regarding network devices.
-struct net_device allocation rules -================================== +struct net_device lifetime rules +================================ Network device structures need to persist even after module is unloaded and must be allocated with alloc_netdev_mqs() and friends. If device has registered successfully, it will be freed on last use -by free_netdev(). This is required to handle the pathologic case cleanly -(example: rmmod mydriver </sys/class/net/myeth/mtu ) +by free_netdev(). This is required to handle the pathological case cleanly +(example: ``rmmod mydriver </sys/class/net/myeth/mtu``)
-alloc_netdev_mqs()/alloc_netdev() reserve extra space for driver +alloc_netdev_mqs() / alloc_netdev() reserve extra space for driver private data which gets freed when the network device is freed. If separately allocated data is attached to the network device -(netdev_priv(dev)) then it is up to the module exit handler to free that. +(netdev_priv()) then it is up to the module exit handler to free that. + +There are two groups of APIs for registering struct net_device. +First group can be used in normal contexts where ``rtnl_lock`` is not already +held: register_netdev(), unregister_netdev(). +Second group can be used when ``rtnl_lock`` is already held: +register_netdevice(), unregister_netdevice(), free_netdevice(). + +Simple drivers +-------------- + +Most drivers (especially device drivers) handle lifetime of struct net_device +in context where ``rtnl_lock`` is not held (e.g. driver probe and remove paths). + +In that case the struct net_device registration is done using +the register_netdev(), and unregister_netdev() functions: + +.. code-block:: c + + int probe() + { + struct my_device_priv *priv; + int err; + + dev = alloc_netdev_mqs(...); + if (!dev) + return -ENOMEM; + priv = netdev_priv(dev); + + /* ... do all device setup before calling register_netdev() ... + */ + + err = register_netdev(dev); + if (err) + goto err_undo; + + /* net_device is visible to the user! */ + + err_undo: + /* ... undo the device setup ... */ + free_netdev(dev); + return err; + } + + void remove() + { + unregister_netdev(dev); + free_netdev(dev); + } + +Note that after calling register_netdev() the device is visible in the system. +Users can open it and start sending / receiving traffic immediately, +or run any other callback, so all initialization must be done prior to +registration. + +unregister_netdev() closes the device and waits for all users to be done +with it. The memory of struct net_device itself may still be referenced +by sysfs but all operations on that device will fail. + +free_netdev() can be called after unregister_netdev() returns on when +register_netdev() failed. + +Device management under RTNL +---------------------------- + +Registering struct net_device while in context which already holds +the ``rtnl_lock`` requires extra care. In those scenarios most drivers +will want to make use of struct net_device's ``needs_free_netdev`` +and ``priv_destructor`` members for freeing of state. + +Example flow of netdev handling under ``rtnl_lock``: + +.. code-block:: c + + static void my_setup(struct net_device *dev) + { + dev->needs_free_netdev = true; + } + + static void my_destructor(struct net_device *dev) + { + some_obj_destroy(priv->obj); + some_uninit(priv); + } + + int create_link() + { + struct my_device_priv *priv; + int err; + + ASSERT_RTNL(); + + dev = alloc_netdev(sizeof(*priv), "net%d", NET_NAME_UNKNOWN, my_setup); + if (!dev) + return -ENOMEM; + priv = netdev_priv(dev); + + /* Implicit constructor */ + err = some_init(priv); + if (err) + goto err_free_dev; + + priv->obj = some_obj_create(); + if (!priv->obj) { + err = -ENOMEM; + goto err_some_uninit; + } + /* End of constructor, set the destructor: */ + dev->priv_destructor = my_destructor; + + err = register_netdevice(dev); + if (err) + /* register_netdevice() calls destructor on failure */ + goto err_free_dev; + + /* If anything fails now unregister_netdevice() (or unregister_netdev()) + * will take care of calling my_destructor and free_netdev(). + */ + + return 0; + + err_some_uninit: + some_uninit(priv); + err_free_dev: + free_netdev(dev); + return err; + } + +If struct net_device.priv_destructor is set it will be called by the core +some time after unregister_netdevice(), it will also be called if +register_netdevice() fails. The callback may be invoked with or without +``rtnl_lock`` held. + +There is no explicit constructor callback, driver "constructs" the private +netdev state after allocating it and before registration. + +Setting struct net_device.needs_free_netdev makes core call free_netdevice() +automatically after unregister_netdevice() when all references to the device +are gone. It only takes effect after a successful call to register_netdevice() +so if register_netdevice() fails driver is responsible for calling +free_netdev(). + +free_netdev() is safe to call on error paths right after unregister_netdevice() +or when register_netdevice() fails. Parts of netdev (de)registration process +happen after ``rtnl_lock`` is released, therefore in those cases free_netdev() +will defer some of the processing until ``rtnl_lock`` is released. + +Devices spawned from struct rtnl_link_ops should never free the +struct net_device directly. + +.ndo_init and .ndo_uninit +~~~~~~~~~~~~~~~~~~~~~~~~~ + +``.ndo_init`` and ``.ndo_uninit`` callbacks are called during net_device +registration and de-registration, under ``rtnl_lock``. Drivers can use +those e.g. when parts of their init process need to run under ``rtnl_lock``. + +``.ndo_init`` runs before device is visible in the system, ``.ndo_uninit`` +runs during de-registering after device is closed but other subsystems +may still have outstanding references to the netdevice.
MTU ===
From: Jakub Kicinski kuba@kernel.org
stable inclusion from stable-v5.10.134 commit b1158677d46bd67e32d6606d1f8c01d8ba9e6d22 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 2014beea7eb165c745706b13659a0f1d0a9a2a61 upstream.
Commit 93ee31f14f6f ("[NET]: Fix free_netdev on register_netdev failure.") moved net_set_todo() outside of rollback_registered() so that rollback_registered() can be used in the failure path of register_netdevice() but without risking a double free.
Since commit cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state."), however, we have a better way of handling that condition, since destructors don't call free_netdev() directly.
After the change in commit c269a24ce057 ("net: make free_netdev() more lenient with unregistering devices") we can now move net_set_todo() back.
Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/core/dev.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c index ee0b405681db..8c9d25e5c6c1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9602,8 +9602,10 @@ static void rollback_registered_many(struct list_head *head)
synchronize_net();
- list_for_each_entry(dev, head, unreg_list) + list_for_each_entry(dev, head, unreg_list) { dev_put(dev); + net_set_todo(dev); + } }
static void rollback_registered(struct net_device *dev) @@ -10154,7 +10156,6 @@ int register_netdevice(struct net_device *dev) /* Expect explicit free_netdev() on failure */ dev->needs_free_netdev = false; rollback_registered(dev); - net_set_todo(dev); goto out; } /* @@ -10769,8 +10770,6 @@ void unregister_netdevice_queue(struct net_device *dev, struct list_head *head) list_move_tail(&dev->unreg_list, head); } else { rollback_registered(dev); - /* Finish processing unregister after unlock */ - net_set_todo(dev); } } EXPORT_SYMBOL(unregister_netdevice_queue); @@ -10784,12 +10783,8 @@ EXPORT_SYMBOL(unregister_netdevice_queue); */ void unregister_netdevice_many(struct list_head *head) { - struct net_device *dev; - if (!list_empty(head)) { rollback_registered_many(head); - list_for_each_entry(dev, head, unreg_list) - net_set_todo(dev); list_del(head); } }
From: Jakub Kicinski kuba@kernel.org
stable inclusion from stable-v5.10.134 commit 672fac0a4372b7cc053bc7458c536eaea450a572 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 037e56bd965e1bc72c2fa9684ac25b56839a338e upstream.
rollback_registered() is a local helper, it's common for driver code to call unregister_netdevice_queue(dev, NULL) when they want to unregister netdevices under rtnl_lock. Inline rollback_registered() and adjust the only remaining caller.
Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/core/dev.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c index 8c9d25e5c6c1..981a93c3036f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9608,15 +9608,6 @@ static void rollback_registered_many(struct list_head *head) } }
-static void rollback_registered(struct net_device *dev) -{ - LIST_HEAD(single); - - list_add(&dev->unreg_list, &single); - rollback_registered_many(&single); - list_del(&single); -} - static netdev_features_t netdev_sync_upper_features(struct net_device *lower, struct net_device *upper, netdev_features_t features) { @@ -10155,7 +10146,7 @@ int register_netdevice(struct net_device *dev) if (ret) { /* Expect explicit free_netdev() on failure */ dev->needs_free_netdev = false; - rollback_registered(dev); + unregister_netdevice_queue(dev, NULL); goto out; } /* @@ -10769,7 +10760,11 @@ void unregister_netdevice_queue(struct net_device *dev, struct list_head *head) if (head) { list_move_tail(&dev->unreg_list, head); } else { - rollback_registered(dev); + LIST_HEAD(single); + + list_add(&dev->unreg_list, &single); + rollback_registered_many(&single); + list_del(&single); } } EXPORT_SYMBOL(unregister_netdevice_queue);
From: Jakub Kicinski kuba@kernel.org
stable inclusion from stable-v5.10.134 commit bf2f3d1970c03f14c84515738de5c0cb28c8ebce category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit bcfe2f1a3818d9dca945b6aca4ae741cb1f75329 upstream.
Move rollback_registered_many() and add a temporary forward declaration to make merging the code into unregister_netdevice_many() easier to review.
No functional changes.
Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Conflicts: net/core/dev.c [zzk: commit ("net/ns: put workqueue of cleanup_net sleep for a while when notify") adds cond_resched() into function which is being moved] Reviewed-by: Wei Li liwei391@huawei.com --- net/core/dev.c | 190 +++++++++++++++++++++++++------------------------ 1 file changed, 96 insertions(+), 94 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c index 981a93c3036f..f0dac979458a 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9514,100 +9514,6 @@ static void net_set_todo(struct net_device *dev) dev_net(dev)->dev_unreg_count++; }
-static void rollback_registered_many(struct list_head *head) -{ - struct net_device *dev, *tmp; - LIST_HEAD(close_head); - - BUG_ON(dev_boot_phase); - ASSERT_RTNL(); - - list_for_each_entry_safe(dev, tmp, head, unreg_list) { - /* Some devices call without registering - * for initialization unwind. Remove those - * devices and proceed with the remaining. - */ - if (dev->reg_state == NETREG_UNINITIALIZED) { - pr_debug("unregister_netdevice: device %s/%p never was registered\n", - dev->name, dev); - - WARN_ON(1); - list_del(&dev->unreg_list); - continue; - } - dev->dismantle = true; - BUG_ON(dev->reg_state != NETREG_REGISTERED); - } - - /* If device is running, close it first. */ - list_for_each_entry(dev, head, unreg_list) - list_add_tail(&dev->close_list, &close_head); - dev_close_many(&close_head, true); - - list_for_each_entry(dev, head, unreg_list) { - /* And unlink it from device chain. */ - unlist_netdevice(dev); - - dev->reg_state = NETREG_UNREGISTERING; - } - flush_all_backlogs(); - - synchronize_net(); - - list_for_each_entry(dev, head, unreg_list) { - struct sk_buff *skb = NULL; - - /* Shutdown queueing discipline. */ - dev_shutdown(dev); - - dev_xdp_uninstall(dev); - - /* Notify protocols, that we are about to destroy - * this device. They should clean all the things. - */ - call_netdevice_notifiers(NETDEV_UNREGISTER, dev); - - if (!dev->rtnl_link_ops || - dev->rtnl_link_state == RTNL_LINK_INITIALIZED) - skb = rtmsg_ifinfo_build_skb(RTM_DELLINK, dev, ~0U, 0, - GFP_KERNEL, NULL, 0); - - /* - * Flush the unicast and multicast chains - */ - dev_uc_flush(dev); - dev_mc_flush(dev); - - netdev_name_node_alt_flush(dev); - netdev_name_node_free(dev->name_node); - - if (dev->netdev_ops->ndo_uninit) - dev->netdev_ops->ndo_uninit(dev); - - if (skb) - rtmsg_ifinfo_send(skb, dev, GFP_KERNEL); - - /* Notifier chain MUST detach us all upper devices. */ - WARN_ON(netdev_has_any_upper_dev(dev)); - WARN_ON(netdev_has_any_lower_dev(dev)); - - /* Remove entries from kobject tree */ - netdev_unregister_kobject(dev); -#ifdef CONFIG_XPS - /* Remove XPS queueing entries */ - netif_reset_xps_queues_gt(dev, 0); -#endif - cond_resched(); - } - - synchronize_net(); - - list_for_each_entry(dev, head, unreg_list) { - dev_put(dev); - net_set_todo(dev); - } -} - static netdev_features_t netdev_sync_upper_features(struct net_device *lower, struct net_device *upper, netdev_features_t features) { @@ -10740,6 +10646,8 @@ void synchronize_net(void) } EXPORT_SYMBOL(synchronize_net);
+static void rollback_registered_many(struct list_head *head); + /** * unregister_netdevice_queue - remove device from the kernel * @dev: device @@ -10785,6 +10693,100 @@ void unregister_netdevice_many(struct list_head *head) } EXPORT_SYMBOL(unregister_netdevice_many);
+static void rollback_registered_many(struct list_head *head) +{ + struct net_device *dev, *tmp; + LIST_HEAD(close_head); + + BUG_ON(dev_boot_phase); + ASSERT_RTNL(); + + list_for_each_entry_safe(dev, tmp, head, unreg_list) { + /* Some devices call without registering + * for initialization unwind. Remove those + * devices and proceed with the remaining. + */ + if (dev->reg_state == NETREG_UNINITIALIZED) { + pr_debug("unregister_netdevice: device %s/%p never was registered\n", + dev->name, dev); + + WARN_ON(1); + list_del(&dev->unreg_list); + continue; + } + dev->dismantle = true; + BUG_ON(dev->reg_state != NETREG_REGISTERED); + } + + /* If device is running, close it first. */ + list_for_each_entry(dev, head, unreg_list) + list_add_tail(&dev->close_list, &close_head); + dev_close_many(&close_head, true); + + list_for_each_entry(dev, head, unreg_list) { + /* And unlink it from device chain. */ + unlist_netdevice(dev); + + dev->reg_state = NETREG_UNREGISTERING; + } + flush_all_backlogs(); + + synchronize_net(); + + list_for_each_entry(dev, head, unreg_list) { + struct sk_buff *skb = NULL; + + /* Shutdown queueing discipline. */ + dev_shutdown(dev); + + dev_xdp_uninstall(dev); + + /* Notify protocols, that we are about to destroy + * this device. They should clean all the things. + */ + call_netdevice_notifiers(NETDEV_UNREGISTER, dev); + + if (!dev->rtnl_link_ops || + dev->rtnl_link_state == RTNL_LINK_INITIALIZED) + skb = rtmsg_ifinfo_build_skb(RTM_DELLINK, dev, ~0U, 0, + GFP_KERNEL, NULL, 0); + + /* + * Flush the unicast and multicast chains + */ + dev_uc_flush(dev); + dev_mc_flush(dev); + + netdev_name_node_alt_flush(dev); + netdev_name_node_free(dev->name_node); + + if (dev->netdev_ops->ndo_uninit) + dev->netdev_ops->ndo_uninit(dev); + + if (skb) + rtmsg_ifinfo_send(skb, dev, GFP_KERNEL); + + /* Notifier chain MUST detach us all upper devices. */ + WARN_ON(netdev_has_any_upper_dev(dev)); + WARN_ON(netdev_has_any_lower_dev(dev)); + + /* Remove entries from kobject tree */ + netdev_unregister_kobject(dev); +#ifdef CONFIG_XPS + /* Remove XPS queueing entries */ + netif_reset_xps_queues_gt(dev, 0); +#endif + cond_resched(); + } + + synchronize_net(); + + list_for_each_entry(dev, head, unreg_list) { + dev_put(dev); + net_set_todo(dev); + } +} + /** * unregister_netdev - remove device from the kernel * @dev: device
From: Jakub Kicinski kuba@kernel.org
stable inclusion from stable-v5.10.134 commit 4f900c37f13eef18e56f541b525d1e743948e01c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 0cbe1e57a7b93517100b0eb63d8e445cfbeb630c upstream.
Similar to the change for rollback_registered() - rollback_registered_many() was a part of unregister_netdevice_many() minus the net_set_todo(), which is no longer needed.
Functionally this patch moves the list_empty() check back after:
BUG_ON(dev_boot_phase); ASSERT_RTNL();
but I can't find any reason why that would be an issue.
Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/core/dev.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c index f0dac979458a..326f4ce33a3d 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5756,7 +5756,7 @@ static void flush_all_backlogs(void) }
/* we can have in flight packet[s] on the cpus we are not flushing, - * synchronize_net() in rollback_registered_many() will take care of + * synchronize_net() in unregister_netdevice_many() will take care of * them */ for_each_cpu(cpu, &flush_cpus) @@ -10646,8 +10646,6 @@ void synchronize_net(void) } EXPORT_SYMBOL(synchronize_net);
-static void rollback_registered_many(struct list_head *head); - /** * unregister_netdevice_queue - remove device from the kernel * @dev: device @@ -10671,8 +10669,7 @@ void unregister_netdevice_queue(struct net_device *dev, struct list_head *head) LIST_HEAD(single);
list_add(&dev->unreg_list, &single); - rollback_registered_many(&single); - list_del(&single); + unregister_netdevice_many(&single); } } EXPORT_SYMBOL(unregister_netdevice_queue); @@ -10685,15 +10682,6 @@ EXPORT_SYMBOL(unregister_netdevice_queue); * we force a list_del() to make sure stack wont be corrupted later. */ void unregister_netdevice_many(struct list_head *head) -{ - if (!list_empty(head)) { - rollback_registered_many(head); - list_del(head); - } -} -EXPORT_SYMBOL(unregister_netdevice_many); - -static void rollback_registered_many(struct list_head *head) { struct net_device *dev, *tmp; LIST_HEAD(close_head); @@ -10701,6 +10689,9 @@ static void rollback_registered_many(struct list_head *head) BUG_ON(dev_boot_phase); ASSERT_RTNL();
+ if (list_empty(head)) + return; + list_for_each_entry_safe(dev, tmp, head, unreg_list) { /* Some devices call without registering * for initialization unwind. Remove those @@ -10785,7 +10776,10 @@ static void rollback_registered_many(struct list_head *head) dev_put(dev); net_set_todo(dev); } + + list_del(head); } +EXPORT_SYMBOL(unregister_netdevice_many);
/** * unregister_netdev - remove device from the kernel
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
stable inclusion from stable-v5.10.134 commit b07240ce4a09d9771d544e15a3a4ec008cc186ad category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
This reverts commit 87ae522e467e17a13b796e2cb595f9c3943e4d5e which is commit db87db65c1059f3be04506d122f8ec9b2fa3b05e upstream.
It is not needed in 5.10.y and causes problems.
Link: https://lore.kernel.org/r/CAK8P3a0vZrXxNp3YhrxFjFunHgxSZBKD9Y4darSODgeFAukCe... Reported-by: kernel test robot lkp@intel.com Reported-by: Arnd Bergmann arnd@arndb.de Cc: Greg Ungerer gerg@linux-m68k.org Cc: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- arch/m68k/Kconfig.bus | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/m68k/Kconfig.bus b/arch/m68k/Kconfig.bus index d1e93a39cd3b..f1be832e2b74 100644 --- a/arch/m68k/Kconfig.bus +++ b/arch/m68k/Kconfig.bus @@ -63,7 +63,7 @@ source "drivers/zorro/Kconfig"
endif
-if COLDFIRE +if !MMU
config ISA_DMA_API def_bool !M5272
From: Jeffrey Hugo quic_jhugo@quicinc.com
stable inclusion from stable-v5.10.134 commit f1d2f1ce05355742f0fdb721e30ddd03de90be94 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 08e61e861a0e47e5e1a3fb78406afd6b0cea6b6d upstream.
If the allocation of multiple MSI vectors for multi-MSI fails in the core PCI framework, the framework will retry the allocation as a single MSI vector, assuming that meets the min_vecs specified by the requesting driver.
Hyper-V advertises that multi-MSI is supported, but reuses the VECTOR domain to implement that for x86. The VECTOR domain does not support multi-MSI, so the alloc will always fail and fallback to a single MSI allocation.
In short, Hyper-V advertises a capability it does not implement.
Hyper-V can support multi-MSI because it coordinates with the hypervisor to map the MSIs in the IOMMU's interrupt remapper, which is something the VECTOR domain does not have. Therefore the fix is simple - copy what the x86 IOMMU drivers (AMD/Intel-IR) do by removing X86_IRQ_ALLOC_CONTIGUOUS_VECTORS after calling the VECTOR domain's pci_msi_prepare().
5.10 backport - adds the hv_msi_prepare wrapper function
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Reviewed-by: Dexuan Cui decui@microsoft.com Link: https://lore.kernel.org/r/1649856981-14649-1-git-send-email-quic_jhugo@quici... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Carl Vanderlip quic_carlv@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/pci/controller/pci-hyperv.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index a070e69bb49c..84085939769c 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1180,6 +1180,21 @@ static void hv_irq_mask(struct irq_data *data) pci_msi_mask_irq(data); }
+static int hv_msi_prepare(struct irq_domain *domain, struct device *dev, + int nvec, msi_alloc_info_t *info) +{ + int ret = pci_msi_prepare(domain, dev, nvec, info); + + /* + * By using the interrupt remapper in the hypervisor IOMMU, contiguous + * CPU vectors is not needed for multi-MSI + */ + if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI) + info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS; + + return ret; +} + /** * hv_irq_unmask() - "Unmask" the IRQ by setting its current * affinity. @@ -1545,7 +1560,7 @@ static struct irq_chip hv_msi_irq_chip = { };
static struct msi_domain_ops hv_msi_ops = { - .msi_prepare = pci_msi_prepare, + .msi_prepare = hv_msi_prepare, .msi_free = hv_msi_free, };
From: Jeffrey Hugo quic_jhugo@quicinc.com
stable inclusion from stable-v5.10.134 commit 73bf070408a7f07e813ab26ebde1b09fca159cd6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 455880dfe292a2bdd3b4ad6a107299fce610e64b upstream.
In the multi-MSI case, hv_arch_irq_unmask() will only operate on the first MSI of the N allocated. This is because only the first msi_desc is cached and it is shared by all the MSIs of the multi-MSI block. This means that hv_arch_irq_unmask() gets the correct address, but the wrong data (always 0).
This can break MSIs.
Lets assume MSI0 is vector 34 on CPU0, and MSI1 is vector 33 on CPU0.
hv_arch_irq_unmask() is called on MSI0. It uses a hypercall to configure the MSI address and data (0) to vector 34 of CPU0. This is correct. Then hv_arch_irq_unmask is called on MSI1. It uses another hypercall to configure the MSI address and data (0) to vector 33 of CPU0. This is wrong, and results in both MSI0 and MSI1 being routed to vector 33. Linux will observe extra instances of MSI1 and no instances of MSI0 despite the endpoint device behaving correctly.
For the multi-MSI case, we need unique address and data info for each MSI, but the cached msi_desc does not provide that. However, that information can be gotten from the int_desc cached in the chip_data by compose_msi_msg(). Fix the multi-MSI case to use that cached information instead. Since hv_set_msi_entry_from_desc() is no longer applicable, remove it.
5.10 backport - removed unused hv_set_msi_entry_from_desc function from mshyperv.h instead of pci-hyperv.c. msi_entry.address/data.as_uint32 changed to direct reference (as they are u32's, just sans union).
Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Reviewed-by: Michael Kelley mikelley@microsoft.com Link: https://lore.kernel.org/r/1651068453-29588-1-git-send-email-quic_jhugo@quici... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Carl Vanderlip quic_carlv@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- arch/x86/include/asm/mshyperv.h | 7 ------- drivers/pci/controller/pci-hyperv.c | 5 ++++- 2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 30f76b966857..871a8b06d430 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -247,13 +247,6 @@ bool hv_vcpu_is_preempted(int vcpu); static inline void hv_apic_init(void) {} #endif
-static inline void hv_set_msi_entry_from_desc(union hv_msi_entry *msi_entry, - struct msi_desc *msi_desc) -{ - msi_entry->address = msi_desc->msg.address_lo; - msi_entry->data = msi_desc->msg.data; -} - #else /* CONFIG_HYPERV */ static inline void hyperv_init(void) {} static inline void hyperv_setup_mmu_ops(void) {} diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 84085939769c..a1f27c555aec 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1210,6 +1210,7 @@ static void hv_irq_unmask(struct irq_data *data) struct msi_desc *msi_desc = irq_data_get_msi_desc(data); struct irq_cfg *cfg = irqd_cfg(data); struct hv_retarget_device_interrupt *params; + struct tran_int_desc *int_desc; struct hv_pcibus_device *hbus; struct cpumask *dest; cpumask_var_t tmp; @@ -1224,6 +1225,7 @@ static void hv_irq_unmask(struct irq_data *data) pdev = msi_desc_to_pci_dev(msi_desc); pbus = pdev->bus; hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata); + int_desc = data->chip_data;
spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags);
@@ -1231,7 +1233,8 @@ static void hv_irq_unmask(struct irq_data *data) memset(params, 0, sizeof(*params)); params->partition_id = HV_PARTITION_ID_SELF; params->int_entry.source = 1; /* MSI(-X) */ - hv_set_msi_entry_from_desc(¶ms->int_entry.msi_entry, msi_desc); + params->int_entry.msi_entry.address = int_desc->address & 0xffffffff; + params->int_entry.msi_entry.data = int_desc->data; params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | (hbus->hdev->dev_instance.b[4] << 16) | (hbus->hdev->dev_instance.b[7] << 8) |
From: Jeffrey Hugo quic_jhugo@quicinc.com
stable inclusion from stable-v5.10.134 commit 522bd31d6b4bb783e4454d8f11d012e77c627648 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit b4b77778ecc5bfbd4e77de1b2fd5c1dd3c655f1f upstream.
Currently if compose_msi_msg() is called multiple times, it will free any previous IRTE allocation, and generate a new allocation. While nothing prevents this from occurring, it is extraneous when Linux could just reuse the existing allocation and avoid a bunch of overhead.
However, when future IRTE allocations operate on blocks of MSIs instead of a single line, freeing the allocation will impact all of the lines. This could cause an issue where an allocation of N MSIs occurs, then some of the lines are retargeted, and finally the allocation is freed/reallocated. The freeing of the allocation removes all of the configuration for the entire block, which requires all the lines to be retargeted, which might not happen since some lines might already be unmasked/active.
Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Reviewed-by: Dexuan Cui decui@microsoft.com Tested-by: Dexuan Cui decui@microsoft.com Tested-by: Michael Kelley mikelley@microsoft.com Link: https://lore.kernel.org/r/1652282582-21595-1-git-send-email-quic_jhugo@quici... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Carl Vanderlip quic_carlv@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/pci/controller/pci-hyperv.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index a1f27c555aec..6e2035deb52a 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1409,6 +1409,15 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) u32 size; int ret;
+ /* Reuse the previous allocation */ + if (data->chip_data) { + int_desc = data->chip_data; + msg->address_hi = int_desc->address >> 32; + msg->address_lo = int_desc->address & 0xffffffff; + msg->data = int_desc->data; + return; + } + pdev = msi_desc_to_pci_dev(irq_data_get_msi_desc(data)); dest = irq_data_get_effective_affinity_mask(data); pbus = pdev->bus; @@ -1418,13 +1427,6 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) if (!hpdev) goto return_null_message;
- /* Free any previous message that might have already been composed. */ - if (data->chip_data) { - int_desc = data->chip_data; - data->chip_data = NULL; - hv_int_desc_free(hpdev, int_desc); - } - int_desc = kzalloc(sizeof(*int_desc), GFP_ATOMIC); if (!int_desc) goto drop_reference;
From: Jeffrey Hugo quic_jhugo@quicinc.com
stable inclusion from stable-v5.10.134 commit e744aad0c4421c83cec35d62394e9cd210ccade6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit a2bad844a67b1c7740bda63e87453baf63c3a7f7 upstream.
According to Dexuan, the hypervisor folks beleive that multi-msi allocations are not correct. compose_msi_msg() will allocate multi-msi one by one. However, multi-msi is a block of related MSIs, with alignment requirements. In order for the hypervisor to allocate properly aligned and consecutive entries in the IOMMU Interrupt Remapping Table, there should be a single mapping request that requests all of the multi-msi vectors in one shot.
Dexuan suggests detecting the multi-msi case and composing a single request related to the first MSI. Then for the other MSIs in the same block, use the cached information. This appears to be viable, so do it.
5.10 backport - add hv_msi_get_int_vector helper function. Fixed merge conflict due to delivery_mode name change (APIC_DELIVERY_MODE_FIXED is the value given to dest_Fixed). Removed unused variable in hv_compose_msi_msg. Fixed reference to msi_desc->pci to point to the same is_msix variable. Removed changes to compose_msi_req_v3 since it doesn't exist yet.
Suggested-by: Dexuan Cui decui@microsoft.com Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Reviewed-by: Dexuan Cui decui@microsoft.com Tested-by: Michael Kelley mikelley@microsoft.com Link: https://lore.kernel.org/r/1652282599-21643-1-git-send-email-quic_jhugo@quici... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Carl Vanderlip quic_carlv@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/pci/controller/pci-hyperv.c | 61 +++++++++++++++++++++++++---- 1 file changed, 53 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 6e2035deb52a..4353443b89d8 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1118,6 +1118,10 @@ static void hv_int_desc_free(struct hv_pci_dev *hpdev, u8 buffer[sizeof(struct pci_delete_interrupt)]; } ctxt;
+ if (!int_desc->vector_count) { + kfree(int_desc); + return; + } memset(&ctxt, 0, sizeof(ctxt)); int_pkt = (struct pci_delete_interrupt *)&ctxt.pkt.message; int_pkt->message_type.type = @@ -1180,6 +1184,13 @@ static void hv_irq_mask(struct irq_data *data) pci_msi_mask_irq(data); }
+static unsigned int hv_msi_get_int_vector(struct irq_data *data) +{ + struct irq_cfg *cfg = irqd_cfg(data); + + return cfg->vector; +} + static int hv_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, msi_alloc_info_t *info) { @@ -1335,12 +1346,12 @@ static void hv_pci_compose_compl(void *context, struct pci_response *resp,
static u32 hv_compose_msi_req_v1( struct pci_create_interrupt *int_pkt, struct cpumask *affinity, - u32 slot, u8 vector) + u32 slot, u8 vector, u8 vector_count) { int_pkt->message_type.type = PCI_CREATE_INTERRUPT_MESSAGE; int_pkt->wslot.slot = slot; int_pkt->int_desc.vector = vector; - int_pkt->int_desc.vector_count = 1; + int_pkt->int_desc.vector_count = vector_count; int_pkt->int_desc.delivery_mode = dest_Fixed;
/* @@ -1354,14 +1365,14 @@ static u32 hv_compose_msi_req_v1(
static u32 hv_compose_msi_req_v2( struct pci_create_interrupt2 *int_pkt, struct cpumask *affinity, - u32 slot, u8 vector) + u32 slot, u8 vector, u8 vector_count) { int cpu;
int_pkt->message_type.type = PCI_CREATE_INTERRUPT_MESSAGE2; int_pkt->wslot.slot = slot; int_pkt->int_desc.vector = vector; - int_pkt->int_desc.vector_count = 1; + int_pkt->int_desc.vector_count = vector_count; int_pkt->int_desc.delivery_mode = dest_Fixed;
/* @@ -1389,7 +1400,6 @@ static u32 hv_compose_msi_req_v2( */ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) { - struct irq_cfg *cfg = irqd_cfg(data); struct hv_pcibus_device *hbus; struct vmbus_channel *channel; struct hv_pci_dev *hpdev; @@ -1398,6 +1408,8 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) struct cpumask *dest; struct compose_comp_ctxt comp; struct tran_int_desc *int_desc; + struct msi_desc *msi_desc; + u8 vector, vector_count; struct { struct pci_packet pci_pkt; union { @@ -1418,7 +1430,8 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) return; }
- pdev = msi_desc_to_pci_dev(irq_data_get_msi_desc(data)); + msi_desc = irq_data_get_msi_desc(data); + pdev = msi_desc_to_pci_dev(msi_desc); dest = irq_data_get_effective_affinity_mask(data); pbus = pdev->bus; hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata); @@ -1431,6 +1444,36 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) if (!int_desc) goto drop_reference;
+ if (!msi_desc->msi_attrib.is_msix && msi_desc->nvec_used > 1) { + /* + * If this is not the first MSI of Multi MSI, we already have + * a mapping. Can exit early. + */ + if (msi_desc->irq != data->irq) { + data->chip_data = int_desc; + int_desc->address = msi_desc->msg.address_lo | + (u64)msi_desc->msg.address_hi << 32; + int_desc->data = msi_desc->msg.data + + (data->irq - msi_desc->irq); + msg->address_hi = msi_desc->msg.address_hi; + msg->address_lo = msi_desc->msg.address_lo; + msg->data = int_desc->data; + put_pcichild(hpdev); + return; + } + /* + * The vector we select here is a dummy value. The correct + * value gets sent to the hypervisor in unmask(). This needs + * to be aligned with the count, and also not zero. Multi-msi + * is powers of 2 up to 32, so 32 will always work here. + */ + vector = 32; + vector_count = msi_desc->nvec_used; + } else { + vector = hv_msi_get_int_vector(data); + vector_count = 1; + } + memset(&ctxt, 0, sizeof(ctxt)); init_completion(&comp.comp_pkt.host_event); ctxt.pci_pkt.completion_func = hv_pci_compose_compl; @@ -1441,7 +1484,8 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) size = hv_compose_msi_req_v1(&ctxt.int_pkts.v1, dest, hpdev->desc.win_slot.slot, - cfg->vector); + vector, + vector_count); break;
case PCI_PROTOCOL_VERSION_1_2: @@ -1449,7 +1493,8 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) size = hv_compose_msi_req_v2(&ctxt.int_pkts.v2, dest, hpdev->desc.win_slot.slot, - cfg->vector); + vector, + vector_count); break;
default:
From: Pali Rohár pali@kernel.org
stable inclusion from stable-v5.10.134 commit 3777ea39f05aefeadf8b9f5216bf2f7978d2649e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 4f532c1e25319e42996ec18a1f473fd50c8e575d upstream.
Functions tty_termios_encode_baud_rate() and uart_update_timeout() should be called with the baudrate value which was set to hardware. Linux then report exact values via ioctl(TCGETS2) to userspace.
Change mvebu_uart_baud_rate_set() function to return baudrate value which was set to hardware and propagate this value to above mentioned functions.
With this change userspace would see precise value in termios c_ospeed field.
Fixes: 68a0db1d7da2 ("serial: mvebu-uart: add function to change baudrate") Cc: stable stable@kernel.org Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Pali Rohár pali@kernel.org Link: https://lore.kernel.org/r/20220628100922.10717-1-pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/tty/serial/mvebu-uart.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/tty/serial/mvebu-uart.c b/drivers/tty/serial/mvebu-uart.c index 34ff2181afd1..e941f57de953 100644 --- a/drivers/tty/serial/mvebu-uart.c +++ b/drivers/tty/serial/mvebu-uart.c @@ -443,13 +443,13 @@ static void mvebu_uart_shutdown(struct uart_port *port) } }
-static int mvebu_uart_baud_rate_set(struct uart_port *port, unsigned int baud) +static unsigned int mvebu_uart_baud_rate_set(struct uart_port *port, unsigned int baud) { unsigned int d_divisor, m_divisor; u32 brdv, osamp;
if (!port->uartclk) - return -EOPNOTSUPP; + return 0;
/* * The baudrate is derived from the UART clock thanks to two divisors: @@ -473,7 +473,7 @@ static int mvebu_uart_baud_rate_set(struct uart_port *port, unsigned int baud) osamp &= ~OSAMP_DIVISORS_MASK; writel(osamp, port->membase + UART_OSAMP);
- return 0; + return DIV_ROUND_CLOSEST(port->uartclk, d_divisor * m_divisor); }
static void mvebu_uart_set_termios(struct uart_port *port, @@ -510,15 +510,11 @@ static void mvebu_uart_set_termios(struct uart_port *port, max_baud = 230400;
baud = uart_get_baud_rate(port, termios, old, min_baud, max_baud); - if (mvebu_uart_baud_rate_set(port, baud)) { - /* No clock available, baudrate cannot be changed */ - if (old) - baud = uart_get_baud_rate(port, old, NULL, - min_baud, max_baud); - } else { - tty_termios_encode_baud_rate(termios, baud, baud); - uart_update_timeout(port, termios->c_cflag, baud); - } + baud = mvebu_uart_baud_rate_set(port, baud); + + /* In case baudrate cannot be changed, report previous old value */ + if (baud == 0 && old) + baud = tty_termios_baud_rate(old);
/* Only the following flag changes are supported */ if (old) { @@ -529,6 +525,11 @@ static void mvebu_uart_set_termios(struct uart_port *port, termios->c_cflag |= CS8; }
+ if (baud != 0) { + tty_termios_encode_baud_rate(termios, baud, baud); + uart_update_timeout(port, termios->c_cflag, baud); + } + spin_unlock_irqrestore(&port->lock, flags); }
From: Miaoqian Lin linmq006@gmail.com
stable inclusion from stable-v5.10.134 commit 493ceca3271316e74639c89ff8ac35883de64256 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 80192eff64eee9b3bc0594a47381937b94b9d65a ]
of_find_matching_node_and_match() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak.
Fixes: 0e545f57b708 ("power: reset: driver for the Versatile syscon reboot") Signed-off-by: Miaoqian Lin linmq006@gmail.com Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/power/reset/arm-versatile-reboot.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/power/reset/arm-versatile-reboot.c b/drivers/power/reset/arm-versatile-reboot.c index 08d0a07b58ef..c7624d7611a7 100644 --- a/drivers/power/reset/arm-versatile-reboot.c +++ b/drivers/power/reset/arm-versatile-reboot.c @@ -146,6 +146,7 @@ static int __init versatile_reboot_probe(void) versatile_reboot_type = (enum versatile_reboot)reboot_id->data;
syscon_regmap = syscon_node_to_regmap(np); + of_node_put(np); if (IS_ERR(syscon_regmap)) return PTR_ERR(syscon_regmap);
From: William Dean williamsukatube@gmail.com
stable inclusion from stable-v5.10.134 commit 5694b162f275fb9a9f89422701b2b963be11e496 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit c3b821e8e406d5650e587b7ac624ac24e9b780a8 ]
Because of the possible failure of the allocation, data->domains might be NULL pointer and will cause the dereference of the NULL pointer later. Therefore, it might be better to check it and directly return -ENOMEM without releasing data manually if fails, because the comment of the devm_kmalloc() says "Memory allocated with this function is automatically freed on driver detach.".
Fixes: a86854d0c599b ("treewide: devm_kzalloc() -> devm_kcalloc()") Reported-by: Hacash Robot hacashRobot@santino.com Signed-off-by: William Dean williamsukatube@gmail.com Link: https://lore.kernel.org/r/20220710154922.2610876-1-williamsukatube@163.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c b/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c index 09b0b8a16e99..2e971cbe2d7a 100644 --- a/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c +++ b/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c @@ -267,6 +267,8 @@ static int rt2880_pinmux_pins(struct rt2880_priv *p) p->func[i]->pin_count, sizeof(int), GFP_KERNEL); + if (!p->func[i]->pins) + return -ENOMEM; for (j = 0; j < p->func[i]->pin_count; j++) p->func[i]->pins[j] = p->func[i]->pin_first + j;
From: Peter Zijlstra peterz@infradead.org
stable inclusion from stable-v5.10.134 commit 43128b3eee337824158f34da6648163d2f2fb937 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 68e3c69803dada336893640110cb87221bb01dcf ]
Yang Jihing reported a race between perf_event_set_output() and perf_mmap_close():
CPU1 CPU2
perf_mmap_close(e2) if (atomic_dec_and_test(&e2->rb->mmap_count)) // 1 - > 0 detach_rest = true
ioctl(e1, IOC_SET_OUTPUT, e2) perf_event_set_output(e1, e2)
... list_for_each_entry_rcu(e, &e2->rb->event_list, rb_entry) ring_buffer_attach(e, NULL); // e1 isn't yet added and // therefore not detached
ring_buffer_attach(e1, e2->rb) list_add_rcu(&e1->rb_entry, &e2->rb->event_list)
After this; e1 is attached to an unmapped rb and a subsequent perf_mmap() will loop forever more:
again: mutex_lock(&e->mmap_mutex); if (event->rb) { ... if (!atomic_inc_not_zero(&e->rb->mmap_count)) { ... mutex_unlock(&e->mmap_mutex); goto again; } }
The loop in perf_mmap_close() holds e2->mmap_mutex, while the attach in perf_event_set_output() holds e1->mmap_mutex. As such there is no serialization to avoid this race.
Change perf_event_set_output() to take both e1->mmap_mutex and e2->mmap_mutex to alleviate that problem. Additionally, have the loop in perf_mmap() detach the rb directly, this avoids having to wait for the concurrent perf_mmap_close() to get around to doing it to make progress.
Fixes: 9bb5d40cd93c ("perf: Fix mmap() accounting hole") Reported-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Yang Jihong yangjihong1@huawei.com Link: https://lkml.kernel.org/r/YsQ3jm2GR38SW7uD@worktop.programming.kicks-ass.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- kernel/events/core.c | 45 ++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index 68dc8a8e7990..6fcd4177ade6 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6181,10 +6181,10 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
if (!atomic_inc_not_zero(&event->rb->mmap_count)) { /* - * Raced against perf_mmap_close() through - * perf_event_set_output(). Try again, hope for better - * luck. + * Raced against perf_mmap_close(); remove the + * event and try again. */ + ring_buffer_attach(event, NULL); mutex_unlock(&event->mmap_mutex); goto again; } @@ -11541,14 +11541,25 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr, goto out; }
+static void mutex_lock_double(struct mutex *a, struct mutex *b) +{ + if (b < a) + swap(a, b); + + mutex_lock(a); + mutex_lock_nested(b, SINGLE_DEPTH_NESTING); +} + static int perf_event_set_output(struct perf_event *event, struct perf_event *output_event) { struct perf_buffer *rb = NULL; int ret = -EINVAL;
- if (!output_event) + if (!output_event) { + mutex_lock(&event->mmap_mutex); goto set; + }
/* don't allow circular references */ if (event == output_event) @@ -11586,8 +11597,15 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) event->pmu != output_event->pmu) goto out;
+ /* + * Hold both mmap_mutex to serialize against perf_mmap_close(). Since + * output_event is already on rb->event_list, and the list iteration + * restarts after every removal, it is guaranteed this new event is + * observed *OR* if output_event is already removed, it's guaranteed we + * observe !rb->mmap_count. + */ + mutex_lock_double(&event->mmap_mutex, &output_event->mmap_mutex); set: - mutex_lock(&event->mmap_mutex); /* Can't redirect output if we've got an active mmap() */ if (atomic_read(&event->mmap_count)) goto unlock; @@ -11597,6 +11615,12 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) rb = ring_buffer_get(output_event); if (!rb) goto unlock; + + /* did we race against perf_mmap_close() */ + if (!atomic_read(&rb->mmap_count)) { + ring_buffer_put(rb); + goto unlock; + } }
ring_buffer_attach(event, rb); @@ -11604,20 +11628,13 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) ret = 0; unlock: mutex_unlock(&event->mmap_mutex); + if (output_event) + mutex_unlock(&output_event->mmap_mutex);
out: return ret; }
-static void mutex_lock_double(struct mutex *a, struct mutex *b) -{ - if (b < a) - swap(a, b); - - mutex_lock(a); - mutex_lock_nested(b, SINGLE_DEPTH_NESTING); -} - static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id) { bool nmi_safe = false;
From: Alex Deucher alexander.deucher@amd.com
stable inclusion from stable-v5.10.134 commit fb6031203ebbc17fa36aa3a85f007854d118d266 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 3ce51649cdf23ab463494df2bd6d1e9529ebdc6a ]
Stutter mode is a power saving feature on GPUs, however at least one early raven system exhibits stability issues with it. Add a quirk to disable it for that system.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214417 Fixes: 005440066f929b ("drm/amdgpu: enable gfxoff again on raven series (v2)") Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 +++++++++++++++++++ 1 file changed, 33 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f069d0faba64..55ecc67592eb 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -922,6 +922,37 @@ static void amdgpu_check_debugfs_connector_property_change(struct amdgpu_device } }
+struct amdgpu_stutter_quirk { + u16 chip_vendor; + u16 chip_device; + u16 subsys_vendor; + u16 subsys_device; + u8 revision; +}; + +static const struct amdgpu_stutter_quirk amdgpu_stutter_quirk_list[] = { + /* https://bugzilla.kernel.org/show_bug.cgi?id=214417 */ + { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 }, + { 0, 0, 0, 0, 0 }, +}; + +static bool dm_should_disable_stutter(struct pci_dev *pdev) +{ + const struct amdgpu_stutter_quirk *p = amdgpu_stutter_quirk_list; + + while (p && p->chip_device != 0) { + if (pdev->vendor == p->chip_vendor && + pdev->device == p->chip_device && + pdev->subsystem_vendor == p->subsys_vendor && + pdev->subsystem_device == p->subsys_device && + pdev->revision == p->revision) { + return true; + } + ++p; + } + return false; +} + static int amdgpu_dm_init(struct amdgpu_device *adev) { struct dc_init_data init_data; @@ -1014,6 +1045,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
if (adev->asic_type != CHIP_CARRIZO && adev->asic_type != CHIP_STONEY) adev->dm.dc->debug.disable_stutter = amdgpu_pp_feature_mask & PP_STUTTER_MODE ? false : true; + if (dm_should_disable_stutter(adev->pdev)) + adev->dm.dc->debug.disable_stutter = true;
if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER) adev->dm.dc->debug.disable_stutter = true;
From: Lennert Buytenhek buytenh@wantstofly.org
stable inclusion from stable-v5.10.134 commit 77836dbe35382aaf8108489060c5c89530c77494 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 7c1ddcee5311f3315096217881d2dbe47cc683f9 ]
The initially merged version of the igc driver code (via commit 146740f9abc4, "igc: Add support for PF") contained the following IGC_REMOVED checks in the igc_rd32/wr32() MMIO accessors:
u32 igc_rd32(struct igc_hw *hw, u32 reg) { u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr); u32 value = 0;
if (IGC_REMOVED(hw_addr)) return ~value;
value = readl(&hw_addr[reg]);
/* reads should not return all F's */ if (!(~value) && (!reg || !(~readl(hw_addr)))) hw->hw_addr = NULL;
return value; }
And:
#define wr32(reg, val) \ do { \ u8 __iomem *hw_addr = READ_ONCE((hw)->hw_addr); \ if (!IGC_REMOVED(hw_addr)) \ writel((val), &hw_addr[(reg)]); \ } while (0)
E.g. igb has similar checks in its MMIO accessors, and has a similar macro E1000_REMOVED, which is implemented as follows:
#define E1000_REMOVED(h) unlikely(!(h))
These checks serve to detect and take note of an 0xffffffff MMIO read return from the device, which can be caused by a PCIe link flap or some other kind of PCI bus error, and to avoid performing MMIO reads and writes from that point onwards.
However, the IGC_REMOVED macro was not originally implemented:
#ifndef IGC_REMOVED #define IGC_REMOVED(a) (0) #endif /* IGC_REMOVED */
This led to the IGC_REMOVED logic to be removed entirely in a subsequent commit (commit 3c215fb18e70, "igc: remove IGC_REMOVED function"), with the rationale that such checks matter only for virtualization and that igc does not support virtualization -- but a PCIe device can become detached even without virtualization being in use, and without proper checks, a PCIe bus error affecting an igc adapter will lead to various NULL pointer dereferences, as the first access after the error will set hw->hw_addr to NULL, and subsequent accesses will blindly dereference this now-NULL pointer.
This patch reinstates the IGC_REMOVED checks in igc_rd32/wr32(), and implements IGC_REMOVED the way it is done for igb, by checking for the unlikely() case of hw_addr being NULL. This change prevents the oopses seen when a PCIe link flap occurs on an igc adapter.
Fixes: 146740f9abc4 ("igc: Add support for PF") Signed-off-by: Lennert Buytenhek buytenh@arista.com Tested-by: Naama Meir naamax.meir@linux.intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/intel/igc/igc_main.c | 3 +++ drivers/net/ethernet/intel/igc/igc_regs.h | 5 ++++- 2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index c3bb040f6750..31ef9a74ef64 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -4916,6 +4916,9 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg) u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr); u32 value = 0;
+ if (IGC_REMOVED(hw_addr)) + return ~value; + value = readl(&hw_addr[reg]);
/* reads should not return all F's */ diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h index b52dd9d737e8..a273e1c33b3f 100644 --- a/drivers/net/ethernet/intel/igc/igc_regs.h +++ b/drivers/net/ethernet/intel/igc/igc_regs.h @@ -252,7 +252,8 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg); #define wr32(reg, val) \ do { \ u8 __iomem *hw_addr = READ_ONCE((hw)->hw_addr); \ - writel((val), &hw_addr[(reg)]); \ + if (!IGC_REMOVED(hw_addr)) \ + writel((val), &hw_addr[(reg)]); \ } while (0)
#define rd32(reg) (igc_rd32(hw, reg)) @@ -264,4 +265,6 @@ do { \
#define array_rd32(reg, offset) (igc_rd32(hw, (reg) + ((offset) << 2)))
+#define IGC_REMOVED(h) unlikely(!(h)) + #endif
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 5e343e3ef464a5dbcc8fd3a99830908e31a4a61d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 0968d2a441bf6afb551fd99e60fa65ed67068963 ]
While reading sysctl_ip_no_pmtu_disc, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/af_inet.c | 2 +- net/ipv4/icmp.c | 2 +- net/ipv6/af_inet6.c | 2 +- net/xfrm/xfrm_state.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 0ce0184bd8c1..3d012716e444 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -338,7 +338,7 @@ static int inet_create(struct net *net, struct socket *sock, int protocol, inet->hdrincl = 1; }
- if (net->ipv4.sysctl_ip_no_pmtu_disc) + if (READ_ONCE(net->ipv4.sysctl_ip_no_pmtu_disc)) inet->pmtudisc = IP_PMTUDISC_DONT; else inet->pmtudisc = IP_PMTUDISC_WANT; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 0fa0da1d71f5..a1aacf5e75a6 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -887,7 +887,7 @@ static bool icmp_unreach(struct sk_buff *skb) * values please see * Documentation/networking/ip-sysctl.rst */ - switch (net->ipv4.sysctl_ip_no_pmtu_disc) { + switch (READ_ONCE(net->ipv4.sysctl_ip_no_pmtu_disc)) { default: net_dbg_ratelimited("%pI4: fragmentation needed and DF set\n", &iph->daddr); diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 010939b1d0ee..c4ef6aad924b 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -225,7 +225,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol, inet->mc_list = NULL; inet->rcv_tos = 0;
- if (net->ipv4.sysctl_ip_no_pmtu_disc) + if (READ_ONCE(net->ipv4.sysctl_ip_no_pmtu_disc)) inet->pmtudisc = IP_PMTUDISC_DONT; else inet->pmtudisc = IP_PMTUDISC_WANT; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index a6a4838d6254..25bf8df9bf31 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2566,7 +2566,7 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) int err;
if (family == AF_INET && - xs_net(x)->ipv4.sysctl_ip_no_pmtu_disc) + READ_ONCE(xs_net(x)->ipv4.sysctl_ip_no_pmtu_disc)) x->props.flags |= XFRM_STATE_NOPMTUDISC;
err = -EPROTONOSUPPORT;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit b96ed5ccb09ae71103023ed13acefb194f609794 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 60c158dc7b1f0558f6cadd5b50d0386da0000d50 ]
While reading sysctl_ip_fwd_use_pmtu, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: f87c10a8aa1e ("ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/ip.h | 2 +- net/ipv4/route.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h index 6b87c7ef6459..b97886996d79 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -443,7 +443,7 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst, struct net *net = dev_net(dst->dev); unsigned int mtu;
- if (net->ipv4.sysctl_ip_fwd_use_pmtu || + if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) || ip_mtu_locked(dst) || !forwarding) return dst_mtu(dst); diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 9bd3cd2177f4..216f76b548ff 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1442,7 +1442,7 @@ u32 ip_mtu_from_fib_result(struct fib_result *res, __be32 daddr) struct fib_info *fi = res->fi; u32 mtu = 0;
- if (dev_net(dev)->ipv4.sysctl_ip_fwd_use_pmtu || + if (READ_ONCE(dev_net(dev)->ipv4.sysctl_ip_fwd_use_pmtu) || fi->fib_metrics->metrics[RTAX_LOCK - 1] & (1 << RTAX_MTU)) mtu = fi->fib_mtu;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 11038fa781ab916535c53351537b22d6d405667d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 7bf9e18d9a5e99e3c83482973557e9f047b051e7 ]
While reading sysctl_ip_fwd_update_priority, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 432e05d32892 ("net: ipv4: Control SKB reprioritization after forwarding") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 3 ++- net/ipv4/ip_forward.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index 5f143ca16c01..d2887ae508bb 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -8038,13 +8038,14 @@ static int mlxsw_sp_dscp_init(struct mlxsw_sp *mlxsw_sp) static int __mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp) { struct net *net = mlxsw_sp_net(mlxsw_sp); - bool usp = net->ipv4.sysctl_ip_fwd_update_priority; char rgcr_pl[MLXSW_REG_RGCR_LEN]; u64 max_rifs; + bool usp;
if (!MLXSW_CORE_RES_VALID(mlxsw_sp->core, MAX_RIFS)) return -EIO; max_rifs = MLXSW_CORE_RES_GET(mlxsw_sp->core, MAX_RIFS); + usp = READ_ONCE(net->ipv4.sysctl_ip_fwd_update_priority);
mlxsw_reg_rgcr_pack(rgcr_pl, true, true); mlxsw_reg_rgcr_max_router_interfaces_set(rgcr_pl, max_rifs); diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c index 00ec819f949b..29730edda220 100644 --- a/net/ipv4/ip_forward.c +++ b/net/ipv4/ip_forward.c @@ -151,7 +151,7 @@ int ip_forward(struct sk_buff *skb) !skb_sec_path(skb)) ip_rt_send_redirect(skb);
- if (net->ipv4.sysctl_ip_fwd_update_priority) + if (READ_ONCE(net->ipv4.sysctl_ip_fwd_update_priority)) skb->priority = rt_tos2priority(iph->tos);
return NF_HOOK(NFPROTO_IPV4, NF_INET_FORWARD,
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 94269132d0fc9ee357e555c17a8f85fd6660649b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 289d3b21fb0bfc94c4e98f10635bba1824e5f83c ]
While reading sysctl_ip_nonlocal_bind, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/inet_sock.h | 2 +- net/sctp/protocol.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index f29059a262da..d105b943b2f4 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -370,7 +370,7 @@ static inline bool inet_get_convert_csum(struct sock *sk) static inline bool inet_can_nonlocal_bind(struct net *net, struct inet_sock *inet) { - return net->ipv4.sysctl_ip_nonlocal_bind || + return READ_ONCE(net->ipv4.sysctl_ip_nonlocal_bind) || inet->freebind || inet->transparent; }
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 940f1e257a90..6e4ca837e91d 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -358,7 +358,7 @@ static int sctp_v4_available(union sctp_addr *addr, struct sctp_sock *sp) if (addr->v4.sin_addr.s_addr != htonl(INADDR_ANY) && ret != RTN_LOCAL && !sp->inet.freebind && - !net->ipv4.sysctl_ip_nonlocal_bind) + !READ_ONCE(net->ipv4.sysctl_ip_nonlocal_bind)) return 0;
if (ipv6_only_sock(sctp_opt2sk(sp)))
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 611ba70e5aca252ef43374dda97ed4cf1c47a07c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 0db232765887d9807df8bcb7b6f29b2871539eab ]
While reading sysctl_ip_autobind_reuse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 4b01a9674231 ("tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/inet_connection_sock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 7785a4775e58..4d9713324003 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -251,7 +251,7 @@ inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int * goto other_half_scan; }
- if (net->ipv4.sysctl_ip_autobind_reuse && !relax) { + if (READ_ONCE(net->ipv4.sysctl_ip_autobind_reuse) && !relax) { /* We still have a chance to connect to different destinations */ relax = true; goto ports_exhausted;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 0ee76fe01ff3c0b4efaa500aecc90d7c8d3a8860 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 85d0b4dbd74b95cc492b1f4e34497d3f894f5d9a ]
While reading sysctl_fwmark_reflect, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/ip.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/ip.h b/include/net/ip.h index b97886996d79..ef6e8e5b4fce 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -382,7 +382,7 @@ void ipfrag_init(void); void ip_static_sysctl_init(void);
#define IP4_REPLY_MARK(net, mark) \ - ((net)->ipv4.sysctl_fwmark_reflect ? (mark) : 0) + (READ_ONCE((net)->ipv4.sysctl_fwmark_reflect) ? (mark) : 0)
static inline bool ip_is_fragment(const struct iphdr *iph) {
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit d4f65615db7fca3df9f7e79eadf937e6ddb03c54 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 1a0008f9df59451d0a17806c1ee1a19857032fa8 ]
While reading sysctl_tcp_fwmark_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 84f39b08d786 ("net: support marking accepting TCP sockets") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/inet_sock.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index d105b943b2f4..9a286c943871 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -108,7 +108,8 @@ static inline struct inet_request_sock *inet_rsk(const struct request_sock *sk)
static inline u32 inet_request_mark(const struct sock *sk, struct sk_buff *skb) { - if (!sk->sk_mark && sock_net(sk)->ipv4.sysctl_tcp_fwmark_accept) + if (!sk->sk_mark && + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fwmark_accept)) return skb->mark;
return sk->sk_mark;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 77a04845f0d28a3561494a5f3121488470a968a4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f47d00e077e7d61baf69e46dde3210c886360207 ]
While reading sysctl_tcp_mtu_probing, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 5d424d5a674f ("[TCP]: MTU probing") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index ec5526bf75e0..497834b94891 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1815,7 +1815,7 @@ void tcp_mtup_init(struct sock *sk) struct inet_connection_sock *icsk = inet_csk(sk); struct net *net = sock_net(sk);
- icsk->icsk_mtup.enabled = net->ipv4.sysctl_tcp_mtu_probing > 1; + icsk->icsk_mtup.enabled = READ_ONCE(net->ipv4.sysctl_tcp_mtu_probing) > 1; icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp + sizeof(struct tcphdr) + icsk->icsk_af_ops->net_header_len; icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, net->ipv4.sysctl_tcp_base_mss); diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 4ef08079ccfa..3c0d689cafac 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -163,7 +163,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk) int mss;
/* Black hole detection */ - if (!net->ipv4.sysctl_tcp_mtu_probing) + if (!READ_ONCE(net->ipv4.sysctl_tcp_mtu_probing)) return;
if (!icsk->icsk_mtup.enabled) {
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 514d2254c7b8aa2d257f5ffc79f0d96be2d6bfda category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 88d78bc097cd8ebc6541e93316c9d9bf651b13e8 ]
While reading sysctl_tcp_base_mss, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 5d424d5a674f ("[TCP]: MTU probing") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 497834b94891..4e318716cfda 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1818,7 +1818,7 @@ void tcp_mtup_init(struct sock *sk) icsk->icsk_mtup.enabled = READ_ONCE(net->ipv4.sysctl_tcp_mtu_probing) > 1; icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp + sizeof(struct tcphdr) + icsk->icsk_af_ops->net_header_len; - icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, net->ipv4.sysctl_tcp_base_mss); + icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, READ_ONCE(net->ipv4.sysctl_tcp_base_mss)); icsk->icsk_mtup.probe_size = 0; if (icsk->icsk_mtup.enabled) icsk->icsk_mtup.probe_timestamp = tcp_jiffies32; diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 3c0d689cafac..795716fd3761 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -171,7 +171,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk) icsk->icsk_mtup.probe_timestamp = tcp_jiffies32; } else { mss = tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low) >> 1; - mss = min(net->ipv4.sysctl_tcp_base_mss, mss); + mss = min(READ_ONCE(net->ipv4.sysctl_tcp_base_mss), mss); mss = max(mss, net->ipv4.sysctl_tcp_mtu_probe_floor); mss = max(mss, net->ipv4.sysctl_tcp_min_snd_mss); icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 97992e8feff33b3ae154a113ec398546bbacda80 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 78eb166cdefcc3221c8c7c1e2d514e91a2eb5014 ]
While reading sysctl_tcp_min_snd_mss, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 5f3e2bf008c2 ("tcp: add tcp_min_snd_mss sysctl") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 3 ++- net/ipv4/tcp_timer.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 4e318716cfda..a6457620f923 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1772,7 +1772,8 @@ static inline int __tcp_mtu_to_mss(struct sock *sk, int pmtu) mss_now -= icsk->icsk_ext_hdr_len;
/* Then reserve room for full set of TCP options and 8 bytes of data */ - mss_now = max(mss_now, sock_net(sk)->ipv4.sysctl_tcp_min_snd_mss); + mss_now = max(mss_now, + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_min_snd_mss)); return mss_now; }
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 795716fd3761..953f36868369 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -173,7 +173,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk) mss = tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low) >> 1; mss = min(READ_ONCE(net->ipv4.sysctl_tcp_base_mss), mss); mss = max(mss, net->ipv4.sysctl_tcp_mtu_probe_floor); - mss = max(mss, net->ipv4.sysctl_tcp_min_snd_mss); + mss = max(mss, READ_ONCE(net->ipv4.sysctl_tcp_min_snd_mss)); icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss); } tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit d5bece4df6090395f891110ef52a6f82d16685db category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 8e92d4423615a5257d0d871fc067aa561f597deb ]
While reading sysctl_tcp_mtu_probe_floor, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: c04b79b6cfd7 ("tcp: add new tcp_mtu_probe_floor sysctl") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 953f36868369..da92c5d70b70 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -172,7 +172,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk) } else { mss = tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low) >> 1; mss = min(READ_ONCE(net->ipv4.sysctl_tcp_base_mss), mss); - mss = max(mss, net->ipv4.sysctl_tcp_mtu_probe_floor); + mss = max(mss, READ_ONCE(net->ipv4.sysctl_tcp_mtu_probe_floor)); mss = max(mss, READ_ONCE(net->ipv4.sysctl_tcp_min_snd_mss)); icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss); }
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit d452ce36f2d4c402fa3f5275c9677f80166e7fc6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 92c0aa4175474483d6cf373314343d4e624e882a ]
While reading sysctl_tcp_probe_threshold, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 6b58e0a5f32d ("ipv4: Use binary search to choose tcp PMTU probe_size") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index a6457620f923..fff3f3e7bf4b 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2412,7 +2412,7 @@ static int tcp_mtu_probe(struct sock *sk) * probing process by not resetting search range to its orignal. */ if (probe_size > tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_high) || - interval < net->ipv4.sysctl_tcp_probe_threshold) { + interval < READ_ONCE(net->ipv4.sysctl_tcp_probe_threshold)) { /* Check whether enough time has elaplased for * another round of probing. */
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit c61aede097d350d890fa1edc9521b0072e14a0b8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 2a85388f1d94a9f8b5a529118a2c5eaa0520d85c ]
While reading sysctl_tcp_probe_interval, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 05cbc0db03e8 ("ipv4: Create probe timer for tcp PMTU as per RFC4821") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index fff3f3e7bf4b..9ddd3ff6bd80 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2330,7 +2330,7 @@ static inline void tcp_mtu_check_reprobe(struct sock *sk) u32 interval; s32 delta;
- interval = net->ipv4.sysctl_tcp_probe_interval; + interval = READ_ONCE(net->ipv4.sysctl_tcp_probe_interval); delta = tcp_jiffies32 - icsk->icsk_mtup.probe_timestamp; if (unlikely(delta >= interval * HZ)) { int mss = tcp_current_mss(sk);
From: Biao Huang biao.huang@mediatek.com
stable inclusion from stable-v5.10.134 commit dd7b5ba44b67566ffde286f60a2f684a56c69e0d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f4c7d8948e866918d61493264dbbd67e45ef2bda ]
Current stmmac driver will prepare/enable ptp_ref clock in stmmac_init_tstamp_counter().
The stmmac_pltfr_noirq_suspend will disable it once in suspend flow.
But in resume flow, stmmac_pltfr_noirq_resume --> stmmac_init_tstamp_counter stmmac_resume --> stmmac_hw_setup --> stmmac_init_ptp --> stmmac_init_tstamp_counter ptp_ref clock reference counter increases twice, which leads to unbalance ptp clock when resume back.
Move ptp_ref clock prepare/enable out of stmmac_init_tstamp_counter to fix it.
Fixes: 0735e639f129d ("net: stmmac: skip only stmmac_ptp_register when resume from suspend") Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 17 ++++++++--------- .../ethernet/stmicro/stmmac/stmmac_platform.c | 8 +++++++- 2 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index e9aa9a5eba6b..27b7bb64a028 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -738,19 +738,10 @@ int stmmac_init_tstamp_counter(struct stmmac_priv *priv, u32 systime_flags) struct timespec64 now; u32 sec_inc = 0; u64 temp = 0; - int ret;
if (!(priv->dma_cap.time_stamp || priv->dma_cap.atime_stamp)) return -EOPNOTSUPP;
- ret = clk_prepare_enable(priv->plat->clk_ptp_ref); - if (ret < 0) { - netdev_warn(priv->dev, - "failed to enable PTP reference clock: %pe\n", - ERR_PTR(ret)); - return ret; - } - stmmac_config_hw_tstamping(priv, priv->ptpaddr, systime_flags); priv->systime_flags = systime_flags;
@@ -2755,6 +2746,14 @@ static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
stmmac_mmc_setup(priv);
+ if (ptp_register) { + ret = clk_prepare_enable(priv->plat->clk_ptp_ref); + if (ret < 0) + netdev_warn(priv->dev, + "failed to enable PTP reference clock: %pe\n", + ERR_PTR(ret)); + } + ret = stmmac_init_ptp(priv); if (ret == -EOPNOTSUPP) netdev_warn(priv->dev, "PTP not supported by HW\n"); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index b40b962055fa..f70d8d1ce329 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -814,7 +814,13 @@ static int __maybe_unused stmmac_pltfr_noirq_resume(struct device *dev) if (ret) return ret;
- stmmac_init_tstamp_counter(priv, priv->systime_flags); + ret = clk_prepare_enable(priv->plat->clk_ptp_ref); + if (ret < 0) { + netdev_warn(priv->dev, + "failed to enable PTP reference clock: %pe\n", + ERR_PTR(ret)); + return ret; + } }
return 0;
From: Robert Hancock robert.hancock@calian.com
stable inclusion from stable-v5.10.134 commit f3da643d87636b2605190b292f7c5ea7222707fd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 4ca8ca873d454635c20d508261bfc0081af75cf8 ]
Problems were observed on the Xilinx ZynqMP platform with large I2C reads. When a read of 277 bytes was performed, the controller NAKed the transfer after only 252 bytes were transferred and returned an ENXIO error on the transfer.
There is some code in cdns_i2c_master_isr to handle this case by resetting the transfer count in the controller before it reaches 0, to allow larger transfers to work, but it was conditional on the CDNS_I2C_BROKEN_HOLD_BIT quirk being set on the controller, and ZynqMP uses the r1p14 version of the core where this quirk is not being set. The requirement to do this to support larger reads seems like an inherently required workaround due to the core only having an 8-bit transfer size register, so it does not appear that this should be conditional on the broken HOLD bit quirk which is used elsewhere in the driver.
Remove the dependency on the CDNS_I2C_BROKEN_HOLD_BIT for this transfer size reset logic to fix this problem.
Fixes: 63cab195bf49 ("i2c: removed work arounds in i2c driver for Zynq Ultrascale+ MPSoC") Signed-off-by: Robert Hancock robert.hancock@calian.com Reviewed-by: Shubhrajyoti Datta Shubhrajyoti.datta@amd.com Acked-by: Michal Simek michal.simek@amd.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/i2c/busses/i2c-cadence.c | 30 +++++------------------------- 1 file changed, 5 insertions(+), 25 deletions(-)
diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c index 01564bd96c62..0abce487ead7 100644 --- a/drivers/i2c/busses/i2c-cadence.c +++ b/drivers/i2c/busses/i2c-cadence.c @@ -386,9 +386,9 @@ static irqreturn_t cdns_i2c_slave_isr(void *ptr) */ static irqreturn_t cdns_i2c_master_isr(void *ptr) { - unsigned int isr_status, avail_bytes, updatetx; + unsigned int isr_status, avail_bytes; unsigned int bytes_to_send; - bool hold_quirk; + bool updatetx; struct cdns_i2c *id = ptr; /* Signal completion only after everything is updated */ int done_flag = 0; @@ -408,11 +408,7 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr) * Check if transfer size register needs to be updated again for a * large data receive operation. */ - updatetx = 0; - if (id->recv_count > id->curr_recv_count) - updatetx = 1; - - hold_quirk = (id->quirks & CDNS_I2C_BROKEN_HOLD_BIT) && updatetx; + updatetx = id->recv_count > id->curr_recv_count;
/* When receiving, handle data interrupt and completion interrupt */ if (id->p_recv_buf && @@ -443,7 +439,7 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr) break; }
- if (cdns_is_holdquirk(id, hold_quirk)) + if (cdns_is_holdquirk(id, updatetx)) break; }
@@ -454,7 +450,7 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr) * maintain transfer size non-zero while performing a large * receive operation. */ - if (cdns_is_holdquirk(id, hold_quirk)) { + if (cdns_is_holdquirk(id, updatetx)) { /* wait while fifo is full */ while (cdns_i2c_readreg(CDNS_I2C_XFER_SIZE_OFFSET) != (id->curr_recv_count - CDNS_I2C_FIFO_DEPTH)) @@ -476,22 +472,6 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr) CDNS_I2C_XFER_SIZE_OFFSET); id->curr_recv_count = id->recv_count; } - } else if (id->recv_count && !hold_quirk && - !id->curr_recv_count) { - - /* Set the slave address in address register*/ - cdns_i2c_writereg(id->p_msg->addr & CDNS_I2C_ADDR_MASK, - CDNS_I2C_ADDR_OFFSET); - - if (id->recv_count > CDNS_I2C_TRANSFER_SIZE) { - cdns_i2c_writereg(CDNS_I2C_TRANSFER_SIZE, - CDNS_I2C_XFER_SIZE_OFFSET); - id->curr_recv_count = CDNS_I2C_TRANSFER_SIZE; - } else { - cdns_i2c_writereg(id->recv_count, - CDNS_I2C_XFER_SIZE_OFFSET); - id->curr_recv_count = id->recv_count; - } }
/* Clear hold (if not repeated start) and signal completion */
From: Junxiao Chang junxiao.chang@intel.com
stable inclusion from stable-v5.10.134 commit a3ac79f38d354b10925824899cdbd2caadce55ba category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 613b065ca32e90209024ec4a6bb5ca887ee70980 ]
When queue number is > 4, left shift overflows due to 32 bits integer variable. Mask calculation is wrong for MTL_RXQ_DMA_MAP1.
If CONFIG_UBSAN is enabled, kernel dumps below warning: [ 10.363842] ================================================================== [ 10.363882] UBSAN: shift-out-of-bounds in /build/linux-intel-iotg-5.15-8e6Tf4/ linux-intel-iotg-5.15-5.15.0/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c:224:12 [ 10.363929] shift exponent 40 is too large for 32-bit type 'unsigned int' [ 10.363953] CPU: 1 PID: 599 Comm: NetworkManager Not tainted 5.15.0-1003-intel-iotg [ 10.363956] Hardware name: ADLINK Technology Inc. LEC-EL/LEC-EL, BIOS 0.15.11 12/22/2021 [ 10.363958] Call Trace: [ 10.363960] <TASK> [ 10.363963] dump_stack_lvl+0x4a/0x5f [ 10.363971] dump_stack+0x10/0x12 [ 10.363974] ubsan_epilogue+0x9/0x45 [ 10.363976] __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e [ 10.363979] ? wake_up_klogd+0x4a/0x50 [ 10.363983] ? vprintk_emit+0x8f/0x240 [ 10.363986] dwmac4_map_mtl_dma.cold+0x42/0x91 [stmmac] [ 10.364001] stmmac_mtl_configuration+0x1ce/0x7a0 [stmmac] [ 10.364009] ? dwmac410_dma_init_channel+0x70/0x70 [stmmac] [ 10.364020] stmmac_hw_setup.cold+0xf/0xb14 [stmmac] [ 10.364030] ? page_pool_alloc_pages+0x4d/0x70 [ 10.364034] ? stmmac_clear_tx_descriptors+0x6e/0xe0 [stmmac] [ 10.364042] stmmac_open+0x39e/0x920 [stmmac] [ 10.364050] __dev_open+0xf0/0x1a0 [ 10.364054] __dev_change_flags+0x188/0x1f0 [ 10.364057] dev_change_flags+0x26/0x60 [ 10.364059] do_setlink+0x908/0xc40 [ 10.364062] ? do_setlink+0xb10/0xc40 [ 10.364064] ? __nla_validate_parse+0x4c/0x1a0 [ 10.364068] __rtnl_newlink+0x597/0xa10 [ 10.364072] ? __nla_reserve+0x41/0x50 [ 10.364074] ? __kmalloc_node_track_caller+0x1d0/0x4d0 [ 10.364079] ? pskb_expand_head+0x75/0x310 [ 10.364082] ? nla_reserve_64bit+0x21/0x40 [ 10.364086] ? skb_free_head+0x65/0x80 [ 10.364089] ? security_sock_rcv_skb+0x2c/0x50 [ 10.364094] ? __cond_resched+0x19/0x30 [ 10.364097] ? kmem_cache_alloc_trace+0x15a/0x420 [ 10.364100] rtnl_newlink+0x49/0x70
This change fixes MTL_RXQ_DMA_MAP1 mask issue and channel/queue mapping warning.
Fixes: d43042f4da3e ("net: stmmac: mapping mtl rx to dma channel") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216195 Reported-by: Cedric Wassenaar cedric@bytespeed.nl Signed-off-by: Junxiao Chang junxiao.chang@intel.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index 16c538cfaf59..2e71e510e127 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -215,6 +215,9 @@ static void dwmac4_map_mtl_dma(struct mac_device_info *hw, u32 queue, u32 chan) if (queue == 0 || queue == 4) { value &= ~MTL_RXQ_DMA_Q04MDMACH_MASK; value |= MTL_RXQ_DMA_Q04MDMACH(chan); + } else if (queue > 4) { + value &= ~MTL_RXQ_DMA_QXMDMACH_MASK(queue - 4); + value |= MTL_RXQ_DMA_QXMDMACH(chan, queue - 4); } else { value &= ~MTL_RXQ_DMA_QXMDMACH_MASK(queue); value |= MTL_RXQ_DMA_QXMDMACH(chan, queue);
From: Tariq Toukan tariqt@nvidia.com
stable inclusion from stable-v5.10.134 commit e80ff0b9661384d40e97a0a7d5cc8ae2a00c785d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f08d8c1bb97c48f24a82afaa2fd8c140f8d3da8b ]
Socket destruction flow and tls_device_down function sync against each other using tls_device_lock and the context refcount, to guarantee the device resources are freed via tls_dev_del() by the end of tls_device_down.
In the following unfortunate flow, this won't happen: - refcount is decreased to zero in tls_device_sk_destruct. - tls_device_down starts, skips the context as refcount is zero, going all the way until it flushes the gc work, and returns without freeing the device resources. - only then, tls_device_queue_ctx_destruction is called, queues the gc work and frees the context's device resources.
Solve it by decreasing the refcount in the socket's destruction flow under the tls_device_lock, for perfect synchronization. This does not slow down the common likely destructor flow, in which both the refcount is decreased and the spinlock is acquired, anyway.
Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Reviewed-by: Maxim Mikityanskiy maximmi@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/tls/tls_device.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index f0af43c1fba2..700dcbaf1971 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -97,13 +97,16 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx) unsigned long flags;
spin_lock_irqsave(&tls_device_lock, flags); + if (unlikely(!refcount_dec_and_test(&ctx->refcount))) + goto unlock; + list_move_tail(&ctx->list, &tls_device_gc_list);
/* schedule_work inside the spinlock * to make sure tls_device_down waits for that work. */ schedule_work(&tls_device_gc_work); - +unlock: spin_unlock_irqrestore(&tls_device_lock, flags); }
@@ -194,8 +197,7 @@ void tls_device_sk_destruct(struct sock *sk) clean_acked_data_disable(inet_csk(sk)); }
- if (refcount_dec_and_test(&tls_ctx->refcount)) - tls_device_queue_ctx_destruction(tls_ctx); + tls_device_queue_ctx_destruction(tls_ctx); } EXPORT_SYMBOL_GPL(tls_device_sk_destruct);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 473aad9ad57ff760005377e6f45a2ad4210e08ce category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f6da2267e71106474fbc0943dc24928b9cb79119 ]
While reading sysctl_igmp_llm_reports, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
This test can be packed into a helper, so such changes will be in the follow-up series after net is merged into net-next.
if (ipv4_is_local_multicast(pmc->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
Fixes: df2cf4a78e48 ("IGMP: Inhibit reports for local multicast groups") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/igmp.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 3817988a5a1d..fd9306950a26 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -467,7 +467,8 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ip_mc_list *pmc,
if (pmc->multiaddr == IGMP_ALL_HOSTS) return skb; - if (ipv4_is_local_multicast(pmc->multiaddr) && !net->ipv4.sysctl_igmp_llm_reports) + if (ipv4_is_local_multicast(pmc->multiaddr) && + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return skb;
mtu = READ_ONCE(dev->mtu); @@ -593,7 +594,7 @@ static int igmpv3_send_report(struct in_device *in_dev, struct ip_mc_list *pmc) if (pmc->multiaddr == IGMP_ALL_HOSTS) continue; if (ipv4_is_local_multicast(pmc->multiaddr) && - !net->ipv4.sysctl_igmp_llm_reports) + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) continue; spin_lock_bh(&pmc->lock); if (pmc->sfcount[MCAST_EXCLUDE]) @@ -736,7 +737,8 @@ static int igmp_send_report(struct in_device *in_dev, struct ip_mc_list *pmc, if (type == IGMPV3_HOST_MEMBERSHIP_REPORT) return igmpv3_send_report(in_dev, pmc);
- if (ipv4_is_local_multicast(group) && !net->ipv4.sysctl_igmp_llm_reports) + if (ipv4_is_local_multicast(group) && + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return 0;
if (type == IGMP_HOST_LEAVE_MESSAGE) @@ -920,7 +922,8 @@ static bool igmp_heard_report(struct in_device *in_dev, __be32 group)
if (group == IGMP_ALL_HOSTS) return false; - if (ipv4_is_local_multicast(group) && !net->ipv4.sysctl_igmp_llm_reports) + if (ipv4_is_local_multicast(group) && + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return false;
rcu_read_lock(); @@ -1045,7 +1048,7 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, if (im->multiaddr == IGMP_ALL_HOSTS) continue; if (ipv4_is_local_multicast(im->multiaddr) && - !net->ipv4.sysctl_igmp_llm_reports) + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) continue; spin_lock_bh(&im->lock); if (im->tm_running) @@ -1296,7 +1299,8 @@ static void __igmp_group_dropped(struct ip_mc_list *im, gfp_t gfp) #ifdef CONFIG_IP_MULTICAST if (im->multiaddr == IGMP_ALL_HOSTS) return; - if (ipv4_is_local_multicast(im->multiaddr) && !net->ipv4.sysctl_igmp_llm_reports) + if (ipv4_is_local_multicast(im->multiaddr) && + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return;
reporter = im->reporter; @@ -1338,7 +1342,8 @@ static void igmp_group_added(struct ip_mc_list *im) #ifdef CONFIG_IP_MULTICAST if (im->multiaddr == IGMP_ALL_HOSTS) return; - if (ipv4_is_local_multicast(im->multiaddr) && !net->ipv4.sysctl_igmp_llm_reports) + if (ipv4_is_local_multicast(im->multiaddr) && + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return;
if (in_dev->dead) @@ -1642,7 +1647,7 @@ static void ip_mc_rejoin_groups(struct in_device *in_dev) if (im->multiaddr == IGMP_ALL_HOSTS) continue; if (ipv4_is_local_multicast(im->multiaddr) && - !net->ipv4.sysctl_igmp_llm_reports) + !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) continue;
/* a failover is happening and switches
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 7d26db0053548f66667da9de680fa45fa6819da5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 6305d821e3b9b5379d348528e5b5faf316383bc2 ]
While reading sysctl_igmp_max_memberships, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/igmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index fd9306950a26..1a70664dcb1a 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -2197,7 +2197,7 @@ static int __ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr, count++; } err = -ENOBUFS; - if (count >= net->ipv4.sysctl_igmp_max_memberships) + if (count >= READ_ONCE(net->ipv4.sysctl_igmp_max_memberships)) goto done; iml = sock_kmalloc(sk, sizeof(*iml), GFP_KERNEL); if (!iml)
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit f85119fb3fd6985f0f68f3454a826f1e267b4b71 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 6ae0f2e553737b8cce49a1372573c81130ffa80e ]
While reading sysctl_igmp_max_msf, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/igmp.c | 2 +- net/ipv4/ip_sockglue.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 1a70664dcb1a..428cc3a4c36f 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -2384,7 +2384,7 @@ int ip_mc_source(int add, int omode, struct sock *sk, struct } /* else, add a new source to the filter */
- if (psl && psl->sl_count >= net->ipv4.sysctl_igmp_max_msf) { + if (psl && psl->sl_count >= READ_ONCE(net->ipv4.sysctl_igmp_max_msf)) { err = -ENOBUFS; goto done; } diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index ec6036713e2c..22507a6a3f71 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -783,7 +783,7 @@ static int ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval, int optlen) /* numsrc >= (4G-140)/128 overflow in 32 bits */ err = -ENOBUFS; if (gsf->gf_numsrc >= 0x1ffffff || - gsf->gf_numsrc > sock_net(sk)->ipv4.sysctl_igmp_max_msf) + gsf->gf_numsrc > READ_ONCE(sock_net(sk)->ipv4.sysctl_igmp_max_msf)) goto out_free_gsf;
err = -EINVAL; @@ -832,7 +832,7 @@ static int compat_ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
/* numsrc >= (4G-140)/128 overflow in 32 bits */ err = -ENOBUFS; - if (n > sock_net(sk)->ipv4.sysctl_igmp_max_msf) + if (n > READ_ONCE(sock_net(sk)->ipv4.sysctl_igmp_max_msf)) goto out_free_gsf; err = set_mcast_msfilter(sk, gf32->gf_interface, n, gf32->gf_fmode, &gf32->gf_group, gf32->gf_slist); @@ -1242,7 +1242,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname, } /* numsrc >= (1G-4) overflow in 32 bits */ if (msf->imsf_numsrc >= 0x3ffffffcU || - msf->imsf_numsrc > net->ipv4.sysctl_igmp_max_msf) { + msf->imsf_numsrc > READ_ONCE(net->ipv4.sysctl_igmp_max_msf)) { kfree(msf); err = -ENOBUFS; break;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit fc489055e7e8abe409224ff18863999cede92fbf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f2f316e287e6c2e3a1c5bab8d9b77ee03daa0463 ]
While reading sysctl_tcp_keepalive_(time|probes|intvl), they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/tcp.h | 9 ++++++--- net/smc/smc_llc.c | 2 +- 2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 73959a256fdc..48a0faed8800 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1449,21 +1449,24 @@ static inline int keepalive_intvl_when(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp);
- return tp->keepalive_intvl ? : net->ipv4.sysctl_tcp_keepalive_intvl; + return tp->keepalive_intvl ? : + READ_ONCE(net->ipv4.sysctl_tcp_keepalive_intvl); }
static inline int keepalive_time_when(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp);
- return tp->keepalive_time ? : net->ipv4.sysctl_tcp_keepalive_time; + return tp->keepalive_time ? : + READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time); }
static inline int keepalive_probes(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp);
- return tp->keepalive_probes ? : net->ipv4.sysctl_tcp_keepalive_probes; + return tp->keepalive_probes ? : + READ_ONCE(net->ipv4.sysctl_tcp_keepalive_probes); }
static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp) diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c index ee1f0fdba085..0ef15f8fba90 100644 --- a/net/smc/smc_llc.c +++ b/net/smc/smc_llc.c @@ -1787,7 +1787,7 @@ void smc_llc_lgr_init(struct smc_link_group *lgr, struct smc_sock *smc) init_waitqueue_head(&lgr->llc_flow_waiter); init_waitqueue_head(&lgr->llc_msg_waiter); mutex_init(&lgr->llc_conf_mutex); - lgr->llc_testlink_time = net->ipv4.sysctl_tcp_keepalive_time; + lgr->llc_testlink_time = READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time); }
/* called after lgr was removed from lgr_list */
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit dc1a78a2b274bad6b1be1561759ab670640af2d7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f2e383b5bb6bbc60a0b94b87b3e49a2b1aefd11e ]
While reading sysctl_tcp_syncookies, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/core/filter.c | 4 ++-- net/ipv4/syncookies.c | 3 ++- net/ipv4/tcp_input.c | 20 ++++++++++++-------- net/ipv6/syncookies.c | 3 ++- 4 files changed, 18 insertions(+), 12 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c index 0299834e024c..0644dde98433 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -6577,7 +6577,7 @@ BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN) return -EINVAL;
- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies)) return -EINVAL;
if (!th->ack || th->rst || th->syn) @@ -6652,7 +6652,7 @@ BPF_CALL_5(bpf_tcp_gen_syncookie, struct sock *, sk, void *, iph, u32, iph_len, if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN) return -EINVAL;
- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies)) return -ENOENT;
if (!th->syn || th->ack || th->fin || th->rst) diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index ce69d24ff5e8..fdf097df4e5c 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -342,7 +342,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) struct flowi4 fl4; u32 tsoff = 0;
- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies || !th->ack || th->rst) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) || + !th->ack || th->rst) goto out;
if (tcp_synq_no_recent_overflow(sk)) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 42c0245227c8..a751541d48ef 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6709,11 +6709,14 @@ static bool tcp_syn_flood_action(const struct sock *sk, const char *proto) { struct request_sock_queue *queue = &inet_csk(sk)->icsk_accept_queue; const char *msg = "Dropping request"; - bool want_cookie = false; struct net *net = sock_net(sk); + bool want_cookie = false; + u8 syncookies; + + syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies);
#ifdef CONFIG_SYN_COOKIES - if (net->ipv4.sysctl_tcp_syncookies) { + if (syncookies) { msg = "Sending cookies"; want_cookie = true; __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPREQQFULLDOCOOKIES); @@ -6721,8 +6724,7 @@ static bool tcp_syn_flood_action(const struct sock *sk, const char *proto) #endif __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPREQQFULLDROP);
- if (!queue->synflood_warned && - net->ipv4.sysctl_tcp_syncookies != 2 && + if (!queue->synflood_warned && syncookies != 2 && xchg(&queue->synflood_warned, 1) == 0) net_info_ratelimited("%s: Possible SYN flooding on port %d. %s. Check SNMP counters.\n", proto, sk->sk_num, msg); @@ -6771,7 +6773,7 @@ u16 tcp_get_syncookie_mss(struct request_sock_ops *rsk_ops, struct tcp_sock *tp = tcp_sk(sk); u16 mss;
- if (sock_net(sk)->ipv4.sysctl_tcp_syncookies != 2 && + if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) != 2 && !inet_csk_reqsk_queue_is_full(sk)) return 0;
@@ -6805,13 +6807,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, bool want_cookie = false; struct dst_entry *dst; struct flowi fl; + u8 syncookies; + + syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies);
/* TW buckets are converted to open requests without * limitations, they conserve resources and peer is * evidently real one. */ - if ((net->ipv4.sysctl_tcp_syncookies == 2 || - inet_csk_reqsk_queue_is_full(sk)) && !isn) { + if ((syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) { want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name); if (!want_cookie) goto drop; @@ -6866,7 +6870,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
if (!want_cookie && !isn) { /* Kill the following clause, if you dislike this way. */ - if (!net->ipv4.sysctl_tcp_syncookies && + if (!syncookies && (net->ipv4.sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) < (net->ipv4.sysctl_max_syn_backlog >> 2)) && !tcp_peer_is_proven(req, dst)) { diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index 3c57734e0dc5..21fc9353590e 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -141,7 +141,8 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) __u8 rcv_wscale; u32 tsoff = 0;
- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies || !th->ack || th->rst) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) || + !th->ack || th->rst) goto out;
if (tcp_synq_no_recent_overflow(sk))
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 474510e174fb2bc9751e04885e4cc30023bcf9d7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 46778cd16e6a5ad1b2e3a91f6c057c907379418e ]
While reading sysctl_tcp_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_input.c | 10 +++++++--- net/ipv4/tcp_metrics.c | 3 ++- 3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 1270772cb1f3..e2fdb4f9b66a 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -444,7 +444,7 @@ void tcp_init_sock(struct sock *sk) tp->snd_cwnd_clamp = ~0; tp->mss_cache = TCP_MSS_DEFAULT;
- tp->reordering = sock_net(sk)->ipv4.sysctl_tcp_reordering; + tp->reordering = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reordering); tcp_assign_congestion_control(sk);
tp->tsoffset = 0; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index a751541d48ef..b87d2ceab910 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2099,6 +2099,7 @@ void tcp_enter_loss(struct sock *sk) struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery; + u8 reordering;
tcp_timeout_mark_lost(sk);
@@ -2119,10 +2120,12 @@ void tcp_enter_loss(struct sock *sk) /* Timeout in disordered state after receiving substantial DUPACKs * suggests that the degree of reordering is over-estimated. */ + reordering = READ_ONCE(net->ipv4.sysctl_tcp_reordering); if (icsk->icsk_ca_state <= TCP_CA_Disorder && - tp->sacked_out >= net->ipv4.sysctl_tcp_reordering) + tp->sacked_out >= reordering) tp->reordering = min_t(unsigned int, tp->reordering, - net->ipv4.sysctl_tcp_reordering); + reordering); + tcp_set_ca_state(sk, TCP_CA_Loss); tp->high_seq = tp->snd_nxt; tcp_ecn_queue_cwr(tp); @@ -3411,7 +3414,8 @@ static inline bool tcp_may_raise_cwnd(const struct sock *sk, const int flag) * new SACK or ECE mark may first advance cwnd here and later reduce * cwnd in tcp_fastretrans_alert() based on more states. */ - if (tcp_sk(sk)->reordering > sock_net(sk)->ipv4.sysctl_tcp_reordering) + if (tcp_sk(sk)->reordering > + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reordering)) return flag & FLAG_FORWARD_PROGRESS;
return flag & FLAG_DATA_ACKED; diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c index 6b27c481fe18..8d7e32f4abf6 100644 --- a/net/ipv4/tcp_metrics.c +++ b/net/ipv4/tcp_metrics.c @@ -428,7 +428,8 @@ void tcp_update_metrics(struct sock *sk) if (!tcp_metric_locked(tm, TCP_METRIC_REORDERING)) { val = tcp_metric_get(tm, TCP_METRIC_REORDERING); if (val < tp->reordering && - tp->reordering != net->ipv4.sysctl_tcp_reordering) + tp->reordering != + READ_ONCE(net->ipv4.sysctl_tcp_reordering)) tcp_metric_set(tm, TCP_METRIC_REORDERING, tp->reordering); }
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 768e424607203ae0d6fcba708cb41e7962ed9d6a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 39e24435a776e9de5c6dd188836cf2523547804b ]
While reading these sysctl knobs, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers.
- tcp_retries1 - tcp_retries2 - tcp_orphan_retries - tcp_fin_timeout
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/tcp.h | 3 ++- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 10 +++++----- 4 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 48a0faed8800..6094ee2ba43a 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1479,7 +1479,8 @@ static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp)
static inline int tcp_fin_time(const struct sock *sk) { - int fin_timeout = tcp_sk(sk)->linger2 ? : sock_net(sk)->ipv4.sysctl_tcp_fin_timeout; + int fin_timeout = tcp_sk(sk)->linger2 ? : + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fin_timeout); const int rto = inet_csk(sk)->icsk_rto;
if (fin_timeout < (rto << 2) - (rto >> 1)) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e2fdb4f9b66a..bbd9574386c2 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3731,7 +3731,7 @@ static int do_tcp_getsockopt(struct sock *sk, int level, case TCP_LINGER2: val = tp->linger2; if (val >= 0) - val = (val ? : net->ipv4.sysctl_tcp_fin_timeout) / HZ; + val = (val ? : READ_ONCE(net->ipv4.sysctl_tcp_fin_timeout)) / HZ; break; case TCP_DEFER_ACCEPT: val = retrans_to_secs(icsk->icsk_accept_queue.rskq_defer_accept, diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 9ddd3ff6bd80..eccce704ed99 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -4144,7 +4144,7 @@ void tcp_send_probe0(struct sock *sk)
icsk->icsk_probes_out++; if (err <= 0) { - if (icsk->icsk_backoff < net->ipv4.sysctl_tcp_retries2) + if (icsk->icsk_backoff < READ_ONCE(net->ipv4.sysctl_tcp_retries2)) icsk->icsk_backoff++; timeout = tcp_probe0_when(sk, TCP_RTO_MAX); } else { diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index da92c5d70b70..e20fd86a2a89 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -143,7 +143,7 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset) */ static int tcp_orphan_retries(struct sock *sk, bool alive) { - int retries = sock_net(sk)->ipv4.sysctl_tcp_orphan_retries; /* May be zero. */ + int retries = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_orphan_retries); /* May be zero. */
/* We know from an ICMP that something is wrong. */ if (sk->sk_err_soft && !alive) @@ -242,14 +242,14 @@ static int tcp_write_timeout(struct sock *sk) retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries; expired = icsk->icsk_retransmits >= retry_until; } else { - if (retransmits_timed_out(sk, net->ipv4.sysctl_tcp_retries1, 0)) { + if (retransmits_timed_out(sk, READ_ONCE(net->ipv4.sysctl_tcp_retries1), 0)) { /* Black hole detection */ tcp_mtu_probing(icsk, sk);
__dst_negative_advice(sk); }
- retry_until = net->ipv4.sysctl_tcp_retries2; + retry_until = READ_ONCE(net->ipv4.sysctl_tcp_retries2); if (sock_flag(sk, SOCK_DEAD)) { const bool alive = icsk->icsk_rto < TCP_RTO_MAX;
@@ -380,7 +380,7 @@ static void tcp_probe_timer(struct sock *sk) msecs_to_jiffies(icsk->icsk_user_timeout)) goto abort;
- max_probes = sock_net(sk)->ipv4.sysctl_tcp_retries2; + max_probes = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_retries2); if (sock_flag(sk, SOCK_DEAD)) { const bool alive = inet_csk_rto_backoff(icsk, TCP_RTO_MAX) < TCP_RTO_MAX;
@@ -585,7 +585,7 @@ void tcp_retransmit_timer(struct sock *sk) } inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, tcp_clamp_rto_to_user_timeout(sk), TCP_RTO_MAX); - if (retransmits_timed_out(sk, net->ipv4.sysctl_tcp_retries1 + 1, 0)) + if (retransmits_timed_out(sk, READ_ONCE(net->ipv4.sysctl_tcp_retries1) + 1, 0)) __sk_dst_reset(sk);
out:;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit fd6f1284e380c377932186042ff0b5c987fb2b92 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 55be873695ed8912eb77ff46d1d1cadf028bd0f3 ]
While reading sysctl_tcp_notsent_lowat, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/tcp.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 6094ee2ba43a..059e47fe8c37 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1975,7 +1975,7 @@ void __tcp_v4_send_check(struct sk_buff *skb, __be32 saddr, __be32 daddr); static inline u32 tcp_notsent_lowat(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp); - return tp->notsent_lowat ?: net->ipv4.sysctl_tcp_notsent_lowat; + return tp->notsent_lowat ?: READ_ONCE(net->ipv4.sysctl_tcp_notsent_lowat); }
/* @wake is one when sk_stream_write_space() calls us.
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit b6c189aa801a9c8749952c65038bc08d4fde8ce4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit cbfc6495586a3f09f6f07d9fb3c7cafe807e3c55 ]
While reading sysctl_tcp_tw_reuse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_ipv4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c854d17cf99a..16fbadb1c2a8 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -106,10 +106,10 @@ static u32 tcp_v4_init_ts_off(const struct net *net, const struct sk_buff *skb)
int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp) { + int reuse = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_tw_reuse); const struct inet_timewait_sock *tw = inet_twsk(sktw); const struct tcp_timewait_sock *tcptw = tcp_twsk(sktw); struct tcp_sock *tp = tcp_sk(sk); - int reuse = sock_net(sk)->ipv4.sysctl_tcp_tw_reuse;
if (reuse == 2) { /* Still does not detect *everything* that goes through
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit b3ce32e33ab71f5b6b69cba35825f43e5d979f55 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 79539f34743d3e14cc1fa6577d326a82cc64d62f ]
While reading sysctl_max_syn_backlog, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_input.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index b87d2ceab910..1b9c83494ee8 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6873,10 +6873,12 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, goto drop_and_free;
if (!want_cookie && !isn) { + int max_syn_backlog = READ_ONCE(net->ipv4.sysctl_max_syn_backlog); + /* Kill the following clause, if you dislike this way. */ if (!syncookies && - (net->ipv4.sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) < - (net->ipv4.sysctl_max_syn_backlog >> 2)) && + (max_syn_backlog - inet_csk_reqsk_queue_len(sk) < + (max_syn_backlog >> 2)) && !tcp_peer_is_proven(req, dst)) { /* Without syncookies last quarter of * backlog is filled with destinations,
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 22938534c611136f35e2ca545bb668073ca5ef49 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 5a54213318c43f4009ae158347aa6016e3b9b55a ]
While reading sysctl_tcp_fastopen, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 2100c8d2d9db ("net-tcp: Fast Open base") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Acked-by: Yuchung Cheng ycheng@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/af_inet.c | 2 +- net/ipv4/tcp.c | 6 ++++-- net/ipv4/tcp_fastopen.c | 4 ++-- 3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 3d012716e444..4629beccd2d4 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -220,7 +220,7 @@ int inet_listen(struct socket *sock, int backlog) * because the socket was in TCP_LISTEN state previously but * was shutdown() rather than close(). */ - tcp_fastopen = sock_net(sk)->ipv4.sysctl_tcp_fastopen; + tcp_fastopen = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen); if ((tcp_fastopen & TFO_SERVER_WO_SOCKOPT1) && (tcp_fastopen & TFO_SERVER_ENABLE) && !inet_csk(sk)->icsk_accept_queue.fastopenq.max_qlen) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index bbd9574386c2..5a55a745a409 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1152,7 +1152,8 @@ static int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, struct sockaddr *uaddr = msg->msg_name; int err, flags;
- if (!(sock_net(sk)->ipv4.sysctl_tcp_fastopen & TFO_CLIENT_ENABLE) || + if (!(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen) & + TFO_CLIENT_ENABLE) || (uaddr && msg->msg_namelen >= sizeof(uaddr->sa_family) && uaddr->sa_family == AF_UNSPEC)) return -EOPNOTSUPP; @@ -3393,7 +3394,8 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname, case TCP_FASTOPEN_CONNECT: if (val > 1 || val < 0) { err = -EINVAL; - } else if (net->ipv4.sysctl_tcp_fastopen & TFO_CLIENT_ENABLE) { + } else if (READ_ONCE(net->ipv4.sysctl_tcp_fastopen) & + TFO_CLIENT_ENABLE) { if (sk->sk_state == TCP_CLOSE) tp->fastopen_connect = val; else diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index 107111984384..ed7aa6ae7b51 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -349,7 +349,7 @@ static bool tcp_fastopen_no_cookie(const struct sock *sk, const struct dst_entry *dst, int flag) { - return (sock_net(sk)->ipv4.sysctl_tcp_fastopen & flag) || + return (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen) & flag) || tcp_sk(sk)->fastopen_no_cookie || (dst && dst_metric(dst, RTAX_FASTOPEN_NO_COOKIE)); } @@ -364,7 +364,7 @@ struct sock *tcp_try_fastopen(struct sock *sk, struct sk_buff *skb, const struct dst_entry *dst) { bool syn_data = TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq + 1; - int tcp_fastopen = sock_net(sk)->ipv4.sysctl_tcp_fastopen; + int tcp_fastopen = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen); struct tcp_fastopen_cookie valid_foc = { .len = -1 }; struct sock *child; int ret = 0;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 0dc2f19d8c2636cebda7976b5ea40c6d69f0d891 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 021266ec640c7a4527e6cd4b7349a512b351de1d ]
While reading sysctl_tcp_fastopen_blackhole_timeout, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: cf1ef3f0719b ("net/tcp_fastopen: Disable active side TFO in certain scenarios") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_fastopen.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index ed7aa6ae7b51..39fb037ce5f3 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -506,7 +506,7 @@ void tcp_fastopen_active_disable(struct sock *sk) { struct net *net = sock_net(sk);
- if (!sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout)) return;
/* Paired with READ_ONCE() in tcp_fastopen_active_should_disable() */ @@ -527,7 +527,8 @@ void tcp_fastopen_active_disable(struct sock *sk) */ bool tcp_fastopen_active_should_disable(struct sock *sk) { - unsigned int tfo_bh_timeout = sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout; + unsigned int tfo_bh_timeout = + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout); unsigned long timeout; int tfo_da_times; int multiplier;
From: Przemyslaw Patynowski przemyslawx.patynowski@intel.com
stable inclusion from stable-v5.10.134 commit c6af94324911ef0846af1a5ce5e049ca736db34b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit a9f49e0060301a9bfebeca76739158d0cf91cdf6 ]
Fix memory leak caused by not handling dummy receive descriptor properly. iavf_get_rx_buffer now sets the rx_buffer return value for dummy receive descriptors. Without this patch, when the hardware writes a dummy descriptor, iavf would not free the page allocated for the previous receive buffer. This is an unlikely event but can still happen.
[Jesse: massaged commit message]
Fixes: efa14c398582 ("iavf: allow null RX descriptors") Signed-off-by: Przemyslaw Patynowski przemyslawx.patynowski@intel.com Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/intel/iavf/iavf_txrx.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c index ffaf2742a2e0..e427b3a868e1 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c @@ -1250,11 +1250,10 @@ static struct iavf_rx_buffer *iavf_get_rx_buffer(struct iavf_ring *rx_ring, { struct iavf_rx_buffer *rx_buffer;
- if (!size) - return NULL; - rx_buffer = &rx_ring->rx_bi[rx_ring->next_to_clean]; prefetchw(rx_buffer->page); + if (!size) + return rx_buffer;
/* we are reusing so sync this buffer for CPU use */ dma_sync_single_range_for_cpu(rx_ring->dev,
From: Dawid Lukwinski dawid.lukwinski@intel.com
stable inclusion from stable-v5.10.134 commit 6f949e5615f89558416ee18892232305d57a5f38 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit f838a63369818faadec4ad1736cfbd20ab5da00e ]
Fix an issue when driver incorrectly detects state of recovery process and erroneously reinitializes interrupts, which results in a kernel error and call trace message.
The issue was caused by a combination of two factors: 1. Assuming the EMP reset issued after completing firmware recovery means the whole recovery process is complete. 2. Erroneous reinitialization of interrupt vector after detecting the above mentioned EMP reset.
Fixes (1) by changing how recovery state change is detected and (2) by adjusting the conditional expression to ensure using proper interrupt reinitialization method, depending on the situation.
Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support") Signed-off-by: Dawid Lukwinski dawid.lukwinski@intel.com Signed-off-by: Jan Sokolowski jan.sokolowski@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20220715214542.2968762-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/intel/i40e/i40e_main.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 58453f7958df..11d4e3ba9af4 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -10187,7 +10187,7 @@ static int i40e_reset(struct i40e_pf *pf) **/ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired) { - int old_recovery_mode_bit = test_bit(__I40E_RECOVERY_MODE, pf->state); + const bool is_recovery_mode_reported = i40e_check_recovery_mode(pf); struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi]; struct i40e_hw *hw = &pf->hw; i40e_status ret; @@ -10195,13 +10195,11 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired) int v;
if (test_bit(__I40E_EMP_RESET_INTR_RECEIVED, pf->state) && - i40e_check_recovery_mode(pf)) { + is_recovery_mode_reported) i40e_set_ethtool_ops(pf->vsi[pf->lan_vsi]->netdev); - }
if (test_bit(__I40E_DOWN, pf->state) && - !test_bit(__I40E_RECOVERY_MODE, pf->state) && - !old_recovery_mode_bit) + !test_bit(__I40E_RECOVERY_MODE, pf->state)) goto clear_recovery; dev_dbg(&pf->pdev->dev, "Rebuilding internal switch\n");
@@ -10228,13 +10226,12 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired) * accordingly with regard to resources initialization * and deinitialization */ - if (test_bit(__I40E_RECOVERY_MODE, pf->state) || - old_recovery_mode_bit) { + if (test_bit(__I40E_RECOVERY_MODE, pf->state)) { if (i40e_get_capabilities(pf, i40e_aqc_opc_list_func_capabilities)) goto end_unlock;
- if (test_bit(__I40E_RECOVERY_MODE, pf->state)) { + if (is_recovery_mode_reported) { /* we're staying in recovery mode so we'll reinitialize * misc vector here */
From: Piotr Skajewski piotrx.skajewski@intel.com
stable inclusion from stable-v5.10.134 commit b82de63f8f817b5735480293dda8e92ba8170c52 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 1e53834ce541d4fe271cdcca7703e50be0a44f8a ]
It is possible to disable VFs while the PF driver is processing requests from the VF driver. This can result in a panic.
BUG: unable to handle kernel paging request at 000000000000106c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I --------- - Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020 RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe] Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff 01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c 00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002 RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006 RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780 R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020 R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80 FS: 0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? ttwu_do_wakeup+0x19/0x140 ? try_to_wake_up+0x1cd/0x550 ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf] ixgbe_msix_other+0x17e/0x310 [ixgbe] __handle_irq_event_percpu+0x40/0x180 handle_irq_event_percpu+0x30/0x80 handle_irq_event+0x36/0x53 handle_edge_irq+0x82/0x190 handle_irq+0x1c/0x30 do_IRQ+0x49/0xd0 common_interrupt+0xf/0xf
This can be eventually be reproduced with the following script:
while : do echo 63 > /sys/class/net/<devname>/device/sriov_numvfs sleep 1 echo 0 > /sys/class/net/<devname>/device/sriov_numvfs sleep 1 done
Add lock when disabling SR-IOV to prevent process VF mailbox communication.
Fixes: d773d1310625 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned") Signed-off-by: Piotr Skajewski piotrx.skajewski@intel.com Tested-by: Marek Szlosek marek.szlosek@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20220715214456.2968711-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 1 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +++ drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 6 ++++++ 3 files changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h index eaa992e7c591..93828720da57 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -775,6 +775,7 @@ struct ixgbe_adapter { #ifdef CONFIG_IXGBE_IPSEC struct ixgbe_ipsec *ipsec; #endif /* CONFIG_IXGBE_IPSEC */ + spinlock_t vfs_lock; };
static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 74ee496e4031..b53ee892cf32 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -6398,6 +6398,9 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter, /* n-tuple support exists, always init our spinlock */ spin_lock_init(&adapter->fdir_perfect_lock);
+ /* init spinlock to avoid concurrency of VF resources */ + spin_lock_init(&adapter->vfs_lock); + #ifdef CONFIG_IXGBE_DCB ixgbe_init_dcb(adapter); #endif diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c index d4e63f0644c3..a1e69c734863 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c @@ -205,10 +205,13 @@ void ixgbe_enable_sriov(struct ixgbe_adapter *adapter, unsigned int max_vfs) int ixgbe_disable_sriov(struct ixgbe_adapter *adapter) { unsigned int num_vfs = adapter->num_vfs, vf; + unsigned long flags; int rss;
+ spin_lock_irqsave(&adapter->vfs_lock, flags); /* set num VFs to 0 to prevent access to vfinfo */ adapter->num_vfs = 0; + spin_unlock_irqrestore(&adapter->vfs_lock, flags);
/* put the reference to all of the vf devices */ for (vf = 0; vf < num_vfs; ++vf) { @@ -1355,8 +1358,10 @@ static void ixgbe_rcv_ack_from_vf(struct ixgbe_adapter *adapter, u32 vf) void ixgbe_msg_task(struct ixgbe_adapter *adapter) { struct ixgbe_hw *hw = &adapter->hw; + unsigned long flags; u32 vf;
+ spin_lock_irqsave(&adapter->vfs_lock, flags); for (vf = 0; vf < adapter->num_vfs; vf++) { /* process any reset requests */ if (!ixgbe_check_for_rst(hw, vf)) @@ -1370,6 +1375,7 @@ void ixgbe_msg_task(struct ixgbe_adapter *adapter) if (!ixgbe_check_for_ack(hw, vf)) ixgbe_rcv_ack_from_vf(adapter, vf); } + spin_unlock_irqrestore(&adapter->vfs_lock, flags); }
static inline void ixgbe_ping_vf(struct ixgbe_adapter *adapter, int vf)
From: Haibo Chen haibo.chen@nxp.com
stable inclusion from stable-v5.10.134 commit 928ded3fc1b9f87fb60b9e977182b5087fed0218 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit db8edaa09d7461ec08672a92a2eef63d5882bb79 ]
For the device use NO AI mode(not support auto address increment), only use the single read/write when config the regmap.
We meet issue on PCA9557PW on i.MX8QXP/DXL evk board, this device do not support AI mode, but when do the regmap sync, regmap will sync 3 byte data to register 1, logically this means write first data to register 1, write second data to register 2, write third data to register 3. But this device do not support AI mode, finally, these three data write only into register 1 one by one. the reault is the value of register 1 alway equal to the latest data, here is the third data, no operation happened on register 2 and register 3. This is not what we expect.
Fixes: 49427232764d ("gpio: pca953x: Perform basic regmap conversion") Signed-off-by: Haibo Chen haibo.chen@nxp.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/gpio/gpio-pca953x.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c index bb4ca064447e..bd7828e0f2f5 100644 --- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -350,6 +350,9 @@ static const struct regmap_config pca953x_i2c_regmap = { .reg_bits = 8, .val_bits = 8,
+ .use_single_read = true, + .use_single_write = true, + .readable_reg = pca953x_readable_register, .writeable_reg = pca953x_writeable_register, .volatile_reg = pca953x_volatile_register,
From: Haibo Chen haibo.chen@nxp.com
stable inclusion from stable-v5.10.134 commit a941e6d5ba3bf989b48490abed9e68d135b44c9b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 2abc17a93867dc816f0ed9d32021dda8078e7330 ]
regmap will sync a range of registers, here use the correct range to make sure the sync do not touch other unexpected registers.
Find on pca9557pw on imx8qxp/dxl evk board, this device support 8 pin, so only need one register(8 bits) to cover all the 8 pins's property setting. But when sync the output, we find it actually update two registers, output register and the following register.
Fixes: b76574300504 ("gpio: pca953x: Restore registers after suspend/resume cycle") Fixes: ec82d1eba346 ("gpio: pca953x: Zap ad-hoc reg_output cache") Fixes: 0f25fda840a9 ("gpio: pca953x: Zap ad-hoc reg_direction cache") Signed-off-by: Haibo Chen haibo.chen@nxp.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/gpio/gpio-pca953x.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c index bd7828e0f2f5..b63ac46beceb 100644 --- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -899,12 +899,12 @@ static int device_pca95xx_init(struct pca953x_chip *chip, u32 invert) int ret;
ret = regcache_sync_region(chip->regmap, chip->regs->output, - chip->regs->output + NBANK(chip)); + chip->regs->output + NBANK(chip) - 1); if (ret) goto out;
ret = regcache_sync_region(chip->regmap, chip->regs->direction, - chip->regs->direction + NBANK(chip)); + chip->regs->direction + NBANK(chip) - 1); if (ret) goto out;
@@ -1117,14 +1117,14 @@ static int pca953x_regcache_sync(struct device *dev) * sync these registers first and only then sync the rest. */ regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0); - ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip)); + ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip) - 1); if (ret) { dev_err(dev, "Failed to sync GPIO dir registers: %d\n", ret); return ret; }
regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0); - ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip)); + ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip) - 1); if (ret) { dev_err(dev, "Failed to sync GPIO out registers: %d\n", ret); return ret; @@ -1134,7 +1134,7 @@ static int pca953x_regcache_sync(struct device *dev) if (chip->driver_data & PCA_PCAL) { regaddr = pca953x_recalc_addr(chip, PCAL953X_IN_LATCH, 0); ret = regcache_sync_region(chip->regmap, regaddr, - regaddr + NBANK(chip)); + regaddr + NBANK(chip) - 1); if (ret) { dev_err(dev, "Failed to sync INT latch registers: %d\n", ret); @@ -1143,7 +1143,7 @@ static int pca953x_regcache_sync(struct device *dev)
regaddr = pca953x_recalc_addr(chip, PCAL953X_INT_MASK, 0); ret = regcache_sync_region(chip->regmap, regaddr, - regaddr + NBANK(chip)); + regaddr + NBANK(chip) - 1); if (ret) { dev_err(dev, "Failed to sync INT mask registers: %d\n", ret);
From: Haibo Chen haibo.chen@nxp.com
stable inclusion from stable-v5.10.134 commit 47523928557e4988072edbaed4ffae2e0d41b71c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit b8c768ccdd8338504fb78370747728d5002b1b5a ]
For regcache_sync_region, we need to use pca953x_recalc_addr() to get the real register address.
Fixes: ec82d1eba346 ("gpio: pca953x: Zap ad-hoc reg_output cache") Fixes: 0f25fda840a9 ("gpio: pca953x: Zap ad-hoc reg_direction cache") Signed-off-by: Haibo Chen haibo.chen@nxp.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/gpio/gpio-pca953x.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c index b63ac46beceb..957be5f69406 100644 --- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -896,15 +896,18 @@ static int pca953x_irq_setup(struct pca953x_chip *chip, static int device_pca95xx_init(struct pca953x_chip *chip, u32 invert) { DECLARE_BITMAP(val, MAX_LINE); + u8 regaddr; int ret;
- ret = regcache_sync_region(chip->regmap, chip->regs->output, - chip->regs->output + NBANK(chip) - 1); + regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0); + ret = regcache_sync_region(chip->regmap, regaddr, + regaddr + NBANK(chip) - 1); if (ret) goto out;
- ret = regcache_sync_region(chip->regmap, chip->regs->direction, - chip->regs->direction + NBANK(chip) - 1); + regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0); + ret = regcache_sync_region(chip->regmap, regaddr, + regaddr + NBANK(chip) - 1); if (ret) goto out;
From: Hristo Venev hristo@venev.name
stable inclusion from stable-v5.10.134 commit 665cbe91de2f7c97c51ca8fce39aae26477c1948 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit d7241f679a59cfe27f92cb5c6272cb429fb1f7ec ]
be_cmd_read_port_transceiver_data assumes that it is given a buffer that is at least PAGE_DATA_LEN long, or twice that if the module supports SFF 8472. However, this is not always the case.
Fix this by passing the desired offset and length to be_cmd_read_port_transceiver_data so that we only copy the bytes once.
Fixes: e36edd9d26cf ("be2net: add ethtool "-m" option support") Signed-off-by: Hristo Venev hristo@venev.name Link: https://lore.kernel.org/r/20220716085134.6095-1-hristo@venev.name Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/ethernet/emulex/benet/be_cmds.c | 10 +++--- drivers/net/ethernet/emulex/benet/be_cmds.h | 2 +- .../net/ethernet/emulex/benet/be_ethtool.c | 31 ++++++++++++------- 3 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c index 649c5c429bd7..1288b5e3d220 100644 --- a/drivers/net/ethernet/emulex/benet/be_cmds.c +++ b/drivers/net/ethernet/emulex/benet/be_cmds.c @@ -2287,7 +2287,7 @@ int be_cmd_get_beacon_state(struct be_adapter *adapter, u8 port_num, u32 *state)
/* Uses sync mcc */ int be_cmd_read_port_transceiver_data(struct be_adapter *adapter, - u8 page_num, u8 *data) + u8 page_num, u32 off, u32 len, u8 *data) { struct be_dma_mem cmd; struct be_mcc_wrb *wrb; @@ -2321,10 +2321,10 @@ int be_cmd_read_port_transceiver_data(struct be_adapter *adapter, req->port = cpu_to_le32(adapter->hba_port_num); req->page_num = cpu_to_le32(page_num); status = be_mcc_notify_wait(adapter); - if (!status) { + if (!status && len > 0) { struct be_cmd_resp_port_type *resp = cmd.va;
- memcpy(data, resp->page_data, PAGE_DATA_LEN); + memcpy(data, resp->page_data + off, len); } err: mutex_unlock(&adapter->mcc_lock); @@ -2415,7 +2415,7 @@ int be_cmd_query_cable_type(struct be_adapter *adapter) int status;
status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0, - page_data); + 0, PAGE_DATA_LEN, page_data); if (!status) { switch (adapter->phy.interface_type) { case PHY_TYPE_QSFP: @@ -2440,7 +2440,7 @@ int be_cmd_query_sfp_info(struct be_adapter *adapter) int status;
status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0, - page_data); + 0, PAGE_DATA_LEN, page_data); if (!status) { strlcpy(adapter->phy.vendor_name, page_data + SFP_VENDOR_NAME_OFFSET, SFP_VENDOR_NAME_LEN - 1); diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h index c30d6d6f0f3a..9e17d6a7ab8c 100644 --- a/drivers/net/ethernet/emulex/benet/be_cmds.h +++ b/drivers/net/ethernet/emulex/benet/be_cmds.h @@ -2427,7 +2427,7 @@ int be_cmd_set_beacon_state(struct be_adapter *adapter, u8 port_num, u8 beacon, int be_cmd_get_beacon_state(struct be_adapter *adapter, u8 port_num, u32 *state); int be_cmd_read_port_transceiver_data(struct be_adapter *adapter, - u8 page_num, u8 *data); + u8 page_num, u32 off, u32 len, u8 *data); int be_cmd_query_cable_type(struct be_adapter *adapter); int be_cmd_query_sfp_info(struct be_adapter *adapter); int lancer_cmd_read_object(struct be_adapter *adapter, struct be_dma_mem *cmd, diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c b/drivers/net/ethernet/emulex/benet/be_ethtool.c index dfa784339781..bd0df189d871 100644 --- a/drivers/net/ethernet/emulex/benet/be_ethtool.c +++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c @@ -1344,7 +1344,7 @@ static int be_get_module_info(struct net_device *netdev, return -EOPNOTSUPP;
status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0, - page_data); + 0, PAGE_DATA_LEN, page_data); if (!status) { if (!page_data[SFP_PLUS_SFF_8472_COMP]) { modinfo->type = ETH_MODULE_SFF_8079; @@ -1362,25 +1362,32 @@ static int be_get_module_eeprom(struct net_device *netdev, { struct be_adapter *adapter = netdev_priv(netdev); int status; + u32 begin, end;
if (!check_privilege(adapter, MAX_PRIVILEGES)) return -EOPNOTSUPP;
- status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0, - data); - if (status) - goto err; + begin = eeprom->offset; + end = eeprom->offset + eeprom->len; + + if (begin < PAGE_DATA_LEN) { + status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0, begin, + min_t(u32, end, PAGE_DATA_LEN) - begin, + data); + if (status) + goto err; + + data += PAGE_DATA_LEN - begin; + begin = PAGE_DATA_LEN; + }
- if (eeprom->offset + eeprom->len > PAGE_DATA_LEN) { - status = be_cmd_read_port_transceiver_data(adapter, - TR_PAGE_A2, - data + - PAGE_DATA_LEN); + if (end > PAGE_DATA_LEN) { + status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A2, + begin - PAGE_DATA_LEN, + end - begin, data); if (status) goto err; } - if (eeprom->offset) - memcpy(data, data + eeprom->offset, eeprom->len); err: return be_cmd_status(status); }
From: Liang He windhl@126.com
stable inclusion from stable-v5.10.134 commit 36f1d9c607f9457e2d8ff26eb96eb40db74b92fd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 02c87df2480ac855d88ee308ce3fa857d9bd55a8 ]
In dcss_dev_create() and dcss_dev_destroy(), we should call of_node_put() in fail path or before the dcss's destroy as of_graph_get_port_by_id() has increased the refcount.
Fixes: 9021c317b770 ("drm/imx: Add initial support for DCSS on iMX8MQ") Signed-off-by: Liang He windhl@126.com Reviewed-by: Laurentiu Palcu laurentiu.palcu@oss.nxp.com Signed-off-by: Laurentiu Palcu laurentiu.palcu@oss.nxp.com Link: https://patchwork.freedesktop.org/patch/msgid/20220714081337.374761-1-windhl... Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/gpu/drm/imx/dcss/dcss-dev.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/imx/dcss/dcss-dev.c b/drivers/gpu/drm/imx/dcss/dcss-dev.c index c849533ca83e..3f5750cc2673 100644 --- a/drivers/gpu/drm/imx/dcss/dcss-dev.c +++ b/drivers/gpu/drm/imx/dcss/dcss-dev.c @@ -207,6 +207,7 @@ struct dcss_dev *dcss_dev_create(struct device *dev, bool hdmi_output)
ret = dcss_submodules_init(dcss); if (ret) { + of_node_put(dcss->of_port); dev_err(dev, "submodules initialization failed\n"); goto clks_err; } @@ -237,6 +238,8 @@ void dcss_dev_destroy(struct dcss_dev *dcss) dcss_clocks_disable(dcss); }
+ of_node_put(dcss->of_port); + pm_runtime_disable(dcss->dev);
dcss_submodules_stop(dcss);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit e045d672ba06e1d35bacb56374d350de0ac99066 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 87507bcb4f5de16bb419e9509d874f4db6c0ad0f ]
While reading sysctl_fib_multipath_use_neigh, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: a6db4494d218 ("net: ipv4: Consider failed nexthops in multipath routes") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/fib_semantics.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 3e31bed1ab10..f62b1739f63b 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -2232,7 +2232,7 @@ void fib_select_multipath(struct fib_result *res, int hash) }
change_nexthops(fi) { - if (net->ipv4.sysctl_fib_multipath_use_neigh) { + if (READ_ONCE(net->ipv4.sysctl_fib_multipath_use_neigh)) { if (!fib_good_nh(nexthop_nh)) continue; if (!first) {
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 9add240f76af6d141d2eebd3a1558a0e503a993d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 9b55c20f83369dd54541d9ddbe3a018a8377f451 ]
sysctl_ip_prot_sock is accessed concurrently, and there is always a chance of data-race. So, all readers and writers need some basic protection to avoid load/store-tearing.
Fixes: 4548b683b781 ("Introduce a sysctl that modifies the value of PROT_SOCK.") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/ip.h | 2 +- net/ipv4/sysctl_net_ipv4.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h index ef6e8e5b4fce..602bf2372e23 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -355,7 +355,7 @@ static inline bool sysctl_dev_name_is_allowed(const char *name)
static inline bool inet_port_requires_bind_service(struct net *net, unsigned short port) { - return port < net->ipv4.sysctl_ip_prot_sock; + return port < READ_ONCE(net->ipv4.sysctl_ip_prot_sock); }
#else diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 1cffacd985c1..9f5e4425aecf 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -95,7 +95,7 @@ static int ipv4_local_port_range(struct ctl_table *table, int write, * port limit. */ if ((range[1] < range[0]) || - (range[0] < net->ipv4.sysctl_ip_prot_sock)) + (range[0] < READ_ONCE(net->ipv4.sysctl_ip_prot_sock))) ret = -EINVAL; else set_local_port_range(net, range); @@ -121,7 +121,7 @@ static int ipv4_privileged_ports(struct ctl_table *table, int write, .extra2 = &ip_privileged_port_max, };
- pports = net->ipv4.sysctl_ip_prot_sock; + pports = READ_ONCE(net->ipv4.sysctl_ip_prot_sock);
ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
@@ -133,7 +133,7 @@ static int ipv4_privileged_ports(struct ctl_table *table, int write, if (range[0] < pports) ret = -EINVAL; else - net->ipv4.sysctl_ip_prot_sock = pports; + WRITE_ONCE(net->ipv4.sysctl_ip_prot_sock, pports); }
return ret;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit fcaef69c79ec222e55643e666b80b221e70fa6a8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 3d72bb4188c708bb16758c60822fc4dda7a95174 ]
While reading sysctl_udp_l3mdev_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 63a6fff353d0 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/udp.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/udp.h b/include/net/udp.h index 4017f257628f..010bc324f860 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -259,7 +259,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if, int dif, int sdif) { #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) - return inet_bound_dev_eq(!!net->ipv4.sysctl_udp_l3mdev_accept, + return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_udp_l3mdev_accept), bound_dev_if, dif, sdif); #else return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit ffc388f6f0d65f2a59798d883c63205919e6d452 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 3666f666e99600518ab20982af04a078bbdad277 ]
While reading these knobs, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers.
- tcp_sack - tcp_window_scaling - tcp_timestamps
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- .../ethernet/chelsio/inline_crypto/chtls/chtls_cm.c | 6 +++--- net/core/secure_seq.c | 4 ++-- net/ipv4/syncookies.c | 6 +++--- net/ipv4/tcp_input.c | 6 +++--- net/ipv4/tcp_output.c | 10 +++++----- 5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c index 51e071c20e39..cd6e016e6210 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c @@ -1235,8 +1235,8 @@ static struct sock *chtls_recv_sock(struct sock *lsk, csk->sndbuf = newsk->sk_sndbuf; csk->smac_idx = ((struct port_info *)netdev_priv(ndev))->smt_idx; RCV_WSCALE(tp) = select_rcv_wscale(tcp_full_space(newsk), - sock_net(newsk)-> - ipv4.sysctl_tcp_window_scaling, + READ_ONCE(sock_net(newsk)-> + ipv4.sysctl_tcp_window_scaling), tp->window_clamp); neigh_release(n); inet_inherit_port(&tcp_hashinfo, lsk, newsk); @@ -1383,7 +1383,7 @@ static void chtls_pass_accept_request(struct sock *sk, #endif } if (req->tcpopt.wsf <= 14 && - sock_net(sk)->ipv4.sysctl_tcp_window_scaling) { + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_window_scaling)) { inet_rsk(oreq)->wscale_ok = 1; inet_rsk(oreq)->snd_wscale = req->tcpopt.wsf; } diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 7131cd1fb2ad..189eea1372d5 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -64,7 +64,7 @@ u32 secure_tcpv6_ts_off(const struct net *net, .daddr = *(struct in6_addr *)daddr, };
- if (net->ipv4.sysctl_tcp_timestamps != 1) + if (READ_ONCE(net->ipv4.sysctl_tcp_timestamps) != 1) return 0;
ts_secret_init(); @@ -120,7 +120,7 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral); #ifdef CONFIG_INET u32 secure_tcp_ts_off(const struct net *net, __be32 saddr, __be32 daddr) { - if (net->ipv4.sysctl_tcp_timestamps != 1) + if (READ_ONCE(net->ipv4.sysctl_tcp_timestamps) != 1) return 0;
ts_secret_init(); diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index fdf097df4e5c..1a3356abe21e 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -249,12 +249,12 @@ bool cookie_timestamp_decode(const struct net *net, return true; }
- if (!net->ipv4.sysctl_tcp_timestamps) + if (!READ_ONCE(net->ipv4.sysctl_tcp_timestamps)) return false;
tcp_opt->sack_ok = (options & TS_OPT_SACK) ? TCP_SACK_SEEN : 0;
- if (tcp_opt->sack_ok && !net->ipv4.sysctl_tcp_sack) + if (tcp_opt->sack_ok && !READ_ONCE(net->ipv4.sysctl_tcp_sack)) return false;
if ((options & TS_OPT_WSCALE_MASK) == TS_OPT_WSCALE_MASK) @@ -263,7 +263,7 @@ bool cookie_timestamp_decode(const struct net *net, tcp_opt->wscale_ok = 1; tcp_opt->snd_wscale = options & TS_OPT_WSCALE_MASK;
- return net->ipv4.sysctl_tcp_window_scaling != 0; + return READ_ONCE(net->ipv4.sysctl_tcp_window_scaling) != 0; } EXPORT_SYMBOL(cookie_timestamp_decode);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1b9c83494ee8..3af88ced5457 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4025,7 +4025,7 @@ void tcp_parse_options(const struct net *net, break; case TCPOPT_WINDOW: if (opsize == TCPOLEN_WINDOW && th->syn && - !estab && net->ipv4.sysctl_tcp_window_scaling) { + !estab && READ_ONCE(net->ipv4.sysctl_tcp_window_scaling)) { __u8 snd_wscale = *(__u8 *)ptr; opt_rx->wscale_ok = 1; if (snd_wscale > TCP_MAX_WSCALE) { @@ -4041,7 +4041,7 @@ void tcp_parse_options(const struct net *net, case TCPOPT_TIMESTAMP: if ((opsize == TCPOLEN_TIMESTAMP) && ((estab && opt_rx->tstamp_ok) || - (!estab && net->ipv4.sysctl_tcp_timestamps))) { + (!estab && READ_ONCE(net->ipv4.sysctl_tcp_timestamps)))) { opt_rx->saw_tstamp = 1; opt_rx->rcv_tsval = get_unaligned_be32(ptr); opt_rx->rcv_tsecr = get_unaligned_be32(ptr + 4); @@ -4049,7 +4049,7 @@ void tcp_parse_options(const struct net *net, break; case TCPOPT_SACK_PERM: if (opsize == TCPOLEN_SACK_PERM && th->syn && - !estab && net->ipv4.sysctl_tcp_sack) { + !estab && READ_ONCE(net->ipv4.sysctl_tcp_sack)) { opt_rx->sack_ok = TCP_SACK_SEEN; tcp_sack_reset(opt_rx); } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index eccce704ed99..b1cb73763294 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -838,18 +838,18 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, opts->mss = tcp_advertise_mss(sk); remaining -= TCPOLEN_MSS_ALIGNED;
- if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !*md5)) { + if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps) && !*md5)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; remaining -= TCPOLEN_TSTAMP_ALIGNED; } - if (likely(sock_net(sk)->ipv4.sysctl_tcp_window_scaling)) { + if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_window_scaling))) { opts->ws = tp->rx_opt.rcv_wscale; opts->options |= OPTION_WSCALE; remaining -= TCPOLEN_WSCALE_ALIGNED; } - if (likely(sock_net(sk)->ipv4.sysctl_tcp_sack)) { + if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_sack))) { opts->options |= OPTION_SACK_ADVERTISE; if (unlikely(!(OPTION_TS & opts->options))) remaining -= TCPOLEN_SACKPERM_ALIGNED; @@ -3700,7 +3700,7 @@ static void tcp_connect_init(struct sock *sk) * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT. */ tp->tcp_header_len = sizeof(struct tcphdr); - if (sock_net(sk)->ipv4.sysctl_tcp_timestamps) + if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps)) tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED;
#ifdef CONFIG_TCP_MD5SIG @@ -3736,7 +3736,7 @@ static void tcp_connect_init(struct sock *sk) tp->advmss - (tp->rx_opt.ts_recent_stamp ? tp->tcp_header_len - sizeof(struct tcphdr) : 0), &tp->rcv_wnd, &tp->window_clamp, - sock_net(sk)->ipv4.sysctl_tcp_window_scaling, + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_window_scaling), &rcv_wscale, rcv_wnd);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 11e8b013d16e5db63f8f76acceb5b86964098aaa category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 52e65865deb6a36718a463030500f16530eaab74 ]
While reading sysctl_tcp_early_retrans, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: eed530b6c676 ("tcp: early retransmit") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index b1cb73763294..8c9d8c04167a 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2787,7 +2787,7 @@ bool tcp_schedule_loss_probe(struct sock *sk, bool advancing_rto) if (rcu_access_pointer(tp->fastopen_rsk)) return false;
- early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans; + early_retrans = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_early_retrans); /* Schedule a loss probe in 2*RTT for SACK capable connections * not in loss recovery, that are either limited by cwnd or application. */
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit d8781f7cd04091744f474a2bada74772084b9dc9 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit e7d2ef837e14a971a05f60ea08c47f3fed1a36e4 ]
While reading sysctl_tcp_recovery, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 4f41b1c58a32 ("tcp: use RACK to detect losses") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_input.c | 3 ++- net/ipv4/tcp_recovery.c | 6 ++++-- 2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 3af88ced5457..dcf3725681bd 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2055,7 +2055,8 @@ static inline void tcp_init_undo(struct tcp_sock *tp)
static bool tcp_is_rack(const struct sock *sk) { - return sock_net(sk)->ipv4.sysctl_tcp_recovery & TCP_RACK_LOSS_DETECTION; + return READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_recovery) & + TCP_RACK_LOSS_DETECTION; }
/* If we detect SACK reneging, forget all SACK information diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c index 31fc178f42c0..21fc9859d421 100644 --- a/net/ipv4/tcp_recovery.c +++ b/net/ipv4/tcp_recovery.c @@ -19,7 +19,8 @@ static u32 tcp_rack_reo_wnd(const struct sock *sk) return 0;
if (tp->sacked_out >= tp->reordering && - !(sock_net(sk)->ipv4.sysctl_tcp_recovery & TCP_RACK_NO_DUPTHRESH)) + !(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_recovery) & + TCP_RACK_NO_DUPTHRESH)) return 0; }
@@ -190,7 +191,8 @@ void tcp_rack_update_reo_wnd(struct sock *sk, struct rate_sample *rs) { struct tcp_sock *tp = tcp_sk(sk);
- if (sock_net(sk)->ipv4.sysctl_tcp_recovery & TCP_RACK_STATIC_REO_WND || + if ((READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_recovery) & + TCP_RACK_STATIC_REO_WND) || !rs->prior_delivered) return;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit cc133e4f4bc225079198192623945bb872c08143 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 7c6f2a86ca590d5187a073d987e9599985fb1c7c ]
While reading sysctl_tcp_thin_linear_timeouts, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 36e31b0af587 ("net: TCP thin linear timeouts") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index e20fd86a2a89..888683f2ff3e 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -574,7 +574,7 @@ void tcp_retransmit_timer(struct sock *sk) * linear-timeout retransmissions into a black hole */ if (sk->sk_state == TCP_ESTABLISHED && - (tp->thin_lto || net->ipv4.sysctl_tcp_thin_linear_timeouts) && + (tp->thin_lto || READ_ONCE(net->ipv4.sysctl_tcp_thin_linear_timeouts)) && tcp_stream_is_thin(tp) && icsk->icsk_retransmits <= TCP_THIN_LINEAR_RETRIES) { icsk->icsk_backoff = 0;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 0e3f82a03ec8c3808e87283e12946227415706c9 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 4845b5713ab18a1bb6e31d1fbb4d600240b8b691 ]
While reading sysctl_tcp_slow_start_after_idle, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/tcp.h | 4 ++-- net/ipv4/tcp_output.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 059e47fe8c37..c49bb1d581a4 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1382,8 +1382,8 @@ static inline void tcp_slow_start_after_idle_check(struct sock *sk) struct tcp_sock *tp = tcp_sk(sk); s32 delta;
- if (!sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle || tp->packets_out || - ca_ops->cong_control) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle) || + tp->packets_out || ca_ops->cong_control) return; delta = tcp_jiffies32 - tp->lsndtime; if (delta > inet_csk(sk)->icsk_rto) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 8c9d8c04167a..b630fa1a9a44 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1951,7 +1951,7 @@ static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited) if (tp->packets_out > tp->snd_cwnd_used) tp->snd_cwnd_used = tp->packets_out;
- if (sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle && + if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle) && (s32)(tcp_jiffies32 - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto && !ca_ops->cong_control) tcp_cwnd_application_limited(sk);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit daa8b5b8694c05a383fe43ffea679aa2153277cf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 1a63cb91f0c2fcdeced6d6edee8d1d886583d139 ]
While reading sysctl_tcp_retrans_collapse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index b630fa1a9a44..ada84d51c99a 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3152,7 +3152,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to, struct sk_buff *skb = to, *tmp; bool first = true;
- if (!sock_net(sk)->ipv4.sysctl_tcp_retrans_collapse) + if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_retrans_collapse)) return; if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN) return;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 03bb3892f3f18fd98717aa6763ab3885416e69d0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 4e08ed41cb1194009fc1a916a59ce3ed4afd77cd ]
While reading sysctl_tcp_stdurg, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index dcf3725681bd..ab74acb50936 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5514,7 +5514,7 @@ static void tcp_check_urg(struct sock *sk, const struct tcphdr *th) struct tcp_sock *tp = tcp_sk(sk); u32 ptr = ntohs(th->urg_ptr);
- if (ptr && !sock_net(sk)->ipv4.sysctl_tcp_stdurg) + if (ptr && !READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_stdurg)) ptr--; ptr += ntohl(th->seq);
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 805f1c7ce47030dbb663b7d2a84c9473a27df774 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 0b484c91911e758e53656d570de58c2ed81ec6f2 ]
While reading sysctl_tcp_rfc1337, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_minisocks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 32e28c55f00d..054f51cf8ea3 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -180,7 +180,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, * Oh well... nobody has a sufficient solution to this * protocol bug yet. */ - if (twsk_net(tw)->ipv4.sysctl_tcp_rfc1337 == 0) { + if (!READ_ONCE(twsk_net(tw)->ipv4.sysctl_tcp_rfc1337)) { kill: inet_twsk_deschedule_put(tw); return TCP_TW_SUCCESS;
From: Kuniyuki Iwashima kuniyu@amazon.com
stable inclusion from stable-v5.10.134 commit 064852663308c801861bd54789d81421fa4c2928 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit a11e5b3e7a59fde1a90b0eaeaa82320495cf8cae ]
While reading sysctl_tcp_max_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/ipv4/tcp_input.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index ab74acb50936..233fb5d992d4 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1011,7 +1011,7 @@ static void tcp_check_sack_reordering(struct sock *sk, const u32 low_seq, tp->undo_marker ? tp->undo_retrans : 0); #endif tp->reordering = min_t(u32, (metric + mss - 1) / mss, - sock_net(sk)->ipv4.sysctl_tcp_max_reordering); + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_max_reordering)); }
/* This exciting event is worth to be remembered. 8) */ @@ -1990,7 +1990,7 @@ static void tcp_check_reno_reordering(struct sock *sk, const int addend) return;
tp->reordering = min_t(u32, tp->packets_out + addend, - sock_net(sk)->ipv4.sysctl_tcp_max_reordering); + READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_max_reordering)); tp->reord_seen++; NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRENOREORDER); }
From: Marc Kleine-Budde mkl@pengutronix.de
stable inclusion from stable-v5.10.134 commit 684896e675edd8b669fd3e9f547c5038222d85bc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 4ceaa684459d414992acbefb4e4c31f2dfc50641 upstream.
In case a IRQ based transfer times out the bcm2835_spi_handle_err() function is called. Since commit 1513ceee70f2 ("spi: bcm2835: Drop dma_pending flag") the TX and RX DMA transfers are unconditionally canceled, leading to NULL pointer derefs if ctlr->dma_tx or ctlr->dma_rx are not set.
Fix the NULL pointer deref by checking that ctlr->dma_tx and ctlr->dma_rx are valid pointers before accessing them.
Fixes: 1513ceee70f2 ("spi: bcm2835: Drop dma_pending flag") Cc: Lukas Wunner lukas@wunner.de Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Link: https://lore.kernel.org/r/20220719072234.2782764-1-mkl@pengutronix.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/spi/spi-bcm2835.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c index 33c32e931767..bb9d8386ba3b 100644 --- a/drivers/spi/spi-bcm2835.c +++ b/drivers/spi/spi-bcm2835.c @@ -1174,10 +1174,14 @@ static void bcm2835_spi_handle_err(struct spi_controller *ctlr, struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr);
/* if an error occurred and we have an active dma, then terminate */ - dmaengine_terminate_sync(ctlr->dma_tx); - bs->tx_dma_active = false; - dmaengine_terminate_sync(ctlr->dma_rx); - bs->rx_dma_active = false; + if (ctlr->dma_tx) { + dmaengine_terminate_sync(ctlr->dma_tx); + bs->tx_dma_active = false; + } + if (ctlr->dma_rx) { + dmaengine_terminate_sync(ctlr->dma_rx); + bs->rx_dma_active = false; + } bcm2835_spi_undo_prologue(bs);
/* and reset */
From: Alexey Kardashevskiy aik@ozlabs.ru
stable inclusion from stable-v5.10.134 commit 3616776bc51cd3262bb1be60cc01c72e0a1959cf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit e8bc2427018826e02add7b0ed0fc625a60390ae5 upstream.
A KVM device cleanup happens in either of two callbacks: 1) destroy() which is called when the VM is being destroyed; 2) release() which is called when a device fd is closed.
Most KVM devices use 1) but Book3s's interrupt controller KVM devices (XICS, XIVE, XIVE-native) use 2) as they need to close and reopen during the machine execution. The error handling in kvm_ioctl_create_device() assumes destroy() is always defined which leads to NULL dereference as discovered by Syzkaller.
This adds a checks for destroy!=NULL and adds a missing release().
This is not changing kvm_destroy_devices() as devices with defined release() should have been removed from the KVM devices list by then.
Suggested-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- virt/kvm/kvm_main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 93cd86267eba..55dbd307b2cb 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3669,8 +3669,11 @@ static int kvm_ioctl_create_device(struct kvm *kvm, kvm_put_kvm_no_destroy(kvm); mutex_lock(&kvm->lock); list_del(&dev->vm_node); + if (ops->release) + ops->release(dev); mutex_unlock(&kvm->lock); - ops->destroy(dev); + if (ops->destroy) + ops->destroy(dev); return ret; }
From: Wang Cheng wanngchenng@gmail.com
stable inclusion from stable-v5.10.134 commit ddb3f0b68863bd1c5f43177eea476bce316d4993 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 018160ad314d75b1409129b2247b614a9f35894c upstream.
mpol_set_nodemask()(mm/mempolicy.c) does not set up nodemask when pol->mode is MPOL_LOCAL. Check pol->mode before access pol->w.cpuset_mems_allowed in mpol_rebind_policy()(mm/mempolicy.c).
BUG: KMSAN: uninit-value in mpol_rebind_policy mm/mempolicy.c:352 [inline] BUG: KMSAN: uninit-value in mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368 mpol_rebind_policy mm/mempolicy.c:352 [inline] mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368 cpuset_change_task_nodemask kernel/cgroup/cpuset.c:1711 [inline] cpuset_attach+0x787/0x15e0 kernel/cgroup/cpuset.c:2278 cgroup_migrate_execute+0x1023/0x1d20 kernel/cgroup/cgroup.c:2515 cgroup_migrate kernel/cgroup/cgroup.c:2771 [inline] cgroup_attach_task+0x540/0x8b0 kernel/cgroup/cgroup.c:2804 __cgroup1_procs_write+0x5cc/0x7a0 kernel/cgroup/cgroup-v1.c:520 cgroup1_tasks_write+0x94/0xb0 kernel/cgroup/cgroup-v1.c:539 cgroup_file_write+0x4c2/0x9e0 kernel/cgroup/cgroup.c:3852 kernfs_fop_write_iter+0x66a/0x9f0 fs/kernfs/file.c:296 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0x1318/0x2030 fs/read_write.c:590 ksys_write+0x28b/0x510 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae
Uninit was created at: slab_post_alloc_hook mm/slab.h:524 [inline] slab_alloc_node mm/slub.c:3251 [inline] slab_alloc mm/slub.c:3259 [inline] kmem_cache_alloc+0x902/0x11c0 mm/slub.c:3264 mpol_new mm/mempolicy.c:293 [inline] do_set_mempolicy+0x421/0xb70 mm/mempolicy.c:853 kernel_set_mempolicy mm/mempolicy.c:1504 [inline] __do_sys_set_mempolicy mm/mempolicy.c:1510 [inline] __se_sys_set_mempolicy+0x44c/0xb60 mm/mempolicy.c:1507 __x64_sys_set_mempolicy+0xd8/0x110 mm/mempolicy.c:1507 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae
KMSAN: uninit-value in mpol_rebind_task (2) https://syzkaller.appspot.com/bug?id=d6eb90f952c2a5de9ea718a1b873c55cb13b59d...
This patch seems to fix below bug too. KMSAN: uninit-value in mpol_rebind_mm (2) https://syzkaller.appspot.com/bug?id=f2fecd0d7013f54ec4162f60743a2b28df40926...
The uninit-value is pol->w.cpuset_mems_allowed in mpol_rebind_policy(). When syzkaller reproducer runs to the beginning of mpol_new(),
mpol_new() mm/mempolicy.c do_mbind() mm/mempolicy.c kernel_mbind() mm/mempolicy.c
`mode` is 1(MPOL_PREFERRED), nodes_empty(*nodes) is `true` and `flags` is 0. Then
mode = MPOL_LOCAL; ... policy->mode = mode; policy->flags = flags;
will be executed. So in mpol_set_nodemask(),
mpol_set_nodemask() mm/mempolicy.c do_mbind() kernel_mbind()
pol->mode is 4 (MPOL_LOCAL), that `nodemask` in `pol` is not initialized, which will be accessed in mpol_rebind_policy().
Link: https://lkml.kernel.org/r/20220512123428.fq3wofedp6oiotd4@ppc.localdomain Signed-off-by: Wang Cheng wanngchenng@gmail.com Reported-by: syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com Tested-by: syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com Cc: David Rientjes rientjes@google.com Cc: Vlastimil Babka vbabka@suse.cz Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- mm/mempolicy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e7fdb3a749b5..e2927e81c738 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -413,7 +413,7 @@ static void mpol_rebind_preferred(struct mempolicy *pol, */ static void mpol_rebind_policy(struct mempolicy *pol, const nodemask_t *newmask) { - if (!pol) + if (!pol || pol->mode == MPOL_LOCAL) return; if (!mpol_store_user_nodemask(pol) && !(pol->flags & MPOL_F_LOCAL) && nodes_equal(pol->w.cpuset_mems_allowed, *newmask))
From: Eric Dumazet edumazet@google.com
stable inclusion from stable-v5.10.134 commit 0c722a32f29abc4231e928a1d15b112c4c084934 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 0326195f523a549e0a9d7fd44c70b26fd7265090 upstream.
Classic BPF has a way to load bytes starting from the mac header.
Some skbs do not have a mac header, and skb_mac_header() in this case is returning a pointer that 65535 bytes after skb->head.
Existing range check in bpf_internal_load_pointer_neg_helper() was properly kicking and no illegal access was happening.
New sanity check in skb_mac_header() is firing, so we need to avoid it.
WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline] WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74 Modules linked in: CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb9484be #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022 RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline] RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74 Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41 RSP: 0018:ffffc9000309f668 EFLAGS: 00010216 RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000 RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003 RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004 R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000 FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ____bpf_skb_load_helper_32 net/core/filter.c:276 [inline] bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264
Fixes: f9aefd6b2aa3 ("net: warn if mac header was not set") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- kernel/bpf/core.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 845a4c052433..fd2aa6b9909e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -66,11 +66,13 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns { u8 *ptr = NULL;
- if (k >= SKF_NET_OFF) + if (k >= SKF_NET_OFF) { ptr = skb_network_header(skb) + k - SKF_NET_OFF; - else if (k >= SKF_LL_OFF) + } else if (k >= SKF_LL_OFF) { + if (unlikely(!skb_mac_header_was_set(skb))) + return NULL; ptr = skb_mac_header(skb) + k - SKF_LL_OFF; - + } if (ptr >= skb->head && ptr + size <= skb_tail_pointer(skb)) return ptr;
From: Juri Lelli juri.lelli@redhat.com
stable inclusion from stable-v5.10.134 commit 26d5eb3c25c3b89bfd62f81f6afea71b350adaf2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit ddfc710395cccc61247348df9eb18ea50321cbed upstream.
Tasks the are being deboosted from SCHED_DEADLINE might enter enqueue_task_dl() one last time and hit an erroneous BUG_ON condition: since they are not boosted anymore, the if (is_dl_boosted()) branch is not taken, but the else if (!dl_prio) is and inside this one we BUG_ON(!is_dl_boosted), which is of course false (BUG_ON triggered) otherwise we had entered the if branch above. Long story short, the current condition doesn't make sense and always leads to triggering of a BUG.
Fix this by only checking enqueue flags, properly: ENQUEUE_REPLENISH has to be present, but additional flags are not a problem.
Fixes: 64be6f1f5f71 ("sched/deadline: Don't replenish from a !SCHED_DEADLINE entity") Signed-off-by: Juri Lelli juri.lelli@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220714151908.533052-1-juri.lelli@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- kernel/sched/deadline.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index cb487d7d33e1..a6b69fae752e 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1563,7 +1563,10 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags) * the throttle. */ p->dl.dl_throttled = 0; - BUG_ON(!is_dl_boosted(&p->dl) || flags != ENQUEUE_REPLENISH); + if (!(flags & ENQUEUE_REPLENISH)) + printk_deferred_once("sched: DL de-boosted task PID %d: REPLENISH flag missing\n", + task_pid_nr(p)); + return; }
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
stable inclusion from stable-v5.10.134 commit cdcd20aa2cd44b83433aa6bf3f86cb48dc58a20a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit eb23b5ef9131e6d65011de349a4d25ef1b3d4314 upstream.
IBRS mitigation for spectre_v2 forces write to MSR_IA32_SPEC_CTRL at every kernel entry/exit. On Enhanced IBRS parts setting MSR_IA32_SPEC_CTRL[IBRS] only once at boot is sufficient. MSR writes at every kernel entry/exit incur unnecessary performance loss.
When Enhanced IBRS feature is present, print a warning about this unnecessary performance loss.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2a5eaf54583c2bfe0edc4fea64006656256cca17.165781485... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- arch/x86/kernel/cpu/bugs.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index d4cb9ff639aa..a38e637729a6 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -931,6 +931,7 @@ static inline const char *spectre_v2_module_string(void) { return ""; } #define SPECTRE_V2_LFENCE_MSG "WARNING: LFENCE mitigation is not recommended for this CPU, data leaks possible!\n" #define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n" #define SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS+LFENCE mitigation and SMT, data leaks possible via Spectre v2 BHB attacks!\n" +#define SPECTRE_V2_IBRS_PERF_MSG "WARNING: IBRS mitigation selected on Enhanced IBRS CPU, this may cause unnecessary performance loss\n"
#ifdef CONFIG_BPF_SYSCALL void unpriv_ebpf_notify(int new_state) @@ -1418,6 +1419,8 @@ static void __init spectre_v2_select_mitigation(void)
case SPECTRE_V2_IBRS: setup_force_cpu_cap(X86_FEATURE_KERNEL_IBRS); + if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) + pr_warn(SPECTRE_V2_IBRS_PERF_MSG); break;
case SPECTRE_V2_LFENCE:
From: Alexander Aring aahringo@redhat.com
stable inclusion from stable-v5.10.134 commit 577b624689aa717704eefa858747cc40391d113f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit ba58995909b5098ca4003af65b0ccd5a8d13dd25 ]
This patch unsets ls_remove_len and ls_remove_name if a message allocation of a remove messages fails. In this case we never send a remove message out but set the per ls ls_remove_len ls_remove_name variable for a pending remove. Unset those variable should indicate possible waiters in wait_pending_remove() that no pending remove is going on at this moment.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- fs/dlm/lock.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c index 2ce96a9ce63c..eaa28d654e9f 100644 --- a/fs/dlm/lock.c +++ b/fs/dlm/lock.c @@ -4067,13 +4067,14 @@ static void send_repeat_remove(struct dlm_ls *ls, char *ms_name, int len) rv = _create_message(ls, sizeof(struct dlm_message) + len, dir_nodeid, DLM_MSG_REMOVE, &ms, &mh); if (rv) - return; + goto out;
memcpy(ms->m_extra, name, len); ms->m_hash = hash;
send_message(mh, ms);
+out: spin_lock(&ls->ls_remove_spin); ls->ls_remove_len = 0; memset(ls->ls_remove_name, 0, DLM_RESNAME_MAXLEN);
From: Wang ShaoBo bobo.shaobowang@huawei.com
stable inclusion from stable-v5.10.134 commit f62ffdb5e2ee31e04c7b9819eb1e8ef63060eda8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 523be44c334bc4e4c014032738dc277b8909d009 ]
Fix unused but set variable warning building with `make W=1`:
drivers/gpu/drm/imx/dcss/dcss-plane.c:270:6: warning: variable ‘pixel_format’ set but not used [-Wunused-but-set-variable] u32 pixel_format; ^~~~~~~~~~~~
Fixes: 9021c317b770 ("drm/imx: Add initial support for DCSS on iMX8MQ") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Reviewed-by: Laurentiu Palcu laurentiu.palcu@oss.nxp.com Signed-off-by: Lucas Stach l.stach@pengutronix.de Link: https://patchwork.freedesktop.org/patch/msgid/20200911014414.4663-1-bobo.sha... Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/gpu/drm/imx/dcss/dcss-plane.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c index f54087ac44d3..46a188dd02ad 100644 --- a/drivers/gpu/drm/imx/dcss/dcss-plane.c +++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c @@ -268,7 +268,6 @@ static void dcss_plane_atomic_update(struct drm_plane *plane, struct dcss_plane *dcss_plane = to_dcss_plane(plane); struct dcss_dev *dcss = plane->dev->dev_private; struct drm_framebuffer *fb = state->fb; - u32 pixel_format; struct drm_crtc_state *crtc_state; bool modifiers_present; u32 src_w, src_h, dst_w, dst_h; @@ -279,7 +278,6 @@ static void dcss_plane_atomic_update(struct drm_plane *plane, if (!fb || !state->crtc || !state->visible) return;
- pixel_format = state->fb->format->format; crtc_state = state->crtc->state; modifiers_present = !!(fb->flags & DRM_MODE_FB_MODIFIERS);
From: Peter Zijlstra peterz@infradead.org
stable inclusion from stable-v5.10.134 commit d1dc861cd68cc83de52b01bec91e7362f8cca1b1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit bff8c3848e071d387d8b0784dc91fa49cd563774 ]
The test: 'mask > (typeof(_reg))~0ull' only works correctly when both sides are unsigned, consider:
- 0xff000000 vs (int)~0ull - 0x000000ff vs (int)~0ull
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Josh Poimboeuf jpoimboe@redhat.com Link: https://lore.kernel.org/r/20211110101324.950210584@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/linux/bitfield.h | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h index 4e035aca6f7e..6093fa6db260 100644 --- a/include/linux/bitfield.h +++ b/include/linux/bitfield.h @@ -41,6 +41,22 @@
#define __bf_shf(x) (__builtin_ffsll(x) - 1)
+#define __scalar_type_to_unsigned_cases(type) \ + unsigned type: (unsigned type)0, \ + signed type: (unsigned type)0 + +#define __unsigned_scalar_typeof(x) typeof( \ + _Generic((x), \ + char: (unsigned char)0, \ + __scalar_type_to_unsigned_cases(char), \ + __scalar_type_to_unsigned_cases(short), \ + __scalar_type_to_unsigned_cases(int), \ + __scalar_type_to_unsigned_cases(long), \ + __scalar_type_to_unsigned_cases(long long), \ + default: (x))) + +#define __bf_cast_unsigned(type, x) ((__unsigned_scalar_typeof(type))(x)) + #define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx) \ ({ \ BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask), \ @@ -49,7 +65,8 @@ BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ? \ ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \ _pfx "value too large for the field"); \ - BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull, \ + BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) > \ + __bf_cast_unsigned(_reg, ~0ull), \ _pfx "type of reg too small for mask"); \ __BUILD_BUG_ON_NOT_POWER_OF_2((_mask) + \ (1ULL << __bf_shf(_mask))); \
From: Takashi Iwai tiwai@suse.de
stable inclusion from stable-v5.10.134 commit 4faf4bbc2d600a921052ff45b1b5914d583d9046 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 5c1733e33c888a3cb7f576564d8ad543d5ad4a9e upstream.
Currently the standard memory allocator (snd_dma_malloc_pages*()) passes the byte size to allocate as is. Most of the backends allocates real pages, hence the actual allocations are aligned in page size. However, the genalloc doesn't seem assuring the size alignment, hence it may result in the access outside the buffer when the whole memory pages are exposed via mmap.
For avoiding such inconsistencies, this patch makes the allocation size always to be aligned in page size.
Note that, after this change, snd_dma_buffer.bytes field contains the aligned size, not the originally requested size. This value is also used for releasing the pages in return.
Reviewed-by: Lars-Peter Clausen lars@metafoo.de Link: https://lore.kernel.org/r/20201218145625.2045-2-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- sound/core/memalloc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c index 0f335162f87c..966bef5acc75 100644 --- a/sound/core/memalloc.c +++ b/sound/core/memalloc.c @@ -133,6 +133,7 @@ int snd_dma_alloc_pages(int type, struct device *device, size_t size, if (WARN_ON(!dmab)) return -ENXIO;
+ size = PAGE_ALIGN(size); dmab->dev.type = type; dmab->dev.dev = device; dmab->bytes = 0;
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit c87b2bc9d74ab550fbc7dc6e5b2bd22aa9da40eb category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 38f64f650dc0e44c146ff88d15a7339efa325918 upstream.
bt_skb_sendmsg helps takes care of allocation the skb and copying the the contents of msg over to the skb while checking for possible errors so it should be safe to call it without holding lock_sock.
Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/bluetooth/bluetooth.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index 3fecc4a411a1..83eaa42896bb 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -422,6 +422,34 @@ static inline struct sk_buff *bt_skb_send_alloc(struct sock *sk, return NULL; }
+/* Shall not be called with lock_sock held */ +static inline struct sk_buff *bt_skb_sendmsg(struct sock *sk, + struct msghdr *msg, + size_t len, size_t mtu, + size_t headroom, size_t tailroom) +{ + struct sk_buff *skb; + size_t size = min_t(size_t, len, mtu); + int err; + + skb = bt_skb_send_alloc(sk, size + headroom + tailroom, + msg->msg_flags & MSG_DONTWAIT, &err); + if (!skb) + return ERR_PTR(err); + + skb_reserve(skb, headroom); + skb_tailroom_reserve(skb, mtu, tailroom); + + if (!copy_from_iter_full(skb_put(skb, size), size, &msg->msg_iter)) { + kfree_skb(skb); + return ERR_PTR(-EFAULT); + } + + skb->priority = sk->sk_priority; + + return skb; +} + int bt_to_errno(u16 code);
void hci_sock_set_flag(struct sock *sk, int nr);
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit 3cce0e771fb5db6494136d83a8e0790865d370be category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 97e4e80299844bb5f6ce5a7540742ffbffae3d97 upstream.
This works similarly to bt_skb_sendmsg but can split the msg into multiple skb fragments which is useful for stream sockets.
Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/bluetooth/bluetooth.h | 38 +++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)
diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index 83eaa42896bb..3275d5737285 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -450,6 +450,44 @@ static inline struct sk_buff *bt_skb_sendmsg(struct sock *sk, return skb; }
+/* Similar to bt_skb_sendmsg but can split the msg into multiple fragments + * accourding to the MTU. + */ +static inline struct sk_buff *bt_skb_sendmmsg(struct sock *sk, + struct msghdr *msg, + size_t len, size_t mtu, + size_t headroom, size_t tailroom) +{ + struct sk_buff *skb, **frag; + + skb = bt_skb_sendmsg(sk, msg, len, mtu, headroom, tailroom); + if (IS_ERR_OR_NULL(skb)) + return skb; + + len -= skb->len; + if (!len) + return skb; + + /* Add remaining data over MTU as continuation fragments */ + frag = &skb_shinfo(skb)->frag_list; + while (len) { + struct sk_buff *tmp; + + tmp = bt_skb_sendmsg(sk, msg, len, mtu, headroom, tailroom); + if (IS_ERR_OR_NULL(tmp)) { + kfree_skb(skb); + return tmp; + } + + len -= tmp->len; + + *frag = tmp; + frag = &(*frag)->next; + } + + return skb; +} + int bt_to_errno(u16 code);
void hci_sock_set_flag(struct sock *sk, int nr);
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit 341a029cf39cb8492b156535f6b6cbd369fabd14 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 0771cbb3b97d3c1d68eecd7f00055f599954c34e upstream.
This makes use of bt_skb_sendmsg instead of allocating a different buffer to be used with memcpy_from_msg which cause one extra copy.
Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/bluetooth/sco.c | 34 +++++++++++----------------------- 1 file changed, 11 insertions(+), 23 deletions(-)
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 807fdc500f6c..8d0b7e7d4ed7 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -280,27 +280,19 @@ static int sco_connect(struct hci_dev *hdev, struct sock *sk) return err; }
-static int sco_send_frame(struct sock *sk, void *buf, int len, - unsigned int msg_flags) +static int sco_send_frame(struct sock *sk, struct sk_buff *skb) { struct sco_conn *conn = sco_pi(sk)->conn; - struct sk_buff *skb; - int err;
/* Check outgoing MTU */ - if (len > conn->mtu) + if (skb->len > conn->mtu) return -EINVAL;
- BT_DBG("sk %p len %d", sk, len); - - skb = bt_skb_send_alloc(sk, len, msg_flags & MSG_DONTWAIT, &err); - if (!skb) - return err; + BT_DBG("sk %p len %d", sk, skb->len);
- memcpy(skb_put(skb, len), buf, len); hci_send_sco(conn->hcon, skb);
- return len; + return skb->len; }
static void sco_recv_frame(struct sco_conn *conn, struct sk_buff *skb) @@ -727,7 +719,7 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; - void *buf; + struct sk_buff *skb; int err;
BT_DBG("sock %p, sk %p", sock, sk); @@ -739,24 +731,20 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg, if (msg->msg_flags & MSG_OOB) return -EOPNOTSUPP;
- buf = kmalloc(len, GFP_KERNEL); - if (!buf) - return -ENOMEM; - - if (memcpy_from_msg(buf, msg, len)) { - kfree(buf); - return -EFAULT; - } + skb = bt_skb_sendmsg(sk, msg, len, len, 0, 0); + if (IS_ERR_OR_NULL(skb)) + return PTR_ERR(skb);
lock_sock(sk);
if (sk->sk_state == BT_CONNECTED) - err = sco_send_frame(sk, buf, len, msg->msg_flags); + err = sco_send_frame(sk, skb); else err = -ENOTCONN;
release_sock(sk); - kfree(buf); + if (err) + kfree_skb(skb); return err; }
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit 5283591c84fae76c3dd7f5a9a2f4db95eae24569 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 81be03e026dc0c16dc1c64e088b2a53b73caa895 upstream.
This makes use of bt_skb_sendmmsg instead using memcpy_from_msg which is not considered safe to be used when lock_sock is held.
Also make rfcomm_dlc_send handle skb with fragments and queue them all atomically.
Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/bluetooth/rfcomm/core.c | 50 +++++++++++++++++++++++++++++++------ net/bluetooth/rfcomm/sock.c | 46 ++++++++-------------------------- 2 files changed, 53 insertions(+), 43 deletions(-)
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c index f2bacb464ccf..7324764384b6 100644 --- a/net/bluetooth/rfcomm/core.c +++ b/net/bluetooth/rfcomm/core.c @@ -549,22 +549,58 @@ struct rfcomm_dlc *rfcomm_dlc_exists(bdaddr_t *src, bdaddr_t *dst, u8 channel) return dlc; }
+static int rfcomm_dlc_send_frag(struct rfcomm_dlc *d, struct sk_buff *frag) +{ + int len = frag->len; + + BT_DBG("dlc %p mtu %d len %d", d, d->mtu, len); + + if (len > d->mtu) + return -EINVAL; + + rfcomm_make_uih(frag, d->addr); + __skb_queue_tail(&d->tx_queue, frag); + + return len; +} + int rfcomm_dlc_send(struct rfcomm_dlc *d, struct sk_buff *skb) { - int len = skb->len; + unsigned long flags; + struct sk_buff *frag, *next; + int len;
if (d->state != BT_CONNECTED) return -ENOTCONN;
- BT_DBG("dlc %p mtu %d len %d", d, d->mtu, len); + frag = skb_shinfo(skb)->frag_list; + skb_shinfo(skb)->frag_list = NULL;
- if (len > d->mtu) - return -EINVAL; + /* Queue all fragments atomically. */ + spin_lock_irqsave(&d->tx_queue.lock, flags);
- rfcomm_make_uih(skb, d->addr); - skb_queue_tail(&d->tx_queue, skb); + len = rfcomm_dlc_send_frag(d, skb); + if (len < 0 || !frag) + goto unlock; + + for (; frag; frag = next) { + int ret; + + next = frag->next; + + ret = rfcomm_dlc_send_frag(d, frag); + if (ret < 0) { + kfree_skb(frag); + goto unlock; + } + + len += ret; + } + +unlock: + spin_unlock_irqrestore(&d->tx_queue.lock, flags);
- if (!test_bit(RFCOMM_TX_THROTTLED, &d->flags)) + if (len > 0 && !test_bit(RFCOMM_TX_THROTTLED, &d->flags)) rfcomm_schedule(); return len; } diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c index ae6f80730561..97f10f05ae19 100644 --- a/net/bluetooth/rfcomm/sock.c +++ b/net/bluetooth/rfcomm/sock.c @@ -575,46 +575,20 @@ static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg, lock_sock(sk);
sent = bt_sock_wait_ready(sk, msg->msg_flags); - if (sent) - goto done; - - while (len) { - size_t size = min_t(size_t, len, d->mtu); - int err; - - skb = sock_alloc_send_skb(sk, size + RFCOMM_SKB_RESERVE, - msg->msg_flags & MSG_DONTWAIT, &err); - if (!skb) { - if (sent == 0) - sent = err; - break; - } - skb_reserve(skb, RFCOMM_SKB_HEAD_RESERVE); - - err = memcpy_from_msg(skb_put(skb, size), msg, size); - if (err) { - kfree_skb(skb); - if (sent == 0) - sent = err; - break; - }
- skb->priority = sk->sk_priority; + release_sock(sk);
- err = rfcomm_dlc_send(d, skb); - if (err < 0) { - kfree_skb(skb); - if (sent == 0) - sent = err; - break; - } + if (sent) + return sent;
- sent += size; - len -= size; - } + skb = bt_skb_sendmmsg(sk, msg, len, d->mtu, RFCOMM_SKB_HEAD_RESERVE, + RFCOMM_SKB_TAIL_RESERVE); + if (IS_ERR_OR_NULL(skb)) + return PTR_ERR(skb);
-done: - release_sock(sk); + sent = rfcomm_dlc_send(d, skb); + if (sent < 0) + kfree_skb(skb);
return sent; }
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit a8feae8bd22757637921dfbfb74731dac31da461 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 266191aa8d14b84958aaeb5e96ee4e97839e3d87 upstream.
Passing NULL to PTR_ERR will result in 0 (success), also since the likes of bt_skb_sendmsg does never return NULL it is safe to replace the instances of IS_ERR_OR_NULL with IS_ERR when checking its return.
Reported-by: Dan Carpenter dan.carpenter@oracle.com Tested-by: Tedd Ho-Jeong An tedd.an@intel.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/bluetooth/bluetooth.h | 2 +- net/bluetooth/rfcomm/sock.c | 2 +- net/bluetooth/sco.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index 3275d5737285..b85e6d9ba39f 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -474,7 +474,7 @@ static inline struct sk_buff *bt_skb_sendmmsg(struct sock *sk, struct sk_buff *tmp;
tmp = bt_skb_sendmsg(sk, msg, len, mtu, headroom, tailroom); - if (IS_ERR_OR_NULL(tmp)) { + if (IS_ERR(tmp)) { kfree_skb(skb); return tmp; } diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c index 97f10f05ae19..4cf1fa9900ca 100644 --- a/net/bluetooth/rfcomm/sock.c +++ b/net/bluetooth/rfcomm/sock.c @@ -583,7 +583,7 @@ static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
skb = bt_skb_sendmmsg(sk, msg, len, d->mtu, RFCOMM_SKB_HEAD_RESERVE, RFCOMM_SKB_TAIL_RESERVE); - if (IS_ERR_OR_NULL(skb)) + if (IS_ERR(skb)) return PTR_ERR(skb);
sent = rfcomm_dlc_send(d, skb); diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 8d0b7e7d4ed7..17e377d9be30 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -732,7 +732,7 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg, return -EOPNOTSUPP;
skb = bt_skb_sendmsg(sk, msg, len, len, 0, 0); - if (IS_ERR_OR_NULL(skb)) + if (IS_ERR(skb)) return PTR_ERR(skb);
lock_sock(sk);
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit f44e65e6f0ee38b124ff4becdce6f8f566f1d1c5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 037ce005af6b8a3e40ee07c6e9266c8997e6a4d6 upstream.
The skb in modified by hci_send_sco which pushes SCO headers thus changing skb->len causing sco_sock_sendmsg to fail.
Fixes: 0771cbb3b97d ("Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg") Tested-by: Tedd Ho-Jeong An tedd.an@intel.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- net/bluetooth/sco.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 17e377d9be30..081d1ee3ddb8 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -283,16 +283,17 @@ static int sco_connect(struct hci_dev *hdev, struct sock *sk) static int sco_send_frame(struct sock *sk, struct sk_buff *skb) { struct sco_conn *conn = sco_pi(sk)->conn; + int len = skb->len;
/* Check outgoing MTU */ - if (skb->len > conn->mtu) + if (len > conn->mtu) return -EINVAL;
- BT_DBG("sk %p len %d", sk, skb->len); + BT_DBG("sk %p len %d", sk, len);
hci_send_sco(conn->hcon, skb);
- return skb->len; + return len; }
static void sco_recv_frame(struct sco_conn *conn, struct sk_buff *skb) @@ -743,7 +744,8 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg, err = -ENOTCONN;
release_sock(sk); - if (err) + + if (err < 0) kfree_skb(skb); return err; }
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
stable inclusion from stable-v5.10.134 commit 14fc9233aa73e896d21a4998cdd453349a4baa8b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 29fb608396d6a62c1b85acc421ad7a4399085b9f upstream.
Since bt_skb_sendmmsg can be used with the likes of SOCK_STREAM it shall return the partial chunks it could allocate instead of freeing everything as otherwise it can cause problems like bellow.
Fixes: 81be03e026dc ("Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg") Reported-by: Paul Menzel pmenzel@molgen.mpg.de Link: https://lore.kernel.org/r/d7206e12-1b99-c3be-84f4-df22af427ef5@molgen.mpg.de BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215594 Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Tested-by: Paul Menzel pmenzel@molgen.mpg.de (Nokia N9 (MeeGo/Harmattan) Signed-off-by: Marcel Holtmann marcel@holtmann.org Cc: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- include/net/bluetooth/bluetooth.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index b85e6d9ba39f..355835639ae5 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -475,8 +475,7 @@ static inline struct sk_buff *bt_skb_sendmmsg(struct sock *sk,
tmp = bt_skb_sendmsg(sk, msg, len, mtu, headroom, tailroom); if (IS_ERR(tmp)) { - kfree_skb(skb); - return tmp; + return skb; }
len -= tmp->len;
From: Peter Zijlstra peterz@infradead.org
stable inclusion from stable-v5.10.134 commit b7b9e5cc8b24d6c800b2b7f8bba44c5914bd6c8f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 28a99e95f55c61855983d36a88c05c178d966bb7 upstream.
On AMD IBRS does not prevent Retbleed; as such use IBPB before a firmware call to flush the branch history state.
And because in order to do an EFI call, the kernel maps a whole lot of the kernel page table into the EFI page table, do an IBPB just in case in order to prevent the scenario of poisoning the BTB and causing an EFI call using the unprotected RET there.
[ bp: Massage. ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Link: https://lore.kernel.org/r/20220715194550.793957-1-cascardo@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Conflicts: arch/x86/include/asm/cpufeatures.h Reviewed-by: Wei Li liwei391@huawei.com --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/nospec-branch.h | 2 ++ arch/x86/kernel/cpu/bugs.c | 11 ++++++++++- 3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 35886c52e333..d3575a3ab97d 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -320,6 +320,7 @@ #define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */ #define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */ #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */ +#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */ #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 92771640706c..a406ca726f46 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -322,6 +322,8 @@ do { \ alternative_msr_write(MSR_IA32_SPEC_CTRL, \ spec_ctrl_current() | SPEC_CTRL_IBRS, \ X86_FEATURE_USE_IBRS_FW); \ + alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, \ + X86_FEATURE_USE_IBPB_FW); \ } while (0)
#define firmware_restrict_branch_speculation_end() \ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index a38e637729a6..15b2091e6b1f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1501,7 +1501,16 @@ static void __init spectre_v2_select_mitigation(void) * the CPU supports Enhanced IBRS, kernel might un-intentionally not * enable IBRS around firmware calls. */ - if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) { + if (boot_cpu_has_bug(X86_BUG_RETBLEED) && + (boot_cpu_data.x86_vendor == X86_VENDOR_AMD || + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) { + + if (retbleed_cmd != RETBLEED_CMD_IBPB) { + setup_force_cpu_cap(X86_FEATURE_USE_IBPB_FW); + pr_info("Enabling Speculation Barrier for firmware calls\n"); + } + + } else if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) { setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW); pr_info("Enabling Restricted Speculation for firmware calls\n"); }
From: Kees Cook keescook@chromium.org
stable inclusion from stable-v5.10.134 commit c0a3a9eb262a5e86a36fd5fc4f9fc38470713f13 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 65cdf0d623bedf0e069bb64ed52e8bb20105e2ba upstream.
Debugging missing return thunks is easier if we can see where they're happening.
Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/lkml/Ys66hwtFcGbYmoiZ@hirez.programming.kicks-ass.ne... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- arch/x86/kernel/alternative.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index fa79a46fa5d2..9d3c0d70a555 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -709,7 +709,9 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) dest = addr + insn.length + insn.immediate.value;
if (__static_call_fixup(addr, op, dest) || - WARN_ON_ONCE(dest != &__x86_return_thunk)) + WARN_ONCE(dest != &__x86_return_thunk, + "missing return thunk: %pS-%pS: %*ph", + addr, dest, 5, addr)) continue;
DPRINTK("return thunk at: %pS (%px) len: %d to: %pS",
From: Jose Alonso joalonsof@gmail.com
stable inclusion from stable-v5.10.134 commit f7c1fc0dec97b882c105a0887bef585471cf1262 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 36a15e1cb134c0395261ba1940762703f778438c upstream.
The extra byte inserted by usbnet.c when (length % dev->maxpacket == 0) is causing problems to device.
This patch sets FLAG_SEND_ZLP to avoid this.
Tested with: 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
Problems observed:
====================================================================== 1) Using ssh/sshfs. The remote sshd daemon can abort with the message: "message authentication code incorrect" This happens because the tcp message sent is corrupted during the USB "Bulk out". The device calculate the tcp checksum and send a valid tcp message to the remote sshd. Then the encryption detects the error and aborts. 2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out 3) Stop normal work without any log message. The "Bulk in" continue receiving packets normally. The host sends "Bulk out" and the device responds with -ECONNRESET. (The netusb.c code tx_complete ignore -ECONNRESET) Under normal conditions these errors take days to happen and in intense usage take hours.
A test with ping gives packet loss, showing that something is wrong: ping -4 -s 462 {destination} # 462 = 512 - 42 - 8 Not all packets fail. My guess is that the device tries to find another packet starting at the extra byte and will fail or not depending on the next bytes (old buffer content). ======================================================================
Signed-off-by: Jose Alonso joalonsof@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/net/usb/ax88179_178a.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c index 79a53fe245e5..0ac4f59e3f18 100644 --- a/drivers/net/usb/ax88179_178a.c +++ b/drivers/net/usb/ax88179_178a.c @@ -1796,7 +1796,7 @@ static const struct driver_info ax88179_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1809,7 +1809,7 @@ static const struct driver_info ax88178a_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1822,7 +1822,7 @@ static const struct driver_info cypress_GX3_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1835,7 +1835,7 @@ static const struct driver_info dlink_dub1312_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1848,7 +1848,7 @@ static const struct driver_info sitecom_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1861,7 +1861,7 @@ static const struct driver_info samsung_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1874,7 +1874,7 @@ static const struct driver_info lenovo_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1887,7 +1887,7 @@ static const struct driver_info belkin_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1900,7 +1900,7 @@ static const struct driver_info toshiba_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1913,7 +1913,7 @@ static const struct driver_info mct_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, };