Alexey Dobriyan (1): Input: i8042 - unbreak Pegatron C15B
Arnd Bergmann (1): elfcore: fix building with clang
Aurelien Aptel (1): cifs: report error instead of invalid when revalidating a dentry fails
Benjamin Valentin (1): Input: xpad - sync supported devices with fork on GitHub
Chenxin Jin (1): USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Christoph Schemmel (1): USB: serial: option: Adding support for Cinterion MV31
DENG Qingfang (1): net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Dan Carpenter (1): USB: gadget: legacy: fix an error code in eth_bind()
Dave Hansen (1): x86/apic: Add extra serialization for non-serializing MSRs
David Howells (1): rxrpc: Fix deadlock around release of dst cached on udp tunnel
Felix Fietkau (1): mac80211: fix station rate table updates on assoc
Fengnan Chang (1): mmc: core: Limit retries when analyse of SDIO tuples fails
Gary Bisson (1): usb: dwc3: fix clock issue during resume in OTG mode
Greg Kroah-Hartman (1): Linux 4.19.175
Gustavo A. R. Silva (1): smb3: Fix out-of-bounds bug in SMB2_negotiate()
Heiko Stuebner (1): usb: dwc2: Fix endpoint direction check in ep_from_windex
Hugh Dickins (1): mm: thp: fix MADV_REMOVE deadlock on shmem THP
Jeremy Figgins (1): USB: usblp: don't call usb_set_interface if there's a single alt
Josh Poimboeuf (1): x86/build: Disable CET instrumentation in the kernel
Liangyan (1): ovl: fix dentry leak in ovl_get_redirect
Marc Zyngier (1): genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
Mathias Nyman (1): xhci: fix bounce buffer usage for non-sg list case
Muchun Song (4): mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page mm: hugetlb: fix a race between freeing and dissolving the page mm: hugetlb: fix a race between isolating and freeing page mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Nadav Amit (1): iommu/vt-d: Do not use flush-queue when caching-mode is on
Pho Tran (1): USB: serial: cp210x: add pid/vid for WSDA-200-USB
Roman Gushchin (1): memblock: do not start bottom-up allocations with kernel_end
Russell King (1): ARM: footbridge: fix dc21285 PCI configuration accessors
Sean Christopherson (1): KVM: SVM: Treat SVM as unsupported when running as an SEV guest
Stefan Chulski (1): net: mvpp2: TCAM entry enable should be written after SRAM data
Thorsten Leemhuis (1): nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Vadim Fedorenko (1): net: ip_tunnel: fix mtu calculation
Wang ShaoBo (1): kretprobe: Avoid re-registration of the same kretprobe earlier
Xiao Ni (1): md: Set prev_flush_start and flush_bio in an atomic way
Xie He (1): net: lapb: Copy the skb before sending a packet
Yoshihiro Shimoda (1): usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
Zyta Szpak (1): arm64: dts: ls1046a: fix dcfg address range
Makefile | 8 +-- arch/arm/mach-footbridge/dc21285.c | 12 ++--- .../arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +- arch/x86/Makefile | 3 ++ arch/x86/include/asm/apic.h | 10 ---- arch/x86/include/asm/barrier.h | 18 +++++++ arch/x86/kernel/apic/apic.c | 4 ++ arch/x86/kernel/apic/x2apic_cluster.c | 6 ++- arch/x86/kernel/apic/x2apic_phys.c | 6 ++- arch/x86/kvm/svm.c | 5 ++ drivers/input/joystick/xpad.c | 17 ++++++- drivers/input/serio/i8042-x86ia64io.h | 2 + drivers/iommu/intel-iommu.c | 6 +++ drivers/md/md.c | 2 + drivers/mmc/core/sdio_cis.c | 6 +++ drivers/net/dsa/mv88e6xxx/chip.c | 6 ++- .../net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 ++-- drivers/nvme/host/pci.c | 2 + drivers/usb/class/usblp.c | 19 ++++--- drivers/usb/dwc2/gadget.c | 8 +-- drivers/usb/dwc3/core.c | 2 +- drivers/usb/gadget/legacy/ether.c | 4 +- drivers/usb/host/xhci-ring.c | 31 +++++++----- drivers/usb/renesas_usbhs/fifo.c | 1 + drivers/usb/serial/cp210x.c | 2 + drivers/usb/serial/option.c | 6 +++ fs/afs/main.c | 6 +-- fs/cifs/dir.c | 22 ++++++++- fs/cifs/smb2pdu.h | 2 +- fs/hugetlbfs/inode.c | 3 +- fs/overlayfs/dir.c | 2 +- include/linux/elfcore.h | 22 +++++++++ include/linux/hugetlb.h | 3 ++ include/linux/msi.h | 6 +++ kernel/Makefile | 1 - kernel/elfcore.c | 26 ---------- kernel/irq/msi.c | 44 ++++++++--------- kernel/kprobes.c | 4 ++ mm/huge_memory.c | 37 ++++++++------ mm/hugetlb.c | 48 ++++++++++++++++-- mm/memblock.c | 49 +++---------------- net/ipv4/ip_tunnel.c | 16 +++--- net/lapb/lapb_out.c | 3 +- net/mac80211/driver-ops.c | 5 +- net/mac80211/rate.c | 3 +- net/rxrpc/af_rxrpc.c | 6 +-- 46 files changed, 307 insertions(+), 199 deletions(-) delete mode 100644 kernel/elfcore.c
From: Pho Tran Pho.Tran@silabs.com
commit 3c4f6ecd93442f4376a58b38bb40ee0b8c46e0e6 upstream.
Information pid/vid of WSDA-200-USB, Lord corporation company: vid: 199b pid: ba30
Signed-off-by: Pho Tran pho.tran@silabs.com [ johan: amend comment with product name ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index 46ec30a2c5168..e101ea47450c3 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -201,6 +201,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x1901, 0x0194) }, /* GE Healthcare Remote Alarm Box */ { USB_DEVICE(0x1901, 0x0195) }, /* GE B850/B650/B450 CP2104 DP UART interface */ { USB_DEVICE(0x1901, 0x0196) }, /* GE B850 CP2105 DP UART interface */ + { USB_DEVICE(0x199B, 0xBA30) }, /* LORD WSDA-200-USB */ { USB_DEVICE(0x19CF, 0x3000) }, /* Parrot NMEA GPS Flight Recorder */ { USB_DEVICE(0x1ADB, 0x0001) }, /* Schweitzer Engineering C662 Cable */ { USB_DEVICE(0x1B1C, 0x1C00) }, /* Corsair USB Dongle */
From: Chenxin Jin bg4akv@hotmail.com
commit 43377df70480f82919032eb09832e9646a8a5efb upstream.
Teraoka AD2000 uses the CP210x driver, but the chip VID/PID is customized with 0988/0578. We need the driver to support the new VID/PID.
Signed-off-by: Chenxin Jin bg4akv@hotmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index e101ea47450c3..07a2c72fc3a71 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -61,6 +61,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x08e6, 0x5501) }, /* Gemalto Prox-PU/CU contactless smartcard reader */ { USB_DEVICE(0x08FD, 0x000A) }, /* Digianswer A/S , ZigBee/802.15.4 MAC Device */ { USB_DEVICE(0x0908, 0x01FF) }, /* Siemens RUGGEDCOM USB Serial Console */ + { USB_DEVICE(0x0988, 0x0578) }, /* Teraoka AD2000 */ { USB_DEVICE(0x0B00, 0x3070) }, /* Ingenico 3070 */ { USB_DEVICE(0x0BED, 0x1100) }, /* MEI (TM) Cashflow-SC Bill/Voucher Acceptor */ { USB_DEVICE(0x0BED, 0x1101) }, /* MEI series 2000 Combo Acceptor */
From: Christoph Schemmel christoph.schemmel@gmail.com
commit e478d6029dca9d8462f426aee0d32896ef64f10f upstream.
Adding support for Cinterion device MV31 for enumeration with PID 0x00B3 and 0x00B7.
usb-devices output for 0x00B3 T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=1e2d ProdID=00b3 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B3 USB Mobile Broadband S: SerialNumber=b3246eed C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=cdc_wdm I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
usb-devices output for 0x00B7 T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 5 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=1e2d ProdID=00b7 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B3 USB Mobile Broadband S: SerialNumber=b3246eed C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
Signed-off-by: Christoph Schemmel christoph.schemmel@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 1024aca252210..cdf25a39fca79 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -425,6 +425,8 @@ static void option_instat_callback(struct urb *urb); #define CINTERION_PRODUCT_AHXX_2RMNET 0x0084 #define CINTERION_PRODUCT_AHXX_AUDIO 0x0085 #define CINTERION_PRODUCT_CLS8 0x00b0 +#define CINTERION_PRODUCT_MV31_MBIM 0x00b3 +#define CINTERION_PRODUCT_MV31_RMNET 0x00b7
/* Olivetti products */ #define OLIVETTI_VENDOR_ID 0x0b3c @@ -1914,6 +1916,10 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE(SIEMENS_VENDOR_ID, CINTERION_PRODUCT_HC25_MDMNET) }, { USB_DEVICE(SIEMENS_VENDOR_ID, CINTERION_PRODUCT_HC28_MDM) }, /* HC28 enumerates with Siemens or Cinterion VID depending on FW revision */ { USB_DEVICE(SIEMENS_VENDOR_ID, CINTERION_PRODUCT_HC28_MDMNET) }, + { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_MBIM, 0xff), + .driver_info = RSVD(3)}, + { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_RMNET, 0xff), + .driver_info = RSVD(0)}, { USB_DEVICE(OLIVETTI_VENDOR_ID, OLIVETTI_PRODUCT_OLICARD100), .driver_info = RSVD(4) }, { USB_DEVICE(OLIVETTI_VENDOR_ID, OLIVETTI_PRODUCT_OLICARD120),
From: Arnd Bergmann arnd@arndb.de
commit 6e7b64b9dd6d96537d816ea07ec26b7dedd397b9 upstream.
kernel/elfcore.c only contains weak symbols, which triggers a bug with clang in combination with recordmcount:
Cannot find symbol for section 2: .text. kernel/elfcore.o: failed
Move the empty stubs into linux/elfcore.h as inline functions. As only two architectures use these, just use the architecture specific Kconfig symbols to key off the declaration.
Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: Nathan Chancellor natechancellor@gmail.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Barret Rhoden brho@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/elfcore.h | 22 ++++++++++++++++++++++ kernel/Makefile | 1 - kernel/elfcore.c | 26 -------------------------- 3 files changed, 22 insertions(+), 27 deletions(-) delete mode 100644 kernel/elfcore.c
diff --git a/include/linux/elfcore.h b/include/linux/elfcore.h index 4cad0e784b286..b81f9e1d74b0a 100644 --- a/include/linux/elfcore.h +++ b/include/linux/elfcore.h @@ -58,6 +58,7 @@ static inline int elf_core_copy_task_xfpregs(struct task_struct *t, elf_fpxregse } #endif
+#if defined(CONFIG_UM) || defined(CONFIG_IA64) /* * These functions parameterize elf_core_dump in fs/binfmt_elf.c to write out * extra segments containing the gate DSO contents. Dumping its @@ -72,5 +73,26 @@ elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset); extern int elf_core_write_extra_data(struct coredump_params *cprm); extern size_t elf_core_extra_data_size(void); +#else +static inline Elf_Half elf_core_extra_phdrs(void) +{ + return 0; +} + +static inline int elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset) +{ + return 1; +} + +static inline int elf_core_write_extra_data(struct coredump_params *cprm) +{ + return 1; +} + +static inline size_t elf_core_extra_data_size(void) +{ + return 0; +} +#endif
#endif /* _LINUX_ELFCORE_H */ diff --git a/kernel/Makefile b/kernel/Makefile index d0482bd27ba49..ccc3a9c479306 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -92,7 +92,6 @@ obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o obj-$(CONFIG_TRACEPOINTS) += tracepoint.o obj-$(CONFIG_LATENCYTOP) += latencytop.o -obj-$(CONFIG_ELFCORE) += elfcore.o obj-$(CONFIG_FUNCTION_TRACER) += trace/ obj-$(CONFIG_TRACING) += trace/ obj-$(CONFIG_TRACE_CLOCK) += trace/ diff --git a/kernel/elfcore.c b/kernel/elfcore.c deleted file mode 100644 index 57fb4dcff4349..0000000000000 --- a/kernel/elfcore.c +++ /dev/null @@ -1,26 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <linux/elf.h> -#include <linux/fs.h> -#include <linux/mm.h> -#include <linux/binfmts.h> -#include <linux/elfcore.h> - -Elf_Half __weak elf_core_extra_phdrs(void) -{ - return 0; -} - -int __weak elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset) -{ - return 1; -} - -int __weak elf_core_write_extra_data(struct coredump_params *cprm) -{ - return 1; -} - -size_t __weak elf_core_extra_data_size(void) -{ - return 0; -}
From: Alexey Dobriyan adobriyan@gmail.com
[ Upstream commit a3a9060ecad030e2c7903b2b258383d2c716b56c ]
g++ reports
drivers/input/serio/i8042-x86ia64io.h:225:3: error: ‘.matches’ designator used multiple times in the same initializer list
C99 semantics is that last duplicated initialiser wins, so DMI entry gets overwritten.
Fixes: a48491c65b51 ("Input: i8042 - add ByteSpeed touchpad to noloop table") Signed-off-by: Alexey Dobriyan adobriyan@gmail.com Acked-by: Po-Hsu Lin po-hsu.lin@canonical.com Link: https://lore.kernel.org/r/20201228072335.GA27766@localhost.localdomain Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/serio/i8042-x86ia64io.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h index b256e3006a6fb..844875df8cad7 100644 --- a/drivers/input/serio/i8042-x86ia64io.h +++ b/drivers/input/serio/i8042-x86ia64io.h @@ -223,6 +223,8 @@ static const struct dmi_system_id __initconst i8042_dmi_noloop_table[] = { DMI_MATCH(DMI_SYS_VENDOR, "PEGATRON CORPORATION"), DMI_MATCH(DMI_PRODUCT_NAME, "C15B"), }, + }, + { .matches = { DMI_MATCH(DMI_SYS_VENDOR, "ByteSpeed LLC"), DMI_MATCH(DMI_PRODUCT_NAME, "ByteSpeed Laptop C15B"),
From: David Howells dhowells@redhat.com
[ Upstream commit 5399d52233c47905bbf97dcbaa2d7a9cc31670ba ]
AF_RXRPC sockets use UDP ports in encap mode. This causes socket and dst from an incoming packet to get stolen and attached to the UDP socket from whence it is leaked when that socket is closed.
When a network namespace is removed, the wait for dst records to be cleaned up happens before the cleanup of the rxrpc and UDP socket, meaning that the wait never finishes.
Fix this by moving the rxrpc (and, by dependence, the afs) private per-network namespace registrations to the device group rather than subsys group. This allows cached rxrpc local endpoints to be cleared and their UDP sockets closed before we try waiting for the dst records.
The symptom is that lines looking like the following:
unregister_netdevice: waiting for lo to become free
get emitted at regular intervals after running something like the referenced syzbot test.
Thanks to Vadim for tracking this down and work out the fix.
Reported-by: syzbot+df400f2f24a1677cd7e0@syzkaller.appspotmail.com Reported-by: Vadim Fedorenko vfedorenko@novek.ru Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook") Signed-off-by: David Howells dhowells@redhat.com Acked-by: Vadim Fedorenko vfedorenko@novek.ru Link: https://lore.kernel.org/r/161196443016.3868642.5577440140646403533.stgit@war... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/main.c | 6 +++--- net/rxrpc/af_rxrpc.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/afs/main.c b/fs/afs/main.c index 107427688eddd..8ecb127be63f9 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -190,7 +190,7 @@ static int __init afs_init(void) goto error_cache; #endif
- ret = register_pernet_subsys(&afs_net_ops); + ret = register_pernet_device(&afs_net_ops); if (ret < 0) goto error_net;
@@ -210,7 +210,7 @@ static int __init afs_init(void) error_proc: afs_fs_exit(); error_fs: - unregister_pernet_subsys(&afs_net_ops); + unregister_pernet_device(&afs_net_ops); error_net: #ifdef CONFIG_AFS_FSCACHE fscache_unregister_netfs(&afs_cache_netfs); @@ -241,7 +241,7 @@ static void __exit afs_exit(void)
proc_remove(afs_proc_symlink); afs_fs_exit(); - unregister_pernet_subsys(&afs_net_ops); + unregister_pernet_device(&afs_net_ops); #ifdef CONFIG_AFS_FSCACHE fscache_unregister_netfs(&afs_cache_netfs); #endif diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c index 57f835d2442ec..fb7e3fffcb5ef 100644 --- a/net/rxrpc/af_rxrpc.c +++ b/net/rxrpc/af_rxrpc.c @@ -1010,7 +1010,7 @@ static int __init af_rxrpc_init(void) goto error_security; }
- ret = register_pernet_subsys(&rxrpc_net_ops); + ret = register_pernet_device(&rxrpc_net_ops); if (ret) goto error_pernet;
@@ -1055,7 +1055,7 @@ static int __init af_rxrpc_init(void) error_sock: proto_unregister(&rxrpc_proto); error_proto: - unregister_pernet_subsys(&rxrpc_net_ops); + unregister_pernet_device(&rxrpc_net_ops); error_pernet: rxrpc_exit_security(); error_security: @@ -1077,7 +1077,7 @@ static void __exit af_rxrpc_exit(void) unregister_key_type(&key_type_rxrpc); sock_unregister(PF_RXRPC); proto_unregister(&rxrpc_proto); - unregister_pernet_subsys(&rxrpc_net_ops); + unregister_pernet_device(&rxrpc_net_ops); ASSERTCMP(atomic_read(&rxrpc_n_tx_skbs), ==, 0); ASSERTCMP(atomic_read(&rxrpc_n_rx_skbs), ==, 0);
From: Zyta Szpak zr@semihalf.com
[ Upstream commit aa880c6f3ee6dbd0d5ab02026a514ff8ea0a3328 ]
Dcfg was overlapping with clockgen address space which resulted in failure in memory allocation for dcfg. According regs description dcfg size should not be bigger than 4KB.
Signed-off-by: Zyta Szpak zr@semihalf.com Fixes: 8126d88162a5 ("arm64: dts: add QorIQ LS1046A SoC support") Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index de6af453a6e16..f4eb4d3b6cabf 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -303,7 +303,7 @@
dcfg: dcfg@1ee0000 { compatible = "fsl,ls1046a-dcfg", "syscon"; - reg = <0x0 0x1ee0000 0x0 0x10000>; + reg = <0x0 0x1ee0000 0x0 0x1000>; big-endian; };
From: Xie He xie.he.0141@gmail.com
[ Upstream commit 88c7a9fd9bdd3e453f04018920964c6f848a591a ]
When sending a packet, we will prepend it with an LAPB header. This modifies the shared parts of a cloned skb, so we should copy the skb rather than just clone it, before we prepend the header.
In "Documentation/networking/driver.rst" (the 2nd point), it states that drivers shouldn't modify the shared parts of a cloned skb when transmitting.
The "dev_queue_xmit_nit" function in "net/core/dev.c", which is called when an skb is being sent, clones the skb and sents the clone to AF_PACKET sockets. Because the LAPB drivers first remove a 1-byte pseudo-header before handing over the skb to us, if we don't copy the skb before prepending the LAPB header, the first byte of the packets received on AF_PACKET sockets can be corrupted.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xie He xie.he.0141@gmail.com Acked-by: Martin Schiller ms@dev.tdt.de Link: https://lore.kernel.org/r/20210201055706.415842-1-xie.he.0141@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/lapb/lapb_out.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/lapb/lapb_out.c b/net/lapb/lapb_out.c index eda726e22f645..621c66f001177 100644 --- a/net/lapb/lapb_out.c +++ b/net/lapb/lapb_out.c @@ -87,7 +87,8 @@ void lapb_kick(struct lapb_cb *lapb) skb = skb_dequeue(&lapb->write_queue);
do { - if ((skbn = skb_clone(skb, GFP_ATOMIC)) == NULL) { + skbn = skb_copy(skb, GFP_ATOMIC); + if (!skbn) { skb_queue_head(&lapb->write_queue, skb); break; }
From: Stefan Chulski stefanc@marvell.com
[ Upstream commit 43f4a20a1266d393840ce010f547486d14cc0071 ]
Last TCAM data contains TCAM enable bit. It should be written after SRAM data before entry enabled.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Stefan Chulski stefanc@marvell.com Link: https://lore.kernel.org/r/1612172139-28343-1-git-send-email-stefanc@marvell.... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c index a30eb90ba3d28..dd590086fe6a5 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c @@ -29,16 +29,16 @@ static int mvpp2_prs_hw_write(struct mvpp2 *priv, struct mvpp2_prs_entry *pe) /* Clear entry invalidation bit */ pe->tcam[MVPP2_PRS_TCAM_INV_WORD] &= ~MVPP2_PRS_TCAM_INV_MASK;
- /* Write tcam index - indirect access */ - mvpp2_write(priv, MVPP2_PRS_TCAM_IDX_REG, pe->index); - for (i = 0; i < MVPP2_PRS_TCAM_WORDS; i++) - mvpp2_write(priv, MVPP2_PRS_TCAM_DATA_REG(i), pe->tcam[i]); - /* Write sram index - indirect access */ mvpp2_write(priv, MVPP2_PRS_SRAM_IDX_REG, pe->index); for (i = 0; i < MVPP2_PRS_SRAM_WORDS; i++) mvpp2_write(priv, MVPP2_PRS_SRAM_DATA_REG(i), pe->sram[i]);
+ /* Write tcam index - indirect access */ + mvpp2_write(priv, MVPP2_PRS_TCAM_IDX_REG, pe->index); + for (i = 0; i < MVPP2_PRS_TCAM_WORDS; i++) + mvpp2_write(priv, MVPP2_PRS_TCAM_DATA_REG(i), pe->tcam[i]); + return 0; }
From: Roman Gushchin guro@fb.com
[ Upstream commit 2dcb3964544177c51853a210b6ad400de78ef17d ]
With kaslr the kernel image is placed at a random place, so starting the bottom-up allocation with the kernel_end can result in an allocation failure and a warning like this one:
hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node ------------[ cut here ]------------ memblock: bottom-up allocation failed, memory hotremove may be affected WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:memblock_find_in_range_node+0x178/0x25a Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046 RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000 R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000 FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0 Call Trace: memblock_alloc_range_nid+0x8d/0x11e cma_declare_contiguous_nid+0x2c4/0x38c hugetlb_cma_reserve+0xdc/0x128 flush_tlb_one_kernel+0xc/0x20 native_set_fixmap+0x82/0xd0 flat_get_apic_id+0x5/0x10 register_lapic_address+0x8e/0x97 setup_arch+0x8a5/0xc3f start_kernel+0x66/0x547 load_ucode_bsp+0x4c/0xcd secondary_startup_64_no_verify+0xb0/0xbb random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0 ---[ end trace f151227d0b39be70 ]---
At the same time, the kernel image is protected with memblock_reserve(), so we can just start searching at PAGE_SIZE. In this case the bottom-up allocation has the same chances to success as a top-down allocation, so there is no reason to fallback in the case of a failure. All together it simplifies the logic.
Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory") Signed-off-by: Roman Gushchin guro@fb.com Reviewed-by: Mike Rapoport rppt@linux.ibm.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Michal Hocko mhocko@kernel.org Cc: Rik van Riel riel@surriel.com Cc: Wonhyuk Yang vvghjk1234@gmail.com Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memblock.c | 49 ++++++------------------------------------------- 1 file changed, 6 insertions(+), 43 deletions(-)
diff --git a/mm/memblock.c b/mm/memblock.c index 5c943854bdfb4..13be610a381f4 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -241,14 +241,6 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end, * * Find @size free area aligned to @align in the specified range and node. * - * When allocation direction is bottom-up, the @start should be greater - * than the end of the kernel image. Otherwise, it will be trimmed. The - * reason is that we want the bottom-up allocation just near the kernel - * image so it is highly likely that the allocated memory and the kernel - * will reside in the same node. - * - * If bottom-up allocation failed, will try to allocate memory top-down. - * * Return: * Found address on success, 0 on failure. */ @@ -257,8 +249,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size, phys_addr_t end, int nid, enum memblock_flags flags) { - phys_addr_t kernel_end, ret; - /* pump up @end */ if (end == MEMBLOCK_ALLOC_ACCESSIBLE || end == MEMBLOCK_ALLOC_KASAN) @@ -267,40 +257,13 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size, /* avoid allocating the first page */ start = max_t(phys_addr_t, start, PAGE_SIZE); end = max(start, end); - kernel_end = __pa_symbol(_end); - - /* - * try bottom-up allocation only when bottom-up mode - * is set and @end is above the kernel image. - */ - if (memblock_bottom_up() && end > kernel_end) { - phys_addr_t bottom_up_start; - - /* make sure we will allocate above the kernel */ - bottom_up_start = max(start, kernel_end);
- /* ok, try bottom-up allocation first */ - ret = __memblock_find_range_bottom_up(bottom_up_start, end, - size, align, nid, flags); - if (ret) - return ret; - - /* - * we always limit bottom-up allocation above the kernel, - * but top-down allocation doesn't have the limit, so - * retrying top-down allocation may succeed when bottom-up - * allocation failed. - * - * bottom-up allocation is expected to be fail very rarely, - * so we use WARN_ONCE() here to see the stack trace if - * fail happens. - */ - WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE), - "memblock: bottom-up allocation failed, memory hotremove may be affected\n"); - } - - return __memblock_find_range_top_down(start, end, size, align, nid, - flags); + if (memblock_bottom_up()) + return __memblock_find_range_bottom_up(start, end, size, align, + nid, flags); + else + return __memblock_find_range_top_down(start, end, size, align, + nid, flags); }
/**
From: Dan Carpenter dan.carpenter@oracle.com
commit 3e1f4a2e1184ae6ad7f4caf682ced9554141a0f4 upstream.
This code should return -ENOMEM if the allocation fails but it currently returns success.
Fixes: 9b95236eebdb ("usb: gadget: ether: allocate and init otg descriptor by otg capabilities") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/YBKE9rqVuJEOUWpW@mwanda Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/legacy/ether.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/legacy/ether.c b/drivers/usb/gadget/legacy/ether.c index 30313b233680d..99c7fc0d1d597 100644 --- a/drivers/usb/gadget/legacy/ether.c +++ b/drivers/usb/gadget/legacy/ether.c @@ -403,8 +403,10 @@ static int eth_bind(struct usb_composite_dev *cdev) struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail1; + } usb_otg_descriptor_init(gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Jeremy Figgins kernel@jeremyfiggins.com
commit d8c6edfa3f4ee0d45d7ce5ef18d1245b78774b9d upstream.
Some devices, such as the Winbond Electronics Corp. Virtual Com Port (Vendor=0416, ProdId=5011), lockup when usb_set_interface() or usb_clear_halt() are called. This device has only a single altsetting, so it should not be necessary to call usb_set_interface().
Acked-by: Pete Zaitcev zaitcev@redhat.com Signed-off-by: Jeremy Figgins kernel@jeremyfiggins.com Link: https://lore.kernel.org/r/YAy9kJhM/rG8EQXC@watson Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usblp.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/class/usblp.c b/drivers/usb/class/usblp.c index 5c00d79ac7a10..c9560d8ba3cb1 100644 --- a/drivers/usb/class/usblp.c +++ b/drivers/usb/class/usblp.c @@ -1327,14 +1327,17 @@ static int usblp_set_protocol(struct usblp *usblp, int protocol) if (protocol < USBLP_FIRST_PROTOCOL || protocol > USBLP_LAST_PROTOCOL) return -EINVAL;
- alts = usblp->protocol[protocol].alt_setting; - if (alts < 0) - return -EINVAL; - r = usb_set_interface(usblp->dev, usblp->ifnum, alts); - if (r < 0) { - printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n", - alts, usblp->ifnum); - return r; + /* Don't unnecessarily set the interface if there's a single alt. */ + if (usblp->intf->num_altsetting > 1) { + alts = usblp->protocol[protocol].alt_setting; + if (alts < 0) + return -EINVAL; + r = usb_set_interface(usblp->dev, usblp->ifnum, alts); + if (r < 0) { + printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n", + alts, usblp->ifnum); + return r; + } }
usblp->bidir = (usblp->protocol[protocol].epread != NULL);
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
commit 9917f0e3cdba7b9f1a23f70e3f70b1a106be54a8 upstream.
Should clear the pipe running flag in usbhs_pkt_pop(). Otherwise, we cannot use this pipe after dequeue was called while the pipe was running.
Fixes: 8355b2b3082d ("usb: renesas_usbhs: fix the behavior of some usbhs_pkt_handle") Reported-by: Tho Vu tho.vu.wh@renesas.com Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Link: https://lore.kernel.org/r/1612183640-8898-1-git-send-email-yoshihiro.shimoda... Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/renesas_usbhs/fifo.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/renesas_usbhs/fifo.c b/drivers/usb/renesas_usbhs/fifo.c index aeb53ec5cc6ab..6b92a57382db3 100644 --- a/drivers/usb/renesas_usbhs/fifo.c +++ b/drivers/usb/renesas_usbhs/fifo.c @@ -126,6 +126,7 @@ struct usbhs_pkt *usbhs_pkt_pop(struct usbhs_pipe *pipe, struct usbhs_pkt *pkt) }
usbhs_pipe_clear_without_sequence(pipe, 0, 0); + usbhs_pipe_running(pipe, 0);
__usbhsf_pkt_del(pkt); }
From: Heiko Stuebner heiko.stuebner@theobroma-systems.com
commit f670e9f9c8cac716c3506c6bac9e997b27ad441a upstream.
dwc2_hsotg_process_req_status uses ep_from_windex() to retrieve the endpoint for the index provided in the wIndex request param.
In a test-case with a rndis gadget running and sending a malformed packet to it like: dev.ctrl_transfer( 0x82, # bmRequestType 0x00, # bRequest 0x0000, # wValue 0x0001, # wIndex 0x00 # wLength ) it is possible to cause a crash:
[ 217.533022] dwc2 ff300000.usb: dwc2_hsotg_process_req_status: USB_REQ_GET_STATUS [ 217.559003] Unable to handle kernel read from unreadable memory at virtual address 0000000000000088 ... [ 218.313189] Call trace: [ 218.330217] ep_from_windex+0x3c/0x54 [ 218.348565] usb_gadget_giveback_request+0x10/0x20 [ 218.368056] dwc2_hsotg_complete_request+0x144/0x184
This happens because ep_from_windex wants to compare the endpoint direction even if index_to_ep() didn't return an endpoint due to the direction not matching.
The fix is easy insofar that the actual direction check is already happening when calling index_to_ep() which will return NULL if there is no endpoint for the targeted direction, so the offending check can go away completely.
Fixes: c6f5c050e2a7 ("usb: dwc2: gadget: add bi-directional endpoint support") Cc: stable@vger.kernel.org Reported-by: Gerhard Klostermeier gerhard.klostermeier@syss.de Signed-off-by: Heiko Stuebner heiko.stuebner@theobroma-systems.com Link: https://lore.kernel.org/r/20210127103919.58215-1-heiko@sntech.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/gadget.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index 8e98b4df9b109..42f66718bc184 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -1453,7 +1453,6 @@ static void dwc2_hsotg_complete_oursetup(struct usb_ep *ep, static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg, u32 windex) { - struct dwc2_hsotg_ep *ep; int dir = (windex & USB_DIR_IN) ? 1 : 0; int idx = windex & 0x7F;
@@ -1463,12 +1462,7 @@ static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg, if (idx > hsotg->num_of_eps) return NULL;
- ep = index_to_ep(hsotg, idx, dir); - - if (idx && ep->dir_in != dir) - return NULL; - - return ep; + return index_to_ep(hsotg, idx, dir); }
/**
From: Gary Bisson gary.bisson@boundarydevices.com
commit 0e5a3c8284a30f4c43fd81d7285528ece74563b5 upstream.
Commit fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3 core") introduced clock support and a new function named dwc3_core_init_for_resume() which enables the clock before calling dwc3_core_init() during resume as clocks get disabled during suspend.
Unfortunately in this commit the DWC3_GCTL_PRTCAP_OTG case was forgotten and therefore during resume, a platform could call dwc3_core_init() without re-enabling the clocks first, preventing to resume properly.
So update the resume path to call dwc3_core_init_for_resume() as it should.
Fixes: fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3 core") Cc: stable@vger.kernel.org Signed-off-by: Gary Bisson gary.bisson@boundarydevices.com Link: https://lore.kernel.org/r/20210125161934.527820-1-gary.bisson@boundarydevice... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 86b1cfbe48a08..e890c26b6c826 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1700,7 +1700,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg) if (PMSG_IS_AUTO(msg)) break;
- ret = dwc3_core_init(dwc); + ret = dwc3_core_init_for_resume(dwc); if (ret) return ret;
From: Liangyan liangyan.peng@linux.alibaba.com
commit e04527fefba6e4e66492f122cf8cc6314f3cf3bf upstream.
We need to lock d_parent->d_lock before dget_dlock, or this may have d_lockref updated parallelly like calltrace below which will cause dentry->d_lockref leak and risk a crash.
CPU 0 CPU 1 ovl_set_redirect lookup_fast ovl_get_redirect __d_lookup dget_dlock //no lock protection here spin_lock(&dentry->d_lock) dentry->d_lockref.count++ dentry->d_lockref.count++
[ 49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0 [ 49.799689] Oops: 0002 [#1] SMP PTI [ 49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1 [ 49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014 [ 49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [ 49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [ 49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [ 49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [ 49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [ 49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [ 49.807200] FS: 00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [ 49.807935] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [ 49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 49.810410] Call Trace: [ 49.810653] d_delete+0x2c/0xb0 [ 49.810951] vfs_rmdir+0xfd/0x120 [ 49.811264] do_rmdir+0x14f/0x1a0 [ 49.811573] do_syscall_64+0x5b/0x190 [ 49.811917] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 49.812385] RIP: 0033:0x7ffbf505ffd7 [ 49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054 [ 49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7 [ 49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50 [ 49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b [ 49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012 [ 49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8 [ 49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042 [ 49.819052] CR2: 0000000000000088 [ 49.819368] ---[ end trace 4e652b8aa299aa2d ]--- [ 49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [ 49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [ 49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [ 49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [ 49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [ 49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [ 49.828890] FS: 00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [ 49.830725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [ 49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Cc: stable@vger.kernel.org Fixes: a6c606551141 ("ovl: redirect on rename-dir") Signed-off-by: Liangyan liangyan.peng@linux.alibaba.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index 2228b38abdb1d..74340542824a4 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -968,8 +968,8 @@ static char *ovl_get_redirect(struct dentry *dentry, bool abs_redirect)
buflen -= thislen; memcpy(&buf[buflen], name, thislen); - tmp = dget_dlock(d->d_parent); spin_unlock(&d->d_lock); + tmp = dget_parent(d);
dput(d); d = tmp;
From: Felix Fietkau nbd@nbd.name
commit 18fe0fae61252b5ae6e26553e2676b5fac555951 upstream.
If the driver uses .sta_add, station entries are only uploaded after the sta is in assoc state. Fix early station rate table updates by deferring them until the sta has been uploaded.
Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20210201083324.3134-1-nbd@nbd.name [use rcu_access_pointer() instead since we won't dereference here] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/driver-ops.c | 5 ++++- net/mac80211/rate.c | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/driver-ops.c b/net/mac80211/driver-ops.c index f783d1377d9a8..9f0f437a09b95 100644 --- a/net/mac80211/driver-ops.c +++ b/net/mac80211/driver-ops.c @@ -128,8 +128,11 @@ int drv_sta_state(struct ieee80211_local *local, } else if (old_state == IEEE80211_STA_AUTH && new_state == IEEE80211_STA_ASSOC) { ret = drv_sta_add(local, sdata, &sta->sta); - if (ret == 0) + if (ret == 0) { sta->uploaded = true; + if (rcu_access_pointer(sta->sta.rates)) + drv_sta_rate_tbl_update(local, sdata, &sta->sta); + } } else if (old_state == IEEE80211_STA_ASSOC && new_state == IEEE80211_STA_AUTH) { drv_sta_remove(local, sdata, &sta->sta); diff --git a/net/mac80211/rate.c b/net/mac80211/rate.c index 76f303fda3eda..954b932fd7b86 100644 --- a/net/mac80211/rate.c +++ b/net/mac80211/rate.c @@ -941,7 +941,8 @@ int rate_control_set_rates(struct ieee80211_hw *hw, if (old) kfree_rcu(old, rcu_head);
- drv_sta_rate_tbl_update(hw_to_local(hw), sta->sdata, pubsta); + if (sta->uploaded) + drv_sta_rate_tbl_update(hw_to_local(hw), sta->sdata, pubsta);
ieee80211_sta_set_expected_throughput(pubsta, sta_get_expected_throughput(sta));
From: Wang ShaoBo bobo.shaobowang@huawei.com
commit 0188b87899ffc4a1d36a0badbe77d56c92fd91dc upstream.
Our system encountered a re-init error when re-registering same kretprobe, where the kretprobe_instance in rp->free_instances is illegally accessed after re-init.
Implementation to avoid re-registration has been introduced for kprobe before, but lags for register_kretprobe(). We must check if kprobe has been re-registered before re-initializing kretprobe, otherwise it will destroy the data struct of kretprobe registered, which can lead to memory leak, system crash, also some unexpected behaviors.
We use check_kprobe_rereg() to check if kprobe has been re-registered before running register_kretprobe()'s body, for giving a warning message and terminate registration process.
Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.co...
Cc: stable@vger.kernel.org Fixes: 1f0ab40976460 ("kprobes: Prevent re-registration of the same kprobe") [ The above commit should have been done for kretprobes too ] Acked-by: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com Acked-by: Ananth N Mavinakayanahalli ananth@linux.ibm.com Acked-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/kprobes.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 182fc95feb72d..a6b658b5b7aee 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1948,6 +1948,10 @@ int register_kretprobe(struct kretprobe *rp) if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset)) return -EINVAL;
+ /* If only rp->kp.addr is specified, check reregistering kprobes */ + if (rp->kp.addr && check_kprobe_rereg(&rp->kp)) + return -EINVAL; + if (kretprobe_blacklist_size) { addr = kprobe_addr(&rp->kp); if (IS_ERR(addr))
From: Marc Zyngier maz@kernel.org
commit 4c457e8cb75eda91906a4f89fc39bde3f9a43922 upstream.
When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI), __msi_domain_alloc_irqs() performs the activation of the interrupt (which in the case of PCI results in the endpoint being programmed) as soon as the interrupt is allocated.
But it appears that this is only done for the first vector, introducing an inconsistent behaviour for PCI Multi-MSI.
Fix it by iterating over the number of vectors allocated to each MSI descriptor. This is easily achieved by introducing a new "for_each_msi_vector" iterator, together with a tiny bit of refactoring.
Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early") Reported-by: Shameer Kolothum shameerali.kolothum.thodi@huawei.com Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Shameer Kolothum shameerali.kolothum.thodi@huawei.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210123122759.1781359-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/msi.h | 6 ++++++ kernel/irq/msi.c | 44 ++++++++++++++++++++------------------------ 2 files changed, 26 insertions(+), 24 deletions(-)
diff --git a/include/linux/msi.h b/include/linux/msi.h index be8ec813dbfb2..5dd171849a27e 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -118,6 +118,12 @@ struct msi_desc { list_for_each_entry((desc), dev_to_msi_list((dev)), list) #define for_each_msi_entry_safe(desc, tmp, dev) \ list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list) +#define for_each_msi_vector(desc, __irq, dev) \ + for_each_msi_entry((desc), (dev)) \ + if ((desc)->irq) \ + for (__irq = (desc)->irq; \ + __irq < ((desc)->irq + (desc)->nvec_used); \ + __irq++)
#ifdef CONFIG_PCI_MSI #define first_pci_msi_entry(pdev) first_msi_entry(&(pdev)->dev) diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c index dc1186ce3ecdb..604974f2afb19 100644 --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -437,22 +437,22 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
can_reserve = msi_check_reservation_mode(domain, info, dev);
- for_each_msi_entry(desc, dev) { - virq = desc->irq; - if (desc->nvec_used == 1) - dev_dbg(dev, "irq %d for MSI\n", virq); - else + /* + * This flag is set by the PCI layer as we need to activate + * the MSI entries before the PCI layer enables MSI in the + * card. Otherwise the card latches a random msi message. + */ + if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY)) + goto skip_activate; + + for_each_msi_vector(desc, i, dev) { + if (desc->irq == i) { + virq = desc->irq; dev_dbg(dev, "irq [%d-%d] for MSI\n", virq, virq + desc->nvec_used - 1); - /* - * This flag is set by the PCI layer as we need to activate - * the MSI entries before the PCI layer enables MSI in the - * card. Otherwise the card latches a random msi message. - */ - if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY)) - continue; + }
- irq_data = irq_domain_get_irq_data(domain, desc->irq); + irq_data = irq_domain_get_irq_data(domain, i); if (!can_reserve) { irqd_clr_can_reserve(irq_data); if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK) @@ -463,28 +463,24 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, goto cleanup; }
+skip_activate: /* * If these interrupts use reservation mode, clear the activated bit * so request_irq() will assign the final vector. */ if (can_reserve) { - for_each_msi_entry(desc, dev) { - irq_data = irq_domain_get_irq_data(domain, desc->irq); + for_each_msi_vector(desc, i, dev) { + irq_data = irq_domain_get_irq_data(domain, i); irqd_clr_activated(irq_data); } } return 0;
cleanup: - for_each_msi_entry(desc, dev) { - struct irq_data *irqd; - - if (desc->irq == virq) - break; - - irqd = irq_domain_get_irq_data(domain, desc->irq); - if (irqd_is_activated(irqd)) - irq_domain_deactivate_irq(irqd); + for_each_msi_vector(desc, i, dev) { + irq_data = irq_domain_get_irq_data(domain, i); + if (irqd_is_activated(irq_data)) + irq_domain_deactivate_irq(irq_data); } msi_domain_free_irqs(domain, dev); return ret;
From: Mathias Nyman mathias.nyman@linux.intel.com
commit d4a610635400ccc382792f6be69427078541c678 upstream.
xhci driver may in some special cases need to copy small amounts of payload data to a bounce buffer in order to meet the boundary and alignment restrictions set by the xHCI specification.
In the majority of these cases the data is in a sg list, and driver incorrectly assumed data is always in urb->sg when using the bounce buffer.
If data instead is contiguous, and in urb->transfer_buffer, we may still need to bounce buffer a small part if data starts very close (less than packet size) to a 64k boundary.
Check if sg list is used before copying data to/from it.
Fixes: f9c589e142d0 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer") Cc: stable@vger.kernel.org Reported-by: Andreas Hartmann andihartmann@01019freenet.de Tested-by: Andreas Hartmann andihartmann@01019freenet.de Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210203113702.436762-2-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-ring.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 537bbcd47c463..22b1de99d02d1 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -670,11 +670,16 @@ static void xhci_unmap_td_bounce_buffer(struct xhci_hcd *xhci, dma_unmap_single(dev, seg->bounce_dma, ring->bounce_buf_len, DMA_FROM_DEVICE); /* for in tranfers we need to copy the data from bounce to sg */ - len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, - seg->bounce_len, seg->bounce_offs); - if (len != seg->bounce_len) - xhci_warn(xhci, "WARN Wrong bounce buffer read length: %zu != %d\n", - len, seg->bounce_len); + if (urb->num_sgs) { + len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, + seg->bounce_len, seg->bounce_offs); + if (len != seg->bounce_len) + xhci_warn(xhci, "WARN Wrong bounce buffer read length: %zu != %d\n", + len, seg->bounce_len); + } else { + memcpy(urb->transfer_buffer + seg->bounce_offs, seg->bounce_buf, + seg->bounce_len); + } seg->bounce_len = 0; seg->bounce_offs = 0; } @@ -3180,12 +3185,16 @@ static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
/* create a max max_pkt sized bounce buffer pointed to by last trb */ if (usb_urb_dir_out(urb)) { - len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs, - seg->bounce_buf, new_buff_len, enqd_len); - if (len != new_buff_len) - xhci_warn(xhci, - "WARN Wrong bounce buffer write length: %zu != %d\n", - len, new_buff_len); + if (urb->num_sgs) { + len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs, + seg->bounce_buf, new_buff_len, enqd_len); + if (len != new_buff_len) + xhci_warn(xhci, "WARN Wrong bounce buffer write length: %zu != %d\n", + len, new_buff_len); + } else { + memcpy(seg->bounce_buf, urb->transfer_buffer + enqd_len, new_buff_len); + } + seg->bounce_dma = dma_map_single(dev, seg->bounce_buf, max_pkt, DMA_TO_DEVICE); } else {
From: Aurelien Aptel aaptel@suse.com
commit 21b200d091826a83aafc95d847139b2b0582f6d1 upstream.
Assuming - //HOST/a is mounted on /mnt - //HOST/b is mounted on /mnt/b
On a slow connection, running 'df' and killing it while it's processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.
This triggers the following chain of events: => the dentry revalidation fail => dentry is put and released => superblock associated with the dentry is put => /mnt/b is unmounted
This patch makes cifs_d_revalidate() return the error instead of 0 (invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file deleted) and ESTALE (file recreated).
Signed-off-by: Aurelien Aptel aaptel@suse.com Suggested-by: Shyam Prasad N nspmangalore@gmail.com Reviewed-by: Shyam Prasad N nspmangalore@gmail.com CC: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/dir.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c index f6e3c00898254..c7e162c9383dc 100644 --- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -840,6 +840,7 @@ static int cifs_d_revalidate(struct dentry *direntry, unsigned int flags) { struct inode *inode; + int rc;
if (flags & LOOKUP_RCU) return -ECHILD; @@ -849,8 +850,25 @@ cifs_d_revalidate(struct dentry *direntry, unsigned int flags) if ((flags & LOOKUP_REVAL) && !CIFS_CACHE_READ(CIFS_I(inode))) CIFS_I(inode)->time = 0; /* force reval */
- if (cifs_revalidate_dentry(direntry)) - return 0; + rc = cifs_revalidate_dentry(direntry); + if (rc) { + cifs_dbg(FYI, "cifs_revalidate_dentry failed with rc=%d", rc); + switch (rc) { + case -ENOENT: + case -ESTALE: + /* + * Those errors mean the dentry is invalid + * (file was deleted or recreated) + */ + return 0; + default: + /* + * Otherwise some unexpected error happened + * report it as-is to VFS layer + */ + return rc; + } + } else { /* * If the inode wasn't known to be a dfs entry when
From: "Gustavo A. R. Silva" gustavoars@kernel.org
commit 8d8d1dbefc423d42d626cf5b81aac214870ebaab upstream.
While addressing some warnings generated by -Warray-bounds, I found this bug that was introduced back in 2017:
CC [M] fs/cifs/smb2pdu.o fs/cifs/smb2pdu.c: In function ‘SMB2_negotiate’: fs/cifs/smb2pdu.c:822:16: warning: array subscript 1 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 822 | req->Dialects[1] = cpu_to_le16(SMB30_PROT_ID); | ~~~~~~~~~~~~~^~~ fs/cifs/smb2pdu.c:823:16: warning: array subscript 2 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 823 | req->Dialects[2] = cpu_to_le16(SMB302_PROT_ID); | ~~~~~~~~~~~~~^~~ fs/cifs/smb2pdu.c:824:16: warning: array subscript 3 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 824 | req->Dialects[3] = cpu_to_le16(SMB311_PROT_ID); | ~~~~~~~~~~~~~^~~ fs/cifs/smb2pdu.c:816:16: warning: array subscript 1 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 816 | req->Dialects[1] = cpu_to_le16(SMB302_PROT_ID); | ~~~~~~~~~~~~~^~~
At the time, the size of array _Dialects_ was changed from 1 to 3 in struct validate_negotiate_info_req, and then in 2019 it was changed from 3 to 4, but those changes were never made in struct smb2_negotiate_req, which has led to a 3 and a half years old out-of-bounds bug in function SMB2_negotiate() (fs/cifs/smb2pdu.c).
Fix this by increasing the size of array _Dialects_ in struct smb2_negotiate_req to 4.
Fixes: 9764c02fcbad ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)") Fixes: d5c7076b772a ("smb3: add smb3.1.1 to default dialect list") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2pdu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/smb2pdu.h b/fs/cifs/smb2pdu.h index 48ed43e6aee84..8a44d59947b70 100644 --- a/fs/cifs/smb2pdu.h +++ b/fs/cifs/smb2pdu.h @@ -222,7 +222,7 @@ struct smb2_negotiate_req { __le32 NegotiateContextOffset; /* SMB3.1.1 only. MBZ earlier */ __le16 NegotiateContextCount; /* SMB3.1.1 only. MBZ earlier */ __le16 Reserved2; - __le16 Dialects[1]; /* One dialect (vers=) at a time for now */ + __le16 Dialects[4]; /* BB expand this if autonegotiate > 4 dialects */ } __packed;
/* Dialects */
From: Fengnan Chang fengnanchang@gmail.com
commit f92e04f764b86e55e522988e6f4b6082d19a2721 upstream.
When analysing tuples fails we may loop indefinitely to retry. Let's avoid this by using a 10s timeout and bail if not completed earlier.
Signed-off-by: Fengnan Chang fengnanchang@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210123033230.36442-1-fengnanchang@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/sdio_cis.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/mmc/core/sdio_cis.c b/drivers/mmc/core/sdio_cis.c index 2ca5cd79018b4..dca72444b3122 100644 --- a/drivers/mmc/core/sdio_cis.c +++ b/drivers/mmc/core/sdio_cis.c @@ -24,6 +24,8 @@ #include "sdio_cis.h" #include "sdio_ops.h"
+#define SDIO_READ_CIS_TIMEOUT_MS (10 * 1000) /* 10s */ + static int cistpl_vers_1(struct mmc_card *card, struct sdio_func *func, const unsigned char *buf, unsigned size) { @@ -270,6 +272,8 @@ static int sdio_read_cis(struct mmc_card *card, struct sdio_func *func)
do { unsigned char tpl_code, tpl_link; + unsigned long timeout = jiffies + + msecs_to_jiffies(SDIO_READ_CIS_TIMEOUT_MS);
ret = mmc_io_rw_direct(card, 0, 0, ptr++, 0, &tpl_code); if (ret) @@ -322,6 +326,8 @@ static int sdio_read_cis(struct mmc_card *card, struct sdio_func *func) prev = &this->next;
if (ret == -ENOENT) { + if (time_after(jiffies, timeout)) + break; /* warn about unknown tuples */ pr_warn_ratelimited("%s: queuing unknown" " CIS tuple 0x%02x (%u bytes)\n",
From: Thorsten Leemhuis linux@leemhuis.info
commit 538e4a8c571efdf131834431e0c14808bcfb1004 upstream.
Some Kingston A2000 NVMe SSDs sooner or later get confused and stop working when they use the deepest APST sleep while running Linux. The system then crashes and one has to cold boot it to get the SSD working again.
Kingston seems to known about this since at least mid-September 2020: https://bbs.archlinux.org/viewtopic.php?pid=1926994#p1926994
Someone working for a German company representing Kingston to the German press confirmed to me Kingston engineering is aware of the issue and investigating; the person stated that to their current knowledge only the deepest APST sleep state causes trouble. Therefore, make Linux avoid it for now by applying the NVME_QUIRK_NO_DEEPEST_PS to this SSD.
I have two such SSDs, but it seems the problem doesn't occur with them. I hence couldn't verify if this patch really fixes the problem, but all the data in front of me suggests it should.
This patch can easily be reverted or improved upon if a better solution surfaces.
FWIW, there are many reports about the issue scattered around the web; most of the users disabled APST completely to make things work, some just made Linux avoid the deepest sleep state:
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c73 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c74 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c78 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c79 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c80 https://askubuntu.com/questions/1222049/nvmekingston-a2000-sometimes-stops-g... https://community.acer.com/en/discussion/604326/m-2-nvme-ssd-aspire-517-51g-...
For the record, some data from 'nvme id-ctrl /dev/nvme0'
NVME Identify Controller: vid : 0x2646 ssvid : 0x2646 mn : KINGSTON SA2000M81000G fr : S5Z42105 [...] ps 0 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- ps 1 : mp:4.60W operational enlat:0 exlat:0 rrt:1 rrl:1 rwt:1 rwl:1 idle_power:- active_power:- ps 2 : mp:3.80W operational enlat:0 exlat:0 rrt:2 rrl:2 rwt:2 rwl:2 idle_power:- active_power:- ps 3 : mp:0.0450W non-operational enlat:2000 exlat:2000 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:- active_power:- ps 4 : mp:0.0040W non-operational enlat:15000 exlat:15000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:- active_power:-
Cc: stable@vger.kernel.org # 4.14+ Signed-off-by: Thorsten Leemhuis linux@leemhuis.info Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/pci.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index e2342bf0a52fd..d295594da64d7 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2751,6 +2751,8 @@ static const struct pci_device_id nvme_id_table[] = { { PCI_DEVICE(0x1d1d, 0x2601), /* CNEX Granby */ .driver_data = NVME_QUIRK_LIGHTNVM, }, { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) }, + { PCI_DEVICE(0x2646, 0x2263), /* KINGSTON A2000 NVMe SSD */ + .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) }, { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) }, { 0, }
From: Sean Christopherson seanjc@google.com
commit ccd85d90ce092bdb047a7f6580f3955393833b22 upstream.
Don't let KVM load when running as an SEV guest, regardless of what CPUID says. Memory is encrypted with a key that is not accessible to the host (L0), thus it's impossible for L0 to emulate SVM, e.g. it'll see garbage when reading the VMCB.
Technically, KVM could decrypt all memory that needs to be accessible to the L0 and use shadow paging so that L0 does not need to shadow NPT, but exposing such information to L0 largely defeats the purpose of running as an SEV guest. This can always be revisited if someone comes up with a use case for running VMs inside SEV guests.
Note, VMLOAD, VMRUN, etc... will also #GP on GPAs with C-bit set, i.e. KVM is doomed even if the SEV guest is debuggable and the hypervisor is willing to decrypt the VMCB. This may or may not be fixed on CPUs that have the SVME_ADDR_CHK fix.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210202212017.2486595-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d2dc734f5bd0d..f1e86051aa5f3 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -892,6 +892,11 @@ static int has_svm(void) return 0; }
+ if (sev_active()) { + pr_info("KVM is unsupported when running as an SEV guest\n"); + return 0; + } + return 1; }
From: Russell King rmk+kernel@armlinux.org.uk
commit 39d3454c3513840eb123b3913fda6903e45ce671 upstream.
Building with gcc 4.9.2 reveals a latent bug in the PCI accessors for Footbridge platforms, which causes a fatal alignment fault while accessing IO memory. Fix this by making the assembly volatile.
Cc: stable@vger.kernel.org Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-footbridge/dc21285.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm/mach-footbridge/dc21285.c b/arch/arm/mach-footbridge/dc21285.c index 16d71bac0061b..2d1f822700a1a 100644 --- a/arch/arm/mach-footbridge/dc21285.c +++ b/arch/arm/mach-footbridge/dc21285.c @@ -69,15 +69,15 @@ dc21285_read_config(struct pci_bus *bus, unsigned int devfn, int where, if (addr) switch (size) { case 1: - asm("ldrb %0, [%1, %2]" + asm volatile("ldrb %0, [%1, %2]" : "=r" (v) : "r" (addr), "r" (where) : "cc"); break; case 2: - asm("ldrh %0, [%1, %2]" + asm volatile("ldrh %0, [%1, %2]" : "=r" (v) : "r" (addr), "r" (where) : "cc"); break; case 4: - asm("ldr %0, [%1, %2]" + asm volatile("ldr %0, [%1, %2]" : "=r" (v) : "r" (addr), "r" (where) : "cc"); break; } @@ -103,17 +103,17 @@ dc21285_write_config(struct pci_bus *bus, unsigned int devfn, int where, if (addr) switch (size) { case 1: - asm("strb %0, [%1, %2]" + asm volatile("strb %0, [%1, %2]" : : "r" (value), "r" (addr), "r" (where) : "cc"); break; case 2: - asm("strh %0, [%1, %2]" + asm volatile("strh %0, [%1, %2]" : : "r" (value), "r" (addr), "r" (where) : "cc"); break; case 4: - asm("str %0, [%1, %2]" + asm volatile("str %0, [%1, %2]" : : "r" (value), "r" (addr), "r" (where) : "cc"); break;
From: Muchun Song songmuchun@bytedance.com
commit 585fc0d2871c9318c949fbf45b1f081edd489e96 upstream.
If a new hugetlb page is allocated during fallocate it will not be marked as active (set_page_huge_active) which will result in a later isolate_huge_page failure when the page migration code would like to move that page. Such a failure would be unexpected and wrong.
Only export set_page_huge_active, just leave clear_page_huge_active as static. Because there are no external users.
Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate()) Signed-off-by: Muchun Song songmuchun@bytedance.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hugetlbfs/inode.c | 3 ++- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index a878f009347cd..47a7b96cfe10f 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -787,9 +787,10 @@ static int hugetlbfs_fallocate_chunk(pgoff_t start, pgoff_t end,
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ set_page_huge_active(page); /* * unlock_page because locked by add_to_page_cache() - * page_put due to reference from alloc_huge_page() + * put_page() due to reference from alloc_huge_page() */ unlock_page(page); put_page(page); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 4a5cf4c984964..eefa6d42140d3 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -556,6 +556,9 @@ static inline void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr set_huge_pte_at(mm, addr, ptep, pte); } #endif + +void set_page_huge_active(struct page *page); + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; #define alloc_huge_page(v, a, r) NULL diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 74007332423b5..8dff07b28df43 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1219,7 +1219,7 @@ bool page_huge_active(struct page *page) }
/* never called for tail page */ -static void set_page_huge_active(struct page *page) +void set_page_huge_active(struct page *page) { VM_BUG_ON_PAGE(!PageHeadHuge(page), page); SetPagePrivate(&page[1]);
From: Muchun Song songmuchun@bytedance.com
commit 7ffddd499ba6122b1a07828f023d1d67629aa017 upstream.
There is a race condition between __free_huge_page() and dissolve_free_huge_page().
CPU0: CPU1:
// page_count(page) == 1 put_page(page) __free_huge_page(page) dissolve_free_huge_page(page) spin_lock(&hugetlb_lock) // PageHuge(page) && !page_count(page) update_and_free_page(page) // page is freed to the buddy spin_unlock(&hugetlb_lock) spin_lock(&hugetlb_lock) clear_page_huge_active(page) enqueue_huge_page(page) // It is wrong, the page is already freed spin_unlock(&hugetlb_lock)
The race window is between put_page() and dissolve_free_huge_page().
We should make sure that the page is already on the free list when it is dissolved.
As a result __free_huge_page would corrupt page(s) already in the buddy allocator.
Link: https://lkml.kernel.org/r/20210115124942.46403-4-songmuchun@bytedance.com Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Muchun Song songmuchun@bytedance.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: Oscar Salvador osalvador@suse.de Acked-by: Michal Hocko mhocko@suse.com Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8dff07b28df43..0fe62c51c197d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -70,6 +70,21 @@ DEFINE_SPINLOCK(hugetlb_lock); static int num_fault_mutexes; struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
+static inline bool PageHugeFreed(struct page *head) +{ + return page_private(head + 4) == -1UL; +} + +static inline void SetPageHugeFreed(struct page *head) +{ + set_page_private(head + 4, -1UL); +} + +static inline void ClearPageHugeFreed(struct page *head) +{ + set_page_private(head + 4, 0); +} + /* Forward declaration */ static int hugetlb_acct_memory(struct hstate *h, long delta);
@@ -868,6 +883,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page) list_move(&page->lru, &h->hugepage_freelists[nid]); h->free_huge_pages++; h->free_huge_pages_node[nid]++; + SetPageHugeFreed(page); }
static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid) @@ -885,6 +901,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid) return NULL; list_move(&page->lru, &h->hugepage_activelist); set_page_refcounted(page); + ClearPageHugeFreed(page); h->free_huge_pages--; h->free_huge_pages_node[nid]--; return page; @@ -1323,6 +1340,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) set_hugetlb_cgroup(page, NULL); h->nr_huge_pages++; h->nr_huge_pages_node[nid]++; + ClearPageHugeFreed(page); spin_unlock(&hugetlb_lock); }
@@ -1559,6 +1577,7 @@ int dissolve_free_huge_page(struct page *page) { int rc = -EBUSY;
+retry: /* Not to disrupt normal path by vainly holding hugetlb_lock */ if (!PageHuge(page)) return 0; @@ -1575,6 +1594,26 @@ int dissolve_free_huge_page(struct page *page) int nid = page_to_nid(head); if (h->free_huge_pages - h->resv_huge_pages == 0) goto out; + + /* + * We should make sure that the page is already on the free list + * when it is dissolved. + */ + if (unlikely(!PageHugeFreed(head))) { + spin_unlock(&hugetlb_lock); + cond_resched(); + + /* + * Theoretically, we should return -EBUSY when we + * encounter this race. In fact, we have a chance + * to successfully dissolve the page if we do a + * retry. Because the race window is quite small. + * If we seize this opportunity, it is an optimization + * for increasing the success rate of dissolving page. + */ + goto retry; + } + /* * Move PageHWPoison flag from head page to the raw error page, * which makes any subpages rather than the error page reusable.
From: Muchun Song songmuchun@bytedance.com
commit 0eb2df2b5629794020f75e94655e1994af63f0d4 upstream.
There is a race between isolate_huge_page() and __free_huge_page().
CPU0: CPU1:
if (PageHuge(page)) put_page(page) __free_huge_page(page) spin_lock(&hugetlb_lock) update_and_free_page(page) set_compound_page_dtor(page, NULL_COMPOUND_DTOR) spin_unlock(&hugetlb_lock) isolate_huge_page(page) // trigger BUG_ON VM_BUG_ON_PAGE(!PageHead(page), page) spin_lock(&hugetlb_lock) page_huge_active(page) // trigger BUG_ON VM_BUG_ON_PAGE(!PageHuge(page), page) spin_unlock(&hugetlb_lock)
When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0, because it is already freed to the buddy allocator.
Link: https://lkml.kernel.org/r/20210115124942.46403-5-songmuchun@bytedance.com Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Muchun Song songmuchun@bytedance.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0fe62c51c197d..0b27b7be46e0e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5171,9 +5171,9 @@ bool isolate_huge_page(struct page *page, struct list_head *list) { bool ret = true;
- VM_BUG_ON_PAGE(!PageHead(page), page); spin_lock(&hugetlb_lock); - if (!page_huge_active(page) || !get_page_unless_zero(page)) { + if (!PageHeadHuge(page) || !page_huge_active(page) || + !get_page_unless_zero(page)) { ret = false; goto unlock; }
From: Muchun Song songmuchun@bytedance.com
commit ecbf4724e6061b4b01be20f6d797d64d462b2bc8 upstream.
The page_huge_active() can be called from scan_movable_pages() which do not hold a reference count to the HugeTLB page. So when we call page_huge_active() from scan_movable_pages(), the HugeTLB page can be freed parallel. Then we will trigger a BUG_ON which is in the page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the VM_BUG_ON_PAGE.
Link: https://lkml.kernel.org/r/20210115124942.46403-6-songmuchun@bytedance.com Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()") Signed-off-by: Muchun Song songmuchun@bytedance.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0b27b7be46e0e..ee1f7075f409c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1231,8 +1231,7 @@ struct hstate *size_to_hstate(unsigned long size) */ bool page_huge_active(struct page *page) { - VM_BUG_ON_PAGE(!PageHuge(page), page); - return PageHead(page) && PagePrivate(&page[1]); + return PageHeadHuge(page) && PagePrivate(&page[1]); }
/* never called for tail page */
From: Hugh Dickins hughd@google.com
commit 1c2f67308af4c102b4e1e6cd6f69819ae59408e0 upstream.
Sergey reported deadlock between kswapd correctly doing its usual lock_page(page) followed by down_read(page->mapping->i_mmap_rwsem), and madvise(MADV_REMOVE) on an madvise(MADV_HUGEPAGE) area doing down_write(page->mapping->i_mmap_rwsem) followed by lock_page(page).
This happened when shmem_fallocate(punch hole)'s unmap_mapping_range() reaches zap_pmd_range()'s call to __split_huge_pmd(). The same deadlock could occur when partially truncating a mapped huge tmpfs file, or using fallocate(FALLOC_FL_PUNCH_HOLE) on it.
__split_huge_pmd()'s page lock was added in 5.8, to make sure that any concurrent use of reuse_swap_page() (holding page lock) could not catch the anon THP's mapcounts and swapcounts while they were being split.
Fortunately, reuse_swap_page() is never applied to a shmem or file THP (not even by khugepaged, which checks PageSwapCache before calling), and anonymous THPs are never created in shmem or file areas: so that __split_huge_pmd()'s page lock can only be necessary for anonymous THPs, on which there is no risk of deadlock with i_mmap_rwsem.
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2101161409470.2022@eggly.anvils Fixes: c444eb564fb1 ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()") Signed-off-by: Hugh Dickins hughd@google.com Reported-by: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Reviewed-by: Andrea Arcangeli aarcange@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 47441c7e80179..d5a44e540188e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2306,7 +2306,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, spinlock_t *ptl; struct mm_struct *mm = vma->vm_mm; unsigned long haddr = address & HPAGE_PMD_MASK; - bool was_locked = false; + bool do_unlock_page = false; pmd_t _pmd;
mmu_notifier_invalidate_range_start(mm, haddr, haddr + HPAGE_PMD_SIZE); @@ -2319,7 +2319,6 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, VM_BUG_ON(freeze && !page); if (page) { VM_WARN_ON_ONCE(!PageLocked(page)); - was_locked = true; if (page != pmd_page(*pmd)) goto out; } @@ -2328,19 +2327,29 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, if (pmd_trans_huge(*pmd)) { if (!page) { page = pmd_page(*pmd); - if (unlikely(!trylock_page(page))) { - get_page(page); - _pmd = *pmd; - spin_unlock(ptl); - lock_page(page); - spin_lock(ptl); - if (unlikely(!pmd_same(*pmd, _pmd))) { - unlock_page(page); + /* + * An anonymous page must be locked, to ensure that a + * concurrent reuse_swap_page() sees stable mapcount; + * but reuse_swap_page() is not used on shmem or file, + * and page lock must not be taken when zap_pmd_range() + * calls __split_huge_pmd() while i_mmap_lock is held. + */ + if (PageAnon(page)) { + if (unlikely(!trylock_page(page))) { + get_page(page); + _pmd = *pmd; + spin_unlock(ptl); + lock_page(page); + spin_lock(ptl); + if (unlikely(!pmd_same(*pmd, _pmd))) { + unlock_page(page); + put_page(page); + page = NULL; + goto repeat; + } put_page(page); - page = NULL; - goto repeat; } - put_page(page); + do_unlock_page = true; } } if (PageMlocked(page)) @@ -2350,7 +2359,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, __split_huge_pmd_locked(vma, pmd, haddr, freeze); out: spin_unlock(ptl); - if (!was_locked && page) + if (do_unlock_page) unlock_page(page); /* * No need to double call mmu_notifier->invalidate_range() callback.
From: Josh Poimboeuf jpoimboe@redhat.com
commit 20bf2b378729c4a0366a53e2018a0b70ace94bcd upstream.
With retpolines disabled, some configurations of GCC, and specifically the GCC versions 9 and 10 in Ubuntu will add Intel CET instrumentation to the kernel by default. That breaks certain tracing scenarios by adding a superfluous ENDBR64 instruction before the fentry call, for functions which can be called indirectly.
CET instrumentation isn't currently necessary in the kernel, as CET is only supported in user space. Disable it unconditionally and move it into the x86's Makefile as CET/CFI... enablement should be a per-arch decision anyway.
[ bp: Massage and extend commit message. ]
Fixes: 29be86d7f9cb ("kbuild: add -fcf-protection=none when using retpoline flags") Reported-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Nikolay Borisov nborisov@suse.com Tested-by: Nikolay Borisov nborisov@suse.com Cc: stable@vger.kernel.org Cc: Seth Forshee seth.forshee@canonical.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Link: https://lkml.kernel.org/r/20210128215219.6kct3h2eiustncws@treble Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 6 ------ arch/x86/Makefile | 3 +++ 2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/Makefile b/Makefile index 8f326d0652a73..5be9bdc77aa59 100644 --- a/Makefile +++ b/Makefile @@ -859,12 +859,6 @@ KBUILD_CFLAGS += $(call cc-option,-Werror=designated-init) # change __FILE__ to the relative path from the srctree KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
-# ensure -fcf-protection is disabled when using retpoline as it is -# incompatible with -mindirect-branch=thunk-extern -ifdef CONFIG_RETPOLINE -KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) -endif - # use the deterministic mode of AR if available KBUILD_ARFLAGS := $(call ar-option,D)
diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 4833dd7e2cc03..0303a243b634e 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -132,6 +132,9 @@ else KBUILD_CFLAGS += -mno-red-zone KBUILD_CFLAGS += -mcmodel=kernel
+ # Intel CET isn't enabled in the kernel + KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) + # -funit-at-a-time shrinks the kernel .text considerably # unfortunately it makes reading oopses harder. KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
From: Dave Hansen dave.hansen@linux.intel.com
commit 25a068b8e9a4eb193d755d58efcb3c98928636e0 upstream.
Jan Kiszka reported that the x2apic_wrmsr_fence() function uses a plain MFENCE while the Intel SDM (10.12.3 MSR Access in x2APIC Mode) calls for MFENCE; LFENCE.
Short summary: we have special MSRs that have weaker ordering than all the rest. Add fencing consistent with current SDM recommendations.
This is not known to cause any issues in practice, only in theory.
Longer story below:
The reason the kernel uses a different semantic is that the SDM changed (roughly in late 2017). The SDM changed because folks at Intel were auditing all of the recommended fences in the SDM and realized that the x2apic fences were insufficient.
Why was the pain MFENCE judged insufficient?
WRMSR itself is normally a serializing instruction. No fences are needed because the instruction itself serializes everything.
But, there are explicit exceptions for this serializing behavior written into the WRMSR instruction documentation for two classes of MSRs: IA32_TSC_DEADLINE and the X2APIC MSRs.
Back to x2apic: WRMSR is *not* serializing in this specific case. But why is MFENCE insufficient? MFENCE makes writes visible, but only affects load/store instructions. WRMSR is unfortunately not a load/store instruction and is unaffected by MFENCE. This means that a non-serializing WRMSR could be reordered by the CPU to execute before the writes made visible by the MFENCE have even occurred in the first place.
This means that an x2apic IPI could theoretically be triggered before there is any (visible) data to process.
Does this affect anything in practice? I honestly don't know. It seems quite possible that by the time an interrupt gets to consume the (not yet) MFENCE'd data, it has become visible, mostly by accident.
To be safe, add the SDM-recommended fences for all x2apic WRMSRs.
This also leaves open the question of the _other_ weakly-ordered WRMSR: MSR_IA32_TSC_DEADLINE. While it has the same ordering architecture as the x2APIC MSRs, it seems substantially less likely to be a problem in practice. While writes to the in-memory Local Vector Table (LVT) might theoretically be reordered with respect to a weakly-ordered WRMSR like TSC_DEADLINE, the SDM has this to say:
In x2APIC mode, the WRMSR instruction is used to write to the LVT entry. The processor ensures the ordering of this write and any subsequent WRMSR to the deadline; no fencing is required.
But, that might still leave xAPIC exposed. The safest thing to do for now is to add the extra, recommended LFENCE.
[ bp: Massage commit message, fix typos, drop accidentally added newline to tools/arch/x86/include/asm/barrier.h. ]
Reported-by: Jan Kiszka jan.kiszka@siemens.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200305174708.F77040DD@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/apic.h | 10 ---------- arch/x86/include/asm/barrier.h | 18 ++++++++++++++++++ arch/x86/kernel/apic/apic.c | 4 ++++ arch/x86/kernel/apic/x2apic_cluster.c | 6 ++++-- arch/x86/kernel/apic/x2apic_phys.c | 6 ++++-- 5 files changed, 30 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index f1286106f3af5..8ac30536c99f6 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -194,16 +194,6 @@ static inline bool apic_needs_pit(void) { return true; } #endif /* !CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_X2APIC -/* - * Make previous memory operations globally visible before - * sending the IPI through x2apic wrmsr. We need a serializing instruction or - * mfence for this. - */ -static inline void x2apic_wrmsr_fence(void) -{ - asm volatile("mfence" : : : "memory"); -} - static inline void native_apic_msr_write(u32 reg, u32 v) { if (reg == APIC_DFR || reg == APIC_ID || reg == APIC_LDR || diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 84f848c2541a6..dd22cffc6b3fc 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -85,4 +85,22 @@ do { \
#include <asm-generic/barrier.h>
+/* + * Make previous memory operations globally visible before + * a WRMSR. + * + * MFENCE makes writes visible, but only affects load/store + * instructions. WRMSR is unfortunately not a load/store + * instruction and is unaffected by MFENCE. The LFENCE ensures + * that the WRMSR is not reordered. + * + * Most WRMSRs are full serializing instructions themselves and + * do not require this barrier. This is only required for the + * IA32_TSC_DEADLINE and X2APIC MSRs. + */ +static inline void weak_wrmsr_fence(void) +{ + asm volatile("mfence; lfence" : : : "memory"); +} + #endif /* _ASM_X86_BARRIER_H */ diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index a97cd6a32cabe..cd216bdc9e904 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -41,6 +41,7 @@ #include <asm/x86_init.h> #include <asm/pgalloc.h> #include <linux/atomic.h> +#include <asm/barrier.h> #include <asm/mpspec.h> #include <asm/i8259.h> #include <asm/proto.h> @@ -470,6 +471,9 @@ static int lapic_next_deadline(unsigned long delta, { u64 tsc;
+ /* This MSR is special and need a special fence: */ + weak_wrmsr_fence(); + tsc = rdtsc(); wrmsrl(MSR_IA32_TSC_DEADLINE, tsc + (((u64) delta) * TSC_DIVISOR)); return 0; diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index 145517934171e..b2c7e3bc55d22 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -31,7 +31,8 @@ static void x2apic_send_IPI(int cpu, int vector) { u32 dest = per_cpu(x86_cpu_to_logical_apicid, cpu);
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence(); __x2apic_send_IPI_dest(dest, vector, APIC_DEST_LOGICAL); }
@@ -43,7 +44,8 @@ __x2apic_send_IPI_mask(const struct cpumask *mask, int vector, int apic_dest) unsigned long flags; u32 dest;
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence(); local_irq_save(flags);
tmpmsk = this_cpu_cpumask_var_ptr(ipi_mask); diff --git a/arch/x86/kernel/apic/x2apic_phys.c b/arch/x86/kernel/apic/x2apic_phys.c index ed56d2850e96a..8e70c2ba21b3d 100644 --- a/arch/x86/kernel/apic/x2apic_phys.c +++ b/arch/x86/kernel/apic/x2apic_phys.c @@ -48,7 +48,8 @@ static void x2apic_send_IPI(int cpu, int vector) { u32 dest = per_cpu(x86_cpu_to_apicid, cpu);
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence(); __x2apic_send_IPI_dest(dest, vector, APIC_DEST_PHYSICAL); }
@@ -59,7 +60,8 @@ __x2apic_send_IPI_mask(const struct cpumask *mask, int vector, int apic_dest) unsigned long this_cpu; unsigned long flags;
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence();
local_irq_save(flags);
From: Benjamin Valentin benpicco@googlemail.com
commit 9bbd77d5bbc9aff8cb74d805c31751f5f0691ba8 upstream.
There is a fork of this driver on GitHub [0] that has been updated with new device IDs.
Merge those into the mainline driver, so the out-of-tree fork is not needed for users of those devices anymore.
[0] https://github.com/paroj/xpad
Signed-off-by: Benjamin Valentin benpicco@googlemail.com Link: https://lore.kernel.org/r/20210121142523.1b6b050f@rechenknecht2k11 Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c index ee3ff0894d093..ef4e8423843f3 100644 --- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -229,9 +229,17 @@ static const struct xpad_device { { 0x0e6f, 0x0213, "Afterglow Gamepad for Xbox 360", 0, XTYPE_XBOX360 }, { 0x0e6f, 0x021f, "Rock Candy Gamepad for Xbox 360", 0, XTYPE_XBOX360 }, { 0x0e6f, 0x0246, "Rock Candy Gamepad for Xbox One 2015", 0, XTYPE_XBOXONE }, - { 0x0e6f, 0x02ab, "PDP Controller for Xbox One", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a0, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a1, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a2, "PDP Wired Controller for Xbox One - Crimson Red", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x02a4, "PDP Wired Controller for Xbox One - Stealth Series", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x02a6, "PDP Wired Controller for Xbox One - Camo Series", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a7, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a8, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02ab, "PDP Controller for Xbox One", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02ad, "PDP Wired Controller for Xbox One - Stealth Series", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02b3, "Afterglow Prismatic Wired Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02b8, "Afterglow Prismatic Wired Controller", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0301, "Logic3 Controller", 0, XTYPE_XBOX360 }, { 0x0e6f, 0x0346, "Rock Candy Gamepad for Xbox One 2016", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0401, "Logic3 Controller", 0, XTYPE_XBOX360 }, @@ -310,6 +318,9 @@ static const struct xpad_device { { 0x1bad, 0xfa01, "MadCatz GamePad", 0, XTYPE_XBOX360 }, { 0x1bad, 0xfd00, "Razer Onza TE", 0, XTYPE_XBOX360 }, { 0x1bad, 0xfd01, "Razer Onza", 0, XTYPE_XBOX360 }, + { 0x20d6, 0x2001, "BDA Xbox Series X Wired Controller", 0, XTYPE_XBOXONE }, + { 0x20d6, 0x281f, "PowerA Wired Controller For Xbox 360", 0, XTYPE_XBOX360 }, + { 0x2e24, 0x0652, "Hyperkin Duke X-Box One pad", 0, XTYPE_XBOXONE }, { 0x24c6, 0x5000, "Razer Atrox Arcade Stick", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOX360 }, { 0x24c6, 0x5300, "PowerA MINI PROEX Controller", 0, XTYPE_XBOX360 }, { 0x24c6, 0x5303, "Xbox Airflo wired controller", 0, XTYPE_XBOX360 }, @@ -443,8 +454,12 @@ static const struct usb_device_id xpad_table[] = { XPAD_XBOX360_VENDOR(0x162e), /* Joytech X-Box 360 controllers */ XPAD_XBOX360_VENDOR(0x1689), /* Razer Onza */ XPAD_XBOX360_VENDOR(0x1bad), /* Harminix Rock Band Guitar and Drums */ + XPAD_XBOX360_VENDOR(0x20d6), /* PowerA Controllers */ + XPAD_XBOXONE_VENDOR(0x20d6), /* PowerA Controllers */ XPAD_XBOX360_VENDOR(0x24c6), /* PowerA Controllers */ XPAD_XBOXONE_VENDOR(0x24c6), /* PowerA Controllers */ + XPAD_XBOXONE_VENDOR(0x2e24), /* Hyperkin Duke X-Box One pad */ + XPAD_XBOX360_VENDOR(0x2f24), /* GameSir Controllers */ { } };
From: Nadav Amit namit@vmware.com
commit 29b32839725f8c89a41cb6ee054c85f3116ea8b5 upstream.
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Acked-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/intel-iommu.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 93a5cf3076585..f51ae00867866 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3386,6 +3386,12 @@ static int __init init_dmars(void)
if (!ecap_pass_through(iommu->ecap)) hw_pass_through = 0; + + if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { + pr_info("Disable batched IOTLB flush due to virtualization"); + intel_iommu_strict = 1; + } + #ifdef CONFIG_INTEL_IOMMU_SVM if (pasid_enabled(iommu)) intel_svm_init(iommu);
From: Xiao Ni xni@redhat.com
commit dc5d17a3c39b06aef866afca19245a9cfb533a79 upstream.
One customer reports a crash problem which causes by flush request. It triggers a warning before crash.
/* new request after previous flush is completed */ if (ktime_after(req_start, mddev->prev_flush_start)) { WARN_ON(mddev->flush_bio); mddev->flush_bio = bio; bio = NULL; }
The WARN_ON is triggered. We use spin lock to protect prev_flush_start and flush_bio in md_flush_request. But there is no lock protection in md_submit_flush_data. It can set flush_bio to NULL first because of compiler reordering write instructions.
For example, flush bio1 sets flush bio to NULL first in md_submit_flush_data. An interrupt or vmware causing an extended stall happen between updating flush_bio and prev_flush_start. Because flush_bio is NULL, flush bio2 can get the lock and submit to underlayer disks. Then flush bio1 updates prev_flush_start after the interrupt or extended stall.
Then flush bio3 enters in md_flush_request. The start time req_start is behind prev_flush_start. The flush_bio is not NULL(flush bio2 hasn't finished). So it can trigger the WARN_ON now. Then it calls INIT_WORK again. INIT_WORK() will re-initialize the list pointers in the work_struct, which then can result in a corrupted work list and the work_struct queued a second time. With the work list corrupted, it can lead in invalid work items being used and cause a crash in process_one_work.
We need to make sure only one flush bio can be handled at one same time. So add spin lock in md_submit_flush_data to protect prev_flush_start and flush_bio in an atomic way.
Reviewed-by: David Jeffery djeffery@redhat.com Signed-off-by: Xiao Ni xni@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jack Wang jinpu.wang@cloud.ionos.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/md.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c index eb378a63d0129..941dbfa7c06d5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -474,8 +474,10 @@ static void md_submit_flush_data(struct work_struct *ws) * could wait for this and below md_handle_request could wait for those * bios because of suspend check */ + spin_lock_irq(&mddev->lock); mddev->last_flush = mddev->start_flush; mddev->flush_bio = NULL; + spin_unlock_irq(&mddev->lock); wake_up(&mddev->sb_wait);
if (bio->bi_iter.bi_size == 0) {
From: Vadim Fedorenko vfedorenko@novek.ru
commit 28e104d00281ade30250b24e098bf50887671ea4 upstream.
dev->hard_header_len for tunnel interface is set only when header_ops are set too and already contains full overhead of any tunnel encapsulation. That's why there is not need to use this overhead twice in mtu calc.
Fixes: fdafed459998 ("ip_gre: set dev->hard_header_len and dev->needed_headroom properly") Reported-by: Slava Bacherikov mail@slava.cc Signed-off-by: Vadim Fedorenko vfedorenko@novek.ru Link: https://lore.kernel.org/r/1611959267-20536-1-git-send-email-vfedorenko@novek... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_tunnel.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 1cad731039c37..bdd073ea300ae 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -330,7 +330,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev) }
dev->needed_headroom = t_hlen + hlen; - mtu -= (dev->hard_header_len + t_hlen); + mtu -= t_hlen;
if (mtu < IPV4_MIN_MTU) mtu = IPV4_MIN_MTU; @@ -360,7 +360,7 @@ static struct ip_tunnel *ip_tunnel_create(struct net *net, nt = netdev_priv(dev); t_hlen = nt->hlen + sizeof(struct iphdr); dev->min_mtu = ETH_MIN_MTU; - dev->max_mtu = IP_MAX_MTU - dev->hard_header_len - t_hlen; + dev->max_mtu = IP_MAX_MTU - t_hlen; ip_tunnel_add(itn, nt); return nt;
@@ -502,12 +502,11 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb, const struct iphdr *inner_iph) { struct ip_tunnel *tunnel = netdev_priv(dev); - int pkt_size = skb->len - tunnel->hlen - dev->hard_header_len; + int pkt_size = skb->len - tunnel->hlen; int mtu;
if (df) - mtu = dst_mtu(&rt->dst) - dev->hard_header_len - - sizeof(struct iphdr) - tunnel->hlen; + mtu = dst_mtu(&rt->dst) - (sizeof(struct iphdr) + tunnel->hlen); else mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
@@ -935,7 +934,7 @@ int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict) { struct ip_tunnel *tunnel = netdev_priv(dev); int t_hlen = tunnel->hlen + sizeof(struct iphdr); - int max_mtu = IP_MAX_MTU - dev->hard_header_len - t_hlen; + int max_mtu = IP_MAX_MTU - t_hlen;
if (new_mtu < ETH_MIN_MTU) return -EINVAL; @@ -1112,10 +1111,9 @@ int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
mtu = ip_tunnel_bind_dev(dev); if (tb[IFLA_MTU]) { - unsigned int max = IP_MAX_MTU - dev->hard_header_len - nt->hlen; + unsigned int max = IP_MAX_MTU - (nt->hlen + sizeof(struct iphdr));
- mtu = clamp(dev->mtu, (unsigned int)ETH_MIN_MTU, - (unsigned int)(max - sizeof(struct iphdr))); + mtu = clamp(dev->mtu, (unsigned int)ETH_MIN_MTU, max); }
err = dev_set_mtu(dev, mtu);
From: DENG Qingfang dqfext@gmail.com
commit f72f2fb8fb6be095b98af5d740ac50cffd0b0cae upstream.
Having multiple destination ports for a unicast address does not make sense. Make port_db_load_purge override existent unicast portvec instead of adding a new port bit.
Fixes: 884729399260 ("net: dsa: mv88e6xxx: handle multiple ports in ATU") Signed-off-by: DENG Qingfang dqfext@gmail.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20210130134334.10243-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/mv88e6xxx/chip.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 6fa8aa69b4180..e04b7fa068aff 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1658,7 +1658,11 @@ static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port, if (!entry.portvec) entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UNUSED; } else { - entry.portvec |= BIT(port); + if (state == MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC) + entry.portvec = BIT(port); + else + entry.portvec |= BIT(port); + entry.state = state; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
Merge 39 patches from 4.19.175 stable branch (39 total) beside 0 already merged patches.
Tested-by: Pavel Machek (CIP) pavel@denx.de Tested-by: Shuah Khan skhan@linuxfoundation.org Tested-by: Igor Matheus Andrade Torrente igormtorrente@gmail.com Tested-by: Jason Self jason@bluehome.net Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Guenter Roeck linux@roeck-us.net Tested-by: Ross Schmidt ross.schm.dev@gmail.com Link: https://lore.kernel.org/r/20210208145806.141056364@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile index 5be9bdc77aa59..db37b39fae693 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 19 -SUBLEVEL = 174 +SUBLEVEL = 175 EXTRAVERSION = NAME = "People's Front"