Al Viro (4): epoll: do not insert into poll queues until all sanity checks are done epoll: replace ->visited/visited_list with generation count epoll: EPOLL_CTL_ADD: close the race in decision to take fast path ep_create_wakeup_source(): dentry name can change under you...
Bartosz Golaszewski (1): gpio: mockup: fix resource leak in error path
Bryan O'Donoghue (1): USB: gadget: f_ncm: Fix NDP16 datagram validation
Chaitanya Kulkarni (1): nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
Chris Packham (2): spi: fsl-espi: Only process interrupts for expected events pinctrl: mvebu: Fix i2c sda definition for 98DX3236
Dinh Nguyen (1): clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk
Felix Fietkau (1): mac80211: do not allow bigger VHT MPDUs than the hardware supports
Greg Kroah-Hartman (1): Linux 4.19.150
Hans de Goede (1): mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models
James Smart (1): nvme-fc: fail new connections to a deleted host or remote port
Jean Delvare (1): drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
Jeffrey Mitchell (1): nfs: Fix security label length not being reset
Jiri Kosina (1): Input: i8042 - add nopnp quirk for Acer Aspire 5 A515
Laurent Dufour (2): mm: replace memmap_context by meminit_context mm: don't rely on system state to detect hot-plug operations
Lucy Yan (1): net: dec: de2104x: Increase receive ring size for Tulip
Marek Szyprowski (1): clk: samsung: exynos4: mark 'chipid' clock as CLK_IGNORE_UNUSED
Martin Cerveny (1): drm/sun4i: mixer: Extend regmap max_register
Nicolas VINCENT (1): i2c: cpm: Fix i2c_ram structure
Olympia Giannou (1): rndis_host: increase sleep time in the query-response loop
Sebastien Boeuf (1): net: virtio_vsock: Enhance connection semantics
Stefano Garzarella (3): vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock vsock/virtio: stop workers during the .remove() vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock()
Steven Rostedt (VMware) (1): ftrace: Move RCU is watching check after recursion check
Taiping Lai (1): gpio: sprd: Clear interrupt when setting the type as edge
Thibaut Sautereau (1): random32: Restore __latent_entropy attribute on net_rand_state
Vincent Huang (1): Input: trackpoint - enable Synaptics trackpoints
Will McVicker (1): netfilter: ctnetlink: add a range check for l3/l4 protonum
Xie He (3): drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices drivers/net/wan/lapbether: Make skb->protocol consistent with the header drivers/net/wan/hdlc: Set skb->protocol before transmitting
Yu Kuai (1): iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()
dillon min (1): gpio: tc35894: fix up tc35894 interrupt configuration
Makefile | 2 +- arch/ia64/mm/init.c | 6 +- drivers/base/node.c | 84 ++++--- drivers/clk/samsung/clk-exynos4.c | 4 +- drivers/clk/socfpga/clk-s10.c | 2 +- drivers/gpio/gpio-mockup.c | 2 + drivers/gpio/gpio-sprd.c | 3 + drivers/gpio/gpio-tc3589x.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 +- drivers/gpu/drm/sun4i/sun8i_mixer.c | 2 +- drivers/i2c/busses/i2c-cpm.c | 3 + drivers/input/mouse/trackpoint.c | 2 + drivers/input/serio/i8042-x86ia64io.h | 7 + drivers/iommu/exynos-iommu.c | 8 +- drivers/mmc/host/sdhci-pci-core.c | 3 +- drivers/net/ethernet/dec/tulip/de2104x.c | 2 +- drivers/net/usb/rndis_host.c | 2 +- drivers/net/wan/hdlc_cisco.c | 1 + drivers/net/wan/hdlc_fr.c | 6 +- drivers/net/wan/hdlc_ppp.c | 1 + drivers/net/wan/lapbether.c | 4 +- drivers/nvme/host/core.c | 15 ++ drivers/nvme/host/fc.c | 6 +- drivers/pinctrl/mvebu/pinctrl-armada-xp.c | 2 +- drivers/spi/spi-fsl-espi.c | 5 +- drivers/usb/gadget/function/f_ncm.c | 30 +-- drivers/vhost/vsock.c | 94 +++---- fs/eventpoll.c | 71 +++--- fs/nfs/dir.c | 3 + include/linux/mm.h | 2 +- include/linux/mmzone.h | 11 +- include/linux/node.h | 11 +- include/linux/virtio_vsock.h | 3 +- kernel/trace/ftrace.c | 6 +- lib/random32.c | 2 +- mm/memory_hotplug.c | 5 +- mm/page_alloc.c | 11 +- net/mac80211/vht.c | 8 +- net/netfilter/nf_conntrack_netlink.c | 2 + net/vmw_vsock/virtio_transport.c | 265 +++++++++++++------- net/vmw_vsock/virtio_transport_common.c | 13 +- 41 files changed, 416 insertions(+), 297 deletions(-)
From: Hans de Goede hdegoede@redhat.com
commit afd7f30886b0b445a4240a99020458a9772f2b89 upstream.
Commit bedf9fc01ff1 ("mmc: sdhci: Workaround broken command queuing on Intel GLK"), disabled command-queuing on Intel GLK based LENOVO models because of it being broken due to what is believed to be a bug in the BIOS.
It seems that the BIOS of some IRBIS models, including the IRBIS NB111 model has the same issue, so disable command queuing there too.
Fixes: bedf9fc01ff1 ("mmc: sdhci: Workaround broken command queuing on Intel GLK") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209397 Reported-and-tested-by: RussianNeuroMancer russianneuromancer@ya.ru Signed-off-by: Hans de Goede hdegoede@redhat.com Acked-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/20200927104821.5676-1-hdegoede@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-pci-core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c index 35168b47afe6..a411300f9d6d 100644 --- a/drivers/mmc/host/sdhci-pci-core.c +++ b/drivers/mmc/host/sdhci-pci-core.c @@ -739,7 +739,8 @@ static int byt_emmc_probe_slot(struct sdhci_pci_slot *slot) static bool glk_broken_cqhci(struct sdhci_pci_slot *slot) { return slot->chip->pdev->device == PCI_DEVICE_ID_INTEL_GLK_EMMC && - dmi_match(DMI_BIOS_VENDOR, "LENOVO"); + (dmi_match(DMI_BIOS_VENDOR, "LENOVO") || + dmi_match(DMI_SYS_VENDOR, "IRBIS")); }
static int glk_emmc_probe_slot(struct sdhci_pci_slot *slot)
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 2b405533c2560d7878199c57d95a39151351df72 upstream.
commit 2b74b0a04d3e ("USB: gadget: f_ncm: add bounds checks to ncm_unwrap_ntb()") adds important bounds checking however it unfortunately also introduces a bug with respect to section 3.3.1 of the NCM specification.
wDatagramIndex[1] : "Byte index, in little endian, of the second datagram described by this NDP16. If zero, then this marks the end of the sequence of datagrams in this NDP16."
wDatagramLength[1]: "Byte length, in little endian, of the second datagram described by this NDP16. If zero, then this marks the end of the sequence of datagrams in this NDP16."
wDatagramIndex[1] and wDatagramLength[1] respectively then may be zero but that does not mean we should throw away the data referenced by wDatagramIndex[0] and wDatagramLength[0] as is currently the case.
Breaking the loop on (index2 == 0 || dg_len2 == 0) should come at the end as was previously the case and checks for index2 and dg_len2 should be removed since zero is valid.
I'm not sure how much testing the above patch received but for me right now after enumeration ping doesn't work. Reverting the commit restores ping, scp, etc.
The extra validation associated with wDatagramIndex[0] and wDatagramLength[0] appears to be valid so, this change removes the incorrect restriction on wDatagramIndex[1] and wDatagramLength[1] restoring data processing between host and device.
Fixes: 2b74b0a04d3e ("USB: gadget: f_ncm: add bounds checks to ncm_unwrap_ntb()") Cc: Ilja Van Sprundel ivansprundel@ioactive.com Cc: Brooke Basile brookebasile@gmail.com Cc: stable stable@kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Link: https://lore.kernel.org/r/20200920170158.1217068-1-bryan.odonoghue@linaro.or... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_ncm.c | 30 ++--------------------------- 1 file changed, 2 insertions(+), 28 deletions(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index 8d8c81d43069..e2eefdd8bf78 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1192,7 +1192,6 @@ static int ncm_unwrap_ntb(struct gether *port, const struct ndp_parser_opts *opts = ncm->parser_opts; unsigned crc_len = ncm->is_crc ? sizeof(uint32_t) : 0; int dgram_counter; - bool ndp_after_header;
/* dwSignature */ if (get_unaligned_le32(tmp) != opts->nth_sign) { @@ -1219,7 +1218,6 @@ static int ncm_unwrap_ntb(struct gether *port, }
ndp_index = get_ncm(&tmp, opts->ndp_index); - ndp_after_header = false;
/* Run through all the NDP's in the NTB */ do { @@ -1235,8 +1233,6 @@ static int ncm_unwrap_ntb(struct gether *port, ndp_index); goto err; } - if (ndp_index == opts->nth_size) - ndp_after_header = true;
/* * walk through NDP @@ -1315,37 +1311,13 @@ static int ncm_unwrap_ntb(struct gether *port, index2 = get_ncm(&tmp, opts->dgram_item_len); dg_len2 = get_ncm(&tmp, opts->dgram_item_len);
- if (index2 == 0 || dg_len2 == 0) - break; - /* wDatagramIndex[1] */ - if (ndp_after_header) { - if (index2 < opts->nth_size + opts->ndp_size) { - INFO(port->func.config->cdev, - "Bad index: %#X\n", index2); - goto err; - } - } else { - if (index2 < opts->nth_size + opts->dpe_size) { - INFO(port->func.config->cdev, - "Bad index: %#X\n", index2); - goto err; - } - } if (index2 > block_len - opts->dpe_size) { INFO(port->func.config->cdev, "Bad index: %#X\n", index2); goto err; }
- /* wDatagramLength[1] */ - if ((dg_len2 < 14 + crc_len) || - (dg_len2 > frame_max)) { - INFO(port->func.config->cdev, - "Bad dgram length: %#X\n", dg_len); - goto err; - } - /* * Copy the data into a new skb. * This ensures the truesize is correct @@ -1362,6 +1334,8 @@ static int ncm_unwrap_ntb(struct gether *port, ndp_len -= 2 * (opts->dgram_item_len * 2);
dgram_counter++; + if (index2 == 0 || dg_len2 == 0) + break; } while (ndp_len > 2 * (opts->dgram_item_len * 2)); } while (ndp_index);
From: Bartosz Golaszewski bgolaszewski@baylibre.com
commit 1b02d9e770cd7087f34c743f85ccf5ea8372b047 upstream.
If the module init function fails after creating the debugs directory, it's never removed. Add proper cleanup calls to avoid this resource leak.
Fixes: 9202ba2397d1 ("gpio: mockup: implement event injecting over debugfs") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-mockup.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpio/gpio-mockup.c b/drivers/gpio/gpio-mockup.c index 945bd13e5e79..cab324eb7df2 100644 --- a/drivers/gpio/gpio-mockup.c +++ b/drivers/gpio/gpio-mockup.c @@ -367,6 +367,7 @@ static int __init gpio_mockup_init(void) err = platform_driver_register(&gpio_mockup_driver); if (err) { gpio_mockup_err("error registering platform driver\n"); + debugfs_remove_recursive(gpio_mockup_dbg_dir); return err; }
@@ -386,6 +387,7 @@ static int __init gpio_mockup_init(void) gpio_mockup_err("error registering device"); platform_driver_unregister(&gpio_mockup_driver); gpio_mockup_unregister_pdevs(); + debugfs_remove_recursive(gpio_mockup_dbg_dir); return PTR_ERR(pdev); }
From: dillon min dillon.minfei@gmail.com
commit 214b0e1ad01abf4c1f6d8d28fa096bf167e47cef upstream.
The offset of regmap is incorrect, j * 8 is move to the wrong register.
for example:
asume i = 0, j = 1. we want to set KPY5 as interrupt falling edge mode, regmap[0][1] should be TC3589x_GPIOIBE1 0xcd but, regmap[i] + j * 8 = TC3589x_GPIOIBE0 + 8 ,point to 0xd4, this is TC3589x_GPIOIE2 not TC3589x_GPIOIBE1.
Fixes: d88b25be3584 ("gpio: Add TC35892 GPIO driver") Cc: Cc: stable@vger.kernel.org Signed-off-by: dillon min dillon.minfei@gmail.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-tc3589x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-tc3589x.c b/drivers/gpio/gpio-tc3589x.c index 91a8ef8e7f3f..1436098b1614 100644 --- a/drivers/gpio/gpio-tc3589x.c +++ b/drivers/gpio/gpio-tc3589x.c @@ -209,7 +209,7 @@ static void tc3589x_gpio_irq_sync_unlock(struct irq_data *d) continue;
tc3589x_gpio->oldregs[i][j] = new; - tc3589x_reg_write(tc3589x, regmap[i] + j * 8, new); + tc3589x_reg_write(tc3589x, regmap[i] + j, new); } }
From: Dinh Nguyen dinguyen@kernel.org
commit b02cf0c4736c65c6667f396efaae6b5521e82abf upstream.
The fixed divider the emac_ptp_free_clk should be 2, not 4.
Fixes: 07afb8db7340 ("clk: socfpga: stratix10: add clock driver for Stratix10 platform") Cc: stable@vger.kernel.org Signed-off-by: Dinh Nguyen dinguyen@kernel.org Link: https://lore.kernel.org/r/20200831202657.8224-1-dinguyen@kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/socfpga/clk-s10.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/socfpga/clk-s10.c b/drivers/clk/socfpga/clk-s10.c index 5bed36e12951..7327e90735c8 100644 --- a/drivers/clk/socfpga/clk-s10.c +++ b/drivers/clk/socfpga/clk-s10.c @@ -107,7 +107,7 @@ static const struct stratix10_perip_cnt_clock s10_main_perip_cnt_clks[] = { { STRATIX10_EMAC_B_FREE_CLK, "emacb_free_clk", NULL, emacb_free_mux, ARRAY_SIZE(emacb_free_mux), 0, 0, 2, 0xB0, 1}, { STRATIX10_EMAC_PTP_FREE_CLK, "emac_ptp_free_clk", NULL, emac_ptp_free_mux, - ARRAY_SIZE(emac_ptp_free_mux), 0, 0, 4, 0xB0, 2}, + ARRAY_SIZE(emac_ptp_free_mux), 0, 0, 2, 0xB0, 2}, { STRATIX10_GPIO_DB_FREE_CLK, "gpio_db_free_clk", NULL, gpio_db_free_mux, ARRAY_SIZE(gpio_db_free_mux), 0, 0, 0, 0xB0, 3}, { STRATIX10_SDMMC_FREE_CLK, "sdmmc_free_clk", NULL, sdmmc_free_mux,
From: Stefano Garzarella sgarzare@redhat.com
[ Upstream commit 9c7a5582f5d720dc35cfcc42ccaded69f0642e4a ]
Some callbacks used by the upper layers can run while we are in the .remove(). A potential use-after-free can happen, because we free the_virtio_vsock without knowing if the callbacks are over or not.
To solve this issue we move the assignment of the_virtio_vsock at the end of .probe(), when we finished all the initialization, and at the beginning of .remove(), before to release resources. For the same reason, we do the same also for the vdev->priv.
We use RCU to be sure that all callbacks that use the_virtio_vsock ended before freeing it. This is not required for callbacks that use vdev->priv, because after the vdev->config->del_vqs() we are sure that they are ended and will no longer be invoked.
We also take the mutex during the .remove() to avoid that .probe() can run while we are resetting the device.
Signed-off-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/vmw_vsock/virtio_transport.c | 70 +++++++++++++++++++++----------- 1 file changed, 46 insertions(+), 24 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 96ab344f17bb..68186419c445 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -66,19 +66,22 @@ struct virtio_vsock { u32 guest_cid; };
-static struct virtio_vsock *virtio_vsock_get(void) -{ - return the_virtio_vsock; -} - static u32 virtio_transport_get_local_cid(void) { - struct virtio_vsock *vsock = virtio_vsock_get(); + struct virtio_vsock *vsock; + u32 ret;
- if (!vsock) - return VMADDR_CID_ANY; + rcu_read_lock(); + vsock = rcu_dereference(the_virtio_vsock); + if (!vsock) { + ret = VMADDR_CID_ANY; + goto out_rcu; + }
- return vsock->guest_cid; + ret = vsock->guest_cid; +out_rcu: + rcu_read_unlock(); + return ret; }
static void virtio_transport_loopback_work(struct work_struct *work) @@ -198,14 +201,18 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt) struct virtio_vsock *vsock; int len = pkt->len;
- vsock = virtio_vsock_get(); + rcu_read_lock(); + vsock = rcu_dereference(the_virtio_vsock); if (!vsock) { virtio_transport_free_pkt(pkt); - return -ENODEV; + len = -ENODEV; + goto out_rcu; }
- if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) - return virtio_transport_send_pkt_loopback(vsock, pkt); + if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) { + len = virtio_transport_send_pkt_loopback(vsock, pkt); + goto out_rcu; + }
if (pkt->reply) atomic_inc(&vsock->queued_replies); @@ -215,6 +222,9 @@ virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt) spin_unlock_bh(&vsock->send_pkt_list_lock);
queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work); + +out_rcu: + rcu_read_unlock(); return len; }
@@ -223,12 +233,14 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk) { struct virtio_vsock *vsock; struct virtio_vsock_pkt *pkt, *n; - int cnt = 0; + int cnt = 0, ret; LIST_HEAD(freeme);
- vsock = virtio_vsock_get(); + rcu_read_lock(); + vsock = rcu_dereference(the_virtio_vsock); if (!vsock) { - return -ENODEV; + ret = -ENODEV; + goto out_rcu; }
spin_lock_bh(&vsock->send_pkt_list_lock); @@ -256,7 +268,11 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk) queue_work(virtio_vsock_workqueue, &vsock->rx_work); }
- return 0; + ret = 0; + +out_rcu: + rcu_read_unlock(); + return ret; }
static void virtio_vsock_rx_fill(struct virtio_vsock *vsock) @@ -566,7 +582,8 @@ static int virtio_vsock_probe(struct virtio_device *vdev) return ret;
/* Only one virtio-vsock device per guest is supported */ - if (the_virtio_vsock) { + if (rcu_dereference_protected(the_virtio_vsock, + lockdep_is_held(&the_virtio_vsock_mutex))) { ret = -EBUSY; goto out; } @@ -591,8 +608,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) vsock->rx_buf_max_nr = 0; atomic_set(&vsock->queued_replies, 0);
- vdev->priv = vsock; - the_virtio_vsock = vsock; mutex_init(&vsock->tx_lock); mutex_init(&vsock->rx_lock); mutex_init(&vsock->event_lock); @@ -614,6 +629,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev) virtio_vsock_event_fill(vsock); mutex_unlock(&vsock->event_lock);
+ vdev->priv = vsock; + rcu_assign_pointer(the_virtio_vsock, vsock); + mutex_unlock(&the_virtio_vsock_mutex); return 0;
@@ -628,6 +646,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev) struct virtio_vsock *vsock = vdev->priv; struct virtio_vsock_pkt *pkt;
+ mutex_lock(&the_virtio_vsock_mutex); + + vdev->priv = NULL; + rcu_assign_pointer(the_virtio_vsock, NULL); + synchronize_rcu(); + flush_work(&vsock->loopback_work); flush_work(&vsock->rx_work); flush_work(&vsock->tx_work); @@ -667,12 +691,10 @@ static void virtio_vsock_remove(struct virtio_device *vdev) } spin_unlock_bh(&vsock->loopback_list_lock);
- mutex_lock(&the_virtio_vsock_mutex); - the_virtio_vsock = NULL; - mutex_unlock(&the_virtio_vsock_mutex); - vdev->config->del_vqs(vdev);
+ mutex_unlock(&the_virtio_vsock_mutex); + kfree(vsock); }
From: Stefano Garzarella sgarzare@redhat.com
[ Upstream commit 17dd1367389cfe7f150790c83247b68e0c19d106 ]
Before to call vdev->config->reset(vdev) we need to be sure that no one is accessing the device, for this reason, we add new variables in the struct virtio_vsock to stop the workers during the .remove().
This patch also add few comments before vdev->config->reset(vdev) and vdev->config->del_vqs(vdev).
Suggested-by: Stefan Hajnoczi stefanha@redhat.com Suggested-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/vmw_vsock/virtio_transport.c | 51 +++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 68186419c445..4bc217ef5694 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -39,6 +39,7 @@ struct virtio_vsock { * must be accessed with tx_lock held. */ struct mutex tx_lock; + bool tx_run;
struct work_struct send_pkt_work; spinlock_t send_pkt_list_lock; @@ -54,6 +55,7 @@ struct virtio_vsock { * must be accessed with rx_lock held. */ struct mutex rx_lock; + bool rx_run; int rx_buf_nr; int rx_buf_max_nr;
@@ -61,6 +63,7 @@ struct virtio_vsock { * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held. */ struct mutex event_lock; + bool event_run; struct virtio_vsock_event event_list[8];
u32 guest_cid; @@ -95,6 +98,10 @@ static void virtio_transport_loopback_work(struct work_struct *work) spin_unlock_bh(&vsock->loopback_list_lock);
mutex_lock(&vsock->rx_lock); + + if (!vsock->rx_run) + goto out; + while (!list_empty(&pkts)) { struct virtio_vsock_pkt *pkt;
@@ -103,6 +110,7 @@ static void virtio_transport_loopback_work(struct work_struct *work)
virtio_transport_recv_pkt(pkt); } +out: mutex_unlock(&vsock->rx_lock); }
@@ -131,6 +139,9 @@ virtio_transport_send_pkt_work(struct work_struct *work)
mutex_lock(&vsock->tx_lock);
+ if (!vsock->tx_run) + goto out; + vq = vsock->vqs[VSOCK_VQ_TX];
for (;;) { @@ -189,6 +200,7 @@ virtio_transport_send_pkt_work(struct work_struct *work) if (added) virtqueue_kick(vq);
+out: mutex_unlock(&vsock->tx_lock);
if (restart_rx) @@ -324,6 +336,10 @@ static void virtio_transport_tx_work(struct work_struct *work)
vq = vsock->vqs[VSOCK_VQ_TX]; mutex_lock(&vsock->tx_lock); + + if (!vsock->tx_run) + goto out; + do { struct virtio_vsock_pkt *pkt; unsigned int len; @@ -334,6 +350,8 @@ static void virtio_transport_tx_work(struct work_struct *work) added = true; } } while (!virtqueue_enable_cb(vq)); + +out: mutex_unlock(&vsock->tx_lock);
if (added) @@ -362,6 +380,9 @@ static void virtio_transport_rx_work(struct work_struct *work)
mutex_lock(&vsock->rx_lock);
+ if (!vsock->rx_run) + goto out; + do { virtqueue_disable_cb(vq); for (;;) { @@ -471,6 +492,9 @@ static void virtio_transport_event_work(struct work_struct *work)
mutex_lock(&vsock->event_lock);
+ if (!vsock->event_run) + goto out; + do { struct virtio_vsock_event *event; unsigned int len; @@ -485,7 +509,7 @@ static void virtio_transport_event_work(struct work_struct *work) } while (!virtqueue_enable_cb(vq));
virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]); - +out: mutex_unlock(&vsock->event_lock); }
@@ -621,12 +645,18 @@ static int virtio_vsock_probe(struct virtio_device *vdev) INIT_WORK(&vsock->send_pkt_work, virtio_transport_send_pkt_work); INIT_WORK(&vsock->loopback_work, virtio_transport_loopback_work);
+ mutex_lock(&vsock->tx_lock); + vsock->tx_run = true; + mutex_unlock(&vsock->tx_lock); + mutex_lock(&vsock->rx_lock); virtio_vsock_rx_fill(vsock); + vsock->rx_run = true; mutex_unlock(&vsock->rx_lock);
mutex_lock(&vsock->event_lock); virtio_vsock_event_fill(vsock); + vsock->event_run = true; mutex_unlock(&vsock->event_lock);
vdev->priv = vsock; @@ -661,6 +691,24 @@ static void virtio_vsock_remove(struct virtio_device *vdev) /* Reset all connected sockets when the device disappear */ vsock_for_each_connected_socket(virtio_vsock_reset_sock);
+ /* Stop all work handlers to make sure no one is accessing the device, + * so we can safely call vdev->config->reset(). + */ + mutex_lock(&vsock->rx_lock); + vsock->rx_run = false; + mutex_unlock(&vsock->rx_lock); + + mutex_lock(&vsock->tx_lock); + vsock->tx_run = false; + mutex_unlock(&vsock->tx_lock); + + mutex_lock(&vsock->event_lock); + vsock->event_run = false; + mutex_unlock(&vsock->event_lock); + + /* Flush all device writes and interrupts, device will not use any + * more buffers. + */ vdev->config->reset(vdev);
mutex_lock(&vsock->rx_lock); @@ -691,6 +739,7 @@ static void virtio_vsock_remove(struct virtio_device *vdev) } spin_unlock_bh(&vsock->loopback_list_lock);
+ /* Delete virtqueues and flush outstanding callbacks if any */ vdev->config->del_vqs(vdev);
mutex_unlock(&the_virtio_vsock_mutex);
From: Stefano Garzarella sgarzare@redhat.com
[ Upstream commit 4c7246dc45e2706770d5233f7ce1597a07e069ba ]
We are going to add 'struct vsock_sock *' parameter to virtio_transport_get_ops().
In some cases, like in the virtio_transport_reset_no_sock(), we don't have any socket assigned to the packet received, so we can't use the virtio_transport_get_ops().
In order to allow virtio_transport_reset_no_sock() to use the '.send_pkt' callback from the 'vhost_transport' or 'virtio_transport', we add the 'struct virtio_transport *' to it and to its caller: virtio_transport_recv_pkt().
We moved the 'vhost_transport' and 'virtio_transport' definition, to pass their address to the virtio_transport_recv_pkt().
Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/vsock.c | 94 +++++++------- include/linux/virtio_vsock.h | 3 +- net/vmw_vsock/virtio_transport.c | 160 ++++++++++++------------ net/vmw_vsock/virtio_transport_common.c | 12 +- 4 files changed, 135 insertions(+), 134 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 7891bd40ebd8..6ee320259e4f 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -383,6 +383,52 @@ static bool vhost_vsock_more_replies(struct vhost_vsock *vsock) return val < vq->num; }
+static struct virtio_transport vhost_transport = { + .transport = { + .get_local_cid = vhost_transport_get_local_cid, + + .init = virtio_transport_do_socket_init, + .destruct = virtio_transport_destruct, + .release = virtio_transport_release, + .connect = virtio_transport_connect, + .shutdown = virtio_transport_shutdown, + .cancel_pkt = vhost_transport_cancel_pkt, + + .dgram_enqueue = virtio_transport_dgram_enqueue, + .dgram_dequeue = virtio_transport_dgram_dequeue, + .dgram_bind = virtio_transport_dgram_bind, + .dgram_allow = virtio_transport_dgram_allow, + + .stream_enqueue = virtio_transport_stream_enqueue, + .stream_dequeue = virtio_transport_stream_dequeue, + .stream_has_data = virtio_transport_stream_has_data, + .stream_has_space = virtio_transport_stream_has_space, + .stream_rcvhiwat = virtio_transport_stream_rcvhiwat, + .stream_is_active = virtio_transport_stream_is_active, + .stream_allow = virtio_transport_stream_allow, + + .notify_poll_in = virtio_transport_notify_poll_in, + .notify_poll_out = virtio_transport_notify_poll_out, + .notify_recv_init = virtio_transport_notify_recv_init, + .notify_recv_pre_block = virtio_transport_notify_recv_pre_block, + .notify_recv_pre_dequeue = virtio_transport_notify_recv_pre_dequeue, + .notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue, + .notify_send_init = virtio_transport_notify_send_init, + .notify_send_pre_block = virtio_transport_notify_send_pre_block, + .notify_send_pre_enqueue = virtio_transport_notify_send_pre_enqueue, + .notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue, + + .set_buffer_size = virtio_transport_set_buffer_size, + .set_min_buffer_size = virtio_transport_set_min_buffer_size, + .set_max_buffer_size = virtio_transport_set_max_buffer_size, + .get_buffer_size = virtio_transport_get_buffer_size, + .get_min_buffer_size = virtio_transport_get_min_buffer_size, + .get_max_buffer_size = virtio_transport_get_max_buffer_size, + }, + + .send_pkt = vhost_transport_send_pkt, +}; + static void vhost_vsock_handle_tx_kick(struct vhost_work *work) { struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue, @@ -439,7 +485,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) if (le64_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid && le64_to_cpu(pkt->hdr.dst_cid) == vhost_transport_get_local_cid()) - virtio_transport_recv_pkt(pkt); + virtio_transport_recv_pkt(&vhost_transport, pkt); else virtio_transport_free_pkt(pkt);
@@ -792,52 +838,6 @@ static struct miscdevice vhost_vsock_misc = { .fops = &vhost_vsock_fops, };
-static struct virtio_transport vhost_transport = { - .transport = { - .get_local_cid = vhost_transport_get_local_cid, - - .init = virtio_transport_do_socket_init, - .destruct = virtio_transport_destruct, - .release = virtio_transport_release, - .connect = virtio_transport_connect, - .shutdown = virtio_transport_shutdown, - .cancel_pkt = vhost_transport_cancel_pkt, - - .dgram_enqueue = virtio_transport_dgram_enqueue, - .dgram_dequeue = virtio_transport_dgram_dequeue, - .dgram_bind = virtio_transport_dgram_bind, - .dgram_allow = virtio_transport_dgram_allow, - - .stream_enqueue = virtio_transport_stream_enqueue, - .stream_dequeue = virtio_transport_stream_dequeue, - .stream_has_data = virtio_transport_stream_has_data, - .stream_has_space = virtio_transport_stream_has_space, - .stream_rcvhiwat = virtio_transport_stream_rcvhiwat, - .stream_is_active = virtio_transport_stream_is_active, - .stream_allow = virtio_transport_stream_allow, - - .notify_poll_in = virtio_transport_notify_poll_in, - .notify_poll_out = virtio_transport_notify_poll_out, - .notify_recv_init = virtio_transport_notify_recv_init, - .notify_recv_pre_block = virtio_transport_notify_recv_pre_block, - .notify_recv_pre_dequeue = virtio_transport_notify_recv_pre_dequeue, - .notify_recv_post_dequeue = virtio_transport_notify_recv_post_dequeue, - .notify_send_init = virtio_transport_notify_send_init, - .notify_send_pre_block = virtio_transport_notify_send_pre_block, - .notify_send_pre_enqueue = virtio_transport_notify_send_pre_enqueue, - .notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue, - - .set_buffer_size = virtio_transport_set_buffer_size, - .set_min_buffer_size = virtio_transport_set_min_buffer_size, - .set_max_buffer_size = virtio_transport_set_max_buffer_size, - .get_buffer_size = virtio_transport_get_buffer_size, - .get_min_buffer_size = virtio_transport_get_min_buffer_size, - .get_max_buffer_size = virtio_transport_get_max_buffer_size, - }, - - .send_pkt = vhost_transport_send_pkt, -}; - static int __init vhost_vsock_init(void) { int ret; diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index e223e2632edd..8b8d13f01cae 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -149,7 +149,8 @@ virtio_transport_dgram_enqueue(struct vsock_sock *vsk,
void virtio_transport_destruct(struct vsock_sock *vsk);
-void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt); +void virtio_transport_recv_pkt(struct virtio_transport *t, + struct virtio_vsock_pkt *pkt); void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt); void virtio_transport_inc_tx_pkt(struct virtio_vsock_sock *vvs, struct virtio_vsock_pkt *pkt); u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 wanted); diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 4bc217ef5694..cc70d651d13e 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -87,33 +87,6 @@ static u32 virtio_transport_get_local_cid(void) return ret; }
-static void virtio_transport_loopback_work(struct work_struct *work) -{ - struct virtio_vsock *vsock = - container_of(work, struct virtio_vsock, loopback_work); - LIST_HEAD(pkts); - - spin_lock_bh(&vsock->loopback_list_lock); - list_splice_init(&vsock->loopback_list, &pkts); - spin_unlock_bh(&vsock->loopback_list_lock); - - mutex_lock(&vsock->rx_lock); - - if (!vsock->rx_run) - goto out; - - while (!list_empty(&pkts)) { - struct virtio_vsock_pkt *pkt; - - pkt = list_first_entry(&pkts, struct virtio_vsock_pkt, list); - list_del_init(&pkt->list); - - virtio_transport_recv_pkt(pkt); - } -out: - mutex_unlock(&vsock->rx_lock); -} - static int virtio_transport_send_pkt_loopback(struct virtio_vsock *vsock, struct virtio_vsock_pkt *pkt) { @@ -370,59 +343,6 @@ static bool virtio_transport_more_replies(struct virtio_vsock *vsock) return val < virtqueue_get_vring_size(vq); }
-static void virtio_transport_rx_work(struct work_struct *work) -{ - struct virtio_vsock *vsock = - container_of(work, struct virtio_vsock, rx_work); - struct virtqueue *vq; - - vq = vsock->vqs[VSOCK_VQ_RX]; - - mutex_lock(&vsock->rx_lock); - - if (!vsock->rx_run) - goto out; - - do { - virtqueue_disable_cb(vq); - for (;;) { - struct virtio_vsock_pkt *pkt; - unsigned int len; - - if (!virtio_transport_more_replies(vsock)) { - /* Stop rx until the device processes already - * pending replies. Leave rx virtqueue - * callbacks disabled. - */ - goto out; - } - - pkt = virtqueue_get_buf(vq, &len); - if (!pkt) { - break; - } - - vsock->rx_buf_nr--; - - /* Drop short/long packets */ - if (unlikely(len < sizeof(pkt->hdr) || - len > sizeof(pkt->hdr) + pkt->len)) { - virtio_transport_free_pkt(pkt); - continue; - } - - pkt->len = len - sizeof(pkt->hdr); - virtio_transport_deliver_tap_pkt(pkt); - virtio_transport_recv_pkt(pkt); - } - } while (!virtqueue_enable_cb(vq)); - -out: - if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2) - virtio_vsock_rx_fill(vsock); - mutex_unlock(&vsock->rx_lock); -} - /* event_lock must be held */ static int virtio_vsock_event_fill_one(struct virtio_vsock *vsock, struct virtio_vsock_event *event) @@ -586,6 +506,86 @@ static struct virtio_transport virtio_transport = { .send_pkt = virtio_transport_send_pkt, };
+static void virtio_transport_loopback_work(struct work_struct *work) +{ + struct virtio_vsock *vsock = + container_of(work, struct virtio_vsock, loopback_work); + LIST_HEAD(pkts); + + spin_lock_bh(&vsock->loopback_list_lock); + list_splice_init(&vsock->loopback_list, &pkts); + spin_unlock_bh(&vsock->loopback_list_lock); + + mutex_lock(&vsock->rx_lock); + + if (!vsock->rx_run) + goto out; + + while (!list_empty(&pkts)) { + struct virtio_vsock_pkt *pkt; + + pkt = list_first_entry(&pkts, struct virtio_vsock_pkt, list); + list_del_init(&pkt->list); + + virtio_transport_recv_pkt(&virtio_transport, pkt); + } +out: + mutex_unlock(&vsock->rx_lock); +} + +static void virtio_transport_rx_work(struct work_struct *work) +{ + struct virtio_vsock *vsock = + container_of(work, struct virtio_vsock, rx_work); + struct virtqueue *vq; + + vq = vsock->vqs[VSOCK_VQ_RX]; + + mutex_lock(&vsock->rx_lock); + + if (!vsock->rx_run) + goto out; + + do { + virtqueue_disable_cb(vq); + for (;;) { + struct virtio_vsock_pkt *pkt; + unsigned int len; + + if (!virtio_transport_more_replies(vsock)) { + /* Stop rx until the device processes already + * pending replies. Leave rx virtqueue + * callbacks disabled. + */ + goto out; + } + + pkt = virtqueue_get_buf(vq, &len); + if (!pkt) { + break; + } + + vsock->rx_buf_nr--; + + /* Drop short/long packets */ + if (unlikely(len < sizeof(pkt->hdr) || + len > sizeof(pkt->hdr) + pkt->len)) { + virtio_transport_free_pkt(pkt); + continue; + } + + pkt->len = len - sizeof(pkt->hdr); + virtio_transport_deliver_tap_pkt(pkt); + virtio_transport_recv_pkt(&virtio_transport, pkt); + } + } while (!virtqueue_enable_cb(vq)); + +out: + if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2) + virtio_vsock_rx_fill(vsock); + mutex_unlock(&vsock->rx_lock); +} + static int virtio_vsock_probe(struct virtio_device *vdev) { vq_callback_t *callbacks[] = { diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 52242a148c70..fae2bded5d51 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -669,9 +669,9 @@ static int virtio_transport_reset(struct vsock_sock *vsk, /* Normally packets are associated with a socket. There may be no socket if an * attempt was made to connect to a socket that does not exist. */ -static int virtio_transport_reset_no_sock(struct virtio_vsock_pkt *pkt) +static int virtio_transport_reset_no_sock(const struct virtio_transport *t, + struct virtio_vsock_pkt *pkt) { - const struct virtio_transport *t; struct virtio_vsock_pkt *reply; struct virtio_vsock_pkt_info info = { .op = VIRTIO_VSOCK_OP_RST, @@ -691,7 +691,6 @@ static int virtio_transport_reset_no_sock(struct virtio_vsock_pkt *pkt) if (!reply) return -ENOMEM;
- t = virtio_transport_get_ops(); if (!t) { virtio_transport_free_pkt(reply); return -ENOTCONN; @@ -993,7 +992,8 @@ static bool virtio_transport_space_update(struct sock *sk, /* We are under the virtio-vsock's vsock->rx_lock or vhost-vsock's vq->mutex * lock. */ -void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt) +void virtio_transport_recv_pkt(struct virtio_transport *t, + struct virtio_vsock_pkt *pkt) { struct sockaddr_vm src, dst; struct vsock_sock *vsk; @@ -1015,7 +1015,7 @@ void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt) le32_to_cpu(pkt->hdr.fwd_cnt));
if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) { - (void)virtio_transport_reset_no_sock(pkt); + (void)virtio_transport_reset_no_sock(t, pkt); goto free_pkt; }
@@ -1026,7 +1026,7 @@ void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt) if (!sk) { sk = vsock_find_bound_socket(&dst); if (!sk) { - (void)virtio_transport_reset_no_sock(pkt); + (void)virtio_transport_reset_no_sock(t, pkt); goto free_pkt; } }
From: Sebastien Boeuf sebastien.boeuf@intel.com
[ Upstream commit df12eb6d6cd920ab2f0e0a43cd6e1c23a05cea91 ]
Whenever the vsock backend on the host sends a packet through the RX queue, it expects an answer on the TX queue. Unfortunately, there is one case where the host side will hang waiting for the answer and might effectively never recover if no timeout mechanism was implemented.
This issue happens when the guest side starts binding to the socket, which insert a new bound socket into the list of already bound sockets. At this time, we expect the guest to also start listening, which will trigger the sk_state to move from TCP_CLOSE to TCP_LISTEN. The problem occurs if the host side queued a RX packet and triggered an interrupt right between the end of the binding process and the beginning of the listening process. In this specific case, the function processing the packet virtio_transport_recv_pkt() will find a bound socket, which means it will hit the switch statement checking for the sk_state, but the state won't be changed into TCP_LISTEN yet, which leads the code to pick the default statement. This default statement will only free the buffer, while it should also respond to the host side, by sending a packet on its TX queue.
In order to simply fix this unfortunate chain of events, it is important that in case the default statement is entered, and because at this stage we know the host side is waiting for an answer, we must send back a packet containing the operation VIRTIO_VSOCK_OP_RST.
One could say that a proper timeout mechanism on the host side will be enough to avoid the backend to hang. But the point of this patch is to ensure the normal use case will be provided with proper responsiveness when it comes to establishing the connection.
Signed-off-by: Sebastien Boeuf sebastien.boeuf@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/vmw_vsock/virtio_transport_common.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index fae2bded5d51..5f8a72d34d31 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1060,6 +1060,7 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, virtio_transport_free_pkt(pkt); break; default: + (void)virtio_transport_reset_no_sock(t, pkt); virtio_transport_free_pkt(pkt); break; }
From: Jiri Kosina jkosina@suse.cz
commit 5fc27b098dafb8e30794a9db0705074c7d766179 upstream.
Touchpad on this laptop is not detected properly during boot, as PNP enumerates (wrongly) AUX port as disabled on this machine.
Fix that by adding this board (with admittedly quite funny DMI identifiers) to nopnp quirk list.
Reported-by: Andrés Barrantes Silman andresbs2000@protonmail.com Signed-off-by: Jiri Kosina jkosina@suse.cz Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2009252337340.3336@cbobk.fhfr.pm Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/serio/i8042-x86ia64io.h | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h index 7c05e09abacf..51bd2ebaa342 100644 --- a/drivers/input/serio/i8042-x86ia64io.h +++ b/drivers/input/serio/i8042-x86ia64io.h @@ -725,6 +725,13 @@ static const struct dmi_system_id __initconst i8042_dmi_nopnp_table[] = { DMI_MATCH(DMI_BOARD_VENDOR, "MICRO-STAR INTERNATIONAL CO., LTD"), }, }, + { + /* Acer Aspire 5 A515 */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "Grumpy_PK"), + DMI_MATCH(DMI_BOARD_VENDOR, "PK"), + }, + }, { } };
From: "Steven Rostedt (VMware)" rostedt@goodmis.org
commit b40341fad6cc2daa195f8090fd3348f18fff640a upstream.
The first thing that the ftrace function callback helper functions should do is to check for recursion. Peter Zijlstra found that when "rcu_is_watching()" had its notrace removed, it caused perf function tracing to crash. This is because the call of rcu_is_watching() is tested before function recursion is checked and and if it is traced, it will cause an infinite recursion loop.
rcu_is_watching() should still stay notrace, but to prevent this should never had crashed in the first place. The recursion prevention must be the first thing done in callback functions.
Link: https://lore.kernel.org/r/20200929112541.GM2628@hirez.programming.kicks-ass....
Cc: stable@vger.kernel.org Cc: Paul McKenney paulmck@kernel.org Fixes: c68c0fa293417 ("ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too") Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Reported-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ftrace.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 70f7743c1672..992d48774c9e 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -6370,16 +6370,14 @@ static void ftrace_ops_assist_func(unsigned long ip, unsigned long parent_ip, { int bit;
- if ((op->flags & FTRACE_OPS_FL_RCU) && !rcu_is_watching()) - return; - bit = trace_test_and_set_recursion(TRACE_LIST_START, TRACE_LIST_MAX); if (bit < 0) return;
preempt_disable_notrace();
- op->func(ip, parent_ip, op, regs); + if (!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching()) + op->func(ip, parent_ip, op, regs);
preempt_enable_notrace(); trace_clear_recursion(bit);
From: Jean Delvare jdelvare@suse.de
commit a39d0d7bdf8c21ac7645c02e9676b5cb2b804c31 upstream.
A recent attempt to fix a ref count leak in amdgpu_display_crtc_set_config() turned out to be doing too much and "fixed" an intended decrease as if it were a leak. Undo that part to restore the proper balance. This is the very nature of this function to increase or decrease the power reference count depending on the situation.
Consequences of this bug is that the power reference would eventually get down to 0 while the display was still in use, resulting in that display switching off unexpectedly.
Signed-off-by: Jean Delvare jdelvare@suse.de Fixes: e008fa6fb415 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config") Cc: stable@vger.kernel.org Cc: Navid Emamdoost navid.emamdoost@gmail.com Cc: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 049a1961c3fa..5f85c9586cba 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -290,7 +290,7 @@ int amdgpu_display_crtc_set_config(struct drm_mode_set *set, take the current one */ if (active && !adev->have_disp_power_ref) { adev->have_disp_power_ref = true; - goto out; + return ret; } /* if we have no active crtcs, then drop the power ref we got before */
From: Xie He xie.he.0141@gmail.com
[ Upstream commit 44a049c42681de71c783d75cd6e56b4e339488b0 ]
PVC devices are virtual devices in this driver stacked on top of the actual HDLC device. They are the devices normal users would use. PVC devices have two types: normal PVC devices and Ethernet-emulating PVC devices.
When transmitting data with PVC devices, the ndo_start_xmit function will prepend a header of 4 or 10 bytes. Currently this driver requests this headroom to be reserved for normal PVC devices by setting their hard_header_len to 10. However, this does not work when these devices are used with AF_PACKET/RAW sockets. Also, this driver does not request this headroom for Ethernet-emulating PVC devices (but deals with this problem by reallocating the skb when needed, which is not optimal).
This patch replaces hard_header_len with needed_headroom, and set needed_headroom for Ethernet-emulating PVC devices, too. This makes the driver to request headroom for all PVC devices in all cases.
Cc: Krzysztof Halasa khc@pm.waw.pl Cc: Martin Schiller ms@dev.tdt.de Signed-off-by: Xie He xie.he.0141@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/hdlc_fr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 038236a9c60e..67f89917277c 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -1044,7 +1044,7 @@ static void pvc_setup(struct net_device *dev) { dev->type = ARPHRD_DLCI; dev->flags = IFF_POINTOPOINT; - dev->hard_header_len = 10; + dev->hard_header_len = 0; dev->addr_len = 2; netif_keep_dst(dev); } @@ -1096,6 +1096,7 @@ static int fr_add_pvc(struct net_device *frad, unsigned int dlci, int type) dev->mtu = HDLC_MAX_MTU; dev->min_mtu = 68; dev->max_mtu = HDLC_MAX_MTU; + dev->needed_headroom = 10; dev->priv_flags |= IFF_NO_QUEUE; dev->ml_priv = pvc;
From: Martin Cerveny m.cerveny@computer.org
[ Upstream commit 74ea06164cda81dc80e97790164ca533fd7e3087 ]
Better guess. Secondary CSC registers are from 0xF0000.
Signed-off-by: Martin Cerveny m.cerveny@computer.org Reviewed-by: Jernej Skrabec jernej.skrabec@siol.net Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20200906162140.5584-3-m.cerven... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/sun4i/sun8i_mixer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.c b/drivers/gpu/drm/sun4i/sun8i_mixer.c index 71a798e5d559..649b57e5e4b7 100644 --- a/drivers/gpu/drm/sun4i/sun8i_mixer.c +++ b/drivers/gpu/drm/sun4i/sun8i_mixer.c @@ -364,7 +364,7 @@ static struct regmap_config sun8i_mixer_regmap_config = { .reg_bits = 32, .val_bits = 32, .reg_stride = 4, - .max_register = 0xbfffc, /* guessed */ + .max_register = 0xffffc, /* guessed */ };
static int sun8i_mixer_of_get_id(struct device_node *node)
From: Lucy Yan lucyyan@google.com
[ Upstream commit ee460417d254d941dfea5fb7cff841f589643992 ]
Increase Rx ring size to address issue where hardware is reaching the receive work limit.
Before:
[ 102.223342] de2104x 0000:17:00.0 eth0: rx work limit reached [ 102.245695] de2104x 0000:17:00.0 eth0: rx work limit reached [ 102.251387] de2104x 0000:17:00.0 eth0: rx work limit reached [ 102.267444] de2104x 0000:17:00.0 eth0: rx work limit reached
Signed-off-by: Lucy Yan lucyyan@google.com Reviewed-by: Moritz Fischer mdf@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dec/tulip/de2104x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c index 13430f75496c..b312cd9bce16 100644 --- a/drivers/net/ethernet/dec/tulip/de2104x.c +++ b/drivers/net/ethernet/dec/tulip/de2104x.c @@ -91,7 +91,7 @@ MODULE_PARM_DESC (rx_copybreak, "de2104x Breakpoint at which Rx packets are copi #define DSL CONFIG_DE2104X_DSL #endif
-#define DE_RX_RING_SIZE 64 +#define DE_RX_RING_SIZE 128 #define DE_TX_RING_SIZE 64 #define DE_RING_BYTES \ ((sizeof(struct de_desc) * DE_RX_RING_SIZE) + \
From: Olympia Giannou ogiannou@gmail.com
[ Upstream commit 4202c9fdf03d79dedaa94b2c4cf574f25793d669 ]
Some WinCE devices face connectivity issues via the NDIS interface. They fail to register, resulting in -110 timeout errors and failures during the probe procedure.
In this kind of WinCE devices, the Windows-side ndis driver needs quite more time to be loaded and configured, so that the linux rndis host queries to them fail to be responded correctly on time.
More specifically, when INIT is called on the WinCE side - no other requests can be served by the Client and this results in a failed QUERY afterwards.
The increase of the waiting time on the side of the linux rndis host in the command-response loop leaves the INIT process to complete and respond to a QUERY, which comes afterwards. The WinCE devices with this special "feature" in their ndis driver are satisfied by this fix.
Signed-off-by: Olympia Giannou olympia.giannou@leica-geosystems.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/rndis_host.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/usb/rndis_host.c b/drivers/net/usb/rndis_host.c index b807c91abe1d..a22ae3137a3f 100644 --- a/drivers/net/usb/rndis_host.c +++ b/drivers/net/usb/rndis_host.c @@ -213,7 +213,7 @@ int rndis_command(struct usbnet *dev, struct rndis_msg_hdr *buf, int buflen) dev_dbg(&info->control->dev, "rndis response error, code %d\n", retval); } - msleep(20); + msleep(40); } dev_dbg(&info->control->dev, "rndis response timeout\n"); return -ETIMEDOUT;
From: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com
[ Upstream commit 52a3974feb1a3eec25d8836d37a508b67b0a9cd0 ]
Get and put the reference to the ctrl in the nvme_dev_open() and nvme_dev_release() before and after module get/put for ctrl in char device file operations.
Introduce char_dev relase function, get/put the controller and module which allows us to fix the potential Oops which can be easily reproduced with a passthru ctrl (although the problem also exists with pure user access):
Entering kdb (current=0xffff8887f8290000, pid 3128) on processor 30 Oops: (null) due to oops @ 0xffffffffa01019ad CPU: 30 PID: 3128 Comm: bash Tainted: G W OE 5.8.0-rc4nvme-5.9+ #35 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.4 RIP: 0010:nvme_free_ctrl+0x234/0x285 [nvme_core] Code: 57 10 a0 e8 73 bf 02 e1 ba 3d 11 00 00 48 c7 c6 98 33 10 a0 48 c7 c7 1d 57 10 a0 e8 5b bf 02 e1 8 RSP: 0018:ffffc90001d63de0 EFLAGS: 00010246 RAX: ffffffffa05c0440 RBX: ffff8888119e45a0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8888177e9550 RDI: ffff8888119e43b0 RBP: ffff8887d4768000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffc90001d63c90 R12: ffff8888119e43b0 R13: ffff8888119e5108 R14: dead000000000100 R15: ffff8888119e5108 FS: 00007f1ef27b0740(0000) GS:ffff888817600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa05c0470 CR3: 00000007f6bee000 CR4: 00000000003406e0 Call Trace: device_release+0x27/0x80 kobject_put+0x98/0x170 nvmet_passthru_ctrl_disable+0x4a/0x70 [nvmet] nvmet_passthru_enable_store+0x4c/0x90 [nvmet] configfs_write_file+0xe6/0x150 vfs_write+0xba/0x1e0 ksys_write+0x5f/0xe0 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f1ef1eb2840 Code: Bad RIP value. RSP: 002b:00007fffdbff0eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1ef1eb2840 RDX: 0000000000000002 RSI: 00007f1ef27d2000 RDI: 0000000000000001 RBP: 00007f1ef27d2000 R08: 000000000000000a R09: 00007f1ef27b0740 R10: 0000000000000001 R11: 0000000000000246 R12: 00007f1ef2186400 R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
With this patch fix we take the module ref count in nvme_dev_open() and release that ref count in newly introduced nvme_dev_release().
Signed-off-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index b49291f639dc..3f24d451f2b6 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2608,10 +2608,24 @@ static int nvme_dev_open(struct inode *inode, struct file *file) return -EWOULDBLOCK; }
+ nvme_get_ctrl(ctrl); + if (!try_module_get(ctrl->ops->module)) + return -EINVAL; + file->private_data = ctrl; return 0; }
+static int nvme_dev_release(struct inode *inode, struct file *file) +{ + struct nvme_ctrl *ctrl = + container_of(inode->i_cdev, struct nvme_ctrl, cdev); + + module_put(ctrl->ops->module); + nvme_put_ctrl(ctrl); + return 0; +} + static int nvme_dev_user_cmd(struct nvme_ctrl *ctrl, void __user *argp) { struct nvme_ns *ns; @@ -2672,6 +2686,7 @@ static long nvme_dev_ioctl(struct file *file, unsigned int cmd, static const struct file_operations nvme_dev_fops = { .owner = THIS_MODULE, .open = nvme_dev_open, + .release = nvme_dev_release, .unlocked_ioctl = nvme_dev_ioctl, .compat_ioctl = nvme_dev_ioctl, };
From: Xie He xie.he.0141@gmail.com
[ Upstream commit 83f9a9c8c1edc222846dc1bde6e3479703e8e5a3 ]
This driver is a virtual driver stacked on top of Ethernet interfaces.
When this driver transmits data on the Ethernet device, the skb->protocol setting is inconsistent with the Ethernet header prepended to the skb.
This causes a user listening on the Ethernet interface with an AF_PACKET socket, to see different sll_protocol values for incoming and outgoing frames, because incoming frames would have this value set by parsing the Ethernet header.
This patch changes the skb->protocol value for outgoing Ethernet frames, making it consistent with the Ethernet header prepended. This makes a user listening on the Ethernet device with an AF_PACKET socket, to see the same sll_protocol value for incoming and outgoing frames.
Cc: Martin Schiller ms@dev.tdt.de Signed-off-by: Xie He xie.he.0141@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/lapbether.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wan/lapbether.c b/drivers/net/wan/lapbether.c index 15177a54b17d..e5fc1b95cea6 100644 --- a/drivers/net/wan/lapbether.c +++ b/drivers/net/wan/lapbether.c @@ -201,8 +201,6 @@ static void lapbeth_data_transmit(struct net_device *ndev, struct sk_buff *skb) struct net_device *dev; int size = skb->len;
- skb->protocol = htons(ETH_P_X25); - ptr = skb_push(skb, 2);
*ptr++ = size % 256; @@ -213,6 +211,8 @@ static void lapbeth_data_transmit(struct net_device *ndev, struct sk_buff *skb)
skb->dev = dev = lapbeth->ethdev;
+ skb->protocol = htons(ETH_P_DEC); + skb_reset_network_header(skb);
dev_hard_header(skb, dev, ETH_P_DEC, bcast_addr, NULL, 0);
From: Xie He xie.he.0141@gmail.com
[ Upstream commit 9fb030a70431a2a2a1b292dbf0b2f399cc072c16 ]
This patch sets skb->protocol before transmitting frames on the HDLC device, so that a user listening on the HDLC device with an AF_PACKET socket will see outgoing frames' sll_protocol field correctly set and consistent with that of incoming frames.
1. Control frames in hdlc_cisco and hdlc_ppp
When these drivers send control frames, skb->protocol is not set.
This value should be set to htons(ETH_P_HDLC), because when receiving control frames, their skb->protocol is set to htons(ETH_P_HDLC).
When receiving, hdlc_type_trans in hdlc.h is called, which then calls cisco_type_trans or ppp_type_trans. The skb->protocol of control frames is set to htons(ETH_P_HDLC) so that the control frames can be received by hdlc_rcv in hdlc.c, which calls cisco_rx or ppp_rx to process the control frames.
2. hdlc_fr
When this driver sends control frames, skb->protocol is set to internal values used in this driver.
When this driver sends data frames (from upper stacked PVC devices), skb->protocol is the same as that of the user data packet being sent on the upper PVC device (for normal PVC devices), or is htons(ETH_P_802_3) (for Ethernet-emulating PVC devices).
However, skb->protocol for both control frames and data frames should be set to htons(ETH_P_HDLC), because when receiving, all frames received on the HDLC device will have their skb->protocol set to htons(ETH_P_HDLC).
When receiving, hdlc_type_trans in hdlc.h is called, and because this driver doesn't provide a type_trans function in struct hdlc_proto, all frames will have their skb->protocol set to htons(ETH_P_HDLC). The frames are then received by hdlc_rcv in hdlc.c, which calls fr_rx to process the frames (control frames are consumed and data frames are re-received on upper PVC devices).
Cc: Krzysztof Halasa khc@pm.waw.pl Signed-off-by: Xie He xie.he.0141@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/hdlc_cisco.c | 1 + drivers/net/wan/hdlc_fr.c | 3 +++ drivers/net/wan/hdlc_ppp.c | 1 + 3 files changed, 5 insertions(+)
diff --git a/drivers/net/wan/hdlc_cisco.c b/drivers/net/wan/hdlc_cisco.c index c169a26e5359..2c6e3fa6947a 100644 --- a/drivers/net/wan/hdlc_cisco.c +++ b/drivers/net/wan/hdlc_cisco.c @@ -121,6 +121,7 @@ static void cisco_keepalive_send(struct net_device *dev, u32 type, skb_put(skb, sizeof(struct cisco_packet)); skb->priority = TC_PRIO_CONTROL; skb->dev = dev; + skb->protocol = htons(ETH_P_HDLC); skb_reset_network_header(skb);
dev_queue_xmit(skb); diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 67f89917277c..03b5f5cce6f4 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -436,6 +436,8 @@ static netdev_tx_t pvc_xmit(struct sk_buff *skb, struct net_device *dev) if (pvc->state.fecn) /* TX Congestion counter */ dev->stats.tx_compressed++; skb->dev = pvc->frad; + skb->protocol = htons(ETH_P_HDLC); + skb_reset_network_header(skb); dev_queue_xmit(skb); return NETDEV_TX_OK; } @@ -558,6 +560,7 @@ static void fr_lmi_send(struct net_device *dev, int fullrep) skb_put(skb, i); skb->priority = TC_PRIO_CONTROL; skb->dev = dev; + skb->protocol = htons(ETH_P_HDLC); skb_reset_network_header(skb);
dev_queue_xmit(skb); diff --git a/drivers/net/wan/hdlc_ppp.c b/drivers/net/wan/hdlc_ppp.c index 85844f26547d..20d9b6585fba 100644 --- a/drivers/net/wan/hdlc_ppp.c +++ b/drivers/net/wan/hdlc_ppp.c @@ -254,6 +254,7 @@ static void ppp_tx_cp(struct net_device *dev, u16 pid, u8 code,
skb->priority = TC_PRIO_CONTROL; skb->dev = dev; + skb->protocol = htons(ETH_P_HDLC); skb_reset_network_header(skb); skb_queue_tail(&tx_queue, skb); }
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 3bd5c7a28a7c3aba07a2d300d43f8e988809e147 ]
Limit maximum VHT MPDU size by local capability.
Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20200917125031.45009-1-nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/vht.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/mac80211/vht.c b/net/mac80211/vht.c index 259325cbcc31..4d154efb80c8 100644 --- a/net/mac80211/vht.c +++ b/net/mac80211/vht.c @@ -170,10 +170,7 @@ ieee80211_vht_cap_ie_to_sta_vht_cap(struct ieee80211_sub_if_data *sdata, /* take some capabilities as-is */ cap_info = le32_to_cpu(vht_cap_ie->vht_cap_info); vht_cap->cap = cap_info; - vht_cap->cap &= IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_3895 | - IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_7991 | - IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454 | - IEEE80211_VHT_CAP_RXLDPC | + vht_cap->cap &= IEEE80211_VHT_CAP_RXLDPC | IEEE80211_VHT_CAP_VHT_TXOP_PS | IEEE80211_VHT_CAP_HTC_VHT | IEEE80211_VHT_CAP_MAX_A_MPDU_LENGTH_EXPONENT_MASK | @@ -182,6 +179,9 @@ ieee80211_vht_cap_ie_to_sta_vht_cap(struct ieee80211_sub_if_data *sdata, IEEE80211_VHT_CAP_RX_ANTENNA_PATTERN | IEEE80211_VHT_CAP_TX_ANTENNA_PATTERN;
+ vht_cap->cap |= min_t(u32, cap_info & IEEE80211_VHT_CAP_MAX_MPDU_MASK, + own_cap.cap & IEEE80211_VHT_CAP_MAX_MPDU_MASK); + /* and some based on our own capabilities */ switch (own_cap.cap & IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_MASK) { case IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160MHZ:
From: Chris Packham chris.packham@alliedtelesis.co.nz
[ Upstream commit b867eef4cf548cd9541225aadcdcee644669b9e1 ]
The SPIE register contains counts for the TX FIFO so any time the irq handler was invoked we would attempt to process the RX/TX fifos. Use the SPIM value to mask the events so that we only process interrupts that were expected.
This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C").
Signed-off-by: Chris Packham chris.packham@alliedtelesis.co.nz Link: https://lore.kernel.org/r/20200904002812.7300-1-chris.packham@alliedtelesis.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-fsl-espi.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c index 1e8ff6256079..b8dd75b8518b 100644 --- a/drivers/spi/spi-fsl-espi.c +++ b/drivers/spi/spi-fsl-espi.c @@ -559,13 +559,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, u32 events) static irqreturn_t fsl_espi_irq(s32 irq, void *context_data) { struct fsl_espi *espi = context_data; - u32 events; + u32 events, mask;
spin_lock(&espi->lock);
/* Get interrupt events(tx/rx) */ events = fsl_espi_read_reg(espi, ESPI_SPIE); - if (!events) { + mask = fsl_espi_read_reg(espi, ESPI_SPIM); + if (!(events & mask)) { spin_unlock(&espi->lock); return IRQ_NONE; }
From: James Smart james.smart@broadcom.com
[ Upstream commit 9e0e8dac985d4bd07d9e62922b9d189d3ca2fccf ]
The lldd may have made calls to delete a remote port or local port and the delete is in progress when the cli then attempts to create a new controller. Currently, this proceeds without error although it can't be very successful.
Fix this by validating that both the host port and remote port are present when a new controller is to be created.
Signed-off-by: James Smart james.smart@broadcom.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/fc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 73db32f97abf..ed88d5021772 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -3294,12 +3294,14 @@ nvme_fc_create_ctrl(struct device *dev, struct nvmf_ctrl_options *opts) spin_lock_irqsave(&nvme_fc_lock, flags); list_for_each_entry(lport, &nvme_fc_lport_list, port_list) { if (lport->localport.node_name != laddr.nn || - lport->localport.port_name != laddr.pn) + lport->localport.port_name != laddr.pn || + lport->localport.port_state != FC_OBJSTATE_ONLINE) continue;
list_for_each_entry(rport, &lport->endp_list, endp_list) { if (rport->remoteport.node_name != raddr.nn || - rport->remoteport.port_name != raddr.pn) + rport->remoteport.port_name != raddr.pn || + rport->remoteport.port_state != FC_OBJSTATE_ONLINE) continue;
/* if fail to get reference fall through. Will error */
From: Taiping Lai taiping.lai@unisoc.com
[ Upstream commit 5fcface659aab7eac4bd65dd116d98b8f7bb88d5 ]
The raw interrupt status of GPIO maybe set before the interrupt is enabled, which would trigger the interrupt event once enabled it from user side. This is the case for edge interrupts only. Adding a clear operation when setting interrupt type can avoid that.
There're a few considerations for the solution: 1) This issue is for edge interrupt only; The interrupts requested by users are IRQ_TYPE_LEVEL_HIGH as default, so clearing interrupt when request is useless. 2) The interrupt type can be set to edge when request and following up with clearing it though, but the problem is still there once users set the interrupt type to level trggier. 3) We can add a clear operation after each time of setting interrupt enable bit, but it is redundant for level trigger interrupt.
Therefore, the solution is this patch seems the best for now.
Fixes: 9a3821c2bb47 ("gpio: Add GPIO driver for Spreadtrum SC9860 platform") Signed-off-by: Taiping Lai taiping.lai@unisoc.com Signed-off-by: Chunyan Zhang chunyan.zhang@unisoc.com Reviewed-by: Baolin Wang baolin.wang7@gmail.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-sprd.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpio/gpio-sprd.c b/drivers/gpio/gpio-sprd.c index 55072d2b367f..4d53347adcaf 100644 --- a/drivers/gpio/gpio-sprd.c +++ b/drivers/gpio/gpio-sprd.c @@ -149,17 +149,20 @@ static int sprd_gpio_irq_set_type(struct irq_data *data, sprd_gpio_update(chip, offset, SPRD_GPIO_IS, 0); sprd_gpio_update(chip, offset, SPRD_GPIO_IBE, 0); sprd_gpio_update(chip, offset, SPRD_GPIO_IEV, 1); + sprd_gpio_update(chip, offset, SPRD_GPIO_IC, 1); irq_set_handler_locked(data, handle_edge_irq); break; case IRQ_TYPE_EDGE_FALLING: sprd_gpio_update(chip, offset, SPRD_GPIO_IS, 0); sprd_gpio_update(chip, offset, SPRD_GPIO_IBE, 0); sprd_gpio_update(chip, offset, SPRD_GPIO_IEV, 0); + sprd_gpio_update(chip, offset, SPRD_GPIO_IC, 1); irq_set_handler_locked(data, handle_edge_irq); break; case IRQ_TYPE_EDGE_BOTH: sprd_gpio_update(chip, offset, SPRD_GPIO_IS, 0); sprd_gpio_update(chip, offset, SPRD_GPIO_IBE, 1); + sprd_gpio_update(chip, offset, SPRD_GPIO_IC, 1); irq_set_handler_locked(data, handle_edge_irq); break; case IRQ_TYPE_LEVEL_HIGH:
From: Chris Packham chris.packham@alliedtelesis.co.nz
[ Upstream commit 63c3212e7a37d68c89a13bdaebce869f4e064e67 ]
Per the datasheet the i2c functions use MPP_Sel=0x1. They are documented as using MPP_Sel=0x4 as well but mixing 0x1 and 0x4 is clearly wrong. On the board tested 0x4 resulted in a non-functioning i2c bus so stick with 0x1 which works.
Fixes: d7ae8f8dee7f ("pinctrl: mvebu: pinctrl driver for 98DX3236 SoC") Signed-off-by: Chris Packham chris.packham@alliedtelesis.co.nz Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20200907211712.9697-2-chris.packham@alliedtelesis.... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/mvebu/pinctrl-armada-xp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/mvebu/pinctrl-armada-xp.c b/drivers/pinctrl/mvebu/pinctrl-armada-xp.c index 43231fd065a1..1a9450ef932b 100644 --- a/drivers/pinctrl/mvebu/pinctrl-armada-xp.c +++ b/drivers/pinctrl/mvebu/pinctrl-armada-xp.c @@ -418,7 +418,7 @@ static struct mvebu_mpp_mode mv98dx3236_mpp_modes[] = { MPP_VAR_FUNCTION(0x1, "i2c0", "sck", V_98DX3236_PLUS)), MPP_MODE(15, MPP_VAR_FUNCTION(0x0, "gpio", NULL, V_98DX3236_PLUS), - MPP_VAR_FUNCTION(0x4, "i2c0", "sda", V_98DX3236_PLUS)), + MPP_VAR_FUNCTION(0x1, "i2c0", "sda", V_98DX3236_PLUS)), MPP_MODE(16, MPP_VAR_FUNCTION(0x0, "gpo", NULL, V_98DX3236_PLUS), MPP_VAR_FUNCTION(0x4, "dev", "oe", V_98DX3236_PLUS)),
From: Jeffrey Mitchell jeffrey.mitchell@starlab.io
[ Upstream commit d33030e2ee3508d65db5644551435310df86010e ]
nfs_readdir_page_filler() iterates over entries in a directory, reusing the same security label buffer, but does not reset the buffer's length. This causes decode_attr_security_label() to return -ERANGE if an entry's security label is longer than the previous one's. This error, in nfs4_decode_dirent(), only gets passed up as -EAGAIN, which causes another failed attempt to copy into the buffer. The second error is ignored and the remaining entries do not show up in ls, specifically the getdents64() syscall.
Reproduce by creating multiple files in NFS and giving one of the later files a longer security label. ls will not see that file nor any that are added afterwards, though they will exist on the backend.
In nfs_readdir_page_filler(), reset security label buffer length before every reuse
Signed-off-by: Jeffrey Mitchell jeffrey.mitchell@starlab.io Fixes: b4487b935452 ("nfs: Fix getxattr kernel panic and memory overflow") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/dir.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 4ae726e70d87..733fd9e4f0a1 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -553,6 +553,9 @@ int nfs_readdir_page_filler(nfs_readdir_descriptor_t *desc, struct nfs_entry *en xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE);
do { + if (entry->label) + entry->label->len = NFS4_MAXLABELLEN; + status = xdr_decode(desc, entry, &stream); if (status != 0) { if (status == -EAGAIN)
From: Marek Szyprowski m.szyprowski@samsung.com
[ Upstream commit f3bb0f796f5ffe32f0fbdce5b1b12eb85511158f ]
The ChipID IO region has it's own clock, which is being disabled while scanning for unused clocks. It turned out that some CPU hotplug, CPU idle or even SOC firmware code depends on the reads from that area. Fix the mysterious hang caused by entering deep CPU idle state by ignoring the 'chipid' clock during unused clocks scan, as there are no direct clients for it which will keep it enabled.
Fixes: e062b571777f ("clk: exynos4: register clocks using common clock framework") Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Link: https://lore.kernel.org/r/20200922124046.10496-1-m.szyprowski@samsung.com Reviewed-by: Krzysztof Kozlowski krzk@kernel.org Acked-by: Sylwester Nawrocki s.nawrocki@samsung.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/samsung/clk-exynos4.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/samsung/clk-exynos4.c b/drivers/clk/samsung/clk-exynos4.c index 442309b56920..8086756e7f07 100644 --- a/drivers/clk/samsung/clk-exynos4.c +++ b/drivers/clk/samsung/clk-exynos4.c @@ -1072,7 +1072,7 @@ static const struct samsung_gate_clock exynos4210_gate_clks[] __initconst = { GATE(CLK_PCIE, "pcie", "aclk133", GATE_IP_FSYS, 14, 0, 0), GATE(CLK_SMMU_PCIE, "smmu_pcie", "aclk133", GATE_IP_FSYS, 18, 0, 0), GATE(CLK_MODEMIF, "modemif", "aclk100", GATE_IP_PERIL, 28, 0, 0), - GATE(CLK_CHIPID, "chipid", "aclk100", E4210_GATE_IP_PERIR, 0, 0, 0), + GATE(CLK_CHIPID, "chipid", "aclk100", E4210_GATE_IP_PERIR, 0, CLK_IGNORE_UNUSED, 0), GATE(CLK_SYSREG, "sysreg", "aclk100", E4210_GATE_IP_PERIR, 0, CLK_IGNORE_UNUSED, 0), GATE(CLK_HDMI_CEC, "hdmi_cec", "aclk100", E4210_GATE_IP_PERIR, 11, 0, @@ -1113,7 +1113,7 @@ static const struct samsung_gate_clock exynos4x12_gate_clks[] __initconst = { 0), GATE(CLK_TSADC, "tsadc", "aclk133", E4X12_GATE_BUS_FSYS1, 16, 0, 0), GATE(CLK_MIPI_HSI, "mipi_hsi", "aclk133", GATE_IP_FSYS, 10, 0, 0), - GATE(CLK_CHIPID, "chipid", "aclk100", E4X12_GATE_IP_PERIR, 0, 0, 0), + GATE(CLK_CHIPID, "chipid", "aclk100", E4X12_GATE_IP_PERIR, 0, CLK_IGNORE_UNUSED, 0), GATE(CLK_SYSREG, "sysreg", "aclk100", E4X12_GATE_IP_PERIR, 1, CLK_IGNORE_UNUSED, 0), GATE(CLK_HDMI_CEC, "hdmi_cec", "aclk100", E4X12_GATE_IP_PERIR, 11, 0,
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 1a26044954a6d1f4d375d5e62392446af663be7a ]
if of_find_device_by_node() succeed, exynos_iommu_of_xlate() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling for this function implementation.
Fixes: aa759fd376fb ("iommu/exynos: Add callback for initializing devices from device tree") Signed-off-by: Yu Kuai yukuai3@huawei.com Acked-by: Marek Szyprowski m.szyprowski@samsung.com Link: https://lore.kernel.org/r/20200918011335.909141-1-yukuai3@huawei.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/exynos-iommu.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c index 1bd0cd7168df..4bf6049dd2c7 100644 --- a/drivers/iommu/exynos-iommu.c +++ b/drivers/iommu/exynos-iommu.c @@ -1302,13 +1302,17 @@ static int exynos_iommu_of_xlate(struct device *dev, return -ENODEV;
data = platform_get_drvdata(sysmmu); - if (!data) + if (!data) { + put_device(&sysmmu->dev); return -ENODEV; + }
if (!owner) { owner = kzalloc(sizeof(*owner), GFP_KERNEL); - if (!owner) + if (!owner) { + put_device(&sysmmu->dev); return -ENOMEM; + }
INIT_LIST_HEAD(&owner->controllers); mutex_init(&owner->rpm_lock);
From: Nicolas VINCENT nicolas.vincent@vossloh.com
[ Upstream commit a2bd970aa62f2f7f80fd0d212b1d4ccea5df4aed ]
the i2c_ram structure is missing the sdmatmp field mentionned in datasheet for MPC8272 at paragraph 36.5. With this field missing, the hardware would write past the allocated memory done through cpm_muram_alloc for the i2c_ram structure and land in memory allocated for the buffers descriptors corrupting the cbd_bufaddr field. Since this field is only set during setup(), the first i2c transaction would work and the following would send data read from an arbitrary memory location.
Fixes: 61045dbe9d8d ("i2c: Add support for I2C bus on Freescale CPM1/CPM2 controllers") Signed-off-by: Nicolas VINCENT nicolas.vincent@vossloh.com Acked-by: Jochen Friedrich jochen@scram.de Acked-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-cpm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c index 8a8ca945561b..7eba874a981d 100644 --- a/drivers/i2c/busses/i2c-cpm.c +++ b/drivers/i2c/busses/i2c-cpm.c @@ -74,6 +74,9 @@ struct i2c_ram { char res1[4]; /* Reserved */ ushort rpbase; /* Relocation pointer */ char res2[2]; /* Reserved */ + /* The following elements are only for CPM2 */ + char res3[4]; /* Reserved */ + uint sdmatmp; /* Internal */ };
#define I2COM_START 0x80
From: Vincent Huang vincent.huang@tw.synaptics.com
[ Upstream commit 996d585b079ad494a30cac10e08585bcd5345125 ]
Add Synaptics IDs in trackpoint_start_protocol() to mark them as valid.
Signed-off-by: Vincent Huang vincent.huang@tw.synaptics.com Fixes: 6c77545af100 ("Input: trackpoint - add new trackpoint variant IDs") Reviewed-by: Harry Cutts hcutts@chromium.org Tested-by: Harry Cutts hcutts@chromium.org Link: https://lore.kernel.org/r/20200924053013.1056953-1-vincent.huang@tw.synaptic... Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/mouse/trackpoint.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/input/mouse/trackpoint.c b/drivers/input/mouse/trackpoint.c index 31c16b68aa31..e46865785409 100644 --- a/drivers/input/mouse/trackpoint.c +++ b/drivers/input/mouse/trackpoint.c @@ -285,6 +285,8 @@ static int trackpoint_start_protocol(struct psmouse *psmouse, case TP_VARIANT_ALPS: case TP_VARIANT_ELAN: case TP_VARIANT_NXP: + case TP_VARIANT_JYT_SYNAPTICS: + case TP_VARIANT_SYNAPTICS: if (variant_id) *variant_id = param[0]; if (firmware_id)
From: Thibaut Sautereau thibaut.sautereau@ssi.gouv.fr
[ Upstream commit 09a6b0bc3be793ca8cba580b7992d73e9f68f15d ]
Commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity") broke compilation and was temporarily fixed by Linus in 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin") by entirely moving net_rand_state out of the things handled by the latent_entropy GCC plugin.
From what I understand when reading the plugin code, using the __latent_entropy attribute on a declaration was the wrong part and simply keeping the __latent_entropy attribute on the variable definition was the correct fix.
Fixes: 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin") Acked-by: Willy Tarreau w@1wt.eu Cc: Emese Revfy re.emese@gmail.com Signed-off-by: Thibaut Sautereau thibaut.sautereau@ssi.gouv.fr Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/random32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/random32.c b/lib/random32.c index 036de0c93e22..b6f3325e38e4 100644 --- a/lib/random32.c +++ b/lib/random32.c @@ -48,7 +48,7 @@ static inline void prandom_state_selftest(void) } #endif
-DEFINE_PER_CPU(struct rnd_state, net_rand_state); +DEFINE_PER_CPU(struct rnd_state, net_rand_state) __latent_entropy;
/** * prandom_u32_state - seeded pseudo-random number generator.
From: Laurent Dufour ldufour@linux.ibm.com
commit c1d0da83358a2316d9be7f229f26126dbaa07468 upstream.
Patch series "mm: fix memory to node bad links in sysfs", v3.
Sometimes, firmware may expose interleaved memory layout like this:
Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff]
In that case, we can see memory blocks assigned to multiple nodes in sysfs:
$ ls -l /sys/devices/system/memory/memory21 total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 -rw-r--r-- 1 root root 65536 Aug 24 05:27 online -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index drwxr-xr-x 2 root root 0 Aug 24 05:27 power -r--r--r-- 1 root root 65536 Aug 24 05:27 removable -rw-r--r-- 1 root root 65536 Aug 24 05:27 state lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones
The same applies in the node's directory with a memory21 link in both the node1 and node2's directory.
This is wrong but doesn't prevent the system to run. However when later, one of these memory blocks is hot-unplugged and then hot-plugged, the system is detecting an inconsistency in the sysfs layout and a BUG_ON() is raised:
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c
This has been seen on PowerPC LPAR.
The root cause of this issue is that when node's memory is registered, the range used can overlap another node's range, thus the memory block is registered to multiple nodes in sysfs.
There are two issues here:
(a) The sysfs memory and node's layouts are broken due to these multiple links
(b) The link errors in link_mem_sections() should not lead to a system panic.
To address (a) register_mem_sect_under_node should not rely on the system state to detect whether the link operation is triggered by a hot plug operation or not. This is addressed by the patches 1 and 2 of this series.
Issue (b) will be addressed separately.
This patch (of 2):
The memmap_context enum is used to detect whether a memory operation is due to a hot-add operation or happening at boot time.
Make it general to the hotplug operation and rename it as meminit_context.
There is no functional change introduced by this patch
Suggested-by: David Hildenbrand david@redhat.com Signed-off-by: Laurent Dufour ldufour@linux.ibm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Oscar Salvador osalvador@suse.de Acked-by: Michal Hocko mhocko@suse.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: "Rafael J . Wysocki" rafael@kernel.org Cc: Nathan Lynch nathanl@linux.ibm.com Cc: Scott Cheloha cheloha@linux.ibm.com Cc: Tony Luck tony.luck@intel.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/ia64/mm/init.c | 6 +++--- include/linux/mm.h | 2 +- include/linux/mmzone.h | 11 ++++++++--- mm/memory_hotplug.c | 2 +- mm/page_alloc.c | 11 ++++++----- 5 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index 79e5cc70f1fd..561e2573bd34 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -499,7 +499,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg) if (map_start < map_end) memmap_init_zone((unsigned long)(map_end - map_start), args->nid, args->zone, page_to_pfn(map_start), - MEMMAP_EARLY, NULL); + MEMINIT_EARLY, NULL); return 0; }
@@ -508,8 +508,8 @@ memmap_init (unsigned long size, int nid, unsigned long zone, unsigned long start_pfn) { if (!vmem_map) { - memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, - NULL); + memmap_init_zone(size, nid, zone, start_pfn, + MEMINIT_EARLY, NULL); } else { struct page *start; struct memmap_init_callback_data args; diff --git a/include/linux/mm.h b/include/linux/mm.h index 16eff0426e4a..d26d870f68cc 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2209,7 +2209,7 @@ static inline void zero_resv_unavail(void) {}
extern void set_dma_reserve(unsigned long new_dma_reserve); extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long, - enum memmap_context, struct vmem_altmap *); + enum meminit_context, struct vmem_altmap *); extern void setup_per_zone_wmarks(void); extern int __meminit init_per_zone_wmark_min(void); extern void mem_init(void); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 8c546c5b0f21..1d7c5dd03ed8 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -762,10 +762,15 @@ bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned int alloc_flags); bool zone_watermark_ok_safe(struct zone *z, unsigned int order, unsigned long mark, int classzone_idx); -enum memmap_context { - MEMMAP_EARLY, - MEMMAP_HOTPLUG, +/* + * Memory initialization context, use to differentiate memory added by + * the platform statically or via memory hotplug interface. + */ +enum meminit_context { + MEMINIT_EARLY, + MEMINIT_HOTPLUG, }; + extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn, unsigned long size);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index dfde365b3ccb..040ef987a5a1 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -733,7 +733,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, * are reserved so nobody should be touching them so we should be safe */ memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn, - MEMMAP_HOTPLUG, altmap); + MEMINIT_HOTPLUG, altmap);
set_zone_contiguous(zone); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 39d530e64098..1cc5dfa418d0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5589,7 +5589,7 @@ void __ref build_all_zonelists(pg_data_t *pgdat) * done. Non-atomic initialization, single-pass. */ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, - unsigned long start_pfn, enum memmap_context context, + unsigned long start_pfn, enum meminit_context context, struct vmem_altmap *altmap) { unsigned long end_pfn = start_pfn + size; @@ -5616,7 +5616,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * There can be holes in boot-time mem_map[]s handed to this * function. They do not exist on hotplugged memory. */ - if (context != MEMMAP_EARLY) + if (context != MEMINIT_EARLY) goto not_early;
if (!early_pfn_valid(pfn)) { @@ -5654,7 +5654,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, not_early: page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); - if (context == MEMMAP_HOTPLUG) + if (context == MEMINIT_HOTPLUG) SetPageReserved(page);
/* @@ -5669,7 +5669,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * check here not to call set_pageblock_migratetype() against * pfn out of zone. * - * Please note that MEMMAP_HOTPLUG path doesn't clear memmap + * Please note that MEMINIT_HOTPLUG path doesn't clear memmap * because this is done early in sparse_add_one_section */ if (!(pfn & (pageblock_nr_pages - 1))) { @@ -5690,7 +5690,8 @@ static void __meminit zone_init_free_lists(struct zone *zone)
#ifndef __HAVE_ARCH_MEMMAP_INIT #define memmap_init(size, nid, zone, start_pfn) \ - memmap_init_zone((size), (nid), (zone), (start_pfn), MEMMAP_EARLY, NULL) + memmap_init_zone((size), (nid), (zone), (start_pfn), \ + MEMINIT_EARLY, NULL) #endif
static int zone_batchsize(struct zone *zone)
From: Laurent Dufour ldufour@linux.ibm.com
commit f85086f95fa36194eb0db5cd5c12e56801b98523 upstream.
In register_mem_sect_under_node() the system_state's value is checked to detect whether the call is made during boot time or during an hot-plug operation. Unfortunately, that check against SYSTEM_BOOTING is wrong because regular memory is registered at SYSTEM_SCHEDULING state. In addition, memory hot-plug operation can be triggered at this system state by the ACPI [1]. So checking against the system state is not enough.
The consequence is that on system with interleaved node's ranges like this:
Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff]
This can be seen on PowerPC LPAR after multiple memory hot-plug and hot-unplug operations are done. At the next reboot the node's memory ranges can be interleaved and since the call to link_mem_sections() is made in topology_init() while the system is in the SYSTEM_SCHEDULING state, the node's id is not checked, and the sections registered to multiple nodes:
$ ls -l /sys/devices/system/memory/memory21/node* total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
In that case, the system is able to boot but if later one of theses memory blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is detected and this is triggering a BUG_ON():
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c
This patch addresses the root cause by not relying on the system_state value to detect whether the call is due to a hot-plug operation. An extra parameter is added to link_mem_sections() detailing whether the operation is due to a hot-plug operation.
[1] According to Oscar Salvador, using this qemu command line, ACPI memory hotplug operations are raised at SYSTEM_SCHEDULING state:
$QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \ -m size=$MEM,slots=255,maxmem=4294967296k \ -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \ -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \ -object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \ -object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \ -object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \ -object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \ -object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \ -object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \
Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()") Signed-off-by: Laurent Dufour ldufour@linux.ibm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Oscar Salvador osalvador@suse.de Acked-by: Michal Hocko mhocko@suse.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: "Rafael J. Wysocki" rafael@kernel.org Cc: Fenghua Yu fenghua.yu@intel.com Cc: Nathan Lynch nathanl@linux.ibm.com Cc: Scott Cheloha cheloha@linux.ibm.com Cc: Tony Luck tony.luck@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/node.c | 84 ++++++++++++++++++++++++++++---------------- include/linux/node.h | 11 +++--- mm/memory_hotplug.c | 3 +- 3 files changed, 63 insertions(+), 35 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c index 44710399f21f..2dc273d69175 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -408,10 +408,32 @@ static int __ref get_nid_for_pfn(unsigned long pfn) return pfn_to_nid(pfn); }
+static int do_register_memory_block_under_node(int nid, + struct memory_block *mem_blk) +{ + int ret; + + /* + * If this memory block spans multiple nodes, we only indicate + * the last processed node. + */ + mem_blk->nid = nid; + + ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj, + &mem_blk->dev.kobj, + kobject_name(&mem_blk->dev.kobj)); + if (ret) + return ret; + + return sysfs_create_link_nowarn(&mem_blk->dev.kobj, + &node_devices[nid]->dev.kobj, + kobject_name(&node_devices[nid]->dev.kobj)); +} + /* register memory section under specified node if it spans that node */ -int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg) +int register_mem_block_under_node_early(struct memory_block *mem_blk, void *arg) { - int ret, nid = *(int *)arg; + int nid = *(int *)arg; unsigned long pfn, sect_start_pfn, sect_end_pfn;
sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr); @@ -431,38 +453,33 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg) }
/* - * We need to check if page belongs to nid only for the boot - * case, during hotplug we know that all pages in the memory - * block belong to the same node. - */ - if (system_state == SYSTEM_BOOTING) { - page_nid = get_nid_for_pfn(pfn); - if (page_nid < 0) - continue; - if (page_nid != nid) - continue; - } - - /* - * If this memory block spans multiple nodes, we only indicate - * the last processed node. + * We need to check if page belongs to nid only at the boot + * case because node's ranges can be interleaved. */ - mem_blk->nid = nid; - - ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj, - &mem_blk->dev.kobj, - kobject_name(&mem_blk->dev.kobj)); - if (ret) - return ret; + page_nid = get_nid_for_pfn(pfn); + if (page_nid < 0) + continue; + if (page_nid != nid) + continue;
- return sysfs_create_link_nowarn(&mem_blk->dev.kobj, - &node_devices[nid]->dev.kobj, - kobject_name(&node_devices[nid]->dev.kobj)); + return do_register_memory_block_under_node(nid, mem_blk); } /* mem section does not span the specified node */ return 0; }
+/* + * During hotplug we know that all pages in the memory block belong to the same + * node. + */ +static int register_mem_block_under_node_hotplug(struct memory_block *mem_blk, + void *arg) +{ + int nid = *(int *)arg; + + return do_register_memory_block_under_node(nid, mem_blk); +} + /* * Unregister a memory block device under the node it spans. Memory blocks * with multiple nodes cannot be offlined and therefore also never be removed. @@ -478,10 +495,17 @@ void unregister_memory_block_under_nodes(struct memory_block *mem_blk) kobject_name(&node_devices[mem_blk->nid]->dev.kobj)); }
-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn) +int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn, + enum meminit_context context) { - return walk_memory_range(start_pfn, end_pfn, (void *)&nid, - register_mem_sect_under_node); + walk_memory_blocks_func_t func; + + if (context == MEMINIT_HOTPLUG) + func = register_mem_block_under_node_hotplug; + else + func = register_mem_block_under_node_early; + + return walk_memory_range(start_pfn, end_pfn, (void *)&nid, func); }
#ifdef CONFIG_HUGETLBFS diff --git a/include/linux/node.h b/include/linux/node.h index 708939bae9aa..a79ec4492650 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -32,11 +32,13 @@ extern struct node *node_devices[]; typedef void (*node_registration_func_t)(struct node *);
#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA) -extern int link_mem_sections(int nid, unsigned long start_pfn, - unsigned long end_pfn); +int link_mem_sections(int nid, unsigned long start_pfn, + unsigned long end_pfn, + enum meminit_context context); #else static inline int link_mem_sections(int nid, unsigned long start_pfn, - unsigned long end_pfn) + unsigned long end_pfn, + enum meminit_context context) { return 0; } @@ -61,7 +63,8 @@ static inline int register_one_node(int nid) if (error) return error; /* link memory sections under this node */ - error = link_mem_sections(nid, start_pfn, end_pfn); + error = link_mem_sections(nid, start_pfn, end_pfn, + MEMINIT_EARLY); }
return error; diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 040ef987a5a1..0031c5063aa1 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1097,7 +1097,8 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) }
/* link memory sections under this node.*/ - ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1)); + ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1), + MEMINIT_HOTPLUG); BUG_ON(ret);
/* create new memmap entry */
From: Al Viro viro@zeniv.linux.org.uk
commit f8d4f44df056c5b504b0d49683fb7279218fd207 upstream.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 37 ++++++++++++++++++------------------- 1 file changed, 18 insertions(+), 19 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 76f9079a6802..d7310a967a0a 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1450,6 +1450,22 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event, RCU_INIT_POINTER(epi->ws, NULL); }
+ /* Add the current item to the list of active epoll hook for this file */ + spin_lock(&tfile->f_lock); + list_add_tail_rcu(&epi->fllink, &tfile->f_ep_links); + spin_unlock(&tfile->f_lock); + + /* + * Add the current item to the RB tree. All RB tree operations are + * protected by "mtx", and ep_insert() is called with "mtx" held. + */ + ep_rbtree_insert(ep, epi); + + /* now check if we've created too many backpaths */ + error = -EINVAL; + if (full_check && reverse_path_check()) + goto error_remove_epi; + /* Initialize the poll table using the queue callback */ epq.epi = epi; init_poll_funcptr(&epq.pt, ep_ptable_queue_proc); @@ -1472,22 +1488,6 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event, if (epi->nwait < 0) goto error_unregister;
- /* Add the current item to the list of active epoll hook for this file */ - spin_lock(&tfile->f_lock); - list_add_tail_rcu(&epi->fllink, &tfile->f_ep_links); - spin_unlock(&tfile->f_lock); - - /* - * Add the current item to the RB tree. All RB tree operations are - * protected by "mtx", and ep_insert() is called with "mtx" held. - */ - ep_rbtree_insert(ep, epi); - - /* now check if we've created too many backpaths */ - error = -EINVAL; - if (full_check && reverse_path_check()) - goto error_remove_epi; - /* We have to drop the new item inside our item list to keep track of it */ spin_lock_irq(&ep->wq.lock);
@@ -1516,6 +1516,8 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
return 0;
+error_unregister: + ep_unregister_pollwait(ep, epi); error_remove_epi: spin_lock(&tfile->f_lock); list_del_rcu(&epi->fllink); @@ -1523,9 +1525,6 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
rb_erase_cached(&epi->rbn, &ep->rbr);
-error_unregister: - ep_unregister_pollwait(ep, epi); - /* * We need to do this because an event could have been arrived on some * allocated wait queue. Note that we don't care about the ep->ovflist
From: Al Viro viro@zeniv.linux.org.uk
commit 18306c404abe18a0972587a6266830583c60c928 upstream.
removes the need to clear it, along with the races.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 26 +++++++------------------- 1 file changed, 7 insertions(+), 19 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index d7310a967a0a..e4f2c1c54dfe 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -222,8 +222,7 @@ struct eventpoll { struct file *file;
/* used to optimize loop detection check */ - int visited; - struct list_head visited_list_link; + u64 gen;
#ifdef CONFIG_NET_RX_BUSY_POLL /* used to track busy poll napi_id */ @@ -273,6 +272,8 @@ static long max_user_watches __read_mostly; */ static DEFINE_MUTEX(epmutex);
+static u64 loop_check_gen = 0; + /* Used to check for epoll file descriptor inclusion loops */ static struct nested_calls poll_loop_ncalls;
@@ -282,9 +283,6 @@ static struct kmem_cache *epi_cache __read_mostly; /* Slab cache used to allocate "struct eppoll_entry" */ static struct kmem_cache *pwq_cache __read_mostly;
-/* Visited nodes during ep_loop_check(), so we can unset them when we finish */ -static LIST_HEAD(visited_list); - /* * List of files with newly added links, where we may need to limit the number * of emanating paths. Protected by the epmutex. @@ -1869,13 +1867,12 @@ static int ep_loop_check_proc(void *priv, void *cookie, int call_nests) struct epitem *epi;
mutex_lock_nested(&ep->mtx, call_nests + 1); - ep->visited = 1; - list_add(&ep->visited_list_link, &visited_list); + ep->gen = loop_check_gen; for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = rb_next(rbp)) { epi = rb_entry(rbp, struct epitem, rbn); if (unlikely(is_file_epoll(epi->ffd.file))) { ep_tovisit = epi->ffd.file->private_data; - if (ep_tovisit->visited) + if (ep_tovisit->gen == loop_check_gen) continue; error = ep_call_nested(&poll_loop_ncalls, EP_MAX_NESTS, ep_loop_check_proc, epi->ffd.file, @@ -1916,18 +1913,8 @@ static int ep_loop_check_proc(void *priv, void *cookie, int call_nests) */ static int ep_loop_check(struct eventpoll *ep, struct file *file) { - int ret; - struct eventpoll *ep_cur, *ep_next; - - ret = ep_call_nested(&poll_loop_ncalls, EP_MAX_NESTS, + return ep_call_nested(&poll_loop_ncalls, EP_MAX_NESTS, ep_loop_check_proc, file, ep, current); - /* clear visited list */ - list_for_each_entry_safe(ep_cur, ep_next, &visited_list, - visited_list_link) { - ep_cur->visited = 0; - list_del(&ep_cur->visited_list_link); - } - return ret; }
static void clear_tfile_check_list(void) @@ -2149,6 +2136,7 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, error_tgt_fput: if (full_check) { clear_tfile_check_list(); + loop_check_gen++; mutex_unlock(&epmutex); }
From: Al Viro viro@zeniv.linux.org.uk
commit fe0a916c1eae8e17e86c3753d13919177d63ed7e upstream.
Checking for the lack of epitems refering to the epoll we want to insert into is not enough; we might have an insertion of that epoll into another one that has already collected the set of files to recheck for excessive reverse paths, but hasn't gotten to creating/inserting the epitem for it.
However, any such insertion in progress can be detected - it will update the generation count in our epoll when it's done looking through it for files to check. That gets done under ->mtx of our epoll and that allows us to detect that safely.
We are *not* holding epmutex here, so the generation count is not stable. However, since both the update of ep->gen by loop check and (later) insertion into ->f_ep_link are done with ep->mtx held, we are fine - the sequence is grab epmutex bump loop_check_gen ... grab tep->mtx // 1 tep->gen = loop_check_gen ... drop tep->mtx // 2 ... grab tep->mtx // 3 ... insert into ->f_ep_link ... drop tep->mtx // 4 bump loop_check_gen drop epmutex and if the fastpath check in another thread happens for that eventpoll, it can come * before (1) - in that case fastpath is just fine * after (4) - we'll see non-empty ->f_ep_link, slow path taken * between (2) and (3) - loop_check_gen is stable, with ->mtx providing barriers and we end up taking slow path.
Note that ->f_ep_link emptiness check is slightly racy - we are protected against insertions into that list, but removals can happen right under us. Not a problem - in the worst case we'll end up taking a slow path for no good reason.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index e4f2c1c54dfe..59f9d28964ad 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2076,6 +2076,7 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, mutex_lock_nested(&ep->mtx, 0); if (op == EPOLL_CTL_ADD) { if (!list_empty(&f.file->f_ep_links) || + ep->gen == loop_check_gen || is_file_epoll(tf.file)) { full_check = 1; mutex_unlock(&ep->mtx);
From: Al Viro viro@zeniv.linux.org.uk
commit 3701cb59d892b88d569427586f01491552f377b1 upstream.
or get freed, for that matter, if it's a long (separately stored) name.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 59f9d28964ad..cf332e8f6bdf 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1376,7 +1376,7 @@ static int reverse_path_check(void)
static int ep_create_wakeup_source(struct epitem *epi) { - const char *name; + struct name_snapshot n; struct wakeup_source *ws;
if (!epi->ep->ws) { @@ -1385,8 +1385,9 @@ static int ep_create_wakeup_source(struct epitem *epi) return -ENOMEM; }
- name = epi->ffd.file->f_path.dentry->d_name.name; - ws = wakeup_source_register(name); + take_dentry_name_snapshot(&n, epi->ffd.file->f_path.dentry); + ws = wakeup_source_register(n.name); + release_dentry_name_snapshot(&n);
if (!ws) return -ENOMEM;
From: Will McVicker willmcvicker@google.com
commit 1cc5ef91d2ff94d2bf2de3b3585423e8a1051cb6 upstream.
The indexes to the nf_nat_l[34]protos arrays come from userspace. So check the tuple's family, e.g. l3num, when creating the conntrack in order to prevent an OOB memory access during setup. Here is an example kernel panic on 4.14.180 when userspace passes in an index greater than NFPROTO_NUMPROTO.
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in:... Process poc (pid: 5614, stack limit = 0x00000000a3933121) CPU: 4 PID: 5614 Comm: poc Tainted: G S W O 4.14.180-g051355490483 Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM task: 000000002a3dfffe task.stack: 00000000a3933121 pc : __cfi_check_fail+0x1c/0x24 lr : __cfi_check_fail+0x1c/0x24 ... Call trace: __cfi_check_fail+0x1c/0x24 name_to_dev_t+0x0/0x468 nfnetlink_parse_nat_setup+0x234/0x258 ctnetlink_parse_nat_setup+0x4c/0x228 ctnetlink_new_conntrack+0x590/0xc40 nfnetlink_rcv_msg+0x31c/0x4d4 netlink_rcv_skb+0x100/0x184 nfnetlink_rcv+0xf4/0x180 netlink_unicast+0x360/0x770 netlink_sendmsg+0x5a0/0x6a4 ___sys_sendmsg+0x314/0x46c SyS_sendmsg+0xb4/0x108 el0_svc_naked+0x34/0x38
This crash is not happening since 5.4+, however, ctnetlink still allows for creating entries with unsupported layer 3 protocol number.
Fixes: c1d10adb4a521 ("[NETFILTER]: Add ctnetlink port for nf_conntrack") Signed-off-by: Will McVicker willmcvicker@google.com [pablo@netfilter.org: rebased original patch on top of nf.git] Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_conntrack_netlink.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 31fa94064a62..0b89609a6e9d 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1129,6 +1129,8 @@ ctnetlink_parse_tuple(const struct nlattr * const cda[], if (!tb[CTA_TUPLE_IP]) return -EINVAL;
+ if (l3num != NFPROTO_IPV4 && l3num != NFPROTO_IPV6) + return -EOPNOTSUPP; tuple->src.l3num = l3num;
err = ctnetlink_parse_tuple_ip(tb[CTA_TUPLE_IP], tuple);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
Merge 38 patches from 4.19.150 stable branch (39 total) beside 1 already merged patches: 1c3886dc3023 net/packet: fix overflow in tpacket_rcv
Tested-by: Jon Hunter jonathanh@nvidia.com Tested-by: Shuah Khan skhan@linuxfoundation.org Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20201005142108.650363140@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile index 3ff5cf33ef55..65485185bec2 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 19 -SUBLEVEL = 149 +SUBLEVEL = 150 EXTRAVERSION = NAME = "People's Front"