Pali Rohár (2): PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO response PCI: aardvark: Fix kernel panic during PIO transfer
Remi Pommarel (1): PCI: aardvark: Don't rely on jiffies while holding spinlock
drivers/pci/controller/pci-aardvark.c | 59 ++++++++++++++++++++------- 1 file changed, 45 insertions(+), 14 deletions(-)
-- 2.25.1
From: Remi Pommarel repk@triplefau.lt
stable inclusion from stable-v4.19.198 commit 6571c80330ac43f2487403f04efcdbbcf796cf7b category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9R4IP CVE: CVE-2021-47229
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
---------------------------
commit 7fbcb5da811be7d47468417c7795405058abb3da upstream.
advk_pcie_wait_pio() can be called while holding a spinlock (from pci_bus_read_config_dword()), then depends on jiffies in order to timeout while polling on PIO state registers. In the case the PIO transaction failed, the timeout will never happen and will also cause the cpu to stall.
This decrements a variable and wait instead of using jiffies.
Signed-off-by: Remi Pommarel repk@triplefau.lt Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Andrew Murray andrew.murray@arm.com Acked-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zeng Heng zengheng4@huawei.com --- drivers/pci/controller/pci-aardvark.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index 6b4555ff2548..a96c38ac84a1 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -166,7 +166,8 @@ (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
-#define PIO_TIMEOUT_MS 1 +#define PIO_RETRY_CNT 500 +#define PIO_RETRY_DELAY 2 /* 2 us*/
#define LINK_WAIT_MAX_RETRIES 10 #define LINK_WAIT_USLEEP_MIN 90000 @@ -373,17 +374,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie) static int advk_pcie_wait_pio(struct advk_pcie *pcie) { struct device *dev = &pcie->pdev->dev; - unsigned long timeout; + int i;
- timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS); - - while (time_before(jiffies, timeout)) { + for (i = 0; i < PIO_RETRY_CNT; i++) { u32 start, isr;
start = advk_readl(pcie, PIO_START); isr = advk_readl(pcie, PIO_ISR); if (!start && isr) return 0; + udelay(PIO_RETRY_DELAY); }
dev_err(dev, "config read/write timed out\n");
From: Pali Rohár pali@kernel.org
stable inclusion from stable-v4.19.207 commit 19d29895971c1ba4d1a31577221aa2bb1931586b category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9R4IP CVE: CVE-2021-47229
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
---------------------------
commit 02bcec3ea5591720114f586960490b04b093a09e upstream.
Measurements in different conditions showed that aardvark hardware PIO response can take up to 1.44s. Increase wait timeout from 1ms to 1.5s to ensure that we do not miss responses from hardware. After 1.44s hardware returns errors (e.g. Completer abort).
The previous two patches fixed checking for PIO status, so now we can use it to also catch errors which are reported by hardware after 1.44s.
After applying this patch, kernel can detect and print PIO errors to dmesg:
[ 6.879999] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100004 [ 6.896436] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100004 [ 6.913049] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100010 [ 6.929663] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100010 [ 6.953558] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100014 [ 6.970170] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100014 [ 6.994328] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100004
Without this patch kernel prints only a generic error to dmesg:
[ 5.246847] advk-pcie d0070000.pcie: config read/write timed out
Link: https://lore.kernel.org/r/20210722144041.12661-3-pali@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún kabel@kernel.org Cc: stable@vger.kernel.org # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zeng Heng zengheng4@huawei.com --- drivers/pci/controller/pci-aardvark.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index a96c38ac84a1..a40f1771d72d 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -166,7 +166,7 @@ (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
-#define PIO_RETRY_CNT 500 +#define PIO_RETRY_CNT 750000 /* 1.5 s */ #define PIO_RETRY_DELAY 2 /* 2 us*/
#define LINK_WAIT_MAX_RETRIES 10
From: Pali Rohár pali@kernel.org
stable inclusion from stable-v4.19.198 commit b00a9aaa4be20ad6e3311fb78a485eae0899e89a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9R4IP CVE: CVE-2021-47229
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
---------------------------
commit f18139966d072dab8e4398c95ce955a9742e04f7 upstream.
Trying to start a new PIO transfer by writing value 0 in PIO_START register when previous transfer has not yet completed (which is indicated by value 1 in PIO_START) causes an External Abort on CPU, which results in kernel panic:
SError Interrupt on CPU0, code 0xbf000002 -- SError Kernel panic - not syncing: Asynchronous SError Interrupt
To prevent kernel panic, it is required to reject a new PIO transfer when previous one has not finished yet.
If previous PIO transfer is not finished yet, the kernel may issue a new PIO request only if the previous PIO transfer timed out.
In the past the root cause of this issue was incorrectly identified (as it often happens during link retraining or after link down event) and special hack was implemented in Trusted Firmware to catch all SError events in EL3, to ignore errors with code 0xbf000002 and not forwarding any other errors to kernel and instead throw panic from EL3 Trusted Firmware handler.
Links to discussion and patches about this issue: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dc... https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@triplefau.lt/ https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@www.loen.... https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541
But the real cause was the fact that during link retraining or after link down event the PIO transfer may take longer time, up to the 1.44s until it times out. This increased probability that a new PIO transfer would be issued by kernel while previous one has not finished yet.
After applying this change into the kernel, it is possible to revert the mentioned TF-A hack and SError events do not have to be caught in TF-A EL3.
Link: https://lore.kernel.org/r/20210608203655.31228-1-pali@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Marek Behún kabel@kernel.org Cc: stable@vger.kernel.org # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zeng Heng zengheng4@huawei.com --- drivers/pci/controller/pci-aardvark.c | 49 ++++++++++++++++++++++----- 1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index a40f1771d72d..cc81417026cf 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -386,7 +386,7 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie) udelay(PIO_RETRY_DELAY); }
- dev_err(dev, "config read/write timed out\n"); + dev_err(dev, "PIO read/write transfer time out\n"); return -ETIMEDOUT; }
@@ -399,6 +399,35 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, return true; }
+static bool advk_pcie_pio_is_running(struct advk_pcie *pcie) +{ + struct device *dev = &pcie->pdev->dev; + + /* + * Trying to start a new PIO transfer when previous has not completed + * cause External Abort on CPU which results in kernel panic: + * + * SError Interrupt on CPU0, code 0xbf000002 -- SError + * Kernel panic - not syncing: Asynchronous SError Interrupt + * + * Functions advk_pcie_rd_conf() and advk_pcie_wr_conf() are protected + * by raw_spin_lock_irqsave() at pci_lock_config() level to prevent + * concurrent calls at the same time. But because PIO transfer may take + * about 1.5s when link is down or card is disconnected, it means that + * advk_pcie_wait_pio() does not always have to wait for completion. + * + * Some versions of ARM Trusted Firmware handles this External Abort at + * EL3 level and mask it to prevent kernel panic. Relevant TF-A commit: + * https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dc... + */ + if (advk_readl(pcie, PIO_START)) { + dev_err(dev, "Previous PIO read/write transfer is still running\n"); + return true; + } + + return false; +} + static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn, int where, int size, u32 *val) { @@ -411,9 +440,10 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn, return PCIBIOS_DEVICE_NOT_FOUND; }
- /* Start PIO */ - advk_writel(pcie, 0, PIO_START); - advk_writel(pcie, 1, PIO_ISR); + if (advk_pcie_pio_is_running(pcie)) { + *val = 0xffffffff; + return PCIBIOS_SET_FAILED; + }
/* Program the control register */ reg = advk_readl(pcie, PIO_CTRL); @@ -432,7 +462,8 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn, /* Program the data strobe */ advk_writel(pcie, 0xf, PIO_WR_DATA_STRB);
- /* Start the transfer */ + /* Clear PIO DONE ISR and start the transfer */ + advk_writel(pcie, 1, PIO_ISR); advk_writel(pcie, 1, PIO_START);
ret = advk_pcie_wait_pio(pcie); @@ -466,9 +497,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn, if (where % size) return PCIBIOS_SET_FAILED;
- /* Start PIO */ - advk_writel(pcie, 0, PIO_START); - advk_writel(pcie, 1, PIO_ISR); + if (advk_pcie_pio_is_running(pcie)) + return PCIBIOS_SET_FAILED;
/* Program the control register */ reg = advk_readl(pcie, PIO_CTRL); @@ -495,7 +525,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn, /* Program the data strobe */ advk_writel(pcie, data_strobe, PIO_WR_DATA_STRB);
- /* Start the transfer */ + /* Clear PIO DONE ISR and start the transfer */ + advk_writel(pcie, 1, PIO_ISR); advk_writel(pcie, 1, PIO_START);
ret = advk_pcie_wait_pio(pcie);
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/9119 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/7...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/9119 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/7...