CVE-2024-53237
Aaron Lu (1): x86/sgx: Fix deadlock in SGX NUMA node search
Weili Qian (1): crypto: hisilicon/qm - inject error before stopping queue
From: Aaron Lu aaron.lu@intel.com
mainline inclusion from mainline-v6.12-rc1 commit 9c936844010466535bd46ea4ce4656ef17653644 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAYQRL CVE: CVE-2024-49856
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
x86/sgx: Fix deadlock in SGX NUMA node search
When the current node doesn't have an EPC section configured by firmware and all other EPC sections are used up, CPU can get stuck inside the while loop that looks for an available EPC page from remote nodes indefinitely, leading to a soft lockup. Note how nid_of_current will never be equal to nid in that while loop because nid_of_current is not set in sgx_numa_mask.
Also worth mentioning is that it's perfectly fine for the firmware not to setup an EPC section on a node. While setting up an EPC section on each node can enhance performance, it is not a requirement for functionality.
Rework the loop to start and end on *a* node that has SGX memory. This avoids the deadlock looking for the current SGX-lacking node to show up in the loop when it never will.
Fixes: 901ddbb9ecf5 ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()") Reported-by: "Molina Sabido, Gerardo" gerardo.molina.sabido@intel.com Signed-off-by: Aaron Lu aaron.lu@intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Kai Huang kai.huang@intel.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Acked-by: Dave Hansen dave.hansen@linux.intel.com Tested-by: Zhimin Luo zhimin.luo@intel.com Link: https://lore.kernel.org/all/20240905080855.1699814-2-aaron.lu%40intel.com Signed-off-by: Zhao Yipeng zhaoyipeng5@huawei.com (cherry picked from commit 318889e81d06ab0505ca5865a2d2fe2dd67acee5) Signed-off-by: Guo Mengqi guomengqi3@huawei.com --- arch/x86/kernel/cpu/sgx/main.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 44b83ddc0c8f..a5341abb13ac 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -471,24 +471,25 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) { struct sgx_epc_page *page; int nid_of_current = numa_node_id(); - int nid = nid_of_current; + int nid_start, nid;
- if (node_isset(nid_of_current, sgx_numa_mask)) { - page = __sgx_alloc_epc_page_from_node(nid_of_current); - if (page) - return page; - } - - /* Fall back to the non-local NUMA nodes: */ - while (true) { - nid = next_node_in(nid, sgx_numa_mask); - if (nid == nid_of_current) - break; + /* + * Try local node first. If it doesn't have an EPC section, + * fall back to the non-local NUMA nodes. + */ + if (node_isset(nid_of_current, sgx_numa_mask)) + nid_start = nid_of_current; + else + nid_start = next_node_in(nid_of_current, sgx_numa_mask);
+ nid = nid_start; + do { page = __sgx_alloc_epc_page_from_node(nid); if (page) return page; - } + + nid = next_node_in(nid, sgx_numa_mask); + } while (nid != nid_start);
return ERR_PTR(-ENOMEM); }
From: Weili Qian qianweili@huawei.com
stable inclusion from stable-v6.1.113 commit 98d3be34c9153eceadb56de50d9f9347e88d86e4 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAYQSI CVE: CVE-2024-47730
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit b04f06fc0243600665b3b50253869533b7938468 ]
The master ooo cannot be completely closed when the accelerator core reports memory error. Therefore, the driver needs to inject the qm error to close the master ooo. Currently, the qm error is injected after stopping queue, memory may be released immediately after stopping queue, causing the device to access the released memory. Therefore, error is injected to close master ooo before stopping queue to ensure that the device does not access the released memory.
Fixes: 6c6dd5802c2d ("crypto: hisilicon/qm - add controller reset interface") Signed-off-by: Weili Qian qianweili@huawei.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org Conflicts: drivers/crypto/hisilicon/qm.c [context conflict] Signed-off-by: Guo Mengqi guomengqi3@huawei.com --- drivers/crypto/hisilicon/qm.c | 51 +++++++++++++++++------------------ 1 file changed, 24 insertions(+), 27 deletions(-)
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 382e54406510..179ec9710d8c 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -5058,6 +5058,28 @@ static int qm_set_vf_mse(struct hisi_qm *qm, bool set) return -ETIMEDOUT; }
+static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) +{ + u32 nfe_enb = 0; + + /* Kunpeng930 hardware automatically close master ooo when NFE occurs */ + if (qm->ver >= QM_HW_V3) + return; + + if (!qm->err_status.is_dev_ecc_mbit && + qm->err_status.is_qm_ecc_mbit && + qm->err_ini->close_axi_master_ooo) { + qm->err_ini->close_axi_master_ooo(qm); + } else if (qm->err_status.is_dev_ecc_mbit && + !qm->err_status.is_qm_ecc_mbit && + !qm->err_ini->close_axi_master_ooo) { + nfe_enb = readl(qm->io_base + QM_RAS_NFE_ENABLE); + writel(nfe_enb & QM_RAS_NFE_MBIT_DISABLE, + qm->io_base + QM_RAS_NFE_ENABLE); + writel(QM_ECC_MBIT, qm->io_base + QM_ABNORMAL_INT_SET); + } +} + static int qm_vf_reset_prepare(struct hisi_qm *qm, enum qm_stop_reason stop_reason) { @@ -5122,6 +5144,8 @@ static int qm_controller_reset_prepare(struct hisi_qm *qm) return ret; }
+ qm_dev_ecc_mbit_handle(qm); + /* PF obtains the information of VF by querying the register. */ qm_cmd_uninit(qm);
@@ -5146,31 +5170,6 @@ static int qm_controller_reset_prepare(struct hisi_qm *qm) return 0; }
-static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) -{ - u32 nfe_enb = 0; - - /* Kunpeng930 hardware automatically close master ooo when NFE occurs */ - if (qm->ver >= QM_HW_V3) - return; - - if (!qm->err_status.is_dev_ecc_mbit && - qm->err_status.is_qm_ecc_mbit && - qm->err_ini->close_axi_master_ooo) { - - qm->err_ini->close_axi_master_ooo(qm); - - } else if (qm->err_status.is_dev_ecc_mbit && - !qm->err_status.is_qm_ecc_mbit && - !qm->err_ini->close_axi_master_ooo) { - - nfe_enb = readl(qm->io_base + QM_RAS_NFE_ENABLE); - writel(nfe_enb & QM_RAS_NFE_MBIT_DISABLE, - qm->io_base + QM_RAS_NFE_ENABLE); - writel(QM_ECC_MBIT, qm->io_base + QM_ABNORMAL_INT_SET); - } -} - static int qm_soft_reset(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -5196,8 +5195,6 @@ static int qm_soft_reset(struct hisi_qm *qm) return ret; }
- qm_dev_ecc_mbit_handle(qm); - /* OOO register set and check */ writel(ACC_MASTER_GLOBAL_CTRL_SHUTDOWN, qm->io_base + ACC_MASTER_GLOBAL_CTRL);
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,转换为PR失败! 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W... 失败原因:应用补丁/补丁集失败,Patch failed at 0001 x86/sgx: Fix deadlock in SGX NUMA node search 建议解决方法:请查看失败原因, 确认补丁是否可以应用在当前期望分支的最新代码上
FeedBack: The patch(es) which you have sent to kernel@openeuler.org has been converted to PR failed! Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W... Failed Reason: apply patch(es) failed, Patch failed at 0001 x86/sgx: Fix deadlock in SGX NUMA node search Suggest Solution: please checkout if the failed patch(es) can work on the newest codes in expected branch