[PATCH OLK-6.6 0/4] CVE-2025-39832

Moshe Shemesh (4): net/mlx5: Reload auxiliary drivers on fw_activate net/mlx5: Add device cap for supporting hot reset in sync reset flow net/mlx5: Add support for sync reset using hot reset net/mlx5: Fix lockdep assertion on sync reset unload event drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 184 ++++++++++++++------- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 + include/linux/mlx5/mlx5_ifc.h | 11 +- 5 files changed, 138 insertions(+), 63 deletions(-) -- 2.9.5

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/18267 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/OO2... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/18267 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/OO2...

From: Moshe Shemesh <moshe@nvidia.com> stable inclusion from stable-v6.6.104 commit 09fd27c8621e33248f6453e05e451167ba904010 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICYBRF CVE: CVE-2025-39832 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 34cc6a54914f478c93e176450fae6313404f9f74 ] The devlink reload fw_activate command performs firmware activation followed by driver reload, while devlink reload driver_reinit triggers only driver reload. However, the driver reload logic differs between the two modes, as on driver_reinit mode mlx5 also reloads auxiliary drivers, while in fw_activate mode the auxiliary drivers are suspended where applicable. Additionally, following the cited commit, if the device has multiple PFs, the behavior during fw_activate may vary between PFs: one PF may suspend auxiliary drivers, while another reloads them. Align devlink dev reload fw_activate behavior with devlink dev reload driver_reinit, to reload all auxiliary drivers. Fixes: 72ed5d5624af ("net/mlx5: Suspend auxiliary devices only in case of PCI device suspend") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Akiva Goldberger <agoldberger@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250825143435.598584-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> --- drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index f66788a..f2d1f7c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -107,7 +107,7 @@ static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netli if (err) return err; - mlx5_unload_one_devl_locked(dev, true); + mlx5_unload_one_devl_locked(dev, false); err = mlx5_health_wait_pci_up(dev); if (err) NL_SET_ERR_MSG_MOD(extack, "FW activate aborted, PCI reads fail after reset"); -- 2.9.5

From: Moshe Shemesh <moshe@nvidia.com> stable inclusion from stable-v6.6.104 commit 6292688e07d066893e85dac7c28db3c78a1c2358 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICYBRF CVE: CVE-2025-39832 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 9947204cdad97d22d171039019a4aad4d6899cdd ] New devices with new FW can support sync reset for firmware activate using hot reset. Add capability for supporting it and add MFRL field to query from FW which type of PCI reset method to use while handling sync reset events. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240911201757.1505453-10-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 902a8bc23a24 ("net/mlx5: Fix lockdep assertion on sync reset unload event") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> --- include/linux/mlx5/mlx5_ifc.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 9106771..4913d36 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1731,7 +1731,8 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_328[0x2]; u8 relaxed_ordering_read[0x1]; u8 log_max_pd[0x5]; - u8 reserved_at_330[0x6]; + u8 reserved_at_330[0x5]; + u8 pcie_reset_using_hotreset_method[0x1]; u8 pci_sync_for_fw_update_with_driver_unload[0x1]; u8 vnic_env_cnt_steering_fail[0x1]; u8 vport_counter_local_loopback[0x1]; @@ -10825,6 +10826,11 @@ struct mlx5_ifc_mcda_reg_bits { }; enum { + MLX5_MFRL_REG_PCI_RESET_METHOD_LINK_TOGGLE = 0, + MLX5_MFRL_REG_PCI_RESET_METHOD_HOT_RESET = 1, +}; + +enum { MLX5_MFRL_REG_RESET_STATE_IDLE = 0, MLX5_MFRL_REG_RESET_STATE_IN_NEGOTIATION = 1, MLX5_MFRL_REG_RESET_STATE_RESET_IN_PROGRESS = 2, @@ -10851,7 +10857,8 @@ struct mlx5_ifc_mfrl_reg_bits { u8 pci_sync_for_fw_update_start[0x1]; u8 pci_sync_for_fw_update_resp[0x2]; u8 rst_type_sel[0x3]; - u8 reserved_at_28[0x4]; + u8 pci_reset_req_method[0x3]; + u8 reserved_at_2b[0x1]; u8 reset_state[0x4]; u8 reset_type[0x8]; u8 reset_level[0x8]; -- 2.9.5

From: Moshe Shemesh <moshe@nvidia.com> stable inclusion from stable-v6.6.104 commit 8da591ae26145e374a9d2148496eed810a5d4f7b category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICYBRF CVE: CVE-2025-39832 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 57502f62678ced0149d415324931bde37b42885a ] On device that supports sync reset for firmware activate using hot reset, the driver queries the required reset method while handling the sync reset request. If the required reset method is hot reset, the driver will use pci_reset_bus() to reset the PCI link instead of the link toggle. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240911201757.1505453-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 902a8bc23a24 ("net/mlx5: Fix lockdep assertion on sync reset unload event") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> --- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 82 +++++++++++++++++----- drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 + 2 files changed, 69 insertions(+), 16 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 6b17346..a1c2dd7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -26,6 +26,7 @@ struct mlx5_fw_reset { struct work_struct reset_now_work; struct work_struct reset_abort_work; unsigned long reset_flags; + u8 reset_method; struct timer_list timer; struct completion done; int ret; @@ -94,7 +95,7 @@ static int mlx5_reg_mfrl_set(struct mlx5_core_dev *dev, u8 reset_level, } static int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, - u8 *reset_type, u8 *reset_state) + u8 *reset_type, u8 *reset_state, u8 *reset_method) { u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {}; u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {}; @@ -110,13 +111,26 @@ static int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, *reset_type = MLX5_GET(mfrl_reg, out, reset_type); if (reset_state) *reset_state = MLX5_GET(mfrl_reg, out, reset_state); + if (reset_method) + *reset_method = MLX5_GET(mfrl_reg, out, pci_reset_req_method); return 0; } int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type) { - return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL); + return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL, NULL); +} + +static int mlx5_fw_reset_get_reset_method(struct mlx5_core_dev *dev, + u8 *reset_method) +{ + if (!MLX5_CAP_GEN(dev, pcie_reset_using_hotreset_method)) { + *reset_method = MLX5_MFRL_REG_PCI_RESET_METHOD_LINK_TOGGLE; + return 0; + } + + return mlx5_reg_mfrl_query(dev, NULL, NULL, NULL, reset_method); } static int mlx5_fw_reset_get_reset_state_err(struct mlx5_core_dev *dev, @@ -124,7 +138,7 @@ static int mlx5_fw_reset_get_reset_state_err(struct mlx5_core_dev *dev, { u8 reset_state; - if (mlx5_reg_mfrl_query(dev, NULL, NULL, &reset_state)) + if (mlx5_reg_mfrl_query(dev, NULL, NULL, &reset_state, NULL)) goto out; if (!reset_state) @@ -402,7 +416,11 @@ static void mlx5_sync_reset_request_event(struct work_struct *work) struct mlx5_core_dev *dev = fw_reset->dev; int err; - if (test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) || + err = mlx5_fw_reset_get_reset_method(dev, &fw_reset->reset_method); + if (err) + mlx5_core_warn(dev, "Failed reading MFRL, err %d\n", err); + + if (err || test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) || !mlx5_is_reset_now_capable(dev)) { err = mlx5_fw_reset_set_reset_sync_nack(dev); mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s", @@ -419,21 +437,15 @@ static void mlx5_sync_reset_request_event(struct work_struct *work) mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n"); } -static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev) +static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev, u16 dev_id) { struct pci_bus *bridge_bus = dev->pdev->bus; struct pci_dev *bridge = bridge_bus->self; unsigned long timeout; struct pci_dev *sdev; - u16 reg16, dev_id; int cap, err; + u16 reg16; - err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id); - if (err) - return pcibios_err_to_errno(err); - err = mlx5_check_dev_ids(dev, dev_id); - if (err) - return err; cap = pci_find_capability(bridge, PCI_CAP_ID_EXP); if (!cap) return -EOPNOTSUPP; @@ -503,6 +515,44 @@ static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev) return err; } +static int mlx5_pci_reset_bus(struct mlx5_core_dev *dev) +{ + if (!MLX5_CAP_GEN(dev, pcie_reset_using_hotreset_method)) + return -EOPNOTSUPP; + + return pci_reset_bus(dev->pdev); +} + +static int mlx5_sync_pci_reset(struct mlx5_core_dev *dev, u8 reset_method) +{ + u16 dev_id; + int err; + + err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id); + if (err) + return pcibios_err_to_errno(err); + err = mlx5_check_dev_ids(dev, dev_id); + if (err) + return err; + + switch (reset_method) { + case MLX5_MFRL_REG_PCI_RESET_METHOD_LINK_TOGGLE: + err = mlx5_pci_link_toggle(dev, dev_id); + if (err) + mlx5_core_warn(dev, "mlx5_pci_link_toggle failed\n"); + break; + case MLX5_MFRL_REG_PCI_RESET_METHOD_HOT_RESET: + err = mlx5_pci_reset_bus(dev); + if (err) + mlx5_core_warn(dev, "mlx5_pci_reset_bus failed\n"); + break; + default: + return -EOPNOTSUPP; + } + + return err; +} + static void mlx5_sync_reset_now_event(struct work_struct *work) { struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset, @@ -521,9 +571,9 @@ static void mlx5_sync_reset_now_event(struct work_struct *work) goto done; } - err = mlx5_pci_link_toggle(dev); + err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); if (err) { - mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, no reset done, err %d\n", err); + mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, no reset done, err %d\n", err); set_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags); } @@ -585,9 +635,9 @@ static void mlx5_sync_reset_unload_event(struct work_struct *work) mlx5_core_warn(dev, "Sync Reset, got reset action. rst_state = %u\n", rst_state); if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ) { - err = mlx5_pci_link_toggle(dev); + err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); if (err) { - mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, err %d\n", err); + mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, err %d\n", err); fw_reset->ret = err; } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index e6280ce..3cb9617 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -627,6 +627,9 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx) if (MLX5_CAP_GEN_MAX(dev, pci_sync_for_fw_update_with_driver_unload)) MLX5_SET(cmd_hca_cap, set_hca_cap, pci_sync_for_fw_update_with_driver_unload, 1); + if (MLX5_CAP_GEN_MAX(dev, pcie_reset_using_hotreset_method)) + MLX5_SET(cmd_hca_cap, set_hca_cap, + pcie_reset_using_hotreset_method, 1); if (MLX5_CAP_GEN_MAX(dev, num_vhca_ports)) MLX5_SET(cmd_hca_cap, -- 2.9.5

From: Moshe Shemesh <moshe@nvidia.com> stable inclusion from stable-v6.6.104 commit ddac9d0fe2493dd550cbfc75eeaf31e9b6dac959 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICYBRF CVE: CVE-2025-39832 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit 902a8bc23a24882200f57cadc270e15a2cfaf2bb ] Fix lockdep assertion triggered during sync reset unload event. When the sync reset flow is initiated using the devlink reload fw_activate option, the PF already holds the devlink lock while handling unload event. In this case, delegate sync reset unload event handling back to the devlink callback process to avoid double-locking and resolve the lockdep warning. Kernel log: WARNING: CPU: 9 PID: 1578 at devl_assert_locked+0x31/0x40 [...] Call Trace: <TASK> mlx5_unload_one_devl_locked+0x2c/0xc0 [mlx5_core] mlx5_sync_reset_unload_event+0xaf/0x2f0 [mlx5_core] process_one_work+0x222/0x640 worker_thread+0x199/0x350 kthread+0x10b/0x230 ? __pfx_worker_thread+0x10/0x10 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x8e/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Fixes: 7a9770f1bfea ("net/mlx5: Handle sync reset unload event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250825143435.598584-7-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> --- drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 108 ++++++++++++--------- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h | 1 + 3 files changed, 63 insertions(+), 48 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index f2d1f7c..8489b50 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -107,7 +107,7 @@ static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netli if (err) return err; - mlx5_unload_one_devl_locked(dev, false); + mlx5_sync_reset_unload_flow(dev, true); err = mlx5_health_wait_pci_up(dev); if (err) NL_SET_ERR_MSG_MOD(extack, "FW activate aborted, PCI reads fail after reset"); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index a1c2dd7..b9986a2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -12,7 +12,8 @@ enum { MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, MLX5_FW_RESET_FLAGS_PENDING_COMP, MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, - MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED + MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, + MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, }; struct mlx5_fw_reset { @@ -217,7 +218,7 @@ int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev) return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL0, 0, 0, false); } -static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unloaded) +static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev) { struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; struct devlink *devlink = priv_to_devlink(dev); @@ -226,8 +227,7 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unload if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) { complete(&fw_reset->done); } else { - if (!unloaded) - mlx5_unload_one(dev, false); + mlx5_sync_reset_unload_flow(dev, false); if (mlx5_health_wait_pci_up(dev)) mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n"); else @@ -270,7 +270,7 @@ static void mlx5_sync_reset_reload_work(struct work_struct *work) mlx5_sync_reset_clear_reset_requested(dev, false); mlx5_enter_error_state(dev, true); - mlx5_fw_reset_complete_reload(dev, false); + mlx5_fw_reset_complete_reload(dev); } #define MLX5_RESET_POLL_INTERVAL (HZ / 10) @@ -553,6 +553,59 @@ static int mlx5_sync_pci_reset(struct mlx5_core_dev *dev, u8 reset_method) return err; } +void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked) +{ + struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; + unsigned long timeout; + bool reset_action; + u8 rst_state; + int err; + + if (locked) + mlx5_unload_one_devl_locked(dev, false); + else + mlx5_unload_one(dev, false); + + if (!test_bit(MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, &fw_reset->reset_flags)) + return; + + mlx5_set_fw_rst_ack(dev); + mlx5_core_warn(dev, "Sync Reset Unload done, device reset expected\n"); + + reset_action = false; + timeout = jiffies + msecs_to_jiffies(mlx5_tout_ms(dev, RESET_UNLOAD)); + do { + rst_state = mlx5_get_fw_rst_state(dev); + if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ || + rst_state == MLX5_FW_RST_STATE_IDLE) { + reset_action = true; + break; + } + msleep(20); + } while (!time_after(jiffies, timeout)); + + if (!reset_action) { + mlx5_core_err(dev, "Got timeout waiting for sync reset action, state = %u\n", + rst_state); + fw_reset->ret = -ETIMEDOUT; + goto done; + } + + mlx5_core_warn(dev, "Sync Reset, got reset action. rst_state = %u\n", + rst_state); + if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ) { + err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); + if (err) { + mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, err %d\n", + err); + fw_reset->ret = err; + } + } + +done: + clear_bit(MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, &fw_reset->reset_flags); +} + static void mlx5_sync_reset_now_event(struct work_struct *work) { struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset, @@ -580,16 +633,13 @@ static void mlx5_sync_reset_now_event(struct work_struct *work) mlx5_enter_error_state(dev, true); done: fw_reset->ret = err; - mlx5_fw_reset_complete_reload(dev, false); + mlx5_fw_reset_complete_reload(dev); } static void mlx5_sync_reset_unload_event(struct work_struct *work) { struct mlx5_fw_reset *fw_reset; struct mlx5_core_dev *dev; - unsigned long timeout; - bool reset_action; - u8 rst_state; int err; fw_reset = container_of(work, struct mlx5_fw_reset, reset_unload_work); @@ -598,6 +648,7 @@ static void mlx5_sync_reset_unload_event(struct work_struct *work) if (mlx5_sync_reset_clear_reset_requested(dev, false)) return; + set_bit(MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, &fw_reset->reset_flags); mlx5_core_warn(dev, "Sync Reset Unload. Function is forced down.\n"); err = mlx5_cmd_fast_teardown_hca(dev); @@ -606,44 +657,7 @@ static void mlx5_sync_reset_unload_event(struct work_struct *work) else mlx5_enter_error_state(dev, true); - if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) - mlx5_unload_one_devl_locked(dev, false); - else - mlx5_unload_one(dev, false); - - mlx5_set_fw_rst_ack(dev); - mlx5_core_warn(dev, "Sync Reset Unload done, device reset expected\n"); - - reset_action = false; - timeout = jiffies + msecs_to_jiffies(mlx5_tout_ms(dev, RESET_UNLOAD)); - do { - rst_state = mlx5_get_fw_rst_state(dev); - if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ || - rst_state == MLX5_FW_RST_STATE_IDLE) { - reset_action = true; - break; - } - msleep(20); - } while (!time_after(jiffies, timeout)); - - if (!reset_action) { - mlx5_core_err(dev, "Got timeout waiting for sync reset action, state = %u\n", - rst_state); - fw_reset->ret = -ETIMEDOUT; - goto done; - } - - mlx5_core_warn(dev, "Sync Reset, got reset action. rst_state = %u\n", rst_state); - if (rst_state == MLX5_FW_RST_STATE_TOGGLE_REQ) { - err = mlx5_sync_pci_reset(dev, fw_reset->reset_method); - if (err) { - mlx5_core_warn(dev, "mlx5_sync_pci_reset failed, err %d\n", err); - fw_reset->ret = err; - } - } - -done: - mlx5_fw_reset_complete_reload(dev, true); + mlx5_fw_reset_complete_reload(dev); } static void mlx5_sync_reset_abort_event(struct work_struct *work) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h index ea527d0..d5b2852 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h @@ -12,6 +12,7 @@ int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel, int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev); int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev); +void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked); int mlx5_fw_reset_verify_fw_complete(struct mlx5_core_dev *dev, struct netlink_ext_ack *extack); void mlx5_fw_reset_events_start(struct mlx5_core_dev *dev); -- 2.9.5
participants (2)
-
patchwork bot
-
Zhang Changzhong