From: Michael Chan michael.chan@broadcom.com
stable inclusion from stable-5.10.68 commit 9f2972e151dd16d3286c1407bec4e66395f30135 bugzilla: 182671 https://gitee.com/openeuler/kernel/issues/I4EWUH
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit eca4cf12acda306f851f6d2a05b1c9ef62cf0e81 upstream.
The recent patch has introduced a regression by not reading the reset count in the ERROR_RECOVERY async event handler. We may have just gone through a reset and the reset count has just incremented. If we don't update the reset count in the ERROR_RECOVERY event handler, the health check timer will see that the reset count has changed and will initiate an unintended reset.
Restore the unconditional update of the reset count in bnxt_async_event_process() if error recovery watchdog is enabled. Also, update the reset count at the end of the reset sequence to make it even more robust.
Fixes: 1b2b91831983 ("bnxt_en: Fix possible unintended driver initiated error recovery") Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Chen Jun chenjun102@huawei.com Acked-by: Weilong Chen chenweilong@huawei.com
Signed-off-by: Chen Jun chenjun102@huawei.com --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 71656e669755..26179e437bbf 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2123,12 +2123,11 @@ static int bnxt_async_event_process(struct bnxt *bp, DIV_ROUND_UP(fw_health->polling_dsecs * HZ, bp->current_interval * 10); fw_health->tmr_counter = fw_health->tmr_multiplier; - if (!fw_health->enabled) { + if (!fw_health->enabled) fw_health->last_fw_heartbeat = bnxt_fw_health_readl(bp, BNXT_FW_HEARTBEAT_REG); - fw_health->last_fw_reset_cnt = - bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG); - } + fw_health->last_fw_reset_cnt = + bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG); netif_info(bp, drv, bp->dev, "Error recovery info: error recovery[1], master[%d], reset count[%u], health status: 0x%x\n", fw_health->master, fw_health->last_fw_reset_cnt, @@ -11653,6 +11652,11 @@ static void bnxt_fw_reset_task(struct work_struct *work) dev_close(bp->dev); }
+ if ((bp->fw_cap & BNXT_FW_CAP_ERROR_RECOVERY) && + bp->fw_health->enabled) { + bp->fw_health->last_fw_reset_cnt = + bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG); + } bp->fw_reset_state = 0; /* Make sure fw_reset_state is 0 before clearing the flag */ smp_mb__before_atomic();