From: Sagi Grimberg sagi@grimberg.me
stable inclusion from stable-v5.10.102 commit e192184cf8bce8dd55d619f5611a2eaba996fa05 bugzilla: https://gitee.com/openeuler/kernel/issues/I567K6
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit ff9fc7ebf5c06de1ef72a69f9b1ab40af8b07f9e ]
While nvme_tcp_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state.
Tested-by: Chris Leech cleech@redhat.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yu Liao liaoyu15@huawei.com Reviewed-by: Wei Li liwei391@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- drivers/nvme/host/tcp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 6bfc5b354418..305db4a6c5a6 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2077,6 +2077,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work) struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
nvme_stop_keep_alive(ctrl); + flush_work(&ctrl->async_event_work); nvme_tcp_teardown_io_queues(ctrl, false); /* unquiesce to fail fast pending requests */ nvme_start_queues(ctrl);