From: Sagi Grimberg sagi@grimberg.me
mainline inclusion from mainline-v5.8-rc4 commit ea43d9709f727e728e933a8157a7a7ca1a868281 category: bugfix bugzilla: NA CVE: NA Link: https://gitee.com/openeuler/kernel/issues/I4JFBE?from=project-issue
-------------------------------------------------
Commit 59c7c3caaaf8 intended to only silently ignore non retry-able errors (DNR bit set) such that we can still identify misbehaving controllers, and in the other hand propagate retry-able errors (DNR bit cleared) so we don't wrongly abandon a namespace just because it happens to be temporarily inaccessible.
The goal remains the same as the original commit where this was introduced but unfortunately had the logic backwards.
Fixes: 59c7c3caaaf8 ("nvme: fix possible hang when ns scanning fails during error recovery") Reported-by: Keith Busch kbusch@kernel.org Signed-off-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Keith Busch kbusch@kernel.org Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: chengjike chengjike.cheng@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- drivers/nvme/host/core.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 2b426f63427bc..d808febcb39dd 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1001,10 +1001,16 @@ static int nvme_identify_ns_descs(struct nvme_ctrl *ctrl, unsigned nsid, dev_warn(ctrl->device, "Identify Descriptors failed (%d)\n", status); /* - * Don't treat an error as fatal, as we potentially already - * have a NGUID or EUI-64. + * Don't treat non-retryable errors as fatal, as we potentially + * already have a NGUID or EUI-64. If we failed with DNR set, + * we want to silently ignore the error as we can still + * identify the device, but if the status has DNR set, we want + * to propagate the error back specifically for the disk + * revalidation flow to make sure we don't abandon the + * device just because of a temporal retry-able error (such + * as path of transport errors). */ - if (status > 0 && !(status & NVME_SC_DNR)) + if (status > 0 && (status & NVME_SC_DNR)) status = 0; goto free_data; }