From: Ruozhu Li liruozhu@huawei.com
mainline inclusion from mainline-v5.15 commit 85032874f80ba17bf187de1d14d9603bf3f582b8 category: bugfix bugzilla: NA CVE: NA Link: https://gitee.com/openeuler/kernel/issues/I1WGZE
This patch fix a BUG_ON when rescan ns after set queue count cmd timeout: BUG_ON(hctx_idx >= ctrl->ctrl.queue_count); //in nvme_rdma_init_hctx
Call trace: nvme_rdma_init_hctx+0x58/0x60 [nvme_rdma] blk_mq_realloc_hw_ctxs+0x140/0x4c0 blk_mq_init_allocated_queue+0x130/0x410 blk_mq_init_queue+0x40/0x88 nvme_validate_ns+0xb8/0x740 nvme_scan_work+0x29c/0x460 process_one_work+0x1f8/0x490 worker_thread+0x50/0x4b8 kthread+0x134/0x138 ret_from_fork+0x10/0x18
-------------------------------------------------
We update ctrl->queue_count and schedule another reconnect when io queue count is zero.But we will never try to create any io queue in next reco- nnection, because ctrl->queue_count already set to zero.We will end up having an admin-only session in Live state, which is exactly what we try to avoid in the original patch. Update ctrl->queue_count after queue_count zero checking to fix it.
Signed-off-by: Ruozhu Li liruozhu@huawei.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Tested-by: Ruozhu Li liruozhu@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- drivers/nvme/host/rdma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 91ff9262f6729..b8e0d637ddcfc 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -656,13 +656,13 @@ static int nvme_rdma_alloc_io_queues(struct nvme_rdma_ctrl *ctrl) if (ret) return ret;
- ctrl->ctrl.queue_count = nr_io_queues + 1; - if (ctrl->ctrl.queue_count < 2) { + if (nr_io_queues == 0) { dev_err(ctrl->ctrl.device, "unable to set any I/O queues\n"); return -ENOMEM; }
+ ctrl->ctrl.queue_count = nr_io_queues + 1; dev_info(ctrl->ctrl.device, "creating %d I/O queues.\n", nr_io_queues);