From: Hannes Reinecke hare@suse.de
[ Upstream commit 0e2209629fec427ba75a6351486153a9feddd36b ]
When a link is going down the driver will be calling fnic_cleanup_io(), which will traverse all commands and calling 'done' for each found command. While the traversal is handled under the host_lock, calling 'done' happens after the host_lock is being dropped.
As fnic_queuecommand_lck() is being called with the host_lock held, it might well be that it will pick the command being selected for abortion from the above routine and enqueue it for sending, but then 'done' is being called on that very command from the above routine.
Which of course confuses the hell out of the scsi midlayer.
So fix this by not queueing commands when fnic_cleanup_io is active.
Link: https://lore.kernel.org/r/20200116102053.62755-1-hare@suse.de Signed-off-by: Hannes Reinecke hare@suse.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- drivers/scsi/fnic/fnic_scsi.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c index 8cbd3c9..73ffc16 100644 --- a/drivers/scsi/fnic/fnic_scsi.c +++ b/drivers/scsi/fnic/fnic_scsi.c @@ -446,6 +446,9 @@ static int fnic_queuecommand_lck(struct scsi_cmnd *sc, void (*done)(struct scsi_ if (unlikely(fnic_chk_state_flags_locked(fnic, FNIC_FLAGS_IO_BLOCKED))) return SCSI_MLQUEUE_HOST_BUSY;
+ if (unlikely(fnic_chk_state_flags_locked(fnic, FNIC_FLAGS_FWRESET))) + return SCSI_MLQUEUE_HOST_BUSY; + rport = starget_to_rport(scsi_target(sc->device)); if (!rport) { FNIC_SCSI_DBG(KERN_DEBUG, fnic->lport->host,