Ming Lei (2): scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
drivers/scsi/scsi_priv.h | 2 +- drivers/scsi/scsi_error.c | 9 +++++---- drivers/scsi/scsi_lib.c | 4 +++- 3 files changed, 9 insertions(+), 6 deletions(-)
From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v6.8-rc3 commit 4373534a9850627a2695317944898eb1283a2db0 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I96GXK CVE: CVE-2024-26627
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock every time for deciding if error handler kthread needs to be waken up.
This can be too heavy in case of recovery, such as:
- N hardware queues
- queue depth is M for each hardware queue
- each scsi_host_busy() iterates over (N * M) tag/requests
If recovery is triggered in case that all requests are in-flight, each scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called for the last in-flight request, scsi_host_busy() has been run for (N * M - 1) times, and request has been iterated for (N*M - 1) * (N * M) times.
If both N and M are big enough, hard lockup can be triggered on acquiring host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
Fix the issue by calling scsi_host_busy() outside the host lock. We don't need the host lock for getting busy count because host the lock never covers that.
[mkp: Drop unnecessary 'busy' variables pointed out by Bart]
Cc: Ewan Milne emilne@redhat.com Fixes: 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq") Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20240112070000.4161982-1-ming.lei@redhat.com Reviewed-by: Ewan D. Milne emilne@redhat.com Reviewed-by: Sathya Prakash Veerichetty safhya.prakash@broadcom.com Tested-by: Sathya Prakash Veerichetty safhya.prakash@broadcom.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com
Conflict: Context conflict, dost not affect patch logic.
Signed-off-by: Li Nan linan122@huawei.com --- drivers/scsi/scsi_priv.h | 2 +- drivers/scsi/scsi_error.c | 8 ++++---- drivers/scsi/scsi_lib.c | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index 180636d54982..ba39c21196f5 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -74,7 +74,7 @@ extern void scmd_eh_abort_handler(struct work_struct *work); extern enum blk_eh_timer_return scsi_times_out(struct request *req); extern int scsi_error_handler(void *host); extern int scsi_decide_disposition(struct scsi_cmnd *cmd); -extern void scsi_eh_wakeup(struct Scsi_Host *shost); +extern void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy); extern void scsi_eh_scmd_add(struct scsi_cmnd *); void scsi_eh_ready_devs(struct Scsi_Host *shost, struct list_head *work_q, diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 3d3d139127ee..897cb0d38219 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -63,11 +63,11 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd); static int scsi_try_to_abort_cmd(struct scsi_host_template *, struct scsi_cmnd *);
-void scsi_eh_wakeup(struct Scsi_Host *shost) +void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy) { lockdep_assert_held(shost->host_lock);
- if (scsi_host_busy(shost) == shost->host_failed) { + if (busy == shost->host_failed) { trace_scsi_eh_wakeup(shost); wake_up_process(shost->ehandler); SCSI_LOG_ERROR_RECOVERY(5, shost_printk(KERN_INFO, shost, @@ -90,7 +90,7 @@ void scsi_schedule_eh(struct Scsi_Host *shost) if (scsi_host_set_state(shost, SHOST_RECOVERY) == 0 || scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY) == 0) { shost->host_eh_scheduled++; - scsi_eh_wakeup(shost); + scsi_eh_wakeup(shost, scsi_host_busy(shost)); }
spin_unlock_irqrestore(shost->host_lock, flags); @@ -245,7 +245,7 @@ static void scsi_eh_inc_host_failed(struct rcu_head *head)
spin_lock_irqsave(shost->host_lock, flags); shost->host_failed++; - scsi_eh_wakeup(shost); + scsi_eh_wakeup(shost, scsi_host_busy(shost)); spin_unlock_irqrestore(shost->host_lock, flags); }
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 89b40993bafe..127b20ddd9b9 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -315,7 +315,7 @@ static void scsi_dec_host_busy(struct Scsi_Host *shost, struct scsi_cmnd *cmd) if (unlikely(scsi_host_in_recovery(shost))) { spin_lock_irqsave(shost->host_lock, flags); if (shost->host_failed || shost->host_eh_scheduled) - scsi_eh_wakeup(shost); + scsi_eh_wakeup(shost, scsi_host_busy(shost)); spin_unlock_irqrestore(shost->host_lock, flags); } rcu_read_unlock();
From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v6.8-rc4 commit 4e6c9011990726f4d175e2cdfebe5b0b8cce4839 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I96GXK CVE: CVE-2024-26627
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Commit 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") intended to fix a hard lockup issue triggered by EH. The core idea was to move scsi_host_busy() out of the host lock when processing individual commands for EH. However, a suggested style change inadvertently caused scsi_host_busy() to remain under the host lock. Fix this by calling scsi_host_busy() outside the lock.
Fixes: 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") Cc: Sathya Prakash Veerichetty safhya.prakash@broadcom.com Cc: Bart Van Assche bvanassche@acm.org Cc: Ewan D. Milne emilne@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20240203024521.2006455-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Li Nan linan122@huawei.com --- drivers/scsi/scsi_error.c | 3 ++- drivers/scsi/scsi_lib.c | 4 +++- 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 897cb0d38219..b8b4008a53bb 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -241,11 +241,12 @@ static void scsi_eh_inc_host_failed(struct rcu_head *head) { struct scsi_cmnd *scmd = container_of(head, typeof(*scmd), rcu); struct Scsi_Host *shost = scmd->device->host; + unsigned int busy = scsi_host_busy(shost); unsigned long flags;
spin_lock_irqsave(shost->host_lock, flags); shost->host_failed++; - scsi_eh_wakeup(shost, scsi_host_busy(shost)); + scsi_eh_wakeup(shost, busy); spin_unlock_irqrestore(shost->host_lock, flags); }
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 127b20ddd9b9..03b7e47cdda3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -313,9 +313,11 @@ static void scsi_dec_host_busy(struct Scsi_Host *shost, struct scsi_cmnd *cmd) rcu_read_lock(); __clear_bit(SCMD_STATE_INFLIGHT, &cmd->state); if (unlikely(scsi_host_in_recovery(shost))) { + unsigned int busy = scsi_host_busy(shost); + spin_lock_irqsave(shost->host_lock, flags); if (shost->host_failed || shost->host_eh_scheduled) - scsi_eh_wakeup(shost, scsi_host_busy(shost)); + scsi_eh_wakeup(shost, busy); spin_unlock_irqrestore(shost->host_lock, flags); } rcu_read_unlock();
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/5259 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/5259 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...