This patchset includes 2 minor changes - For interrupt coalescing, the count of CQ entries is set to 10, and the interrupt coalescing timeout period is set to 10us. - Add cond_resched() to cq_thread_v3_hw() to execute the watchdog thread.
Changes since v1: - Remove unnecessary comments. - Update the commit message for patch 1.
Yihang Li (2): scsi: hisi_sas: Default enable interrupt coalescing scsi: hisi_sas: Add cond_resched() to cq_thread_v3_hw()
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9BBFQ CVE: NA
----------------------------------------------------------------------
In the current interrupt reporting mode, each CQ entry reports an interrupt. However, when there are a large number of I/O completion interrupts, the following issue may occur:
[ 4682.678657][ C129] irq 134: nobody cared (try booting with the "irqpoll" option) [ 4682.686137][ C129] CPU: 129 PID: 2567 Comm: irq/134-hisi_sa Kdump: loaded Tainted: G OE 5.10.0 #1 [ 4682.696297][ C129] Hardware name: To be filled by O.E.M. KpxxxEVB 2P2T2N TB/To be filled by O.E.M., BIOS KpxxxEVB 2P2T2N TB 03/23/2023 [ 4682.708455][ C129] Call trace: [ 4682.711589][ C129] dump_backtrace+0x0/0x1e4 [ 4682.715934][ C129] show_stack+0x20/0x2c [ 4682.719933][ C129] dump_stack+0xd8/0x140 [ 4682.724017][ C129] __report_bad_irq+0x54/0x180 [ 4682.728625][ C129] note_interrupt+0x1ec/0x2f0 [ 4682.733143][ C129] handle_irq_event+0x118/0x1ac [ 4682.737834][ C129] handle_fasteoi_irq+0xc8/0x200 [ 4682.742613][ C129] __handle_domain_irq+0x84/0xf0 [ 4682.747391][ C129] gic_handle_irq+0x88/0x2c0 [ 4682.751822][ C129] el1_irq+0xbc/0x140 [ 4682.755648][ C129] _find_next_bit.constprop.0+0x20/0x94 [ 4682.761036][ C129] cpumask_next+0x24/0x30 [ 4682.765208][ C129] gic_ipi_send_mask+0x48/0x170 [ 4682.769900][ C129] __ipi_send_mask+0x34/0x110 [ 4682.775720][ C129] smp_cross_call+0x3c/0xcc [ 4682.780064][ C129] arch_send_call_function_single_ipi+0x38/0x44 [ 4682.786146][ C129] send_call_function_single_ipi+0xd0/0xe0 [ 4682.791794][ C129] generic_exec_single+0xb4/0x170 [ 4682.796659][ C129] smp_call_function_single_async+0x2c/0x40 [ 4682.802395][ C129] blk_mq_complete_request_remote.part.0+0xec/0x100 [ 4682.808822][ C129] blk_mq_complete_request+0x30/0x70 [ 4682.813950][ C129] scsi_mq_done+0x48/0xac [ 4682.818128][ C129] sas_scsi_task_done+0xb0/0x150 [libsas] [ 4682.823692][ C129] slot_complete_v3_hw+0x230/0x710 [hisi_sas_v3_hw] [ 4682.830120][ C129] cq_thread_v3_hw+0xbc/0x190 [hisi_sas_v3_hw] [ 4682.836114][ C129] irq_thread_fn+0x34/0xa4 [ 4682.840371][ C129] irq_thread+0xc4/0x130 [ 4682.844455][ C129] kthread+0x108/0x13c [ 4682.848365][ C129] ret_from_fork+0x10/0x18 [ 4682.852621][ C129] handlers: [ 4682.855577][ C129] [<00000000949e52bf>] cq_interrupt_v3_hw [hisi_sas_v3_hw] threaded [<000000005d8e3b68>] cq_thread_v3_hw [hisi_sas_v3_hw] [ 4682.868084][ C129] Disabling IRQ #134
When the IRQ management layer processes each hardware interrupt, if the return value of the interrupt handler is IRQ_WAKE_THREAD, it will wake up the handler thread for this interrupt action and set IRQTF_RUNTHREAD flag, wait for the interrupt handling thread to clear the IRQTF_RUNTHREAD flag after execution. Later in note_interrupt(), use irq_count to count hardware interrupts and irqs_unhandled to count interrupts for which no thread handler is responsible. When irq_count reaches 100000 and irqs_unhandled reaches 99000, irq will be disabled.
In the performance test scenario, I/O completion hardware interrupts are continuously and quickly generated. As a result, the interrupt processing thread is cyclically called in irq_thread() and does not exit, this affects the response of the interrupt thread to the hardware interrupt. Finally, the irq is disabled.
Therefore, default enable interrupt coalescing to reduce the generation of hardware interrupts, this helps interrupt processing threads to stop calling in irq_thread().
For interrupt coalescing, the count of CQ entries is set to 10, and the interrupt coalescing timeout period is set to 10us.
After interrupt coalescing is enabled by default, for fio with 4K read 32 iodepth scenario, the performance is improved by about 4% (1342k -> 1396k).
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Bing Xia xiabing12@h-partners.com --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 30a3b748692d..b24438aa45a7 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -630,9 +630,9 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba) hisi_sas_write32(hisi_hba, TRANS_LOCK_ICT_TIME, 0x4A817C80); hisi_sas_write32(hisi_hba, HGC_SAS_TXFAIL_RETRY_CTRL, 0x108); hisi_sas_write32(hisi_hba, CFG_AGING_TIME, 0x1); - hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x1); - hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0x1); - hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0x1); + hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x3); + hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0xa); + hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0xa); hisi_sas_write32(hisi_hba, CQ_INT_CONVERGE_EN, hisi_sas_intr_conv); hisi_sas_write32(hisi_hba, OQ_INT_SRC, 0xffff); @@ -2771,11 +2771,9 @@ static void config_intr_coal_v3_hw(struct hisi_hba *hisi_hba)
if (hisi_hba->intr_coal_ticks == 0 || hisi_hba->intr_coal_count == 0) { - hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x1); - hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0x1); - hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0x1); + hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0xa); + hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0xa); } else { - hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x3); hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, hisi_hba->intr_coal_ticks); hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT,
在 2024/4/8 星期一 10:53, Yihang Li 写道:
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9BBFQ CVE: NA
In the current interrupt reporting mode, each CQ entry reports an interrupt. However, when there are a large number of I/O completion interrupts, the following issue may occur:
[ 4682.678657][ C129] irq 134: nobody cared (try booting with the "irqpoll" option) [ 4682.686137][ C129] CPU: 129 PID: 2567 Comm: irq/134-hisi_sa Kdump: loaded Tainted: G OE 5.10.0 #1 [ 4682.696297][ C129] Hardware name: To be filled by O.E.M. KpxxxEVB 2P2T2N TB/To be filled by O.E.M., BIOS KpxxxEVB 2P2T2N TB 03/23/2023 [ 4682.708455][ C129] Call trace: [ 4682.711589][ C129] dump_backtrace+0x0/0x1e4 [ 4682.715934][ C129] show_stack+0x20/0x2c [ 4682.719933][ C129] dump_stack+0xd8/0x140 [ 4682.724017][ C129] __report_bad_irq+0x54/0x180 [ 4682.728625][ C129] note_interrupt+0x1ec/0x2f0 [ 4682.733143][ C129] handle_irq_event+0x118/0x1ac [ 4682.737834][ C129] handle_fasteoi_irq+0xc8/0x200 [ 4682.742613][ C129] __handle_domain_irq+0x84/0xf0 [ 4682.747391][ C129] gic_handle_irq+0x88/0x2c0 [ 4682.751822][ C129] el1_irq+0xbc/0x140 [ 4682.755648][ C129] _find_next_bit.constprop.0+0x20/0x94 [ 4682.761036][ C129] cpumask_next+0x24/0x30 [ 4682.765208][ C129] gic_ipi_send_mask+0x48/0x170 [ 4682.769900][ C129] __ipi_send_mask+0x34/0x110 [ 4682.775720][ C129] smp_cross_call+0x3c/0xcc [ 4682.780064][ C129] arch_send_call_function_single_ipi+0x38/0x44 [ 4682.786146][ C129] send_call_function_single_ipi+0xd0/0xe0 [ 4682.791794][ C129] generic_exec_single+0xb4/0x170 [ 4682.796659][ C129] smp_call_function_single_async+0x2c/0x40 [ 4682.802395][ C129] blk_mq_complete_request_remote.part.0+0xec/0x100 [ 4682.808822][ C129] blk_mq_complete_request+0x30/0x70 [ 4682.813950][ C129] scsi_mq_done+0x48/0xac [ 4682.818128][ C129] sas_scsi_task_done+0xb0/0x150 [libsas] [ 4682.823692][ C129] slot_complete_v3_hw+0x230/0x710 [hisi_sas_v3_hw] [ 4682.830120][ C129] cq_thread_v3_hw+0xbc/0x190 [hisi_sas_v3_hw] [ 4682.836114][ C129] irq_thread_fn+0x34/0xa4 [ 4682.840371][ C129] irq_thread+0xc4/0x130 [ 4682.844455][ C129] kthread+0x108/0x13c [ 4682.848365][ C129] ret_from_fork+0x10/0x18 [ 4682.852621][ C129] handlers: [ 4682.855577][ C129] [<00000000949e52bf>] cq_interrupt_v3_hw [hisi_sas_v3_hw] threaded [<000000005d8e3b68>] cq_thread_v3_hw [hisi_sas_v3_hw] [ 4682.868084][ C129] Disabling IRQ #134
When the IRQ management layer processes each hardware interrupt, if the return value of the interrupt handler is IRQ_WAKE_THREAD, it will wake up the handler thread for this interrupt action and set IRQTF_RUNTHREAD flag, wait for the interrupt handling thread to clear the IRQTF_RUNTHREAD flag after execution. Later in note_interrupt(), use irq_count to count hardware interrupts and irqs_unhandled to count interrupts for which no thread handler is responsible. When irq_count reaches 100000 and irqs_unhandled reaches 99000, irq will be disabled.
In the performance test scenario, I/O completion hardware interrupts are continuously and quickly generated. As a result, the interrupt processing thread is cyclically called in irq_thread() and does not exit, this affects the response of the interrupt thread to the hardware interrupt. Finally, the irq is disabled.
Therefore, default enable interrupt coalescing to reduce the generation of hardware interrupts, this helps interrupt processing threads to stop calling in irq_thread().
For interrupt coalescing, the count of CQ entries is set to 10, and the interrupt coalescing timeout period is set to 10us.
After interrupt coalescing is enabled by default, for fio with 4K read 32 iodepth scenario, the performance is improved by about 4% (1342k -> 1396k).
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Bing Xia xiabing12@h-partners.com
Reviewed-by: Xiang Chen chenxiang66@hisilicon.com
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 30a3b748692d..b24438aa45a7 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -630,9 +630,9 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba) hisi_sas_write32(hisi_hba, TRANS_LOCK_ICT_TIME, 0x4A817C80); hisi_sas_write32(hisi_hba, HGC_SAS_TXFAIL_RETRY_CTRL, 0x108); hisi_sas_write32(hisi_hba, CFG_AGING_TIME, 0x1);
- hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x1);
- hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0x1);
- hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0x1);
- hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x3);
- hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0xa);
- hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0xa); hisi_sas_write32(hisi_hba, CQ_INT_CONVERGE_EN, hisi_sas_intr_conv); hisi_sas_write32(hisi_hba, OQ_INT_SRC, 0xffff);
@@ -2771,11 +2771,9 @@ static void config_intr_coal_v3_hw(struct hisi_hba *hisi_hba)
if (hisi_hba->intr_coal_ticks == 0 || hisi_hba->intr_coal_count == 0) {
hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x1);
hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0x1);
hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0x1);
hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, 0xa);
} else {hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT, 0xa);
hisi_sas_write32(hisi_hba, OQ_INT_COAL_TIME, hisi_hba->intr_coal_ticks); hisi_sas_write32(hisi_hba, OQ_INT_COAL_CNT,hisi_sas_write32(hisi_hba, INT_COAL_EN, 0x3);
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9BBFQ CVE: NA
----------------------------------------------------------------------
In the scenario where the expander is connected to 12 SAS SSDs, the following call trace may occur:
[ 214.409199][ C240] watchdog: BUG: soft lockup - CPU#240 stuck for 22s! [irq/149-hisi_sa:3211] [ 214.545262][ C240] CPU: 240 PID: 3211 Comm: irq/149-hisi_sa Kdump: loaded Tainted: G E 5.10.0 #1 [ 214.568533][ C240] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 214.575224][ C240] pc : fput_many+0x8c/0xdc [ 214.579480][ C240] lr : fput+0x1c/0xf0 [ 214.583302][ C240] sp : ffff80002de2b900 [ 214.587298][ C240] x29: ffff80002de2b900 x28: ffff1082aa412000 [ 214.593291][ C240] x27: ffff3062a0348c08 x26: ffff80003a9f6000 [ 214.599284][ C240] x25: ffff1062bbac5c40 x24: 0000000000001000 [ 214.605277][ C240] x23: 000000000000000a x22: 0000000000000001 [ 214.611270][ C240] x21: 0000000000001000 x20: 0000000000000000 [ 214.617262][ C240] x19: ffff3062a41ae580 x18: 0000000000010000 [ 214.623255][ C240] x17: 0000000000000001 x16: ffffdb3a6efe5fc0 [ 214.629248][ C240] x15: ffffffffffffffff x14: 0000000003ffffff [ 214.635241][ C240] x13: 000000000000ffff x12: 000000000000029c [ 214.641234][ C240] x11: 0000000000000006 x10: ffff80003a9f7fd0 [ 214.647226][ C240] x9 : ffffdb3a6f0482fc x8 : 0000000000000001 [ 214.653219][ C240] x7 : 0000000000000002 x6 : 0000000000000080 [ 214.659212][ C240] x5 : ffff55480ee9b000 x4 : fffffde7f94c6554 [ 214.665205][ C240] x3 : 0000000000000002 x2 : 0000000000000020 [ 214.671198][ C240] x1 : 0000000000000021 x0 : ffff3062a41ae5b8 [ 214.677191][ C240] Call trace: [ 214.680320][ C240] fput_many+0x8c/0xdc [ 214.684230][ C240] fput+0x1c/0xf0 [ 214.687707][ C240] aio_complete_rw+0xd8/0x1fc [ 214.692225][ C240] blkdev_bio_end_io+0x98/0x140 [ 214.696917][ C240] bio_endio+0x160/0x1bc [ 214.701001][ C240] blk_update_request+0x1c8/0x3bc [ 214.705867][ C240] scsi_end_request+0x3c/0x1f0 [ 214.710471][ C240] scsi_io_completion+0x7c/0x1a0 [ 214.715249][ C240] scsi_finish_command+0x104/0x140 [ 214.720200][ C240] scsi_softirq_done+0x90/0x180 [ 214.724892][ C240] blk_mq_complete_request+0x5c/0x70 [ 214.730016][ C240] scsi_mq_done+0x48/0xac [ 214.734194][ C240] sas_scsi_task_done+0xbc/0x16c [libsas] [ 214.739758][ C240] slot_complete_v3_hw+0x260/0x760 [hisi_sas_v3_hw] [ 214.746185][ C240] cq_thread_v3_hw+0xbc/0x190 [hisi_sas_v3_hw] [ 214.752179][ C240] irq_thread_fn+0x34/0xa4 [ 214.756435][ C240] irq_thread+0xc4/0x130 [ 214.760520][ C240] kthread+0x108/0x13c [ 214.764430][ C240] ret_from_fork+0x10/0x18
This is because in the hisi_sas driver, both the hardware interrupt handler and the interrupt thread are executed on the same CPU. In the performance test scenario, a large number of hardware interrupts and interrupt processing threads continuously occupy the CPU. As a result, the CPU cannot run the watchdog thread. When the watchdog time exceeds the specified time, call trace occurs.
To fix it, add cond_resched() to cq_thread_v3_hw() to execute the watchdog thread.
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Bing Xia xiabing12@h-partners.com Reviewed-by: Xiang Chen chenxiang66@hisilicon.com --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index b24438aa45a7..c66cac317a9f 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -2489,6 +2489,7 @@ static irqreturn_t cq_thread_v3_hw(int irq_no, void *p) /* update rd_point */ cq->rd_point = rd_point; hisi_sas_write32(hisi_hba, COMPL_Q_0_RD_PTR + (0x14 * queue), rd_point); + cond_resched();
return IRQ_HANDLED; }
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/5770 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/5770 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...