From: Ye Bin yebin10@huawei.com
hulk inclusion category: bugfix bugzilla: 34271 CVE: NA
--------------------------------
This reverts commit fd72360bc94ad304136beb56e8ff2ec089113bb8.
test setp: ... rmmod hisi_sas_v3_hw lsmod fdisk -l insomd hisi_sas_v3_hw.ko lsmod fdisk -l ....
We get follow error when we test by above test steps.
[ 3660.259153] [ffff00000116f000] pgd=00002027ffffe003, pud=00002027ffffd003, pmd=00002027cdf28003, pte=0000000000000000 [ 3660.269719] Internal error: Oops: 96000007 [#1] PREEMPT SMP [ 3660.275266] Modules linked in: hisi_sas_v3_hw(+) hisi_sas_main hns_roce_hw_v2(O) hns_roce(O) rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib ib_umad realtek hns3(O) hclge(O) hnae3(O) crc32_ce crct10dif_ce hisi_hpre hisi_zip hisi_qm uacce hisi_trng_v2 rng_core sfc lbc ip_tables x_tables libsas scsi_transport_sas [last unloaded: hisi_sas_main] [ 3660.308227] Process smartd (pid: 19570, stack limit = 0x000000001103634d) [ 3660.314985] CPU: 31 PID: 19570 Comm: smartd Kdump: loaded Tainted: G O 4.19.36-g32894fc #1 [ 3660.324504] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 2280-V2 CS V3.B220.02 03/27/2020 [ 3660.334110] pstate: 60400009 (nZCv daif +PAN -UAO) [ 3660.338882] pc : scsi_device_put+0x18/0x38 [ 3660.342961] lr : scsi_disk_put+0x3c/0x58 [ 3660.346865] sp : ffff0000158a3cb0 [ 3660.350164] x29: ffff0000158a3cb0 x28: ffff8027b8111000 [ 3660.355451] x27: 00000000080a005d x26: 0000000000000000 [ 3660.360738] x25: ffff8027c6310398 x24: ffff8027cd2ec410 [ 3660.366025] x23: ffff000009811000 x22: ffff80276d274750 [ 3660.371312] x21: ffff8027abdd5000 x20: ffff8027b8110800 [ 3660.376599] x19: ffff8027abdd5000 x18: 0000000000000000 [ 3660.381886] x17: 0000000000000000 x16: 0000000000000000 [ 3660.387172] x15: 0000000000000000 x14: 0000000000000000 [ 3660.392459] x13: ffff000009996cd0 x12: ffffffffffffffff [ 3660.397746] x11: ffff000009996cc8 x10: 0000000000000000 [ 3660.403033] x9 : 0000000000000000 x8 : 0000000040000000 [ 3660.408320] x7 : ffff0000098116c8 x6 : 0000000000000000 [ 3660.413607] x5 : ffff00000820ebbc x4 : ffff7e009eb8fb20 [ 3660.418894] x3 : 0000000080400009 x2 : ffff8027ae3ec600 [ 3660.424180] x1 : 71b6030ca20bb300 x0 : ffff00000116f000 [ 3660.429467] Call trace: [ 3660.431904] scsi_device_put+0x18/0x38 [ 3660.435636] scsi_disk_put+0x3c/0x58 [ 3660.439195] sd_release+0x50/0xc0 [ 3660.442496] __blkdev_put+0x20c/0x220 [ 3660.446141] blkdev_put+0x4c/0x110 [ 3660.449527] blkdev_close+0x1c/0x28 [ 3660.453000] __fput+0x88/0x1b8 [ 3660.456042] ____fput+0xc/0x18 [ 3660.459085] task_work_run+0x94/0xb0 [ 3660.462646] do_notify_resume+0x17c/0x180 [ 3660.466637] work_pending+0x8/0x10 [ 3660.470022] Code: f9000bf3 aa0003f3 f9400000 f9404c00 (f9400000) [ 3660.476089] ---[ end trace ca1d0144f9241f71 ]---
void scsi_device_put(struct scsi_device *sdev) { module_put(sdev->host->hostt->module); ---> error code put_device(&sdev->sdev_gendev); }
When access "sdev->host->hostt" occurs exception, as "sdev->host->hostt" is point to the module address space which is already removed. module_delete first check module reference count, then call module exit function. So after pass module reference count check and before call module exit, we can call scsi_device_get function successfully. As "scsi: fix failing unload of a LLDD module" lead to call scsi_device_get success during remove module. We revert this patch, "scsi: fixup kernel warning during rmmod()" already fixed previous error.
Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- drivers/scsi/scsi.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 7d472c2..fc1356d 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -544,6 +544,9 @@ int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer, * Description: Gets a reference to the scsi_device and increments the use count * of the underlying LLDD module. You must hold host_lock of the * parent Scsi_Host or already have a reference when calling this. + * + * This will fail if a device is deleted or cancelled, or when the LLD module + * is in the process of being unloaded. */ int scsi_device_get(struct scsi_device *sdev) { @@ -551,12 +554,12 @@ int scsi_device_get(struct scsi_device *sdev) goto fail; if (!get_device(&sdev->sdev_gendev)) goto fail; - /* We can fail try_module_get if we're doing SCSI operations - * from module exit (like cache flush) - */ - __module_get(sdev->host->hostt->module); + if (!try_module_get(sdev->host->hostt->module)) + goto fail_put_device; return 0;
+fail_put_device: + put_device(&sdev->sdev_gendev); fail: return -ENXIO; }