data:image/s3,"s3://crabby-images/4d1fb/4d1fb0a196527be34e31e81913f9c6569be065a3" alt=""
在 2021/11/25 19:54, John Garry 写道:
On 23/11/2021 12:04, chenxiang (M) wrote:
ry to say, but I find this hard to read.
sas_task is already released, but sas_free_task() will still be called again.
What are these situations?
For tmf IO, it is failed with stat = SAS_OPEN_REJECT and resp = SAS_TASK_COMPLETE, then it will retry three times. Eevey time sas_task is allocated at the beginning and freed at the end in the loop.
I am looking at hisi_sas_main.c, and for this case on each retry we free the sas_task at the bottom of the loop and then set the pointer to NULL. And after we break out from 3 failed retries we do call sas_free_task() again but task = NULL and sas_free_task() is NULL safe.
But it will free the sas_task again (outside the loop) at the end of the function.
Are you sure? As explained above, I think that this is safe.
But between the 3rd sas_free_task and 4rd sas_free_task, it is possible that freed sas_task is allocated for other IO. Tmf IOs from different disks are asynchronized when sending ata reset (you can see following logs) [35133.522944] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 [35133.538102] sas: ata58: end_device-2:0:0: dev error handler [35133.589002] sas: ata59: end_device-2:0:1: dev error handler [35133.633502] sas: ata60: end_device-2:0:2: dev error handler [35133.688151] sas: ata61: end_device-2:0:5: dev error handler [35135.898451] ata58.00: ATA-10: ST4000NM0035-1V4107, TN03, max UDMA/133 [35135.973229] ata59.00: ATA-10: ST4000NM0035-1V4107, TN03, max UDMA/133 [35135.986567] ata60.00: ATA-10: ST4000NM0035-1V4107, TN03, max UDMA/133 [35136.002164] ata61.00: ATA-10: ST4000NM0035-1V4107, TN03, max UDMA/133 [35151.121408] hisi_sas_v3_hw 0000:30:04.0: erroneous completion iptt=4076 task=00000000a6e52fa3 dev id=1 exp 0x500e004aaaaaaa1f phy0 addr=500e004aaaaaaa00 CQ hdr: 0x101b 0x10fec 0x0 0x0 Error info: 0x8000 0x0 0x0 0x0 [35151.149873] hisi_sas_v3_hw 0000:30:04.0: abort tmf: open reject failed [35151.268958] hisi_sas_v3_hw 0000:30:04.0: erroneous completion iptt=4077 task=0000000025465ccf dev id=1 exp 0x500e004aaaaaaa1f phy0 addr=500e004aaaaaaa00 CQ hdr: 0x101b 0x10fed 0x0 0x0 Error info: 0x8000 0x0 0x0 0x0 [35151.293686] hisi_sas_v3_hw 0000:30:04.0: abort tmf: open reject failed [35151.380960] hisi_sas_v3_hw 0000:30:04.0: erroneous completion iptt=4089 task=00000000b7661372 dev id=2 exp 0x500e004aaaaaaa1f phy1 addr=500e004aaaaaaa01 CQ hdr: 0x101b 0x20ff9 0x0 0x0 Error info: 0x8000 0x0 0x0 0x0 [35151.397819] hisi_sas_v3_hw 0000:30:04.0: erroneous completion iptt=4092 task=000000003815f088 dev id=5 exp 0x500e004aaaaaaa1f phy5 addr=500e004aaaaaaa05 CQ hdr: 0x101b 0x50ffc 0x0 0x0 Error info: 0x8000 0x0 0x0 0x0
Thanks, John
But if freed sas_task is allocated by other IO before freeing sas_task at the end of the function, it frees other IO's sas_task actually which will cause memory issue.
thread 1 thread 2 allocate task0 free task0 allocate task0 as task0 is freed already by thread 1 free task0
.