On 22/11/2021 02:24, chenxiang (M) wrote:
在 2021/11/19 23:09, John Garry 写道:
On 19/11/2021 03:34, chenxiang wrote:
Is there any measureable performance difference here? It seems risky and sligtly messy, so would really need to be wrote it. More below.
Before i saw memset() occupies some here with FlameGraph and perf. I think maybe we can remove them.
I'd still rather know the performance gain.
For slot->cmd_hdr, we can optimize as the patch does. For slot->buf (including status buffer/command table), maybe we don't allocate it and let them allocated with request, and then they will be initialized as 0 in function scsi_init_command() (it zeroes all of them including scsi command + private data of LLDD at one time, and i think it is more efficience).
From: Xiang Chen chenxiang66@hisilicon.com
...
And why not zero dif_prd_table_addr? What if the previous usage of the header had this set?
Hdr->sg_len includes two part: length of DATA SGL/SEG and DIF SGL/SGE. Only when the length is not zero, then hardware will access those data. If length of DATA SGL or DIF SGL/SGE is not zero, the field DATA SGL/SGE and DIF SGL/SGE base address will be overwritten. If all of them is zero, we don't need to care about it as the hardware will not access them. So i think we don't need to zero them.
I'll still just rather zero these fields for safety
Thanks, John