From: Zhang Zekun zhangzekun11@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9TIC6
------------------------------------------
The first four params in struct iova_rcache will share the same cache line. The depot related parameter which will be upated each time the depot has been changed, and it should not influence the field "cpu_rcache". Moving the delayed_work up to avoid cache false-sharing, because it has 192 Bytes, which should be enough for sperating a cache line. The optimization can be reflected by perf top:
Before: 31.13% [kernel] [k] queue_iova 23.02% [kernel] [k] __iova_rcache_get 7.78% [kernel] [k] __arm_lpae_unmap 6.18% [kernel] [k] arm_lpae_map 3.91% [kernel] [k] sch_direct_xmit 3.19% [kernel] [k] __arm_lpae_map 1.50% [kernel] [k] __dev_queue_xmit
After: 15.88% [kernel] [k] __arm_lpae_unmap 11.33% [kernel] [k] arm_lpae_map 7.98% [kernel] [k] sch_direct_xmit 6.71% [kernel] [k] __arm_lpae_map 5.35% [kernel] [k] queue_iova 3.09% [kernel] [k] __dev_queue_xmit 2.83% [kernel] [k] ip_finish_output2
Fixes: 876b598ef137 ("iommu/iova: Make the rcache depot scale better") Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- include/linux/iova.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/iova.h b/include/linux/iova.h index ccc59e4b6c54..32996e73ce19 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -31,9 +31,9 @@ struct iova_rcache { spinlock_t lock; unsigned int depot_size; struct iova_magazine *depot; + struct delayed_work work; struct iova_cpu_rcache __percpu *cpu_rcaches; struct iova_domain *iovad; - struct delayed_work work; KABI_RESERVE(1) KABI_RESERVE(2) };