Recently we have meet a problem when running FIO test. In our kunpeng server with 320 cores, about 80% cpus comes to- 100% usage, and soft lockup appears in the kernel message, which shows it stuck in __alloc_and_insert_iova_range(). Both the call trace and high cpu occupy rate implys that the iova_rcache behaves poorly to alloc iovas.
A similar problem has been addressed in early months this year, the solution is just to enlarge the MAX_GLOBAL_MAGS to 128, but it requires to growth with the cpu cores, and it is hard to set a accurate val for a specific machine. So, it is better to use solutions from the linux community, use a list to replace the array of iova_rcache->depot. It will push iova_magazine to depot list when the local cpu_rcache is full, and use schedul_delayed_work() to free it after 100ms. The minimum length of the depot list is the num of online_cpus().
The performance of this patch shows good. This patch set use a lot of timers to record the iova_magazine, and the delay of timers in system may increase when undering heavy work load, but it should not cause problem as timers itself allows a inaccurate delay for a range.
We need to merge "iommu/iova: change IOVA_MAG_SIZE to 127 to save memory" first to resolve a compile error:
error: static assertion failed: "!(sizeof(struct iova_magazine) & (sizeof(struct iova_magazine) - 1))"
Feng Tang (1): iommu/iova: change IOVA_MAG_SIZE to 127 to save memory
Zhang Zekun (5): Revert "iommu/iova: move IOVA_MAX_GLOBAL_MAGS outside of IOMMU_SUPPORT" Revert "config: enable set the max iova mag size to 128" Revert "iommu/iova: increase the iova_rcache depot max size to 128" iommu/iova: Make the rcache depot scale better iommu/iova: Manage the depot list size
arch/arm64/configs/openeuler_defconfig | 1 - drivers/iommu/Kconfig | 10 ---- drivers/iommu/iova.c | 79 +++++++++++++++++++------- include/linux/iova.h | 7 ++- 4 files changed, 63 insertions(+), 34 deletions(-)