The iova rcache will have performance problem, set the iova rcache global mag max size to 128 to fix it.
Zhang Zekun (2): iommu/iova: increase the iova_rcache depot max size config: enable set the max iova mag size to 128
arch/arm64/configs/openeuler_defconfig | 1 + drivers/iommu/Kconfig | 10 ++++++++++ include/linux/iova.h | 2 +- 3 files changed, 12 insertions(+), 1 deletion(-)
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH CVE: NA
---------------------------------------
In fio test with iodepth=256 with allowd cpus to 0-255, we observe a serve performance decrease. The statistic of cache hit rate are relatively low. Here are some statistics about the iova_cpu_rcahe of all cpus:
iova alloc order 0 1 2 3 4 5 ---------------------------------------------------------------------- average cpu_rcache hit rate 0.9941 0.7408 0.8109 0.8854 0.9082 0.8887
Jobs: 12 (f=12): [R(12)][20.0%][r=1091MiB/s][r=279k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][22.2%][r=1426MiB/s][r=365k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][25.0%][r=1607MiB/s][r=411k IOPS][eta 00m:27s] Jobs: 12 (f=12): [R(12)][27.8%][r=1501MiB/s][r=384k IOPS][eta 00m:26s] Jobs: 12 (f=12): [R(12)][30.6%][r=1486MiB/s][r=380k IOPS][eta 00m:25s] Jobs: 12 (f=12): [R(12)][33.3%][r=1393MiB/s][r=357k IOPS][eta 00m:24s] Jobs: 12 (f=12): [R(12)][36.1%][r=1550MiB/s][r=397k IOPS][eta 00m:23s] Jobs: 12 (f=12): [R(12)][38.9%][r=1485MiB/s][r=380k IOPS][eta 00m:22s]
The under lying hisi sas driver has 16 thread irqs to free iova, but these irq call back function will only free iovas on 16 certain cpus(cpu{0, 16,32...,240}). For example, thread irq which smp affinity is 0-15, will only free iova on cpu 0. However, the driver will alloc iova on all cpus(cpu{0-255}), cpus without free iova in local cpu_rcache need to get free iovas from iova_rcache->depot. The current size of iova_rcache->depot max size is 32, and it seems to be too small for 256 users (16 cpus will put iovas to iova_rcache->depot and 240 cpus will try to get iova from it). Set iova_rcache->depot to 128 can fix the performance issue, and the performance can return to normal.
iova alloc order 0 1 2 3 4 5 ---------------------------------------------------------------------- average cpu_rcache hit rate 0.9925 0.9736 0.9789 0.9867 0.9889 0.9906
Jobs: 12 (f=12): [R(12)][12.9%][r=7526MiB/s][r=1927k IOPS][eta 04m:30s] Jobs: 12 (f=12): [R(12)][13.2%][r=7527MiB/s][r=1927k IOPS][eta 04m:29s] Jobs: 12 (f=12): [R(12)][13.5%][r=7529MiB/s][r=1927k IOPS][eta 04m:28s] Jobs: 12 (f=12): [R(12)][13.9%][r=7531MiB/s][r=1928k IOPS][eta 04m:27s] Jobs: 12 (f=12): [R(12)][14.2%][r=7529MiB/s][r=1928k IOPS][eta 04m:26s] Jobs: 12 (f=12): [R(12)][14.5%][r=7528MiB/s][r=1927k IOPS][eta 04m:25s] Jobs: 12 (f=12): [R(12)][14.8%][r=7527MiB/s][r=1927k IOPS][eta 04m:24s] Jobs: 12 (f=12): [R(12)][15.2%][r=7525MiB/s][r=1926k IOPS][eta 04m:23s]
Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- drivers/iommu/Kconfig | 10 ++++++++++ include/linux/iova.h | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index f04a2bde0018..54d4a8cc3876 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -437,5 +437,15 @@ config SMMU_BYPASS_DEV
This feature will be replaced by ACPI IORT RMR node, which will be upstreamed in mainline. +config IOVA_MAX_GLOBAL_MAGS + int "Set the max iova global magzines in iova rcache" + range 16 2048 + default "32" + help + Iova rcache global magizine is shared among every cpu. The size of + it can be a bottle neck when lots of cpus are contending to use it. + If you are suffering from the speed of allocing iova with more than + 128 cpus, try to tune this config larger. +
endif # IOMMU_SUPPORT diff --git a/include/linux/iova.h b/include/linux/iova.h index dfa51ae49666..3cb469b366d7 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -26,7 +26,7 @@ struct iova_magazine; struct iova_cpu_rcache;
#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ -#define MAX_GLOBAL_MAGS 32 /* magazines per bin */ +#define MAX_GLOBAL_MAGS CONFIG_IOVA_MAX_GLOBAL_MAGS /* magazines per bin */
struct iova_rcache { spinlock_t lock;
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH CVE: NA
---------------------------------------
iova mag size will be iova_rcache size to 128, to support more concurrency in iova allocation, and can fix the problem dixcribe in bugzilla.
Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- arch/arm64/configs/openeuler_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index eb4ee0522446..213c25c623a2 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -5950,6 +5950,7 @@ CONFIG_ARM_SMMU_V3_PM=y # CONFIG_QCOM_IOMMU is not set # CONFIG_VIRTIO_IOMMU is not set CONFIG_SMMU_BYPASS_DEV=y +CONFIG_IOVA_MAX_GLOBAL_MAGS=128
# # Remoteproc drivers
On 2023/6/26 16:32, Zhang Zekun wrote:
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH CVE: NA
iova mag size will be iova_rcache size to 128, to support more concurrency in iova allocation, and can fix the problem dixcribe in bugzilla.
Signed-off-by: Zhang Zekun zhangzekun11@huawei.com
arch/arm64/configs/openeuler_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index eb4ee0522446..213c25c623a2 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -5950,6 +5950,7 @@ CONFIG_ARM_SMMU_V3_PM=y # CONFIG_QCOM_IOMMU is not set # CONFIG_VIRTIO_IOMMU is not set CONFIG_SMMU_BYPASS_DEV=y +CONFIG_IOVA_MAX_GLOBAL_MAGS=128
No, pls don't do this. we need to confirm there is no performance regressions for other scenarios.
Thanks Hanjun
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/1244 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/Q...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/1244 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/Q...