hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH CVE: NA
---------------------------------------
In fio test with iodepth=256 with allowd cpus to 0-255, we observe a serve performance decrease. The statistic of cache hit rate are relatively low. Here are some statistics about the iova_cpu_rcahe of all cpus:
iova alloc order 0 1 2 3 4 5 ---------------------------------------------------------------------- average cpu_rcache hit rate 0.9941 0.7408 0.8109 0.8854 0.9082 0.8887
Jobs: 12 (f=12): [R(12)][20.0%][r=1091MiB/s][r=279k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][22.2%][r=1426MiB/s][r=365k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][25.0%][r=1607MiB/s][r=411k IOPS][eta 00m:27s] Jobs: 12 (f=12): [R(12)][27.8%][r=1501MiB/s][r=384k IOPS][eta 00m:26s] Jobs: 12 (f=12): [R(12)][30.6%][r=1486MiB/s][r=380k IOPS][eta 00m:25s] Jobs: 12 (f=12): [R(12)][33.3%][r=1393MiB/s][r=357k IOPS][eta 00m:24s] Jobs: 12 (f=12): [R(12)][36.1%][r=1550MiB/s][r=397k IOPS][eta 00m:23s] Jobs: 12 (f=12): [R(12)][38.9%][r=1485MiB/s][r=380k IOPS][eta 00m:22s]
The under lying hisi sas driver has 16 thread irqs to free iova, but these irq call back function will only free iovas on 16 certain cpus(cpu{0, 16,32...,240}). For example, thread irq which smp affinity is 0-15, will only free iova on cpu 0. However, the driver will alloc iova on all cpus(cpu{0-255}), cpus without free iova in local cpu_rcache need to get free iovas from iova_rcache->depot. The current size of iova_rcache->depot max size is 32, and it seems to be too small for 256 users (16 cpus will put iovas to iova_rcache->depot and 240 cpus will try to get iova from it). Set iova_rcache->depot to 128 can fix the performance issue, and the performance can return to normal.
iova alloc order 0 1 2 3 4 5 ---------------------------------------------------------------------- average cpu_rcache hit rate 0.9925 0.9736 0.9789 0.9867 0.9889 0.9906
Jobs: 12 (f=12): [R(12)][12.9%][r=7526MiB/s][r=1927k IOPS][eta 04m:30s] Jobs: 12 (f=12): [R(12)][13.2%][r=7527MiB/s][r=1927k IOPS][eta 04m:29s] Jobs: 12 (f=12): [R(12)][13.5%][r=7529MiB/s][r=1927k IOPS][eta 04m:28s] Jobs: 12 (f=12): [R(12)][13.9%][r=7531MiB/s][r=1928k IOPS][eta 04m:27s] Jobs: 12 (f=12): [R(12)][14.2%][r=7529MiB/s][r=1928k IOPS][eta 04m:26s] Jobs: 12 (f=12): [R(12)][14.5%][r=7528MiB/s][r=1927k IOPS][eta 04m:25s] Jobs: 12 (f=12): [R(12)][14.8%][r=7527MiB/s][r=1927k IOPS][eta 04m:24s] Jobs: 12 (f=12): [R(12)][15.2%][r=7525MiB/s][r=1926k IOPS][eta 04m:23s]
Signed-off-by: Zhang Zekun zhangzekun11@huawei.com --- drivers/iommu/Kconfig | 10 ++++++++++ include/linux/iova.h | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index f04a2bde0018..54d4a8cc3876 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -437,5 +437,15 @@ config SMMU_BYPASS_DEV
This feature will be replaced by ACPI IORT RMR node, which will be upstreamed in mainline. +config IOVA_MAX_GLOBAL_MAGS + int "Set the max iova global magzines in iova rcache" + range 16 2048 + default "32" + help + Iova rcache global magizine is shared among every cpu. The size of + it can be a bottle neck when lots of cpus are contending to use it. + If you are suffering from the speed of allocing iova with more than + 128 cpus, try to tune this config larger. +
endif # IOMMU_SUPPORT diff --git a/include/linux/iova.h b/include/linux/iova.h index dfa51ae49666..3cb469b366d7 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -26,7 +26,7 @@ struct iova_magazine; struct iova_cpu_rcache;
#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ -#define MAX_GLOBAL_MAGS 32 /* magazines per bin */ +#define MAX_GLOBAL_MAGS CONFIG_IOVA_MAX_GLOBAL_MAGS /* magazines per bin */
struct iova_rcache { spinlock_t lock;
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/1223 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/O...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/1223 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/O...
On 2023/6/25 15:38, Zhang Zekun wrote:
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ASVH CVE: NA
In fio test with iodepth=256 with allowd cpus to 0-255, we observe a serve performance decrease. The statistic of cache hit rate are relatively low. Here are some statistics about the iova_cpu_rcahe of all cpus:
iova alloc order 0 1 2 3 4 5
average cpu_rcache hit rate 0.9941 0.7408 0.8109 0.8854 0.9082 0.8887
Jobs: 12 (f=12): [R(12)][20.0%][r=1091MiB/s][r=279k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][22.2%][r=1426MiB/s][r=365k IOPS][eta 00m:28s] Jobs: 12 (f=12): [R(12)][25.0%][r=1607MiB/s][r=411k IOPS][eta 00m:27s] Jobs: 12 (f=12): [R(12)][27.8%][r=1501MiB/s][r=384k IOPS][eta 00m:26s] Jobs: 12 (f=12): [R(12)][30.6%][r=1486MiB/s][r=380k IOPS][eta 00m:25s] Jobs: 12 (f=12): [R(12)][33.3%][r=1393MiB/s][r=357k IOPS][eta 00m:24s] Jobs: 12 (f=12): [R(12)][36.1%][r=1550MiB/s][r=397k IOPS][eta 00m:23s] Jobs: 12 (f=12): [R(12)][38.9%][r=1485MiB/s][r=380k IOPS][eta 00m:22s]
The under lying hisi sas driver has 16 thread irqs to free iova, but these irq call back function will only free iovas on 16 certain cpus(cpu{0, 16,32...,240}). For example, thread irq which smp affinity is 0-15, will only free iova on cpu 0. However, the driver will alloc iova on all cpus(cpu{0-255}), cpus without free iova in local cpu_rcache need to get free iovas from iova_rcache->depot. The current size of iova_rcache->depot max size is 32, and it seems to be too small for 256 users (16 cpus will put iovas to iova_rcache->depot and 240 cpus will try to get iova from it). Set iova_rcache->depot to 128 can fix the performance issue, and the performance can return to normal.
iova alloc order 0 1 2 3 4 5
average cpu_rcache hit rate 0.9925 0.9736 0.9789 0.9867 0.9889 0.9906
Jobs: 12 (f=12): [R(12)][12.9%][r=7526MiB/s][r=1927k IOPS][eta 04m:30s] Jobs: 12 (f=12): [R(12)][13.2%][r=7527MiB/s][r=1927k IOPS][eta 04m:29s] Jobs: 12 (f=12): [R(12)][13.5%][r=7529MiB/s][r=1927k IOPS][eta 04m:28s] Jobs: 12 (f=12): [R(12)][13.9%][r=7531MiB/s][r=1928k IOPS][eta 04m:27s] Jobs: 12 (f=12): [R(12)][14.2%][r=7529MiB/s][r=1928k IOPS][eta 04m:26s] Jobs: 12 (f=12): [R(12)][14.5%][r=7528MiB/s][r=1927k IOPS][eta 04m:25s] Jobs: 12 (f=12): [R(12)][14.8%][r=7527MiB/s][r=1927k IOPS][eta 04m:24s] Jobs: 12 (f=12): [R(12)][15.2%][r=7525MiB/s][r=1926k IOPS][eta 04m:23s]
Signed-off-by: Zhang Zekun zhangzekun11@huawei.com
drivers/iommu/Kconfig | 10 ++++++++++ include/linux/iova.h | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index f04a2bde0018..54d4a8cc3876 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -437,5 +437,15 @@ config SMMU_BYPASS_DEV
This feature will be replaced by ACPI IORT RMR node, which will be upstreamed in mainline.
+config IOVA_MAX_GLOBAL_MAGS
int "Set the max iova global magzines in iova rcache"
range 16 2048
default "32"
help
Iova rcache global magizine is shared among every cpu. The size of
it can be a bottle neck when lots of cpus are contending to use it.
If you are suffering from the speed of allocing iova with more than
128 cpus, try to tune this config larger.
endif # IOMMU_SUPPORT
diff --git a/include/linux/iova.h b/include/linux/iova.h index dfa51ae49666..3cb469b366d7 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -26,7 +26,7 @@ struct iova_magazine; struct iova_cpu_rcache;
#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ -#define MAX_GLOBAL_MAGS 32 /* magazines per bin */ +#define MAX_GLOBAL_MAGS CONFIG_IOVA_MAX_GLOBAL_MAGS /* magazines per bin */
struct iova_rcache { spinlock_t lock;
It's better to update it in runtime, but can do that later,
Reviewed-by: Hanjun Guo guohanjun@huawei.com