Random performance decreases appear on cases of Hackbench which test pipe or socket communication among multi-threads on Hisi HIP08 SoC. Cache sharing which caused by the change of the data layout and the cache readunique prefetch mechanism both lead to this problem.
Readunique mechanism which may caused by store operation will invalid cachelines on other cores during data fetching stage which can cause cacheline invalidation happens frequently in a sharing data access situation.
Disable cache readunique prefetch can trackle this problem. Test cases are like: for i in 20;do echo "--------pipe thread num=$i----------" for j in $(seq 1 10);do ./hackbench -pipe $i thread 1000 done done
We disable readunique prefetch only in el2 for in el1 disabling readunique prefetch may cause panic due to lack of related priority which often be set in BIOS.
Introduce CONFIG_HISILICON_ERRATUM_HIP08_RU_PREFETCH and disable RU prefetch using boot cmdline 'readunique_prefetch=off'.
Kai Shen (1): arm64: errata: add option to disable cache readunique prefetch on HIP08
Xie XiuQi (1): arm64: errata: enable HISILICON_ERRATUM_HIP08_RU_PREFETCH
arch/arm64/Kconfig | 18 +++++++++ arch/arm64/configs/openeuler_defconfig | 2 + arch/arm64/kernel/cpu_errata.c | 56 ++++++++++++++++++++++++++ arch/arm64/tools/cpucaps | 1 + 4 files changed, 77 insertions(+)