Support features: pv-qspinlock pv-sched TWED ncsnp TLBI
Jingyi Wang (1): KVM: arm64: Make use of TWED feature
Quan Zhou (5): KVM: arm64: Support a new HiSi CPU type KVM: arm64: Probe and configure DVMBM capability on HiSi CPUs KVM: arm64: Add kvm_vcpu_arch::cpus_ptr and pre_cpus_ptr KVM: arm64: Add kvm_arch::dvm_cpumask and dvm_lock KVM: arm64: Implement the capability of DVMBM
Yanan Wang (1): KVM: arm64: Only probe Hisi ncsnp feature on Hisi CPUs
Zenghui Yu (3): KVM: arm64: Probe Hisi CPU TYPE from ACPI/DTB KVM: arm64: Add support for probing Hisi ncsnp capability KVM: arm64: Fix {fp_asimd,sve}_exit_stat manipulation
Zengruan Ye (11): arm64: cpufeature: TWED support detection KVM: arm64: Document PV-sched interface KVM: arm64: Implement PV_SCHED_FEATURES call KVM: arm64: Support pvsched preempted via shared structure KVM: arm64: Add interface to support vCPU preempted check KVM: arm64: Support the vCPU preemption check KVM: arm64: Add SMCCC PV-sched to kick cpu KVM: arm64: Implement PV_SCHED_KICK_CPU call KVM: arm64: Add interface to support PV qspinlock KVM: arm64: Enable PV qspinlock KVM: arm64: Add tracepoints for PV qspinlock
chenjiajun (1): kvm: debugfs: export remaining aarch64 kvm exit reasons to debugfs
lishusen (2): KVM: arm64: Replace pv.ops by static_call for PV qspinlock feature KVM: arm64: Add configuration for feature pv-sched
yezengruan (1): kvm: arm64: fix some pvsched bugs
C | 0 .../admin-guide/kernel-parameters.txt | 3 + Documentation/virt/kvm/arm/pvsched.rst | 74 ++++ arch/arm64/Kconfig | 30 ++ arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/Kbuild | 1 - arch/arm64/include/asm/kvm_arm.h | 8 + arch/arm64/include/asm/kvm_emulate.h | 27 ++ arch/arm64/include/asm/kvm_host.h | 63 +++ arch/arm64/include/asm/paravirt.h | 51 +++ arch/arm64/include/asm/pvsched-abi.h | 16 + arch/arm64/include/asm/qspinlock.h | 48 +++ arch/arm64/include/asm/qspinlock_paravirt.h | 12 + arch/arm64/include/asm/spinlock.h | 13 + arch/arm64/include/asm/virt.h | 5 + arch/arm64/kernel/Makefile | 3 +- arch/arm64/kernel/cpufeature.c | 12 + arch/arm64/kernel/image-vars.h | 5 + arch/arm64/kernel/paravirt-spinlocks.c | 21 + arch/arm64/kernel/paravirt.c | 189 +++++++++ arch/arm64/kernel/trace-paravirt.h | 66 +++ arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/Makefile | 3 +- arch/arm64/kvm/arm.c | 93 +++++ arch/arm64/kvm/handle_exit.c | 12 + arch/arm64/kvm/hisilicon/Kconfig | 7 + arch/arm64/kvm/hisilicon/Makefile | 2 + arch/arm64/kvm/hisilicon/hisi_virt.c | 389 ++++++++++++++++++ arch/arm64/kvm/hisilicon/hisi_virt.h | 61 +++ arch/arm64/kvm/hyp/include/hyp/switch.h | 2 + arch/arm64/kvm/hyp/pgtable.c | 2 +- arch/arm64/kvm/hypercalls.c | 24 ++ arch/arm64/kvm/mmu.c | 1 + arch/arm64/kvm/pvsched.c | 81 ++++ arch/arm64/kvm/sys_regs.c | 11 + arch/arm64/kvm/trace_arm.h | 18 + arch/arm64/tools/cpucaps | 1 + include/linux/arm-smccc.h | 25 ++ include/linux/cpuhotplug.h | 1 + 39 files changed, 1378 insertions(+), 4 deletions(-) create mode 100644 C create mode 100644 Documentation/virt/kvm/arm/pvsched.rst create mode 100644 arch/arm64/include/asm/pvsched-abi.h create mode 100644 arch/arm64/include/asm/qspinlock.h create mode 100644 arch/arm64/include/asm/qspinlock_paravirt.h create mode 100644 arch/arm64/kernel/paravirt-spinlocks.c create mode 100644 arch/arm64/kernel/trace-paravirt.h create mode 100644 arch/arm64/kvm/hisilicon/Kconfig create mode 100644 arch/arm64/kvm/hisilicon/Makefile create mode 100644 arch/arm64/kvm/hisilicon/hisi_virt.c create mode 100644 arch/arm64/kvm/hisilicon/hisi_virt.h create mode 100644 arch/arm64/kvm/pvsched.c
From: Zenghui Yu yuzenghui@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
Parse ACPI/DTB to get where the hypervisor is running.
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Yanan Wang wangyanan55@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/hisi_cpu_model.h | 19 ++++++ arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/Makefile | 1 + arch/arm64/kvm/arm.c | 6 ++ arch/arm64/kvm/hisi_cpu_model.c | 83 +++++++++++++++++++++++++ 5 files changed, 110 insertions(+) create mode 100644 arch/arm64/include/asm/hisi_cpu_model.h create mode 100644 arch/arm64/kvm/hisi_cpu_model.c
diff --git a/arch/arm64/include/asm/hisi_cpu_model.h b/arch/arm64/include/asm/hisi_cpu_model.h new file mode 100644 index 000000000000..f686a7591e8f --- /dev/null +++ b/arch/arm64/include/asm/hisi_cpu_model.h @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + */ + +#ifndef __HISI_CPU_MODEL_H__ +#define __HISI_CPU_MODEL_H__ + +enum hisi_cpu_type { + HI_1612, + HI_1616, + HI_1620, + UNKNOWN_HI_TYPE +}; + +extern enum hisi_cpu_type hi_cpu_type; + +void probe_hisi_cpu_type(void); +#endif /* __HISI_CPU_MODEL_H__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index af06ccb7ee34..5154085849fa 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -27,6 +27,7 @@ #include <asm/fpsimd.h> #include <asm/kvm.h> #include <asm/kvm_asm.h> +#include <asm/hisi_cpu_model.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index c0c050e53157..86c6ded87eeb 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -15,6 +15,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ guest.o debug.o reset.o sys_regs.o stacktrace.o \ vgic-sys-reg-v3.o fpsimd.o pkvm.o \ arch_timer.o trng.o vmid.o emulate-nested.o nested.o \ + hisi_cpu_model.o \ vgic/vgic.o vgic/vgic-init.o \ vgic/vgic-irqfd.o vgic/vgic-v2.o \ vgic/vgic-v3.o vgic/vgic-v4.o \ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 4866b3f7b4ea..9d57a57283b3 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -56,6 +56,9 @@ DECLARE_KVM_NVHE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
static bool vgic_present, kvm_arm_initialised;
+/* Hisi cpu type enum */ +enum hisi_cpu_type hi_cpu_type = UNKNOWN_HI_TYPE; + static DEFINE_PER_CPU(unsigned char, kvm_hyp_initialized); DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
@@ -2415,6 +2418,9 @@ static __init int kvm_arm_init(void) return err; }
+ /* Probe the Hisi CPU type */ + probe_hisi_cpu_type(); + in_hyp_mode = is_kernel_in_hyp_mode();
if (cpus_have_final_cap(ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE) || diff --git a/arch/arm64/kvm/hisi_cpu_model.c b/arch/arm64/kvm/hisi_cpu_model.c new file mode 100644 index 000000000000..4d5a099bc27a --- /dev/null +++ b/arch/arm64/kvm/hisi_cpu_model.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + */ + +#include <linux/acpi.h> +#include <linux/of.h> +#include <linux/init.h> +#include <linux/kvm_host.h> + +#ifdef CONFIG_ACPI + +/* ACPI Hisi oem table id str */ +const char *oem_str[] = { + "HIP06", /* Hisi 1612 */ + "HIP07", /* Hisi 1616 */ + "HIP08" /* Hisi 1620 */ +}; + +/* + * Get Hisi oem table id. + */ +static void acpi_get_hw_cpu_type(void) +{ + struct acpi_table_header *table; + acpi_status status; + int i, str_size = ARRAY_SIZE(oem_str); + + /* Get oem table id from ACPI table header */ + status = acpi_get_table(ACPI_SIG_DSDT, 0, &table); + if (ACPI_FAILURE(status)) { + pr_err("Failed to get ACPI table: %s\n", + acpi_format_exception(status)); + return; + } + + for (i = 0; i < str_size; ++i) { + if (!strncmp(oem_str[i], table->oem_table_id, 5)) { + hi_cpu_type = i; + return; + } + } +} + +#else +static void acpi_get_hw_cpu_type(void) {} +#endif + +/* of Hisi cpu model str */ +const char *of_model_str[] = { + "Hi1612", + "Hi1616" +}; + +static void of_get_hw_cpu_type(void) +{ + const char *cpu_type; + int ret, i, str_size = ARRAY_SIZE(of_model_str); + + ret = of_property_read_string(of_root, "model", &cpu_type); + if (ret < 0) { + pr_err("Failed to get Hisi cpu model by OF.\n"); + return; + } + + for (i = 0; i < str_size; ++i) { + if (strstr(cpu_type, of_model_str[i])) { + hi_cpu_type = i; + return; + } + } +} + +void probe_hisi_cpu_type(void) +{ + if (!acpi_disabled) + acpi_get_hw_cpu_type(); + else + of_get_hw_cpu_type(); + + if (hi_cpu_type == UNKNOWN_HI_TYPE) + pr_warn("UNKNOWN Hisi cpu type.\n"); +}
From: Zenghui Yu yuzenghui@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
Kunpeng 920 offers the HHA ncsnp capability, with which hypervisor doesn't need to perform a lot of cache maintenance like before (in case the guest has some non-cacheable Stage-1 mappings). Currently we apply this hardware capability when
- vCPU switching MMU+caches on/off - creating Stage-2 mappings for Daborts
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Yanan Wang wangyanan55@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/hisi_cpu_model.h | 2 ++ arch/arm64/kvm/arm.c | 2 ++ arch/arm64/kvm/hisi_cpu_model.c | 34 +++++++++++++++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 2 +- 4 files changed, 39 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/hisi_cpu_model.h b/arch/arm64/include/asm/hisi_cpu_model.h index f686a7591e8f..e0da0ef61613 100644 --- a/arch/arm64/include/asm/hisi_cpu_model.h +++ b/arch/arm64/include/asm/hisi_cpu_model.h @@ -14,6 +14,8 @@ enum hisi_cpu_type { };
extern enum hisi_cpu_type hi_cpu_type; +extern bool kvm_ncsnp_support;
void probe_hisi_cpu_type(void); +void probe_hisi_ncsnp_support(void); #endif /* __HISI_CPU_MODEL_H__ */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 9d57a57283b3..d6d379aeda2e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -58,6 +58,7 @@ static bool vgic_present, kvm_arm_initialised;
/* Hisi cpu type enum */ enum hisi_cpu_type hi_cpu_type = UNKNOWN_HI_TYPE; +bool kvm_ncsnp_support;
static DEFINE_PER_CPU(unsigned char, kvm_hyp_initialized); DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use); @@ -2420,6 +2421,7 @@ static __init int kvm_arm_init(void)
/* Probe the Hisi CPU type */ probe_hisi_cpu_type(); + probe_hisi_ncsnp_support();
in_hyp_mode = is_kernel_in_hyp_mode();
diff --git a/arch/arm64/kvm/hisi_cpu_model.c b/arch/arm64/kvm/hisi_cpu_model.c index 4d5a099bc27a..52eecf1ba1cf 100644 --- a/arch/arm64/kvm/hisi_cpu_model.c +++ b/arch/arm64/kvm/hisi_cpu_model.c @@ -81,3 +81,37 @@ void probe_hisi_cpu_type(void) if (hi_cpu_type == UNKNOWN_HI_TYPE) pr_warn("UNKNOWN Hisi cpu type.\n"); } + +#define NCSNP_MMIO_BASE 0x20107E238 + +/* + * We have the fantastic HHA ncsnp capability on Kunpeng 920, + * with which hypervisor doesn't need to perform a lot of cache + * maintenance like before (in case the guest has non-cacheable + * Stage-1 mappings). + */ +void probe_hisi_ncsnp_support(void) +{ + void __iomem *base; + unsigned int high; + + kvm_ncsnp_support = false; + + if (hi_cpu_type != HI_1620) + goto out; + + base = ioremap(NCSNP_MMIO_BASE, 4); + if (!base) { + pr_err("Unable to map MMIO region when probing ncsnp!\n"); + goto out; + } + + high = readl_relaxed(base) >> 28; + iounmap(base); + if (high != 0x1) + kvm_ncsnp_support = true; + +out: + kvm_info("Hisi ncsnp: %s\n", kvm_ncsnp_support ? "enabled" : + "disabled"); +} diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index f155b8c9e98c..1ba101ba9392 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1342,7 +1342,7 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) .arg = pgt, };
- if (stage2_has_fwb(pgt)) + if (kvm_ncsnp_support || stage2_has_fwb(pgt)) return 0;
return kvm_pgtable_walk(pgt, addr, size, &walker);
From: Yanan Wang wangyanan55@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
Add kvm_ncsnp_support in image-vars.h to support ncsnp feature
----------------------------------------------------
The "ncsnp" is an implementation specific CPU virtualization feature on Hisi 1620 series CPUs. This feature works just like ARM standard S2FWB to reduce some cache management operations in virtualization.
Given that it's Hisi specific feature, let's restrict the detection only to Hisi CPUs. To realize this: 1) Add a sub-directory `hisilicon/` within arch/arm64/kvm to hold code for Hisi specific virtualization features. 2) Add a new kconfig option `CONFIG_KVM_HISI_VIRT` for users to select the whole Hisi specific virtualization features. 3) Add a generic global KVM variable `kvm_ncsnp_support` which is `false` by default. Only re-initialize it when we have `CONFIG_KVM_HISI_VIRT` enabled.
Signed-off-by: Yanan Wang wangyanan55@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/configs/openeuler_defconfig | 1 + arch/arm64/include/asm/hisi_cpu_model.h | 21 ---- arch/arm64/include/asm/kvm_host.h | 3 +- arch/arm64/kernel/image-vars.h | 5 + arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/arm.c | 13 ++- arch/arm64/kvm/hisi_cpu_model.c | 117 ---------------------- arch/arm64/kvm/hisilicon/Kconfig | 7 ++ arch/arm64/kvm/hisilicon/Makefile | 2 + arch/arm64/kvm/hisilicon/hisi_virt.c | 124 ++++++++++++++++++++++++ arch/arm64/kvm/hisilicon/hisi_virt.h | 19 ++++ 12 files changed, 171 insertions(+), 144 deletions(-) delete mode 100644 arch/arm64/include/asm/hisi_cpu_model.h delete mode 100644 arch/arm64/kvm/hisi_cpu_model.c create mode 100644 arch/arm64/kvm/hisilicon/Kconfig create mode 100644 arch/arm64/kvm/hisilicon/Makefile create mode 100644 arch/arm64/kvm/hisilicon/hisi_virt.c create mode 100644 arch/arm64/kvm/hisilicon/hisi_virt.h
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index cc174174c745..eb65231cc67c 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -714,6 +714,7 @@ CONFIG_KVM_VFIO=y CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y CONFIG_HAVE_KVM_IRQ_BYPASS=y CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE=y +CONFIG_KVM_HISI_VIRT=y CONFIG_KVM_XFER_TO_GUEST_WORK=y CONFIG_KVM_GENERIC_HARDWARE_ENABLING=y CONFIG_VIRTUALIZATION=y diff --git a/arch/arm64/include/asm/hisi_cpu_model.h b/arch/arm64/include/asm/hisi_cpu_model.h deleted file mode 100644 index e0da0ef61613..000000000000 --- a/arch/arm64/include/asm/hisi_cpu_model.h +++ /dev/null @@ -1,21 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * Copyright(c) 2019 Huawei Technologies Co., Ltd - */ - -#ifndef __HISI_CPU_MODEL_H__ -#define __HISI_CPU_MODEL_H__ - -enum hisi_cpu_type { - HI_1612, - HI_1616, - HI_1620, - UNKNOWN_HI_TYPE -}; - -extern enum hisi_cpu_type hi_cpu_type; -extern bool kvm_ncsnp_support; - -void probe_hisi_cpu_type(void); -void probe_hisi_ncsnp_support(void); -#endif /* __HISI_CPU_MODEL_H__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 5154085849fa..e66e24ea5ab4 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -27,7 +27,6 @@ #include <asm/fpsimd.h> #include <asm/kvm.h> #include <asm/kvm_asm.h> -#include <asm/hisi_cpu_model.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED
@@ -1155,4 +1154,6 @@ static inline void kvm_hyp_reserve(void) { } void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu); bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu);
+extern bool kvm_ncsnp_support; + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 35f3c7959513..62725396334f 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -106,6 +106,11 @@ KVM_NVHE_ALIAS(__hyp_rodata_end); /* pKVM static key */ KVM_NVHE_ALIAS(kvm_protected_mode_initialized);
+/* ncsnp feature */ +#ifdef CONFIG_KVM_HISI_VIRT +KVM_NVHE_ALIAS(kvm_ncsnp_support); +#endif + #endif /* CONFIG_KVM */
#ifdef CONFIG_EFI_ZBOOT diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 83c1e09be42e..8350d43f56d4 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -5,6 +5,7 @@
source "virt/lib/Kconfig" source "virt/kvm/Kconfig" +source "arch/arm64/kvm/hisilicon/Kconfig"
menuconfig VIRTUALIZATION bool "Virtualization" diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 86c6ded87eeb..826a05d072d7 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -15,7 +15,6 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ guest.o debug.o reset.o sys_regs.o stacktrace.o \ vgic-sys-reg-v3.o fpsimd.o pkvm.o \ arch_timer.o trng.o vmid.o emulate-nested.o nested.o \ - hisi_cpu_model.o \ vgic/vgic.o vgic/vgic-init.o \ vgic/vgic-irqfd.o vgic/vgic-v2.o \ vgic/vgic-v3.o vgic/vgic-v4.o \ @@ -24,6 +23,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ vgic/vgic-its.o vgic/vgic-debug.o
kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o +obj-$(CONFIG_KVM_HISI_VIRT) += hisilicon/
always-y := hyp_constants.h hyp-constants.s
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index d6d379aeda2e..3b3244b9f6a2 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -47,6 +47,10 @@
static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
+#ifdef CONFIG_KVM_HISI_VIRT +#include "hisilicon/hisi_virt.h" +#endif + DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page); @@ -56,8 +60,7 @@ DECLARE_KVM_NVHE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
static bool vgic_present, kvm_arm_initialised;
-/* Hisi cpu type enum */ -enum hisi_cpu_type hi_cpu_type = UNKNOWN_HI_TYPE; +/* Capability of non-cacheable snooping */ bool kvm_ncsnp_support;
static DEFINE_PER_CPU(unsigned char, kvm_hyp_initialized); @@ -2419,9 +2422,11 @@ static __init int kvm_arm_init(void) return err; }
- /* Probe the Hisi CPU type */ +#ifdef CONFIG_KVM_HISI_VIRT probe_hisi_cpu_type(); - probe_hisi_ncsnp_support(); + kvm_ncsnp_support = hisi_ncsnp_supported(); +#endif + kvm_info("KVM ncsnp %s\n", kvm_ncsnp_support ? "enabled" : "disabled");
in_hyp_mode = is_kernel_in_hyp_mode();
diff --git a/arch/arm64/kvm/hisi_cpu_model.c b/arch/arm64/kvm/hisi_cpu_model.c deleted file mode 100644 index 52eecf1ba1cf..000000000000 --- a/arch/arm64/kvm/hisi_cpu_model.c +++ /dev/null @@ -1,117 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * Copyright(c) 2019 Huawei Technologies Co., Ltd - */ - -#include <linux/acpi.h> -#include <linux/of.h> -#include <linux/init.h> -#include <linux/kvm_host.h> - -#ifdef CONFIG_ACPI - -/* ACPI Hisi oem table id str */ -const char *oem_str[] = { - "HIP06", /* Hisi 1612 */ - "HIP07", /* Hisi 1616 */ - "HIP08" /* Hisi 1620 */ -}; - -/* - * Get Hisi oem table id. - */ -static void acpi_get_hw_cpu_type(void) -{ - struct acpi_table_header *table; - acpi_status status; - int i, str_size = ARRAY_SIZE(oem_str); - - /* Get oem table id from ACPI table header */ - status = acpi_get_table(ACPI_SIG_DSDT, 0, &table); - if (ACPI_FAILURE(status)) { - pr_err("Failed to get ACPI table: %s\n", - acpi_format_exception(status)); - return; - } - - for (i = 0; i < str_size; ++i) { - if (!strncmp(oem_str[i], table->oem_table_id, 5)) { - hi_cpu_type = i; - return; - } - } -} - -#else -static void acpi_get_hw_cpu_type(void) {} -#endif - -/* of Hisi cpu model str */ -const char *of_model_str[] = { - "Hi1612", - "Hi1616" -}; - -static void of_get_hw_cpu_type(void) -{ - const char *cpu_type; - int ret, i, str_size = ARRAY_SIZE(of_model_str); - - ret = of_property_read_string(of_root, "model", &cpu_type); - if (ret < 0) { - pr_err("Failed to get Hisi cpu model by OF.\n"); - return; - } - - for (i = 0; i < str_size; ++i) { - if (strstr(cpu_type, of_model_str[i])) { - hi_cpu_type = i; - return; - } - } -} - -void probe_hisi_cpu_type(void) -{ - if (!acpi_disabled) - acpi_get_hw_cpu_type(); - else - of_get_hw_cpu_type(); - - if (hi_cpu_type == UNKNOWN_HI_TYPE) - pr_warn("UNKNOWN Hisi cpu type.\n"); -} - -#define NCSNP_MMIO_BASE 0x20107E238 - -/* - * We have the fantastic HHA ncsnp capability on Kunpeng 920, - * with which hypervisor doesn't need to perform a lot of cache - * maintenance like before (in case the guest has non-cacheable - * Stage-1 mappings). - */ -void probe_hisi_ncsnp_support(void) -{ - void __iomem *base; - unsigned int high; - - kvm_ncsnp_support = false; - - if (hi_cpu_type != HI_1620) - goto out; - - base = ioremap(NCSNP_MMIO_BASE, 4); - if (!base) { - pr_err("Unable to map MMIO region when probing ncsnp!\n"); - goto out; - } - - high = readl_relaxed(base) >> 28; - iounmap(base); - if (high != 0x1) - kvm_ncsnp_support = true; - -out: - kvm_info("Hisi ncsnp: %s\n", kvm_ncsnp_support ? "enabled" : - "disabled"); -} diff --git a/arch/arm64/kvm/hisilicon/Kconfig b/arch/arm64/kvm/hisilicon/Kconfig new file mode 100644 index 000000000000..6536f897a32e --- /dev/null +++ b/arch/arm64/kvm/hisilicon/Kconfig @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only +config KVM_HISI_VIRT + bool "HiSilicon SoC specific virtualization features" + depends on ARCH_HISI + help + Support for HiSilicon SoC specific virtualization features. + On non-HiSilicon platforms, say N here. diff --git a/arch/arm64/kvm/hisilicon/Makefile b/arch/arm64/kvm/hisilicon/Makefile new file mode 100644 index 000000000000..849f99d1526d --- /dev/null +++ b/arch/arm64/kvm/hisilicon/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-only +obj-$(CONFIG_KVM_HISI_VIRT) += hisi_virt.o diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.c b/arch/arm64/kvm/hisilicon/hisi_virt.c new file mode 100644 index 000000000000..9587f9508a79 --- /dev/null +++ b/arch/arm64/kvm/hisilicon/hisi_virt.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright(c) 2022 Huawei Technologies Co., Ltd + */ + +#include <linux/acpi.h> +#include <linux/of.h> +#include <linux/init.h> +#include <linux/kvm_host.h> +#include "hisi_virt.h" + +static enum hisi_cpu_type cpu_type = UNKNOWN_HI_TYPE; + +static const char * const hisi_cpu_type_str[] = { + "Hisi1612", + "Hisi1616", + "Hisi1620", + "Unknown" +}; + +/* ACPI Hisi oem table id str */ +static const char * const oem_str[] = { + "HIP06", /* Hisi 1612 */ + "HIP07", /* Hisi 1616 */ + "HIP08" /* Hisi 1620 */ +}; + +/* + * Probe Hisi CPU type form ACPI. + */ +static enum hisi_cpu_type acpi_get_hisi_cpu_type(void) +{ + struct acpi_table_header *table; + acpi_status status; + int i, str_size = ARRAY_SIZE(oem_str); + + /* Get oem table id from ACPI table header */ + status = acpi_get_table(ACPI_SIG_DSDT, 0, &table); + if (ACPI_FAILURE(status)) { + pr_warn("Failed to get ACPI table: %s\n", + acpi_format_exception(status)); + return UNKNOWN_HI_TYPE; + } + + for (i = 0; i < str_size; ++i) { + if (!strncmp(oem_str[i], table->oem_table_id, 5)) + return i; + } + + return UNKNOWN_HI_TYPE; +} + +/* of Hisi cpu model str */ +static const char * const of_model_str[] = { + "Hi1612", + "Hi1616" +}; + +/* + * Probe Hisi CPU type from DT. + */ +static enum hisi_cpu_type of_get_hisi_cpu_type(void) +{ + const char *model; + int ret, i, str_size = ARRAY_SIZE(of_model_str); + + /* + * Note: There may not be a "model" node in FDT, which + * is provided by the vendor. In this case, we are not + * able to get CPU type information through this way. + */ + ret = of_property_read_string(of_root, "model", &model); + if (ret < 0) { + pr_warn("Failed to get Hisi cpu model by OF.\n"); + return UNKNOWN_HI_TYPE; + } + + for (i = 0; i < str_size; ++i) { + if (strstr(model, of_model_str[i])) + return i; + } + + return UNKNOWN_HI_TYPE; +} + +void probe_hisi_cpu_type(void) +{ + if (!acpi_disabled) + cpu_type = acpi_get_hisi_cpu_type(); + else + cpu_type = of_get_hisi_cpu_type(); + + kvm_info("detected: Hisi CPU type '%s'\n", hisi_cpu_type_str[cpu_type]); +} + +/* + * We have the fantastic HHA ncsnp capability on Kunpeng 920, + * with which hypervisor doesn't need to perform a lot of cache + * maintenance like before (in case the guest has non-cacheable + * Stage-1 mappings). + */ +#define NCSNP_MMIO_BASE 0x20107E238 +bool hisi_ncsnp_supported(void) +{ + void __iomem *base; + unsigned int high; + bool supported = false; + + if (cpu_type != HI_1620) + return supported; + + base = ioremap(NCSNP_MMIO_BASE, 4); + if (!base) { + pr_warn("Unable to map MMIO region when probing ncsnp!\n"); + return supported; + } + + high = readl_relaxed(base) >> 28; + iounmap(base); + if (high != 0x1) + supported = true; + + return supported; +} diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.h b/arch/arm64/kvm/hisilicon/hisi_virt.h new file mode 100644 index 000000000000..ef8de6a2101e --- /dev/null +++ b/arch/arm64/kvm/hisilicon/hisi_virt.h @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright(c) 2022 Huawei Technologies Co., Ltd + */ + +#ifndef __HISI_VIRT_H__ +#define __HISI_VIRT_H__ + +enum hisi_cpu_type { + HI_1612, + HI_1616, + HI_1620, + UNKNOWN_HI_TYPE +}; + +void probe_hisi_cpu_type(void); +bool hisi_ncsnp_supported(void); + +#endif /* __HISI_VIRT_H__ */
From: Quan Zhou zhouquan65@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
Add a new entry ("HIP09") in oem_str[] to support detection of the new HiSi CPU type.
Signed-off-by: Quan Zhou zhouquan65@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Nianyao Tang tangnianyao@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kvm/hisilicon/hisi_virt.c | 4 +++- arch/arm64/kvm/hisilicon/hisi_virt.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.c b/arch/arm64/kvm/hisilicon/hisi_virt.c index 9587f9508a79..90c363ed642e 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.c +++ b/arch/arm64/kvm/hisilicon/hisi_virt.c @@ -15,6 +15,7 @@ static const char * const hisi_cpu_type_str[] = { "Hisi1612", "Hisi1616", "Hisi1620", + "HIP09", "Unknown" };
@@ -22,7 +23,8 @@ static const char * const hisi_cpu_type_str[] = { static const char * const oem_str[] = { "HIP06", /* Hisi 1612 */ "HIP07", /* Hisi 1616 */ - "HIP08" /* Hisi 1620 */ + "HIP08", /* Hisi 1620 */ + "HIP09" /* HIP09 */ };
/* diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.h b/arch/arm64/kvm/hisilicon/hisi_virt.h index ef8de6a2101e..ebc462bf2a9d 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.h +++ b/arch/arm64/kvm/hisilicon/hisi_virt.h @@ -10,6 +10,7 @@ enum hisi_cpu_type { HI_1612, HI_1616, HI_1620, + HI_IP09, UNKNOWN_HI_TYPE };
From: Quan Zhou zhouquan65@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
DVMBM is an virtualization extension since HIP09, which allows TLBI executed at NS EL1 to be broadcast in a configurable range of physical CPUs (even with HCR_EL2.FB set). It will bring TLBI broadcast optimization.
Introduce the method to detect and enable this feature. Also add a kernel command parameter "kvm-arm.dvmbm_enabled" (=0 on default) so that users can {en,dis}able DVMBM on need. The parameter description is added under Documentation/.
Signed-off-by: Quan Zhou zhouquan65@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Nianyao Tang tangnianyao@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- .../admin-guide/kernel-parameters.txt | 3 ++ arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/arm.c | 5 ++ arch/arm64/kvm/hisilicon/hisi_virt.c | 49 +++++++++++++++++++ arch/arm64/kvm/hisilicon/hisi_virt.h | 6 +++ 5 files changed, 64 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index e755f76f76bd..49f4b7e41db4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2648,6 +2648,9 @@ [KVM,ARM] Allow use of GICv4 for direct injection of LPIs.
+ kvm-arm.dvmbm_enabled= + [KVM,ARM] Allow use of HiSilicon DVMBM capability. + kvm_cma_resv_ratio=n [PPC] Reserves given percentage from system memory area for contiguous memory allocation for KVM hash pagetable diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e66e24ea5ab4..be1e1a6e3754 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1155,5 +1155,6 @@ void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu); bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu);
extern bool kvm_ncsnp_support; +extern bool kvm_dvmbm_support;
#endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 3b3244b9f6a2..7c08597501bf 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -63,6 +63,9 @@ static bool vgic_present, kvm_arm_initialised; /* Capability of non-cacheable snooping */ bool kvm_ncsnp_support;
+/* Capability of DVMBM */ +bool kvm_dvmbm_support; + static DEFINE_PER_CPU(unsigned char, kvm_hyp_initialized); DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
@@ -2425,8 +2428,10 @@ static __init int kvm_arm_init(void) #ifdef CONFIG_KVM_HISI_VIRT probe_hisi_cpu_type(); kvm_ncsnp_support = hisi_ncsnp_supported(); + kvm_dvmbm_support = hisi_dvmbm_supported(); #endif kvm_info("KVM ncsnp %s\n", kvm_ncsnp_support ? "enabled" : "disabled"); + kvm_info("KVM dvmbm %s\n", kvm_dvmbm_support ? "enabled" : "disabled");
in_hyp_mode = is_kernel_in_hyp_mode();
diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.c b/arch/arm64/kvm/hisilicon/hisi_virt.c index 90c363ed642e..b81488cd663b 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.c +++ b/arch/arm64/kvm/hisilicon/hisi_virt.c @@ -11,6 +11,8 @@
static enum hisi_cpu_type cpu_type = UNKNOWN_HI_TYPE;
+static bool dvmbm_enabled; + static const char * const hisi_cpu_type_str[] = { "Hisi1612", "Hisi1616", @@ -124,3 +126,50 @@ bool hisi_ncsnp_supported(void)
return supported; } + +static int __init early_dvmbm_enable(char *buf) +{ + return strtobool(buf, &dvmbm_enabled); +} +early_param("kvm-arm.dvmbm_enabled", early_dvmbm_enable); + +static void hardware_enable_dvmbm(void *data) +{ + u64 val; + + val = read_sysreg_s(SYS_LSUDVM_CTRL_EL2); + val |= LSUDVM_CTLR_EL2_MASK; + write_sysreg_s(val, SYS_LSUDVM_CTRL_EL2); +} + +static void hardware_disable_dvmbm(void *data) +{ + u64 val; + + val = read_sysreg_s(SYS_LSUDVM_CTRL_EL2); + val &= ~LSUDVM_CTLR_EL2_MASK; + write_sysreg_s(val, SYS_LSUDVM_CTRL_EL2); +} + +bool hisi_dvmbm_supported(void) +{ + if (cpu_type != HI_IP09) + return false; + + /* Determine whether DVMBM is supported by the hardware */ + if (!(read_sysreg(aidr_el1) & AIDR_EL1_DVMBM_MASK)) + return false; + + /* User provided kernel command-line parameter */ + if (!dvmbm_enabled || !is_kernel_in_hyp_mode()) { + on_each_cpu(hardware_disable_dvmbm, NULL, 1); + return false; + } + + /* + * Enable TLBI Broadcast optimization by setting + * LSUDVM_CTRL_EL2's bit[0]. + */ + on_each_cpu(hardware_enable_dvmbm, NULL, 1); + return true; +} diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.h b/arch/arm64/kvm/hisilicon/hisi_virt.h index ebc462bf2a9d..95e5e889dcb1 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.h +++ b/arch/arm64/kvm/hisilicon/hisi_virt.h @@ -14,7 +14,13 @@ enum hisi_cpu_type { UNKNOWN_HI_TYPE };
+/* HIP09 */ +#define AIDR_EL1_DVMBM_MASK GENMASK_ULL(13, 12) +#define SYS_LSUDVM_CTRL_EL2 sys_reg(3, 4, 15, 7, 4) +#define LSUDVM_CTLR_EL2_MASK BIT_ULL(0) + void probe_hisi_cpu_type(void); bool hisi_ncsnp_supported(void); +bool hisi_dvmbm_supported(void);
#endif /* __HISI_VIRT_H__ */
From: Quan Zhou zhouquan65@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
We already have cpus_ptr in current thread struct now, through which we can know the pcpu range the thread is allowed to run on. So in kvm_arch_vcpu_{load,put}, we can also know the pcpu range the vcpu thread is allowed to be scheduled on, and that is the range we want to configure for TLBI broadcast.
Introduce two variables cpus_ptr and pre_cpus_ptr in struct kvm_vcpu_arch. @cpus_ptr always comes from current->cpus_ptr and @pre_cpus_ptr always comes from @cpus_ptr.
Signed-off-by: Quan Zhou zhouquan65@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Nianyao Tang tangnianyao@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 6 +++++ arch/arm64/kvm/arm.c | 18 +++++++++++++ arch/arm64/kvm/hisilicon/hisi_virt.c | 38 ++++++++++++++++++++++++++++ arch/arm64/kvm/hisilicon/hisi_virt.h | 5 ++++ 4 files changed, 67 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index be1e1a6e3754..9e7b2081e11e 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -591,6 +591,12 @@ struct kvm_vcpu_arch {
/* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; + +#ifdef CONFIG_KVM_HISI_VIRT + /* Copy of current->cpus_ptr */ + cpumask_t *cpus_ptr; + cpumask_t *pre_cpus_ptr; +#endif };
/* diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 7c08597501bf..6d22d5e3626c 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -402,6 +402,12 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) if (err) return err;
+#ifdef CONFIG_KVM_HISI_VIRT + err = kvm_hisi_dvmbm_vcpu_init(vcpu); + if (err) + return err; +#endif + return kvm_share_hyp(vcpu, vcpu + 1); }
@@ -419,6 +425,10 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) kvm_pmu_vcpu_destroy(vcpu);
kvm_arm_vcpu_destroy(vcpu); + +#ifdef CONFIG_KVM_HISI_VIRT + kvm_hisi_dvmbm_vcpu_destroy(vcpu); +#endif }
void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) @@ -475,6 +485,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu); + +#ifdef CONFIG_KVM_HISI_VIRT + kvm_hisi_dvmbm_load(vcpu); +#endif }
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) @@ -490,6 +504,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1; + +#ifdef CONFIG_KVM_HISI_VIRT + kvm_hisi_dvmbm_put(vcpu); +#endif }
static void __kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.c b/arch/arm64/kvm/hisilicon/hisi_virt.c index b81488cd663b..2c79e7f28ca5 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.c +++ b/arch/arm64/kvm/hisilicon/hisi_virt.c @@ -173,3 +173,41 @@ bool hisi_dvmbm_supported(void) on_each_cpu(hardware_enable_dvmbm, NULL, 1); return true; } + +int kvm_hisi_dvmbm_vcpu_init(struct kvm_vcpu *vcpu) +{ + if (!kvm_dvmbm_support) + return 0; + + vcpu->arch.cpus_ptr = kzalloc(sizeof(cpumask_t), GFP_ATOMIC); + vcpu->arch.pre_cpus_ptr = kzalloc(sizeof(cpumask_t), GFP_ATOMIC); + if (!vcpu->arch.cpus_ptr || !vcpu->arch.pre_cpus_ptr) + return -ENOMEM; + + return 0; +} + +void kvm_hisi_dvmbm_vcpu_destroy(struct kvm_vcpu *vcpu) +{ + if (!kvm_dvmbm_support) + return; + + kfree(vcpu->arch.cpus_ptr); + kfree(vcpu->arch.pre_cpus_ptr); +} + +void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu) +{ + if (!kvm_dvmbm_support) + return; + + cpumask_copy(vcpu->arch.cpus_ptr, current->cpus_ptr); +} + +void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu) +{ + if (!kvm_dvmbm_support) + return; + + cpumask_copy(vcpu->arch.pre_cpus_ptr, vcpu->arch.cpus_ptr); +} diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.h b/arch/arm64/kvm/hisilicon/hisi_virt.h index 95e5e889dcb1..3aac75651733 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.h +++ b/arch/arm64/kvm/hisilicon/hisi_virt.h @@ -23,4 +23,9 @@ void probe_hisi_cpu_type(void); bool hisi_ncsnp_supported(void); bool hisi_dvmbm_supported(void);
+int kvm_hisi_dvmbm_vcpu_init(struct kvm_vcpu *vcpu); +void kvm_hisi_dvmbm_vcpu_destroy(struct kvm_vcpu *vcpu); +void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu); +void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu); + #endif /* __HISI_VIRT_H__ */
From: Quan Zhou zhouquan65@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
Introduce dvm_cpumask and dvm_lock in struct kvm_arch. dvm_cpumask will store the union of all vcpus' cpus_ptr and will be used for the TLBI broadcast range. dvm_lock ensures a exclusive manipulation of dvm_cpumask.
In vcpu_load, we should decide whether to perform the subsequent update operation by checking whether dvm_cpumask has changed.
Signed-off-by: Quan Zhou zhouquan65@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Nianyao Tang tangnianyao@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 5 +++ arch/arm64/kvm/arm.c | 11 ++++++ arch/arm64/kvm/hisilicon/hisi_virt.c | 53 ++++++++++++++++++++++++++++ arch/arm64/kvm/hisilicon/hisi_virt.h | 2 ++ 4 files changed, 71 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 9e7b2081e11e..327327fcd444 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -279,6 +279,11 @@ struct kvm_arch { * the associated pKVM instance in the hypervisor. */ struct kvm_protected_vm pkvm; + +#ifdef CONFIG_KVM_HISI_VIRT + spinlock_t dvm_lock; + cpumask_t *dvm_cpumask; /* Union of all vcpu's cpus_ptr */ +#endif };
struct kvm_vcpu_fault_info { diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 6d22d5e3626c..6bec6043d173 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -147,6 +147,12 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int ret;
+#ifdef CONFIG_KVM_HISI_VIRT + ret = kvm_hisi_init_dvmbm(kvm); + if (ret) + return ret; +#endif + mutex_init(&kvm->arch.config_lock);
#ifdef CONFIG_LOCKDEP @@ -158,6 +164,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) #endif
ret = kvm_share_hyp(kvm, kvm + 1); + if (ret) return ret;
@@ -207,6 +214,10 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) */ void kvm_arch_destroy_vm(struct kvm *kvm) { +#ifdef CONFIG_KVM_HISI_VIRT + kvm_hisi_destroy_dvmbm(kvm); +#endif + bitmap_free(kvm->arch.pmu_filter); free_cpumask_var(kvm->arch.supported_cpus);
diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.c b/arch/arm64/kvm/hisilicon/hisi_virt.c index 2c79e7f28ca5..3dd2a747b780 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.c +++ b/arch/arm64/kvm/hisilicon/hisi_virt.c @@ -198,10 +198,42 @@ void kvm_hisi_dvmbm_vcpu_destroy(struct kvm_vcpu *vcpu)
void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu) { + struct kvm *kvm = vcpu->kvm; + struct kvm_vcpu *tmp; + cpumask_t mask; + unsigned long i; + + /* Don't bother on old hardware */ if (!kvm_dvmbm_support) return;
cpumask_copy(vcpu->arch.cpus_ptr, current->cpus_ptr); + + if (likely(cpumask_equal(vcpu->arch.cpus_ptr, + vcpu->arch.pre_cpus_ptr))) + return; + + /* Re-calculate dvm_cpumask for this VM */ + spin_lock(&kvm->arch.dvm_lock); + + cpumask_clear(&mask); + kvm_for_each_vcpu(i, tmp, kvm) { + /* + * We may get the stale cpus_ptr if another thread + * is concurrently changing its affinity. It'll + * eventually go through vcpu_load() and we rely on + * the last dvm_lock holder to make things correct. + */ + cpumask_or(&mask, &mask, tmp->arch.cpus_ptr); + } + + if (cpumask_equal(kvm->arch.dvm_cpumask, &mask)) + goto out_unlock; + + cpumask_copy(kvm->arch.dvm_cpumask, &mask); + +out_unlock: + spin_unlock(&kvm->arch.dvm_lock); }
void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu) @@ -211,3 +243,24 @@ void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu)
cpumask_copy(vcpu->arch.pre_cpus_ptr, vcpu->arch.cpus_ptr); } + +int kvm_hisi_init_dvmbm(struct kvm *kvm) +{ + if (!kvm_dvmbm_support) + return 0; + + spin_lock_init(&kvm->arch.dvm_lock); + kvm->arch.dvm_cpumask = kzalloc(sizeof(cpumask_t), GFP_ATOMIC); + if (!kvm->arch.dvm_cpumask) + return -ENOMEM; + + return 0; +} + +void kvm_hisi_destroy_dvmbm(struct kvm *kvm) +{ + if (!kvm_dvmbm_support) + return; + + kfree(kvm->arch.dvm_cpumask); +} diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.h b/arch/arm64/kvm/hisilicon/hisi_virt.h index 3aac75651733..1fd4b3295d78 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.h +++ b/arch/arm64/kvm/hisilicon/hisi_virt.h @@ -27,5 +27,7 @@ int kvm_hisi_dvmbm_vcpu_init(struct kvm_vcpu *vcpu); void kvm_hisi_dvmbm_vcpu_destroy(struct kvm_vcpu *vcpu); void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu); void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu); +int kvm_hisi_init_dvmbm(struct kvm *kvm); +void kvm_hisi_destroy_dvmbm(struct kvm *kvm);
#endif /* __HISI_VIRT_H__ */
From: Quan Zhou zhouquan65@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
Implement the capability of DVMBM. Before each vcpu is loaded, we re-calculate the VM-wide dvm_cpumask, and if it's changed we will kick all other vcpus out to reload the latest LSUDVMBM value to the register, and a new request KVM_REQ_RELOAD_DVMBM is added to implement this.
Otherwise if the dvm_cpumask is not changed by this single vcpu, in order to ensure the correctness of the contents in the register, we reload the LSUDVMBM value to the register and nothing else will be done.
Signed-off-by: Quan Zhou zhouquan65@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Nianyao Tang tangnianyao@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/kvm/arm.c | 5 ++ arch/arm64/kvm/hisilicon/hisi_virt.c | 125 ++++++++++++++++++++++++++- arch/arm64/kvm/hisilicon/hisi_virt.h | 28 ++++++ 4 files changed, 159 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 327327fcd444..e9bf556f206f 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -50,6 +50,7 @@ #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) #define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7) +#define KVM_REQ_RELOAD_DVMBM KVM_ARCH_REQ(8)
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ KVM_DIRTY_LOG_INITIALLY_SET) @@ -283,6 +284,7 @@ struct kvm_arch { #ifdef CONFIG_KVM_HISI_VIRT spinlock_t dvm_lock; cpumask_t *dvm_cpumask; /* Union of all vcpu's cpus_ptr */ + u64 lsudvmbm_el2; #endif };
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 6bec6043d173..ccef39bcf6ac 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -851,6 +851,11 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
if (kvm_dirty_ring_check_request(vcpu)) return 0; + +#ifdef CONFIG_KVM_HISI_VIRT + if (kvm_check_request(KVM_REQ_RELOAD_DVMBM, vcpu)) + kvm_hisi_reload_lsudvmbm(vcpu->kvm); +#endif }
return 1; diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.c b/arch/arm64/kvm/hisilicon/hisi_virt.c index 3dd2a747b780..31483b65a680 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.c +++ b/arch/arm64/kvm/hisilicon/hisi_virt.c @@ -196,6 +196,97 @@ void kvm_hisi_dvmbm_vcpu_destroy(struct kvm_vcpu *vcpu) kfree(vcpu->arch.pre_cpus_ptr); }
+static void __kvm_write_lsudvmbm(struct kvm *kvm) +{ + write_sysreg_s(kvm->arch.lsudvmbm_el2, SYS_LSUDVMBM_EL2); +} + +static void kvm_write_lsudvmbm(struct kvm *kvm) +{ + /* Do we really need to hold the dvm_lock?? */ + spin_lock(&kvm->arch.dvm_lock); + __kvm_write_lsudvmbm(kvm); + spin_unlock(&kvm->arch.dvm_lock); +} + +static int kvm_dvmbm_get_dies_info(struct kvm *kvm, u64 *vm_aff3s, int size) +{ + int num = 0, cpu; + + for_each_cpu(cpu, kvm->arch.dvm_cpumask) { + bool found = false; + u64 aff3; + int i; + + if (num >= size) + break; + + aff3 = MPIDR_AFFINITY_LEVEL(cpu_logical_map(cpu), 3); + for (i = 0; i < num; i++) { + if (vm_aff3s[i] == aff3) { + found = true; + break; + } + } + + if (!found) + vm_aff3s[num++] = aff3; + } + + return num; +} + +static void kvm_update_vm_lsudvmbm(struct kvm *kvm) +{ + u64 mpidr, aff3, aff2, aff1; + u64 vm_aff3s[DVMBM_MAX_DIES]; + u64 val; + int cpu, nr_dies; + + nr_dies = kvm_dvmbm_get_dies_info(kvm, vm_aff3s, DVMBM_MAX_DIES); + if (nr_dies > 2) { + val = DVMBM_RANGE_ALL_DIES << DVMBM_RANGE_SHIFT; + goto out_update; + } + + if (nr_dies == 1) { + val = DVMBM_RANGE_ONE_DIE << DVMBM_RANGE_SHIFT | + vm_aff3s[0] << DVMBM_DIE1_SHIFT; + + /* fulfill bits [52:0] */ + for_each_cpu(cpu, kvm->arch.dvm_cpumask) { + mpidr = cpu_logical_map(cpu); + aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); + aff1 = MPIDR_AFFINITY_LEVEL(mpidr, 1); + + val |= 1ULL << (aff2 * 4 + aff1); + } + + goto out_update; + } + + /* nr_dies == 2 */ + val = DVMBM_RANGE_TWO_DIES << DVMBM_RANGE_SHIFT | + DVMBM_GRAN_CLUSTER << DVMBM_GRAN_SHIFT | + vm_aff3s[0] << DVMBM_DIE1_SHIFT | + vm_aff3s[1] << DVMBM_DIE2_SHIFT; + + /* and fulfill bits [43:0] */ + for_each_cpu(cpu, kvm->arch.dvm_cpumask) { + mpidr = cpu_logical_map(cpu); + aff3 = MPIDR_AFFINITY_LEVEL(mpidr, 3); + aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); + + if (aff3 == vm_aff3s[0]) + val |= 1ULL << (aff2 + DVMBM_DIE1_CLUSTER_SHIFT); + else + val |= 1ULL << (aff2 + DVMBM_DIE2_CLUSTER_SHIFT); + } + +out_update: + kvm->arch.lsudvmbm_el2 = val; +} + void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu) { struct kvm *kvm = vcpu->kvm; @@ -210,8 +301,10 @@ void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu) cpumask_copy(vcpu->arch.cpus_ptr, current->cpus_ptr);
if (likely(cpumask_equal(vcpu->arch.cpus_ptr, - vcpu->arch.pre_cpus_ptr))) + vcpu->arch.pre_cpus_ptr))) { + kvm_write_lsudvmbm(kvm); return; + }
/* Re-calculate dvm_cpumask for this VM */ spin_lock(&kvm->arch.dvm_lock); @@ -232,7 +325,21 @@ void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu)
cpumask_copy(kvm->arch.dvm_cpumask, &mask);
+ /* + * Perform a heavy invalidation for this VMID. Good place + * to optimize, right? + */ + kvm_flush_remote_tlbs(kvm); + + /* + * Re-calculate LSUDVMBM_EL2 for this VM and kick all vcpus + * out to reload the LSUDVMBM configuration. + */ + kvm_update_vm_lsudvmbm(kvm); + kvm_make_all_cpus_request(kvm, KVM_REQ_RELOAD_DVMBM); + out_unlock: + __kvm_write_lsudvmbm(kvm); spin_unlock(&kvm->arch.dvm_lock); }
@@ -242,6 +349,12 @@ void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu) return;
cpumask_copy(vcpu->arch.pre_cpus_ptr, vcpu->arch.cpus_ptr); + + /* + * We're pretty sure that host kernel runs at EL2 (as + * DVMBM is disabled in case of nVHE) and can't be affected + * by the configured SYS_LSUDVMBM_EL2. + */ }
int kvm_hisi_init_dvmbm(struct kvm *kvm) @@ -264,3 +377,13 @@ void kvm_hisi_destroy_dvmbm(struct kvm *kvm)
kfree(kvm->arch.dvm_cpumask); } + +void kvm_hisi_reload_lsudvmbm(struct kvm *kvm) +{ + if (WARN_ON_ONCE(!kvm_dvmbm_support)) + return; + + preempt_disable(); + kvm_write_lsudvmbm(kvm); + preempt_enable(); +} diff --git a/arch/arm64/kvm/hisilicon/hisi_virt.h b/arch/arm64/kvm/hisilicon/hisi_virt.h index 1fd4b3295d78..aefed2777a9e 100644 --- a/arch/arm64/kvm/hisilicon/hisi_virt.h +++ b/arch/arm64/kvm/hisilicon/hisi_virt.h @@ -19,6 +19,33 @@ enum hisi_cpu_type { #define SYS_LSUDVM_CTRL_EL2 sys_reg(3, 4, 15, 7, 4) #define LSUDVM_CTLR_EL2_MASK BIT_ULL(0)
+/* + * MPIDR_EL1 layout on HIP09 + * + * Aff3[7:3] - socket ID [0-15] + * Aff3[2:0] - die ID [1,3] + * Aff2 - cluster ID [0-9] + * Aff1 - core ID [0-3] + * Aff0 - thread ID [0,1] + */ + +#define SYS_LSUDVMBM_EL2 sys_reg(3, 4, 15, 7, 5) +#define DVMBM_RANGE_SHIFT 62 +#define DVMBM_RANGE_ONE_DIE 0ULL +#define DVMBM_RANGE_TWO_DIES 1ULL +#define DVMBM_RANGE_ALL_DIES 3ULL + +#define DVMBM_GRAN_SHIFT 61 +#define DVMBM_GRAN_CLUSTER 0ULL +#define DVMBM_GRAN_DIE 1ULL + +#define DVMBM_DIE1_SHIFT 53 +#define DVMBM_DIE2_SHIFT 45 +#define DVMBM_DIE1_CLUSTER_SHIFT 22 +#define DVMBM_DIE2_CLUSTER_SHIFT 0 + +#define DVMBM_MAX_DIES 32 + void probe_hisi_cpu_type(void); bool hisi_ncsnp_supported(void); bool hisi_dvmbm_supported(void); @@ -29,5 +56,6 @@ void kvm_hisi_dvmbm_load(struct kvm_vcpu *vcpu); void kvm_hisi_dvmbm_put(struct kvm_vcpu *vcpu); int kvm_hisi_init_dvmbm(struct kvm *kvm); void kvm_hisi_destroy_dvmbm(struct kvm *kvm); +void kvm_hisi_reload_lsudvmbm(struct kvm *kvm);
#endif /* __HISI_VIRT_H__ */
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
-----------------------------
TWE Delay is an optional feature in ARMv8.6 Extensions. This patch detect this feature.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: Jingyi Wang wangjingyi11@huawei.com Reviewed-by: Keqian Zhu zhukeqian1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 10 ++++++++++ arch/arm64/kernel/cpufeature.c | 12 ++++++++++++ arch/arm64/tools/cpucaps | 1 + 3 files changed, 23 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index d1705fdc046a..9db5bd439a29 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2175,6 +2175,16 @@ config ARM64_EPAN if the cpu does not implement the feature. endmenu # "ARMv8.7 architectural features"
+menu "ARMv8.6 architectural features" + +config ARM64_TWED + bool "Enable support for delayed trapping of WFE" + default y + help + Delayed Trapping of WFE (part of the ARMv8.6 Extensions) + +endmenu + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 455e72ce080a..9169d622805d 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2722,6 +2722,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP) }, +#ifdef CONFIG_ARM64_TWED + { + .desc = "Delayed Trapping of WFE", + .capability = ARM64_HAS_TWED, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_cpuid_feature, + .sys_reg = SYS_ID_AA64MMFR1_EL1, + .field_pos = ID_AA64MMFR1_EL1_TWED_SHIFT, + .sign = FTR_UNSIGNED, + .min_field_value = 1, + }, +#endif {}, };
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 569ecec76c16..fea9f598f4ef 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -52,6 +52,7 @@ HAS_TIDCP1 HAS_TLB_RANGE HAS_VIRT_HOST_EXTN HAS_WFXT +HAS_TWED HW_DBM KVM_HVHE KVM_PROTECTED_MODE
From: Jingyi Wang wangjingyi11@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
-----------------------------
For HCR_EL2, TWEDEn(bit[59]) decides whether TWED is enabled, and when the configurable delay is enabled, TWEDEL (bits[63:60]) encodes the minimum delay in taking a trap of WFE caused by the TWE bit in this register as 2^(TWEDEL + 8) cycles.
We use two kernel parameters "twed_enable" and "twedel" to configure the register.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: Jingyi Wang wangjingyi11@huawei.com Reviewed-by: Keqian Zhu zhukeqian1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_arm.h | 8 ++++++++ arch/arm64/include/asm/kvm_emulate.h | 27 +++++++++++++++++++++++++++ arch/arm64/include/asm/kvm_host.h | 8 ++++++++ arch/arm64/include/asm/virt.h | 5 +++++ arch/arm64/kvm/arm.c | 15 +++++++++++++++ 5 files changed, 63 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 1095c6647e96..a661c99d129a 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -14,6 +14,7 @@
/* Hyp Configuration Register (HCR) bits */
+#define HCR_TWEDEN (UL(1) << 59) #define HCR_TID5 (UL(1) << 58) #define HCR_DCT (UL(1) << 57) #define HCR_ATA_SHIFT 56 @@ -74,6 +75,13 @@ #define HCR_VM (UL(1) << 0) #define HCR_RES0 ((UL(1) << 48) | (UL(1) << 39))
+#ifdef CONFIG_ARM64_TWED +#define HCR_TWEDEL_SHIFT 60 +#define HCR_TWEDEL_MAX (UL(0xf)) +#define HCR_TWEDEL_MASK (HCR_TWEDEL_MAX << HCR_TWEDEL_SHIFT) +#define HCR_TWEDEL (UL(1) << HCR_TWEDEL_SHIFT) +#endif + /* * The bits we set in HCR: * TLOR: Trap LORegion register accesses diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 3d6725ff0bf6..8bca453739d0 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -124,6 +124,33 @@ static inline void vcpu_set_wfx_traps(struct kvm_vcpu *vcpu) vcpu->arch.hcr_el2 |= HCR_TWI; }
+#ifdef CONFIG_ARM64_TWED +static inline void vcpu_twed_enable(struct kvm_vcpu *vcpu) +{ + vcpu->arch.hcr_el2 |= HCR_TWEDEN; +} + +static inline void vcpu_twed_disable(struct kvm_vcpu *vcpu) +{ + vcpu->arch.hcr_el2 &= ~HCR_TWEDEN; +} + +static inline void vcpu_set_twed(struct kvm_vcpu *vcpu) +{ + u64 delay = (u64)twedel; + + if (delay > HCR_TWEDEL_MAX) + delay = HCR_TWEDEL_MAX; + + vcpu->arch.hcr_el2 &= ~HCR_TWEDEL_MASK; + vcpu->arch.hcr_el2 |= (delay << HCR_TWEDEL_SHIFT); +} +#else +static inline void vcpu_twed_enable(struct kvm_vcpu *vcpu) {}; +static inline void vcpu_twed_disable(struct kvm_vcpu *vcpu) {}; +static inline void vcpu_set_twed(struct kvm_vcpu *vcpu) {}; +#endif + static inline void vcpu_ptrauth_enable(struct kvm_vcpu *vcpu) { vcpu->arch.hcr_el2 |= (HCR_API | HCR_APK); diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e9bf556f206f..8a96ff22e64b 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1167,6 +1167,14 @@ static inline void kvm_hyp_reserve(void) { } void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu); bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu);
+#ifdef CONFIG_ARM64_TWED +#define use_twed() (has_twed() && twed_enable) +extern bool twed_enable; +extern unsigned int twedel; +#else +#define use_twed() (false) +#endif + extern bool kvm_ncsnp_support; extern bool kvm_dvmbm_support;
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h index 261d6e9df2e1..198f4f35534e 100644 --- a/arch/arm64/include/asm/virt.h +++ b/arch/arm64/include/asm/virt.h @@ -156,6 +156,11 @@ static inline bool is_hyp_nvhe(void) return is_hyp_mode_available() && !is_kernel_in_hyp_mode(); }
+static __always_inline bool has_twed(void) +{ + return cpus_have_const_cap(ARM64_HAS_TWED); +} + #endif /* __ASSEMBLY__ */
#endif /* ! __ASM__VIRT_H */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index ccef39bcf6ac..48c93eb97b1e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -74,6 +74,14 @@ bool is_kvm_arm_initialised(void) return kvm_arm_initialised; }
+#ifdef CONFIG_ARM64_TWED +bool twed_enable = false; +module_param(twed_enable, bool, S_IRUGO | S_IWUSR); + +unsigned int twedel = 0; +module_param(twedel, uint, S_IRUGO | S_IWUSR); +#endif + int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu) { return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE; @@ -1027,6 +1035,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_arm_setup_debug(vcpu); kvm_arch_vcpu_ctxflush_fp(vcpu);
+ if (use_twed()) { + vcpu_twed_enable(vcpu); + vcpu_set_twed(vcpu); + } else { + vcpu_twed_disable(vcpu); + } + /************************************************************** * Enter the guest */
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: 47624 CVE: NA
--------------------------------
Introduce a paravirtualization interface for KVM/arm64 to PV-sched.
A hypercall interface is provided for the guest to interrogate the hypervisor's support for this interface and the location of the shared memory structures.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- Documentation/virt/kvm/arm/pvsched.rst | 58 ++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 Documentation/virt/kvm/arm/pvsched.rst
diff --git a/Documentation/virt/kvm/arm/pvsched.rst b/Documentation/virt/kvm/arm/pvsched.rst new file mode 100644 index 000000000000..8f7112a8a9cd --- /dev/null +++ b/Documentation/virt/kvm/arm/pvsched.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Paravirtualized sched support for arm64 +======================================= + +KVM/arm64 provides some hypervisor service calls to support a paravirtualized +sched. + +Some SMCCC compatible hypercalls are defined: + +* PV_SCHED_FEATURES: 0xC5000090 +* PV_SCHED_IPA_INIT: 0xC5000091 +* PV_SCHED_IPA_RELEASE: 0xC5000092 + +The existence of the PV_SCHED hypercall should be probed using the SMCCC 1.1 +ARCH_FEATURES mechanism before calling it. + +PV_SCHED_FEATURES + ============= ======== ========== + Function ID: (uint32) 0xC5000090 + PV_call_id: (uint32) The function to query for support. + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant + PV-sched feature is supported by the hypervisor. + ============= ======== ========== + +PV_SCHED_IPA_INIT + ============= ======== ========== + Function ID: (uint32) 0xC5000091 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA of + this vCPU's PV data structure is shared to the + hypervisor. + ============= ======== ========== + +PV_SCHED_IPA_RELEASE + ============= ======== ========== + Function ID: (uint32) 0xC5000092 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA of + this vCPU's PV data structure is released. + ============= ======== ========== + +PV sched state +-------------- + +The structure pointed to by the PV_SCHED_IPA hypercall is as follows: + ++-----------+-------------+-------------+-----------------------------------+ +| Field | Byte Length | Byte Offset | Description | ++===========+=============+=============+===================================+ +| preempted | 4 | 0 | Indicates that the vCPU that owns | +| | | | this struct is running or not. | +| | | | Non-zero values mean the vCPU has | +| | | | been preempted. Zero means the | +| | | | vCPU is not preempted. | ++-----------+-------------+-------------+-----------------------------------+ + +The preempted field will be updated to 0 by the hypervisor prior to scheduling +a vCPU. When the vCPU is scheduled out, the preempted field will be updated +to 1 by the hypervisor.
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
This provides a mechanism for querying which paravirtualized sched features are available in this hypervisor.
Add some SMCCC compatible hypercalls for PV sched features: PV_SCHED_FEATURES: 0xC5000090 PV_SCHED_IPA_INIT: 0xC5000091 PV_SCHED_IPA_RELEASE: 0xC5000092
Also add the header file which defines the ABI for the paravirtualized sched features we're about to add.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/include/asm/pvsched-abi.h | 16 ++++++++++++++++ arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/hypercalls.c | 6 ++++++ arch/arm64/kvm/pvsched.c | 23 +++++++++++++++++++++++ include/linux/arm-smccc.h | 19 +++++++++++++++++++ 6 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/pvsched-abi.h create mode 100644 arch/arm64/kvm/pvsched.c
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 8a96ff22e64b..50fcc92257cf 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1051,6 +1051,8 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) return (vcpu_arch->steal.base != INVALID_GPA); }
+long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); + void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); diff --git a/arch/arm64/include/asm/pvsched-abi.h b/arch/arm64/include/asm/pvsched-abi.h new file mode 100644 index 000000000000..80e50e7a1a31 --- /dev/null +++ b/arch/arm64/include/asm/pvsched-abi.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef __ASM_PVSCHED_ABI_H +#define __ASM_PVSCHED_ABI_H + +struct pvsched_vcpu_state { + __le32 preempted; + /* Structure must be 64 byte aligned, pad to that size */ + u8 padding[60]; +} __packed; + +#endif diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 826a05d072d7..fe63a91a4c54 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -10,7 +10,7 @@ include $(srctree)/virt/kvm/Makefile.kvm obj-$(CONFIG_KVM) += kvm.o obj-$(CONFIG_KVM) += hyp/
-kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ +kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o pvsched.o \ inject_fault.o va_layout.o handle_exit.o \ guest.o debug.o reset.o sys_regs.o stacktrace.o \ vgic-sys-reg-v3.o fpsimd.o pkvm.o \ diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 7fb4df0456de..190670d8dd3f 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -332,6 +332,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) &smccc_feat->std_hyp_bmap)) val[0] = SMCCC_RET_SUCCESS; break; + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val[0] = SMCCC_RET_SUCCESS; + break; } break; case ARM_SMCCC_HV_PV_TIME_FEATURES: @@ -360,6 +363,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_TRNG_RND32: case ARM_SMCCC_TRNG_RND64: return kvm_trng_call(vcpu); + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val[0] = kvm_hypercall_pvsched_features(vcpu); + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c new file mode 100644 index 000000000000..3d96122fcf9e --- /dev/null +++ b/arch/arm64/kvm/pvsched.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#include <linux/arm-smccc.h> + +#include <kvm/arm_hypercalls.h> + +long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) +{ + u32 feature = smccc_get_arg1(vcpu); + long val = SMCCC_RET_NOT_SUPPORTED; + + switch (feature) { + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val = SMCCC_RET_SUCCESS; + break; + } + + return val; +} diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 083f85653716..daab363d31ae 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -577,5 +577,24 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, method; \ })
+/* Paravirtualised sched calls */ +#define ARM_SMCCC_HV_PV_SCHED_FEATURES \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x90) + +#define ARM_SMCCC_HV_PV_SCHED_IPA_INIT \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x91) + +#define ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x92) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
Implement the service call for configuring a shared structure between a vCPU and the hypervisor in which the hypervisor can tell the vCPU that is running or not.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- C | 0 arch/arm64/include/asm/kvm_host.h | 16 ++++++++++++++++ arch/arm64/kvm/arm.c | 9 +++++++++ arch/arm64/kvm/hypercalls.c | 11 +++++++++++ arch/arm64/kvm/pvsched.c | 28 ++++++++++++++++++++++++++++ 5 files changed, 64 insertions(+) create mode 100644 C
diff --git a/C b/C new file mode 100644 index 000000000000..e69de29bb2d1 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 50fcc92257cf..e07cb6ff542c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -596,6 +596,11 @@ struct kvm_vcpu_arch { gpa_t base; } steal;
+ /* Guest PV sched state */ + struct { + gpa_t base; + } pvsched; + /* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr;
@@ -1052,6 +1057,17 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) }
long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); +void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); + +static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) +{ + vcpu_arch->pvsched.base = GPA_INVALID; +} + +static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) +{ + return (vcpu_arch->pvsched.base != GPA_INVALID); +}
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 48c93eb97b1e..ce66b516deb8 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -415,6 +415,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+ kvm_arm_pvsched_vcpu_init(&vcpu->arch); + vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
err = kvm_vgic_vcpu_init(vcpu); @@ -500,11 +502,15 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (vcpu_has_ptrauth(vcpu)) vcpu_ptrauth_disable(vcpu); + kvm_arch_vcpu_load_debug_state_flags(vcpu);
if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu);
+ if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) + kvm_update_pvsched_preempted(vcpu, 0); + #ifdef CONFIG_KVM_HISI_VIRT kvm_hisi_dvmbm_load(vcpu); #endif @@ -524,6 +530,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1;
+ if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) + kvm_update_pvsched_preempted(vcpu, 1); + #ifdef CONFIG_KVM_HISI_VIRT kvm_hisi_dvmbm_put(vcpu); #endif diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 190670d8dd3f..b72224609c3e 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -366,6 +366,17 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = kvm_hypercall_pvsched_features(vcpu); break; + case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: + gpa = smccc_get_arg1(vcpu); + if (gpa != GPA_INVALID) { + vcpu->arch.pvsched.base = gpa; + val = SMCCC_RET_SUCCESS; + } + break; + case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: + vcpu->arch.pvsched.base = GPA_INVALID; + val = SMCCC_RET_SUCCESS; + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 3d96122fcf9e..b923c4b6c52e 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -5,9 +5,35 @@ */
#include <linux/arm-smccc.h> +#include <linux/kvm_host.h> + +#include <asm/pvsched-abi.h>
#include <kvm/arm_hypercalls.h>
+void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) +{ + struct kvm *kvm = vcpu->kvm; + u64 base = vcpu->arch.pvsched.base; + u64 offset = offsetof(struct pvsched_vcpu_state, preempted); + int idx; + + if (base == GPA_INVALID) + return; + + /* + * This function is called from atomic context, so we need to + * disable page faults. + */ + pagefault_disable(); + + idx = srcu_read_lock(&kvm->srcu); + kvm_put_guest(kvm, base + offset, cpu_to_le32(preempted)); + srcu_read_unlock(&kvm->srcu, idx); + + pagefault_enable(); +} + long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) { u32 feature = smccc_get_arg1(vcpu); @@ -15,6 +41,8 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu)
switch (feature) { case ARM_SMCCC_HV_PV_SCHED_FEATURES: + case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: + case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: val = SMCCC_RET_SUCCESS; break; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
This is to fix some lock holder preemption issues. Some other locks implementation do a spin loop before acquiring the lock itself. Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It takes the CPU as parameter and return true if the CPU is preempted. Then kernel can break the spin loops upon the retval of vcpu_is_preempted.
As kernel has used this interface, So lets support it.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 8 ++++++++ arch/arm64/include/asm/spinlock.h | 10 ++++++++++ arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/paravirt-spinlocks.c | 16 ++++++++++++++++ 4 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kernel/paravirt-spinlocks.c
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 9aa193e0e8f2..53282dc429f8 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,6 +20,14 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+__visible bool __native_vcpu_is_preempted(int cpu); +DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); + +static inline bool pv_vcpu_is_preempted(int cpu) +{ + return static_call(pv_vcpu_preempted)(cpu); +} + #else
#define pv_time_init() do {} while (0) diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index 0525c0b089ed..3dcb1c903ba0 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -7,6 +7,7 @@
#include <asm/qspinlock.h> #include <asm/qrwlock.h> +#include <asm/paravirt.h>
/* See include/linux/spinlock.h */ #define smp_mb__after_spinlock() smp_mb() @@ -19,9 +20,18 @@ * https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-a... */ #define vcpu_is_preempted vcpu_is_preempted +#ifdef CONFIG_PARAVIRT +static inline bool vcpu_is_preempted(int cpu) +{ + return pv_vcpu_is_preempted(cpu); +} + +#else + static inline bool vcpu_is_preempted(int cpu) { return false; } +#endif /* CONFIG_PARAVIRT */
#endif /* __ASM_SPINLOCK_H */ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 31859e7741a5..235963ed87b1 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -60,7 +60,7 @@ obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o -obj-$(CONFIG_PARAVIRT) += paravirt.o +obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt-spinlocks.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o obj-$(CONFIG_ELF_CORE) += elfcore.o diff --git a/arch/arm64/kernel/paravirt-spinlocks.c b/arch/arm64/kernel/paravirt-spinlocks.c new file mode 100644 index 000000000000..3566897070d9 --- /dev/null +++ b/arch/arm64/kernel/paravirt-spinlocks.c @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#include <linux/static_call.h> +#include <linux/spinlock.h> +#include <asm/paravirt.h> + +__visible bool __native_vcpu_is_preempted(int cpu) +{ + return false; +} + +DEFINE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted);
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
Support the vcpu_is_preempted() functionality under KVM/arm64. This will enhance lock performance on overcommitted hosts (more runnable vCPUs than physical CPUs in the system) as doing busy waits for preempted vCPUs will hurt system performance far worse than early yielding.
unix benchmark result: host: kernel 4.19.87, HiSilicon Kunpeng920, 8 CPUs guest: kernel 4.19.87, 16 vCPUs
test-case | after-patch | before-patch ----------------------------------------+-------------------+------------------ Dhrystone 2 using register variables | 338955728.5 lps | 339266319.5 lps Double-Precision Whetstone | 30634.9 MWIPS | 30884.4 MWIPS Execl Throughput | 6753.2 lps | 3580.1 lps File Copy 1024 bufsize 2000 maxblocks | 490048.0 KBps | 313282.3 KBps File Copy 256 bufsize 500 maxblocks | 129662.5 KBps | 83550.7 KBps File Copy 4096 bufsize 8000 maxblocks | 1552551.5 KBps | 814327.0 KBps Pipe Throughput | 8976422.5 lps | 9048628.4 lps Pipe-based Context Switching | 258641.7 lps | 252925.9 lps Process Creation | 5312.2 lps | 4507.9 lps Shell Scripts (1 concurrent) | 8704.2 lpm | 6720.9 lpm Shell Scripts (8 concurrent) | 1708.8 lpm | 607.2 lpm System Call Overhead | 3714444.7 lps | 3746386.8 lps ----------------------------------------+-------------------+------------------ System Benchmarks Index Score | 2270.6 | 1679.2
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 4 +- arch/arm64/include/asm/paravirt.h | 3 + arch/arm64/kernel/paravirt.c | 106 ++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 2 + arch/arm64/kvm/hypercalls.c | 8 +-- arch/arm64/kvm/pvsched.c | 2 +- include/linux/cpuhotplug.h | 1 + 7 files changed, 119 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e07cb6ff542c..0537099a0bd0 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1061,12 +1061,12 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted);
static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) { - vcpu_arch->pvsched.base = GPA_INVALID; + vcpu_arch->pvsched.base = INVALID_GPA; }
static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) { - return (vcpu_arch->pvsched.base != GPA_INVALID); + return (vcpu_arch->pvsched.base != INVALID_GPA); }
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome); diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 53282dc429f8..4e35851bcecf 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,6 +20,8 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+int __init pv_sched_init(void); + __visible bool __native_vcpu_is_preempted(int cpu); DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted);
@@ -31,6 +33,7 @@ static inline bool pv_vcpu_is_preempted(int cpu) #else
#define pv_time_init() do {} while (0) +#define pv_sched_init() do {} while (0)
#endif // CONFIG_PARAVIRT
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index aa718d6a9274..844622baf6bf 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -22,6 +22,7 @@
#include <asm/paravirt.h> #include <asm/pvclock-abi.h> +#include <asm/pvsched-abi.h> #include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled; @@ -174,3 +175,108 @@ int __init pv_time_init(void)
return 0; } + + +DEFINE_PER_CPU(struct pvsched_vcpu_state, pvsched_vcpu_region) __aligned(64); +EXPORT_PER_CPU_SYMBOL(pvsched_vcpu_region); + +static bool kvm_vcpu_is_preempted(int cpu) +{ + struct pvsched_vcpu_state *reg; + u32 preempted; + + reg = &per_cpu(pvsched_vcpu_region, cpu); + if (!reg) { + pr_warn_once("PV sched enabled but not configured for cpu %d\n", + cpu); + return false; + } + + preempted = le32_to_cpu(READ_ONCE(reg->preempted)); + + return !!preempted; +} + +static int pvsched_vcpu_state_dying_cpu(unsigned int cpu) +{ + struct pvsched_vcpu_state *reg; + struct arm_smccc_res res; + + reg = this_cpu_ptr(&pvsched_vcpu_region); + if (!reg) + return -EFAULT; + + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE, &res); + memset(reg, 0, sizeof(*reg)); + + return 0; +} + +static int init_pvsched_vcpu_state(unsigned int cpu) +{ + struct pvsched_vcpu_state *reg; + struct arm_smccc_res res; + + reg = this_cpu_ptr(&pvsched_vcpu_region); + if (!reg) + return -EFAULT; + + /* Pass the memory address to host via hypercall */ + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_IPA_INIT, + virt_to_phys(reg), &res); + + return 0; +} + +static int kvm_arm_init_pvsched(void) +{ + int ret; + + ret = cpuhp_setup_state(CPUHP_AP_ARM_KVM_PVSCHED_STARTING, + "hypervisor/arm/pvsched:starting", + init_pvsched_vcpu_state, + pvsched_vcpu_state_dying_cpu); + + if (ret < 0) { + pr_warn("PV sched init failed\n"); + return ret; + } + + return 0; +} + +static bool has_kvm_pvsched(void) +{ + struct arm_smccc_res res; + + /* To detect the presence of PV sched support we require SMCCC 1.1+ */ + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) + return false; + + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_HV_PV_SCHED_FEATURES, &res); + + return (res.a0 == SMCCC_RET_SUCCESS); +} + +int __init pv_sched_init(void) +{ + int ret; + + if (is_hyp_mode_available()) + return 0; + + if (!has_kvm_pvsched()) { + pr_warn("PV sched is not available\n"); + return 0; + } + + ret = kvm_arm_init_pvsched(); + if (ret) + return ret; + + static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); + pr_info("using PV sched preempted\n"); + + return 0; +} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 645034e52496..0255788cb9c4 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -416,6 +416,8 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p) smp_init_cpus(); smp_build_mpidr_hash();
+ pv_sched_init(); + /* Init percpu seeds for random tags after cpus are set up. */ kasan_init_sw_tags();
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index b72224609c3e..a96b88d90cf8 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -368,14 +368,14 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) break; case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: gpa = smccc_get_arg1(vcpu); - if (gpa != GPA_INVALID) { + if (gpa != INVALID_GPA) { vcpu->arch.pvsched.base = gpa; - val = SMCCC_RET_SUCCESS; + val[0] = SMCCC_RET_SUCCESS; } break; case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: - vcpu->arch.pvsched.base = GPA_INVALID; - val = SMCCC_RET_SUCCESS; + vcpu->arch.pvsched.base = INVALID_GPA; + val[0] = SMCCC_RET_SUCCESS; break; default: return kvm_psci_call(vcpu); diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index b923c4b6c52e..06290c831101 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -18,7 +18,7 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) u64 offset = offsetof(struct pvsched_vcpu_state, preempted); int idx;
- if (base == GPA_INVALID) + if (base == INVALID_GPA) return;
/* diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 624d4a38c358..f94a1b8e34e0 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -190,6 +190,7 @@ enum cpuhp_state { CPUHP_AP_DUMMY_TIMER_STARTING, CPUHP_AP_ARM_XEN_STARTING, CPUHP_AP_ARM_XEN_RUNSTATE_STARTING, + CPUHP_AP_ARM_KVM_PVSCHED_STARTING, CPUHP_AP_ARM_CORESIGHT_STARTING, CPUHP_AP_ARM_CORESIGHT_CTI_STARTING, CPUHP_AP_ARM64_ISNDEP_STARTING,
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: 47624 CVE: NA
--------------------------------
A new hypercall interface function is provided for the guest to kick WFI state vCPU.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- Documentation/virt/kvm/arm/pvsched.rst | 16 ++++++++++++++++ include/linux/arm-smccc.h | 6 ++++++ 2 files changed, 22 insertions(+)
diff --git a/Documentation/virt/kvm/arm/pvsched.rst b/Documentation/virt/kvm/arm/pvsched.rst index 8f7112a8a9cd..6ba221e25089 100644 --- a/Documentation/virt/kvm/arm/pvsched.rst +++ b/Documentation/virt/kvm/arm/pvsched.rst @@ -11,6 +11,7 @@ Some SMCCC compatible hypercalls are defined: * PV_SCHED_FEATURES: 0xC5000090 * PV_SCHED_IPA_INIT: 0xC5000091 * PV_SCHED_IPA_RELEASE: 0xC5000092 +* PV_SCHED_KICK_CPU: 0xC5000093
The existence of the PV_SCHED hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES mechanism before calling it. @@ -38,6 +39,13 @@ PV_SCHED_IPA_RELEASE this vCPU's PV data structure is released. ============= ======== ==========
+PV_SCHED_KICK_CPU + ============= ======== ========== + Function ID: (uint32) 0xC5000093 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the vCPU is + kicked by the hypervisor. + ============= ======== ========== + PV sched state --------------
@@ -56,3 +64,11 @@ The structure pointed to by the PV_SCHED_IPA hypercall is as follows: The preempted field will be updated to 0 by the hypervisor prior to scheduling a vCPU. When the vCPU is scheduled out, the preempted field will be updated to 1 by the hypervisor. + +A vCPU of a paravirtualized guest that is busywaiting in guest kernel mode for +an event to occur (ex: a spinlock to become available) can execute WFI +instruction once it has busy-waited for more than a threshold time-interval. +Execution of WFI instruction would cause the hypervisor to put the vCPU to sleep +until occurrence of an appropriate event. Another vCPU of the same guest can +wakeup the sleeping vCPU by issuing PV_SCHED_KICK_CPU hypercall, specifying CPU +id (reg1) of the vCPU to be woken up. diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index daab363d31ae..02b857f44ae8 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -596,5 +596,11 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, ARM_SMCCC_OWNER_STANDARD_HYP, \ 0x92)
+#define ARM_SMCCC_HV_PV_SCHED_KICK_CPU \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x93) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
Implement the service call for waking up a WFI state vCPU.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 4 +++- arch/arm64/kvm/handle_exit.c | 1 + arch/arm64/kvm/hypercalls.c | 3 +++ arch/arm64/kvm/pvsched.c | 25 +++++++++++++++++++++++++ 5 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 0537099a0bd0..e837b5fe0a97 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -598,6 +598,7 @@ struct kvm_vcpu_arch {
/* Guest PV sched state */ struct { + bool pv_unhalted; gpa_t base; } pvsched;
@@ -1058,6 +1059,7 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch)
long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); +long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu);
static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) { diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index ce66b516deb8..82604fc917a0 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -613,7 +613,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF); - return ((irq_lines || kvm_vgic_vcpu_pending_irq(v)) + bool pv_unhalted = v->arch.pvsched.pv_unhalted; + + return ((irq_lines || kvm_vgic_vcpu_pending_irq(v) || pv_unhalted) && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); }
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 617ae6dea5d5..b9a44c3bebb7 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -121,6 +121,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) } else { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); vcpu->stat.wfi_exit_stat++; + vcpu->arch.pvsched.pv_unhalted = false; }
if (esr & ESR_ELx_WFx_ISS_WFxT) { diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index a96b88d90cf8..46def0572461 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -377,6 +377,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) vcpu->arch.pvsched.base = INVALID_GPA; val[0] = SMCCC_RET_SUCCESS; break; + case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: + val[0] = kvm_pvsched_kick_vcpu(vcpu); + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 06290c831101..49d03fd6f4ad 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -34,6 +34,30 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) pagefault_enable(); }
+long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu) +{ + unsigned int vcpu_idx; + long val = SMCCC_RET_NOT_SUPPORTED; + struct kvm *kvm = vcpu->kvm; + struct kvm_vcpu *target = NULL; + + vcpu_idx = smccc_get_arg1(vcpu); + target = kvm_get_vcpu(kvm, vcpu_idx); + if (!target) + goto out; + + target->arch.pvsched.pv_unhalted = true; + kvm_make_request(KVM_REQ_IRQ_PENDING, target); + kvm_vcpu_kick(target); + if (READ_ONCE(target->ready)) + kvm_vcpu_yield_to(target); + + val = SMCCC_RET_SUCCESS; + +out: + return val; +} + long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) { u32 feature = smccc_get_arg1(vcpu); @@ -43,6 +67,7 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_FEATURES: case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: + case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: val = SMCCC_RET_SUCCESS; break; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
As kernel has used this interface, so lets support it.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 13 ++++++ arch/arm64/include/asm/Kbuild | 1 - arch/arm64/include/asm/paravirt.h | 25 +++++++++++ arch/arm64/include/asm/qspinlock.h | 48 +++++++++++++++++++++ arch/arm64/include/asm/qspinlock_paravirt.h | 12 ++++++ arch/arm64/include/asm/spinlock.h | 3 ++ arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/paravirt-spinlocks.c | 5 +++ 8 files changed, 107 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/qspinlock.h create mode 100644 arch/arm64/include/asm/qspinlock_paravirt.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 9db5bd439a29..971cf9d54101 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1548,6 +1548,19 @@ config PARAVIRT under a hypervisor, potentially improving performance significantly over full virtualization.
+config PARAVIRT_SPINLOCKS + bool "Paravirtualization layer for spinlocks" + depends on PARAVIRT && SMP + help + Paravirtualized spinlocks allow a pvops backend to replace the + spinlock implementation with something virtualization-friendly + (for example, block the virtual CPU rather than spinning). + + It has a minimal impact on native kernels and gives a nice performance + benefit on paravirtualized KVM kernels. + + If you are unsure how to answer this question, answer Y. + config PARAVIRT_TIME_ACCOUNTING bool "Paravirtual steal time accounting" select PARAVIRT diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild index 5c8ee5a541d2..d16ee8095366 100644 --- a/arch/arm64/include/asm/Kbuild +++ b/arch/arm64/include/asm/Kbuild @@ -2,7 +2,6 @@ generic-y += early_ioremap.h generic-y += mcs_spinlock.h generic-y += qrwlock.h -generic-y += qspinlock.h generic-y += parport.h generic-y += user.h
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 4e35851bcecf..1c9011d31562 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -30,6 +30,31 @@ static inline bool pv_vcpu_is_preempted(int cpu) return static_call(pv_vcpu_preempted)(cpu); }
+#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) +bool pv_is_native_spin_unlock(void); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath); +static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + return static_call(pv_qspinlock_queued_spin_lock_slowpath)(lock, val); +} + +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +static inline void pv_queued_spin_unlock(struct qspinlock *lock) +{ + return static_call(pv_qspinlock_queued_spin_unlock)(lock); +} + +static inline void pv_wait(u8 *ptr, u8 val) +{ + return; +} + +static inline void pv_kick(int cpu) +{ + return; +} +#endif /* SMP && PARAVIRT_SPINLOCKS */ + #else
#define pv_time_init() do {} while (0) diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h new file mode 100644 index 000000000000..ce3816b60f25 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2020 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef _ASM_ARM64_QSPINLOCK_H +#define _ASM_ARM64_QSPINLOCK_H + +#include <linux/jump_label.h> +#include <asm/cpufeature.h> +#include <asm-generic/qspinlock_types.h> +#include <asm/paravirt.h> + +#define _Q_PENDING_LOOPS (1 << 9) + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_init_lock_hash(void); +extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); + +#define queued_spin_unlock queued_spin_unlock +/** + * queued_spin_unlock - release a queued spinlock + * @lock : Pointer to queued spinlock structure + * + * A smp_store_release() on the least-significant byte. + */ +static inline void native_queued_spin_unlock(struct qspinlock *lock) +{ + smp_store_release(&lock->locked, 0); +} + +static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + pv_queued_spin_lock_slowpath(lock, val); +} + +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + pv_queued_spin_unlock(lock); +} + +#endif + +#include <asm-generic/qspinlock.h> + +#endif /* _ASM_ARM64_QSPINLOCK_H */ diff --git a/arch/arm64/include/asm/qspinlock_paravirt.h b/arch/arm64/include/asm/qspinlock_paravirt.h new file mode 100644 index 000000000000..eba4be28fbb9 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock_paravirt.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef __ASM_QSPINLOCK_PARAVIRT_H +#define __ASM_QSPINLOCK_PARAVIRT_H + +extern void __pv_queued_spin_unlock(struct qspinlock *lock); + +#endif diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index 3dcb1c903ba0..4040a27ecf86 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -9,6 +9,9 @@ #include <asm/qrwlock.h> #include <asm/paravirt.h>
+/* How long a lock should spin before we consider blocking */ +#define SPIN_THRESHOLD (1 << 15) + /* See include/linux/spinlock.h */ #define smp_mb__after_spinlock() smp_mb()
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 235963ed87b1..4db14d6677bd 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt-spinlocks.o +obj-$(CONFIG_PARAVIRT_SPINLOCKS) += paravirt.o paravirt-spinlocks.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o obj-$(CONFIG_ELF_CORE) += elfcore.o diff --git a/arch/arm64/kernel/paravirt-spinlocks.c b/arch/arm64/kernel/paravirt-spinlocks.c index 3566897070d9..2e4abc83e318 100644 --- a/arch/arm64/kernel/paravirt-spinlocks.c +++ b/arch/arm64/kernel/paravirt-spinlocks.c @@ -14,3 +14,8 @@ __visible bool __native_vcpu_is_preempted(int cpu) }
DEFINE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); + +bool pv_is_native_spin_unlock(void) +{ + return false; +}
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
Linux kernel builds were run in KVM guest on HiSilicon Kunpeng920 system. VM guests were set up with 32, 48 and 64 vCPUs on the 32 physical CPUs. The kernel build (make -j<n>) was done in a VM with unpinned vCPUs 3 times with the best time selected and <n> is the number of vCPUs available. The build times of the original linux 4.19.87, pvqspinlock with various number of vCPUs are as follows:
Kernel 32 vCPUs 48 vCPUs 60 vCPUs ---------- -------- -------- -------- 4.19.87 342.336s 602.048s 950.340s pvqsinlock 341.366s 376.135s 437.037s
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 5 +++ arch/arm64/kernel/paravirt.c | 60 +++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+)
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 1c9011d31562..f10552839663 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -31,6 +31,7 @@ static inline bool pv_vcpu_is_preempted(int cpu) }
#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) +void __init pv_qspinlock_init(void); bool pv_is_native_spin_unlock(void); DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath); static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) @@ -53,6 +54,10 @@ static inline void pv_kick(int cpu) { return; } +#else + +#define pv_qspinlock_init() do {} while (0) + #endif /* SMP && PARAVIRT_SPINLOCKS */
#else diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 844622baf6bf..a2d617b1571c 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -23,6 +23,7 @@ #include <asm/paravirt.h> #include <asm/pvclock-abi.h> #include <asm/pvsched-abi.h> +#include <asm/qspinlock_paravirt.h> #include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled; @@ -259,6 +260,63 @@ static bool has_kvm_pvsched(void) return (res.a0 == SMCCC_RET_SUCCESS); }
+#ifdef CONFIG_PARAVIRT_SPINLOCKS +static bool arm_pvspin = false; + +/* Kick a cpu by its cpuid. Used to wake up a halted vcpu */ +static void kvm_kick_cpu(int cpu) +{ + struct arm_smccc_res res; + + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_KICK_CPU, cpu, &res); +} + +static void kvm_wait(u8 *ptr, u8 val) +{ + unsigned long flags; + + if (in_nmi()) + return; + + local_irq_save(flags); + + if (READ_ONCE(*ptr) != val) + goto out; + + dsb(sy); + wfi(); + +out: + local_irq_restore(flags); +} + +void __init pv_qspinlock_init(void) +{ + /* Don't use the PV qspinlock code if there is only 1 vCPU. */ + if (num_possible_cpus() == 1) + arm_pvspin = false; + + if (!arm_pvspin) { + pr_info("PV qspinlocks disabled\n"); + return; + } + pr_info("PV qspinlocks enabled\n"); + + __pv_init_lock_hash(); + pv_ops.sched.queued_spin_lock_slowpath = __pv_queued_spin_lock_slowpath; + pv_ops.sched.queued_spin_unlock = __pv_queued_spin_unlock; + pv_ops.sched.wait = kvm_wait; + pv_ops.sched.kick = kvm_kick_cpu; +} + +static __init int arm_parse_pvspin(char *arg) +{ + arm_pvspin = true; + return 0; +} +early_param("arm_pvspin", arm_parse_pvspin); +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ + int __init pv_sched_init(void) { int ret; @@ -278,5 +336,7 @@ int __init pv_sched_init(void) static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); pr_info("using PV sched preempted\n");
+ pv_qspinlock_init(); + return 0; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
--------------------------------
Add tracepoints for PV qspinlock
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kernel/paravirt.c | 6 +++ arch/arm64/kernel/trace-paravirt.h | 66 ++++++++++++++++++++++++++++++ arch/arm64/kvm/pvsched.c | 3 ++ arch/arm64/kvm/trace_arm.h | 18 ++++++++ 4 files changed, 93 insertions(+) create mode 100644 arch/arm64/kernel/trace-paravirt.h
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index a2d617b1571c..106f18654cea 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -26,6 +26,9 @@ #include <asm/qspinlock_paravirt.h> #include <asm/smp_plat.h>
+#define CREATE_TRACE_POINTS +#include "trace-paravirt.h" + struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled;
@@ -269,6 +272,8 @@ static void kvm_kick_cpu(int cpu) struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_KICK_CPU, cpu, &res); + + trace_kvm_kick_cpu("kvm kick cpu", smp_processor_id(), cpu); }
static void kvm_wait(u8 *ptr, u8 val) @@ -285,6 +290,7 @@ static void kvm_wait(u8 *ptr, u8 val)
dsb(sy); wfi(); + trace_kvm_wait("kvm wait wfi", smp_processor_id());
out: local_irq_restore(flags); diff --git a/arch/arm64/kernel/trace-paravirt.h b/arch/arm64/kernel/trace-paravirt.h new file mode 100644 index 000000000000..2d76272f39ae --- /dev/null +++ b/arch/arm64/kernel/trace-paravirt.h @@ -0,0 +1,66 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM paravirt + +#if !defined(_TRACE_PARAVIRT_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_PARAVIRT_H + +#include <linux/tracepoint.h> + +TRACE_EVENT(kvm_kick_cpu, + TP_PROTO(const char *name, int cpu, int target), + TP_ARGS(name, cpu, target), + + TP_STRUCT__entry( + __string(name, name) + __field(int, cpu) + __field(int, target) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->cpu = cpu; + __entry->target = target; + ), + + TP_printk("PV qspinlock: %s, cpu %d kick target cpu %d", + __get_str(name), + __entry->cpu, + __entry->target + ) +); + +TRACE_EVENT(kvm_wait, + TP_PROTO(const char *name, int cpu), + TP_ARGS(name, cpu), + + TP_STRUCT__entry( + __string(name, name) + __field(int, cpu) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->cpu = cpu; + ), + + TP_printk("PV qspinlock: %s, cpu %d wait kvm access wfi", + __get_str(name), + __entry->cpu + ) +); + +#endif /* _TRACE_PARAVIRT_H */ + +/* This part must be outside protection */ +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_PATH ../../../arch/arm64/kernel/ +#define TRACE_INCLUDE_FILE trace-paravirt + +#include <trace/define_trace.h> diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 49d03fd6f4ad..bcfd4760b491 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -11,6 +11,8 @@
#include <kvm/arm_hypercalls.h>
+#include "trace.h" + void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) { struct kvm *kvm = vcpu->kvm; @@ -53,6 +55,7 @@ long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu) kvm_vcpu_yield_to(target);
val = SMCCC_RET_SUCCESS; + trace_kvm_pvsched_kick_vcpu(vcpu->vcpu_id, target->vcpu_id);
out: return val; diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h index 8ad53104934d..680c7e39af4a 100644 --- a/arch/arm64/kvm/trace_arm.h +++ b/arch/arm64/kvm/trace_arm.h @@ -390,6 +390,24 @@ TRACE_EVENT(kvm_forward_sysreg_trap, sys_reg_Op2(__entry->sysreg)) );
+TRACE_EVENT(kvm_pvsched_kick_vcpu, + TP_PROTO(int vcpu_id, int target_vcpu_id), + TP_ARGS(vcpu_id, target_vcpu_id), + + TP_STRUCT__entry( + __field(int, vcpu_id) + __field(int, target_vcpu_id) + ), + + TP_fast_assign( + __entry->vcpu_id = vcpu_id; + __entry->target_vcpu_id = target_vcpu_id; + ), + + TP_printk("PV qspinlock: vcpu %d kick target vcpu %d", + __entry->vcpu_id, __entry->target_vcpu_id) +); + #endif /* _TRACE_ARM_ARM64_KVM_H */
#undef TRACE_INCLUDE_PATH
From: yezengruan yezengruan@huawei.com
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N
--------------------------------------------
1. Clear kvm_vcpu_arch::pvsched::base on vcpu reset The guest memory will otherwise be corrupted by KVM if we reboot into an old guest kernel which is **not** aware of the pvsched feature.
2. Fix boot cpu pvsched init abnormal The pv_sched_init was called too early in the boot in setup_arch, hence pvsched_vcpu_state was not initializedfo vcpu 0.
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: yezengruan yezengruan@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kernel/paravirt.c | 1 + arch/arm64/kernel/setup.c | 2 -- arch/arm64/kvm/arm.c | 2 ++ 3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 106f18654cea..d1dcebe4772b 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -346,3 +346,4 @@ int __init pv_sched_init(void)
return 0; } +early_initcall(pv_sched_init); diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 0255788cb9c4..645034e52496 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -416,8 +416,6 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p) smp_init_cpus(); smp_build_mpidr_hash();
- pv_sched_init(); - /* Init percpu seeds for random tags after cpus are set up. */ kasan_init_sw_tags();
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 82604fc917a0..33a85383d587 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1402,6 +1402,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
spin_unlock(&vcpu->arch.mp_state_lock);
+ kvm_arm_pvsched_vcpu_init(&vcpu->arch); + return 0; }
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
Support PV qspinlock feature after pv.ops removed
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 16 ++++++++++++---- arch/arm64/kernel/paravirt.c | 22 ++++++++++++++++++---- 2 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index f10552839663..a29eeffa49aa 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -33,26 +33,34 @@ static inline bool pv_vcpu_is_preempted(int cpu) #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) void __init pv_qspinlock_init(void); bool pv_is_native_spin_unlock(void); -DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath); + +void dummy_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, + dummy_queued_spin_lock_slowpath); static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) { return static_call(pv_qspinlock_queued_spin_lock_slowpath)(lock, val); }
-DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +void dummy_queued_spin_unlock(struct qspinlock *lock); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, dummy_queued_spin_unlock); static inline void pv_queued_spin_unlock(struct qspinlock *lock) { return static_call(pv_qspinlock_queued_spin_unlock)(lock); }
+void dummy_wait(u8 *ptr, u8 val); +DECLARE_STATIC_CALL(pv_qspinlock_wait, dummy_wait); static inline void pv_wait(u8 *ptr, u8 val) { - return; + return static_call(pv_qspinlock_wait)(ptr, val); }
+void dummy_kick(int cpu); +DECLARE_STATIC_CALL(pv_qspinlock_kick, dummy_kick); static inline void pv_kick(int cpu) { - return; + return static_call(pv_qspinlock_kick)(cpu); } #else
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index d1dcebe4772b..142cee610804 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -296,6 +296,17 @@ static void kvm_wait(u8 *ptr, u8 val) local_irq_restore(flags); }
+DEFINE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, + native_queued_spin_lock_slowpath); +DEFINE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +DEFINE_STATIC_CALL(pv_qspinlock_wait, kvm_wait); +DEFINE_STATIC_CALL(pv_qspinlock_kick, kvm_kick_cpu); + +EXPORT_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath); +EXPORT_STATIC_CALL(pv_qspinlock_queued_spin_unlock); +EXPORT_STATIC_CALL(pv_qspinlock_wait); +EXPORT_STATIC_CALL(pv_qspinlock_kick); + void __init pv_qspinlock_init(void) { /* Don't use the PV qspinlock code if there is only 1 vCPU. */ @@ -309,10 +320,13 @@ void __init pv_qspinlock_init(void) pr_info("PV qspinlocks enabled\n");
__pv_init_lock_hash(); - pv_ops.sched.queued_spin_lock_slowpath = __pv_queued_spin_lock_slowpath; - pv_ops.sched.queued_spin_unlock = __pv_queued_spin_unlock; - pv_ops.sched.wait = kvm_wait; - pv_ops.sched.kick = kvm_kick_cpu; + + static_call_update(pv_qspinlock_queued_spin_lock_slowpath, + __pv_queued_spin_lock_slowpath); + static_call_update(pv_qspinlock_queued_spin_unlock, + __pv_queued_spin_unlock); + static_call_update(pv_qspinlock_wait, kvm_wait); + static_call_update(pv_qspinlock_kick, kvm_kick_cpu); }
static __init int arm_parse_pvspin(char *arg)
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
Add configuration for feature pv-sched
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 7 ++++ arch/arm64/include/asm/kvm_host.h | 4 +++ arch/arm64/include/asm/paravirt.h | 2 ++ arch/arm64/kernel/paravirt.c | 54 ++++++++++++++++--------------- arch/arm64/kvm/arm.c | 13 ++++++++ arch/arm64/kvm/handle_exit.c | 2 ++ arch/arm64/kvm/hypercalls.c | 4 +++ arch/arm64/kvm/pvsched.c | 2 ++ 8 files changed, 62 insertions(+), 26 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 971cf9d54101..145d85fc37fd 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1561,6 +1561,13 @@ config PARAVIRT_SPINLOCKS
If you are unsure how to answer this question, answer Y.
+config PARAVIRT_SCHED + bool "Paravirtualization layer for sched" + depends on PARAVIRT + help + + If you are unsure how to answer this question, answer Y. + config PARAVIRT_TIME_ACCOUNTING bool "Paravirtual steal time accounting" select PARAVIRT diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e837b5fe0a97..b0cb4f71754c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -596,11 +596,13 @@ struct kvm_vcpu_arch { gpa_t base; } steal;
+#ifdef CONFIG_PARAVIRT_SCHED /* Guest PV sched state */ struct { bool pv_unhalted; gpa_t base; } pvsched; +#endif
/* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; @@ -1057,6 +1059,7 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) return (vcpu_arch->steal.base != INVALID_GPA); }
+#ifdef CONFIG_PARAVIRT_SCHED long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu); @@ -1070,6 +1073,7 @@ static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) { return (vcpu_arch->pvsched.base != INVALID_GPA); } +#endif
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index a29eeffa49aa..1ce7187eed48 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,7 +20,9 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+#ifdef CONFIG_PARAVIRT_SCHED int __init pv_sched_init(void); +#endif
__visible bool __native_vcpu_is_preempted(int cpu); DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 142cee610804..dd2407d382a3 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -180,7 +180,7 @@ int __init pv_time_init(void) return 0; }
- +#ifdef CONFIG_PARAVIRT_SCHED DEFINE_PER_CPU(struct pvsched_vcpu_state, pvsched_vcpu_region) __aligned(64); EXPORT_PER_CPU_SYMBOL(pvsched_vcpu_region);
@@ -263,6 +263,33 @@ static bool has_kvm_pvsched(void) return (res.a0 == SMCCC_RET_SUCCESS); }
+int __init pv_sched_init(void) +{ + int ret; + + if (is_hyp_mode_available()) + return 0; + + if (!has_kvm_pvsched()) { + pr_warn("PV sched is not available\n"); + return 0; + } + + ret = kvm_arm_init_pvsched(); + if (ret) + return ret; + + static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); + pr_info("using PV sched preempted\n"); + + pv_qspinlock_init(); + + return 0; +} + +early_initcall(pv_sched_init); +#endif /* CONFIG_PARAVIRT_SCHED */ + #ifdef CONFIG_PARAVIRT_SPINLOCKS static bool arm_pvspin = false;
@@ -336,28 +363,3 @@ static __init int arm_parse_pvspin(char *arg) } early_param("arm_pvspin", arm_parse_pvspin); #endif /* CONFIG_PARAVIRT_SPINLOCKS */ - -int __init pv_sched_init(void) -{ - int ret; - - if (is_hyp_mode_available()) - return 0; - - if (!has_kvm_pvsched()) { - pr_warn("PV sched is not available\n"); - return 0; - } - - ret = kvm_arm_init_pvsched(); - if (ret) - return ret; - - static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); - pr_info("using PV sched preempted\n"); - - pv_qspinlock_init(); - - return 0; -} -early_initcall(pv_sched_init); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 33a85383d587..ffbe1fcd8f41 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -415,7 +415,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+#ifdef CONFIG_PARAVIRT_SCHED kvm_arm_pvsched_vcpu_init(&vcpu->arch); +#endif
vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
@@ -508,8 +510,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu);
+#ifdef CONFIG_PARAVIRT_SCHED if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) kvm_update_pvsched_preempted(vcpu, 0); +#endif
#ifdef CONFIG_KVM_HISI_VIRT kvm_hisi_dvmbm_load(vcpu); @@ -530,8 +534,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1;
+#ifdef CONFIG_PARAVIRT_SCHED if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) kvm_update_pvsched_preempted(vcpu, 1); +#endif
#ifdef CONFIG_KVM_HISI_VIRT kvm_hisi_dvmbm_put(vcpu); @@ -613,10 +619,15 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF); +#ifdef CONFIG_PARAVIRT_SCHED bool pv_unhalted = v->arch.pvsched.pv_unhalted;
return ((irq_lines || kvm_vgic_vcpu_pending_irq(v) || pv_unhalted) && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); +#else + return ((irq_lines || kvm_vgic_vcpu_pending_irq(v)) + && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); +#endif }
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu) @@ -1402,7 +1413,9 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
spin_unlock(&vcpu->arch.mp_state_lock);
+#ifdef CONFIG_PARAVIRT_SCHED kvm_arm_pvsched_vcpu_init(&vcpu->arch); +#endif
return 0; } diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index b9a44c3bebb7..f4671ae23b1f 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -121,7 +121,9 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) } else { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); vcpu->stat.wfi_exit_stat++; +#ifdef CONFIG_PARAVIRT_SCHED vcpu->arch.pvsched.pv_unhalted = false; +#endif }
if (esr & ESR_ELx_WFx_ISS_WFxT) { diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 46def0572461..2e7a1d06e6ff 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -332,9 +332,11 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) &smccc_feat->std_hyp_bmap)) val[0] = SMCCC_RET_SUCCESS; break; +#ifdef CONFIG_PARAVIRT_SCHED case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = SMCCC_RET_SUCCESS; break; +#endif } break; case ARM_SMCCC_HV_PV_TIME_FEATURES: @@ -363,6 +365,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_TRNG_RND32: case ARM_SMCCC_TRNG_RND64: return kvm_trng_call(vcpu); +#ifdef CONFIG_PARAVIRT_SCHED case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = kvm_hypercall_pvsched_features(vcpu); break; @@ -380,6 +383,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: val[0] = kvm_pvsched_kick_vcpu(vcpu); break; +#endif default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index bcfd4760b491..eae6ad7b6c5d 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -4,6 +4,7 @@ * Author: Zengruan Ye yezengruan@huawei.com */
+#ifdef CONFIG_PARAVIRT_SCHED #include <linux/arm-smccc.h> #include <linux/kvm_host.h>
@@ -77,3 +78,4 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu)
return val; } +#endif /* CONFIG_PARAVIRT_SCHED */
From: chenjiajun chenjiajun8@huawei.com
virt inclusion category: feature bugzilla: bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
This patch export remaining aarch64 exit items to vcpu_stat via debugfs. The items include: fp_asimd_exit_stat, irq_exit_stat, sys64_exit_stat, mabt_exit_stat, fail_entry_exit_stat, internal_error_exit_stat, unknown_ec_exit_stat, cp15_32_exit_stat, cp15_64_exit_stat, cp14_mr_exit_stat, cp14_ls_exit_stat, cp14_64_exit_stat, smc_exit_stat, sve_exit_stat, debug_exit_stat
Signed-off-by: Biaoxiang Ye yebiaoxiang@huawei.com Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: chenjiajun chenjiajun8@huawei.com Reviewed-by: Xiangyou Xie xiexiangyou@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Chen Jun chenjun102@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 15 +++++++++++++++ arch/arm64/kvm/handle_exit.c | 9 +++++++++ arch/arm64/kvm/hyp/include/hyp/switch.h | 1 + arch/arm64/kvm/mmu.c | 1 + arch/arm64/kvm/sys_regs.c | 11 +++++++++++ 5 files changed, 37 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index b0cb4f71754c..55ca5f3aa4e1 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -926,6 +926,21 @@ struct kvm_vcpu_stat { u64 mmio_exit_kernel; u64 signal_exits; u64 exits; + u64 fp_asimd_exit_stat; + u64 irq_exit_stat; + u64 sys64_exit_stat; + u64 mabt_exit_stat; + u64 fail_entry_exit_stat; + u64 internal_error_exit_stat; + u64 unknown_ec_exit_stat; + u64 cp15_32_exit_stat; + u64 cp15_64_exit_stat; + u64 cp14_mr_exit_stat; + u64 cp14_ls_exit_stat; + u64 cp14_64_exit_stat; + u64 smc_exit_stat; + u64 sve_exit_stat; + u64 debug_exit_stat; };
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index f4671ae23b1f..65d780887ab2 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -73,6 +73,7 @@ static int handle_smc(struct kvm_vcpu *vcpu) */ if (kvm_vcpu_hvc_get_imm(vcpu)) { vcpu_set_reg(vcpu, 0, ~0UL); + vcpu->stat.smc_exit_stat++; return 1; }
@@ -176,6 +177,8 @@ static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu) run->debug.arch.hsr_high = upper_32_bits(esr); run->flags = KVM_DEBUG_ARCH_HSR_HIGH_VALID;
+ vcpu->stat.debug_exit_stat++; + switch (ESR_ELx_EC(esr)) { case ESR_ELx_EC_WATCHPT_LOW: run->debug.arch.far = vcpu->arch.fault.far_el2; @@ -196,6 +199,7 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu) esr, esr_get_class_string(esr));
kvm_inject_undefined(vcpu); + vcpu->stat.unknown_ec_exit_stat++; return 1; }
@@ -206,6 +210,7 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu) static int handle_sve(struct kvm_vcpu *vcpu) { kvm_inject_undefined(vcpu); + vcpu->stat.sve_exit_stat++; return 1; }
@@ -338,6 +343,7 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
switch (exception_index) { case ARM_EXCEPTION_IRQ: + vcpu->stat.irq_exit_stat++; return 1; case ARM_EXCEPTION_EL1_SERROR: return 1; @@ -349,6 +355,7 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index) * is pre-emptied by kvm_reboot()'s shutdown call. */ run->exit_reason = KVM_EXIT_FAIL_ENTRY; + vcpu->stat.fail_entry_exit_stat++; return 0; case ARM_EXCEPTION_IL: /* @@ -356,11 +363,13 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index) * have been corrupted somehow. Give up. */ run->exit_reason = KVM_EXIT_FAIL_ENTRY; + vcpu->stat.fail_entry_exit_stat++; return -EINVAL; default: kvm_pr_unimpl("Unsupported exception type: %d", exception_index); run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + vcpu->stat.internal_error_exit_stat++; return 0; } } diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 9cfe6bd1dbe4..c91e641346d6 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -294,6 +294,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) /* Only handle traps the vCPU can support here: */ switch (esr_ec) { case ESR_ELx_EC_FP_ASIMD: + vcpu->stat.fp_asimd_exit_stat++; break; case ESR_ELx_EC_SVE: if (!sve_guest) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 482280fe22d7..121a3d90240d 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1416,6 +1416,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); VM_BUG_ON(write_fault && exec_fault); + vcpu->stat.mabt_exit_stat++;
if (fault_status == ESR_ELx_FSC_PERM && !write_fault && !exec_fault) { kvm_err("Unexpected L2 read permission error\n"); diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0afd6136e275..8a92e848a1e6 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2774,6 +2774,8 @@ static bool check_sysreg_table(const struct sys_reg_desc *table, unsigned int n, int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu) { kvm_inject_undefined(vcpu); + vcpu->stat.cp14_ls_exit_stat++; + return 1; }
@@ -3050,6 +3052,8 @@ static int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
int kvm_handle_cp15_64(struct kvm_vcpu *vcpu) { + vcpu->stat.cp15_64_exit_stat++; + return kvm_handle_cp_64(vcpu, cp15_64_regs, ARRAY_SIZE(cp15_64_regs)); }
@@ -3059,6 +3063,8 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu)
params = esr_cp1x_32_to_params(kvm_vcpu_get_esr(vcpu));
+ vcpu->stat.cp15_32_exit_stat++; + /* * Certain AArch32 ID registers are handled by rerouting to the AArch64 * system register table. Registers in the ID range where CRm=0 are @@ -3073,6 +3079,8 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu)
int kvm_handle_cp14_64(struct kvm_vcpu *vcpu) { + vcpu->stat.cp14_64_exit_stat++; + return kvm_handle_cp_64(vcpu, cp14_64_regs, ARRAY_SIZE(cp14_64_regs)); }
@@ -3082,6 +3090,8 @@ int kvm_handle_cp14_32(struct kvm_vcpu *vcpu)
params = esr_cp1x_32_to_params(kvm_vcpu_get_esr(vcpu));
+ vcpu->stat.cp14_mr_exit_stat++; + return kvm_handle_cp_32(vcpu, ¶ms, cp14_regs, ARRAY_SIZE(cp14_regs)); }
@@ -3178,6 +3188,7 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu) int Rt = kvm_vcpu_sys_get_rt(vcpu);
trace_kvm_handle_sys_reg(esr); + vcpu->stat.sys64_exit_stat++;
if (__check_nv_sr_forward(vcpu)) return 1;
From: Zenghui Yu yuzenghui@huawei.com
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8TN8N CVE: NA
----------------------------------------------------
Currently fp_asimd_exit_stat is accumulated for *both* FP/ASIMD and SVE traps so that user can not distinguish between these two via debugfs.
Fix the manipulation for both exception classes.
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Keqian Zhu zhukeqian1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kvm/hyp/include/hyp/switch.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index c91e641346d6..61907949e00f 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -297,6 +297,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) vcpu->stat.fp_asimd_exit_stat++; break; case ESR_ELx_EC_SVE: + vcpu->stat.sve_exit_stat++; if (!sve_guest) return false; break;
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/4001 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/4001 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/F...