Support pv feature
Zenghui Yu (1): KVM: arm64: Fix {fp_asimd,sve}_exit_stat manipulation
Zengruan Ye (10): KVM: arm64: Document PV-sched interface KVM: arm64: Implement PV_SCHED_FEATURES call KVM: arm64: Support pvsched preempted via shared structure KVM: arm64: Add interface to support vCPU preempted check KVM: arm64: Support the vCPU preemption check KVM: arm64: Add SMCCC PV-sched to kick cpu KVM: arm64: Implement PV_SCHED_KICK_CPU call KVM: arm64: Add interface to support PV qspinlock KVM: arm64: Enable PV qspinlock KVM: arm64: Add tracepoints for PV qspinlock
chenjiajun (1): kvm: debugfs: export remaining aarch64 kvm exit reasons to debugfs
lishusen (2): KVM: arm64: Replace pv.ops by static_call for PV qspinlock feature KVM: arm64: Add configuration for feature pv-sched
yezengruan (1): kvm: arm64: fix some pvsched bugs
C | 0 Documentation/virt/kvm/arm/pvsched.rst | 74 ++++++++ arch/arm64/Kconfig | 20 +++ arch/arm64/include/asm/Kbuild | 1 - arch/arm64/include/asm/kvm_host.h | 39 ++++ arch/arm64/include/asm/paravirt.h | 51 ++++++ arch/arm64/include/asm/pvsched-abi.h | 16 ++ arch/arm64/include/asm/qspinlock.h | 48 +++++ arch/arm64/include/asm/qspinlock_paravirt.h | 12 ++ arch/arm64/include/asm/spinlock.h | 13 ++ arch/arm64/kernel/Makefile | 3 +- arch/arm64/kernel/paravirt-spinlocks.c | 21 +++ arch/arm64/kernel/paravirt.c | 189 ++++++++++++++++++++ arch/arm64/kernel/trace-paravirt.h | 66 +++++++ arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/arm.c | 34 ++++ arch/arm64/kvm/handle_exit.c | 12 ++ arch/arm64/kvm/hyp/include/hyp/switch.h | 2 + arch/arm64/kvm/hypercalls.c | 24 +++ arch/arm64/kvm/mmu.c | 1 + arch/arm64/kvm/pvsched.c | 81 +++++++++ arch/arm64/kvm/sys_regs.c | 11 ++ arch/arm64/kvm/trace_arm.h | 18 ++ include/linux/arm-smccc.h | 25 +++ include/linux/cpuhotplug.h | 1 + 25 files changed, 761 insertions(+), 3 deletions(-) create mode 100644 C create mode 100644 Documentation/virt/kvm/arm/pvsched.rst create mode 100644 arch/arm64/include/asm/pvsched-abi.h create mode 100644 arch/arm64/include/asm/qspinlock.h create mode 100644 arch/arm64/include/asm/qspinlock_paravirt.h create mode 100644 arch/arm64/kernel/paravirt-spinlocks.c create mode 100644 arch/arm64/kernel/trace-paravirt.h create mode 100644 arch/arm64/kvm/pvsched.c
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Introduce a paravirtualization interface for KVM/arm64 to PV-sched.
A hypercall interface is provided for the guest to interrogate the hypervisor's support for this interface and the location of the shared memory structures.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- Documentation/virt/kvm/arm/pvsched.rst | 58 ++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 Documentation/virt/kvm/arm/pvsched.rst
diff --git a/Documentation/virt/kvm/arm/pvsched.rst b/Documentation/virt/kvm/arm/pvsched.rst new file mode 100644 index 000000000000..8f7112a8a9cd --- /dev/null +++ b/Documentation/virt/kvm/arm/pvsched.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Paravirtualized sched support for arm64 +======================================= + +KVM/arm64 provides some hypervisor service calls to support a paravirtualized +sched. + +Some SMCCC compatible hypercalls are defined: + +* PV_SCHED_FEATURES: 0xC5000090 +* PV_SCHED_IPA_INIT: 0xC5000091 +* PV_SCHED_IPA_RELEASE: 0xC5000092 + +The existence of the PV_SCHED hypercall should be probed using the SMCCC 1.1 +ARCH_FEATURES mechanism before calling it. + +PV_SCHED_FEATURES + ============= ======== ========== + Function ID: (uint32) 0xC5000090 + PV_call_id: (uint32) The function to query for support. + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant + PV-sched feature is supported by the hypervisor. + ============= ======== ========== + +PV_SCHED_IPA_INIT + ============= ======== ========== + Function ID: (uint32) 0xC5000091 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA of + this vCPU's PV data structure is shared to the + hypervisor. + ============= ======== ========== + +PV_SCHED_IPA_RELEASE + ============= ======== ========== + Function ID: (uint32) 0xC5000092 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA of + this vCPU's PV data structure is released. + ============= ======== ========== + +PV sched state +-------------- + +The structure pointed to by the PV_SCHED_IPA hypercall is as follows: + ++-----------+-------------+-------------+-----------------------------------+ +| Field | Byte Length | Byte Offset | Description | ++===========+=============+=============+===================================+ +| preempted | 4 | 0 | Indicates that the vCPU that owns | +| | | | this struct is running or not. | +| | | | Non-zero values mean the vCPU has | +| | | | been preempted. Zero means the | +| | | | vCPU is not preempted. | ++-----------+-------------+-------------+-----------------------------------+ + +The preempted field will be updated to 0 by the hypervisor prior to scheduling +a vCPU. When the vCPU is scheduled out, the preempted field will be updated +to 1 by the hypervisor.
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
This provides a mechanism for querying which paravirtualized sched features are available in this hypervisor.
Add some SMCCC compatible hypercalls for PV sched features: PV_SCHED_FEATURES: 0xC5000090 PV_SCHED_IPA_INIT: 0xC5000091 PV_SCHED_IPA_RELEASE: 0xC5000092
Also add the header file which defines the ABI for the paravirtualized sched features we're about to add.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/include/asm/pvsched-abi.h | 16 ++++++++++++++++ arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/hypercalls.c | 6 ++++++ arch/arm64/kvm/pvsched.c | 23 +++++++++++++++++++++++ include/linux/arm-smccc.h | 19 +++++++++++++++++++ 6 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/pvsched-abi.h create mode 100644 arch/arm64/kvm/pvsched.c
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index af06ccb7ee34..3a91eee0c80a 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1038,6 +1038,8 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) return (vcpu_arch->steal.base != INVALID_GPA); }
+long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); + void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); diff --git a/arch/arm64/include/asm/pvsched-abi.h b/arch/arm64/include/asm/pvsched-abi.h new file mode 100644 index 000000000000..80e50e7a1a31 --- /dev/null +++ b/arch/arm64/include/asm/pvsched-abi.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef __ASM_PVSCHED_ABI_H +#define __ASM_PVSCHED_ABI_H + +struct pvsched_vcpu_state { + __le32 preempted; + /* Structure must be 64 byte aligned, pad to that size */ + u8 padding[60]; +} __packed; + +#endif diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index c0c050e53157..3e16e6c364c6 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -10,7 +10,7 @@ include $(srctree)/virt/kvm/Makefile.kvm obj-$(CONFIG_KVM) += kvm.o obj-$(CONFIG_KVM) += hyp/
-kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ +kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o pvsched.o \ inject_fault.o va_layout.o handle_exit.o \ guest.o debug.o reset.o sys_regs.o stacktrace.o \ vgic-sys-reg-v3.o fpsimd.o pkvm.o \ diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 7fb4df0456de..190670d8dd3f 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -332,6 +332,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) &smccc_feat->std_hyp_bmap)) val[0] = SMCCC_RET_SUCCESS; break; + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val[0] = SMCCC_RET_SUCCESS; + break; } break; case ARM_SMCCC_HV_PV_TIME_FEATURES: @@ -360,6 +363,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_TRNG_RND32: case ARM_SMCCC_TRNG_RND64: return kvm_trng_call(vcpu); + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val[0] = kvm_hypercall_pvsched_features(vcpu); + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c new file mode 100644 index 000000000000..3d96122fcf9e --- /dev/null +++ b/arch/arm64/kvm/pvsched.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#include <linux/arm-smccc.h> + +#include <kvm/arm_hypercalls.h> + +long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) +{ + u32 feature = smccc_get_arg1(vcpu); + long val = SMCCC_RET_NOT_SUPPORTED; + + switch (feature) { + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val = SMCCC_RET_SUCCESS; + break; + } + + return val; +} diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 083f85653716..daab363d31ae 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -577,5 +577,24 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, method; \ })
+/* Paravirtualised sched calls */ +#define ARM_SMCCC_HV_PV_SCHED_FEATURES \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x90) + +#define ARM_SMCCC_HV_PV_SCHED_IPA_INIT \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x91) + +#define ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x92) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Implement the service call for configuring a shared structure between a vCPU and the hypervisor in which the hypervisor can tell the vCPU that is running or not.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- C | 0 arch/arm64/include/asm/kvm_host.h | 16 ++++++++++++++++ arch/arm64/kvm/arm.c | 17 +++++++++++++++++ arch/arm64/kvm/hypercalls.c | 11 +++++++++++ arch/arm64/kvm/pvsched.c | 28 ++++++++++++++++++++++++++++ 5 files changed, 72 insertions(+) create mode 100644 C
diff --git a/C b/C new file mode 100644 index 000000000000..e69de29bb2d1 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3a91eee0c80a..cf50756f11dd 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -589,6 +589,11 @@ struct kvm_vcpu_arch { gpa_t base; } steal;
+ /* Guest PV sched state */ + struct { + gpa_t base; + } pvsched; + /* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; }; @@ -1039,6 +1044,17 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) }
long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); +void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); + +static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) +{ + vcpu_arch->pvsched.base = GPA_INVALID; +} + +static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) +{ + return (vcpu_arch->pvsched.base != GPA_INVALID); +}
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 4866b3f7b4ea..b9dd72f182e0 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -386,6 +386,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+ kvm_arm_pvsched_vcpu_init(&vcpu->arch); + vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
err = kvm_vgic_vcpu_init(vcpu); @@ -461,10 +463,18 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (vcpu_has_ptrauth(vcpu)) vcpu_ptrauth_disable(vcpu); + kvm_arch_vcpu_load_debug_state_flags(vcpu);
if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu); + + if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) + kvm_update_pvsched_preempted(vcpu, 0); + +#ifdef CONFIG_KVM_HISI_VIRT + kvm_hisi_dvmbm_load(vcpu); +#endif }
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) @@ -480,6 +490,13 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1; + + if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) + kvm_update_pvsched_preempted(vcpu, 1); + +#ifdef CONFIG_KVM_HISI_VIRT + kvm_hisi_dvmbm_put(vcpu); +#endif }
static void __kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 190670d8dd3f..b72224609c3e 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -366,6 +366,17 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = kvm_hypercall_pvsched_features(vcpu); break; + case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: + gpa = smccc_get_arg1(vcpu); + if (gpa != GPA_INVALID) { + vcpu->arch.pvsched.base = gpa; + val = SMCCC_RET_SUCCESS; + } + break; + case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: + vcpu->arch.pvsched.base = GPA_INVALID; + val = SMCCC_RET_SUCCESS; + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 3d96122fcf9e..b923c4b6c52e 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -5,9 +5,35 @@ */
#include <linux/arm-smccc.h> +#include <linux/kvm_host.h> + +#include <asm/pvsched-abi.h>
#include <kvm/arm_hypercalls.h>
+void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) +{ + struct kvm *kvm = vcpu->kvm; + u64 base = vcpu->arch.pvsched.base; + u64 offset = offsetof(struct pvsched_vcpu_state, preempted); + int idx; + + if (base == GPA_INVALID) + return; + + /* + * This function is called from atomic context, so we need to + * disable page faults. + */ + pagefault_disable(); + + idx = srcu_read_lock(&kvm->srcu); + kvm_put_guest(kvm, base + offset, cpu_to_le32(preempted)); + srcu_read_unlock(&kvm->srcu, idx); + + pagefault_enable(); +} + long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) { u32 feature = smccc_get_arg1(vcpu); @@ -15,6 +41,8 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu)
switch (feature) { case ARM_SMCCC_HV_PV_SCHED_FEATURES: + case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: + case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: val = SMCCC_RET_SUCCESS; break; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
This is to fix some lock holder preemption issues. Some other locks implementation do a spin loop before acquiring the lock itself. Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It takes the CPU as parameter and return true if the CPU is preempted. Then kernel can break the spin loops upon the retval of vcpu_is_preempted.
As kernel has used this interface, So lets support it.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 8 ++++++++ arch/arm64/include/asm/spinlock.h | 10 ++++++++++ arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/paravirt-spinlocks.c | 16 ++++++++++++++++ 4 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kernel/paravirt-spinlocks.c
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 9aa193e0e8f2..53282dc429f8 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,6 +20,14 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+__visible bool __native_vcpu_is_preempted(int cpu); +DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); + +static inline bool pv_vcpu_is_preempted(int cpu) +{ + return static_call(pv_vcpu_preempted)(cpu); +} + #else
#define pv_time_init() do {} while (0) diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index 0525c0b089ed..3dcb1c903ba0 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -7,6 +7,7 @@
#include <asm/qspinlock.h> #include <asm/qrwlock.h> +#include <asm/paravirt.h>
/* See include/linux/spinlock.h */ #define smp_mb__after_spinlock() smp_mb() @@ -19,9 +20,18 @@ * https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-a... */ #define vcpu_is_preempted vcpu_is_preempted +#ifdef CONFIG_PARAVIRT +static inline bool vcpu_is_preempted(int cpu) +{ + return pv_vcpu_is_preempted(cpu); +} + +#else + static inline bool vcpu_is_preempted(int cpu) { return false; } +#endif /* CONFIG_PARAVIRT */
#endif /* __ASM_SPINLOCK_H */ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 31859e7741a5..235963ed87b1 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -60,7 +60,7 @@ obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o -obj-$(CONFIG_PARAVIRT) += paravirt.o +obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt-spinlocks.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o obj-$(CONFIG_ELF_CORE) += elfcore.o diff --git a/arch/arm64/kernel/paravirt-spinlocks.c b/arch/arm64/kernel/paravirt-spinlocks.c new file mode 100644 index 000000000000..3566897070d9 --- /dev/null +++ b/arch/arm64/kernel/paravirt-spinlocks.c @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#include <linux/static_call.h> +#include <linux/spinlock.h> +#include <asm/paravirt.h> + +__visible bool __native_vcpu_is_preempted(int cpu) +{ + return false; +} + +DEFINE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted);
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Support the vcpu_is_preempted() functionality under KVM/arm64. This will enhance lock performance on overcommitted hosts (more runnable vCPUs than physical CPUs in the system) as doing busy waits for preempted vCPUs will hurt system performance far worse than early yielding.
unix benchmark result: host: kernel 4.19.87, HiSilicon Kunpeng920, 8 CPUs guest: kernel 4.19.87, 16 vCPUs
test-case | after-patch | before-patch ----------------------------------------+-------------------+------------------ Dhrystone 2 using register variables | 338955728.5 lps | 339266319.5 lps Double-Precision Whetstone | 30634.9 MWIPS | 30884.4 MWIPS Execl Throughput | 6753.2 lps | 3580.1 lps File Copy 1024 bufsize 2000 maxblocks | 490048.0 KBps | 313282.3 KBps File Copy 256 bufsize 500 maxblocks | 129662.5 KBps | 83550.7 KBps File Copy 4096 bufsize 8000 maxblocks | 1552551.5 KBps | 814327.0 KBps Pipe Throughput | 8976422.5 lps | 9048628.4 lps Pipe-based Context Switching | 258641.7 lps | 252925.9 lps Process Creation | 5312.2 lps | 4507.9 lps Shell Scripts (1 concurrent) | 8704.2 lpm | 6720.9 lpm Shell Scripts (8 concurrent) | 1708.8 lpm | 607.2 lpm System Call Overhead | 3714444.7 lps | 3746386.8 lps ----------------------------------------+-------------------+------------------ System Benchmarks Index Score | 2270.6 | 1679.2
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 4 +- arch/arm64/include/asm/paravirt.h | 3 + arch/arm64/kernel/paravirt.c | 106 ++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 2 + arch/arm64/kvm/hypercalls.c | 8 +-- arch/arm64/kvm/pvsched.c | 2 +- include/linux/cpuhotplug.h | 1 + 7 files changed, 119 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index cf50756f11dd..b87dd86b1842 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1048,12 +1048,12 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted);
static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) { - vcpu_arch->pvsched.base = GPA_INVALID; + vcpu_arch->pvsched.base = INVALID_GPA; }
static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) { - return (vcpu_arch->pvsched.base != GPA_INVALID); + return (vcpu_arch->pvsched.base != INVALID_GPA); }
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome); diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 53282dc429f8..4e35851bcecf 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,6 +20,8 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+int __init pv_sched_init(void); + __visible bool __native_vcpu_is_preempted(int cpu); DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted);
@@ -31,6 +33,7 @@ static inline bool pv_vcpu_is_preempted(int cpu) #else
#define pv_time_init() do {} while (0) +#define pv_sched_init() do {} while (0)
#endif // CONFIG_PARAVIRT
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index aa718d6a9274..844622baf6bf 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -22,6 +22,7 @@
#include <asm/paravirt.h> #include <asm/pvclock-abi.h> +#include <asm/pvsched-abi.h> #include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled; @@ -174,3 +175,108 @@ int __init pv_time_init(void)
return 0; } + + +DEFINE_PER_CPU(struct pvsched_vcpu_state, pvsched_vcpu_region) __aligned(64); +EXPORT_PER_CPU_SYMBOL(pvsched_vcpu_region); + +static bool kvm_vcpu_is_preempted(int cpu) +{ + struct pvsched_vcpu_state *reg; + u32 preempted; + + reg = &per_cpu(pvsched_vcpu_region, cpu); + if (!reg) { + pr_warn_once("PV sched enabled but not configured for cpu %d\n", + cpu); + return false; + } + + preempted = le32_to_cpu(READ_ONCE(reg->preempted)); + + return !!preempted; +} + +static int pvsched_vcpu_state_dying_cpu(unsigned int cpu) +{ + struct pvsched_vcpu_state *reg; + struct arm_smccc_res res; + + reg = this_cpu_ptr(&pvsched_vcpu_region); + if (!reg) + return -EFAULT; + + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE, &res); + memset(reg, 0, sizeof(*reg)); + + return 0; +} + +static int init_pvsched_vcpu_state(unsigned int cpu) +{ + struct pvsched_vcpu_state *reg; + struct arm_smccc_res res; + + reg = this_cpu_ptr(&pvsched_vcpu_region); + if (!reg) + return -EFAULT; + + /* Pass the memory address to host via hypercall */ + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_IPA_INIT, + virt_to_phys(reg), &res); + + return 0; +} + +static int kvm_arm_init_pvsched(void) +{ + int ret; + + ret = cpuhp_setup_state(CPUHP_AP_ARM_KVM_PVSCHED_STARTING, + "hypervisor/arm/pvsched:starting", + init_pvsched_vcpu_state, + pvsched_vcpu_state_dying_cpu); + + if (ret < 0) { + pr_warn("PV sched init failed\n"); + return ret; + } + + return 0; +} + +static bool has_kvm_pvsched(void) +{ + struct arm_smccc_res res; + + /* To detect the presence of PV sched support we require SMCCC 1.1+ */ + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) + return false; + + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_HV_PV_SCHED_FEATURES, &res); + + return (res.a0 == SMCCC_RET_SUCCESS); +} + +int __init pv_sched_init(void) +{ + int ret; + + if (is_hyp_mode_available()) + return 0; + + if (!has_kvm_pvsched()) { + pr_warn("PV sched is not available\n"); + return 0; + } + + ret = kvm_arm_init_pvsched(); + if (ret) + return ret; + + static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); + pr_info("using PV sched preempted\n"); + + return 0; +} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 645034e52496..0255788cb9c4 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -416,6 +416,8 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p) smp_init_cpus(); smp_build_mpidr_hash();
+ pv_sched_init(); + /* Init percpu seeds for random tags after cpus are set up. */ kasan_init_sw_tags();
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index b72224609c3e..a96b88d90cf8 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -368,14 +368,14 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) break; case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: gpa = smccc_get_arg1(vcpu); - if (gpa != GPA_INVALID) { + if (gpa != INVALID_GPA) { vcpu->arch.pvsched.base = gpa; - val = SMCCC_RET_SUCCESS; + val[0] = SMCCC_RET_SUCCESS; } break; case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: - vcpu->arch.pvsched.base = GPA_INVALID; - val = SMCCC_RET_SUCCESS; + vcpu->arch.pvsched.base = INVALID_GPA; + val[0] = SMCCC_RET_SUCCESS; break; default: return kvm_psci_call(vcpu); diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index b923c4b6c52e..06290c831101 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -18,7 +18,7 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) u64 offset = offsetof(struct pvsched_vcpu_state, preempted); int idx;
- if (base == GPA_INVALID) + if (base == INVALID_GPA) return;
/* diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 624d4a38c358..f94a1b8e34e0 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -190,6 +190,7 @@ enum cpuhp_state { CPUHP_AP_DUMMY_TIMER_STARTING, CPUHP_AP_ARM_XEN_STARTING, CPUHP_AP_ARM_XEN_RUNSTATE_STARTING, + CPUHP_AP_ARM_KVM_PVSCHED_STARTING, CPUHP_AP_ARM_CORESIGHT_STARTING, CPUHP_AP_ARM_CORESIGHT_CTI_STARTING, CPUHP_AP_ARM64_ISNDEP_STARTING,
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
A new hypercall interface function is provided for the guest to kick WFI state vCPU.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- Documentation/virt/kvm/arm/pvsched.rst | 16 ++++++++++++++++ include/linux/arm-smccc.h | 6 ++++++ 2 files changed, 22 insertions(+)
diff --git a/Documentation/virt/kvm/arm/pvsched.rst b/Documentation/virt/kvm/arm/pvsched.rst index 8f7112a8a9cd..6ba221e25089 100644 --- a/Documentation/virt/kvm/arm/pvsched.rst +++ b/Documentation/virt/kvm/arm/pvsched.rst @@ -11,6 +11,7 @@ Some SMCCC compatible hypercalls are defined: * PV_SCHED_FEATURES: 0xC5000090 * PV_SCHED_IPA_INIT: 0xC5000091 * PV_SCHED_IPA_RELEASE: 0xC5000092 +* PV_SCHED_KICK_CPU: 0xC5000093
The existence of the PV_SCHED hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES mechanism before calling it. @@ -38,6 +39,13 @@ PV_SCHED_IPA_RELEASE this vCPU's PV data structure is released. ============= ======== ==========
+PV_SCHED_KICK_CPU + ============= ======== ========== + Function ID: (uint32) 0xC5000093 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the vCPU is + kicked by the hypervisor. + ============= ======== ========== + PV sched state --------------
@@ -56,3 +64,11 @@ The structure pointed to by the PV_SCHED_IPA hypercall is as follows: The preempted field will be updated to 0 by the hypervisor prior to scheduling a vCPU. When the vCPU is scheduled out, the preempted field will be updated to 1 by the hypervisor. + +A vCPU of a paravirtualized guest that is busywaiting in guest kernel mode for +an event to occur (ex: a spinlock to become available) can execute WFI +instruction once it has busy-waited for more than a threshold time-interval. +Execution of WFI instruction would cause the hypervisor to put the vCPU to sleep +until occurrence of an appropriate event. Another vCPU of the same guest can +wakeup the sleeping vCPU by issuing PV_SCHED_KICK_CPU hypercall, specifying CPU +id (reg1) of the vCPU to be woken up. diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index daab363d31ae..02b857f44ae8 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -596,5 +596,11 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, ARM_SMCCC_OWNER_STANDARD_HYP, \ 0x92)
+#define ARM_SMCCC_HV_PV_SCHED_KICK_CPU \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x93) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Implement the service call for waking up a WFI state vCPU.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 4 +++- arch/arm64/kvm/handle_exit.c | 1 + arch/arm64/kvm/hypercalls.c | 3 +++ arch/arm64/kvm/pvsched.c | 25 +++++++++++++++++++++++++ 5 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index b87dd86b1842..d74310f1e569 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -591,6 +591,7 @@ struct kvm_vcpu_arch {
/* Guest PV sched state */ struct { + bool pv_unhalted; gpa_t base; } pvsched;
@@ -1045,6 +1046,7 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch)
long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); +long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu);
static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) { diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index b9dd72f182e0..735cc8393a55 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -574,7 +574,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF); - return ((irq_lines || kvm_vgic_vcpu_pending_irq(v)) + bool pv_unhalted = v->arch.pvsched.pv_unhalted; + + return ((irq_lines || kvm_vgic_vcpu_pending_irq(v) || pv_unhalted) && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); }
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 617ae6dea5d5..b9a44c3bebb7 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -121,6 +121,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) } else { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); vcpu->stat.wfi_exit_stat++; + vcpu->arch.pvsched.pv_unhalted = false; }
if (esr & ESR_ELx_WFx_ISS_WFxT) { diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index a96b88d90cf8..46def0572461 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -377,6 +377,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) vcpu->arch.pvsched.base = INVALID_GPA; val[0] = SMCCC_RET_SUCCESS; break; + case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: + val[0] = kvm_pvsched_kick_vcpu(vcpu); + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 06290c831101..49d03fd6f4ad 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -34,6 +34,30 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) pagefault_enable(); }
+long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu) +{ + unsigned int vcpu_idx; + long val = SMCCC_RET_NOT_SUPPORTED; + struct kvm *kvm = vcpu->kvm; + struct kvm_vcpu *target = NULL; + + vcpu_idx = smccc_get_arg1(vcpu); + target = kvm_get_vcpu(kvm, vcpu_idx); + if (!target) + goto out; + + target->arch.pvsched.pv_unhalted = true; + kvm_make_request(KVM_REQ_IRQ_PENDING, target); + kvm_vcpu_kick(target); + if (READ_ONCE(target->ready)) + kvm_vcpu_yield_to(target); + + val = SMCCC_RET_SUCCESS; + +out: + return val; +} + long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) { u32 feature = smccc_get_arg1(vcpu); @@ -43,6 +67,7 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_FEATURES: case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: + case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: val = SMCCC_RET_SUCCESS; break; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
As kernel has used this interface, so lets support it.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 13 ++++++ arch/arm64/include/asm/Kbuild | 1 - arch/arm64/include/asm/paravirt.h | 25 +++++++++++ arch/arm64/include/asm/qspinlock.h | 48 +++++++++++++++++++++ arch/arm64/include/asm/qspinlock_paravirt.h | 12 ++++++ arch/arm64/include/asm/spinlock.h | 3 ++ arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/paravirt-spinlocks.c | 5 +++ 8 files changed, 107 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/qspinlock.h create mode 100644 arch/arm64/include/asm/qspinlock_paravirt.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 9568049f36e3..c21b663b34d2 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1548,6 +1548,19 @@ config PARAVIRT under a hypervisor, potentially improving performance significantly over full virtualization.
+config PARAVIRT_SPINLOCKS + bool "Paravirtualization layer for spinlocks" + depends on PARAVIRT && SMP + help + Paravirtualized spinlocks allow a pvops backend to replace the + spinlock implementation with something virtualization-friendly + (for example, block the virtual CPU rather than spinning). + + It has a minimal impact on native kernels and gives a nice performance + benefit on paravirtualized KVM kernels. + + If you are unsure how to answer this question, answer Y. + config PARAVIRT_TIME_ACCOUNTING bool "Paravirtual steal time accounting" select PARAVIRT diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild index 5c8ee5a541d2..d16ee8095366 100644 --- a/arch/arm64/include/asm/Kbuild +++ b/arch/arm64/include/asm/Kbuild @@ -2,7 +2,6 @@ generic-y += early_ioremap.h generic-y += mcs_spinlock.h generic-y += qrwlock.h -generic-y += qspinlock.h generic-y += parport.h generic-y += user.h
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 4e35851bcecf..1c9011d31562 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -30,6 +30,31 @@ static inline bool pv_vcpu_is_preempted(int cpu) return static_call(pv_vcpu_preempted)(cpu); }
+#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) +bool pv_is_native_spin_unlock(void); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath); +static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + return static_call(pv_qspinlock_queued_spin_lock_slowpath)(lock, val); +} + +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +static inline void pv_queued_spin_unlock(struct qspinlock *lock) +{ + return static_call(pv_qspinlock_queued_spin_unlock)(lock); +} + +static inline void pv_wait(u8 *ptr, u8 val) +{ + return; +} + +static inline void pv_kick(int cpu) +{ + return; +} +#endif /* SMP && PARAVIRT_SPINLOCKS */ + #else
#define pv_time_init() do {} while (0) diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h new file mode 100644 index 000000000000..ce3816b60f25 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2020 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef _ASM_ARM64_QSPINLOCK_H +#define _ASM_ARM64_QSPINLOCK_H + +#include <linux/jump_label.h> +#include <asm/cpufeature.h> +#include <asm-generic/qspinlock_types.h> +#include <asm/paravirt.h> + +#define _Q_PENDING_LOOPS (1 << 9) + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_init_lock_hash(void); +extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); + +#define queued_spin_unlock queued_spin_unlock +/** + * queued_spin_unlock - release a queued spinlock + * @lock : Pointer to queued spinlock structure + * + * A smp_store_release() on the least-significant byte. + */ +static inline void native_queued_spin_unlock(struct qspinlock *lock) +{ + smp_store_release(&lock->locked, 0); +} + +static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + pv_queued_spin_lock_slowpath(lock, val); +} + +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + pv_queued_spin_unlock(lock); +} + +#endif + +#include <asm-generic/qspinlock.h> + +#endif /* _ASM_ARM64_QSPINLOCK_H */ diff --git a/arch/arm64/include/asm/qspinlock_paravirt.h b/arch/arm64/include/asm/qspinlock_paravirt.h new file mode 100644 index 000000000000..eba4be28fbb9 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock_paravirt.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef __ASM_QSPINLOCK_PARAVIRT_H +#define __ASM_QSPINLOCK_PARAVIRT_H + +extern void __pv_queued_spin_unlock(struct qspinlock *lock); + +#endif diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index 3dcb1c903ba0..4040a27ecf86 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -9,6 +9,9 @@ #include <asm/qrwlock.h> #include <asm/paravirt.h>
+/* How long a lock should spin before we consider blocking */ +#define SPIN_THRESHOLD (1 << 15) + /* See include/linux/spinlock.h */ #define smp_mb__after_spinlock() smp_mb()
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 235963ed87b1..4db14d6677bd 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt-spinlocks.o +obj-$(CONFIG_PARAVIRT_SPINLOCKS) += paravirt.o paravirt-spinlocks.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o obj-$(CONFIG_ELF_CORE) += elfcore.o diff --git a/arch/arm64/kernel/paravirt-spinlocks.c b/arch/arm64/kernel/paravirt-spinlocks.c index 3566897070d9..2e4abc83e318 100644 --- a/arch/arm64/kernel/paravirt-spinlocks.c +++ b/arch/arm64/kernel/paravirt-spinlocks.c @@ -14,3 +14,8 @@ __visible bool __native_vcpu_is_preempted(int cpu) }
DEFINE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); + +bool pv_is_native_spin_unlock(void) +{ + return false; +}
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Linux kernel builds were run in KVM guest on HiSilicon Kunpeng920 system. VM guests were set up with 32, 48 and 64 vCPUs on the 32 physical CPUs. The kernel build (make -j<n>) was done in a VM with unpinned vCPUs 3 times with the best time selected and <n> is the number of vCPUs available. The build times of the original linux 4.19.87, pvqspinlock with various number of vCPUs are as follows:
Kernel 32 vCPUs 48 vCPUs 60 vCPUs ---------- -------- -------- -------- 4.19.87 342.336s 602.048s 950.340s pvqsinlock 341.366s 376.135s 437.037s
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 5 +++ arch/arm64/kernel/paravirt.c | 60 +++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+)
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 1c9011d31562..f10552839663 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -31,6 +31,7 @@ static inline bool pv_vcpu_is_preempted(int cpu) }
#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) +void __init pv_qspinlock_init(void); bool pv_is_native_spin_unlock(void); DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath); static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) @@ -53,6 +54,10 @@ static inline void pv_kick(int cpu) { return; } +#else + +#define pv_qspinlock_init() do {} while (0) + #endif /* SMP && PARAVIRT_SPINLOCKS */
#else diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 844622baf6bf..19e264a553a9 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -23,6 +23,7 @@ #include <asm/paravirt.h> #include <asm/pvclock-abi.h> #include <asm/pvsched-abi.h> +#include <asm/qspinlock_paravirt.h> #include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled; @@ -259,6 +260,63 @@ static bool has_kvm_pvsched(void) return (res.a0 == SMCCC_RET_SUCCESS); }
+#ifdef CONFIG_PARAVIRT_SPINLOCKS +static bool arm_pvspin; + +/* Kick a cpu by its cpuid. Used to wake up a halted vcpu */ +static void kvm_kick_cpu(int cpu) +{ + struct arm_smccc_res res; + + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_KICK_CPU, cpu, &res); +} + +static void kvm_wait(u8 *ptr, u8 val) +{ + unsigned long flags; + + if (in_nmi()) + return; + + local_irq_save(flags); + + if (READ_ONCE(*ptr) != val) + goto out; + + dsb(sy); + wfi(); + +out: + local_irq_restore(flags); +} + +void __init pv_qspinlock_init(void) +{ + /* Don't use the PV qspinlock code if there is only 1 vCPU. */ + if (num_possible_cpus() == 1) + arm_pvspin = false; + + if (!arm_pvspin) { + pr_info("PV qspinlocks disabled\n"); + return; + } + pr_info("PV qspinlocks enabled\n"); + + __pv_init_lock_hash(); + pv_ops.sched.queued_spin_lock_slowpath = __pv_queued_spin_lock_slowpath; + pv_ops.sched.queued_spin_unlock = __pv_queued_spin_unlock; + pv_ops.sched.wait = kvm_wait; + pv_ops.sched.kick = kvm_kick_cpu; +} + +static __init int arm_parse_pvspin(char *arg) +{ + arm_pvspin = true; + return 0; +} +early_param("arm_pvspin", arm_parse_pvspin); +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ + int __init pv_sched_init(void) { int ret; @@ -278,5 +336,7 @@ int __init pv_sched_init(void) static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); pr_info("using PV sched preempted\n");
+ pv_qspinlock_init(); + return 0; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Add tracepoints for PV qspinlock
Signed-off-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kernel/paravirt.c | 6 +++ arch/arm64/kernel/trace-paravirt.h | 66 ++++++++++++++++++++++++++++++ arch/arm64/kvm/pvsched.c | 3 ++ arch/arm64/kvm/trace_arm.h | 18 ++++++++ 4 files changed, 93 insertions(+) create mode 100644 arch/arm64/kernel/trace-paravirt.h
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 19e264a553a9..6ad463cadd68 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -26,6 +26,9 @@ #include <asm/qspinlock_paravirt.h> #include <asm/smp_plat.h>
+#define CREATE_TRACE_POINTS +#include "trace-paravirt.h" + struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled;
@@ -269,6 +272,8 @@ static void kvm_kick_cpu(int cpu) struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_KICK_CPU, cpu, &res); + + trace_kvm_kick_cpu("kvm kick cpu", smp_processor_id(), cpu); }
static void kvm_wait(u8 *ptr, u8 val) @@ -285,6 +290,7 @@ static void kvm_wait(u8 *ptr, u8 val)
dsb(sy); wfi(); + trace_kvm_wait("kvm wait wfi", smp_processor_id());
out: local_irq_restore(flags); diff --git a/arch/arm64/kernel/trace-paravirt.h b/arch/arm64/kernel/trace-paravirt.h new file mode 100644 index 000000000000..2d76272f39ae --- /dev/null +++ b/arch/arm64/kernel/trace-paravirt.h @@ -0,0 +1,66 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM paravirt + +#if !defined(_TRACE_PARAVIRT_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_PARAVIRT_H + +#include <linux/tracepoint.h> + +TRACE_EVENT(kvm_kick_cpu, + TP_PROTO(const char *name, int cpu, int target), + TP_ARGS(name, cpu, target), + + TP_STRUCT__entry( + __string(name, name) + __field(int, cpu) + __field(int, target) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->cpu = cpu; + __entry->target = target; + ), + + TP_printk("PV qspinlock: %s, cpu %d kick target cpu %d", + __get_str(name), + __entry->cpu, + __entry->target + ) +); + +TRACE_EVENT(kvm_wait, + TP_PROTO(const char *name, int cpu), + TP_ARGS(name, cpu), + + TP_STRUCT__entry( + __string(name, name) + __field(int, cpu) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->cpu = cpu; + ), + + TP_printk("PV qspinlock: %s, cpu %d wait kvm access wfi", + __get_str(name), + __entry->cpu + ) +); + +#endif /* _TRACE_PARAVIRT_H */ + +/* This part must be outside protection */ +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_PATH ../../../arch/arm64/kernel/ +#define TRACE_INCLUDE_FILE trace-paravirt + +#include <trace/define_trace.h> diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 49d03fd6f4ad..bcfd4760b491 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -11,6 +11,8 @@
#include <kvm/arm_hypercalls.h>
+#include "trace.h" + void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) { struct kvm *kvm = vcpu->kvm; @@ -53,6 +55,7 @@ long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu) kvm_vcpu_yield_to(target);
val = SMCCC_RET_SUCCESS; + trace_kvm_pvsched_kick_vcpu(vcpu->vcpu_id, target->vcpu_id);
out: return val; diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h index 8ad53104934d..680c7e39af4a 100644 --- a/arch/arm64/kvm/trace_arm.h +++ b/arch/arm64/kvm/trace_arm.h @@ -390,6 +390,24 @@ TRACE_EVENT(kvm_forward_sysreg_trap, sys_reg_Op2(__entry->sysreg)) );
+TRACE_EVENT(kvm_pvsched_kick_vcpu, + TP_PROTO(int vcpu_id, int target_vcpu_id), + TP_ARGS(vcpu_id, target_vcpu_id), + + TP_STRUCT__entry( + __field(int, vcpu_id) + __field(int, target_vcpu_id) + ), + + TP_fast_assign( + __entry->vcpu_id = vcpu_id; + __entry->target_vcpu_id = target_vcpu_id; + ), + + TP_printk("PV qspinlock: vcpu %d kick target vcpu %d", + __entry->vcpu_id, __entry->target_vcpu_id) +); + #endif /* _TRACE_ARM_ARM64_KVM_H */
#undef TRACE_INCLUDE_PATH
From: yezengruan yezengruan@huawei.com
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------------------
1. Clear kvm_vcpu_arch::pvsched::base on vcpu reset The guest memory will otherwise be corrupted by KVM if we reboot into an old guest kernel which is **not** aware of the pvsched feature.
2. Fix boot cpu pvsched init abnormal The pv_sched_init was called too early in the boot in setup_arch, hence pvsched_vcpu_state was not initializedfo vcpu 0.
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: yezengruan yezengruan@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kernel/paravirt.c | 1 + arch/arm64/kernel/setup.c | 2 -- arch/arm64/kvm/arm.c | 2 ++ 3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 6ad463cadd68..c5fc82372e47 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -346,3 +346,4 @@ int __init pv_sched_init(void)
return 0; } +early_initcall(pv_sched_init); diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 0255788cb9c4..645034e52496 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -416,8 +416,6 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p) smp_init_cpus(); smp_build_mpidr_hash();
- pv_sched_init(); - /* Init percpu seeds for random tags after cpus are set up. */ kasan_init_sw_tags();
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 735cc8393a55..19936ab415b9 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1351,6 +1351,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
spin_unlock(&vcpu->arch.mp_state_lock);
+ kvm_arm_pvsched_vcpu_init(&vcpu->arch); + return 0; }
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
Support PV qspinlock feature after pv.ops removed
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 16 ++++++++++++---- arch/arm64/kernel/paravirt.c | 22 ++++++++++++++++++---- 2 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index f10552839663..a29eeffa49aa 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -33,26 +33,34 @@ static inline bool pv_vcpu_is_preempted(int cpu) #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) void __init pv_qspinlock_init(void); bool pv_is_native_spin_unlock(void); -DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath); + +void dummy_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, + dummy_queued_spin_lock_slowpath); static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) { return static_call(pv_qspinlock_queued_spin_lock_slowpath)(lock, val); }
-DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +void dummy_queued_spin_unlock(struct qspinlock *lock); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, dummy_queued_spin_unlock); static inline void pv_queued_spin_unlock(struct qspinlock *lock) { return static_call(pv_qspinlock_queued_spin_unlock)(lock); }
+void dummy_wait(u8 *ptr, u8 val); +DECLARE_STATIC_CALL(pv_qspinlock_wait, dummy_wait); static inline void pv_wait(u8 *ptr, u8 val) { - return; + return static_call(pv_qspinlock_wait)(ptr, val); }
+void dummy_kick(int cpu); +DECLARE_STATIC_CALL(pv_qspinlock_kick, dummy_kick); static inline void pv_kick(int cpu) { - return; + return static_call(pv_qspinlock_kick)(cpu); } #else
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index c5fc82372e47..73a8b9886775 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -296,6 +296,17 @@ static void kvm_wait(u8 *ptr, u8 val) local_irq_restore(flags); }
+DEFINE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, + native_queued_spin_lock_slowpath); +DEFINE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +DEFINE_STATIC_CALL(pv_qspinlock_wait, kvm_wait); +DEFINE_STATIC_CALL(pv_qspinlock_kick, kvm_kick_cpu); + +EXPORT_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath); +EXPORT_STATIC_CALL(pv_qspinlock_queued_spin_unlock); +EXPORT_STATIC_CALL(pv_qspinlock_wait); +EXPORT_STATIC_CALL(pv_qspinlock_kick); + void __init pv_qspinlock_init(void) { /* Don't use the PV qspinlock code if there is only 1 vCPU. */ @@ -309,10 +320,13 @@ void __init pv_qspinlock_init(void) pr_info("PV qspinlocks enabled\n");
__pv_init_lock_hash(); - pv_ops.sched.queued_spin_lock_slowpath = __pv_queued_spin_lock_slowpath; - pv_ops.sched.queued_spin_unlock = __pv_queued_spin_unlock; - pv_ops.sched.wait = kvm_wait; - pv_ops.sched.kick = kvm_kick_cpu; + + static_call_update(pv_qspinlock_queued_spin_lock_slowpath, + __pv_queued_spin_lock_slowpath); + static_call_update(pv_qspinlock_queued_spin_unlock, + __pv_queued_spin_unlock); + static_call_update(pv_qspinlock_wait, kvm_wait); + static_call_update(pv_qspinlock_kick, kvm_kick_cpu); }
static __init int arm_parse_pvspin(char *arg)
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
Add configuration for feature pv-sched
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 7 ++++ arch/arm64/include/asm/kvm_host.h | 4 +++ arch/arm64/include/asm/paravirt.h | 2 ++ arch/arm64/kernel/paravirt.c | 54 ++++++++++++++++--------------- arch/arm64/kvm/arm.c | 13 ++++++++ arch/arm64/kvm/handle_exit.c | 2 ++ arch/arm64/kvm/hypercalls.c | 4 +++ arch/arm64/kvm/pvsched.c | 2 ++ 8 files changed, 62 insertions(+), 26 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c21b663b34d2..b518783557ea 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1561,6 +1561,13 @@ config PARAVIRT_SPINLOCKS
If you are unsure how to answer this question, answer Y.
+config PARAVIRT_SCHED + bool "Paravirtualization layer for sched" + depends on PARAVIRT + help + + If you are unsure how to answer this question, answer Y. + config PARAVIRT_TIME_ACCOUNTING bool "Paravirtual steal time accounting" select PARAVIRT diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index d74310f1e569..5a5ed211ea35 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -589,11 +589,13 @@ struct kvm_vcpu_arch { gpa_t base; } steal;
+#ifdef CONFIG_PARAVIRT_SCHED /* Guest PV sched state */ struct { bool pv_unhalted; gpa_t base; } pvsched; +#endif
/* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; @@ -1044,6 +1046,7 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) return (vcpu_arch->steal.base != INVALID_GPA); }
+#ifdef CONFIG_PARAVIRT_SCHED long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu); @@ -1057,6 +1060,7 @@ static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) { return (vcpu_arch->pvsched.base != INVALID_GPA); } +#endif
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index a29eeffa49aa..1ce7187eed48 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,7 +20,9 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+#ifdef CONFIG_PARAVIRT_SCHED int __init pv_sched_init(void); +#endif
__visible bool __native_vcpu_is_preempted(int cpu); DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 73a8b9886775..126dff610e07 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -180,7 +180,7 @@ int __init pv_time_init(void) return 0; }
- +#ifdef CONFIG_PARAVIRT_SCHED DEFINE_PER_CPU(struct pvsched_vcpu_state, pvsched_vcpu_region) __aligned(64); EXPORT_PER_CPU_SYMBOL(pvsched_vcpu_region);
@@ -263,6 +263,33 @@ static bool has_kvm_pvsched(void) return (res.a0 == SMCCC_RET_SUCCESS); }
+int __init pv_sched_init(void) +{ + int ret; + + if (is_hyp_mode_available()) + return 0; + + if (!has_kvm_pvsched()) { + pr_warn("PV sched is not available\n"); + return 0; + } + + ret = kvm_arm_init_pvsched(); + if (ret) + return ret; + + static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); + pr_info("using PV sched preempted\n"); + + pv_qspinlock_init(); + + return 0; +} + +early_initcall(pv_sched_init); +#endif /* CONFIG_PARAVIRT_SCHED */ + #ifdef CONFIG_PARAVIRT_SPINLOCKS static bool arm_pvspin;
@@ -336,28 +363,3 @@ static __init int arm_parse_pvspin(char *arg) } early_param("arm_pvspin", arm_parse_pvspin); #endif /* CONFIG_PARAVIRT_SPINLOCKS */ - -int __init pv_sched_init(void) -{ - int ret; - - if (is_hyp_mode_available()) - return 0; - - if (!has_kvm_pvsched()) { - pr_warn("PV sched is not available\n"); - return 0; - } - - ret = kvm_arm_init_pvsched(); - if (ret) - return ret; - - static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); - pr_info("using PV sched preempted\n"); - - pv_qspinlock_init(); - - return 0; -} -early_initcall(pv_sched_init); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 19936ab415b9..c128a97fcc88 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -386,7 +386,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+#ifdef CONFIG_PARAVIRT_SCHED kvm_arm_pvsched_vcpu_init(&vcpu->arch); +#endif
vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
@@ -469,8 +471,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu);
+#ifdef CONFIG_PARAVIRT_SCHED if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) kvm_update_pvsched_preempted(vcpu, 0); +#endif
#ifdef CONFIG_KVM_HISI_VIRT kvm_hisi_dvmbm_load(vcpu); @@ -491,8 +495,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1;
+#ifdef CONFIG_PARAVIRT_SCHED if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) kvm_update_pvsched_preempted(vcpu, 1); +#endif
#ifdef CONFIG_KVM_HISI_VIRT kvm_hisi_dvmbm_put(vcpu); @@ -574,10 +580,15 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF); +#ifdef CONFIG_PARAVIRT_SCHED bool pv_unhalted = v->arch.pvsched.pv_unhalted;
return ((irq_lines || kvm_vgic_vcpu_pending_irq(v) || pv_unhalted) && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); +#else + return ((irq_lines || kvm_vgic_vcpu_pending_irq(v)) + && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); +#endif }
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu) @@ -1351,7 +1362,9 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
spin_unlock(&vcpu->arch.mp_state_lock);
+#ifdef CONFIG_PARAVIRT_SCHED kvm_arm_pvsched_vcpu_init(&vcpu->arch); +#endif
return 0; } diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index b9a44c3bebb7..f4671ae23b1f 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -121,7 +121,9 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) } else { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); vcpu->stat.wfi_exit_stat++; +#ifdef CONFIG_PARAVIRT_SCHED vcpu->arch.pvsched.pv_unhalted = false; +#endif }
if (esr & ESR_ELx_WFx_ISS_WFxT) { diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 46def0572461..2e7a1d06e6ff 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -332,9 +332,11 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) &smccc_feat->std_hyp_bmap)) val[0] = SMCCC_RET_SUCCESS; break; +#ifdef CONFIG_PARAVIRT_SCHED case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = SMCCC_RET_SUCCESS; break; +#endif } break; case ARM_SMCCC_HV_PV_TIME_FEATURES: @@ -363,6 +365,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_TRNG_RND32: case ARM_SMCCC_TRNG_RND64: return kvm_trng_call(vcpu); +#ifdef CONFIG_PARAVIRT_SCHED case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = kvm_hypercall_pvsched_features(vcpu); break; @@ -380,6 +383,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: val[0] = kvm_pvsched_kick_vcpu(vcpu); break; +#endif default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index bcfd4760b491..eae6ad7b6c5d 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -4,6 +4,7 @@ * Author: Zengruan Ye yezengruan@huawei.com */
+#ifdef CONFIG_PARAVIRT_SCHED #include <linux/arm-smccc.h> #include <linux/kvm_host.h>
@@ -77,3 +78,4 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu)
return val; } +#endif /* CONFIG_PARAVIRT_SCHED */
From: chenjiajun chenjiajun8@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
This patch export remaining aarch64 exit items to vcpu_stat via debugfs. The items include: fp_asimd_exit_stat, irq_exit_stat, sys64_exit_stat, mabt_exit_stat, fail_entry_exit_stat, internal_error_exit_stat, unknown_ec_exit_stat, cp15_32_exit_stat, cp15_64_exit_stat, cp14_mr_exit_stat, cp14_ls_exit_stat, cp14_64_exit_stat, smc_exit_stat, sve_exit_stat, debug_exit_stat
Signed-off-by: Biaoxiang Ye yebiaoxiang@huawei.com Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: chenjiajun chenjiajun8@huawei.com Reviewed-by: Xiangyou Xie xiexiangyou@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Chen Jun chenjun102@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 15 +++++++++++++++ arch/arm64/kvm/handle_exit.c | 9 +++++++++ arch/arm64/kvm/hyp/include/hyp/switch.h | 1 + arch/arm64/kvm/mmu.c | 1 + arch/arm64/kvm/sys_regs.c | 11 +++++++++++ 5 files changed, 37 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 5a5ed211ea35..37e0e434c505 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -913,6 +913,21 @@ struct kvm_vcpu_stat { u64 mmio_exit_kernel; u64 signal_exits; u64 exits; + u64 fp_asimd_exit_stat; + u64 irq_exit_stat; + u64 sys64_exit_stat; + u64 mabt_exit_stat; + u64 fail_entry_exit_stat; + u64 internal_error_exit_stat; + u64 unknown_ec_exit_stat; + u64 cp15_32_exit_stat; + u64 cp15_64_exit_stat; + u64 cp14_mr_exit_stat; + u64 cp14_ls_exit_stat; + u64 cp14_64_exit_stat; + u64 smc_exit_stat; + u64 sve_exit_stat; + u64 debug_exit_stat; };
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index f4671ae23b1f..65d780887ab2 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -73,6 +73,7 @@ static int handle_smc(struct kvm_vcpu *vcpu) */ if (kvm_vcpu_hvc_get_imm(vcpu)) { vcpu_set_reg(vcpu, 0, ~0UL); + vcpu->stat.smc_exit_stat++; return 1; }
@@ -176,6 +177,8 @@ static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu) run->debug.arch.hsr_high = upper_32_bits(esr); run->flags = KVM_DEBUG_ARCH_HSR_HIGH_VALID;
+ vcpu->stat.debug_exit_stat++; + switch (ESR_ELx_EC(esr)) { case ESR_ELx_EC_WATCHPT_LOW: run->debug.arch.far = vcpu->arch.fault.far_el2; @@ -196,6 +199,7 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu) esr, esr_get_class_string(esr));
kvm_inject_undefined(vcpu); + vcpu->stat.unknown_ec_exit_stat++; return 1; }
@@ -206,6 +210,7 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu) static int handle_sve(struct kvm_vcpu *vcpu) { kvm_inject_undefined(vcpu); + vcpu->stat.sve_exit_stat++; return 1; }
@@ -338,6 +343,7 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
switch (exception_index) { case ARM_EXCEPTION_IRQ: + vcpu->stat.irq_exit_stat++; return 1; case ARM_EXCEPTION_EL1_SERROR: return 1; @@ -349,6 +355,7 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index) * is pre-emptied by kvm_reboot()'s shutdown call. */ run->exit_reason = KVM_EXIT_FAIL_ENTRY; + vcpu->stat.fail_entry_exit_stat++; return 0; case ARM_EXCEPTION_IL: /* @@ -356,11 +363,13 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index) * have been corrupted somehow. Give up. */ run->exit_reason = KVM_EXIT_FAIL_ENTRY; + vcpu->stat.fail_entry_exit_stat++; return -EINVAL; default: kvm_pr_unimpl("Unsupported exception type: %d", exception_index); run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + vcpu->stat.internal_error_exit_stat++; return 0; } } diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 9cfe6bd1dbe4..c91e641346d6 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -294,6 +294,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) /* Only handle traps the vCPU can support here: */ switch (esr_ec) { case ESR_ELx_EC_FP_ASIMD: + vcpu->stat.fp_asimd_exit_stat++; break; case ESR_ELx_EC_SVE: if (!sve_guest) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 482280fe22d7..121a3d90240d 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1416,6 +1416,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); VM_BUG_ON(write_fault && exec_fault); + vcpu->stat.mabt_exit_stat++;
if (fault_status == ESR_ELx_FSC_PERM && !write_fault && !exec_fault) { kvm_err("Unexpected L2 read permission error\n"); diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0afd6136e275..8a92e848a1e6 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2774,6 +2774,8 @@ static bool check_sysreg_table(const struct sys_reg_desc *table, unsigned int n, int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu) { kvm_inject_undefined(vcpu); + vcpu->stat.cp14_ls_exit_stat++; + return 1; }
@@ -3050,6 +3052,8 @@ static int kvm_handle_cp_32(struct kvm_vcpu *vcpu,
int kvm_handle_cp15_64(struct kvm_vcpu *vcpu) { + vcpu->stat.cp15_64_exit_stat++; + return kvm_handle_cp_64(vcpu, cp15_64_regs, ARRAY_SIZE(cp15_64_regs)); }
@@ -3059,6 +3063,8 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu)
params = esr_cp1x_32_to_params(kvm_vcpu_get_esr(vcpu));
+ vcpu->stat.cp15_32_exit_stat++; + /* * Certain AArch32 ID registers are handled by rerouting to the AArch64 * system register table. Registers in the ID range where CRm=0 are @@ -3073,6 +3079,8 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu)
int kvm_handle_cp14_64(struct kvm_vcpu *vcpu) { + vcpu->stat.cp14_64_exit_stat++; + return kvm_handle_cp_64(vcpu, cp14_64_regs, ARRAY_SIZE(cp14_64_regs)); }
@@ -3082,6 +3090,8 @@ int kvm_handle_cp14_32(struct kvm_vcpu *vcpu)
params = esr_cp1x_32_to_params(kvm_vcpu_get_esr(vcpu));
+ vcpu->stat.cp14_mr_exit_stat++; + return kvm_handle_cp_32(vcpu, ¶ms, cp14_regs, ARRAY_SIZE(cp14_regs)); }
@@ -3178,6 +3188,7 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu) int Rt = kvm_vcpu_sys_get_rt(vcpu);
trace_kvm_handle_sys_reg(esr); + vcpu->stat.sys64_exit_stat++;
if (__check_nv_sr_forward(vcpu)) return 1;
From: Zenghui Yu yuzenghui@huawei.com
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
----------------------------------------------------
Currently fp_asimd_exit_stat is accumulated for *both* FP/ASIMD and SVE traps so that user can not distinguish between these two via debugfs.
Fix the manipulation for both exception classes.
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Reviewed-by: Keqian Zhu zhukeqian1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kvm/hyp/include/hyp/switch.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index c91e641346d6..61907949e00f 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -297,6 +297,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) vcpu->stat.fp_asimd_exit_stat++; break; case ESR_ELx_EC_SVE: + vcpu->stat.sve_exit_stat++; if (!sve_guest) return false; break;
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/4034 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/4...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/4034 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/4...