Support PV sched and PV qspinlock
Zengruan Ye (9): KVM: arm64: Document PV-sched interface KVM: arm64: Implement PV_SCHED_FEATURES call KVM: arm64: Support pvsched preempted via shared structure KVM: arm64: Add interface to support vCPU preempted check KVM: arm64: Support the vCPU preemption check KVM: arm64: Add SMCCC PV-sched to kick cpu KVM: arm64: Implement PV_SCHED_KICK_CPU call KVM: arm64: Enable PV qspinlock KVM: arm64: Add tracepoints for PV qspinlock
lishusen (1): KVM: arm64: Add configuration for feature pv-sched
Documentation/virt/kvm/arm/pvsched.rst | 74 ++++++++ arch/arm64/Kconfig | 20 +++ arch/arm64/include/asm/Kbuild | 1 - arch/arm64/include/asm/kvm_host.h | 24 +++ arch/arm64/include/asm/paravirt.h | 51 ++++++ arch/arm64/include/asm/pvsched-abi.h | 16 ++ arch/arm64/include/asm/qspinlock.h | 48 +++++ arch/arm64/include/asm/qspinlock_paravirt.h | 12 ++ arch/arm64/include/asm/spinlock.h | 13 ++ arch/arm64/kernel/Makefile | 3 +- arch/arm64/kernel/paravirt-spinlocks.c | 21 +++ arch/arm64/kernel/paravirt.c | 189 ++++++++++++++++++++ arch/arm64/kernel/trace-paravirt.h | 66 +++++++ arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/arm.c | 26 +++ arch/arm64/kvm/handle_exit.c | 3 + arch/arm64/kvm/hypercalls.c | 24 +++ arch/arm64/kvm/pvsched.c | 81 +++++++++ arch/arm64/kvm/trace_arm.h | 18 ++ include/linux/arm-smccc.h | 25 +++ include/linux/cpuhotplug.h | 1 + 21 files changed, 715 insertions(+), 3 deletions(-) create mode 100644 Documentation/virt/kvm/arm/pvsched.rst create mode 100644 arch/arm64/include/asm/pvsched-abi.h create mode 100644 arch/arm64/include/asm/qspinlock.h create mode 100644 arch/arm64/include/asm/qspinlock_paravirt.h create mode 100644 arch/arm64/kernel/paravirt-spinlocks.c create mode 100644 arch/arm64/kernel/trace-paravirt.h create mode 100644 arch/arm64/kvm/pvsched.c
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Introduce a paravirtualization interface for KVM/arm64 to PV-sched.
A hypercall interface is provided for the guest to interrogate the hypervisor's support for this interface and the location of the shared memory structures.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- Documentation/virt/kvm/arm/pvsched.rst | 58 ++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 Documentation/virt/kvm/arm/pvsched.rst
diff --git a/Documentation/virt/kvm/arm/pvsched.rst b/Documentation/virt/kvm/arm/pvsched.rst new file mode 100644 index 000000000000..8f7112a8a9cd --- /dev/null +++ b/Documentation/virt/kvm/arm/pvsched.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Paravirtualized sched support for arm64 +======================================= + +KVM/arm64 provides some hypervisor service calls to support a paravirtualized +sched. + +Some SMCCC compatible hypercalls are defined: + +* PV_SCHED_FEATURES: 0xC5000090 +* PV_SCHED_IPA_INIT: 0xC5000091 +* PV_SCHED_IPA_RELEASE: 0xC5000092 + +The existence of the PV_SCHED hypercall should be probed using the SMCCC 1.1 +ARCH_FEATURES mechanism before calling it. + +PV_SCHED_FEATURES + ============= ======== ========== + Function ID: (uint32) 0xC5000090 + PV_call_id: (uint32) The function to query for support. + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant + PV-sched feature is supported by the hypervisor. + ============= ======== ========== + +PV_SCHED_IPA_INIT + ============= ======== ========== + Function ID: (uint32) 0xC5000091 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA of + this vCPU's PV data structure is shared to the + hypervisor. + ============= ======== ========== + +PV_SCHED_IPA_RELEASE + ============= ======== ========== + Function ID: (uint32) 0xC5000092 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA of + this vCPU's PV data structure is released. + ============= ======== ========== + +PV sched state +-------------- + +The structure pointed to by the PV_SCHED_IPA hypercall is as follows: + ++-----------+-------------+-------------+-----------------------------------+ +| Field | Byte Length | Byte Offset | Description | ++===========+=============+=============+===================================+ +| preempted | 4 | 0 | Indicates that the vCPU that owns | +| | | | this struct is running or not. | +| | | | Non-zero values mean the vCPU has | +| | | | been preempted. Zero means the | +| | | | vCPU is not preempted. | ++-----------+-------------+-------------+-----------------------------------+ + +The preempted field will be updated to 0 by the hypervisor prior to scheduling +a vCPU. When the vCPU is scheduled out, the preempted field will be updated +to 1 by the hypervisor.
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
This provides a mechanism for querying which paravirtualized sched features are available in this hypervisor.
Add some SMCCC compatible hypercalls for PV sched features: PV_SCHED_FEATURES: 0xC5000090 PV_SCHED_IPA_INIT: 0xC5000091 PV_SCHED_IPA_RELEASE: 0xC5000092
Also add the header file which defines the ABI for the paravirtualized sched features we're about to add.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/include/asm/pvsched-abi.h | 16 ++++++++++++++++ arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/hypercalls.c | 6 ++++++ arch/arm64/kvm/pvsched.c | 23 +++++++++++++++++++++++ include/linux/arm-smccc.h | 19 +++++++++++++++++++ 6 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/pvsched-abi.h create mode 100644 arch/arm64/kvm/pvsched.c
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 714a15e9adc2..f1b54be07620 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1051,6 +1051,8 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) return (vcpu_arch->steal.base != INVALID_GPA); }
+long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); + void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); diff --git a/arch/arm64/include/asm/pvsched-abi.h b/arch/arm64/include/asm/pvsched-abi.h new file mode 100644 index 000000000000..80e50e7a1a31 --- /dev/null +++ b/arch/arm64/include/asm/pvsched-abi.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef __ASM_PVSCHED_ABI_H +#define __ASM_PVSCHED_ABI_H + +struct pvsched_vcpu_state { + __le32 preempted; + /* Structure must be 64 byte aligned, pad to that size */ + u8 padding[60]; +} __packed; + +#endif diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 826a05d072d7..fe63a91a4c54 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -10,7 +10,7 @@ include $(srctree)/virt/kvm/Makefile.kvm obj-$(CONFIG_KVM) += kvm.o obj-$(CONFIG_KVM) += hyp/
-kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ +kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o pvsched.o \ inject_fault.o va_layout.o handle_exit.o \ guest.o debug.o reset.o sys_regs.o stacktrace.o \ vgic-sys-reg-v3.o fpsimd.o pkvm.o \ diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 7fb4df0456de..190670d8dd3f 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -332,6 +332,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) &smccc_feat->std_hyp_bmap)) val[0] = SMCCC_RET_SUCCESS; break; + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val[0] = SMCCC_RET_SUCCESS; + break; } break; case ARM_SMCCC_HV_PV_TIME_FEATURES: @@ -360,6 +363,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_TRNG_RND32: case ARM_SMCCC_TRNG_RND64: return kvm_trng_call(vcpu); + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val[0] = kvm_hypercall_pvsched_features(vcpu); + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c new file mode 100644 index 000000000000..3d96122fcf9e --- /dev/null +++ b/arch/arm64/kvm/pvsched.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#include <linux/arm-smccc.h> + +#include <kvm/arm_hypercalls.h> + +long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) +{ + u32 feature = smccc_get_arg1(vcpu); + long val = SMCCC_RET_NOT_SUPPORTED; + + switch (feature) { + case ARM_SMCCC_HV_PV_SCHED_FEATURES: + val = SMCCC_RET_SUCCESS; + break; + } + + return val; +} diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 083f85653716..daab363d31ae 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -577,5 +577,24 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, method; \ })
+/* Paravirtualised sched calls */ +#define ARM_SMCCC_HV_PV_SCHED_FEATURES \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x90) + +#define ARM_SMCCC_HV_PV_SCHED_IPA_INIT \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x91) + +#define ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x92) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Implement the service call for configuring a shared structure between a vCPU and the hypervisor in which the hypervisor can tell the vCPU that is running or not.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 16 ++++++++++++++++ arch/arm64/kvm/arm.c | 9 +++++++++ arch/arm64/kvm/hypercalls.c | 11 +++++++++++ arch/arm64/kvm/pvsched.c | 28 ++++++++++++++++++++++++++++ 4 files changed, 64 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f1b54be07620..1ad1a18caceb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -596,6 +596,11 @@ struct kvm_vcpu_arch { gpa_t base; } steal;
+ /* Guest PV sched state */ + struct { + gpa_t base; + } pvsched; + /* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr;
@@ -1052,6 +1057,17 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) }
long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); +void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); + +static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) +{ + vcpu_arch->pvsched.base = INVALID_GPA; +} + +static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) +{ + return (vcpu_arch->pvsched.base != INVALID_GPA); +}
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 6b30c9698d13..9750bfba6f4c 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -408,6 +408,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+ kvm_arm_pvsched_vcpu_init(&vcpu->arch); + vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
err = kvm_vgic_vcpu_init(vcpu); @@ -489,11 +491,15 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (vcpu_has_ptrauth(vcpu)) vcpu_ptrauth_disable(vcpu); + kvm_arch_vcpu_load_debug_state_flags(vcpu);
if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu);
+ if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) + kvm_update_pvsched_preempted(vcpu, 0); + kvm_tlbi_dvmbm_vcpu_load(vcpu); }
@@ -511,6 +517,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1;
+ if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) + kvm_update_pvsched_preempted(vcpu, 1); + kvm_tlbi_dvmbm_vcpu_put(vcpu); }
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 190670d8dd3f..a96b88d90cf8 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -366,6 +366,17 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = kvm_hypercall_pvsched_features(vcpu); break; + case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: + gpa = smccc_get_arg1(vcpu); + if (gpa != INVALID_GPA) { + vcpu->arch.pvsched.base = gpa; + val[0] = SMCCC_RET_SUCCESS; + } + break; + case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: + vcpu->arch.pvsched.base = INVALID_GPA; + val[0] = SMCCC_RET_SUCCESS; + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 3d96122fcf9e..06290c831101 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -5,9 +5,35 @@ */
#include <linux/arm-smccc.h> +#include <linux/kvm_host.h> + +#include <asm/pvsched-abi.h>
#include <kvm/arm_hypercalls.h>
+void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) +{ + struct kvm *kvm = vcpu->kvm; + u64 base = vcpu->arch.pvsched.base; + u64 offset = offsetof(struct pvsched_vcpu_state, preempted); + int idx; + + if (base == INVALID_GPA) + return; + + /* + * This function is called from atomic context, so we need to + * disable page faults. + */ + pagefault_disable(); + + idx = srcu_read_lock(&kvm->srcu); + kvm_put_guest(kvm, base + offset, cpu_to_le32(preempted)); + srcu_read_unlock(&kvm->srcu, idx); + + pagefault_enable(); +} + long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) { u32 feature = smccc_get_arg1(vcpu); @@ -15,6 +41,8 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu)
switch (feature) { case ARM_SMCCC_HV_PV_SCHED_FEATURES: + case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: + case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: val = SMCCC_RET_SUCCESS; break; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
This is to fix some lock holder preemption issues. Some other locks implementation do a spin loop before acquiring the lock itself. Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It takes the CPU as parameter and return true if the CPU is preempted. Then kernel can break the spin loops upon the retval of vcpu_is_preempted.
As kernel has used this interface, So lets support it.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 8 ++++++++ arch/arm64/include/asm/spinlock.h | 10 ++++++++++ arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/paravirt-spinlocks.c | 16 ++++++++++++++++ 4 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kernel/paravirt-spinlocks.c
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 9aa193e0e8f2..53282dc429f8 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,6 +20,14 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+__visible bool __native_vcpu_is_preempted(int cpu); +DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); + +static inline bool pv_vcpu_is_preempted(int cpu) +{ + return static_call(pv_vcpu_preempted)(cpu); +} + #else
#define pv_time_init() do {} while (0) diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index 0525c0b089ed..3dcb1c903ba0 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -7,6 +7,7 @@
#include <asm/qspinlock.h> #include <asm/qrwlock.h> +#include <asm/paravirt.h>
/* See include/linux/spinlock.h */ #define smp_mb__after_spinlock() smp_mb() @@ -19,9 +20,18 @@ * https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-a... */ #define vcpu_is_preempted vcpu_is_preempted +#ifdef CONFIG_PARAVIRT +static inline bool vcpu_is_preempted(int cpu) +{ + return pv_vcpu_is_preempted(cpu); +} + +#else + static inline bool vcpu_is_preempted(int cpu) { return false; } +#endif /* CONFIG_PARAVIRT */
#endif /* __ASM_SPINLOCK_H */ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 31859e7741a5..235963ed87b1 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -60,7 +60,7 @@ obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o -obj-$(CONFIG_PARAVIRT) += paravirt.o +obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt-spinlocks.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o obj-$(CONFIG_ELF_CORE) += elfcore.o diff --git a/arch/arm64/kernel/paravirt-spinlocks.c b/arch/arm64/kernel/paravirt-spinlocks.c new file mode 100644 index 000000000000..3566897070d9 --- /dev/null +++ b/arch/arm64/kernel/paravirt-spinlocks.c @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#include <linux/static_call.h> +#include <linux/spinlock.h> +#include <asm/paravirt.h> + +__visible bool __native_vcpu_is_preempted(int cpu) +{ + return false; +} + +DEFINE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted);
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Support the vcpu_is_preempted() functionality under KVM/arm64. This will enhance lock performance on overcommitted hosts (more runnable vCPUs than physical CPUs in the system) as doing busy waits for preempted vCPUs will hurt system performance far worse than early yielding.
unix benchmark result: host: kernel 4.19.87, HiSilicon Kunpeng920, 8 CPUs guest: kernel 4.19.87, 16 vCPUs
test-case | after-patch | before-patch ----------------------------------------+-------------------+------------------ Dhrystone 2 using register variables | 338955728.5 lps | 339266319.5 lps Double-Precision Whetstone | 30634.9 MWIPS | 30884.4 MWIPS Execl Throughput | 6753.2 lps | 3580.1 lps File Copy 1024 bufsize 2000 maxblocks | 490048.0 KBps | 313282.3 KBps File Copy 256 bufsize 500 maxblocks | 129662.5 KBps | 83550.7 KBps File Copy 4096 bufsize 8000 maxblocks | 1552551.5 KBps | 814327.0 KBps Pipe Throughput | 8976422.5 lps | 9048628.4 lps Pipe-based Context Switching | 258641.7 lps | 252925.9 lps Process Creation | 5312.2 lps | 4507.9 lps Shell Scripts (1 concurrent) | 8704.2 lpm | 6720.9 lpm Shell Scripts (8 concurrent) | 1708.8 lpm | 607.2 lpm System Call Overhead | 3714444.7 lps | 3746386.8 lps ----------------------------------------+-------------------+------------------ System Benchmarks Index Score | 2270.6 | 1679.2
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/paravirt.h | 3 + arch/arm64/kernel/paravirt.c | 107 ++++++++++++++++++++++++++++++ arch/arm64/kvm/arm.c | 2 + include/linux/cpuhotplug.h | 1 + 4 files changed, 113 insertions(+)
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 53282dc429f8..4e35851bcecf 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,6 +20,8 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+int __init pv_sched_init(void); + __visible bool __native_vcpu_is_preempted(int cpu); DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted);
@@ -31,6 +33,7 @@ static inline bool pv_vcpu_is_preempted(int cpu) #else
#define pv_time_init() do {} while (0) +#define pv_sched_init() do {} while (0)
#endif // CONFIG_PARAVIRT
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index aa718d6a9274..e0c862f8f656 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -22,6 +22,7 @@
#include <asm/paravirt.h> #include <asm/pvclock-abi.h> +#include <asm/pvsched-abi.h> #include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled; @@ -174,3 +175,109 @@ int __init pv_time_init(void)
return 0; } + + +DEFINE_PER_CPU(struct pvsched_vcpu_state, pvsched_vcpu_region) __aligned(64); +EXPORT_PER_CPU_SYMBOL(pvsched_vcpu_region); + +static bool kvm_vcpu_is_preempted(int cpu) +{ + struct pvsched_vcpu_state *reg; + u32 preempted; + + reg = &per_cpu(pvsched_vcpu_region, cpu); + if (!reg) { + pr_warn_once("PV sched enabled but not configured for cpu %d\n", + cpu); + return false; + } + + preempted = le32_to_cpu(READ_ONCE(reg->preempted)); + + return !!preempted; +} + +static int pvsched_vcpu_state_dying_cpu(unsigned int cpu) +{ + struct pvsched_vcpu_state *reg; + struct arm_smccc_res res; + + reg = this_cpu_ptr(&pvsched_vcpu_region); + if (!reg) + return -EFAULT; + + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE, &res); + memset(reg, 0, sizeof(*reg)); + + return 0; +} + +static int init_pvsched_vcpu_state(unsigned int cpu) +{ + struct pvsched_vcpu_state *reg; + struct arm_smccc_res res; + + reg = this_cpu_ptr(&pvsched_vcpu_region); + if (!reg) + return -EFAULT; + + /* Pass the memory address to host via hypercall */ + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_IPA_INIT, + virt_to_phys(reg), &res); + + return 0; +} + +static int kvm_arm_init_pvsched(void) +{ + int ret; + + ret = cpuhp_setup_state(CPUHP_AP_ARM_KVM_PVSCHED_STARTING, + "hypervisor/arm/pvsched:starting", + init_pvsched_vcpu_state, + pvsched_vcpu_state_dying_cpu); + + if (ret < 0) { + pr_warn("PV sched init failed\n"); + return ret; + } + + return 0; +} + +static bool has_kvm_pvsched(void) +{ + struct arm_smccc_res res; + + /* To detect the presence of PV sched support we require SMCCC 1.1+ */ + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) + return false; + + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_HV_PV_SCHED_FEATURES, &res); + + return (res.a0 == SMCCC_RET_SUCCESS); +} + +int __init pv_sched_init(void) +{ + int ret; + + if (is_hyp_mode_available()) + return 0; + + if (!has_kvm_pvsched()) { + pr_warn("PV sched is not available\n"); + return 0; + } + + ret = kvm_arm_init_pvsched(); + if (ret) + return ret; + + static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); + pr_info("using PV sched preempted\n"); + + return 0; +} +early_initcall(pv_sched_init); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 9750bfba6f4c..c3774a172ff3 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1383,6 +1383,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
spin_unlock(&vcpu->arch.mp_state_lock);
+ kvm_arm_pvsched_vcpu_init(&vcpu->arch); + return 0; }
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 624d4a38c358..f94a1b8e34e0 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -190,6 +190,7 @@ enum cpuhp_state { CPUHP_AP_DUMMY_TIMER_STARTING, CPUHP_AP_ARM_XEN_STARTING, CPUHP_AP_ARM_XEN_RUNSTATE_STARTING, + CPUHP_AP_ARM_KVM_PVSCHED_STARTING, CPUHP_AP_ARM_CORESIGHT_STARTING, CPUHP_AP_ARM_CORESIGHT_CTI_STARTING, CPUHP_AP_ARM64_ISNDEP_STARTING,
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
A new hypercall interface function is provided for the guest to kick WFI state vCPU.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- Documentation/virt/kvm/arm/pvsched.rst | 16 ++++++++++++++++ include/linux/arm-smccc.h | 6 ++++++ 2 files changed, 22 insertions(+)
diff --git a/Documentation/virt/kvm/arm/pvsched.rst b/Documentation/virt/kvm/arm/pvsched.rst index 8f7112a8a9cd..6ba221e25089 100644 --- a/Documentation/virt/kvm/arm/pvsched.rst +++ b/Documentation/virt/kvm/arm/pvsched.rst @@ -11,6 +11,7 @@ Some SMCCC compatible hypercalls are defined: * PV_SCHED_FEATURES: 0xC5000090 * PV_SCHED_IPA_INIT: 0xC5000091 * PV_SCHED_IPA_RELEASE: 0xC5000092 +* PV_SCHED_KICK_CPU: 0xC5000093
The existence of the PV_SCHED hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES mechanism before calling it. @@ -38,6 +39,13 @@ PV_SCHED_IPA_RELEASE this vCPU's PV data structure is released. ============= ======== ==========
+PV_SCHED_KICK_CPU + ============= ======== ========== + Function ID: (uint32) 0xC5000093 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the vCPU is + kicked by the hypervisor. + ============= ======== ========== + PV sched state --------------
@@ -56,3 +64,11 @@ The structure pointed to by the PV_SCHED_IPA hypercall is as follows: The preempted field will be updated to 0 by the hypervisor prior to scheduling a vCPU. When the vCPU is scheduled out, the preempted field will be updated to 1 by the hypervisor. + +A vCPU of a paravirtualized guest that is busywaiting in guest kernel mode for +an event to occur (ex: a spinlock to become available) can execute WFI +instruction once it has busy-waited for more than a threshold time-interval. +Execution of WFI instruction would cause the hypervisor to put the vCPU to sleep +until occurrence of an appropriate event. Another vCPU of the same guest can +wakeup the sleeping vCPU by issuing PV_SCHED_KICK_CPU hypercall, specifying CPU +id (reg1) of the vCPU to be woken up. diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index daab363d31ae..02b857f44ae8 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -596,5 +596,11 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, ARM_SMCCC_OWNER_STANDARD_HYP, \ 0x92)
+#define ARM_SMCCC_HV_PV_SCHED_KICK_CPU \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_STANDARD_HYP, \ + 0x93) + #endif /*__ASSEMBLY__*/ #endif /*__LINUX_ARM_SMCCC_H*/
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Implement the service call for waking up a WFI state vCPU.
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 4 +++- arch/arm64/kvm/handle_exit.c | 1 + arch/arm64/kvm/hypercalls.c | 3 +++ arch/arm64/kvm/pvsched.c | 25 +++++++++++++++++++++++++ 5 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1ad1a18caceb..bb7549d515db 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -598,6 +598,7 @@ struct kvm_vcpu_arch {
/* Guest PV sched state */ struct { + bool pv_unhalted; gpa_t base; } pvsched;
@@ -1058,6 +1059,7 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch)
long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); +long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu);
static inline void kvm_arm_pvsched_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) { diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index c3774a172ff3..edffe6c486de 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -598,7 +598,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF); - return ((irq_lines || kvm_vgic_vcpu_pending_irq(v)) + bool pv_unhalted = v->arch.pvsched.pv_unhalted; + + return ((irq_lines || kvm_vgic_vcpu_pending_irq(v) || pv_unhalted) && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); }
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 617ae6dea5d5..b9a44c3bebb7 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -121,6 +121,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) } else { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); vcpu->stat.wfi_exit_stat++; + vcpu->arch.pvsched.pv_unhalted = false; }
if (esr & ESR_ELx_WFx_ISS_WFxT) { diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index a96b88d90cf8..46def0572461 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -377,6 +377,9 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) vcpu->arch.pvsched.base = INVALID_GPA; val[0] = SMCCC_RET_SUCCESS; break; + case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: + val[0] = kvm_pvsched_kick_vcpu(vcpu); + break; default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 06290c831101..49d03fd6f4ad 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -34,6 +34,30 @@ void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) pagefault_enable(); }
+long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu) +{ + unsigned int vcpu_idx; + long val = SMCCC_RET_NOT_SUPPORTED; + struct kvm *kvm = vcpu->kvm; + struct kvm_vcpu *target = NULL; + + vcpu_idx = smccc_get_arg1(vcpu); + target = kvm_get_vcpu(kvm, vcpu_idx); + if (!target) + goto out; + + target->arch.pvsched.pv_unhalted = true; + kvm_make_request(KVM_REQ_IRQ_PENDING, target); + kvm_vcpu_kick(target); + if (READ_ONCE(target->ready)) + kvm_vcpu_yield_to(target); + + val = SMCCC_RET_SUCCESS; + +out: + return val; +} + long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) { u32 feature = smccc_get_arg1(vcpu); @@ -43,6 +67,7 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_FEATURES: case ARM_SMCCC_HV_PV_SCHED_IPA_INIT: case ARM_SMCCC_HV_PV_SCHED_IPA_RELEASE: + case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: val = SMCCC_RET_SUCCESS; break; }
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Linux kernel builds were run in KVM guest on HiSilicon Kunpeng920 system. VM guests were set up with 32, 48 and 64 vCPUs on the 32 physical CPUs. The kernel build (make -j<n>) was done in a VM with unpinned vCPUs 3 times with the best time selected and <n> is the number of vCPUs available. The build times of the original linux 4.19.87, pvqspinlock with various number of vCPUs are as follows:
Kernel 32 vCPUs 48 vCPUs 60 vCPUs ---------- -------- -------- -------- 4.19.87 342.336s 602.048s 950.340s pvqsinlock 341.366s 376.135s 437.037s
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 13 ++++ arch/arm64/include/asm/Kbuild | 1 - arch/arm64/include/asm/paravirt.h | 38 +++++++++++ arch/arm64/include/asm/qspinlock.h | 48 +++++++++++++ arch/arm64/include/asm/qspinlock_paravirt.h | 12 ++++ arch/arm64/include/asm/spinlock.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/paravirt-spinlocks.c | 5 ++ arch/arm64/kernel/paravirt.c | 74 +++++++++++++++++++++ 9 files changed, 194 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/qspinlock.h create mode 100644 arch/arm64/include/asm/qspinlock_paravirt.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 9db5bd439a29..971cf9d54101 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1548,6 +1548,19 @@ config PARAVIRT under a hypervisor, potentially improving performance significantly over full virtualization.
+config PARAVIRT_SPINLOCKS + bool "Paravirtualization layer for spinlocks" + depends on PARAVIRT && SMP + help + Paravirtualized spinlocks allow a pvops backend to replace the + spinlock implementation with something virtualization-friendly + (for example, block the virtual CPU rather than spinning). + + It has a minimal impact on native kernels and gives a nice performance + benefit on paravirtualized KVM kernels. + + If you are unsure how to answer this question, answer Y. + config PARAVIRT_TIME_ACCOUNTING bool "Paravirtual steal time accounting" select PARAVIRT diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild index 5c8ee5a541d2..d16ee8095366 100644 --- a/arch/arm64/include/asm/Kbuild +++ b/arch/arm64/include/asm/Kbuild @@ -2,7 +2,6 @@ generic-y += early_ioremap.h generic-y += mcs_spinlock.h generic-y += qrwlock.h -generic-y += qspinlock.h generic-y += parport.h generic-y += user.h
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 4e35851bcecf..a29eeffa49aa 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -30,6 +30,44 @@ static inline bool pv_vcpu_is_preempted(int cpu) return static_call(pv_vcpu_preempted)(cpu); }
+#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) +void __init pv_qspinlock_init(void); +bool pv_is_native_spin_unlock(void); + +void dummy_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, + dummy_queued_spin_lock_slowpath); +static inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + return static_call(pv_qspinlock_queued_spin_lock_slowpath)(lock, val); +} + +void dummy_queued_spin_unlock(struct qspinlock *lock); +DECLARE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, dummy_queued_spin_unlock); +static inline void pv_queued_spin_unlock(struct qspinlock *lock) +{ + return static_call(pv_qspinlock_queued_spin_unlock)(lock); +} + +void dummy_wait(u8 *ptr, u8 val); +DECLARE_STATIC_CALL(pv_qspinlock_wait, dummy_wait); +static inline void pv_wait(u8 *ptr, u8 val) +{ + return static_call(pv_qspinlock_wait)(ptr, val); +} + +void dummy_kick(int cpu); +DECLARE_STATIC_CALL(pv_qspinlock_kick, dummy_kick); +static inline void pv_kick(int cpu) +{ + return static_call(pv_qspinlock_kick)(cpu); +} +#else + +#define pv_qspinlock_init() do {} while (0) + +#endif /* SMP && PARAVIRT_SPINLOCKS */ + #else
#define pv_time_init() do {} while (0) diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h new file mode 100644 index 000000000000..ce3816b60f25 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2020 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef _ASM_ARM64_QSPINLOCK_H +#define _ASM_ARM64_QSPINLOCK_H + +#include <linux/jump_label.h> +#include <asm/cpufeature.h> +#include <asm-generic/qspinlock_types.h> +#include <asm/paravirt.h> + +#define _Q_PENDING_LOOPS (1 << 9) + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_init_lock_hash(void); +extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); + +#define queued_spin_unlock queued_spin_unlock +/** + * queued_spin_unlock - release a queued spinlock + * @lock : Pointer to queued spinlock structure + * + * A smp_store_release() on the least-significant byte. + */ +static inline void native_queued_spin_unlock(struct qspinlock *lock) +{ + smp_store_release(&lock->locked, 0); +} + +static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + pv_queued_spin_lock_slowpath(lock, val); +} + +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + pv_queued_spin_unlock(lock); +} + +#endif + +#include <asm-generic/qspinlock.h> + +#endif /* _ASM_ARM64_QSPINLOCK_H */ diff --git a/arch/arm64/include/asm/qspinlock_paravirt.h b/arch/arm64/include/asm/qspinlock_paravirt.h new file mode 100644 index 000000000000..eba4be28fbb9 --- /dev/null +++ b/arch/arm64/include/asm/qspinlock_paravirt.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#ifndef __ASM_QSPINLOCK_PARAVIRT_H +#define __ASM_QSPINLOCK_PARAVIRT_H + +extern void __pv_queued_spin_unlock(struct qspinlock *lock); + +#endif diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h index 3dcb1c903ba0..4040a27ecf86 100644 --- a/arch/arm64/include/asm/spinlock.h +++ b/arch/arm64/include/asm/spinlock.h @@ -9,6 +9,9 @@ #include <asm/qrwlock.h> #include <asm/paravirt.h>
+/* How long a lock should spin before we consider blocking */ +#define SPIN_THRESHOLD (1 << 15) + /* See include/linux/spinlock.h */ #define smp_mb__after_spinlock() smp_mb()
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 235963ed87b1..4db14d6677bd 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt-spinlocks.o +obj-$(CONFIG_PARAVIRT_SPINLOCKS) += paravirt.o paravirt-spinlocks.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o obj-$(CONFIG_ELF_CORE) += elfcore.o diff --git a/arch/arm64/kernel/paravirt-spinlocks.c b/arch/arm64/kernel/paravirt-spinlocks.c index 3566897070d9..2e4abc83e318 100644 --- a/arch/arm64/kernel/paravirt-spinlocks.c +++ b/arch/arm64/kernel/paravirt-spinlocks.c @@ -14,3 +14,8 @@ __visible bool __native_vcpu_is_preempted(int cpu) }
DEFINE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); + +bool pv_is_native_spin_unlock(void) +{ + return false; +} diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index e0c862f8f656..3a115f1478ee 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -23,6 +23,7 @@ #include <asm/paravirt.h> #include <asm/pvclock-abi.h> #include <asm/pvsched-abi.h> +#include <asm/qspinlock_paravirt.h> #include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled; @@ -259,6 +260,77 @@ static bool has_kvm_pvsched(void) return (res.a0 == SMCCC_RET_SUCCESS); }
+#ifdef CONFIG_PARAVIRT_SPINLOCKS +static bool arm_pvspin; + +/* Kick a cpu by its cpuid. Used to wake up a halted vcpu */ +static void kvm_kick_cpu(int cpu) +{ + struct arm_smccc_res res; + + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_KICK_CPU, cpu, &res); +} + +static void kvm_wait(u8 *ptr, u8 val) +{ + unsigned long flags; + + if (in_nmi()) + return; + + local_irq_save(flags); + + if (READ_ONCE(*ptr) != val) + goto out; + + dsb(sy); + wfi(); + +out: + local_irq_restore(flags); +} + +DEFINE_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath, + native_queued_spin_lock_slowpath); +DEFINE_STATIC_CALL(pv_qspinlock_queued_spin_unlock, native_queued_spin_unlock); +DEFINE_STATIC_CALL(pv_qspinlock_wait, kvm_wait); +DEFINE_STATIC_CALL(pv_qspinlock_kick, kvm_kick_cpu); + +EXPORT_STATIC_CALL(pv_qspinlock_queued_spin_lock_slowpath); +EXPORT_STATIC_CALL(pv_qspinlock_queued_spin_unlock); +EXPORT_STATIC_CALL(pv_qspinlock_wait); +EXPORT_STATIC_CALL(pv_qspinlock_kick); + +void __init pv_qspinlock_init(void) +{ + /* Don't use the PV qspinlock code if there is only 1 vCPU. */ + if (num_possible_cpus() == 1) + arm_pvspin = false; + + if (!arm_pvspin) { + pr_info("PV qspinlocks disabled\n"); + return; + } + pr_info("PV qspinlocks enabled\n"); + + __pv_init_lock_hash(); + + static_call_update(pv_qspinlock_queued_spin_lock_slowpath, + __pv_queued_spin_lock_slowpath); + static_call_update(pv_qspinlock_queued_spin_unlock, + __pv_queued_spin_unlock); + static_call_update(pv_qspinlock_wait, kvm_wait); + static_call_update(pv_qspinlock_kick, kvm_kick_cpu); +} + +static __init int arm_parse_pvspin(char *arg) +{ + arm_pvspin = true; + return 0; +} +early_param("arm_pvspin", arm_parse_pvspin); +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ + int __init pv_sched_init(void) { int ret; @@ -278,6 +350,8 @@ int __init pv_sched_init(void) static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); pr_info("using PV sched preempted\n");
+ pv_qspinlock_init(); + return 0; } early_initcall(pv_sched_init);
From: Zengruan Ye yezengruan@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Add tracepoints for PV qspinlock
Signed-off-by: Zengruan Ye yezengruan@huawei.com Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/kernel/paravirt.c | 6 +++ arch/arm64/kernel/trace-paravirt.h | 66 ++++++++++++++++++++++++++++++ arch/arm64/kvm/pvsched.c | 3 ++ arch/arm64/kvm/trace_arm.h | 18 ++++++++ 4 files changed, 93 insertions(+) create mode 100644 arch/arm64/kernel/trace-paravirt.h
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 3a115f1478ee..73a8b9886775 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -26,6 +26,9 @@ #include <asm/qspinlock_paravirt.h> #include <asm/smp_plat.h>
+#define CREATE_TRACE_POINTS +#include "trace-paravirt.h" + struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled;
@@ -269,6 +272,8 @@ static void kvm_kick_cpu(int cpu) struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_SCHED_KICK_CPU, cpu, &res); + + trace_kvm_kick_cpu("kvm kick cpu", smp_processor_id(), cpu); }
static void kvm_wait(u8 *ptr, u8 val) @@ -285,6 +290,7 @@ static void kvm_wait(u8 *ptr, u8 val)
dsb(sy); wfi(); + trace_kvm_wait("kvm wait wfi", smp_processor_id());
out: local_irq_restore(flags); diff --git a/arch/arm64/kernel/trace-paravirt.h b/arch/arm64/kernel/trace-paravirt.h new file mode 100644 index 000000000000..2d76272f39ae --- /dev/null +++ b/arch/arm64/kernel/trace-paravirt.h @@ -0,0 +1,66 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright(c) 2019 Huawei Technologies Co., Ltd + * Author: Zengruan Ye yezengruan@huawei.com + */ + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM paravirt + +#if !defined(_TRACE_PARAVIRT_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_PARAVIRT_H + +#include <linux/tracepoint.h> + +TRACE_EVENT(kvm_kick_cpu, + TP_PROTO(const char *name, int cpu, int target), + TP_ARGS(name, cpu, target), + + TP_STRUCT__entry( + __string(name, name) + __field(int, cpu) + __field(int, target) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->cpu = cpu; + __entry->target = target; + ), + + TP_printk("PV qspinlock: %s, cpu %d kick target cpu %d", + __get_str(name), + __entry->cpu, + __entry->target + ) +); + +TRACE_EVENT(kvm_wait, + TP_PROTO(const char *name, int cpu), + TP_ARGS(name, cpu), + + TP_STRUCT__entry( + __string(name, name) + __field(int, cpu) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->cpu = cpu; + ), + + TP_printk("PV qspinlock: %s, cpu %d wait kvm access wfi", + __get_str(name), + __entry->cpu + ) +); + +#endif /* _TRACE_PARAVIRT_H */ + +/* This part must be outside protection */ +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_PATH ../../../arch/arm64/kernel/ +#define TRACE_INCLUDE_FILE trace-paravirt + +#include <trace/define_trace.h> diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index 49d03fd6f4ad..bcfd4760b491 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -11,6 +11,8 @@
#include <kvm/arm_hypercalls.h>
+#include "trace.h" + void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted) { struct kvm *kvm = vcpu->kvm; @@ -53,6 +55,7 @@ long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu) kvm_vcpu_yield_to(target);
val = SMCCC_RET_SUCCESS; + trace_kvm_pvsched_kick_vcpu(vcpu->vcpu_id, target->vcpu_id);
out: return val; diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h index 8ad53104934d..680c7e39af4a 100644 --- a/arch/arm64/kvm/trace_arm.h +++ b/arch/arm64/kvm/trace_arm.h @@ -390,6 +390,24 @@ TRACE_EVENT(kvm_forward_sysreg_trap, sys_reg_Op2(__entry->sysreg)) );
+TRACE_EVENT(kvm_pvsched_kick_vcpu, + TP_PROTO(int vcpu_id, int target_vcpu_id), + TP_ARGS(vcpu_id, target_vcpu_id), + + TP_STRUCT__entry( + __field(int, vcpu_id) + __field(int, target_vcpu_id) + ), + + TP_fast_assign( + __entry->vcpu_id = vcpu_id; + __entry->target_vcpu_id = target_vcpu_id; + ), + + TP_printk("PV qspinlock: vcpu %d kick target vcpu %d", + __entry->vcpu_id, __entry->target_vcpu_id) +); + #endif /* _TRACE_ARM_ARM64_KVM_H */
#undef TRACE_INCLUDE_PATH
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8WMFU CVE: NA
--------------------------------
Add configuration for feature pv-sched
Signed-off-by: lishusen lishusen2@huawei.com --- arch/arm64/Kconfig | 7 ++++ arch/arm64/include/asm/kvm_host.h | 4 +++ arch/arm64/include/asm/paravirt.h | 2 ++ arch/arm64/kernel/paravirt.c | 54 ++++++++++++++++--------------- arch/arm64/kvm/arm.c | 21 +++++++++--- arch/arm64/kvm/handle_exit.c | 2 ++ arch/arm64/kvm/hypercalls.c | 4 +++ arch/arm64/kvm/pvsched.c | 2 ++ 8 files changed, 66 insertions(+), 30 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 971cf9d54101..145d85fc37fd 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1561,6 +1561,13 @@ config PARAVIRT_SPINLOCKS
If you are unsure how to answer this question, answer Y.
+config PARAVIRT_SCHED + bool "Paravirtualization layer for sched" + depends on PARAVIRT + help + + If you are unsure how to answer this question, answer Y. + config PARAVIRT_TIME_ACCOUNTING bool "Paravirtual steal time accounting" select PARAVIRT diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index bb7549d515db..93782ef4f25e 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -596,11 +596,13 @@ struct kvm_vcpu_arch { gpa_t base; } steal;
+#ifdef CONFIG_PARAVIRT_SCHED /* Guest PV sched state */ struct { bool pv_unhalted; gpa_t base; } pvsched; +#endif
/* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; @@ -1057,6 +1059,7 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) return (vcpu_arch->steal.base != INVALID_GPA); }
+#ifdef CONFIG_PARAVIRT_SCHED long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu); void kvm_update_pvsched_preempted(struct kvm_vcpu *vcpu, u32 preempted); long kvm_pvsched_kick_vcpu(struct kvm_vcpu *vcpu); @@ -1070,6 +1073,7 @@ static inline bool kvm_arm_is_pvsched_enabled(struct kvm_vcpu_arch *vcpu_arch) { return (vcpu_arch->pvsched.base != INVALID_GPA); } +#endif
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index a29eeffa49aa..1ce7187eed48 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -20,7 +20,9 @@ static inline u64 paravirt_steal_clock(int cpu)
int __init pv_time_init(void);
+#ifdef CONFIG_PARAVIRT_SCHED int __init pv_sched_init(void); +#endif
__visible bool __native_vcpu_is_preempted(int cpu); DECLARE_STATIC_CALL(pv_vcpu_preempted, __native_vcpu_is_preempted); diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 73a8b9886775..126dff610e07 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -180,7 +180,7 @@ int __init pv_time_init(void) return 0; }
- +#ifdef CONFIG_PARAVIRT_SCHED DEFINE_PER_CPU(struct pvsched_vcpu_state, pvsched_vcpu_region) __aligned(64); EXPORT_PER_CPU_SYMBOL(pvsched_vcpu_region);
@@ -263,6 +263,33 @@ static bool has_kvm_pvsched(void) return (res.a0 == SMCCC_RET_SUCCESS); }
+int __init pv_sched_init(void) +{ + int ret; + + if (is_hyp_mode_available()) + return 0; + + if (!has_kvm_pvsched()) { + pr_warn("PV sched is not available\n"); + return 0; + } + + ret = kvm_arm_init_pvsched(); + if (ret) + return ret; + + static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); + pr_info("using PV sched preempted\n"); + + pv_qspinlock_init(); + + return 0; +} + +early_initcall(pv_sched_init); +#endif /* CONFIG_PARAVIRT_SCHED */ + #ifdef CONFIG_PARAVIRT_SPINLOCKS static bool arm_pvspin;
@@ -336,28 +363,3 @@ static __init int arm_parse_pvspin(char *arg) } early_param("arm_pvspin", arm_parse_pvspin); #endif /* CONFIG_PARAVIRT_SPINLOCKS */ - -int __init pv_sched_init(void) -{ - int ret; - - if (is_hyp_mode_available()) - return 0; - - if (!has_kvm_pvsched()) { - pr_warn("PV sched is not available\n"); - return 0; - } - - ret = kvm_arm_init_pvsched(); - if (ret) - return ret; - - static_call_update(pv_vcpu_preempted, kvm_vcpu_is_preempted); - pr_info("using PV sched preempted\n"); - - pv_qspinlock_init(); - - return 0; -} -early_initcall(pv_sched_init); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index edffe6c486de..d399e4fae356 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -408,7 +408,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+#ifdef CONFIG_PARAVIRT_SCHED kvm_arm_pvsched_vcpu_init(&vcpu->arch); +#endif
vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
@@ -497,10 +499,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus)) vcpu_set_on_unsupported_cpu(vcpu);
+ kvm_tlbi_dvmbm_vcpu_load(vcpu); + +#ifdef CONFIG_PARAVIRT_SCHED if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) kvm_update_pvsched_preempted(vcpu, 0); - - kvm_tlbi_dvmbm_vcpu_load(vcpu); +#endif }
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) @@ -517,10 +521,12 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) vcpu_clear_on_unsupported_cpu(vcpu); vcpu->cpu = -1;
+ kvm_tlbi_dvmbm_vcpu_put(vcpu); + +#ifdef CONFIG_PARAVIRT_SCHED if (kvm_arm_is_pvsched_enabled(&vcpu->arch)) kvm_update_pvsched_preempted(vcpu, 1); - - kvm_tlbi_dvmbm_vcpu_put(vcpu); +#endif }
static void __kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu) @@ -598,10 +604,15 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF); +#ifdef CONFIG_PARAVIRT_SCHED bool pv_unhalted = v->arch.pvsched.pv_unhalted;
return ((irq_lines || kvm_vgic_vcpu_pending_irq(v) || pv_unhalted) && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); +#else + return ((irq_lines || kvm_vgic_vcpu_pending_irq(v)) + && !kvm_arm_vcpu_stopped(v) && !v->arch.pause); +#endif }
bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu) @@ -1385,7 +1396,9 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
spin_unlock(&vcpu->arch.mp_state_lock);
+#ifdef CONFIG_PARAVIRT_SCHED kvm_arm_pvsched_vcpu_init(&vcpu->arch); +#endif
return 0; } diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index b9a44c3bebb7..f4671ae23b1f 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -121,7 +121,9 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) } else { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); vcpu->stat.wfi_exit_stat++; +#ifdef CONFIG_PARAVIRT_SCHED vcpu->arch.pvsched.pv_unhalted = false; +#endif }
if (esr & ESR_ELx_WFx_ISS_WFxT) { diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 46def0572461..2e7a1d06e6ff 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -332,9 +332,11 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) &smccc_feat->std_hyp_bmap)) val[0] = SMCCC_RET_SUCCESS; break; +#ifdef CONFIG_PARAVIRT_SCHED case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = SMCCC_RET_SUCCESS; break; +#endif } break; case ARM_SMCCC_HV_PV_TIME_FEATURES: @@ -363,6 +365,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_TRNG_RND32: case ARM_SMCCC_TRNG_RND64: return kvm_trng_call(vcpu); +#ifdef CONFIG_PARAVIRT_SCHED case ARM_SMCCC_HV_PV_SCHED_FEATURES: val[0] = kvm_hypercall_pvsched_features(vcpu); break; @@ -380,6 +383,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) case ARM_SMCCC_HV_PV_SCHED_KICK_CPU: val[0] = kvm_pvsched_kick_vcpu(vcpu); break; +#endif default: return kvm_psci_call(vcpu); } diff --git a/arch/arm64/kvm/pvsched.c b/arch/arm64/kvm/pvsched.c index bcfd4760b491..eae6ad7b6c5d 100644 --- a/arch/arm64/kvm/pvsched.c +++ b/arch/arm64/kvm/pvsched.c @@ -4,6 +4,7 @@ * Author: Zengruan Ye yezengruan@huawei.com */
+#ifdef CONFIG_PARAVIRT_SCHED #include <linux/arm-smccc.h> #include <linux/kvm_host.h>
@@ -77,3 +78,4 @@ long kvm_hypercall_pvsched_features(struct kvm_vcpu *vcpu)
return val; } +#endif /* CONFIG_PARAVIRT_SCHED */
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,转换为PR失败! 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/L... 失败原因:应用补丁/补丁集失败,Patch failed at 0003 KVM: arm64: Support pvsched preempted via shared structure 建议解决方法:请查看失败原因, 确认补丁是否可以应用在当前期望分支的最新代码上
FeedBack: The patch(es) which you have sent to kernel@openeuler.org has been converted to PR failed! Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/L... Failed Reason: apply patch(es) failed, Patch failed at 0003 KVM: arm64: Support pvsched preempted via shared structure Suggest Solution: please checkout if the failed patch(es) can work on the newest codes in expected branch