From: Wanpeng Li wanpengli@tencent.com
mainline inclusion from mainline-v5.8-rc5 commit d73eb57b80b98ae147e4e6a7d9877c2ba175f972 category: feature bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=35 CVE: NA
--------------------------------
Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during vcpu wakeup and interrupt delivery), we want to also boost not just lock holders but also vCPUs that are delivering interrupts. Most smp_call_function_many calls are synchronous, so the IPI target vCPUs are also good yield candidates. This patch introduces vcpu->ready to boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected (VT-d PI handles voluntary preemption separately, in pi_pre_block).
Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM: ebizzy -M
vanilla boosting improved 1VM 21443 23520 9% 2VM 2800 8000 180% 3VM 1800 3100 72%
Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs, one running ebizzy -M, the other running 'stress --cpu 2':
w/ boosting + w/o pv sched yield(vanilla)
vanilla boosting improved 1570 4000 155%
w/ boosting + w/ pv sched yield(vanilla)
vanilla boosting improved 1844 5157 179%
w/o boosting, perf top in VM:
72.33% [kernel] [k] smp_call_function_many 4.22% [kernel] [k] call_function_i 3.71% [kernel] [k] async_page_fault
w/ boosting, perf top in VM:
38.43% [kernel] [k] smp_call_function_many 6.31% [kernel] [k] async_page_fault 6.13% libc-2.23.so [.] __memcpy_avx_unaligned 4.88% [kernel] [k] call_function_interrupt
Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Cc: Christian Borntraeger borntraeger@de.ibm.com Cc: Paul Mackerras paulus@ozlabs.org Cc: Marc Zyngier maz@kernel.org Signed-off-by: Wanpeng Li wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Reviewed-by: zhanghailiang zhang.zhanghailiang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Chaochao Xing xingchaochao@huawei.com Reviewed-by: Zengruan Ye yezengruan@huawei.com Reviewed-by: Xiangyou Xie xiexiangyou@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- arch/s390/kvm/interrupt.c | 2 +- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 12 ++++++++---- 3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index 05ea466b9e40..c567a20ecb78 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -1144,7 +1144,7 @@ void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu) * The vcpu gave up the cpu voluntarily, mark it as a good * yield-candidate. */ - vcpu->preempted = true; + vcpu->ready = true; swake_up_one(&vcpu->wq); vcpu->stat.halt_wakeup++; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 1e1c7a6241d1..b9443fedf24e 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -274,6 +274,7 @@ struct kvm_vcpu { } spin_loop; #endif bool preempted; + bool ready; struct kvm_vcpu_arch arch; struct dentry *debugfs_dentry; }; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 47b6d182fd8a..c1e63fd9b1ec 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -334,6 +334,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) kvm_vcpu_set_in_spin_loop(vcpu, false); kvm_vcpu_set_dy_eligible(vcpu, false); vcpu->preempted = false; + vcpu->ready = false;
r = kvm_arch_vcpu_init(vcpu); if (r < 0) @@ -2281,6 +2282,7 @@ bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu) wqp = kvm_arch_vcpu_wq(vcpu); if (swq_has_sleeper(wqp)) { swake_up_one(wqp); + WRITE_ONCE(vcpu->ready, true); ++vcpu->stat.halt_wakeup; return true; } @@ -2417,7 +2419,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) continue; } else if (pass && i > last_boosted_vcpu) break; - if (!READ_ONCE(vcpu->preempted)) + if (!READ_ONCE(vcpu->ready)) continue; if (vcpu == me) continue; @@ -4172,8 +4174,8 @@ static void kvm_sched_in(struct preempt_notifier *pn, int cpu) { struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
- if (vcpu->preempted) - vcpu->preempted = false; + vcpu->preempted = false; + WRITE_ONCE(vcpu->ready, false);
kvm_arch_sched_in(vcpu, cpu);
@@ -4185,8 +4187,10 @@ static void kvm_sched_out(struct preempt_notifier *pn, { struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
- if (current->state == TASK_RUNNING) + if (current->state == TASK_RUNNING) { vcpu->preempted = true; + WRITE_ONCE(vcpu->ready, true); + } kvm_arch_vcpu_put(vcpu); }