[Virt] Re: [PATCH hulk-4.19-next] KVM: Boost vCPUs that are delivering interrupts

4 Aug 2020

Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
...
-----Original Message-----
From: yezengruan
Sent: Tuesday, August 4, 2020 3:40 PM
To: Xiexiuqi <xiexiuqi@huawei.com>; Guohanjun (Hanjun Guo)
<guohanjun@huawei.com>
Cc: Wanghaibin (D) <wanghaibin.wang@huawei.com>; Fanhenglong
<fanhenglong@huawei.com>; yezengruan <yezengruan@huawei.com>;
Zhanghailiang <zhang.zhanghailiang@huawei.com>; kernel.openeuler
<kernel.openeuler@huawei.com>; Chenzhendong (alex)
<alex.chen@huawei.com>; virt@openeuler.org; Xiexiangyou
<xiexiangyou@huawei.com>; yuzenghui <yuzenghui@huawei.com>
Subject: [PATCH hulk-4.19-next] KVM: Boost vCPUs that are delivering
interrupts
From: Wanpeng Li <wanpengli@tencent.com>
mainline inclusion
from mainline-v5.8-rc5
commit d73eb57b80b98ae147e4e6a7d9877c2ba175f972
category: feature
bugzilla: NA
DTS: NA
CVE: NA
--------------------------------
Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during vcpu
wakeup and interrupt delivery), we want to also boost not just lock holders
but also vCPUs that are delivering interrupts. Most smp_call_function_many
calls are synchronous, so the IPI target vCPUs are also good yield candidates.
This patch introduces vcpu->ready to boost vCPUs during wakeup and
interrupt delivery time; unlike s390 we do not reuse vcpu->preempted so
that voluntarily preempted vCPUs are taken into account by
kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected (VT-d PI handles
voluntary preemption separately, in pi_pre_block).
Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB
RAM:
ebizzy -M
vanilla     boosting    improved
1VM          21443       23520         9%
2VM           2800        8000       180%
3VM           1800        3100        72%
Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs,
one running ebizzy -M, the other running 'stress --cpu 2':
w/ boosting + w/o pv sched yield(vanilla)
vanilla     boosting   improved
              1570         4000      155%
w/ boosting + w/ pv sched yield(vanilla)
vanilla     boosting   improved
              1844         5157      179%
w/o boosting, perf top in VM:
72.33%  [kernel]       [k] smp_call_function_many
  4.22%  [kernel]       [k] call_function_i
  3.71%  [kernel]       [k] async_page_fault
w/ boosting, perf top in VM:
38.43%  [kernel]       [k] smp_call_function_many
  6.31%  [kernel]       [k] async_page_fault
  6.13%  libc-2.23.so   [.] __memcpy_avx_unaligned
  4.88%  [kernel]       [k] call_function_interrupt
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/s390/kvm/interrupt.c |  2 +-
 include/linux/kvm_host.h  |  1 +
 virt/kvm/kvm_main.c       | 12 ++++++++----
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index
9dde4d7d8704..26f8bf4a22a7 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -1240,7 +1240,7 @@ void kvm_s390_vcpu_wakeup(struct kvm_vcpu
*vcpu)
       * The vcpu gave up the cpu voluntarily, mark it as a good
       * yield-candidate.
       */
-		vcpu->preempted = true;
+		vcpu->ready = true;
      swake_up_one(&vcpu->wq);
      vcpu->stat.halt_wakeup++;
  }
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index
c5da875f19e3..5c5b5867024c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -318,6 +318,7 @@ struct kvm_vcpu {
  } spin_loop;
 #endif
  bool preempted;
+	bool ready;
  struct kvm_vcpu_arch arch;
  struct dentry *debugfs_dentry;
 };
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index
b4ab59dd6846..887f3b0c2b60 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -314,6 +314,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct
kvm *kvm, unsigned id)
  kvm_vcpu_set_in_spin_loop(vcpu, false);
  kvm_vcpu_set_dy_eligible(vcpu, false);
  vcpu->preempted = false;
+	vcpu->ready = false;
r = kvm_arch_vcpu_init(vcpu);
  if (r < 0)
@@ -2387,6 +2388,7 @@ bool kvm_vcpu_wake_up(struct kvm_vcpu
*vcpu)
  wqp = kvm_arch_vcpu_wq(vcpu);
  if (swq_has_sleeper(wqp)) {
      swake_up_one(wqp);
+		WRITE_ONCE(vcpu->ready, true);
      ++vcpu->stat.halt_wakeup;
      return true;
  }
@@ -2500,7 +2502,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me,
bool yield_to_kernel_mode)
      		continue;
      	} else if (pass && i > last_boosted_vcpu)
      		break;
-			if (!READ_ONCE(vcpu->preempted))
+			if (!READ_ONCE(vcpu->ready))
      		continue;
      	if (vcpu == me)
      		continue;
@@ -4203,8 +4205,8 @@ static void kvm_sched_in(struct preempt_notifier
*pn, int cpu)  {
  struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
-	if (vcpu->preempted)
-		vcpu->preempted = false;
+	vcpu->preempted = false;
+	WRITE_ONCE(vcpu->ready, false);
kvm_arch_sched_in(vcpu, cpu);
@@ -4216,8 +4218,10 @@ static void kvm_sched_out(struct
preempt_notifier *pn,  {
  struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
-	if (current->state == TASK_RUNNING)
+	if (current->state == TASK_RUNNING) {
      vcpu->preempted = true;
+		WRITE_ONCE(vcpu->ready, true);
+	}
  kvm_arch_vcpu_put(vcpu);
 }
--
2.19.1