[PATCH RFC 0/1] Make vCPUs that are HLT state candidates for load balancing

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Masanori Misono <m.misono760@gmail.com>
To: David Woodhouse <dwmw@amazon.co.uk>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Rohit Jain <rohit.k.jain@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Masanori Misono <m.misono760@gmail.com>
Subject: [PATCH RFC 0/1] Make vCPUs that are HLT state candidates for load balancing
Date: Wed, 26 May 2021 22:37:26 +0900	[thread overview]
Message-ID: <20210526133727.42339-1-m.misono760@gmail.com> (raw)

Hi,

I observed performance degradation when running some parallel programs on a
VM that has (1) KVM_FEATURE_PV_UNHALT, (2) KVM_FEATURE_STEAL_TIME, and (3)
multi-core architecture. The benchmark results are shown at the bottom. An
example of libvirt XML for creating such VM is

```
[...]
  <vcpu placement='static'>8</vcpu>
  <cpu mode='host-model'>
    <topology sockets='1' cores='8' threads='1'/>
  </cpu>
  <qemu:commandline>
    <qemu:arg value='-cpu'/>
    <qemu:arg value='host,l3-cache=on,+kvm-pv-unhalt,+kvm-steal-time'/>
  </qemu:commandline>
[...]
```

I investigate the cause and found that the problem occurs in the following
ways:

- vCPU1 schedules thread A, and vCPU2 schedules thread B. vCPU1 and vCPU2
  share LLC.
- Thread A tries to acquire a lock but fails, resulting in a sleep state
  (via futex.)
- vCPU1 becomes idle because there are no runnable threads and does HLT,
  which leads to HLT VMEXIT (if idle=halt, and KVM doesn't disable HLT
  VMEXIT using KVM_CAP_X86_DISABLE_EXITS).
- KVM sets vCPU1's st->preempted as 1 in kvm_steal_time_set_preempted().
- Thread C wakes on vCPU2. vCPU2 tries to do load balancing in
  select_idle_core(). Although vCPU1 is idle, vCPU1 is not a candidate for
  load balancing because is_vcpu_preempted(vCPU1) is true, hence
  available_idle_cpu(vPCU1) is false.
- As a result, both thread B and thread C stay in the vCPU2's runqueue, and
  vCPU1 is not utilized.

The patch changes kvm_arch_cpu_put() so that it does not set st->preempted
as 1 when a vCPU does HLT VMEXIT. As a result, is_vcpu_preempted(vCPU)
becomes 0, and the vCPU becomes a candidate for CFS load balancing.

The followings are parts of benchmark results of NPB-OMP
(https://www.nas.nasa.gov/publications/npb.html), which contains several
parallel computing programs. My machine has two nodes, and each CPU has 24
cores (Intel Xeon Platinum 8160, hyper-threading disabled.) I created a VM
with 48 vCPU, and each vCPU is pinned to the corresponding pCPU. I also
created virtual NUMA so that the guest environment became as close as the
host. Values in the tables are execution time (seconds; lower is better).

| environmnent \ benchmark name | lu.C   | mg.C  | bt.C  | cg.C  |
|-------------------------------+--------+-------+-------+-------|
| host (Linux v5.13-rc3)        | 50.67  | 14.67 | 54.77 | 20.08 |
| VM (sockets=48, cores=1)      | 51.37  | 14.88 | 55.99 | 20.05 |
| VM (sockets=2, cores=24)      | 170.12 | 23.86 | 75.95 | 40.15 |
|   w/ this patch               | 48.92  | 14.95 | 55.23 | 20.09 |

is_vcpu_preempted() is also used in PV spinlock implementations to mitigate
lock holder preemption problems, etc. A vCPU holding a lock does not do
HLT, so I think this patch doesn't affect them. However, pCPU may be
running the host's thread that has higher priority than a vCPU thread, and
in that case, is_vcpu_preempted() should return 0 ideally. I guess
its implementation would be a bit complicated, so I wonder if this patch
approach is acceptable.

Thanks,

Masanori Misono (1):
  KVM: x86: Don't set preempted when vCPU does HLT VMEXIT

 arch/x86/kvm/x86.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

base-commit: c4681547bcce777daf576925a966ffa824edd09d
-- 
2.31.1

next             reply	other threads:[~2021-05-26 13:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 13:37 Masanori Misono [this message]
2021-05-26 13:37 ` [PATCH RFC 1/1] KVM: x86: Don't set preempted when vCPU does HLT VMEXIT Masanori Misono
2021-05-26 14:43 ` [PATCH RFC 0/1] Make vCPUs that are HLT state candidates for load balancing Peter Zijlstra
2021-05-26 14:49 ` Peter Zijlstra
2021-05-26 16:15   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210526133727.42339-1-m.misono760@gmail.com \
    --to=m.misono760@gmail.com \
    --cc=dwmw@amazon.co.uk \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rohit.k.jain@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox