All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, KVM <kvm@vger.kernel.org>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Jens Freimann <jfrei@linux.vnet.ibm.com>,
	David Hildenbrand <dahi@linux.vnet.ibm.com>,
	Wanpeng Li <kernellwp@gmail.com>,
	David Matlack <dmatlack@google.com>
Subject: Re: [PATCH 1/1] KVM: halt_polling: provide a way to qualify wakeups during poll
Date: Wed, 4 May 2016 09:50:57 +0200	[thread overview]
Message-ID: <5729A9E1.3050706@de.ibm.com> (raw)
In-Reply-To: <20160503150902.GF30059@potion>

On 05/03/2016 05:09 PM, Radim Krčmář wrote:
> 2016-05-03 14:37+0200, Christian Borntraeger:
>> Some wakeups should not be considered a sucessful poll. For example on
>> s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
>> would be considered runnable - letting all vCPUs poll all the time for
>> transactional like workload, even if one vCPU would be enough.
>> This can result in huge CPU usage for large guests.
>> This patch lets architectures provide a way to qualify wakeups if they
>> should be considered a good/bad wakeups in regard to polls.
>>
>> For s390 the implementation will fence of halt polling for anything but
>> known good, single vCPU events. The s390 implementation for floating
>> interrupts does a wakeup for one vCPU, but the interrupt will be delivered
>> by whatever CPU checks first for a pending interrupt. We prefer the
>> woken up CPU by marking the poll of this CPU as "good" poll.
>> This code will also mark several other wakeup reasons like IPI or
>> expired timers as "good". This will of course also mark some events as
>> not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
>> we prefer to not poll, unless we are really sure, though.
>>
>> This patch successfully limits the CPU usage for cases like uperf 1byte
>> transactional ping pong workload or wakeup heavy workload like OLTP
>> while still providing a proper speedup.
>>
>> This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
>> wakeups that are considered not good for polling.
>>
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> Cc: David Matlack <dmatlack@google.com>
>> Cc: Wanpeng Li <kernellwp@gmail.com>
>> ---
> 
> Thanks for all explanations,
> 
> Acked-by: Radim Krčmář <rkrcmar@redhat.com>
> 


The feedback about the logic triggered some more experiments on my side.
So I was experimenting with some different workloads/heuristics and it
seems that even more aggressive shrinking (basically resetting to 0 as soon
as an invalid poll comes along) does improve the cpu usage even more.

patch on top
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ffe0545..c168662 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2036,12 +2036,13 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 out:
        block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
-       if (halt_poll_ns) {
+       if (!vcpu_valid_wakeup(vcpu))
+                shrink_halt_poll_ns(vcpu);
+       else if (halt_poll_ns) {
                if (block_ns <= vcpu->halt_poll_ns)
                        ;
                /* we had a long block, shrink polling */
-               else if (!vcpu_valid_wakeup(vcpu) ||
-                       (vcpu->halt_poll_ns && block_ns > halt_poll_ns))
+               else if (vcpu->halt_poll_ns && block_ns > halt_poll_ns)
                        shrink_halt_poll_ns(vcpu);
                /* we had a short halt and our poll time is too small */
                else if (vcpu->halt_poll_ns < halt_poll_ns &&


the uperf 1byte:1byte workload seems to have all the benefits still.
I have asked the performance folks to test several other workloads if
we loose some of the benefits.
So I will defer this patch until I have a full picture which heuristics
is best. Hopefully I have some answers next week. 

(So the new diff looks like)
@@ -2034,7 +2036,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 out:
        block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
-       if (halt_poll_ns) {
+       if (!vcpu_valid_wakeup(vcpu))
+                shrink_halt_poll_ns(vcpu);
+       else if (halt_poll_ns) {
                if (block_ns <= vcpu->halt_poll_ns)
                        ;

  reply	other threads:[~2016-05-04  7:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-03 12:37 [PATCH v2] KVM: halt_polling: provide a way to qualify wakeups during poll Christian Borntraeger
2016-05-03 12:37 ` [PATCH 1/1] " Christian Borntraeger
2016-05-03 12:41   ` David Hildenbrand
2016-05-03 12:56   ` Cornelia Huck
2016-05-03 15:03     ` Radim Krčmář
2016-05-03 18:12       ` Christian Borntraeger
2016-05-04  6:22         ` Cornelia Huck
2016-05-03 15:09   ` Radim Krčmář
2016-05-04  7:50     ` Christian Borntraeger [this message]
2016-05-04  8:05       ` Cornelia Huck
2016-05-13 10:18         ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5729A9E1.3050706@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=dahi@linux.vnet.ibm.com \
    --cc=dmatlack@google.com \
    --cc=jfrei@linux.vnet.ibm.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.