From: Avi Kivity <avi@redhat.com>
To: Rik van Riel <riel@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mike Galbraith <efault@gmx.de>,
Chris Wright <chrisw@sous-sol.org>,
"Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [PATCH -v8a 0/7] directed yield for Pause Loop Exiting
Date: Mon, 07 Feb 2011 11:08:08 +0200 [thread overview]
Message-ID: <4D4FB678.7030701@redhat.com> (raw)
In-Reply-To: <20110201094433.72829892@annuminas.surriel.com>
On 02/01/2011 04:44 PM, Rik van Riel wrote:
> When running SMP virtual machines, it is possible for one VCPU to be
> spinning on a spinlock, while the VCPU that holds the spinlock is not
> currently running, because the host scheduler preempted it to run
> something else.
>
> Both Intel and AMD CPUs have a feature that detects when a virtual
> CPU is spinning on a lock and will trap to the host.
>
> The current KVM code sleeps for a bit whenever that happens, which
> results in eg. a 64 VCPU Windows guest taking forever and a bit to
> boot up. This is because the VCPU holding the lock is actually
> running and not sleeping, so the pause is counter-productive.
>
> In other workloads a pause can also be counter-productive, with
> spinlock detection resulting in one guest giving up its CPU time
> to the others. Instead of spinning, it ends up simply not running
> much at all.
>
> This patch series aims to fix that, by having a VCPU that spins
> give the remainder of its timeslice to another VCPU in the same
> guest before yielding the CPU - one that is runnable but got
> preempted, hopefully the lock holder.
>
> v8:
> - some more changes and cleanups suggested by Peter
> v7:
> - move the vcpu to pid mapping to inside the vcpu->mutex
> - rename ->yield to ->skip
> - merge patch 5 into patch 4
> v6:
> - implement yield_task_fair in a way that works with task groups,
> this allows me to actually get a performance improvement!
> - fix another race Avi pointed out, the code should be good now
> v5:
> - fix the race condition Avi pointed out, by tracking vcpu->pid
> - also allows us to yield to vcpu tasks that got preempted while in qemu
> userspace
> v4:
> - change to newer version of Mike Galbraith's yield_to implementation
> - chainsaw out some code from Mike that looked like a great idea, but
> turned out to give weird interactions in practice
> v3:
> - more cleanups
> - change to Mike Galbraith's yield_to implementation
> - yield to spinning VCPUs, this seems to work better in some
> situations and has little downside potential
> v2:
> - make lots of cleanups and improvements suggested
> - do not implement timeslice scheduling or fairness stuff
> yet, since it is not entirely clear how to do that right
> (suggestions welcome)
>
>
> Benchmark results:
>
> Two 4-CPU KVM guests are pinned to the same 4 physical CPUs.
>
> One guest runs the AMQP performance test, the other guest runs
> 0, 2 or 4 infinite loops, for CPU overcommit factors of 0, 1.5
> and 4.
>
> The AMQP perftest is run 30 times, with message payloads of 8 and 16 bytes.
>
> size8 no overcommit 1.5x overcommit 2x overcommit
>
> no PLE 223801 135137 104951
> PLE 224135 141105 118744
>
> size16 no overcommit 1.5x overcommit 2x overcommit
>
> no PLE 222424 126175 105299
> PLE 222534 138082 132945
>
> Note: this is with the KVM guests NOT running inside cgroups. There
> seems to be a CPU load balancing issue with cgroup fair group scheduling,
> which often results in one guest getting only 80% CPU time and the other
> guest 320%. That will have to be fixed to get meaningful results with
> cgroups.
>
> CPU time division between the AMQP guest and the infinite loop guest
> were not exactly fair, but the guests got close to the same amount
> of CPU time in each test run.
>
> There is a substantial amount of randomness in CPU time division between
> guests, but the performance improvement is consistent between multiple
> runs.
>
I've merged tip's sched/core, which includes yield_to(), and applied the
final three patches. Thanks.
--
error compiling committee.c: too many arguments to function
prev parent reply other threads:[~2011-02-07 9:08 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-01 14:44 [PATCH -v8a 0/7] directed yield for Pause Loop Exiting Rik van Riel
2011-02-01 14:47 ` [PATCH -v8a 1/7] sched: check the right ->nr_running in yield_task_fair Rik van Riel
2011-02-03 14:11 ` [tip:sched/core] sched: Check the right ->nr_running in yield_task_fair() tip-bot for Rik van Riel
2011-02-01 14:48 ` [PATCH -v8a 2/7] sched: limit the scope of clear_buddies Rik van Riel
2011-02-03 14:11 ` [tip:sched/core] sched: Limit " tip-bot for Rik van Riel
2011-02-01 14:50 ` [PATCH -v8a 4/7] sched: Add yield_to(task, preempt) functionality Rik van Riel
2011-02-01 15:52 ` Peter Zijlstra
2011-02-03 12:58 ` Peter Zijlstra
2011-02-03 14:12 ` [tip:sched/core] " tip-bot for Mike Galbraith
2011-02-26 0:43 ` Venkatesh Pallipadi
2011-02-26 5:44 ` Rik van Riel
2011-02-28 9:26 ` Mike Galbraith
2011-03-02 0:28 ` [PATCH] sched: resched proper CPU on yield_to Venkatesh Pallipadi
2011-03-02 3:33 ` Rik van Riel
2011-03-02 3:37 ` Venkatesh Pallipadi
2011-03-02 3:52 ` Rik van Riel
2011-03-04 11:50 ` [tip:sched/core] sched: Resched proper CPU on yield_to() tip-bot for Venkatesh Pallipadi
2011-02-01 14:51 ` [PATCH -v8a 3/7] sched: use a buddy to implement yield_task_fair Rik van Riel
2011-02-01 15:53 ` Peter Zijlstra
2011-02-03 12:58 ` Peter Zijlstra
2011-02-03 14:12 ` [tip:sched/core] sched: Use a buddy to implement yield_task_fair() tip-bot for Rik van Riel
2011-02-01 14:51 ` [PATCH -v8a 5/7] export pid symbols needed for kvm_vcpu_on_spin Rik van Riel
2011-02-01 14:52 ` [PATCH -v8a 6/7] kvm: keep track of which task is running a KVM vcpu Rik van Riel
2011-02-01 14:53 ` [PATCH -v8a 7/7] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel
2011-02-07 9:08 ` Avi Kivity [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D4FB678.7030701@redhat.com \
--to=avi@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=chrisw@sous-sol.org \
--cc=efault@gmx.de \
--cc=jun.nakajima@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=riel@redhat.com \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.