[Qemu-devel] Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: vatsa@linux.vnet.ibm.com
Cc: Mike@gnu.org, kvm@vger.kernel.org, Galbraith <efault@gmx.de>,
	qemu-devel@nongnu.org, Chris Wright <chrisw@sous-sol.org>,
	Anthony Liguori <aliguori@linux.vnet.ibm.com>,
	Avi Kivity <avi@redhat.com>
Subject: [Qemu-devel] Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)
Date: Wed, 01 Dec 2010 17:25:18 +0100	[thread overview]
Message-ID: <1291220718.32004.1696.camel@laptop> (raw)
In-Reply-To: <20101201161221.GA8073@linux.vnet.ibm.com>

On Wed, 2010-12-01 at 21:42 +0530, Srivatsa Vaddagiri wrote:

> Not if yield() remembers what timeslice was given up and adds that back when
> thread is finally ready to run. Figure below illustrates this idea:
> 
> 
>       A0/4    C0/4 D0/4 A0/4  C0/4 D0/4 A0/4  C0/4 D0/4 A0/4 
> p0   |----|-L|----|----|----|L|----|----|----|L|----|----|----|--------------|
>             \                \               \                  \
> 	   B0/2[2]	    B0/0[6]         B0/0[10]            B0/14[0]
> 
>  
> where,
> 	p0	-> physical cpu0
> 	L	-> denotes period of lock contention
> 	A0/4    -> means vcpu A0 (of guest A) ran for 4 ms
> 	B0/2[6] -> means vcpu B0 (of guest B) ran for 2 ms (and has given up
> 		   6ms worth of its timeslice so far). In reality, we should
> 	     	   not see too much of "given up" timeslice for a vcpu.

/me fails to parse

> > >Regarding directed yield, do we have any reliable mechanism to find target of
> > >directed yield in this (unmodified/non-paravirtualized guest) case? IOW how do
> > >we determine the vcpu thread to which cycles need to be yielded upon contention?
> > 
> > My idea was to yield to a random starved vcpu of the same guest.
> > There are several cases to consider:
> > 
> > - we hit the right vcpu; lock is released, party.
> > - we hit some vcpu that is doing unrelated work.  yielding thread
> > doesn't make progress, but we're not wasting cpu time.
> > - we hit another waiter for the same lock.  it will also PLE exit
> > and trigger a directed yield.  this increases the cost of directed
> > yield by a factor of count_of_runnable_but_not_running_vcpus, which
> > could be large, but not disasterously so (i.e. don't run a 64-vcpu
> > guest on a uniprocessor host with this)
> > 
> > >>  So if you were to test something similar running with a 20% vcpu
> > >>  cap, I'm sure you'd run into similar issues.  It may show with fewer
> > >>  vcpus (I've only tested 64).
> > >>
> > >>  >Are you assuming the existence of a directed yield and the
> > >>  >specific concern is what happens when a directed yield happens
> > >>  >after a PLE and the target of the yield has been capped?
> > >>
> > >>  Yes.  My concern is that we will see the same kind of problems
> > >>  directed yield was designed to fix, but without allowing directed
> > >>  yield to fix them.  Directed yield was designed to fix lock holder
> > >>  preemption under contention,
> > >
> > >For modified guests, something like [2] seems to be the best approach to fix
> > >lock-holder preemption (LHP) problem, which does not require any sort of
> > >directed yield support. Essentially upon contention, a vcpu registers its lock
> > >of interest and goes to sleep (via hypercall) waiting for lock-owner to wake it
> > >up (again via another hypercall).
> > 
> > Right.
> 
> We don't have these hypercalls for KVM atm, which I am working on now.
> 
> > >For unmodified guests, IMHO a plain yield (or slightly enhanced yield [1])
> > >should fix the LHP problem.
> > 
> > A plain yield (ignoring no-opiness on Linux) will penalize the
> > running guest wrt other guests.  We need to maintain fairness.
> 
> Agreed on the need to maintain fairness.

Directed yield and fairness don't mix well either. You can end up
feeding the other tasks more time than you'll ever get back.

> > >Fyi, Xen folks also seem to be avoiding a directed yield for some of the same
> > >reasons [3].
> > 
> > I think that fails for unmodified guests, where you don't know when
> > the lock is released and so you don't have a wake_up notification.
> > You lost a large timeslice and you can't gain it back, whereas with
> > pv the wakeup means you only lose as much time as the lock was held.
> > 
> > >Given this line of thinking, hard-limiting guests (either in user-space or
> > >kernel-space, latter being what I prefer) should not have adverse interactions
> > >with LHP-related solutions.
> > 
> > If you hard-limit a vcpu that holds a lock, any waiting vcpus are
> > also halted.
> 
> This can happen in normal case when lock-holders are preempted as well. So
> not a new problem that hard-limits is introducing!

No, but hard limits make it _much_ worse.

> >  With directed yield you can let the lock holder make
> > some progress at the expense of another vcpu.  A regular yield()
> > will simply stall the waiter.
> 
> Agreed. Do you see any problems with slightly enhanced version of yeild
> described above (rather than directed yield)? It has some advantage over 
> directed yield in that it preserves not only fairness between VMs but also 
> fairness between VCPUs of a VM. Also it avoids the need for a guessing game 
> mentioned above and bad interactions with hard-limits.
> 
> CCing other scheduler experts for their opinion of proposed yield() extensions.

sys_yield() usage for anything other but two FIFO threads of the same
priority goes to /dev/null.

The Xen paravirt spinlock solution is relatively sane, use that.
Unmodified guests suck anyway, there's really nothing much sane you can
do there as you don't know who owns what lock.

next prev parent reply	other threads:[~2010-12-01 16:25 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-23 16:49 [Qemu-devel] [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2) Anthony Liguori
2010-11-23 19:35 ` Blue Swirl
2010-11-23 21:46   ` Anthony Liguori
2010-11-23 23:43     ` Paolo Bonzini
2010-11-24  1:15       ` Anthony Liguori
2010-11-24  2:08         ` Paolo Bonzini
2010-11-24  8:18 ` [Qemu-devel] " Avi Kivity
2010-11-24 13:58   ` Anthony Liguori
2010-11-24 14:23     ` Avi Kivity
2010-12-01 12:37       ` Srivatsa Vaddagiri
2010-12-01 12:56         ` Avi Kivity
2010-12-01 16:12           ` Srivatsa Vaddagiri
2010-12-01 16:25             ` Peter Zijlstra [this message]
2010-12-01 17:17               ` Chris Wright
2010-12-01 17:22                 ` Peter Zijlstra
2010-12-01 17:26                   ` Rik van Riel
2010-12-01 19:07                     ` Peter Zijlstra
2010-12-01 19:24                       ` Rik van Riel
2010-12-01 19:35                         ` Peter Zijlstra
2010-12-01 19:42                           ` Rik van Riel
2010-12-01 19:47                             ` Peter Zijlstra
2010-12-02  9:07                       ` Avi Kivity
2010-12-01 17:46                   ` Chris Wright
2010-12-01 17:29               ` Srivatsa Vaddagiri
2010-12-01 17:45                 ` Peter Zijlstra
2010-12-01 18:00                   ` Srivatsa Vaddagiri
2010-12-01 19:09                     ` Peter Zijlstra
2010-12-02  9:17                       ` Avi Kivity
2010-12-02 11:47                         ` Srivatsa Vaddagiri
2010-12-02 12:22                           ` Srivatsa Vaddagiri
2010-12-02 12:41                           ` Avi Kivity
2010-12-02 13:13                             ` Srivatsa Vaddagiri
2010-12-02 13:49                               ` Avi Kivity
2010-12-02 15:27                                 ` Srivatsa Vaddagiri
2010-12-02 15:28                                   ` Srivatsa Vaddagiri
2010-12-02 15:33                                   ` Avi Kivity
2010-12-02 15:44                                     ` Srivatsa Vaddagiri
2010-12-02 12:19                         ` Srivatsa Vaddagiri
2010-12-02 12:42                           ` Avi Kivity
2010-12-02  9:14                 ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291220718.32004.1696.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=Mike@gnu.org \
    --cc=aliguori@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=efault@gmx.de \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).