xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Keir Fraser <keir.xen@gmail.com>
To: Andres Lagar-Cavilla <andreslc@gridcentric.ca>,
	Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: Wait Queues
Date: Thu, 08 Nov 2012 07:42:35 +0000	[thread overview]
Message-ID: <CCC112EB.443D1%keir.xen@gmail.com> (raw)
In-Reply-To: <25D45C62-FB0E-4345-A5A1-9A37F219477B@gridcentric.ca>

On 08/11/2012 03:22, "Andres Lagar-Cavilla" <andreslc@gridcentric.ca> wrote:

>> I'd like to propose an approach that ensures that as long some properties are
>> met, arbitrary wait queue sleep is allowed. Here are the properties:
>> 1. Third parties servicing a wait queue sleep are indeed third parties. In
>> other words, dom0 does not do paging.
>> 2. Vcpus of a wait queue servicing domain may never go to sleep on a wait
>> queue during a foreign map.
>> 3. A guest vcpu may go to sleep on a wait queue holding any kinds of locks as
>> long as it does not hold the p2m lock.
> 
> N.B: I understand (now) this may cause any other vcpu contending on a lock
> held by the wait queue sleeper to not yield to the scheduler and pin its
> physical cpu.
> 
> What I am struggling with is coming up with a solution that doesn't turn
> hypervisor mm hacking into a locking minefield.
> 
> Linux fixes this with many kinds of sleeping synchronization primitives. A
> task can, for example, hold the mmap semaphore and sleep on a wait queue. Is
> this the only way out of this mess? Not if wait queues force the vcpu to wake
> up on the same phys cpu it was using at the time of sleepingŠ.

Well, the forcing to wake up on same phys cpu it slept on is going to be
fixed. But it's not clear to me how that current restriction makes the
problem harder? What if you were running on a single-phys-cpu system?

As you have realised, the fact that all locks in Xen are spinlocks makes the
potential for deadlock very obvious. Someone else gets scheduled and takes
out the phys cpu by spinning on a lock that someone else is holding while
they are descheduled.

Linux-style sleeping mutexes might help. We could add those. They don't help
as readily as in the Linux case however! In some ways they push the deadlock
up one level of abstraction, to the virt cpu (vcpu). Consider single-vcpu
dom0 running a pager -- even if you are careful that the pager itself does
not acquire any locks that one of its clients may hold-while-sleeping, if
*anything* running in dom0 can acquire such a lock, you have an obvious
deadlock, as that will take out the dom0 vcpu and leave it blocked forever
waiting for a lock that is held while its holder waits for service from the
dom0 vcpu....

I don't think there is an easy solution here!

 -- Keir

  reply	other threads:[~2012-11-08  7:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-07 20:54 Wait Queues Andres Lagar-Cavilla
2012-11-08  3:22 ` Andres Lagar-Cavilla
2012-11-08  7:42   ` Keir Fraser [this message]
2012-11-08 15:39     ` Andres Lagar-Cavilla
2012-11-08 16:48       ` Keir Fraser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CCC112EB.443D1%keir.xen@gmail.com \
    --to=keir.xen@gmail.com \
    --cc=JBeulich@suse.com \
    --cc=andreslc@gridcentric.ca \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).