xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tim Deegan <tim@xen.org>
To: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Cc: xen-devel@lists.xensource.com, keir@xen.org
Subject: Re: Domain relinquish resources racing with p2m access
Date: Thu, 2 Feb 2012 13:34:59 +0000	[thread overview]
Message-ID: <20120202133459.GP48883@ocelot.phlegethon.org> (raw)
In-Reply-To: <0642c1aa7bb490b322c1a5c7d12ebb54.squirrel@webmail.lagarcavilla.org>

At 12:49 -0800 on 01 Feb (1328100564), Andres Lagar-Cavilla wrote:
> So we've run into this interesting (race?) condition while doing
> stress-testing. We pummel the domain with paging, sharing and mmap
> operations from dom0, and concurrently we launch a domain destruction.
> Often we get in the logs something along these lines
> 
> (XEN) mm.c:958:d0 Error getting mfn 859b1a (pfn ffffffffffffffff) from L1
> entry 8000000859b1a625 for l1e_owner=0, pg_owner=1
> 
> We're using the synchronized p2m patches just posted, so my analysis is as
> follows:
> 
> - the domain destroy domctl kicks in. It calls relinquish resources. This
> disowns and puts most domain pages, resulting in invalid (0xff...ff) m2p
> entries
> 
> - In parallel, a do_mmu_update is making progress, it has no issues
> performing a p2m lookup because the p2m has not been torn down yet; we
> haven't gotten to the RCU callback. Eventually, the mapping fails in
> page_get_owner in get_pafe_from_l1e.
> 
> The map is failed, as expected, but what makes me uneasy is the fact that
> there is a still active p2m lurking around, with seemingly valid
> translations to valid mfn's, while all the domain pages are gone.

Yes.  That's OK as long as we know that any user of that page will
fail, but I'm not sure that we do.   

At one point we talked about get_gfn() taking a refcount on the
underlying MFN, which would fix this more cleanly.  ISTR the problem was
how to make sure the refcount was moved when the gfn->mfn mapping
changed. 

Can you stick a WARN() in mm.c to get the actual path that leads to the
failure?

Tim.

  reply	other threads:[~2012-02-02 13:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-01 20:49 Domain relinquish resources racing with p2m access Andres Lagar-Cavilla
2012-02-02 13:34 ` Tim Deegan [this message]
2012-02-10 18:05   ` Andres Lagar-Cavilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120202133459.GP48883@ocelot.phlegethon.org \
    --to=tim@xen.org \
    --cc=andres@lagarcavilla.org \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).