From: Ian Campbell <Ian.Campbell@citrix.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Keir Fraser <Keir.Fraser@eu.citrix.com>
Subject: Re: dom0 pvops crash
Date: Wed, 27 Jan 2010 19:18:50 +0000 [thread overview]
Message-ID: <1264619930.2392.248.camel@localhost.localdomain> (raw)
In-Reply-To: <4B608AE2.1020501@goop.org>
On Wed, 2010-01-27 at 18:50 +0000, Jeremy Fitzhardinge wrote:
> On 01/27/2010 09:26 AM, Ian Campbell wrote:
> > On Mon, 2010-01-25 at 20:02 +0000, Jeremy Fitzhardinge wrote:
> >
> >> IanC, Pasi, myself and others explored a number of other ways to try
> >> and fix it in the Xen pvops code, but they all turned out to be very
> >> expensive, just not work (they just pushed the race around), or
> >> require new pvops just for this case.
> >>
> > Just to brainstorm a bit more:
> >
> > There's no way a kunmap_atomic pvop would be acceptable? it would at
> > least make the API symmetrical.
> >
>
> We could propose it, but I think we have bigger things to spend our
> capital on. And I'm not sure it would help:
>
> In theory xen_kmap_atomic could take the pte lock and unmap_atomic could
> release it. But
> kmap_atomic doesn't have enough info be able to take the lock and unmap
> wouldn't either unless we passed it some odd parameters. And even if we
> did take the lock, the calling kernel code will also attempt to take the
> lock if it actually wants to make a pte change, so we'd have to change
> the logic there.
OK, so that idea is out.
> > What about a hypercall which would set a PTE with the writable bit set
> > atomically depending on the pinned status of the referenced page? (I
> > haven't even vaguely thought this idea through).
> >
>
> It doesn't really help because the core issue is the race which changes
> the page state half way through. If we create a writable mapping, a pin
> on another CPU is going to fail.
I think it could be constructed such that the pin and the new hypercall
collude and do the right thing, somehow... Anyway it doesn't matter, I
think the idea below is much more likely to yield a useful solution.
> We could fix it by locking the pte
> while it is mapped, but then we wouldn't need a new hypercall.
>
> > Is there some way we can disable HIGHPTE at runtime even if
> > CONFIG_HIGHPTE=y? Looks like that might be relatively self-contained in
> > pte_alloc_one(). All the actual uses of high PTEs goes through
> > kmap_atomic which explicitly tests for PageHighmem so by ensuring PTEs
> > are never high at allocation time we would skip all those paths.
> > Something like the untested patch below, but not so skanky, obviously.
> >
>
> That's a thought. It could be generally useful too; highpte should only
> be used in extreme circumstances (to prevent ptes from filling most of
> lowmem), not on every system with highmem. IOW use a generic flag
> rather than make it explicitly Xen-related, then we can set that flag.
I think this is the most plausible idea. Need to think about what
criteria would be used to set the flag on native, simply raw RAM size?
i.e. you wouldn't use HIGHPTE on a 4G system, even if CONFIG_HIGHPTE is
enabled, but where would the cut-off be?
Rather than a flag I guess I'd make a pte_gfp variable which could be
modified to suit.
> Or we could just put a big fat config dependency in.
I'd imagine that seemingly random "depends !XEN" would be unpopular
upstream.
> > This last would be nice since it also remove the
> > crippling-for-virtualisation overhead, so it would potentially benefit
> > KVM and VMI as well...
> >
>
> VMI is a non-issue, and I don't think HIGHPTE is extraordinarily
> expensive on kvm.
It would be expensive for shadow mode (three traps to update a PTE) but
I guess for EPT/NPT it is around as cheap as on native.
> >> Given that HIGHPTE is generally a bad idea and should be deprecated
> >> (any machine big enough to need it should definitely be running a
> >> 64-bit kernel), I've left it on the backburner hoping for some
> >> inspiration to strike. So far it has not.
> >>
> > Unfortunately distros seem to be using it for their native kernels and
> > since pvops means they won't have a separate xen kernel I think we need
> > to figure something out.
> >
>
> We could lobby for them to turn it off.
As a separate action to the above that seem like it might be worthwhile.
> I wonder if they have a real
> user demand for it these days. It could only be important for users
> with lots of physical memory and a 32-bit only CPU, which can't be
> common now.
I guess it is hard for them to judge and so they are relatively
conservative about switching a long standing option off?
> (There should be no problem with using a 64-bit kernel,
> even if userspace is all 32-bit.).
A lot of distros have been a bit slow on the uptake with that
configuration.
Ian.
next prev parent reply other threads:[~2010-01-27 19:18 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-25 16:29 dom0 pvops crash Ian Jackson
2010-01-25 17:10 ` Keir Fraser
2010-01-25 17:28 ` Ian Jackson
2010-01-25 17:54 ` Pasi Kärkkäinen
2010-01-25 18:03 ` Ian Jackson
2010-01-25 18:57 ` Keir Fraser
2010-01-25 19:00 ` Jeremy Fitzhardinge
2010-01-25 19:31 ` Keir Fraser
2010-01-25 20:02 ` Jeremy Fitzhardinge
2010-01-26 11:53 ` Jan Beulich
2010-01-27 17:26 ` Ian Campbell
2010-01-27 18:50 ` Jeremy Fitzhardinge
2010-01-27 19:18 ` Ian Campbell [this message]
2010-01-27 19:34 ` Jeremy Fitzhardinge
2010-01-27 20:03 ` Pasi Kärkkäinen
2010-01-27 20:07 ` Jeremy Fitzhardinge
2010-02-07 19:35 ` Pasi Kärkkäinen
2010-02-07 21:42 ` Ian Campbell
2010-02-07 22:22 ` Daniel Stodden
2010-02-08 7:41 ` Pasi Kärkkäinen
2010-02-08 7:47 ` Ian Campbell
2010-02-08 8:06 ` Pasi Kärkkäinen
2010-02-08 8:50 ` Ian Campbell
2010-02-08 8:57 ` Pasi Kärkkäinen
2010-02-08 12:57 ` Xen pvops kernel CONFIG_HIGHPTE race/crash Pasi Kärkkäinen
2010-02-08 17:34 ` Pasi Kärkkäinen
2010-02-09 9:14 ` Ian Campbell
2010-01-25 18:18 ` dom0 pvops crash Ian Pratt
2010-01-25 18:26 ` Ian Campbell
2010-02-02 10:23 ` Pasi Kärkkäinen
2010-02-02 10:31 ` Ian Campbell
2010-02-02 10:45 ` Pasi Kärkkäinen
2010-01-25 17:11 ` Pasi Kärkkäinen
2010-01-25 17:30 ` Ian Jackson
2010-01-25 17:42 ` Pasi Kärkkäinen
2010-01-26 13:45 ` Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1264619930.2392.248.camel@localhost.localdomain \
--to=ian.campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=Keir.Fraser@eu.citrix.com \
--cc=jeremy@goop.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.