From: George Dunlap <george.dunlap@eu.citrix.com>
To: Dave McCracken <dcm@mccr.org>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
Xen Developers <xen-devel@lists.xen.org>
Subject: Re: Issue with PV superpage handling
Date: Mon, 25 Jun 2012 16:08:27 +0100 [thread overview]
Message-ID: <4FE87EEB.7060507@eu.citrix.com> (raw)
In-Reply-To: <201206250938.21418.dcm@mccr.org>
On 25/06/12 15:38, Dave McCracken wrote:
> Awhile back I added the domain config flag "superpages" to support Linux
> hugepages in PV domains. When the flag is set, the PV domain is populated
> entirely with superpages. If not enough superpage-sized chunks can be found,
> the domain creation fails.
>
> At some time after my patch was accepted, the code I added to domain restore
> was removed because I broke page allocation batching. I put it on my TODO
> list to reimplement it, then it got lost, for which I apologize.
>
> Now I have gotten back to reimplementing PV superpage support in restore, I
> find that recently other code was added to restore that, while triggered by
> the superpage flag, only allocates superpages opportunistically and falls back
> to small pages if it fails. This breaks the original semantics of the flag
> and could cause any OS that depends on the semantics to fail catastrophically.
>
> I have a patch that implements the original semantics of the superpage flag
> while preserving the batch allocation behavior. I can remove the competing
> code and submit mine, but I have a question. What value is there in
> implementing opportunistic allocation of superpages for a PV (or an HVM)
> domain in restore? It clearly can't be based on the superpages flag.
> Opportunistic superpage allocation is already the default behavior for HVM
> domain creation. Should it also be a default on HVM restore? What about for
> PV domains? Is there any real benefit?
Well the value of having superpages for HVM guests is pretty obvious.
When using hardware assisted pagetables (HAP), the number of memory
reads on a TLB lookup is guest_levels * p2m_level -- so on a 64-bit
guest, the one extra level of p2m could cause up to 4 extra memory reads
for every TLB miss. The reason to do it opportunistically instead of
all-or-nothing is that there's no reason not to -- every little helps. :-)
My question is, what is the value of enforcing all-or-nothing for PV
guests? Is it the case that PV guests have to be entirely in either one
mode or the other?
I'm not particularly fussed about having a way to disable the
opportunistic superpage allocation for HVM guests, and just turning that
on all the time. I only really used the flag because I saw it was being
passed but wasn't being used; I didn't realize it was meant to have the
"use superpages or abort" semantics. My only non-negotiable is that we
have *a way* to get opportunistic superpages for HVM guests.
-George
next prev parent reply other threads:[~2012-06-25 15:08 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-25 14:38 Issue with PV superpage handling Dave McCracken
2012-06-25 15:08 ` George Dunlap [this message]
2012-06-25 15:48 ` Jan Beulich
2012-06-25 16:07 ` Dave McCracken
2012-06-25 17:07 ` George Dunlap
2012-06-26 6:52 ` Jan Beulich
2012-07-09 6:02 ` Juergen Gross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FE87EEB.7060507@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=dcm@mccr.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).