public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Paul Mackerras <paulus@samba.org>
Cc: Alexander Graf <agraf@suse.de>,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: PPC: Book3S HV: Make the guest MMU hash table size configurable
Date: Mon, 30 Apr 2012 11:31:42 +0300	[thread overview]
Message-ID: <4F9E4DEE.60104@redhat.com> (raw)
In-Reply-To: <20120430044014.GA5428@drongo>

On 04/30/2012 07:40 AM, Paul Mackerras wrote:
> On Sun, Apr 29, 2012 at 04:37:33PM +0300, Avi Kivity wrote:
>
> > How difficult is it to have the kernel resize the HPT on demand?
>
> Quite difficult, unfortunately.  The guest kernel knows the size of
> the HPT, and the paravirt interface for updating it relies on the
> guest knowing it, since it is used in the hash function (the computed
> hash is taken modulo the HPT size).
>
> And even if it were possible to notify the guest that the size was
> changing, since it is a hash table, changing the size requires
> traversing the table to move hash entries to their new locations.
> When reducing the size one only has to traverse the part that is going
> away, but even that will be at least half of the table since the size
> is always a power of 2.

I'm no x86 fan but I'm glad we have nothing like that over there.

>
> >  Guest
> > size is meaningless in the presence of memory hotplug, and having
> > unprivileged userspace pin down large amounts of kernel memory us
> > undesirable.
>
> I agree.  The HPT is certainly not ideal.  However, it's what we have
> to deal with on POWER hardware.
>
> One idea I had is to reserve some contiguous physical memory at boot
> time, say a couple of percent of system memory, and use that as a pool
> to allocate HPTs from.  That would limit the impact on the rest of the
> system and also make it more likely that we can find the necessary
> amount of physically contiguous memory.

Doesn't that limit the number of guests that can run?

> > On x86 we grow and shrink the mmu resources in response to guest demand
> > and host memory pressure.  We can do this because the data structures
> > are not authoritative (don't know it that's the case for ppc) and
> > because they can be grown incrementally (pretty sure that isn't the case
> > on ppc).  Still, if we can do this at KVM_SET_USER_MEMORY_REGION time
> > instead of a separate ioctl, I think it's better.
>
> It's not practical to grow the HPT after the guest has started
> booting.  It is possible to have two HPTs: one that the guest sees,
> which can be in pageable memory, and another shadow HPT that the
> hardware uses, which has to be in physically contiguous memory.  In
> this model the size of the shadow HPT can be changed at will, at the
> expense of having to reestablish the entries in it, though that can be
> done on demand.  I have avoided that approach until now because it
> uses more memory and is slower than just having a single HPT.

This is similar to x86 in the pre npt/ept days, it's indeed slow.  I
guess we'll be stuck with the pv hash until you get nested lookups (at
least a nested hash lookup is just 3 accesses instead of 24).

How are limits managed?  Won't a user creating a thousand guests with a
16MB hash each bring a server to its knees?

-- 
error compiling committee.c: too many arguments to function

  reply	other threads:[~2012-04-30  8:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-27  3:55 [PATCH] KVM: PPC: Book3S HV: Make the guest MMU hash table size configurable Paul Mackerras
2012-04-29 13:37 ` Avi Kivity
2012-04-30  4:40   ` Paul Mackerras
2012-04-30  8:31     ` Avi Kivity [this message]
2012-04-30 11:54       ` Paul Mackerras
2012-04-30 13:34         ` Avi Kivity
2012-05-01 21:49           ` Paul Mackerras
2012-05-02 12:52 ` Alexander Graf
2012-05-02 23:49   ` Paul Mackerras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F9E4DEE.60104@redhat.com \
    --to=avi@redhat.com \
    --cc=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox