From: Paul Mackerras <paulus@samba.org>
To: Avi Kivity <avi@redhat.com>
Cc: Alexander Graf <agraf@suse.de>,
kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: PPC: Book3S HV: Make the guest MMU hash table size configurable
Date: Mon, 30 Apr 2012 14:40:14 +1000 [thread overview]
Message-ID: <20120430044014.GA5428@drongo> (raw)
In-Reply-To: <4F9D441D.9050802@redhat.com>
On Sun, Apr 29, 2012 at 04:37:33PM +0300, Avi Kivity wrote:
> How difficult is it to have the kernel resize the HPT on demand?
Quite difficult, unfortunately. The guest kernel knows the size of
the HPT, and the paravirt interface for updating it relies on the
guest knowing it, since it is used in the hash function (the computed
hash is taken modulo the HPT size).
And even if it were possible to notify the guest that the size was
changing, since it is a hash table, changing the size requires
traversing the table to move hash entries to their new locations.
When reducing the size one only has to traverse the part that is going
away, but even that will be at least half of the table since the size
is always a power of 2.
> Guest
> size is meaningless in the presence of memory hotplug, and having
> unprivileged userspace pin down large amounts of kernel memory us
> undesirable.
I agree. The HPT is certainly not ideal. However, it's what we have
to deal with on POWER hardware.
One idea I had is to reserve some contiguous physical memory at boot
time, say a couple of percent of system memory, and use that as a pool
to allocate HPTs from. That would limit the impact on the rest of the
system and also make it more likely that we can find the necessary
amount of physically contiguous memory.
> On x86 we grow and shrink the mmu resources in response to guest demand
> and host memory pressure. We can do this because the data structures
> are not authoritative (don't know it that's the case for ppc) and
> because they can be grown incrementally (pretty sure that isn't the case
> on ppc). Still, if we can do this at KVM_SET_USER_MEMORY_REGION time
> instead of a separate ioctl, I think it's better.
It's not practical to grow the HPT after the guest has started
booting. It is possible to have two HPTs: one that the guest sees,
which can be in pageable memory, and another shadow HPT that the
hardware uses, which has to be in physically contiguous memory. In
this model the size of the shadow HPT can be changed at will, at the
expense of having to reestablish the entries in it, though that can be
done on demand. I have avoided that approach until now because it
uses more memory and is slower than just having a single HPT.
Paul.
next prev parent reply other threads:[~2012-04-30 4:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-27 3:55 [PATCH] KVM: PPC: Book3S HV: Make the guest MMU hash table size configurable Paul Mackerras
2012-04-29 13:37 ` Avi Kivity
2012-04-30 4:40 ` Paul Mackerras [this message]
2012-04-30 8:31 ` Avi Kivity
2012-04-30 11:54 ` Paul Mackerras
2012-04-30 13:34 ` Avi Kivity
2012-05-01 21:49 ` Paul Mackerras
2012-05-02 12:52 ` Alexander Graf
2012-05-02 23:49 ` Paul Mackerras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120430044014.GA5428@drongo \
--to=paulus@samba.org \
--cc=agraf@suse.de \
--cc=avi@redhat.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox