* HV KVM fails on 970 due to HTAB allocation
@ 2012-01-12 18:16 Alexander Graf
2012-01-13 5:11 ` Paul Mackerras
0 siblings, 1 reply; 3+ messages in thread
From: Alexander Graf @ 2012-01-12 18:16 UTC (permalink / raw)
To: kvm-ppc; +Cc: paulus@samba.org, DValeev, kvm@vger.kernel.org
Hi Paul,
While trying to run HV KVM for something useful on 970, we stumbled over
the following code path:
/* Allocate guest's hashed page table */
hpt =
__get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN,
HPT_ORDER - PAGE_SHIFT);
if (!hpt) {
pr_err("kvm_alloc_hpt: Couldn't alloc HPT\n");
return -ENOMEM;
}
kvm->arch.hpt_virt = hpt;
We're most of the time running into the !hpt case, because we simply
don't have 16MB of contiguous memory lying around.
I was trying to check if we could maybe allocate a huge_tlb page from
within kernel space, since that usually matches the 16MB pretty well.
However that seems to be very tricky. Maybe something similar to the RMA
thing would be a good idea?
Alex
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: HV KVM fails on 970 due to HTAB allocation
2012-01-12 18:16 HV KVM fails on 970 due to HTAB allocation Alexander Graf
@ 2012-01-13 5:11 ` Paul Mackerras
2012-01-13 12:36 ` Alexander Graf
0 siblings, 1 reply; 3+ messages in thread
From: Paul Mackerras @ 2012-01-13 5:11 UTC (permalink / raw)
To: Alexander Graf; +Cc: kvm-ppc, DValeev, kvm@vger.kernel.org, David Gibson
On Thu, Jan 12, 2012 at 07:16:51PM +0100, Alexander Graf wrote:
> While trying to run HV KVM for something useful on 970, we stumbled
> over the following code path:
>
> /* Allocate guest's hashed page table */
> hpt =
> __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN,
> HPT_ORDER - PAGE_SHIFT);
> if (!hpt) {
> pr_err("kvm_alloc_hpt: Couldn't alloc HPT\n");
> return -ENOMEM;
> }
> kvm->arch.hpt_virt = hpt;
>
> We're most of the time running into the !hpt case, because we simply
> don't have 16MB of contiguous memory lying around.
>
> I was trying to check if we could maybe allocate a huge_tlb page
> from within kernel space, since that usually matches the 16MB pretty
> well. However that seems to be very tricky. Maybe something similar
> to the RMA thing would be a good idea?
In discussing this with David Gibson in the past, one idea we have had
is to have userspace allocate the HPT using hugetlbfs and supply it to
KVM via an ioctl. If userspace doesn't call that ioctl then we try to
do a high-order allocation, as at present, when they do the first
VCPU_RUN ioctl.
The other thing the code could do is to fall back to lower-order
allocations. The HPT doesn't have to be 16MB in size; any power of 2
that is at least 256kB will do (there is an upper limit, but it is
enormous). Smaller sizes will potentially reduce performance, of
course (and the size of the VRMA on POWER7, but on 970 we have to use
an RMO region, which isn't affected by the HPT size).
Paul.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: HV KVM fails on 970 due to HTAB allocation
2012-01-13 5:11 ` Paul Mackerras
@ 2012-01-13 12:36 ` Alexander Graf
0 siblings, 0 replies; 3+ messages in thread
From: Alexander Graf @ 2012-01-13 12:36 UTC (permalink / raw)
To: Paul Mackerras; +Cc: kvm-ppc, DValeev, kvm@vger.kernel.org, David Gibson
On 13.01.2012, at 06:11, Paul Mackerras wrote:
> On Thu, Jan 12, 2012 at 07:16:51PM +0100, Alexander Graf wrote:
>
>> While trying to run HV KVM for something useful on 970, we stumbled
>> over the following code path:
>>
>> /* Allocate guest's hashed page table */
>> hpt =
>> __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN,
>> HPT_ORDER - PAGE_SHIFT);
>> if (!hpt) {
>> pr_err("kvm_alloc_hpt: Couldn't alloc HPT\n");
>> return -ENOMEM;
>> }
>> kvm->arch.hpt_virt = hpt;
>>
>> We're most of the time running into the !hpt case, because we simply
>> don't have 16MB of contiguous memory lying around.
>>
>> I was trying to check if we could maybe allocate a huge_tlb page
>> from within kernel space, since that usually matches the 16MB pretty
>> well. However that seems to be very tricky. Maybe something similar
>> to the RMA thing would be a good idea?
>
> In discussing this with David Gibson in the past, one idea we have had
> is to have userspace allocate the HPT using hugetlbfs and supply it to
> KVM via an ioctl. If userspace doesn't call that ioctl then we try to
> do a high-order allocation, as at present, when they do the first
> VCPU_RUN ioctl.
At which point user space has complete control over the HPT which converts guest EA to host RA addresses? I don't think that's a good idea :).
> The other thing the code could do is to fall back to lower-order
> allocations. The HPT doesn't have to be 16MB in size; any power of 2
> that is at least 256kB will do (there is an upper limit, but it is
> enormous). Smaller sizes will potentially reduce performance, of
> course (and the size of the VRMA on POWER7, but on 970 we have to use
> an RMO region, which isn't affected by the HPT size).
Which means that a guest could potentially run slower due to random circumstances on the host. Or in other words, benchmarking after bootup will be fast, benchmarking after 2 weeks of runtime of the system will be slow. This really should be the last resort.
Maybe we should do something similar to the RMA allocator, where we on bootup define how many VMs we want to preallocate memory for? I really don't like that either, but can't think of a better approach atm.
Alex
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-01-13 12:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-12 18:16 HV KVM fails on 970 due to HTAB allocation Alexander Graf
2012-01-13 5:11 ` Paul Mackerras
2012-01-13 12:36 ` Alexander Graf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox