linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
@ 2016-09-12 11:13 Anshuman Khandual
  2016-09-12 11:33 ` Balbir Singh
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-12 11:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus, aneesh.kumar, mpe

When the HPT size is explicitly passed on from the userspace, currently
the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
from reserved CMA area and if that is not possible, the allocation just
fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
back to smaller HPT size in allocation ioctl"), it does not even try to
allocate the same order pages from the page allocator before failing for
good. Same order allocation should be attempted from the page allocator
as a fallback option when the CMA allocation attempt fails.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
- This change saves guests from failing to start after migration

 arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 05f09ae..0a30eb4 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
 			--order;
 	}
 
+	/*
+	 * Fallback in case the userspace has provided a size via ioctl.
+	 * Try allocating the same order pages from the page allocator.
+	 */
+	if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
+		hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
+			__GFP_NOWARN, order - PAGE_SHIFT);
+
 	if (!hpt)
 		return -ENOMEM;
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-12 11:13 [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl Anshuman Khandual
@ 2016-09-12 11:33 ` Balbir Singh
  2016-09-13  4:07   ` Anshuman Khandual
  2016-09-12 12:05 ` Aneesh Kumar K.V
  2016-09-13 23:57 ` Michael Ellerman
  2 siblings, 1 reply; 13+ messages in thread
From: Balbir Singh @ 2016-09-12 11:33 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Aneesh Kumar KV

On Mon, Sep 12, 2016 at 9:13 PM, Anshuman Khandual
<khandual@linux.vnet.ibm.com> wrote:
> When the HPT size is explicitly passed on from the userspace, currently
> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
> from reserved CMA area and if that is not possible, the allocation just
> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
> back to smaller HPT size in allocation ioctl"), it does not even try to
> allocate the same order pages from the page allocator before failing for
> good. Same order allocation should be attempted from the page allocator
> as a fallback option when the CMA allocation attempt fails.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
> - This change saves guests from failing to start after migration
>
>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 05f09ae..0a30eb4 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>                         --order;
>         }
>
> +       /*
> +        * Fallback in case the userspace has provided a size via ioctl.
> +        * Try allocating the same order pages from the page allocator.
> +        */
> +       if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
> +               hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
> +                       __GFP_NOWARN, order - PAGE_SHIFT);
> +

How often does this succeed? Please provide data. I presume this for
the case where guest pages are pinned?

Balbir Singh.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-12 11:13 [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl Anshuman Khandual
  2016-09-12 11:33 ` Balbir Singh
@ 2016-09-12 12:05 ` Aneesh Kumar K.V
  2016-09-13  6:26   ` Anshuman Khandual
  2016-09-13 23:57 ` Michael Ellerman
  2 siblings, 1 reply; 13+ messages in thread
From: Aneesh Kumar K.V @ 2016-09-12 12:05 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> When the HPT size is explicitly passed on from the userspace, currently
> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
> from reserved CMA area and if that is not possible, the allocation just
> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
> back to smaller HPT size in allocation ioctl"), it does not even try to
> allocate the same order pages from the page allocator before failing for
> good. Same order allocation should be attempted from the page allocator
> as a fallback option when the CMA allocation attempt fails.

IMO we should fix the reason for these CMA allocation failure. We are just
doing work around here.

>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
> - This change saves guests from failing to start after migration
>
>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 05f09ae..0a30eb4 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>  			--order;
>  	}
>
> +	/*
> +	 * Fallback in case the userspace has provided a size via ioctl.
> +	 * Try allocating the same order pages from the page allocator.
> +	 */
> +	if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
> +		hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
> +			__GFP_NOWARN, order - PAGE_SHIFT);
> +
>  	if (!hpt)
>  		return -ENOMEM;

A better way to do that would be (not even compile tested) ?

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 65b2b00d93d7..3f8995f27339 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -68,16 +68,18 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
 		memset((void *)hpt, 0, (1ul << order));
 		kvm->arch.hpt_cma_alloc = 1;
 	}
-
-	/* Lastly try successively smaller sizes from the page allocator */
-	/* Only do this if userspace didn't specify a size via ioctl */
-	while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
+	/*
+	 * Try successively smaller sizes from the page allocator.
+	 * If a size was specified via an ioctl, we just try that
+	 * specific size
+	 */
+-	while (!hpt && order > PPC_MIN_HPT_ORDER) {
 		hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
 				       __GFP_NOWARN, order - PAGE_SHIFT);
-		if (!hpt)
-			--order;
+		if (htab_orderp)
+			break;
+		--order;
 	}
-
 	if (!hpt)
 		return -ENOMEM;
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-12 11:33 ` Balbir Singh
@ 2016-09-13  4:07   ` Anshuman Khandual
  2016-09-13  4:34     ` Balbir Singh
  0 siblings, 1 reply; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-13  4:07 UTC (permalink / raw)
  To: Balbir Singh
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Aneesh Kumar KV

On 09/12/2016 05:03 PM, Balbir Singh wrote:
> On Mon, Sep 12, 2016 at 9:13 PM, Anshuman Khandual
> <khandual@linux.vnet.ibm.com> wrote:
>> > When the HPT size is explicitly passed on from the userspace, currently
>> > the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>> > from reserved CMA area and if that is not possible, the allocation just
>> > fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>> > back to smaller HPT size in allocation ioctl"), it does not even try to
>> > allocate the same order pages from the page allocator before failing for
>> > good. Same order allocation should be attempted from the page allocator
>> > as a fallback option when the CMA allocation attempt fails.
>> >
>> > Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> > ---
>> > - This change saves guests from failing to start after migration
>> >
>> >  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>> >  1 file changed, 8 insertions(+)
>> >
>> > diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> > index 05f09ae..0a30eb4 100644
>> > --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> > +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> > @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>> >                         --order;
>> >         }
>> >
>> > +       /*
>> > +        * Fallback in case the userspace has provided a size via ioctl.
>> > +        * Try allocating the same order pages from the page allocator.
>> > +        */
>> > +       if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
>> > +               hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>> > +                       __GFP_NOWARN, order - PAGE_SHIFT);
>> > +
> How often does this succeed? Please provide data. I presume this for

During continuous guest VM migration test from source host to destination host
this patch was able to prevent guest creation failure after migration on the
destination host which was failing after 2-3 days. We have not seen the failure
till now even after 3-4 days.

> the case where guest pages are pinned?

Hmm, need to check that in the test setup. There was nothing running inside the
guests though. IIUC, HPT size of the guest is computed based on the max memory
the guest is ever going to have irrespective of the RAM usage before migration.
How does pinning effect the HPT size ?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-13  4:07   ` Anshuman Khandual
@ 2016-09-13  4:34     ` Balbir Singh
  2016-09-13  5:49       ` Anshuman Khandual
  0 siblings, 1 reply; 13+ messages in thread
From: Balbir Singh @ 2016-09-13  4:34 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Aneesh Kumar KV



On 13/09/16 14:07, Anshuman Khandual wrote:
> On 09/12/2016 05:03 PM, Balbir Singh wrote:
>> On Mon, Sep 12, 2016 at 9:13 PM, Anshuman Khandual
>> <khandual@linux.vnet.ibm.com> wrote:
>>>> When the HPT size is explicitly passed on from the userspace, currently
>>>> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>>>> from reserved CMA area and if that is not possible, the allocation just
>>>> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>>>> back to smaller HPT size in allocation ioctl"), it does not even try to
>>>> allocate the same order pages from the page allocator before failing for
>>>> good. Same order allocation should be attempted from the page allocator
>>>> as a fallback option when the CMA allocation attempt fails.
>>>>
>>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>>> ---
>>>> - This change saves guests from failing to start after migration
>>>>
>>>>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>>>>  1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>> index 05f09ae..0a30eb4 100644
>>>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>>>>                         --order;
>>>>         }
>>>>
>>>> +       /*
>>>> +        * Fallback in case the userspace has provided a size via ioctl.
>>>> +        * Try allocating the same order pages from the page allocator.
>>>> +        */
>>>> +       if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
>>>> +               hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>>>> +                       __GFP_NOWARN, order - PAGE_SHIFT);
>>>> +
>> How often does this succeed? Please provide data. I presume this for
> 
> During continuous guest VM migration test from source host to destination host
> this patch was able to prevent guest creation failure after migration on the
> destination host which was failing after 2-3 days. We have not seen the failure
> till now even after 3-4 days.
> 

OK.. the CMA failures need analysis. Are we just ignoring a CMA bug? IOW, why
would CMA allocation fail -- CMA size is too small to accommodate the required
number of allocations? 

>> the case where guest pages are pinned?
> 
> Hmm, need to check that in the test setup. There was nothing running inside the
> guests though. IIUC, HPT size of the guest is computed based on the max memory
> the guest is ever going to have irrespective of the RAM usage before migration.
> How does pinning effect the HPT size ?
> 

If the pinned pages (from anywhere) belong to CMA, then CMA allocations would start failing


Balbir Singh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-13  4:34     ` Balbir Singh
@ 2016-09-13  5:49       ` Anshuman Khandual
  2016-09-13  9:26         ` Balbir Singh
  0 siblings, 1 reply; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-13  5:49 UTC (permalink / raw)
  To: Balbir Singh
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Aneesh Kumar KV

On 09/13/2016 10:04 AM, Balbir Singh wrote:
> 
> 
> On 13/09/16 14:07, Anshuman Khandual wrote:
>> On 09/12/2016 05:03 PM, Balbir Singh wrote:
>>> On Mon, Sep 12, 2016 at 9:13 PM, Anshuman Khandual
>>> <khandual@linux.vnet.ibm.com> wrote:
>>>>> When the HPT size is explicitly passed on from the userspace, currently
>>>>> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>>>>> from reserved CMA area and if that is not possible, the allocation just
>>>>> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>>>>> back to smaller HPT size in allocation ioctl"), it does not even try to
>>>>> allocate the same order pages from the page allocator before failing for
>>>>> good. Same order allocation should be attempted from the page allocator
>>>>> as a fallback option when the CMA allocation attempt fails.
>>>>>
>>>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>>>> ---
>>>>> - This change saves guests from failing to start after migration
>>>>>
>>>>>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>>>>>  1 file changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>>> index 05f09ae..0a30eb4 100644
>>>>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>>> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>>>>>                         --order;
>>>>>         }
>>>>>
>>>>> +       /*
>>>>> +        * Fallback in case the userspace has provided a size via ioctl.
>>>>> +        * Try allocating the same order pages from the page allocator.
>>>>> +        */
>>>>> +       if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
>>>>> +               hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>>>>> +                       __GFP_NOWARN, order - PAGE_SHIFT);
>>>>> +
>>> How often does this succeed? Please provide data. I presume this for
>>
>> During continuous guest VM migration test from source host to destination host
>> this patch was able to prevent guest creation failure after migration on the
>> destination host which was failing after 2-3 days. We have not seen the failure
>> till now even after 3-4 days.
>>
> 
> OK.. the CMA failures need analysis. Are we just ignoring a CMA bug? IOW, why

Sure, it does need analysis. But there will be situations where CMA
allocation request can fail, thats why we will need fallback option.
That the same reason why we have fall back options of attempting from
page allocator (in decreasing order every time) when the size is not
specified as part of the ioctl. Why the case should be any different
when the size is specified in the ioctl().

> would CMA allocation fail -- CMA size is too small to accommodate the required
> number of allocations? 

The same size seems to be good enough for first couple of days and
then it fails. Probably some __GFP_MOVABLE allocation got pinned
later on.

> 
>>> the case where guest pages are pinned?
>>
>> Hmm, need to check that in the test setup. There was nothing running inside the
>> guests though. IIUC, HPT size of the guest is computed based on the max memory
>> the guest is ever going to have irrespective of the RAM usage before migration.
>> How does pinning effect the HPT size ?
>>
> 
> If the pinned pages (from anywhere) belong to CMA, then CMA allocations would start failing

Right and with the current design of CMA we can do nothing about it,
unless we make sure the pages allocated to satisfy guest real memory
do not come from CMA area at all.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-12 12:05 ` Aneesh Kumar K.V
@ 2016-09-13  6:26   ` Anshuman Khandual
  0 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-13  6:26 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev

On 09/12/2016 05:35 PM, Aneesh Kumar K.V wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
>> When the HPT size is explicitly passed on from the userspace, currently
>> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>> from reserved CMA area and if that is not possible, the allocation just
>> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>> back to smaller HPT size in allocation ioctl"), it does not even try to
>> allocate the same order pages from the page allocator before failing for
>> good. Same order allocation should be attempted from the page allocator
>> as a fallback option when the CMA allocation attempt fails.
> 
> IMO we should fix the reason for these CMA allocation failure. We are just

IMHO, irrespective of the ability of CMA to satisfy allocation requests,
a fall back option from page allocator should always be there when the
guest is failing to start due to unavailability of memory. In my previous
response in this thread, also pointed out how there need to be a parity
between what we do for cases when ioctl calls specify size or not from
having a fallback option.

> doing work around here.
> 
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>> - This change saves guests from failing to start after migration
>>
>>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> index 05f09ae..0a30eb4 100644
>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>>  			--order;
>>  	}
>>
>> +	/*
>> +	 * Fallback in case the userspace has provided a size via ioctl.
>> +	 * Try allocating the same order pages from the page allocator.
>> +	 */
>> +	if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
>> +		hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>> +			__GFP_NOWARN, order - PAGE_SHIFT);
>> +
>>  	if (!hpt)
>>  		return -ENOMEM;
> 
> A better way to do that would be (not even compile tested) ?
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 65b2b00d93d7..3f8995f27339 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -68,16 +68,18 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>  		memset((void *)hpt, 0, (1ul << order));
>  		kvm->arch.hpt_cma_alloc = 1;
>  	}
> -
> -	/* Lastly try successively smaller sizes from the page allocator */
> -	/* Only do this if userspace didn't specify a size via ioctl */
> -	while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
> +	/*
> +	 * Try successively smaller sizes from the page allocator.
> +	 * If a size was specified via an ioctl, we just try that
> +	 * specific size
> +	 */
> +-	while (!hpt && order > PPC_MIN_HPT_ORDER) {
>  		hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>  				       __GFP_NOWARN, order - PAGE_SHIFT);
> -		if (!hpt)
> -			--order;
> +		if (htab_orderp)
> +			break;
> +		--order;
>  	}
> -
>  	if (!hpt)
>  		return -ENOMEM;
> 

Initially thought about this way but then decided not to change this
existing code block instead just one more. But anything is fine, I
can just change this next time around.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-13  5:49       ` Anshuman Khandual
@ 2016-09-13  9:26         ` Balbir Singh
  0 siblings, 0 replies; 13+ messages in thread
From: Balbir Singh @ 2016-09-13  9:26 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Aneesh Kumar KV

On Tue, Sep 13, 2016 at 3:49 PM, Anshuman Khandual
<khandual@linux.vnet.ibm.com> wrote:
> On 09/13/2016 10:04 AM, Balbir Singh wrote:
>>
>>
>> On 13/09/16 14:07, Anshuman Khandual wrote:
>>> On 09/12/2016 05:03 PM, Balbir Singh wrote:
>>>> On Mon, Sep 12, 2016 at 9:13 PM, Anshuman Khandual
>>>> <khandual@linux.vnet.ibm.com> wrote:
>>>>>> When the HPT size is explicitly passed on from the userspace, currently
>>>>>> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>>>>>> from reserved CMA area and if that is not possible, the allocation just
>>>>>> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>>>>>> back to smaller HPT size in allocation ioctl"), it does not even try to
>>>>>> allocate the same order pages from the page allocator before failing for
>>>>>> good. Same order allocation should be attempted from the page allocator
>>>>>> as a fallback option when the CMA allocation attempt fails.
>>>>>>
>>>>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>>>>> ---
>>>>>> - This change saves guests from failing to start after migration
>>>>>>
>>>>>>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++
>>>>>>  1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>>>> index 05f09ae..0a30eb4 100644
>>>>>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>>>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>>>>>> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>>>>>>                         --order;
>>>>>>         }
>>>>>>
>>>>>> +       /*
>>>>>> +        * Fallback in case the userspace has provided a size via ioctl.
>>>>>> +        * Try allocating the same order pages from the page allocator.
>>>>>> +        */
>>>>>> +       if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp)
>>>>>> +               hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>>>>>> +                       __GFP_NOWARN, order - PAGE_SHIFT);
>>>>>> +
>>>> How often does this succeed? Please provide data. I presume this for
>>>
>>> During continuous guest VM migration test from source host to destination host
>>> this patch was able to prevent guest creation failure after migration on the
>>> destination host which was failing after 2-3 days. We have not seen the failure
>>> till now even after 3-4 days.
>>>
>>
>> OK.. the CMA failures need analysis. Are we just ignoring a CMA bug? IOW, why
>
> Sure, it does need analysis. But there will be situations where CMA
> allocation request can fail, thats why we will need fallback option.

Please elaborate those situations. This patch needs more explanation
as to why we should fallback -- what are those short comings of CMA
allocation. Can anyone using CMA face them and have to design a fallback?

> That the same reason why we have fall back options of attempting from
> page allocator (in decreasing order every time) when the size is not
> specified as part of the ioctl. Why the case should be any different
> when the size is specified in the ioctl().
>
>> would CMA allocation fail -- CMA size is too small to accommodate the required
>> number of allocations?
>
> The same size seems to be good enough for first couple of days and
> then it fails. Probably some __GFP_MOVABLE allocation got pinned
> later on.
>

Please analyze and let us know

>>
>>>> the case where guest pages are pinned?
>>>
>>> Hmm, need to check that in the test setup. There was nothing running inside the
>>> guests though. IIUC, HPT size of the guest is computed based on the max memory
>>> the guest is ever going to have irrespective of the RAM usage before migration.
>>> How does pinning effect the HPT size ?
>>>
>>
>> If the pinned pages (from anywhere) belong to CMA, then CMA allocations would start failing
>
> Right and with the current design of CMA we can do nothing about it,
> unless we make sure the pages allocated to satisfy guest real memory
> do not come from CMA area at all.
>

I have patches to move non-THP pages out of CMA

Balbir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-12 11:13 [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl Anshuman Khandual
  2016-09-12 11:33 ` Balbir Singh
  2016-09-12 12:05 ` Aneesh Kumar K.V
@ 2016-09-13 23:57 ` Michael Ellerman
  2016-09-14  0:12   ` Paul Mackerras
                     ` (2 more replies)
  2 siblings, 3 replies; 13+ messages in thread
From: Michael Ellerman @ 2016-09-13 23:57 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: paulus, aneesh.kumar

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> When the HPT size is explicitly passed on from the userspace, currently
> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
> from reserved CMA area and if that is not possible, the allocation just
> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
> back to smaller HPT size in allocation ioctl"), it does not even try to
> allocate the same order pages from the page allocator before failing for
> good. Same order allocation should be attempted from the page allocator
> as a fallback option when the CMA allocation attempt fails.

It looks like if CMA is not configured we will just fail instantly.

So this does look like something we should fix.

But I think it is just a bug in commit 572abd563bef ("KVM: PPC: Book3S
HV: Don't fall back to smaller HPT size in allocation ioctl"), which did:

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 1f9c0a17f445..10722b1e38b5 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -70,7 +70,8 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
        }
 
        /* Lastly try successively smaller sizes from the page allocator */
-       while (!hpt && order > PPC_MIN_HPT_ORDER) {
+       /* Only do this if userspace didn't specify a size via ioctl */
+       while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
                hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
                                       __GFP_NOWARN, order - PAGE_SHIFT);
                if (!hpt)


Instead of guarding the loop entry with !htab_orderp, it should have
allowed the loop to enter, but prevented it from iterating if the
allocation fails and htab_orderp != 0.

cheers

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-13 23:57 ` Michael Ellerman
@ 2016-09-14  0:12   ` Paul Mackerras
  2016-09-14  3:55     ` Anshuman Khandual
  2016-09-14  3:52   ` Anshuman Khandual
  2016-09-14  5:28   ` Anshuman Khandual
  2 siblings, 1 reply; 13+ messages in thread
From: Paul Mackerras @ 2016-09-14  0:12 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Anshuman Khandual, linuxppc-dev, aneesh.kumar

On Wed, Sep 14, 2016 at 09:57:48AM +1000, Michael Ellerman wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
> > When the HPT size is explicitly passed on from the userspace, currently
> > the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
> > from reserved CMA area and if that is not possible, the allocation just
> > fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
> > back to smaller HPT size in allocation ioctl"), it does not even try to
> > allocate the same order pages from the page allocator before failing for
> > good. Same order allocation should be attempted from the page allocator
> > as a fallback option when the CMA allocation attempt fails.
> 
> It looks like if CMA is not configured we will just fail instantly.
> 
> So this does look like something we should fix.
> 
> But I think it is just a bug in commit 572abd563bef ("KVM: PPC: Book3S
> HV: Don't fall back to smaller HPT size in allocation ioctl"), which did:
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 1f9c0a17f445..10722b1e38b5 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -70,7 +70,8 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>         }
>  
>         /* Lastly try successively smaller sizes from the page allocator */
> -       while (!hpt && order > PPC_MIN_HPT_ORDER) {
> +       /* Only do this if userspace didn't specify a size via ioctl */
> +       while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
>                 hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>                                        __GFP_NOWARN, order - PAGE_SHIFT);
>                 if (!hpt)
> 
> 
> Instead of guarding the loop entry with !htab_orderp, it should have
> allowed the loop to enter, but prevented it from iterating if the
> allocation fails and htab_orderp != 0.

You're right.  I'll fix it.

Paul.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-13 23:57 ` Michael Ellerman
  2016-09-14  0:12   ` Paul Mackerras
@ 2016-09-14  3:52   ` Anshuman Khandual
  2016-09-14  5:28   ` Anshuman Khandual
  2 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-14  3:52 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aneesh.kumar

On 09/14/2016 05:27 AM, Michael Ellerman wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
>> When the HPT size is explicitly passed on from the userspace, currently
>> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>> from reserved CMA area and if that is not possible, the allocation just
>> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>> back to smaller HPT size in allocation ioctl"), it does not even try to
>> allocate the same order pages from the page allocator before failing for
>> good. Same order allocation should be attempted from the page allocator
>> as a fallback option when the CMA allocation attempt fails.
> 
> It looks like if CMA is not configured we will just fail instantly.

Right and also we have this fallback registered any way. I wonder
why we are still debating about the need of a fallback mechanism
when we already have got one.

> 
> So this does look like something we should fix.
> 
> But I think it is just a bug in commit 572abd563bef ("KVM: PPC: Book3S
> HV: Don't fall back to smaller HPT size in allocation ioctl"), which did:

Hmm, I think its something the commit missed to accommodate for.
But maybe yes, its a bug in the commit.

> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 1f9c0a17f445..10722b1e38b5 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -70,7 +70,8 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>         }
> 
>         /* Lastly try successively smaller sizes from the page allocator */
> -       while (!hpt && order > PPC_MIN_HPT_ORDER) {
> +       /* Only do this if userspace didn't specify a size via ioctl */
> +       while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
>                 hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>                                        __GFP_NOWARN, order - PAGE_SHIFT);
>                 if (!hpt)
> 
> 
> Instead of guarding the loop entry with !htab_orderp, it should have
> allowed the loop to enter, but prevented it from iterating if the
> allocation fails and htab_orderp != 0.

Right and thats what Aneesh's proposed patch (in the other thread) does.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-14  0:12   ` Paul Mackerras
@ 2016-09-14  3:55     ` Anshuman Khandual
  0 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-14  3:55 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman; +Cc: linuxppc-dev, aneesh.kumar

On 09/14/2016 05:42 AM, Paul Mackerras wrote:
> On Wed, Sep 14, 2016 at 09:57:48AM +1000, Michael Ellerman wrote:
>> > Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
>> > 
>>> > > When the HPT size is explicitly passed on from the userspace, currently
>>> > > the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>>> > > from reserved CMA area and if that is not possible, the allocation just
>>> > > fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>>> > > back to smaller HPT size in allocation ioctl"), it does not even try to
>>> > > allocate the same order pages from the page allocator before failing for
>>> > > good. Same order allocation should be attempted from the page allocator
>>> > > as a fallback option when the CMA allocation attempt fails.
>> > 
>> > It looks like if CMA is not configured we will just fail instantly.
>> > 
>> > So this does look like something we should fix.
>> > 
>> > But I think it is just a bug in commit 572abd563bef ("KVM: PPC: Book3S
>> > HV: Don't fall back to smaller HPT size in allocation ioctl"), which did:
>> > 
>> > diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> > index 1f9c0a17f445..10722b1e38b5 100644
>> > --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> > +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> > @@ -70,7 +70,8 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>> >         }
>> >  
>> >         /* Lastly try successively smaller sizes from the page allocator */
>> > -       while (!hpt && order > PPC_MIN_HPT_ORDER) {
>> > +       /* Only do this if userspace didn't specify a size via ioctl */
>> > +       while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
>> >                 hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>> >                                        __GFP_NOWARN, order - PAGE_SHIFT);
>> >                 if (!hpt)
>> > 
>> > 
>> > Instead of guarding the loop entry with !htab_orderp, it should have
>> > allowed the loop to enter, but prevented it from iterating if the
>> > allocation fails and htab_orderp != 0.
> You're right.  I'll fix it.

Thanks Paul, so I will not be sending follow up patch on this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl
  2016-09-13 23:57 ` Michael Ellerman
  2016-09-14  0:12   ` Paul Mackerras
  2016-09-14  3:52   ` Anshuman Khandual
@ 2016-09-14  5:28   ` Anshuman Khandual
  2 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-09-14  5:28 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aneesh.kumar

On 09/14/2016 05:27 AM, Michael Ellerman wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

Not sure whether this mail ever went. Sending it again.

> 
>> > When the HPT size is explicitly passed on from the userspace, currently
>> > the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT
>> > from reserved CMA area and if that is not possible, the allocation just
>> > fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall
>> > back to smaller HPT size in allocation ioctl"), it does not even try to
>> > allocate the same order pages from the page allocator before failing for
>> > good. Same order allocation should be attempted from the page allocator
>> > as a fallback option when the CMA allocation attempt fails.
> It looks like if CMA is not configured we will just fail instantly.

Right and also we have this fallback registered any way. I wonder
why we are still debating about the need of a fallback mechanism
when we already have got one.

> 
> So this does look like something we should fix.
> 
> But I think it is just a bug in commit 572abd563bef ("KVM: PPC: Book3S
> HV: Don't fall back to smaller HPT size in allocation ioctl"), which did:

Hmm, I think its something the commit missed to accommodate for.
But maybe yes, its a bug in the commit.

> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 1f9c0a17f445..10722b1e38b5 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -70,7 +70,8 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
>         }
> 
>         /* Lastly try successively smaller sizes from the page allocator */
> -       while (!hpt && order > PPC_MIN_HPT_ORDER) {
> +       /* Only do this if userspace didn't specify a size via ioctl */
> +       while (!hpt && order > PPC_MIN_HPT_ORDER && !htab_orderp) {
>                 hpt = __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|
>                                        __GFP_NOWARN, order - PAGE_SHIFT);
>                 if (!hpt)
> 
> 
> Instead of guarding the loop entry with !htab_orderp, it should have
> allowed the loop to enter, but prevented it from iterating if the
> allocation fails and htab_orderp != 0.

Right and thats what Aneesh's proposed patch (in the other thread) does.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-09-14  5:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-12 11:13 [RFC] KVM: PPC: Book3S HV: Fall back to same size HPT in allocation ioctl Anshuman Khandual
2016-09-12 11:33 ` Balbir Singh
2016-09-13  4:07   ` Anshuman Khandual
2016-09-13  4:34     ` Balbir Singh
2016-09-13  5:49       ` Anshuman Khandual
2016-09-13  9:26         ` Balbir Singh
2016-09-12 12:05 ` Aneesh Kumar K.V
2016-09-13  6:26   ` Anshuman Khandual
2016-09-13 23:57 ` Michael Ellerman
2016-09-14  0:12   ` Paul Mackerras
2016-09-14  3:55     ` Anshuman Khandual
2016-09-14  3:52   ` Anshuman Khandual
2016-09-14  5:28   ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).