From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alexander Graf <agraf@suse.de>
Date: Tue, 06 May 2014 14:25:33 +0000
Subject: Re: [PATCH] KVM: PPC: BOOK3S: HV: Don't try to allocate from kernel page allocator for hash page tab
Message-Id: <5368F0DD.9090107@suse.de>
List-Id: <kvm-ppc.vger.kernel.org>
References: <1399224322-22028-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
 <53677558.50900@suse.de> <87r4489ttk.fsf@linux.vnet.ibm.com>
 <20FFDF8F-1A3D-4719-B492-1E4B70F9D1B4@suse.de>
 <1399334797.20388.71.camel@pasglop> <536889C6.1050603@suse.de>
 <1399360775.20388.112.camel@pasglop> <53688D89.1070201@suse.de>
 <87wqdzq98f.fsf@linux.vnet.ibm.com>
In-Reply-To: <87wqdzq98f.fsf@linux.vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "paulus@samba.org" <paulus@samba.org>, "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>, "kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>

On 05/06/2014 04:20 PM, Aneesh Kumar K.V wrote:
> Alexander Graf <agraf@suse.de> writes:
>
>> On 06.05.14 09:19, Benjamin Herrenschmidt wrote:
>>> On Tue, 2014-05-06 at 09:05 +0200, Alexander Graf wrote:
>>>> On 06.05.14 02:06, Benjamin Herrenschmidt wrote:
>>>>> On Mon, 2014-05-05 at 17:16 +0200, Alexander Graf wrote:
>>>>>> Isn't this a greater problem? We should start swapping before we hit
>>>>>> the point where non movable kernel allocation fails, no?
>>>>> Possibly but the fact remains, this can be avoided by making sure that
>>>>> if we create a CMA reserve for KVM, then it uses it rather than using
>>>>> the rest of main memory for hash tables.
>>>> So why were we preferring non-CMA memory before? Considering that Aneesh
>>>> introduced that logic in fa61a4e3 I suppose this was just a mistake?
>>> I assume so.
> ....
> ...
>
>>> Whatever remains is split between CMA and the normal page allocator.
>>>
>>> Without Aneesh latest patch, when creating guests, KVM starts allocating
>>> it's hash tables from the latter instead of CMA (we never allocate from
>>> hugetlb pool afaik, only guest pages do that, not hash tables).
>>>
>>> So we exhaust the page allocator and get linux into OOM conditions
>>> while there's plenty of space in CMA. But the kernel cannot use CMA for
>>> it's own allocations, only to back user pages, which we don't care about
>>> because our guest pages are covered by our hugetlb reserve :-)
>> Yes. Write that in the patch description and I'm happy ;).
>>
> How about the below:
>
> Current KVM code first try to allocate hash page table from the normal
> page allocator before falling back to the CMA reserve region. One of the
> side effects of that is, we could exhaust the page allocator and get
> linux into OOM conditions while we still have plenty of space in CMA.
>
> Fix this by trying the CMA reserve region first and then falling back
> to normal page allocator if we fail to get enough memory from CMA
> reserve area.

Fix the grammar (I've spotted a good number of mistakes), then this 
should do. Please also improve the headline.


Alex


From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <agraf@suse.de>
Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15])
 (using TLSv1 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by ozlabs.org (Postfix) with ESMTPS id 9360214134E
 for <linuxppc-dev@lists.ozlabs.org>; Wed,  7 May 2014 00:25:36 +1000 (EST)
Message-ID: <5368F0DD.9090107@suse.de>
Date: Tue, 06 May 2014 16:25:33 +0200
From: Alexander Graf <agraf@suse.de>
MIME-Version: 1.0
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Subject: Re: [PATCH] KVM: PPC: BOOK3S: HV: Don't try to allocate from kernel
 page allocator for hash page table.
References: <1399224322-22028-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
 <53677558.50900@suse.de> <87r4489ttk.fsf@linux.vnet.ibm.com>
 <20FFDF8F-1A3D-4719-B492-1E4B70F9D1B4@suse.de>
 <1399334797.20388.71.camel@pasglop> <536889C6.1050603@suse.de>
 <1399360775.20388.112.camel@pasglop> <53688D89.1070201@suse.de>
 <87wqdzq98f.fsf@linux.vnet.ibm.com>
In-Reply-To: <87wqdzq98f.fsf@linux.vnet.ibm.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Cc: "paulus@samba.org" <paulus@samba.org>,
 "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
 "kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
 "kvm@vger.kernel.org" <kvm@vger.kernel.org>
List-Id: Linux on PowerPC Developers Mail List <linuxppc-dev.lists.ozlabs.org>
List-Unsubscribe: <https://lists.ozlabs.org/options/linuxppc-dev>,
 <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>
List-Archive: <http://lists.ozlabs.org/pipermail/linuxppc-dev/>
List-Post: <mailto:linuxppc-dev@lists.ozlabs.org>
List-Help: <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>
List-Subscribe: <https://lists.ozlabs.org/listinfo/linuxppc-dev>,
 <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>

On 05/06/2014 04:20 PM, Aneesh Kumar K.V wrote:
> Alexander Graf <agraf@suse.de> writes:
>
>> On 06.05.14 09:19, Benjamin Herrenschmidt wrote:
>>> On Tue, 2014-05-06 at 09:05 +0200, Alexander Graf wrote:
>>>> On 06.05.14 02:06, Benjamin Herrenschmidt wrote:
>>>>> On Mon, 2014-05-05 at 17:16 +0200, Alexander Graf wrote:
>>>>>> Isn't this a greater problem? We should start swapping before we hit
>>>>>> the point where non movable kernel allocation fails, no?
>>>>> Possibly but the fact remains, this can be avoided by making sure that
>>>>> if we create a CMA reserve for KVM, then it uses it rather than using
>>>>> the rest of main memory for hash tables.
>>>> So why were we preferring non-CMA memory before? Considering that Aneesh
>>>> introduced that logic in fa61a4e3 I suppose this was just a mistake?
>>> I assume so.
> ....
> ...
>
>>> Whatever remains is split between CMA and the normal page allocator.
>>>
>>> Without Aneesh latest patch, when creating guests, KVM starts allocating
>>> it's hash tables from the latter instead of CMA (we never allocate from
>>> hugetlb pool afaik, only guest pages do that, not hash tables).
>>>
>>> So we exhaust the page allocator and get linux into OOM conditions
>>> while there's plenty of space in CMA. But the kernel cannot use CMA for
>>> it's own allocations, only to back user pages, which we don't care about
>>> because our guest pages are covered by our hugetlb reserve :-)
>> Yes. Write that in the patch description and I'm happy ;).
>>
> How about the below:
>
> Current KVM code first try to allocate hash page table from the normal
> page allocator before falling back to the CMA reserve region. One of the
> side effects of that is, we could exhaust the page allocator and get
> linux into OOM conditions while we still have plenty of space in CMA.
>
> Fix this by trying the CMA reserve region first and then falling back
> to normal page allocator if we fail to get enough memory from CMA
> reserve area.

Fix the grammar (I've spotted a good number of mistakes), then this 
should do. Please also improve the headline.


Alex

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alexander Graf <agraf@suse.de>
Subject: Re: [PATCH] KVM: PPC: BOOK3S: HV: Don't try to allocate from kernel
 page allocator for hash page table.
Date: Tue, 06 May 2014 16:25:33 +0200
Message-ID: <5368F0DD.9090107@suse.de>
References: <1399224322-22028-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
 <53677558.50900@suse.de> <87r4489ttk.fsf@linux.vnet.ibm.com>
 <20FFDF8F-1A3D-4719-B492-1E4B70F9D1B4@suse.de>
 <1399334797.20388.71.camel@pasglop> <536889C6.1050603@suse.de>
 <1399360775.20388.112.camel@pasglop> <53688D89.1070201@suse.de>
 <87wqdzq98f.fsf@linux.vnet.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Content-Transfer-Encoding: base64
Cc: "paulus@samba.org" <paulus@samba.org>,
 "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
 "kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
 "kvm@vger.kernel.org" <kvm@vger.kernel.org>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Return-path: <linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org>
In-Reply-To: <87wqdzq98f.fsf@linux.vnet.ibm.com>
List-Unsubscribe: <https://lists.ozlabs.org/options/linuxppc-dev>,
 <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>
List-Archive: <http://lists.ozlabs.org/pipermail/linuxppc-dev/>
List-Post: <mailto:linuxppc-dev@lists.ozlabs.org>
List-Help: <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>
List-Subscribe: <https://lists.ozlabs.org/listinfo/linuxppc-dev>,
 <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>
Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org
Sender: "Linuxppc-dev"
 <linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org>
List-Id: kvm.vger.kernel.org

T24gMDUvMDYvMjAxNCAwNDoyMCBQTSwgQW5lZXNoIEt1bWFyIEsuViB3cm90ZToKPiBBbGV4YW5k
ZXIgR3JhZiA8YWdyYWZAc3VzZS5kZT4gd3JpdGVzOgo+Cj4+IE9uIDA2LjA1LjE0IDA5OjE5LCBC
ZW5qYW1pbiBIZXJyZW5zY2htaWR0IHdyb3RlOgo+Pj4gT24gVHVlLCAyMDE0LTA1LTA2IGF0IDA5
OjA1ICswMjAwLCBBbGV4YW5kZXIgR3JhZiB3cm90ZToKPj4+PiBPbiAwNi4wNS4xNCAwMjowNiwg
QmVuamFtaW4gSGVycmVuc2NobWlkdCB3cm90ZToKPj4+Pj4gT24gTW9uLCAyMDE0LTA1LTA1IGF0
IDE3OjE2ICswMjAwLCBBbGV4YW5kZXIgR3JhZiB3cm90ZToKPj4+Pj4+IElzbid0IHRoaXMgYSBn
cmVhdGVyIHByb2JsZW0/IFdlIHNob3VsZCBzdGFydCBzd2FwcGluZyBiZWZvcmUgd2UgaGl0Cj4+
Pj4+PiB0aGUgcG9pbnQgd2hlcmUgbm9uIG1vdmFibGUga2VybmVsIGFsbG9jYXRpb24gZmFpbHMs
IG5vPwo+Pj4+PiBQb3NzaWJseSBidXQgdGhlIGZhY3QgcmVtYWlucywgdGhpcyBjYW4gYmUgYXZv
aWRlZCBieSBtYWtpbmcgc3VyZSB0aGF0Cj4+Pj4+IGlmIHdlIGNyZWF0ZSBhIENNQSByZXNlcnZl
IGZvciBLVk0sIHRoZW4gaXQgdXNlcyBpdCByYXRoZXIgdGhhbiB1c2luZwo+Pj4+PiB0aGUgcmVz
dCBvZiBtYWluIG1lbW9yeSBmb3IgaGFzaCB0YWJsZXMuCj4+Pj4gU28gd2h5IHdlcmUgd2UgcHJl
ZmVycmluZyBub24tQ01BIG1lbW9yeSBiZWZvcmU/IENvbnNpZGVyaW5nIHRoYXQgQW5lZXNoCj4+
Pj4gaW50cm9kdWNlZCB0aGF0IGxvZ2ljIGluIGZhNjFhNGUzIEkgc3VwcG9zZSB0aGlzIHdhcyBq
dXN0IGEgbWlzdGFrZT8KPj4+IEkgYXNzdW1lIHNvLgo+IC4uLi4KPiAuLi4KPgo+Pj4gV2hhdGV2
ZXIgcmVtYWlucyBpcyBzcGxpdCBiZXR3ZWVuIENNQSBhbmQgdGhlIG5vcm1hbCBwYWdlIGFsbG9j
YXRvci4KPj4+Cj4+PiBXaXRob3V0IEFuZWVzaCBsYXRlc3QgcGF0Y2gsIHdoZW4gY3JlYXRpbmcg
Z3Vlc3RzLCBLVk0gc3RhcnRzIGFsbG9jYXRpbmcKPj4+IGl0J3MgaGFzaCB0YWJsZXMgZnJvbSB0
aGUgbGF0dGVyIGluc3RlYWQgb2YgQ01BICh3ZSBuZXZlciBhbGxvY2F0ZSBmcm9tCj4+PiBodWdl
dGxiIHBvb2wgYWZhaWssIG9ubHkgZ3Vlc3QgcGFnZXMgZG8gdGhhdCwgbm90IGhhc2ggdGFibGVz
KS4KPj4+Cj4+PiBTbyB3ZSBleGhhdXN0IHRoZSBwYWdlIGFsbG9jYXRvciBhbmQgZ2V0IGxpbnV4
IGludG8gT09NIGNvbmRpdGlvbnMKPj4+IHdoaWxlIHRoZXJlJ3MgcGxlbnR5IG9mIHNwYWNlIGlu
IENNQS4gQnV0IHRoZSBrZXJuZWwgY2Fubm90IHVzZSBDTUEgZm9yCj4+PiBpdCdzIG93biBhbGxv
Y2F0aW9ucywgb25seSB0byBiYWNrIHVzZXIgcGFnZXMsIHdoaWNoIHdlIGRvbid0IGNhcmUgYWJv
dXQKPj4+IGJlY2F1c2Ugb3VyIGd1ZXN0IHBhZ2VzIGFyZSBjb3ZlcmVkIGJ5IG91ciBodWdldGxi
IHJlc2VydmUgOi0pCj4+IFllcy4gV3JpdGUgdGhhdCBpbiB0aGUgcGF0Y2ggZGVzY3JpcHRpb24g
YW5kIEknbSBoYXBweSA7KS4KPj4KPiBIb3cgYWJvdXQgdGhlIGJlbG93Ogo+Cj4gQ3VycmVudCBL
Vk0gY29kZSBmaXJzdCB0cnkgdG8gYWxsb2NhdGUgaGFzaCBwYWdlIHRhYmxlIGZyb20gdGhlIG5v
cm1hbAo+IHBhZ2UgYWxsb2NhdG9yIGJlZm9yZSBmYWxsaW5nIGJhY2sgdG8gdGhlIENNQSByZXNl
cnZlIHJlZ2lvbi4gT25lIG9mIHRoZQo+IHNpZGUgZWZmZWN0cyBvZiB0aGF0IGlzLCB3ZSBjb3Vs
ZCBleGhhdXN0IHRoZSBwYWdlIGFsbG9jYXRvciBhbmQgZ2V0Cj4gbGludXggaW50byBPT00gY29u
ZGl0aW9ucyB3aGlsZSB3ZSBzdGlsbCBoYXZlIHBsZW50eSBvZiBzcGFjZSBpbiBDTUEuCj4KPiBG
aXggdGhpcyBieSB0cnlpbmcgdGhlIENNQSByZXNlcnZlIHJlZ2lvbiBmaXJzdCBhbmQgdGhlbiBm
YWxsaW5nIGJhY2sKPiB0byBub3JtYWwgcGFnZSBhbGxvY2F0b3IgaWYgd2UgZmFpbCB0byBnZXQg
ZW5vdWdoIG1lbW9yeSBmcm9tIENNQQo+IHJlc2VydmUgYXJlYS4KCkZpeCB0aGUgZ3JhbW1hciAo
SSd2ZSBzcG90dGVkIGEgZ29vZCBudW1iZXIgb2YgbWlzdGFrZXMpLCB0aGVuIHRoaXMgCnNob3Vs
ZCBkby4gUGxlYXNlIGFsc28gaW1wcm92ZSB0aGUgaGVhZGxpbmUuCgoKQWxleAoKX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KTGludXhwcGMtZGV2IG1haWxp
bmcgbGlzdApMaW51eHBwYy1kZXZAbGlzdHMub3psYWJzLm9yZwpodHRwczovL2xpc3RzLm96bGFi
cy5vcmcvbGlzdGluZm8vbGludXhwcGMtZGV2