From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Date: Tue, 06 May 2014 14:25:33 +0000 Subject: Re: [PATCH] KVM: PPC: BOOK3S: HV: Don't try to allocate from kernel page allocator for hash page tab Message-Id: <5368F0DD.9090107@suse.de> List-Id: References: <1399224322-22028-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <53677558.50900@suse.de> <87r4489ttk.fsf@linux.vnet.ibm.com> <20FFDF8F-1A3D-4719-B492-1E4B70F9D1B4@suse.de> <1399334797.20388.71.camel@pasglop> <536889C6.1050603@suse.de> <1399360775.20388.112.camel@pasglop> <53688D89.1070201@suse.de> <87wqdzq98f.fsf@linux.vnet.ibm.com> In-Reply-To: <87wqdzq98f.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: "Aneesh Kumar K.V" Cc: "paulus@samba.org" , "linuxppc-dev@lists.ozlabs.org" , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" On 05/06/2014 04:20 PM, Aneesh Kumar K.V wrote: > Alexander Graf writes: > >> On 06.05.14 09:19, Benjamin Herrenschmidt wrote: >>> On Tue, 2014-05-06 at 09:05 +0200, Alexander Graf wrote: >>>> On 06.05.14 02:06, Benjamin Herrenschmidt wrote: >>>>> On Mon, 2014-05-05 at 17:16 +0200, Alexander Graf wrote: >>>>>> Isn't this a greater problem? We should start swapping before we hit >>>>>> the point where non movable kernel allocation fails, no? >>>>> Possibly but the fact remains, this can be avoided by making sure that >>>>> if we create a CMA reserve for KVM, then it uses it rather than using >>>>> the rest of main memory for hash tables. >>>> So why were we preferring non-CMA memory before? Considering that Aneesh >>>> introduced that logic in fa61a4e3 I suppose this was just a mistake? >>> I assume so. > .... > ... > >>> Whatever remains is split between CMA and the normal page allocator. >>> >>> Without Aneesh latest patch, when creating guests, KVM starts allocating >>> it's hash tables from the latter instead of CMA (we never allocate from >>> hugetlb pool afaik, only guest pages do that, not hash tables). >>> >>> So we exhaust the page allocator and get linux into OOM conditions >>> while there's plenty of space in CMA. But the kernel cannot use CMA for >>> it's own allocations, only to back user pages, which we don't care about >>> because our guest pages are covered by our hugetlb reserve :-) >> Yes. Write that in the patch description and I'm happy ;). >> > How about the below: > > Current KVM code first try to allocate hash page table from the normal > page allocator before falling back to the CMA reserve region. One of the > side effects of that is, we could exhaust the page allocator and get > linux into OOM conditions while we still have plenty of space in CMA. > > Fix this by trying the CMA reserve region first and then falling back > to normal page allocator if we fail to get enough memory from CMA > reserve area. Fix the grammar (I've spotted a good number of mistakes), then this should do. Please also improve the headline. Alex