From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263065AbVHEQnA (ORCPT ); Fri, 5 Aug 2005 12:43:00 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263021AbVHEQm6 (ORCPT ); Fri, 5 Aug 2005 12:42:58 -0400 Received: from e34.co.us.ibm.com ([32.97.110.132]:26850 "EHLO e34.co.us.ibm.com") by vger.kernel.org with ESMTP id S263085AbVHEQmY (ORCPT ); Fri, 5 Aug 2005 12:42:24 -0400 Subject: Re: [RFC] Demand faulting for large pages From: Adam Litke To: Andi Kleen Cc: linux-kernel@vger.kernel.org, christoph@lameter.com, dwg@au1.ibm.com In-Reply-To: <20050805155307.GV8266@wotan.suse.de> References: <1123255298.3121.46.camel@localhost.localdomain> <20050805155307.GV8266@wotan.suse.de> Content-Type: text/plain Organization: IBM Message-Id: <1123259847.3121.91.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Fri, 05 Aug 2005 11:37:27 -0500 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2005-08-05 at 10:53, Andi Kleen wrote: > On Fri, Aug 05, 2005 at 10:21:38AM -0500, Adam Litke wrote: > > Below is a patch to implement demand faulting for huge pages. The main > > motivation for changing from prefaulting to demand faulting is so that > > huge page allocations can follow the NUMA API. Currently, huge pages > > are allocated round-robin from all NUMA nodes. > > I think matching DEFAULT is better than having a different default for > huge pages than for small pages. I am not exactly sure what the above means. Is 'DEFAULT' a system default numa allocation policy? > In general more programs are happy with local memory than remote memory. I totally agree. > Also it makes it consistent. > > > > > The default behavior in SLES9 for i386 is to use demand faulting with > > NUMA policy-aware allocations. To my knowledge, this continues to work > > Not sure what you're trying to say here. All allocations are NUMA policy aware. Sorry, I really wasn't clear. That statement referred to huge pages specifically. I was trying to point out that numa policy-aware huge page allocation combined with demand faulting in SLES9/i386 has been a success. > > well in practice. Thanks to consolidated hugetlb code, switching the > > behavior requires changing only one fault handler. The bulk of the > > patch just moves the logic from hugelb_prefault() to > > hugetlb_pte_fault(). > > Are you sure you fixed get_user_pages to handle this properly? It doesn't > like it. Unless I am missing something, the call to follow_hugetlb_page() in get_user_pages() is just an optimization. Removing it means follow_page() will be called individually for each PAGE_SIZE page in the huge page. We can probably do better but I didn't want to cloud this patch with that logic. -- Adam Litke - (agl at us.ibm.com) IBM Linux Technology Center