Re: [RFC] Demand faulting for large pages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Adam Litke <agl@us.ibm.com>
To: Andi Kleen <ak@suse.de>
Cc: linux-kernel@vger.kernel.org, christoph@lameter.com, dwg@au1.ibm.com
Subject: Re: [RFC] Demand faulting for large pages
Date: Fri, 05 Aug 2005 12:00:00 -0500	[thread overview]
Message-ID: <1123261200.3121.104.camel@localhost.localdomain> (raw)
In-Reply-To: <20050805164702.GY8266@wotan.suse.de>

On Fri, 2005-08-05 at 11:47, Andi Kleen wrote:
> On Fri, Aug 05, 2005 at 11:37:27AM -0500, Adam Litke wrote:
> > On Fri, 2005-08-05 at 10:53, Andi Kleen wrote:
> > > On Fri, Aug 05, 2005 at 10:21:38AM -0500, Adam Litke wrote:
> > > > Below is a patch to implement demand faulting for huge pages.  The main
> > > > motivation for changing from prefaulting to demand faulting is so that
> > > > huge page allocations can follow the NUMA API.  Currently, huge pages
> > > > are allocated round-robin from all NUMA nodes.   
> > > 
> > > I think matching DEFAULT is better than having a different default for
> > > huge pages than for small pages.
> > 
> > I am not exactly sure what the above means.  Is 'DEFAULT' a system
> > default numa allocation policy?
> 
> It's one of the four numa policies: DEFAULT, PREFERED, INTERLEAVE, BIND
> 
> It just means allocate on the local node if possible, otherwise fall back.
> 
> You said you wanted INTERLEAVE by default, which i think is a bad idea.
> It should be only optional like in all other allocations.

I tried to say that allocations are _currently_ INTERLEAVE (aka
round-robin) but that I want it to be configurable.  So I think we are
in agreement here.

> > > > patch just moves the logic from hugelb_prefault() to
> > > > hugetlb_pte_fault().
> > > 
> > > Are you sure you fixed get_user_pages to handle this properly? It doesn't
> > > like it.
> > 
> > Unless I am missing something, the call to follow_hugetlb_page() in
> > get_user_pages() is just an optimization.  Removing it means
> > follow_page() will be called individually for each PAGE_SIZE page in the
> > huge page.  We can probably do better but I didn't want to cloud this
> > patch with that logic.
> 
> The problem is that get_user_pages needs to handle the case of a large
> page not yet being faulted in properly. The SLES9 implementation did
> some changes for this.
> 
> You don't change it at all, so I'm suspect it doesn't work yet.

What about:
--- reference/mm/memory.c
+++ current/mm/memory.c
@@ -933,11 +933,6 @@ int get_user_pages(struct task_struct *t
 				|| !(flags & vma->vm_flags))
 			return i ? : -EFAULT;
 
-		if (is_vm_hugetlb_page(vma)) {
-			i = follow_hugetlb_page(mm, vma, pages, vmas,
-						&start, &len, i);
-			continue;
-		}
 		spin_lock(&mm->page_table_lock);
 		do {
 			struct page *page;

> It's a common case - think people doing raw IO on huge pages shared memory.

My Direct IO test seemed to work fine, but I'll give this a closer look
to make sure follow_huge_{addr|pmd} never return a page for an unfaulted
hugetlb page.  Thanks for your close scrutiny and comments. 

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

next prev parent reply	other threads:[~2005-08-05 17:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-05 15:21 [RFC] Demand faulting for large pages Adam Litke
2005-08-05 15:53 ` Andi Kleen
2005-08-05 16:37   ` Adam Litke
2005-08-05 16:47     ` Andi Kleen
2005-08-05 17:00       ` Adam Litke [this message]
2005-08-05 17:12         ` Andi Kleen
2005-08-05 17:09       ` Christoph Lameter
2005-08-05 21:05 ` Chen, Kenneth W
2005-08-05 21:35   ` Andi Kleen
2005-08-05 21:33 ` Chen, Kenneth W
2005-08-05 22:05   ` Chen, Kenneth W
2005-08-08 22:16     ` Adam Litke
2005-08-08 22:36       ` Chen, Kenneth W

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1123261200.3121.104.camel@localhost.localdomain \
    --to=agl@us.ibm.com \
    --cc=ak@suse.de \
    --cc=christoph@lameter.com \
    --cc=dwg@au1.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.