All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <haveblue@us.ibm.com>
To: Adam Litke <agl@us.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] hugetlb: Fix pool resizing corner case
Date: Wed, 03 Oct 2007 11:59:08 -0700	[thread overview]
Message-ID: <1191437948.4939.105.camel@localhost> (raw)
In-Reply-To: <1191436392.19775.43.camel@localhost.localdomain>

On Wed, 2007-10-03 at 13:33 -0500, Adam Litke wrote:
> On Wed, 2007-10-03 at 10:40 -0700, Dave Hansen wrote:
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 84c795e..7af3908 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -224,14 +224,14 @@ static void try_to_free_low(unsigned long count)
> > >  	for (i = 0; i < MAX_NUMNODES; ++i) {
> > >  		struct page *page, *next;
> > >  		list_for_each_entry_safe(page, next, &hugepage_freelists[i], lru) {
> > > +			if (count >= nr_huge_pages)
> > > +				return;
> > >  			if (PageHighMem(page))
> > >  				continue;
> > >  			list_del(&page->lru);
> > >  			update_and_free_page(page);
> > >  			free_huge_pages--;
> > >  			free_huge_pages_node[page_to_nid(page)]--;
> > > -			if (count >= nr_huge_pages)
> > > -				return;
> > >  		}
> > >  	}
> > >  }
> > 
> > That's an excellent problem description.  I'm just a bit hazy on how the
> > patch fixes it. :)
> > 
> > What is the actual error in this loop?  The fact that we can go trying
> > to free pages when the count is actually OK?
> 
> The above hunk serves only to change the behavior of try_to_free_low()
> so that rather than always freeing _at_least_ one huge page, it can
> return without having freed any pages. 

OK, that makes sense.  Can you include that in the patch description?

> > BTW, try_to_free_low(count) kinda sucks for a function name.  Is that
> > count the number of pages we're trying to end up with, or the total
> > number of low pages that we're trying to free?
> 
> I agree the name sucks, but this is a bugfix patch.

Oh, yeah, none of what I was suggesting belongs in this patch.  But,
they'd make good followups.  I was just trying to get you to look at
what caused that bug to be introduced in the first place.  

> > We could rewrite the original max() line this way:
> > 
> > 	if (resv_huge_pages > nr_of_pages_to_end_up_with)
> > 		nr_of_pages_to_end_up_with = resv_huge_pages;
> > 	try_to_make_the_total_nr_of_hpages(nr_of_pages_to_end_up_with);
> > 
> > Which makes it more clear that you're setting the number of total pages
> > to the number of reserved pages, which is obviously screwy.
> > 
> > OK, so this is actually saying: "count can never go below
> > resv_huge_pages+nr_huge_pages"?
> 
> Not quite.  Count can never go below the number of reserved pages plus
> pages allocated to MAP_PRIVATE mappings.  That number is computed by:
> (resv + (total - free)).

So, (total - free) equals the number of MAP_PRIVATE pages?  Does that
imply that all reserved pages are shared and that all shared pages are
reserved?

This would be clearer even with

static inline int nr_map_private_hpages(void)
{
	return total - free;
}

Especially, if we could get the "shared" name somewhere into the
reserved variable.  

It makes a whole ton of sense to see:

	int total_in_use = nr_shared() + nr_private();
	if (nr_requested_pages < total_in_use)
		uh_oh_cant_do_that();

That'd be a wonderful cleanup.

> > Could we change try_to_free_low() to free a distinct number of pages?
> > 
> > 	if (count > free_huge_pages)
> > 		count = free_huge_pages;
> > 	try_to_free_nr_huge_pages(count);
> > 
> > I feel a bit sketchy about the "resv_huge_pages + nr_huge_pages -
> > free_huge_pages" logic.  Could you elaborate a bit there on what the
> > rules are?
> 
> The key is that we don't want to shrink the pool below the number of
> pages we are committed to keeping around.  Before this patch, we only
> accounted for the pages we plan to hand out (reserved huge pages) but
> not the ones we've already handed out (total - free).  Does that make
> sense?

Yes.  I'm just starting to get the idea that pages are committed in
different ways.  So, it makes good sense to have a nice function like
total_committed_hpages(), which explains all of this, and is called to
compare agaist the evil "count" variable. :)

-- Dave


WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <haveblue@us.ibm.com>
To: Adam Litke <agl@us.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] hugetlb: Fix pool resizing corner case
Date: Wed, 03 Oct 2007 11:59:08 -0700	[thread overview]
Message-ID: <1191437948.4939.105.camel@localhost> (raw)
In-Reply-To: <1191436392.19775.43.camel@localhost.localdomain>

On Wed, 2007-10-03 at 13:33 -0500, Adam Litke wrote:
> On Wed, 2007-10-03 at 10:40 -0700, Dave Hansen wrote:
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 84c795e..7af3908 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -224,14 +224,14 @@ static void try_to_free_low(unsigned long count)
> > >  	for (i = 0; i < MAX_NUMNODES; ++i) {
> > >  		struct page *page, *next;
> > >  		list_for_each_entry_safe(page, next, &hugepage_freelists[i], lru) {
> > > +			if (count >= nr_huge_pages)
> > > +				return;
> > >  			if (PageHighMem(page))
> > >  				continue;
> > >  			list_del(&page->lru);
> > >  			update_and_free_page(page);
> > >  			free_huge_pages--;
> > >  			free_huge_pages_node[page_to_nid(page)]--;
> > > -			if (count >= nr_huge_pages)
> > > -				return;
> > >  		}
> > >  	}
> > >  }
> > 
> > That's an excellent problem description.  I'm just a bit hazy on how the
> > patch fixes it. :)
> > 
> > What is the actual error in this loop?  The fact that we can go trying
> > to free pages when the count is actually OK?
> 
> The above hunk serves only to change the behavior of try_to_free_low()
> so that rather than always freeing _at_least_ one huge page, it can
> return without having freed any pages. 

OK, that makes sense.  Can you include that in the patch description?

> > BTW, try_to_free_low(count) kinda sucks for a function name.  Is that
> > count the number of pages we're trying to end up with, or the total
> > number of low pages that we're trying to free?
> 
> I agree the name sucks, but this is a bugfix patch.

Oh, yeah, none of what I was suggesting belongs in this patch.  But,
they'd make good followups.  I was just trying to get you to look at
what caused that bug to be introduced in the first place.  

> > We could rewrite the original max() line this way:
> > 
> > 	if (resv_huge_pages > nr_of_pages_to_end_up_with)
> > 		nr_of_pages_to_end_up_with = resv_huge_pages;
> > 	try_to_make_the_total_nr_of_hpages(nr_of_pages_to_end_up_with);
> > 
> > Which makes it more clear that you're setting the number of total pages
> > to the number of reserved pages, which is obviously screwy.
> > 
> > OK, so this is actually saying: "count can never go below
> > resv_huge_pages+nr_huge_pages"?
> 
> Not quite.  Count can never go below the number of reserved pages plus
> pages allocated to MAP_PRIVATE mappings.  That number is computed by:
> (resv + (total - free)).

So, (total - free) equals the number of MAP_PRIVATE pages?  Does that
imply that all reserved pages are shared and that all shared pages are
reserved?

This would be clearer even with

static inline int nr_map_private_hpages(void)
{
	return total - free;
}

Especially, if we could get the "shared" name somewhere into the
reserved variable.  

It makes a whole ton of sense to see:

	int total_in_use = nr_shared() + nr_private();
	if (nr_requested_pages < total_in_use)
		uh_oh_cant_do_that();

That'd be a wonderful cleanup.

> > Could we change try_to_free_low() to free a distinct number of pages?
> > 
> > 	if (count > free_huge_pages)
> > 		count = free_huge_pages;
> > 	try_to_free_nr_huge_pages(count);
> > 
> > I feel a bit sketchy about the "resv_huge_pages + nr_huge_pages -
> > free_huge_pages" logic.  Could you elaborate a bit there on what the
> > rules are?
> 
> The key is that we don't want to shrink the pool below the number of
> pages we are committed to keeping around.  Before this patch, we only
> accounted for the pages we plan to hand out (reserved huge pages) but
> not the ones we've already handed out (total - free).  Does that make
> sense?

Yes.  I'm just starting to get the idea that pages are committed in
different ways.  So, it makes good sense to have a nice function like
total_committed_hpages(), which explains all of this, and is called to
compare agaist the evil "count" variable. :)

-- Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-10-03 18:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-03 15:47 [PATCH] hugetlb: Fix pool resizing corner case Adam Litke
2007-10-03 15:47 ` Adam Litke
2007-10-03 17:40 ` Dave Hansen
2007-10-03 17:40   ` Dave Hansen
2007-10-03 18:33   ` Adam Litke
2007-10-03 18:33     ` Adam Litke
2007-10-03 18:59     ` Dave Hansen [this message]
2007-10-03 18:59       ` Dave Hansen
2007-10-03 19:08       ` Ken Chen
2007-10-03 19:08         ` Ken Chen
2007-10-03 19:03     ` Ken Chen
2007-10-03 19:03       ` Ken Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1191437948.4939.105.camel@localhost \
    --to=haveblue@us.ibm.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.