Re: [PATCH 1/5] Free huge pages round robin to balance across nodes

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mel Gorman <mel@csn.ul.ie>
To: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Nishanth Aravamudan <nacc@us.ibm.com>,
	Adam Litke <agl@us.ibm.com>, Andy Whitcroft <apw@canonical.com>,
	eric.whitney@hp.com
Subject: Re: [PATCH 1/5] Free huge pages round robin to balance across nodes
Date: Wed, 17 Jun 2009 14:18:33 +0100	[thread overview]
Message-ID: <20090617131833.GG28529@csn.ul.ie> (raw)
In-Reply-To: <20090616135236.25248.93692.sendpatchset@lts-notebook>

On Tue, Jun 16, 2009 at 09:52:36AM -0400, Lee Schermerhorn wrote:
> [PATCH 1/5] Free huge pages round robin to balance across nodes
> 
> Against:  17may09 mmotm
> 
> Currently, altho' increasing nr_hugepages will [attempt to]
> distribute the new huge pages across all nodes in the system,
> reducing nr_hugepages will free or surplus all free pages
> from nodes in node id order.  This patch frees huges pages
> from nodes in round robin fashion in an attempt to keep
> [persistent] hugepage allocates balanced across the nodes.
> 
> New function free_pool_huge_page() is modeled on and
> performs roughly the inverse of alloc_fresh_huge_page().
> Replaces dequeue_huge_page() which now has no callers
> and can be removed.
> 
> Helper function hstate_next_to_free_node() uses new hstate
> member next_to_free_nid to distribute "frees" across all
> nodes with huge pages.
> 
> I placed this patch first in the series because I think it
> [or something similar] should be applied independent of the
> rest of the series.  
> 

Agreed. Reading though, I can't see any problems with the patch and it
does make the freeing symmetric with the allocation.

For clarity though, would it be worth renaming hugetlb_next_nid to
next_to_alloc_nid so that there is a clear relationship in the
round-robin allocation and freeing of pages amoung online nodes?

> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
> 
>  include/linux/hugetlb.h |    1 
>  mm/hugetlb.c            |   68 +++++++++++++++++++++++++++++++++---------------
>  2 files changed, 48 insertions(+), 21 deletions(-)
> 
> Index: linux-2.6.30-rc8-mmotm-090603-1633/include/linux/hugetlb.h
> ===================================================================
> --- linux-2.6.30-rc8-mmotm-090603-1633.orig/include/linux/hugetlb.h	2009-06-04 12:59:29.000000000 -0400
> +++ linux-2.6.30-rc8-mmotm-090603-1633/include/linux/hugetlb.h	2009-06-04 12:59:31.000000000 -0400
> @@ -184,6 +184,7 @@ unsigned long hugetlb_get_unmapped_area(
>  /* Defines one hugetlb page size */
>  struct hstate {
>  	int hugetlb_next_nid;
> +	int next_to_free_nid;
>  	unsigned int order;
>  	unsigned long mask;
>  	unsigned long max_huge_pages;
> Index: linux-2.6.30-rc8-mmotm-090603-1633/mm/hugetlb.c
> ===================================================================
> --- linux-2.6.30-rc8-mmotm-090603-1633.orig/mm/hugetlb.c	2009-06-04 12:59:29.000000000 -0400
> +++ linux-2.6.30-rc8-mmotm-090603-1633/mm/hugetlb.c	2009-06-04 12:59:31.000000000 -0400
> @@ -455,24 +455,6 @@ static void enqueue_huge_page(struct hst
>  	h->free_huge_pages_node[nid]++;
>  }
>  
> -static struct page *dequeue_huge_page(struct hstate *h)
> -{
> -	int nid;
> -	struct page *page = NULL;
> -
> -	for (nid = 0; nid < MAX_NUMNODES; ++nid) {
> -		if (!list_empty(&h->hugepage_freelists[nid])) {
> -			page = list_entry(h->hugepage_freelists[nid].next,
> -					  struct page, lru);
> -			list_del(&page->lru);
> -			h->free_huge_pages--;
> -			h->free_huge_pages_node[nid]--;
> -			break;
> -		}
> -	}
> -	return page;
> -}
> -
>  static struct page *dequeue_huge_page_vma(struct hstate *h,
>  				struct vm_area_struct *vma,
>  				unsigned long address, int avoid_reserve)
> @@ -683,6 +665,52 @@ static int alloc_fresh_huge_page(struct 
>  	return ret;
>  }
>  
> +/*
> + * helper for free_pool_huge_page() - find next node
> + * from which to free a huge page
> + */
> +static int hstate_next_to_free_node(struct hstate *h)
> +{
> +	int next_nid;
> +	next_nid = next_node(h->next_to_free_nid, node_online_map);
> +	if (next_nid == MAX_NUMNODES)
> +		next_nid = first_node(node_online_map);
> +	h->next_to_free_nid = next_nid;
> +	return next_nid;
> +}
> +
> +/*
> + * Free huge page from pool from next node to free.
> + * Attempt to keep persistent huge pages more or less
> + * balanced over allowed nodes.
> + * Called with hugetlb_lock locked.
> + */
> +static int free_pool_huge_page(struct hstate *h)
> +{
> +	int start_nid;
> +	int nid;
> +	int ret = 0;
> +
> +	start_nid = h->next_to_free_nid;
> +	nid = h->next_to_free_nid;
> +
> +	do {
> +		if (!list_empty(&h->hugepage_freelists[nid])) {
> +			struct page *page =
> +				list_entry(h->hugepage_freelists[nid].next,
> +					  struct page, lru);
> +			list_del(&page->lru);
> +			h->free_huge_pages--;
> +			h->free_huge_pages_node[nid]--;
> +			update_and_free_page(h, page);
> +			ret = 1;
> +		}
> +		nid = hstate_next_to_free_node(h);
> +	} while (!ret && nid != start_nid);
> +
> +	return ret;
> +}
> +
>  static struct page *alloc_buddy_huge_page(struct hstate *h,
>  			struct vm_area_struct *vma, unsigned long address)
>  {
> @@ -1226,10 +1254,8 @@ static unsigned long set_max_huge_pages(
>  	min_count = max(count, min_count);
>  	try_to_free_low(h, min_count);
>  	while (min_count < persistent_huge_pages(h)) {
> -		struct page *page = dequeue_huge_page(h);
> -		if (!page)
> +		if (!free_pool_huge_page(h))
>  			break;
> -		update_and_free_page(h, page);
>  	}
>  	while (count < persistent_huge_pages(h)) {
>  		if (!adjust_pool_surplus(h, 1))
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-06-17 13:18 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-16 13:52 [PATCH 0/5] Huge Pages Nodes Allowed Lee Schermerhorn
2009-06-16 13:52 ` [PATCH 1/5] Free huge pages round robin to balance across nodes Lee Schermerhorn
2009-06-17 13:18   ` Mel Gorman [this message]
2009-06-17 17:16     ` Lee Schermerhorn
2009-06-18 19:08       ` David Rientjes
2009-06-16 13:52 ` [PATCH 2/5] Add nodes_allowed members to hugepages hstate struct Lee Schermerhorn
2009-06-17 13:35   ` Mel Gorman
2009-06-17 17:38     ` Lee Schermerhorn
2009-06-18  9:17       ` Mel Gorman
2009-06-16 13:53 ` [PATCH 3/5] Use per hstate nodes_allowed to constrain huge page allocation Lee Schermerhorn
2009-06-17 13:39   ` Mel Gorman
2009-06-17 17:47     ` Lee Schermerhorn
2009-06-18  9:18       ` Mel Gorman
2009-06-16 13:53 ` [PATCH 4/5] Add sysctl for default hstate nodes_allowed Lee Schermerhorn
2009-06-17 13:41   ` Mel Gorman
2009-06-17 17:52     ` Lee Schermerhorn
2009-06-18  9:19       ` Mel Gorman
2009-06-16 13:53 ` [PATCH 5/5] Update huge pages kernel documentation Lee Schermerhorn
2009-06-18 18:49   ` David Rientjes
2009-06-18 19:06     ` Lee Schermerhorn
2009-06-17 13:02 ` [PATCH 0/5] Huge Pages Nodes Allowed Mel Gorman
2009-06-17 17:15   ` Lee Schermerhorn
2009-06-18  9:33     ` Mel Gorman
2009-06-18 14:46       ` Lee Schermerhorn
2009-06-18 15:00         ` Mel Gorman
2009-06-18 19:08     ` David Rientjes
2009-06-24  7:11       ` David Rientjes
2009-06-24 11:25         ` Lee Schermerhorn
2009-06-24 22:26           ` David Rientjes
2009-06-25  2:14             ` Lee Schermerhorn
2009-06-25 19:22               ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090617131833.GG28529@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=apw@canonical.com \
    --cc=eric.whitney@hp.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-mm@kvack.org \
    --cc=nacc@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).