From: Nishanth Aravamudan <nacc@us.ibm.com>
To: clameter@sgi.com
Cc: lee.schermerhorn@hp.com, wli@holomorphy.com, melgor@ie.ibm.com,
akpm@linux-foundation.org, linux-mm@kvack.org, agl@us.ibm.com
Subject: [RFC][PATCH 5/5] hugetlb: interleave dequeueing of huge pages
Date: Mon, 6 Aug 2007 09:48:42 -0700 [thread overview]
Message-ID: <20070806164842.GQ15714@us.ibm.com> (raw)
In-Reply-To: <20070806164410.GO15714@us.ibm.com>
Currently, when shrinking the hugetlb pool, we free all of the pages on
node 0, then all the pages on node 1, etc. Instead, we interleave over
the valid nodes, as constrained by the enclosing cpuset (or populated
nodes if !CPUSETS). If some particularly node should be cleared first,
the sysfs allocator can be used for finer-grained control. This also
helps with keeping the pool balanced as we change the pool at run-time.
Before:
Trying to clear the hugetlb pool
Done. 0 free
Trying to resize the pool to 100
Node 3 HugePages_Free: 0
Node 2 HugePages_Free: 0
Node 1 HugePages_Free: 50
Node 0 HugePages_Free: 50
Done. Initially 100 free
Trying to resize the pool to 200
Node 3 HugePages_Free: 0
Node 2 HugePages_Free: 0
Node 1 HugePages_Free: 100
Node 0 HugePages_Free: 100
Done. 200 free
Trying to resize the pool back to 100
Node 3 HugePages_Free: 0
Node 2 HugePages_Free: 0
Node 1 HugePages_Free: 100
Node 0 HugePages_Free: 0
Done. 100 free
After:
Trying to clear the hugetlb pool
Done. 0 free
Trying to resize the pool to 100
Node 3 HugePages_Free: 0
Node 2 HugePages_Free: 0
Node 1 HugePages_Free: 50
Node 0 HugePages_Free: 50
Done. Initially 100 free
Trying to resize the pool to 200
Node 3 HugePages_Free: 0
Node 2 HugePages_Free: 0
Node 1 HugePages_Free: 100
Node 0 HugePages_Free: 100
Done. 200 free
Trying to resize the pool back to 100
Node 3 HugePages_Free: 0
Node 2 HugePages_Free: 0
Node 1 HugePages_Free: 50
Node 0 HugePages_Free: 50
Done. 100 free
Tested on: 2-node IA64, 4-node ppc64 (2 memoryless nodes), 4-node ppc64
(no memoryless nodes), 4-node x86_64, !NUMA x86, 1-node x86 (NUMA-Q)
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index af07a0b..f6d1811 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -78,7 +78,27 @@ static struct page *dequeue_huge_page_node(int nid)
return page;
}
-static struct page *dequeue_huge_page(struct vm_area_struct *vma,
+static struct page *dequeue_huge_page(struct mempolicy *policy)
+{
+ struct page *page;
+ int nid;
+ int start_nid = interleave_nodes(policy);
+
+ nid = start_nid;
+
+ do {
+ if (!list_empty(&hugepage_freelists[nid])) {
+ page = dequeue_huge_page_node(nid);
+ if (page)
+ return page;
+ }
+ nid = interleave_nodes(policy);
+ } while (nid != start_nid);
+
+ return NULL;
+}
+
+static struct page *dequeue_huge_page_vma(struct vm_area_struct *vma,
unsigned long address)
{
int nid;
@@ -155,7 +175,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
else if (free_huge_pages <= resv_huge_pages)
goto fail;
- page = dequeue_huge_page(vma, addr);
+ page = dequeue_huge_page_vma(vma, addr);
if (!page)
goto fail;
@@ -295,20 +315,23 @@ static unsigned long set_max_huge_pages(unsigned long count)
if (!alloc_fresh_huge_page(pol))
break;
}
- mpol_free(pol);
- if (count >= nr_huge_pages)
+ if (count >= nr_huge_pages) {
+ mpol_free(pol);
return nr_huge_pages;
+ }
spin_lock(&hugetlb_lock);
count = max(count, resv_huge_pages);
try_to_free_low(count);
+ set_first_interleave_node(cpuset_current_mems_allowed);
while (count < nr_huge_pages) {
- struct page *page = dequeue_huge_page(NULL, 0);
+ struct page *page = dequeue_huge_page(pol);
if (!page)
break;
update_and_free_page(page_to_nid(page), page);
}
spin_unlock(&hugetlb_lock);
+ mpol_free(pol);
return nr_huge_pages;
}
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-08-06 16:48 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-06 16:32 [RFC][PATCH 0/5] hugetlb NUMA improvements Nishanth Aravamudan
2007-08-06 16:37 ` [RFC][PATCH 1/5] Fix hugetlb pool allocation with empty nodes V9 Nishanth Aravamudan
2007-08-06 16:38 ` [RFC][PATCH 2/5] hugetlb: numafy several functions Nishanth Aravamudan
2007-08-06 16:40 ` [RFC][PATCH 3/5] hugetlb: add per-node nr_hugepages sysfs attribute Nishanth Aravamudan
2007-08-06 16:44 ` [RFC][PATCH 4/5] hugetlb: fix cpuset-constrained pool resizing Nishanth Aravamudan
2007-08-06 16:45 ` Nishanth Aravamudan
2007-08-06 16:48 ` Nishanth Aravamudan [this message]
2007-08-06 18:04 ` Christoph Lameter
2007-08-06 18:26 ` Nishanth Aravamudan
2007-08-06 18:41 ` Christoph Lameter
2007-08-07 0:03 ` Nishanth Aravamudan
2007-08-06 19:37 ` Lee Schermerhorn
2007-08-08 1:50 ` Nishanth Aravamudan
2007-08-08 13:26 ` Lee Schermerhorn
2007-08-06 17:59 ` [RFC][PATCH 2/5] hugetlb: numafy several functions Christoph Lameter
2007-08-06 18:15 ` Nishanth Aravamudan
2007-08-07 0:34 ` Nishanth Aravamudan
2007-08-06 18:00 ` [RFC][PATCH 1/5] Fix hugetlb pool allocation with empty nodes V9 Christoph Lameter
2007-08-06 18:19 ` Nishanth Aravamudan
2007-08-06 18:37 ` Christoph Lameter
2007-08-06 19:52 ` Lee Schermerhorn
2007-08-06 20:15 ` Christoph Lameter
2007-08-07 0:04 ` Nishanth Aravamudan
2007-08-06 16:39 ` [RFC][PATCH 0/5] hugetlb NUMA improvements Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070806164842.GQ15714@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
--cc=melgor@ie.ibm.com \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).