From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, Mel Gorman <mel@csn.ul.ie>,
Nishanth Aravamudan <nacc@us.ibm.com>,
Adam Litke <agl@us.ibm.com>, Andy Whitcroft <apw@canonical.com>,
eric.whitney@hp.com
Subject: [PATCH 1/5] Free huge pages round robin to balance across nodes
Date: Tue, 16 Jun 2009 09:52:36 -0400 [thread overview]
Message-ID: <20090616135236.25248.93692.sendpatchset@lts-notebook> (raw)
In-Reply-To: <20090616135228.25248.22018.sendpatchset@lts-notebook>
[PATCH 1/5] Free huge pages round robin to balance across nodes
Against: 17may09 mmotm
Currently, altho' increasing nr_hugepages will [attempt to]
distribute the new huge pages across all nodes in the system,
reducing nr_hugepages will free or surplus all free pages
from nodes in node id order. This patch frees huges pages
from nodes in round robin fashion in an attempt to keep
[persistent] hugepage allocates balanced across the nodes.
New function free_pool_huge_page() is modeled on and
performs roughly the inverse of alloc_fresh_huge_page().
Replaces dequeue_huge_page() which now has no callers
and can be removed.
Helper function hstate_next_to_free_node() uses new hstate
member next_to_free_nid to distribute "frees" across all
nodes with huge pages.
I placed this patch first in the series because I think it
[or something similar] should be applied independent of the
rest of the series.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
include/linux/hugetlb.h | 1
mm/hugetlb.c | 68 +++++++++++++++++++++++++++++++++---------------
2 files changed, 48 insertions(+), 21 deletions(-)
Index: linux-2.6.30-rc8-mmotm-090603-1633/include/linux/hugetlb.h
===================================================================
--- linux-2.6.30-rc8-mmotm-090603-1633.orig/include/linux/hugetlb.h 2009-06-04 12:59:29.000000000 -0400
+++ linux-2.6.30-rc8-mmotm-090603-1633/include/linux/hugetlb.h 2009-06-04 12:59:31.000000000 -0400
@@ -184,6 +184,7 @@ unsigned long hugetlb_get_unmapped_area(
/* Defines one hugetlb page size */
struct hstate {
int hugetlb_next_nid;
+ int next_to_free_nid;
unsigned int order;
unsigned long mask;
unsigned long max_huge_pages;
Index: linux-2.6.30-rc8-mmotm-090603-1633/mm/hugetlb.c
===================================================================
--- linux-2.6.30-rc8-mmotm-090603-1633.orig/mm/hugetlb.c 2009-06-04 12:59:29.000000000 -0400
+++ linux-2.6.30-rc8-mmotm-090603-1633/mm/hugetlb.c 2009-06-04 12:59:31.000000000 -0400
@@ -455,24 +455,6 @@ static void enqueue_huge_page(struct hst
h->free_huge_pages_node[nid]++;
}
-static struct page *dequeue_huge_page(struct hstate *h)
-{
- int nid;
- struct page *page = NULL;
-
- for (nid = 0; nid < MAX_NUMNODES; ++nid) {
- if (!list_empty(&h->hugepage_freelists[nid])) {
- page = list_entry(h->hugepage_freelists[nid].next,
- struct page, lru);
- list_del(&page->lru);
- h->free_huge_pages--;
- h->free_huge_pages_node[nid]--;
- break;
- }
- }
- return page;
-}
-
static struct page *dequeue_huge_page_vma(struct hstate *h,
struct vm_area_struct *vma,
unsigned long address, int avoid_reserve)
@@ -683,6 +665,52 @@ static int alloc_fresh_huge_page(struct
return ret;
}
+/*
+ * helper for free_pool_huge_page() - find next node
+ * from which to free a huge page
+ */
+static int hstate_next_to_free_node(struct hstate *h)
+{
+ int next_nid;
+ next_nid = next_node(h->next_to_free_nid, node_online_map);
+ if (next_nid == MAX_NUMNODES)
+ next_nid = first_node(node_online_map);
+ h->next_to_free_nid = next_nid;
+ return next_nid;
+}
+
+/*
+ * Free huge page from pool from next node to free.
+ * Attempt to keep persistent huge pages more or less
+ * balanced over allowed nodes.
+ * Called with hugetlb_lock locked.
+ */
+static int free_pool_huge_page(struct hstate *h)
+{
+ int start_nid;
+ int nid;
+ int ret = 0;
+
+ start_nid = h->next_to_free_nid;
+ nid = h->next_to_free_nid;
+
+ do {
+ if (!list_empty(&h->hugepage_freelists[nid])) {
+ struct page *page =
+ list_entry(h->hugepage_freelists[nid].next,
+ struct page, lru);
+ list_del(&page->lru);
+ h->free_huge_pages--;
+ h->free_huge_pages_node[nid]--;
+ update_and_free_page(h, page);
+ ret = 1;
+ }
+ nid = hstate_next_to_free_node(h);
+ } while (!ret && nid != start_nid);
+
+ return ret;
+}
+
static struct page *alloc_buddy_huge_page(struct hstate *h,
struct vm_area_struct *vma, unsigned long address)
{
@@ -1226,10 +1254,8 @@ static unsigned long set_max_huge_pages(
min_count = max(count, min_count);
try_to_free_low(h, min_count);
while (min_count < persistent_huge_pages(h)) {
- struct page *page = dequeue_huge_page(h);
- if (!page)
+ if (!free_pool_huge_page(h))
break;
- update_and_free_page(h, page);
}
while (count < persistent_huge_pages(h)) {
if (!adjust_pool_surplus(h, 1))
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-06-16 13:51 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-16 13:52 [PATCH 0/5] Huge Pages Nodes Allowed Lee Schermerhorn
2009-06-16 13:52 ` Lee Schermerhorn [this message]
2009-06-17 13:18 ` [PATCH 1/5] Free huge pages round robin to balance across nodes Mel Gorman
2009-06-17 17:16 ` Lee Schermerhorn
2009-06-18 19:08 ` David Rientjes
2009-06-16 13:52 ` [PATCH 2/5] Add nodes_allowed members to hugepages hstate struct Lee Schermerhorn
2009-06-17 13:35 ` Mel Gorman
2009-06-17 17:38 ` Lee Schermerhorn
2009-06-18 9:17 ` Mel Gorman
2009-06-16 13:53 ` [PATCH 3/5] Use per hstate nodes_allowed to constrain huge page allocation Lee Schermerhorn
2009-06-17 13:39 ` Mel Gorman
2009-06-17 17:47 ` Lee Schermerhorn
2009-06-18 9:18 ` Mel Gorman
2009-06-16 13:53 ` [PATCH 4/5] Add sysctl for default hstate nodes_allowed Lee Schermerhorn
2009-06-17 13:41 ` Mel Gorman
2009-06-17 17:52 ` Lee Schermerhorn
2009-06-18 9:19 ` Mel Gorman
2009-06-16 13:53 ` [PATCH 5/5] Update huge pages kernel documentation Lee Schermerhorn
2009-06-18 18:49 ` David Rientjes
2009-06-18 19:06 ` Lee Schermerhorn
2009-06-17 13:02 ` [PATCH 0/5] Huge Pages Nodes Allowed Mel Gorman
2009-06-17 17:15 ` Lee Schermerhorn
2009-06-18 9:33 ` Mel Gorman
2009-06-18 14:46 ` Lee Schermerhorn
2009-06-18 15:00 ` Mel Gorman
2009-06-18 19:08 ` David Rientjes
2009-06-24 7:11 ` David Rientjes
2009-06-24 11:25 ` Lee Schermerhorn
2009-06-24 22:26 ` David Rientjes
2009-06-25 2:14 ` Lee Schermerhorn
2009-06-25 19:22 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090616135236.25248.93692.sendpatchset@lts-notebook \
--to=lee.schermerhorn@hp.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=nacc@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).