* [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages
2008-02-25 22:01 [PATCH 0/3] hugetlb: Dynamic pool resize improvements Adam Litke
@ 2008-02-25 22:01 ` Adam Litke
2008-02-25 22:31 ` Dave Hansen
0 siblings, 1 reply; 7+ messages in thread
From: Adam Litke @ 2008-02-25 22:01 UTC (permalink / raw)
To: linux-mm; +Cc: mel, apw, nacc, agl
To reduce the number of times we acquire and release hugetlb_lock when
freeing excess huge pages, loop through the page list twice: once to siphon
pages into the hugetlb pool, and again to free the excess pages back to the
buddy allocator. This removes the lock/unlock around free_huge_page().
Note that we still visit each page only once since a page is always removed
from the list when it is visited.
Thanks Mel Gorman for this improvement.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
---
mm/hugetlb.c | 17 ++++++++++++-----
1 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8296431..306d762 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -349,11 +349,19 @@ retry:
resv_huge_pages += delta;
ret = 0;
free:
+ /* Free the needed pages to the hugetlb pool */
list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ if ((--needed) < 0)
+ break;
list_del(&page->lru);
- if ((--needed) >= 0)
- enqueue_huge_page(page);
- else {
+ enqueue_huge_page(page);
+ }
+
+ /* Free unnecessary surplus pages to the buddy allocator */
+ if (!list_empty(&surplus_list)) {
+ spin_unlock(&hugetlb_lock);
+ list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ list_del(&page->lru);
/*
* The page has a reference count of zero already, so
* call free_huge_page directly instead of using
@@ -361,10 +369,9 @@ free:
* unlocked which is safe because free_huge_page takes
* hugetlb_lock before deciding how to free the page.
*/
- spin_unlock(&hugetlb_lock);
free_huge_page(page);
- spin_lock(&hugetlb_lock);
}
+ spin_lock(&hugetlb_lock);
}
return ret;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages
2008-02-25 22:01 ` [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages Adam Litke
@ 2008-02-25 22:31 ` Dave Hansen
2008-02-25 22:51 ` Adam Litke
0 siblings, 1 reply; 7+ messages in thread
From: Dave Hansen @ 2008-02-25 22:31 UTC (permalink / raw)
To: Adam Litke; +Cc: linux-mm, mel, apw, nacc, agl
On Mon, 2008-02-25 at 14:01 -0800, Adam Litke wrote:
> + /* Free unnecessary surplus pages to the buddy allocator */
> + if (!list_empty(&surplus_list)) {
> + spin_unlock(&hugetlb_lock);
> + list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
> + list_del(&page->lru);
What is the surplus_list protected by?
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages
2008-02-25 22:31 ` Dave Hansen
@ 2008-02-25 22:51 ` Adam Litke
0 siblings, 0 replies; 7+ messages in thread
From: Adam Litke @ 2008-02-25 22:51 UTC (permalink / raw)
To: Dave Hansen; +Cc: linux-mm, mel, apw, nacc, agl
On Mon, 2008-02-25 at 14:31 -0800, Dave Hansen wrote:
> On Mon, 2008-02-25 at 14:01 -0800, Adam Litke wrote:
> > + /* Free unnecessary surplus pages to the buddy allocator */
> > + if (!list_empty(&surplus_list)) {
> > + spin_unlock(&hugetlb_lock);
> > + list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
> > + list_del(&page->lru);
>
> What is the surplus_list protected by?
It's a local variable on the stack.
--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/3] hugetlb: Dynamic pool resize improvements
@ 2008-03-03 18:06 Adam Litke
2008-03-03 18:06 ` [PATCH 1/3] hugetlb: Correct page count for surplus huge pages Adam Litke
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Adam Litke @ 2008-03-03 18:06 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Dave Hansen, Andy Whitcroft, Mel Gorman, Adam Litke
Andrew, I think these have now been properly vetted. Would you consider them
for -mm please? The first two are bug fixes and should also be considered for
merging into 2.6.25. The third patch can happily wait until the next merge
window.
This series of patches contains fixes for a few issues found with the new
dynamically resizing hugetlb pool while stress testing. The first patch
corrects the page count for surplus huge pages when they are first allocated.
This avoids a BUG when CONFIG_DEBUG_VM is enabled. The second patch closes a
difficult to trigger race when setting up a reservation involving surplus pages
which could lead to reservations not being honored. The third patch is a minor
performance optimization in gather_surplus_huge_pages().
These patches were generated against 2.6.24
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] hugetlb: Correct page count for surplus huge pages
2008-03-03 18:06 [PATCH 0/3] hugetlb: Dynamic pool resize improvements Adam Litke
@ 2008-03-03 18:06 ` Adam Litke
2008-03-03 18:06 ` [PATCH 2/3] hugetlb: Close a difficult to trigger reservation race Adam Litke
2008-03-03 18:06 ` [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages Adam Litke
2 siblings, 0 replies; 7+ messages in thread
From: Adam Litke @ 2008-03-03 18:06 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Dave Hansen, Andy Whitcroft, Mel Gorman, Adam Litke
Free pages in the hugetlb pool are free and as such have a reference
count of zero. Regular allocations into the pool from the buddy are
"freed" into the pool which results in their page_count dropping to zero.
However, surplus pages can be directly utilized by the caller without first
being freed to the pool. Therefore, a call to put_page_testzero() is in
order so that such a page will be handed to the caller with a correct
count.
This has not effected end users because the bad page count is reset before
the page is handed off. However, under CONFIG_DEBUG_VM this triggers a BUG
when the page count is validated.
Thanks go to Mel for first spotting this issue and providing an initial
fix.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
---
mm/hugetlb.c | 13 ++++++++++---
1 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index db861d8..819d6d9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -267,6 +267,12 @@ static struct page *alloc_buddy_huge_page(struct vm_area_struct *vma,
spin_lock(&hugetlb_lock);
if (page) {
+ /*
+ * This page is now managed by the hugetlb allocator and has
+ * no users -- drop the buddy allocator's reference.
+ */
+ int page_count = put_page_testzero(page);
+ BUG_ON(page_count != 0);
nid = page_to_nid(page);
set_compound_page_dtor(page, free_huge_page);
/*
@@ -345,13 +351,14 @@ free:
enqueue_huge_page(page);
else {
/*
- * Decrement the refcount and free the page using its
- * destructor. This must be done with hugetlb_lock
+ * The page has a reference count of zero already, so
+ * call free_huge_page directly instead of using
+ * put_page. This must be done with hugetlb_lock
* unlocked which is safe because free_huge_page takes
* hugetlb_lock before deciding how to free the page.
*/
spin_unlock(&hugetlb_lock);
- put_page(page);
+ free_huge_page(page);
spin_lock(&hugetlb_lock);
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] hugetlb: Close a difficult to trigger reservation race
2008-03-03 18:06 [PATCH 0/3] hugetlb: Dynamic pool resize improvements Adam Litke
2008-03-03 18:06 ` [PATCH 1/3] hugetlb: Correct page count for surplus huge pages Adam Litke
@ 2008-03-03 18:06 ` Adam Litke
2008-03-03 18:06 ` [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages Adam Litke
2 siblings, 0 replies; 7+ messages in thread
From: Adam Litke @ 2008-03-03 18:06 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Dave Hansen, Andy Whitcroft, Mel Gorman, Adam Litke
A hugetlb reservation may be inadequately backed in the event of racing
allocations and frees when utilizing surplus huge pages. Consider the
following series of events in processes A and B:
A) Allocates some surplus pages to satisfy a reservation
B) Frees some huge pages
A) A notices the extra free pages and drops hugetlb_lock to free some of
its surplus pages back to the buddy allocator.
B) Allocates some huge pages
A) Reacquires hugetlb_lock and returns from gather_surplus_huge_pages()
Avoid this by commiting the reservation after pages have been allocated but
before dropping the lock to free excess pages. For parity, release the
reservation in return_unused_surplus_pages().
This patch also corrects the cpuset_mems_nr() error path in
hugetlb_acct_memory(). If the cpuset check fails, uncommit the
reservation, but also be sure to return any surplus huge pages that may
have been allocated to back the failed reservation.
Thanks to Andy Whitcroft for discovering this.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
---
mm/hugetlb.c | 17 +++++++++++++----
1 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 819d6d9..f6ce740 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -301,8 +301,10 @@ static int gather_surplus_pages(int delta)
int needed, allocated;
needed = (resv_huge_pages + delta) - free_huge_pages;
- if (needed <= 0)
+ if (needed <= 0) {
+ resv_huge_pages += delta;
return 0;
+ }
allocated = 0;
INIT_LIST_HEAD(&surplus_list);
@@ -340,9 +342,12 @@ retry:
* The surplus_list now contains _at_least_ the number of extra pages
* needed to accomodate the reservation. Add the appropriate number
* of pages to the hugetlb pool and free the extras back to the buddy
- * allocator.
+ * allocator. Commit the entire reservation here to prevent another
+ * process from stealing the pages as they are added to the pool but
+ * before they are reserved.
*/
needed += allocated;
+ resv_huge_pages += delta;
ret = 0;
free:
list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
@@ -377,6 +382,9 @@ static void return_unused_surplus_pages(unsigned long unused_resv_pages)
struct page *page;
unsigned long nr_pages;
+ /* Uncommit the reservation */
+ resv_huge_pages -= unused_resv_pages;
+
nr_pages = min(unused_resv_pages, surplus_huge_pages);
while (nr_pages) {
@@ -1198,12 +1206,13 @@ static int hugetlb_acct_memory(long delta)
if (gather_surplus_pages(delta) < 0)
goto out;
- if (delta > cpuset_mems_nr(free_huge_pages_node))
+ if (delta > cpuset_mems_nr(free_huge_pages_node)) {
+ return_unused_surplus_pages(delta);
goto out;
+ }
}
ret = 0;
- resv_huge_pages += delta;
if (delta < 0)
return_unused_surplus_pages((unsigned long) -delta);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages
2008-03-03 18:06 [PATCH 0/3] hugetlb: Dynamic pool resize improvements Adam Litke
2008-03-03 18:06 ` [PATCH 1/3] hugetlb: Correct page count for surplus huge pages Adam Litke
2008-03-03 18:06 ` [PATCH 2/3] hugetlb: Close a difficult to trigger reservation race Adam Litke
@ 2008-03-03 18:06 ` Adam Litke
2 siblings, 0 replies; 7+ messages in thread
From: Adam Litke @ 2008-03-03 18:06 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Dave Hansen, Andy Whitcroft, Mel Gorman, Adam Litke
To reduce hugetlb_lock acquisitions and releases when freeing excess
surplus pages, scan the page list in two parts. First, transfer the needed
pages to the hugetlb pool. Then drop the lock and free the remaining
pages back to the buddy allocator.
In the common case there are zero excess pages and no lock operations are
required.
Thanks Mel Gorman for this improvement.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
---
mm/hugetlb.c | 17 ++++++++++++-----
1 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f6ce740..b0d48bf 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -350,11 +350,19 @@ retry:
resv_huge_pages += delta;
ret = 0;
free:
+ /* Free the needed pages to the hugetlb pool */
list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ if ((--needed) < 0)
+ break;
list_del(&page->lru);
- if ((--needed) >= 0)
- enqueue_huge_page(page);
- else {
+ enqueue_huge_page(page);
+ }
+
+ /* Free unnecessary surplus pages to the buddy allocator */
+ if (!list_empty(&surplus_list)) {
+ spin_unlock(&hugetlb_lock);
+ list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
+ list_del(&page->lru);
/*
* The page has a reference count of zero already, so
* call free_huge_page directly instead of using
@@ -362,10 +370,9 @@ free:
* unlocked which is safe because free_huge_page takes
* hugetlb_lock before deciding how to free the page.
*/
- spin_unlock(&hugetlb_lock);
free_huge_page(page);
- spin_lock(&hugetlb_lock);
}
+ spin_lock(&hugetlb_lock);
}
return ret;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-03-03 18:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-03 18:06 [PATCH 0/3] hugetlb: Dynamic pool resize improvements Adam Litke
2008-03-03 18:06 ` [PATCH 1/3] hugetlb: Correct page count for surplus huge pages Adam Litke
2008-03-03 18:06 ` [PATCH 2/3] hugetlb: Close a difficult to trigger reservation race Adam Litke
2008-03-03 18:06 ` [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages Adam Litke
-- strict thread matches above, loose matches on Subject: below --
2008-02-25 22:01 [PATCH 0/3] hugetlb: Dynamic pool resize improvements Adam Litke
2008-02-25 22:01 ` [PATCH 3/3] hugetlb: Decrease hugetlb_lock cycling in gather_surplus_huge_pages Adam Litke
2008-02-25 22:31 ` Dave Hansen
2008-02-25 22:51 ` Adam Litke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).