[PATCH 4/5] [hugetlb] Try to grow pool on alloc_huge_page failure

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Adam Litke <agl@us.ibm.com>
To: linux-mm@kvack.org
Cc: Mel Gorman <mel@skynet.ie>, Andy Whitcroft <apw@shadowen.org>,
	William Lee Irwin III <wli@holomorphy.com>,
	Christoph Lameter <clameter@sgi.com>,
	Ken Chen <kenchen@google.com>, Adam Litke <agl@us.ibm.com>
Subject: [PATCH 4/5] [hugetlb] Try to grow pool on alloc_huge_page failure
Date: Fri, 13 Jul 2007 08:17:06 -0700	[thread overview]
Message-ID: <20070713151706.17750.89107.stgit@kernel> (raw)
In-Reply-To: <20070713151621.17750.58171.stgit@kernel>

Because we overcommit hugepages for MAP_PRIVATE mappings, it is possible that
the hugetlb pool will be exhausted (or fully reserved) when a hugepage is
needed to satisfy a page fault.  Before killing the process in this situation,
try to allocate a hugepage directly from the buddy allocator.  Only do this if
the process would remain within its locked_vm memory limits.

Hugepages allocated directly from the buddy allocator (surplus pages)
should be freed back to the buddy allocator to prevent unbounded growth of
the hugetlb pool.  Introduce a per-node surplus pages counter which is then
used by free_huge_page to determine how the page should be freed.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Adam Litke <agl@us.ibm.com>
---

 mm/hugetlb.c |   82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a754c20..f03db67 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -27,6 +27,7 @@ unsigned long max_huge_pages;
 static struct list_head hugepage_freelists[MAX_NUMNODES];
 static unsigned int nr_huge_pages_node[MAX_NUMNODES];
 static unsigned int free_huge_pages_node[MAX_NUMNODES];
+static unsigned int surplus_huge_pages_node[MAX_NUMNODES];
 /*
  * Protects updates to hugepage_freelists, nr_huge_pages, and free_huge_pages
  */
@@ -105,16 +106,22 @@ static void update_and_free_page(struct page *page)
 
 static void free_huge_page(struct page *page)
 {
-	BUG_ON(page_count(page));
+	int nid = page_to_nid(page);
 
+	BUG_ON(page_count(page));
 	INIT_LIST_HEAD(&page->lru);
 
 	spin_lock(&hugetlb_lock);
-	enqueue_huge_page(page);
+	if (surplus_huge_pages_node[nid]) {
+		update_and_free_page(page);
+		surplus_huge_pages_node[nid]--;
+	} else {
+		enqueue_huge_page(page);
+	}
 	spin_unlock(&hugetlb_lock);
 }
 
-static int alloc_fresh_huge_page(void)
+static struct page *__alloc_fresh_huge_page(void)
 {
 	static int nid = 0;
 	struct page *page;
@@ -129,16 +136,72 @@ static int alloc_fresh_huge_page(void)
 		nr_huge_pages++;
 		nr_huge_pages_node[page_to_nid(page)]++;
 		spin_unlock(&hugetlb_lock);
+	}
+	return page;
+}
+
+static int alloc_fresh_huge_page(void)
+{
+	struct page *page;
+
+	page = __alloc_fresh_huge_page();
+	if (page) {
 		put_page(page); /* free it into the hugepage allocator */
 		return 1;
 	}
 	return 0;
 }
 
+/*
+ * Returns 1 if a process remains within lock limits after locking
+ * hpage_delta huge pages. It is expected that mmap_sem is held
+ * when calling this function, otherwise the locked_vm counter may
+ * change unexpectedly
+ */
+static int within_locked_vm_limits(long hpage_delta)
+{
+	unsigned long locked_pages, locked_pages_limit;
+
+	/* Check locked page limits */
+	locked_pages = current->mm->locked_vm;
+	locked_pages += hpage_delta * BASE_PAGES_PER_HPAGE;
+	locked_pages_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
+	locked_pages_limit >>= PAGE_SHIFT;
+
+	/* Return 0 if we would exceed locked_vm limits */
+	if (locked_pages > locked_pages_limit)
+		return 0;
+
+	/* Nice, we're within limits */
+	return 1;
+}
+
+static struct page *alloc_buddy_huge_page(struct vm_area_struct *vma,
+						unsigned long address)
+{
+	struct page *page = NULL;
+
+	/* Check we remain within limits if 1 huge page is allocated */
+	if (!within_locked_vm_limits(1))
+		return NULL;
+
+	page = __alloc_fresh_huge_page();
+	if (page) {
+		INIT_LIST_HEAD(&page->lru);
+
+		/* We now have a surplus huge page, keep track of it */
+		spin_lock(&hugetlb_lock);
+		surplus_huge_pages_node[page_to_nid(page)]++;
+		spin_unlock(&hugetlb_lock);
+	}
+
+	return page;
+}
+
 static struct page *alloc_huge_page(struct vm_area_struct *vma,
 				    unsigned long addr)
 {
-	struct page *page;
+	struct page *page = NULL;
 
 	spin_lock(&hugetlb_lock);
 	if (vma->vm_flags & VM_MAYSHARE)
@@ -158,7 +221,16 @@ fail:
 	if (vma->vm_flags & VM_MAYSHARE)
 		resv_huge_pages++;
 	spin_unlock(&hugetlb_lock);
-	return NULL;
+
+	/*
+	 * Private mappings do not use reserved huge pages so the allocation
+	 * may have failed due to an undersized hugetlb pool.  Try to grab a
+	 * surplus huge page from the buddy allocator.
+	 */
+	if (!(vma->vm_flags & VM_MAYSHARE))
+		page = alloc_buddy_huge_page(vma, addr);
+
+	return page;
 }
 
 static int __init hugetlb_init(void)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2007-07-13 15:11 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-13 15:16 [PATCH 0/5] [RFC] Dynamic hugetlb pool resizing Adam Litke
2007-07-13 15:16 ` [PATCH 1/5] [hugetlb] Introduce BASE_PAGES_PER_HPAGE constant Adam Litke
2007-07-23 19:43   ` Christoph Lameter
2007-07-23 19:52     ` Adam Litke
2007-07-13 15:16 ` [PATCH 2/5] [hugetlb] Account for hugepages as locked_vm Adam Litke
2007-07-13 15:16 ` [PATCH 3/5] [hugetlb] Move update_and_free_page so it can be used by alloc functions Adam Litke
2007-07-13 15:17 ` Adam Litke [this message]
2007-07-13 15:17 ` [PATCH 5/5] [hugetlb] Try to grow pool for MAP_SHARED mappings Adam Litke
2007-07-13 20:05   ` Paul Jackson
2007-07-13 21:05     ` Adam Litke
2007-07-13 21:24       ` Ken Chen
2007-07-13 21:29       ` Christoph Lameter
2007-07-13 21:38         ` Ken Chen
2007-07-13 21:47           ` Christoph Lameter
2007-07-13 22:21           ` Paul Jackson
2007-07-13 21:38       ` Paul Jackson
2007-07-17 23:42         ` Nish Aravamudan
2007-07-18 14:44           ` Lee Schermerhorn
2007-07-18 15:17             ` Nish Aravamudan
2007-07-18 16:02               ` Lee Schermerhorn
2007-07-18 21:16                 ` Nish Aravamudan
2007-07-18 21:40                   ` Lee Schermerhorn
2007-07-19  1:52                 ` Paul Mundt
2007-07-20 20:35                   ` Nish Aravamudan
2007-07-20 20:53                     ` Lee Schermerhorn
2007-07-20 21:12                       ` Nish Aravamudan
2007-07-21 16:57                     ` Paul Mundt
2007-07-13 23:15       ` Nish Aravamudan
2007-07-13 21:09     ` Ken Chen

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:a754c20 dfblob:f03db67 )
 OR (
bs:"[PATCH 4/5] [hugetlb] Try to grow pool on alloc_huge_page failure" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070713151706.17750.89107.stgit@kernel \
    --to=agl@us.ibm.com \
    --cc=apw@shadowen.org \
    --cc=clameter@sgi.com \
    --cc=kenchen@google.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.