All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@linux.vnet.ibm.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Timur Tabi <timur@freescale.com>,
	Andi Kleen <andi@firstfloor.org>, Mel Gorman <mel@csn.ul.ie>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Nazarewicz <mina86@mina86.com>,
	David Rientjes <rientjes@google.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>
Subject: [PATCH 2/3] make new alloc_pages_exact()
Date: Thu, 14 Apr 2011 13:01:40 -0700	[thread overview]
Message-ID: <20110414200140.CDE09A20@kernel> (raw)
In-Reply-To: <20110414200139.ABD98551@kernel>


What I really wanted in the end was a highmem-capable
alloc_pages_exact(), so here it is.  This function can be used to
allocate unmapped (like highmem) non-power-of-two-sized areas of
memory.  This is in constast to get_free_pages_exact() which can only
allocate from lowmem.

My plan is to use this in the virtio_balloon driver to allocate large,
oddly-sized contiguous areas.

The new __alloc_pages_exact() now takes a size in numbers of pages,
and returns a 'struct page', which means it can now address
highmem.  The (new) argument order mirrors alloc_pages() itself.

It's a bit unfortunate that this introduces __free_pages_exact()
alongside free_pages_exact().  But that mess already exists with
__free_pages() vs. free_pages_exact().  So, at worst, this mirrors the
mess that we already have.

I'm also a bit worried that I've not put in something named
alloc_pages_exact(), but that behaves differently than it did before
this set.  I got all of the in-tree cases, but I'm a bit worried about
stragglers elsewhere.  So, I'm calling this __alloc_pages_exact() for
the moment.  We can take out the __ some day if it bothers people.

Note that the __get_free_pages() has a !GFP_HIGHMEM check.  Now that
we are using alloc_pages_exact() instead of __get_free_pages() for
get_free_pages_exact(), we had to add a new check in
get_free_pages_exact().

This has been compile and boot tested, and I checked that

	echo 2 > /sys/kernel/profiling

still works, since it uses get_free_pages_exact().

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
---

 linux-2.6.git-dave/include/linux/gfp.h |    4 +
 linux-2.6.git-dave/mm/page_alloc.c     |   85 ++++++++++++++++++++++++---------
 2 files changed, 68 insertions(+), 21 deletions(-)

diff -puN include/linux/gfp.h~make_new_alloc_pages_exact include/linux/gfp.h
--- linux-2.6.git/include/linux/gfp.h~make_new_alloc_pages_exact	2011-04-14 09:14:35.533886248 -0700
+++ linux-2.6.git-dave/include/linux/gfp.h	2011-04-14 09:14:35.573886242 -0700
@@ -352,6 +352,10 @@ extern struct page *alloc_pages_vma(gfp_
 extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
+/* 'struct page' version */
+struct page *__alloc_pages_exact(gfp_t gfp_mask, unsigned long nr_pages);
+void __free_pages_exact(struct page *page, unsigned long nr_pages);
+/* virtual address version */
 void *get_free_pages_exact(gfp_t gfp_mask, size_t size);
 void free_pages_exact(void *virt, size_t size);
 
diff -puN mm/page_alloc.c~make_new_alloc_pages_exact mm/page_alloc.c
--- linux-2.6.git/mm/page_alloc.c~make_new_alloc_pages_exact	2011-04-14 09:14:35.545886248 -0700
+++ linux-2.6.git-dave/mm/page_alloc.c	2011-04-14 09:24:48.793504357 -0700
@@ -2318,9 +2318,10 @@ void free_pages(unsigned long addr, unsi
 EXPORT_SYMBOL(free_pages);
 
 /**
- * get_free_pages_exact - allocate an exact number physically-contiguous pages.
- * @size: the number of bytes to allocate
+ * __alloc_pages_exact - allocate an exact number physically-contiguous pages.
+ * @nr_pages: the number of pages to allocate
  * @gfp_mask: GFP flags for the allocation
+ * returns: struct page for allocated memory
  *
  * This function is similar to alloc_pages(), except that it allocates the
  * minimum number of pages to satisfy the request.  alloc_pages() can only
@@ -2330,29 +2331,76 @@ EXPORT_SYMBOL(free_pages);
  *
  * Memory allocated by this function must be released by free_pages_exact().
  */
-void *get_free_pages_exact(gfp_t gfp_mask, size_t size)
+struct page *__alloc_pages_exact(gfp_t gfp_mask, unsigned long nr_pages)
 {
-	unsigned int order = get_order(size);
-	unsigned long addr;
+	unsigned int order = get_order(nr_pages * PAGE_SIZE);
+	struct page *page;
 
-	addr = __get_free_pages(gfp_mask, order);
-	if (addr) {
-		unsigned long alloc_end = addr + (PAGE_SIZE << order);
-		unsigned long used = addr + PAGE_ALIGN(size);
+	page = alloc_pages(gfp_mask, order);
+	if (page) {
+		struct page *alloc_end = page + (1 << order);
+		struct page *used = page + nr_pages;
 
-		split_page(virt_to_page((void *)addr), order);
+		split_page(page, order);
 		while (used < alloc_end) {
-			free_page(used);
-			used += PAGE_SIZE;
+			__free_page(used);
+			used++;
 		}
 	}
 
-	return (void *)addr;
+	return page;
+}
+EXPORT_SYMBOL(__alloc_pages_exact);
+
+/**
+ * __free_pages_exact - release memory allocated via __alloc_pages_exact()
+ * @virt: the value returned by get_free_pages_exact.
+ * @nr_pages: size in pages, same value as passed to __alloc_pages_exact().
+ *
+ * Release the memory allocated by a previous call to __alloc_pages_exact().
+ */
+void __free_pages_exact(struct page *page, unsigned long nr_pages)
+{
+	struct page *end = page + nr_pages;
+
+	while (page < end) {
+		__free_page(page);
+		page++;
+	}
+}
+EXPORT_SYMBOL(__free_pages_exact);
+
+/**
+ * get_free_pages_exact - allocate an exact number physically-contiguous pages.
+ * @gfp_mask: GFP flags for the allocation
+ * @size: the number of bytes to allocate
+ * returns: virtual address of allocated memory
+ *
+ * This function is similar to __get_free_pages(), except that it allocates the
+ * minimum number of pages to satisfy the request.  get_free_pages() can only
+ * allocate memory in power-of-two pages.
+ *
+ * This function is also limited by MAX_ORDER.
+ *
+ * Memory allocated by this function must be released by free_pages_exact().
+ */
+void *get_free_pages_exact(gfp_t gfp_mask, size_t size)
+{
+	struct page *page;
+	unsigned long nr_pages = PAGE_ALIGN(size) / PAGE_SIZE;
+
+	/* If we are using page_address(), we can not allow highmem */
+	VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0);
+
+	page = __alloc_pages_exact(gfp_mask, nr_pages);
+	if (page)
+		return (void *) page_address(page);
+	return NULL;
 }
 EXPORT_SYMBOL(get_free_pages_exact);
 
 /**
- * free_pages_exact - release memory allocated via get_free_pages_exact()
+ * __free_pages_exact - release memory allocated via get_free_pages_exact()
  * @virt: the value returned by get_free_pages_exact.
  * @size: size of allocation, same value as passed to get_free_pages_exact().
  *
@@ -2360,13 +2408,8 @@ EXPORT_SYMBOL(get_free_pages_exact);
  */
 void free_pages_exact(void *virt, size_t size)
 {
-	unsigned long addr = (unsigned long)virt;
-	unsigned long end = addr + PAGE_ALIGN(size);
-
-	while (addr < end) {
-		free_page(addr);
-		addr += PAGE_SIZE;
-	}
+	int nr_pages = PAGE_ALIGN(size)/PAGE_SIZE;
+	__free_pages_exact(virt_to_page(virt), nr_pages);
 }
 EXPORT_SYMBOL(free_pages_exact);
 
diff -puN mm/swap.c~make_new_alloc_pages_exact mm/swap.c
diff -puN tools/virtio/linux/virtio.h~make_new_alloc_pages_exact tools/virtio/linux/virtio.h
diff -puN include/linux/types.h~make_new_alloc_pages_exact include/linux/types.h
diff -puN Documentation/kbuild/kbuild.txt~make_new_alloc_pages_exact Documentation/kbuild/kbuild.txt
diff -puN Documentation/sparse.txt~make_new_alloc_pages_exact Documentation/sparse.txt
_

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@linux.vnet.ibm.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Timur Tabi <timur@freescale.com>,
	Andi Kleen <andi@firstfloor.org>, Mel Gorman <mel@csn.ul.ie>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Nazarewicz <mina86@mina86.com>,
	David Rientjes <rientjes@google.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>
Subject: [PATCH 2/3] make new alloc_pages_exact()
Date: Thu, 14 Apr 2011 13:01:40 -0700	[thread overview]
Message-ID: <20110414200140.CDE09A20@kernel> (raw)
In-Reply-To: <20110414200139.ABD98551@kernel>


What I really wanted in the end was a highmem-capable
alloc_pages_exact(), so here it is.  This function can be used to
allocate unmapped (like highmem) non-power-of-two-sized areas of
memory.  This is in constast to get_free_pages_exact() which can only
allocate from lowmem.

My plan is to use this in the virtio_balloon driver to allocate large,
oddly-sized contiguous areas.

The new __alloc_pages_exact() now takes a size in numbers of pages,
and returns a 'struct page', which means it can now address
highmem.  The (new) argument order mirrors alloc_pages() itself.

It's a bit unfortunate that this introduces __free_pages_exact()
alongside free_pages_exact().  But that mess already exists with
__free_pages() vs. free_pages_exact().  So, at worst, this mirrors the
mess that we already have.

I'm also a bit worried that I've not put in something named
alloc_pages_exact(), but that behaves differently than it did before
this set.  I got all of the in-tree cases, but I'm a bit worried about
stragglers elsewhere.  So, I'm calling this __alloc_pages_exact() for
the moment.  We can take out the __ some day if it bothers people.

Note that the __get_free_pages() has a !GFP_HIGHMEM check.  Now that
we are using alloc_pages_exact() instead of __get_free_pages() for
get_free_pages_exact(), we had to add a new check in
get_free_pages_exact().

This has been compile and boot tested, and I checked that

	echo 2 > /sys/kernel/profiling

still works, since it uses get_free_pages_exact().

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
---

 linux-2.6.git-dave/include/linux/gfp.h |    4 +
 linux-2.6.git-dave/mm/page_alloc.c     |   85 ++++++++++++++++++++++++---------
 2 files changed, 68 insertions(+), 21 deletions(-)

diff -puN include/linux/gfp.h~make_new_alloc_pages_exact include/linux/gfp.h
--- linux-2.6.git/include/linux/gfp.h~make_new_alloc_pages_exact	2011-04-14 09:14:35.533886248 -0700
+++ linux-2.6.git-dave/include/linux/gfp.h	2011-04-14 09:14:35.573886242 -0700
@@ -352,6 +352,10 @@ extern struct page *alloc_pages_vma(gfp_
 extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
+/* 'struct page' version */
+struct page *__alloc_pages_exact(gfp_t gfp_mask, unsigned long nr_pages);
+void __free_pages_exact(struct page *page, unsigned long nr_pages);
+/* virtual address version */
 void *get_free_pages_exact(gfp_t gfp_mask, size_t size);
 void free_pages_exact(void *virt, size_t size);
 
diff -puN mm/page_alloc.c~make_new_alloc_pages_exact mm/page_alloc.c
--- linux-2.6.git/mm/page_alloc.c~make_new_alloc_pages_exact	2011-04-14 09:14:35.545886248 -0700
+++ linux-2.6.git-dave/mm/page_alloc.c	2011-04-14 09:24:48.793504357 -0700
@@ -2318,9 +2318,10 @@ void free_pages(unsigned long addr, unsi
 EXPORT_SYMBOL(free_pages);
 
 /**
- * get_free_pages_exact - allocate an exact number physically-contiguous pages.
- * @size: the number of bytes to allocate
+ * __alloc_pages_exact - allocate an exact number physically-contiguous pages.
+ * @nr_pages: the number of pages to allocate
  * @gfp_mask: GFP flags for the allocation
+ * returns: struct page for allocated memory
  *
  * This function is similar to alloc_pages(), except that it allocates the
  * minimum number of pages to satisfy the request.  alloc_pages() can only
@@ -2330,29 +2331,76 @@ EXPORT_SYMBOL(free_pages);
  *
  * Memory allocated by this function must be released by free_pages_exact().
  */
-void *get_free_pages_exact(gfp_t gfp_mask, size_t size)
+struct page *__alloc_pages_exact(gfp_t gfp_mask, unsigned long nr_pages)
 {
-	unsigned int order = get_order(size);
-	unsigned long addr;
+	unsigned int order = get_order(nr_pages * PAGE_SIZE);
+	struct page *page;
 
-	addr = __get_free_pages(gfp_mask, order);
-	if (addr) {
-		unsigned long alloc_end = addr + (PAGE_SIZE << order);
-		unsigned long used = addr + PAGE_ALIGN(size);
+	page = alloc_pages(gfp_mask, order);
+	if (page) {
+		struct page *alloc_end = page + (1 << order);
+		struct page *used = page + nr_pages;
 
-		split_page(virt_to_page((void *)addr), order);
+		split_page(page, order);
 		while (used < alloc_end) {
-			free_page(used);
-			used += PAGE_SIZE;
+			__free_page(used);
+			used++;
 		}
 	}
 
-	return (void *)addr;
+	return page;
+}
+EXPORT_SYMBOL(__alloc_pages_exact);
+
+/**
+ * __free_pages_exact - release memory allocated via __alloc_pages_exact()
+ * @virt: the value returned by get_free_pages_exact.
+ * @nr_pages: size in pages, same value as passed to __alloc_pages_exact().
+ *
+ * Release the memory allocated by a previous call to __alloc_pages_exact().
+ */
+void __free_pages_exact(struct page *page, unsigned long nr_pages)
+{
+	struct page *end = page + nr_pages;
+
+	while (page < end) {
+		__free_page(page);
+		page++;
+	}
+}
+EXPORT_SYMBOL(__free_pages_exact);
+
+/**
+ * get_free_pages_exact - allocate an exact number physically-contiguous pages.
+ * @gfp_mask: GFP flags for the allocation
+ * @size: the number of bytes to allocate
+ * returns: virtual address of allocated memory
+ *
+ * This function is similar to __get_free_pages(), except that it allocates the
+ * minimum number of pages to satisfy the request.  get_free_pages() can only
+ * allocate memory in power-of-two pages.
+ *
+ * This function is also limited by MAX_ORDER.
+ *
+ * Memory allocated by this function must be released by free_pages_exact().
+ */
+void *get_free_pages_exact(gfp_t gfp_mask, size_t size)
+{
+	struct page *page;
+	unsigned long nr_pages = PAGE_ALIGN(size) / PAGE_SIZE;
+
+	/* If we are using page_address(), we can not allow highmem */
+	VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0);
+
+	page = __alloc_pages_exact(gfp_mask, nr_pages);
+	if (page)
+		return (void *) page_address(page);
+	return NULL;
 }
 EXPORT_SYMBOL(get_free_pages_exact);
 
 /**
- * free_pages_exact - release memory allocated via get_free_pages_exact()
+ * __free_pages_exact - release memory allocated via get_free_pages_exact()
  * @virt: the value returned by get_free_pages_exact.
  * @size: size of allocation, same value as passed to get_free_pages_exact().
  *
@@ -2360,13 +2408,8 @@ EXPORT_SYMBOL(get_free_pages_exact);
  */
 void free_pages_exact(void *virt, size_t size)
 {
-	unsigned long addr = (unsigned long)virt;
-	unsigned long end = addr + PAGE_ALIGN(size);
-
-	while (addr < end) {
-		free_page(addr);
-		addr += PAGE_SIZE;
-	}
+	int nr_pages = PAGE_ALIGN(size)/PAGE_SIZE;
+	__free_pages_exact(virt_to_page(virt), nr_pages);
 }
 EXPORT_SYMBOL(free_pages_exact);
 
diff -puN mm/swap.c~make_new_alloc_pages_exact mm/swap.c
diff -puN tools/virtio/linux/virtio.h~make_new_alloc_pages_exact tools/virtio/linux/virtio.h
diff -puN include/linux/types.h~make_new_alloc_pages_exact include/linux/types.h
diff -puN Documentation/kbuild/kbuild.txt~make_new_alloc_pages_exact Documentation/kbuild/kbuild.txt
diff -puN Documentation/sparse.txt~make_new_alloc_pages_exact Documentation/sparse.txt
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-04-14 20:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-14 20:01 [PATCH 1/3] rename alloc_pages_exact() Dave Hansen
2011-04-14 20:01 ` Dave Hansen
2011-04-14 20:01 ` Dave Hansen [this message]
2011-04-14 20:01   ` [PATCH 2/3] make new alloc_pages_exact() Dave Hansen
2011-04-14 21:55   ` David Rientjes
2011-04-14 21:55     ` David Rientjes
2011-04-27 21:30   ` Timur Tabi
2011-04-27 21:30     ` Timur Tabi
2011-04-27 21:37     ` Dave Hansen
2011-04-27 21:37       ` Dave Hansen
2011-04-27 21:42       ` Timur Tabi
2011-04-27 21:42         ` Timur Tabi
2011-04-28 16:02         ` Dave Hansen
2011-04-28 16:02           ` Dave Hansen
2011-04-14 20:01 ` [PATCH 3/3] reuse __free_pages_exact() in __alloc_pages_exact() Dave Hansen
2011-04-14 20:01   ` Dave Hansen
2011-04-14 22:00   ` David Rientjes
2011-04-14 22:00     ` David Rientjes
2011-04-14 22:07     ` Dave Hansen
2011-04-14 22:07       ` Dave Hansen
2011-04-14 22:09       ` David Rientjes
2011-04-14 22:09         ` David Rientjes
2011-04-29 14:17 ` [PATCH 1/3] rename alloc_pages_exact() Timur Tabi
2011-04-29 14:17   ` Timur Tabi
  -- strict thread matches above, loose matches on Subject: below --
2011-04-11 22:03 Dave Hansen
2011-04-11 22:03 ` [PATCH 2/3] make new alloc_pages_exact() Dave Hansen
2011-04-11 22:03   ` Dave Hansen
2011-04-11 22:22   ` Andrew Morton
2011-04-11 22:22     ` Andrew Morton
2011-04-11 22:36     ` Dave Hansen
2011-04-11 22:36       ` Dave Hansen
2011-04-11 22:42       ` Timur Tabi
2011-04-11 22:42         ` Timur Tabi
2011-04-12 10:28     ` Michal Nazarewicz
2011-04-12 10:28       ` Michal Nazarewicz
2011-04-12 15:04       ` Dave Hansen
2011-04-12 15:04         ` Dave Hansen
2011-04-12 15:58         ` Michal Nazarewicz
2011-04-12 15:58           ` Michal Nazarewicz
2011-04-13 23:23           ` Dave Hansen
2011-04-13 23:23             ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110414200140.CDE09A20@kernel \
    --to=dave@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mina86@mina86.com \
    --cc=rientjes@google.com \
    --cc=timur@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.