From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm <linux-mm@kvack.org>
Subject: PATCH Migration: find correct vma in new_vma_page()
Date: Thu, 01 Nov 2007 17:48:44 -0400 [thread overview]
Message-ID: <1193953725.5300.108.camel@localhost> (raw)
PATCH Migration: find correct vma in new_vma_page()
Against: 2.6.23-mm1
We hit the BUG_ON() in mm/rmap.c:vma_address() when trying to migrate
via mbind(MPOL_MF_MOVE) a non-anon region that spans multiple vmas.
For anon-regions, we just fail to migrate any pages beyond the 1st
vma in the range.
This occurs because do_mbind() collects a list of pages to migrate
by calling check_range(). check_range() walks the task's mm, spanning
vmas as necessary, to collect the migratable pages into a list. Then,
do_mbind() calls migrate_pages() passing the list of pages, a function
to allocate new pages based on vma policy [new_vma_page()], and a
pointer to the first vma of the range.
For each page in the list, new_vma_page() calls page_address_in_vma()
passing the page and the vma [first in range] to obtain the address
to get for alloc_page_vma(). The page address is needed to get
interleaving policy correct. If the pages in the list come from
multiple vmas, eventually, new_page_address() will pass that page
to page_address_in_vma() with the incorrect vma. For !PageAnon
pages, this will result in a bug check in rmap.c:vma_address(). For
anon pages, vma_address() will just return EFAULT and fail the
migration.
This patch modifies new_vma_page() to check the return value from
page_address_in_vma(). If the return value is EFAULT, new_vma_page()
searchs forward via vm_next for the vma that maps the page--i.e.,
that does not return EFAULT. This assumes that the pages in the list
handed to migrate_pages() is in address order. This is currently
case. The patch documents this assumption in a new comment block
for new_vma_page().
If new_vma_page() cannot locate the vma mapping the page in a forward
search in the mm, it will pass a NULL vma to alloc_page_vma(). This
will result in the allocation using the task policy, if any, else
system default policy. This situation is unlikely, but the patch
documents this behavior with a comment.
Note, this patch results in restarting from the first vma in a
multi-vma range each time new_vma_page() is called. If this is not
acceptable, we can make the vma argument a pointer, both in new_vma_page()
and it's caller unmap_and_move() so that the value held by the loop
in migrate_pages() always passes down the last vma in which a page
was found. This will require changes to all new_page_t functions
passed to migrate_pages(). Is this necessary?
For this patch to work, we can't bug check in vma_address() for pages
outside the argument vma. This patch removes the BUG_ON(). All other
callers [besides new_vma_page()] already check the return status.
Tested on x86_64, 4 node NUMA platform.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mempolicy.c | 21 +++++++++++++++++++--
mm/rmap.c | 7 ++++---
2 files changed, 23 insertions(+), 5 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-11-01 17:34:10.000000000 -0400
+++ Linux/mm/mempolicy.c 2007-11-01 17:36:23.000000000 -0400
@@ -722,12 +722,29 @@ out:
}
+/*
+ * Allocate a new page for page migration based on vma policy.
+ * Start assuming that page is mapped by vma pointed to by @private.
+ * Search forward from there, if not. N.B., this assumes that the
+ * list of pages handed to migrate_pages()--which is how we get here--
+ * is in virtual address order.
+ */
static struct page *new_vma_page(struct page *page, unsigned long private, int **x)
{
struct vm_area_struct *vma = (struct vm_area_struct *)private;
+ unsigned long uninitialized_var(address);
- return alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
- page_address_in_vma(page, vma));
+ while (vma) {
+ address = page_address_in_vma(page, vma);
+ if (address != -EFAULT)
+ break;
+ vma = vma->vm_next;
+ }
+
+ /*
+ * if !vma, alloc_page_vma() will use task or system default policy
+ */
+ return alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
}
#else
Index: Linux/mm/rmap.c
===================================================================
--- Linux.orig/mm/rmap.c 2007-11-01 17:34:10.000000000 -0400
+++ Linux/mm/rmap.c 2007-11-01 17:34:43.000000000 -0400
@@ -184,7 +184,9 @@ static void page_unlock_anon_vma(struct
}
/*
- * At what user virtual address is page expected in vma?
+ * At what user virtual address is page expected in @vma?
+ * Returns virtual address or -EFAULT if page's index/offset is not
+ * within the range mapped the @vma.
*/
static inline unsigned long
vma_address(struct page *page, struct vm_area_struct *vma)
@@ -194,8 +196,7 @@ vma_address(struct page *page, struct vm
address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (unlikely(address < vma->vm_start || address >= vma->vm_end)) {
- /* page should be within any vma from prio_tree_next */
- BUG_ON(!PageAnon(page));
+ /* page should be within @vma mapping range */
return -EFAULT;
}
return address;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2007-11-01 21:48 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-01 21:48 Lee Schermerhorn [this message]
2007-11-02 1:47 ` PATCH Migration: find correct vma in new_vma_page() Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1193953725.5300.108.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.