From: Ingo Molnar <mingo@kernel.org>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: Paul Turner <pjt@google.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
Hugh Dickins <hughd@google.com>,
Lee Schermerhorn <lee.schermerhorn@hp.com>
Subject: [PATCH 13/19] mm/mpol: Check for misplaced page
Date: Fri, 16 Nov 2012 17:25:15 +0100 [thread overview]
Message-ID: <1353083121-4560-14-git-send-email-mingo@kernel.org> (raw)
In-Reply-To: <1353083121-4560-1-git-send-email-mingo@kernel.org>
From: Lee Schermerhorn <lee.schermerhorn@hp.com>
This patch provides a new function to test whether a page resides
on a node that is appropriate for the mempolicy for the vma and
address where the page is supposed to be mapped. This involves
looking up the node where the page belongs. So, the function
returns that node so that it may be used to allocated the page
without consulting the policy again.
A subsequent patch will call this function from the fault path.
Because of this, I don't want to go ahead and allocate the page, e.g.,
via alloc_page_vma() only to have to free it if it has the correct
policy. So, I just mimic the alloc_page_vma() node computation
logic--sort of.
Note: we could use this function to implement a MPOL_MF_STRICT
behavior when migrating pages to match mbind() mempolicy--e.g.,
to ensure that pages in an interleaved range are reinterleaved
rather than left where they are when they reside on any page in
the interleave nodemask.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
[ Added MPOL_F_LAZY to trigger migrate-on-fault;
simplified code now that we don't have to bother
with special crap for interleaved ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-z3mgep4tgrc08o07vl1ahb2m@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
include/linux/mempolicy.h | 8 +++++
include/uapi/linux/mempolicy.h | 1 +
mm/mempolicy.c | 76 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 85 insertions(+)
diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index e5ccb9d..c511e25 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -198,6 +198,8 @@ static inline int vma_migratable(struct vm_area_struct *vma)
return 1;
}
+extern int mpol_misplaced(struct page *, struct vm_area_struct *, unsigned long);
+
#else
struct mempolicy {};
@@ -323,5 +325,11 @@ static inline int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol,
return 0;
}
+static inline int mpol_misplaced(struct page *page, struct vm_area_struct *vma,
+ unsigned long address)
+{
+ return -1; /* no node preference */
+}
+
#endif /* CONFIG_NUMA */
#endif
diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index d23dca8..472de8a 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -61,6 +61,7 @@ enum mpol_rebind_step {
#define MPOL_F_SHARED (1 << 0) /* identify shared policies */
#define MPOL_F_LOCAL (1 << 1) /* preferred local allocation */
#define MPOL_F_REBINDING (1 << 2) /* identify policies in rebinding */
+#define MPOL_F_MOF (1 << 3) /* this policy wants migrate on fault */
#endif /* _UAPI_LINUX_MEMPOLICY_H */
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index c7c7c86..1b2890c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2179,6 +2179,82 @@ static void sp_free(struct sp_node *n)
kmem_cache_free(sn_cache, n);
}
+/**
+ * mpol_misplaced - check whether current page node is valid in policy
+ *
+ * @page - page to be checked
+ * @vma - vm area where page mapped
+ * @addr - virtual address where page mapped
+ *
+ * Lookup current policy node id for vma,addr and "compare to" page's
+ * node id.
+ *
+ * Returns:
+ * -1 - not misplaced, page is in the right node
+ * node - node id where the page should be
+ *
+ * Policy determination "mimics" alloc_page_vma().
+ * Called from fault path where we know the vma and faulting address.
+ */
+int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long addr)
+{
+ struct mempolicy *pol;
+ struct zone *zone;
+ int curnid = page_to_nid(page);
+ unsigned long pgoff;
+ int polnid = -1;
+ int ret = -1;
+
+ BUG_ON(!vma);
+
+ pol = get_vma_policy(current, vma, addr);
+ if (!(pol->flags & MPOL_F_MOF))
+ goto out;
+
+ switch (pol->mode) {
+ case MPOL_INTERLEAVE:
+ BUG_ON(addr >= vma->vm_end);
+ BUG_ON(addr < vma->vm_start);
+
+ pgoff = vma->vm_pgoff;
+ pgoff += (addr - vma->vm_start) >> PAGE_SHIFT;
+ polnid = offset_il_node(pol, vma, pgoff);
+ break;
+
+ case MPOL_PREFERRED:
+ if (pol->flags & MPOL_F_LOCAL)
+ polnid = numa_node_id();
+ else
+ polnid = pol->v.preferred_node;
+ break;
+
+ case MPOL_BIND:
+ /*
+ * allows binding to multiple nodes.
+ * use current page if in policy nodemask,
+ * else select nearest allowed node, if any.
+ * If no allowed nodes, use current [!misplaced].
+ */
+ if (node_isset(curnid, pol->v.nodes))
+ goto out;
+ (void)first_zones_zonelist(
+ node_zonelist(numa_node_id(), GFP_HIGHUSER),
+ gfp_zone(GFP_HIGHUSER),
+ &pol->v.nodes, &zone);
+ polnid = zone->node;
+ break;
+
+ default:
+ BUG();
+ }
+ if (curnid != polnid)
+ ret = polnid;
+out:
+ mpol_cond_put(pol);
+
+ return ret;
+}
+
static void sp_delete(struct shared_policy *sp, struct sp_node *n)
{
pr_debug("deleting %lx-l%lx\n", n->start, n->end);
--
1.7.11.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-16 16:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-16 16:25 [PATCH 00/19] latest numa/base patches Ingo Molnar
2012-11-16 16:25 ` [PATCH 01/19] mm/generic: Only flush the local TLB in ptep_set_access_flags() Ingo Molnar
2012-11-16 16:25 ` [PATCH 02/19] x86/mm: Only do a local tlb flush " Ingo Molnar
2012-11-16 16:25 ` [PATCH 03/19] sched, numa, mm: Make find_busiest_queue() a method Ingo Molnar
2012-11-16 16:25 ` [PATCH 04/19] sched, numa, mm: Describe the NUMA scheduling problem formally Ingo Molnar
2012-11-25 6:07 ` abhishek agarwal
2012-11-25 6:09 ` abhishek agarwal
2012-11-16 16:25 ` [PATCH 05/19] sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 Ingo Molnar
2012-11-16 16:25 ` [PATCH 06/19] mm/thp: Preserve pgprot across huge page split Ingo Molnar
2012-11-16 16:25 ` [PATCH 07/19] x86/mm: Introduce pte_accessible() Ingo Molnar
2012-11-16 16:25 ` [PATCH 08/19] mm: Only flush the TLB when clearing an accessible pte Ingo Molnar
2012-11-16 16:25 ` [PATCH 09/19] sched, numa, mm, MIPS/thp: Add pmd_pgprot() implementation Ingo Molnar
2012-11-16 16:25 ` [PATCH 10/19] mm/pgprot: Move the pgprot_modify() fallback definition to mm.h Ingo Molnar
2012-11-16 16:25 ` [PATCH 11/19] mm/mpol: Make MPOL_LOCAL a real policy Ingo Molnar
2012-11-16 16:25 ` [PATCH 12/19] mm/mpol: Add MPOL_MF_NOOP Ingo Molnar
2012-11-16 16:25 ` Ingo Molnar [this message]
2012-11-16 16:25 ` [PATCH 14/19] mm/mpol: Create special PROT_NONE infrastructure Ingo Molnar
2012-11-16 16:25 ` [PATCH 15/19] mm/mpol: Add MPOL_MF_LAZY Ingo Molnar
2012-11-16 16:25 ` [PATCH 16/19] numa, mm: Support NUMA hinting page faults from gup/gup_fast Ingo Molnar
2012-11-16 16:25 ` [PATCH 17/19] mm/migrate: Introduce migrate_misplaced_page() Ingo Molnar
2012-11-19 2:25 ` [PATCH 17/19, v2] " Ingo Molnar
2012-11-19 16:02 ` Rik van Riel
2012-11-16 16:25 ` [PATCH 18/19] mm/mpol: Use special PROT_NONE to migrate pages Ingo Molnar
2012-11-16 16:25 ` [PATCH 19/19] x86/mm: Completely drop the TLB flush from ptep_set_access_flags() Ingo Molnar
2012-11-17 8:35 ` [PATCH 00/19] latest numa/base patches Alex Shi
2012-11-17 8:40 ` Alex Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1353083121-4560-14-git-send-email-mingo@kernel.org \
--to=mingo@kernel.org \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).