linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: akpm@linux-foundation.org
Cc: aarcange@redhat.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Andi Kleen <ak@linux.intel.com>
Subject: [PATCH 5/8] Use correct numa policy node for transparent hugepages
Date: Wed,  2 Mar 2011 16:45:25 -0800	[thread overview]
Message-ID: <1299113128-11349-6-git-send-email-andi@firstfloor.org> (raw)
In-Reply-To: <1299113128-11349-1-git-send-email-andi@firstfloor.org>

From: Andi Kleen <ak@linux.intel.com>

Pass down the correct node for a transparent hugepage allocation.
Most callers continue to use the current node, however the hugepaged
daemon now uses the previous node of the first to be collapsed page
instead. This ensures that khugepaged does not mess up local memory
for an existing process which uses local policy.

The choice of node is somewhat primitive currently: it just
uses the node of the first page in the pmd range. An alternative
would be to look at multiple pages and use the most popular
node. I used the simplest variant for now which should work
well enough for the case of all pages being on the same node.

Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/huge_memory.c |   24 +++++++++++++++++-------
 mm/mempolicy.c   |    3 ++-
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1802db8..8a7f94c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -650,10 +650,10 @@ static inline gfp_t alloc_hugepage_gfpmask(int defrag)
 
 static inline struct page *alloc_hugepage_vma(int defrag,
 					      struct vm_area_struct *vma,
-					      unsigned long haddr)
+					      unsigned long haddr, int nd)
 {
 	return alloc_pages_vma(alloc_hugepage_gfpmask(defrag),
-			       HPAGE_PMD_ORDER, vma, haddr, numa_node_id());
+			       HPAGE_PMD_ORDER, vma, haddr, nd);
 }
 
 #ifndef CONFIG_NUMA
@@ -678,7 +678,7 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		if (unlikely(khugepaged_enter(vma)))
 			return VM_FAULT_OOM;
 		page = alloc_hugepage_vma(transparent_hugepage_defrag(vma),
-					  vma, haddr);
+					  vma, haddr, numa_node_id());
 		if (unlikely(!page))
 			goto out;
 		if (unlikely(mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))) {
@@ -902,7 +902,7 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	if (transparent_hugepage_enabled(vma) &&
 	    !transparent_hugepage_debug_cow())
 		new_page = alloc_hugepage_vma(transparent_hugepage_defrag(vma),
-					      vma, haddr);
+					      vma, haddr, numa_node_id());
 	else
 		new_page = NULL;
 
@@ -1745,7 +1745,8 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page,
 static void collapse_huge_page(struct mm_struct *mm,
 			       unsigned long address,
 			       struct page **hpage,
-			       struct vm_area_struct *vma)
+			       struct vm_area_struct *vma,
+			       int node)
 {
 	pgd_t *pgd;
 	pud_t *pud;
@@ -1773,7 +1774,8 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 * mmap_sem in read mode is good idea also to allow greater
 	 * scalability.
 	 */
-	new_page = alloc_hugepage_vma(khugepaged_defrag(), vma, address);
+	new_page = alloc_hugepage_vma(khugepaged_defrag(), vma, address,
+				      node);
 	if (unlikely(!new_page)) {
 		up_read(&mm->mmap_sem);
 		*hpage = ERR_PTR(-ENOMEM);
@@ -1919,6 +1921,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 	struct page *page;
 	unsigned long _address;
 	spinlock_t *ptl;
+	int node = -1;
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 
@@ -1949,6 +1952,13 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 		page = vm_normal_page(vma, _address, pteval);
 		if (unlikely(!page))
 			goto out_unmap;
+		/* 
+		 * Chose the node of the first page. This could 
+		 * be more sophisticated and look at more pages,
+		 * but isn't for now.
+		 */
+		if (node == -1) 
+			node = page_to_nid(page);
 		VM_BUG_ON(PageCompound(page));
 		if (!PageLRU(page) || PageLocked(page) || !PageAnon(page))
 			goto out_unmap;
@@ -1965,7 +1975,7 @@ out_unmap:
 	pte_unmap_unlock(pte, ptl);
 	if (ret)
 		/* collapse_huge_page will return with the mmap_sem released */
-		collapse_huge_page(mm, address, hpage, vma);
+		collapse_huge_page(mm, address, hpage, vma, node);
 out:
 	return ret;
 }
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 25a5a91..151c20c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1891,7 +1891,8 @@ struct page *alloc_pages_current(gfp_t gfp, unsigned order)
 		page = alloc_page_interleave(gfp, order, interleave_nodes(pol));
 	else
 		page = __alloc_pages_nodemask(gfp, order,
-			policy_zonelist(gfp, pol), policy_nodemask(gfp, pol));
+	      			policy_zonelist(gfp, pol, numa_node_id()), 
+				policy_nodemask(gfp, pol));
 	put_mems_allowed();
 	return page;
 }
-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-03-03  0:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-03  0:45 Fix NUMA problems in transparent hugepages and KSM Andi Kleen
2011-03-03  0:45 ` [PATCH 1/8] Fix interleaving for transparent hugepages v2 Andi Kleen
2011-03-03  2:12   ` KAMEZAWA Hiroyuki
2011-03-03  0:45 ` [PATCH 2/8] Change alloc_pages_vma to pass down the policy node for local policy Andi Kleen
2011-03-03  2:14   ` KAMEZAWA Hiroyuki
2011-03-03  0:45 ` [PATCH 3/8] Add alloc_page_vma_node Andi Kleen
2011-03-03  2:15   ` KAMEZAWA Hiroyuki
2011-03-03  0:45 ` [PATCH 4/8] Preserve original node for transparent huge page copies Andi Kleen
2011-03-03  2:17   ` KAMEZAWA Hiroyuki
2011-03-03  0:45 ` Andi Kleen [this message]
2011-03-03  2:26   ` [PATCH 5/8] Use correct numa policy node for transparent hugepages KAMEZAWA Hiroyuki
2011-03-03  0:45 ` [PATCH 6/8] Add __GFP_OTHER_NODE flag Andi Kleen
2011-03-03  2:33   ` KAMEZAWA Hiroyuki
2011-03-03  0:45 ` [PATCH 7/8] Use GFP_OTHER_NODE for transparent huge pages Andi Kleen
2011-03-03  2:35   ` KAMEZAWA Hiroyuki
2011-03-03  0:45 ` [PATCH 8/8] Add VM counters for transparent hugepages Andi Kleen
2011-03-03  9:18   ` Johannes Weiner
2011-03-03 18:09     ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2011-03-03 19:59 Fix NUMA problems in transparent hugepages and KSM Andi Kleen
2011-03-03 19:59 ` [PATCH 5/8] Use correct numa policy node for transparent hugepages Andi Kleen
2011-03-07  8:38   ` KOSAKI Motohiro
2011-02-23  1:51 Fix NUMA problems in transparent hugepages v2 Andi Kleen
2011-02-23  1:51 ` [PATCH 5/8] Use correct numa policy node for transparent hugepages Andi Kleen
2011-02-21 19:07 Fix NUMA problems in transparent hugepages and KSM Andi Kleen
2011-02-21 19:07 ` [PATCH 5/8] Use correct numa policy node for transparent hugepages Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1299113128-11349-6-git-send-email-andi@firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).