From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: akpm@linux-foundation.org, paulus@samba.org,
benh@kernel.crashing.org, kirill.shutemov@linux.intel.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Subject: [RFC PATCH 3/3] mm/thp: Add new function to clear pmd on collapse
Date: Thu, 30 Apr 2015 13:55:41 +0530 [thread overview]
Message-ID: <1430382341-8316-4-git-send-email-aneesh.kumar@linux.vnet.ibm.com> (raw)
In-Reply-To: <1430382341-8316-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
Some arch may need an explicit IPI when clearing pmd
on collapse. Add new function which arch can override.
After this pmdp_clear_flush is used only for THP case
to invalidate a huge page pte.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable-ppc64.h | 4 ++
arch/powerpc/mm/pgtable_64.c | 77 ++++++++++++++++----------------
include/asm-generic/pgtable.h | 9 ++++
mm/huge_memory.c | 2 +-
4 files changed, 53 insertions(+), 39 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index ff275443a040..655dde8e9683 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -562,6 +562,10 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
extern void pmdp_splitting_flush_notify(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp);
+#define __HAVE_ARCH_PMDP_COLLAPSE_FLUSH
+extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
+ unsigned long address, pmd_t *pmdp);
+
#define __HAVE_ARCH_PGTABLE_DEPOSIT
extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
pgtable_t pgtable);
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 89b356250be3..fa49e2ff042b 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -558,45 +558,9 @@ unsigned long pmd_hugepage_update(struct mm_struct *mm, unsigned long addr,
pmd_t pmdp_clear_flush(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp)
{
- pmd_t pmd;
-
VM_BUG_ON(address & ~HPAGE_PMD_MASK);
- if (pmd_trans_huge(*pmdp)) {
- pmd = pmdp_get_and_clear(vma->vm_mm, address, pmdp);
- } else {
- /*
- * khugepaged calls this for normal pmd
- */
- pmd = *pmdp;
- pmd_clear(pmdp);
- /*
- * Wait for all pending hash_page to finish. This is needed
- * in case of subpage collapse. When we collapse normal pages
- * to hugepage, we first clear the pmd, then invalidate all
- * the PTE entries. The assumption here is that any low level
- * page fault will see a none pmd and take the slow path that
- * will wait on mmap_sem. But we could very well be in a
- * hash_page with local ptep pointer value. Such a hash page
- * can result in adding new HPTE entries for normal subpages.
- * That means we could be modifying the page content as we
- * copy them to a huge page. So wait for parallel hash_page
- * to finish before invalidating HPTE entries. We can do this
- * by sending an IPI to all the cpus and executing a dummy
- * function there.
- */
- kick_all_cpus_sync();
- /*
- * Now invalidate the hpte entries in the range
- * covered by pmd. This make sure we take a
- * fault and will find the pmd as none, which will
- * result in a major fault which takes mmap_sem and
- * hence wait for collapse to complete. Without this
- * the __collapse_huge_page_copy can result in copying
- * the old content.
- */
- flush_tlb_pmd_range(vma->vm_mm, &pmd, address);
- }
- return pmd;
+ VM_BUG_ON(!pmd_trans_huge(*pmdp));
+ return pmdp_get_and_clear(vma->vm_mm, address, pmdp);
}
int pmdp_test_and_clear_young(struct vm_area_struct *vma,
@@ -641,6 +605,43 @@ void pmdp_splitting_flush_notify(struct vm_area_struct *vma,
kick_all_cpus_sync();
}
+pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address,
+ pmd_t *pmdp)
+{
+ pmd_t pmd;
+
+ VM_BUG_ON(address & ~HPAGE_PMD_MASK);
+ pmd = *pmdp;
+ pmd_clear(pmdp);
+ /*
+ * Wait for all pending hash_page to finish. This is needed
+ * in case of subpage collapse. When we collapse normal pages
+ * to hugepage, we first clear the pmd, then invalidate all
+ * the PTE entries. The assumption here is that any low level
+ * page fault will see a none pmd and take the slow path that
+ * will wait on mmap_sem. But we could very well be in a
+ * hash_page with local ptep pointer value. Such a hash page
+ * can result in adding new HPTE entries for normal subpages.
+ * That means we could be modifying the page content as we
+ * copy them to a huge page. So wait for parallel hash_page
+ * to finish before invalidating HPTE entries. We can do this
+ * by sending an IPI to all the cpus and executing a dummy
+ * function there.
+ */
+ kick_all_cpus_sync();
+ /*
+ * Now invalidate the hpte entries in the range
+ * covered by pmd. This make sure we take a
+ * fault and will find the pmd as none, which will
+ * result in a major fault which takes mmap_sem and
+ * hence wait for collapse to complete. Without this
+ * the __collapse_huge_page_copy can result in copying
+ * the old content.
+ */
+ flush_tlb_pmd_range(vma->vm_mm, &pmd, address);
+ return pmd;
+}
+
/*
* We want to put the pgtable in pmd and use pgtable for tracking
* the base page size hptes
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index d091a666f5b1..2e1e4653ae7c 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -189,6 +189,15 @@ extern void pmdp_splitting_flush_notify(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp);
#endif
+#ifndef __HAVE_ARCH_PMDP_COLLAPSE_FLUSH
+static inline pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
+ unsigned long address,
+ pmd_t *pmdp)
+{
+ return pmdp_clear_flush(vma, address, pmdp);
+}
+#endif
+
#ifndef __HAVE_ARCH_PGTABLE_DEPOSIT
extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
pgtable_t pgtable);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 81e9578bf43a..30c1b46fcf6d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2187,7 +2187,7 @@ static void collapse_huge_page(struct mm_struct *mm,
* huge and small TLB entries for the same virtual address
* to avoid the risk of CPU bugs in that area.
*/
- _pmd = pmdp_clear_flush(vma, address, pmd);
+ _pmd = pmdp_collapse_flush(vma, address, pmd);
spin_unlock(pmd_ptl);
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
--
2.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-04-30 8:25 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-23 21:03 [PATCHv5 00/28] THP refcounting redesign Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 01/28] mm, proc: adjust PSS calculation Kirill A. Shutemov
2015-04-29 15:49 ` Jerome Marchand
2015-05-14 14:12 ` Vlastimil Babka
2015-05-15 10:56 ` Kirill A. Shutemov
2015-05-15 11:33 ` Vlastimil Babka
2015-05-15 11:43 ` Kirill A. Shutemov
2015-05-15 12:37 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 02/28] rmap: add argument to charge compound page Kirill A. Shutemov
2015-04-29 15:53 ` Jerome Marchand
2015-04-30 11:52 ` Kirill A. Shutemov
2015-05-14 16:07 ` Vlastimil Babka
2015-05-15 11:14 ` Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 03/28] memcg: adjust to support new THP refcounting Kirill A. Shutemov
2015-05-15 7:44 ` Vlastimil Babka
2015-05-15 11:18 ` Kirill A. Shutemov
2015-05-15 14:57 ` Dave Hansen
2015-05-16 23:17 ` Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault Kirill A. Shutemov
2015-04-29 15:54 ` Jerome Marchand
2015-05-15 9:15 ` Vlastimil Babka
2015-05-15 11:21 ` Kirill A. Shutemov
2015-05-15 11:35 ` Vlastimil Babka
2015-05-15 13:29 ` Kirill A. Shutemov
2015-05-19 13:00 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 05/28] mm: adjust FOLL_SPLIT for new refcounting Kirill A. Shutemov
2015-05-15 11:05 ` Vlastimil Babka
2015-05-15 11:36 ` Kirill A. Shutemov
2015-05-15 12:01 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 06/28] mm: handle PTE-mapped tail pages in gerneric fast gup implementaiton Kirill A. Shutemov
2015-04-29 15:56 ` Jerome Marchand
2015-05-15 12:46 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 07/28] thp, mlock: do not allow huge pages in mlocked area Kirill A. Shutemov
2015-04-29 15:58 ` Jerome Marchand
2015-05-15 12:56 ` Vlastimil Babka
2015-05-15 13:41 ` Kirill A. Shutemov
2015-05-19 14:37 ` Vlastimil Babka
2015-05-20 12:10 ` Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 08/28] khugepaged: ignore pmd tables with THP mapped with ptes Kirill A. Shutemov
2015-04-29 15:59 ` Jerome Marchand
2015-05-15 12:59 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 09/28] thp: rename split_huge_page_pmd() to split_huge_pmd() Kirill A. Shutemov
2015-04-29 16:00 ` Jerome Marchand
2015-05-15 13:08 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 10/28] mm, vmstats: new THP splitting event Kirill A. Shutemov
2015-04-29 16:02 ` Jerome Marchand
2015-05-15 13:10 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 11/28] mm: temporally mark THP broken Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 12/28] thp: drop all split_huge_page()-related code Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 13/28] mm: drop tail page refcounting Kirill A. Shutemov
2015-05-18 9:48 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 14/28] futex, thp: remove special case for THP in get_futex_key Kirill A. Shutemov
2015-05-18 11:49 ` Vlastimil Babka
2015-05-18 12:13 ` Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 15/28] ksm: prepare to new THP semantics Kirill A. Shutemov
2015-05-18 12:41 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 16/28] mm, thp: remove compound_lock Kirill A. Shutemov
2015-04-29 16:11 ` Jerome Marchand
2015-04-30 11:58 ` Kirill A. Shutemov
2015-05-18 12:57 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 17/28] mm, thp: remove infrastructure for handling splitting PMDs Kirill A. Shutemov
2015-04-29 16:14 ` Jerome Marchand
2015-04-30 12:03 ` Kirill A. Shutemov
2015-05-18 13:40 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 18/28] x86, " Kirill A. Shutemov
2015-04-29 9:13 ` Aneesh Kumar K.V
2015-04-23 21:03 ` [PATCHv5 19/28] mm: store mapcount for compound page separately Kirill A. Shutemov
2015-05-18 14:32 ` Vlastimil Babka
2015-05-19 3:55 ` Kirill A. Shutemov
2015-05-19 9:01 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 20/28] mm: differentiate page_mapped() from page_mapcount() for compound pages Kirill A. Shutemov
2015-04-29 16:20 ` Jerome Marchand
2015-04-30 12:06 ` Kirill A. Shutemov
2015-05-18 15:35 ` Vlastimil Babka
2015-05-19 4:00 ` Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 21/28] mm, numa: skip PTE-mapped THP on numa fault Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 22/28] thp: implement split_huge_pmd() Kirill A. Shutemov
2015-05-19 8:25 ` Vlastimil Babka
2015-05-20 14:38 ` Kirill A. Shutemov
2015-04-23 21:03 ` [PATCHv5 23/28] thp: add option to setup migration entiries during PMD split Kirill A. Shutemov
2015-05-19 13:55 ` Vlastimil Babka
2015-04-23 21:03 ` [PATCHv5 24/28] thp, mm: split_huge_page(): caller need to lock page Kirill A. Shutemov
2015-05-19 13:55 ` Vlastimil Babka
2015-04-23 21:04 ` [PATCHv5 25/28] thp: reintroduce split_huge_page() Kirill A. Shutemov
2015-05-19 12:43 ` Vlastimil Babka
2015-04-23 21:04 ` [PATCHv5 26/28] thp: introduce deferred_split_huge_page() Kirill A. Shutemov
2015-05-19 13:54 ` Vlastimil Babka
2015-04-23 21:04 ` [PATCHv5 27/28] mm: re-enable THP Kirill A. Shutemov
2015-04-23 21:04 ` [PATCHv5 28/28] thp: update documentation Kirill A. Shutemov
2015-04-27 23:03 ` [PATCHv5 00/28] THP refcounting redesign Andrew Morton
2015-04-27 23:33 ` Kirill A. Shutemov
2015-04-30 8:25 ` [RFC PATCH 0/3] Remove _PAGE_SPLITTING from ppc64 Aneesh Kumar K.V
2015-04-30 8:25 ` [RFC PATCH 1/3] mm/thp: Use pmdp_splitting_flush_notify to clear pmd on splitting Aneesh Kumar K.V
2015-04-30 13:30 ` Kirill A. Shutemov
2015-04-30 15:59 ` Aneesh Kumar K.V
2015-04-30 16:47 ` Aneesh Kumar K.V
2015-04-30 8:25 ` [RFC PATCH 2/3] powerpc/thp: Remove _PAGE_SPLITTING and related code Aneesh Kumar K.V
2015-04-30 8:25 ` Aneesh Kumar K.V [this message]
2015-05-15 8:55 ` [PATCHv5 00/28] THP refcounting redesign Vlastimil Babka
2015-05-15 13:31 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1430382341-8316-4-git-send-email-aneesh.kumar@linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).