* [PATCH] mm/numa/thp: fix assumptions of migrate_misplaced_transhuge_page()
@ 2016-05-03 12:33 jglisse
2016-05-05 9:53 ` Mel Gorman
0 siblings, 1 reply; 2+ messages in thread
From: jglisse @ 2016-05-03 12:33 UTC (permalink / raw)
To: linux-mm
Cc: linux-kernel, Jérôme Glisse, Mel Gorman, Hugh Dickins,
Andrea Arcangeli
From: JA(C)rA'me Glisse <jglisse@redhat.com>
Fix assumptions in migrate_misplaced_transhuge_page() which is only
call by do_huge_pmd_numa_page() itself only call by __handle_mm_fault()
for pmd with PROT_NONE. This means that if the pmd stays the same
then there can be no concurrent get_user_pages / get_user_pages_fast
(GUP/GUP_fast). More over because migrate_misplaced_transhuge_page()
abort if page is mapped more than once then there can be no GUP from
a different process. Finaly, holding the pmd lock assure us that no
other part of the kernel can take an extra reference on the page.
In the end this means that the failure code path should never be
taken unless something is horribly wrong, so convert it to BUG_ON().
Signed-off-by: JA(C)rA'me Glisse <jglisse@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
mm/migrate.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index f9dfb18..07148be 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1760,7 +1760,13 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
int page_lru = page_is_file_cache(page);
unsigned long mmun_start = address & HPAGE_PMD_MASK;
unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE;
- pmd_t orig_entry;
+
+ /*
+ * What we do here is only valid if pmd_protnone(entry) is true and thp
+ * page is map in only once, which numamigrate_isolate_page() checks.
+ */
+ if (!pmd_protnone(entry))
+ goto out_unlock;
/*
* Rate-limit the amount of data that is being migrated to a node.
@@ -1803,7 +1809,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
ptl = pmd_lock(mm, pmd);
if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
-fail_putback:
spin_unlock(ptl);
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
@@ -1825,17 +1830,21 @@ fail_putback:
goto out_unlock;
}
- orig_entry = *pmd;
+ /*
+ * We are holding the lock so no one can set a new pmd and original pmd
+ * is PROT_NONE thus no one can get_user_pages or get_user_pages_fast
+ * (GUP or GUP_fast) from this point on we can not fail.
+ */
entry = mk_pmd(new_page, vma->vm_page_prot);
entry = pmd_mkhuge(entry);
entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
/*
* Clear the old entry under pagetable lock and establish the new PTE.
- * Any parallel GUP will either observe the old page blocking on the
- * page lock, block on the page table lock or observe the new page.
- * The SetPageUptodate on the new page and page_add_new_anon_rmap
- * guarantee the copy is visible before the pagetable update.
+ * Any parallel GUP can only observe the new page as old page only had
+ * one mapping with PROT_NONE (GUP/GUP_fast fails if pmd_protnone() is
+ * true). However a concurrent GUP might see the new page as soon as
+ * we set the pmd to the new entry.
*/
flush_cache_range(vma, mmun_start, mmun_end);
page_add_anon_rmap(new_page, vma, mmun_start, true);
@@ -1843,14 +1852,13 @@ fail_putback:
set_pmd_at(mm, mmun_start, pmd, entry);
update_mmu_cache_pmd(vma, address, &entry);
- if (page_count(page) != 2) {
- set_pmd_at(mm, mmun_start, pmd, orig_entry);
- flush_pmd_tlb_range(vma, mmun_start, mmun_end);
- mmu_notifier_invalidate_range(mm, mmun_start, mmun_end);
- update_mmu_cache_pmd(vma, address, &entry);
- page_remove_rmap(new_page, true);
- goto fail_putback;
- }
+ /* As said above no one can get reference on the old page nor through
+ * get_user_pages or get_user_pages_fast (GUP/GUP_fast) or through
+ * any other means. To get reference on huge page you need to hold
+ * pmd_lock and we are already holding that lock here and the page
+ * is only mapped once.
+ */
+ BUG_ON(page_count(page) != 2);
mlock_migrate_page(new_page, page);
page_remove_rmap(page, true);
--
2.1.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] mm/numa/thp: fix assumptions of migrate_misplaced_transhuge_page()
2016-05-03 12:33 [PATCH] mm/numa/thp: fix assumptions of migrate_misplaced_transhuge_page() jglisse
@ 2016-05-05 9:53 ` Mel Gorman
0 siblings, 0 replies; 2+ messages in thread
From: Mel Gorman @ 2016-05-05 9:53 UTC (permalink / raw)
To: jglisse; +Cc: linux-mm, linux-kernel, Hugh Dickins, Andrea Arcangeli
On Tue, May 03, 2016 at 02:33:51PM +0200, jglisse@redhat.com wrote:
> From: Jerome Glisse <jglisse@redhat.com>
>
> Fix assumptions in migrate_misplaced_transhuge_page() which is only
> call by do_huge_pmd_numa_page() itself only call by __handle_mm_fault()
> for pmd with PROT_NONE. This means that if the pmd stays the same
> then there can be no concurrent get_user_pages / get_user_pages_fast
> (GUP/GUP_fast). More over because migrate_misplaced_transhuge_page()
> abort if page is mapped more than once then there can be no GUP from
> a different process. Finaly, holding the pmd lock assure us that no
> other part of the kernel can take an extra reference on the page.
>
> In the end this means that the failure code path should never be
> taken unless something is horribly wrong, so convert it to BUG_ON().
>
> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-05-05 9:54 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-03 12:33 [PATCH] mm/numa/thp: fix assumptions of migrate_misplaced_transhuge_page() jglisse
2016-05-05 9:53 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).