* Re: [PATCH 3/4] thp: fix split vs. unmap race [not found] <052401d116e0$c3ac0110$4b040330$@alibaba-inc.com> @ 2015-11-04 9:20 ` Hillf Danton 2015-11-05 9:26 ` Kirill A. Shutemov 0 siblings, 1 reply; 3+ messages in thread From: Hillf Danton @ 2015-11-04 9:20 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andrew Morton, Andrea Arcangeli, Hugh Dickins, Naoya Horiguchi, Sasha Levin, linux-kernel, linux-mm, 'Minchan Kim' > @@ -1135,20 +1135,12 @@ void do_page_add_anon_rmap(struct page *page, > bool compound = flags & RMAP_COMPOUND; > bool first; > > - if (PageTransCompound(page)) { > + if (compound) { > + atomic_t *mapcount; > VM_BUG_ON_PAGE(!PageLocked(page), page); > - if (compound) { > - atomic_t *mapcount; > - > - VM_BUG_ON_PAGE(!PageTransHuge(page), page); > - mapcount = compound_mapcount_ptr(page); > - first = atomic_inc_and_test(mapcount); > - } else { > - /* Anon THP always mapped first with PMD */ > - first = 0; > - VM_BUG_ON_PAGE(!page_mapcount(page), page); > - atomic_inc(&page->_mapcount); > - } > + VM_BUG_ON_PAGE(!PageTransHuge(page), page); > + mapcount = compound_mapcount_ptr(page); > + first = atomic_inc_and_test(mapcount); > } else { > VM_BUG_ON_PAGE(compound, page); Then this debug info is no longer needed. > first = atomic_inc_and_test(&page->_mapcount); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 3/4] thp: fix split vs. unmap race 2015-11-04 9:20 ` [PATCH 3/4] thp: fix split vs. unmap race Hillf Danton @ 2015-11-05 9:26 ` Kirill A. Shutemov 0 siblings, 0 replies; 3+ messages in thread From: Kirill A. Shutemov @ 2015-11-05 9:26 UTC (permalink / raw) To: Hillf Danton Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins, Naoya Horiguchi, Sasha Levin, linux-kernel, linux-mm, 'Minchan Kim' On Wed, Nov 04, 2015 at 05:20:15PM +0800, Hillf Danton wrote: > > @@ -1135,20 +1135,12 @@ void do_page_add_anon_rmap(struct page *page, > > bool compound = flags & RMAP_COMPOUND; > > bool first; > > > > - if (PageTransCompound(page)) { > > + if (compound) { > > + atomic_t *mapcount; > > VM_BUG_ON_PAGE(!PageLocked(page), page); > > - if (compound) { > > - atomic_t *mapcount; > > - > > - VM_BUG_ON_PAGE(!PageTransHuge(page), page); > > - mapcount = compound_mapcount_ptr(page); > > - first = atomic_inc_and_test(mapcount); > > - } else { > > - /* Anon THP always mapped first with PMD */ > > - first = 0; > > - VM_BUG_ON_PAGE(!page_mapcount(page), page); > > - atomic_inc(&page->_mapcount); > > - } > > + VM_BUG_ON_PAGE(!PageTransHuge(page), page); > > + mapcount = compound_mapcount_ptr(page); > > + first = atomic_inc_and_test(mapcount); > > } else { > > VM_BUG_ON_PAGE(compound, page); > > Then this debug info is no longer needed. > > first = atomic_inc_and_test(&page->_mapcount); Right. diff --git a/mm/rmap.c b/mm/rmap.c index 0837487d3737..a9550b1f74cd 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1186,7 +1186,6 @@ void do_page_add_anon_rmap(struct page *page, mapcount = compound_mapcount_ptr(page); first = atomic_inc_and_test(mapcount); } else { - VM_BUG_ON_PAGE(compound, page); first = atomic_inc_and_test(&page->_mapcount); } -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 0/4] Bugfixes for THP refcounting @ 2015-11-03 15:26 Kirill A. Shutemov 2015-11-03 15:26 ` [PATCH 3/4] thp: fix split vs. unmap race Kirill A. Shutemov 0 siblings, 1 reply; 3+ messages in thread From: Kirill A. Shutemov @ 2015-11-03 15:26 UTC (permalink / raw) To: Andrew Morton Cc: Andrea Arcangeli, Hugh Dickins, Naoya Horiguchi, Sasha Levin, Minchan Kim, linux-kernel, linux-mm, Kirill A. Shutemov Hi, There's few bugfixes for THP refcounting patchset. It should address most reported bugs. I need to track down one more bug: rss-counter mismatch on exit. Kirill A. Shutemov (4): mm: do not crash on PageDoubleMap() for non-head pages mm: duplicate rmap reference for hugetlb pages as compound thp: fix split vs. unmap race mm: prepare page_referenced() and page_idle to new THP refcounting include/linux/huge_mm.h | 4 -- include/linux/mm.h | 19 +++++++ include/linux/page-flags.h | 3 +- mm/huge_memory.c | 76 ++++++------------------- mm/migrate.c | 2 +- mm/page_idle.c | 64 ++++++++++++++++++--- mm/rmap.c | 137 ++++++++++++++++++++++++++++----------------- 7 files changed, 180 insertions(+), 125 deletions(-) -- 2.6.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 3/4] thp: fix split vs. unmap race 2015-11-03 15:26 [PATCH 0/4] Bugfixes for THP refcounting Kirill A. Shutemov @ 2015-11-03 15:26 ` Kirill A. Shutemov 0 siblings, 0 replies; 3+ messages in thread From: Kirill A. Shutemov @ 2015-11-03 15:26 UTC (permalink / raw) To: Andrew Morton Cc: Andrea Arcangeli, Hugh Dickins, Naoya Horiguchi, Sasha Levin, Minchan Kim, linux-kernel, linux-mm, Kirill A. Shutemov To stabilize compound page during split we use migration entries. The code to implement this is buggy: I wrongly assumed that kernel would wait migration to finish, before zapping ptes. But turn out that's not true. As result if zap_pte_range() races with split_huge_page(), we can end up with page which is not mapped anymore but has _count and _mapcount elevated. The page is on LRU too. So it's still reachable by vmscan and by pfn scanners. It's likely that page->mapping in this case would point to freed anon_vma. BOOM! The patch modify freeze/unfreeze_page() code to match normal migration entries logic: on setup we remove page from rmap and drop pin, on removing we get pin back and put page on rmap. This way even if migration entry will be removed under us we don't corrupt page's state. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Minchan Kim <minchan@kernel.org> Reported-by: Sasha Levin <sasha.levin@oracle.com> --- mm/huge_memory.c | 22 ++++++++++++++++++---- mm/rmap.c | 19 +++++-------------- 2 files changed, 23 insertions(+), 18 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5009f68786d0..3700981f8035 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2934,6 +2934,13 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, smp_wmb(); /* make pte visible before pmd */ pmd_populate(mm, pmd, pgtable); + + if (freeze) { + for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + page_remove_rmap(page + i, false); + put_page(page + i); + } + } } void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, @@ -3079,6 +3086,8 @@ static void freeze_page_vma(struct vm_area_struct *vma, struct page *page, if (pte_soft_dirty(entry)) swp_pte = pte_swp_mksoft_dirty(swp_pte); set_pte_at(vma->vm_mm, address, pte + i, swp_pte); + page_remove_rmap(page, false); + put_page(page); } pte_unmap_unlock(pte, ptl); } @@ -3117,8 +3126,6 @@ static void unfreeze_page_vma(struct vm_area_struct *vma, struct page *page, return; pte = pte_offset_map_lock(vma->vm_mm, pmd, address, &ptl); for (i = 0; i < HPAGE_PMD_NR; i++, address += PAGE_SIZE, page++) { - if (!page_mapped(page)) - continue; if (!is_swap_pte(pte[i])) continue; @@ -3128,6 +3135,9 @@ static void unfreeze_page_vma(struct vm_area_struct *vma, struct page *page, if (migration_entry_to_page(swp_entry) != page) continue; + get_page(page); + page_add_anon_rmap(page, vma, address, false); + entry = pte_mkold(mk_pte(page, vma->vm_page_prot)); entry = pte_mkdirty(entry); if (is_write_migration_entry(swp_entry)) @@ -3195,8 +3205,6 @@ static int __split_huge_page_tail(struct page *head, int tail, */ atomic_add(mapcount + 1, &page_tail->_count); - /* after clearing PageTail the gup refcount can be released */ - smp_mb__after_atomic(); page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; page_tail->flags |= (head->flags & @@ -3209,6 +3217,12 @@ static int __split_huge_page_tail(struct page *head, int tail, (1L << PG_unevictable))); page_tail->flags |= (1L << PG_dirty); + /* + * After clearing PageTail the gup refcount can be released. + * Page flags also must be visible before we make the page non-compound. + */ + smp_wmb(); + clear_compound_head(page_tail); if (page_is_young(head)) diff --git a/mm/rmap.c b/mm/rmap.c index 288622f5f34d..ad9af8b3a381 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1135,20 +1135,12 @@ void do_page_add_anon_rmap(struct page *page, bool compound = flags & RMAP_COMPOUND; bool first; - if (PageTransCompound(page)) { + if (compound) { + atomic_t *mapcount; VM_BUG_ON_PAGE(!PageLocked(page), page); - if (compound) { - atomic_t *mapcount; - - VM_BUG_ON_PAGE(!PageTransHuge(page), page); - mapcount = compound_mapcount_ptr(page); - first = atomic_inc_and_test(mapcount); - } else { - /* Anon THP always mapped first with PMD */ - first = 0; - VM_BUG_ON_PAGE(!page_mapcount(page), page); - atomic_inc(&page->_mapcount); - } + VM_BUG_ON_PAGE(!PageTransHuge(page), page); + mapcount = compound_mapcount_ptr(page); + first = atomic_inc_and_test(mapcount); } else { VM_BUG_ON_PAGE(compound, page); first = atomic_inc_and_test(&page->_mapcount); @@ -1163,7 +1155,6 @@ void do_page_add_anon_rmap(struct page *page, * disabled. */ if (compound) { - VM_BUG_ON_PAGE(!PageTransHuge(page), page); __inc_zone_page_state(page, NR_ANON_TRANSPARENT_HUGEPAGES); } -- 2.6.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-11-05 9:26 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <052401d116e0$c3ac0110$4b040330$@alibaba-inc.com>
2015-11-04 9:20 ` [PATCH 3/4] thp: fix split vs. unmap race Hillf Danton
2015-11-05 9:26 ` Kirill A. Shutemov
2015-11-03 15:26 [PATCH 0/4] Bugfixes for THP refcounting Kirill A. Shutemov
2015-11-03 15:26 ` [PATCH 3/4] thp: fix split vs. unmap race Kirill A. Shutemov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).