From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx147.postini.com [74.125.245.147]) by kanga.kvack.org (Postfix) with SMTP id CB2CE6B005D for ; Wed, 7 Nov 2012 10:00:14 -0500 (EST) From: "Kirill A. Shutemov" Subject: [PATCH v5 07/11] thp: implement splitting pmd for huge zero page Date: Wed, 7 Nov 2012 17:00:59 +0200 Message-Id: <1352300463-12627-8-git-send-email-kirill.shutemov@linux.intel.com> In-Reply-To: <1352300463-12627-1-git-send-email-kirill.shutemov@linux.intel.com> References: <1352300463-12627-1-git-send-email-kirill.shutemov@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Andrea Arcangeli , linux-mm@kvack.org Cc: Andi Kleen , "H. Peter Anvin" , linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , "Kirill A. Shutemov" From: "Kirill A. Shutemov" We can't split huge zero page itself (and it's bug if we try), but we can split the pmd which points to it. On splitting the pmd we create a table with all ptes set to normal zero page. Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 44 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 90e651c..f36bc7d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1611,6 +1611,7 @@ int split_huge_page(struct page *page) struct anon_vma *anon_vma; int ret = 1; + BUG_ON(is_huge_zero_pfn(page_to_pfn(page))); BUG_ON(!PageAnon(page)); anon_vma = page_lock_anon_vma(page); if (!anon_vma) @@ -2509,23 +2510,63 @@ static int khugepaged(void *none) return 0; } +static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, + unsigned long haddr, pmd_t *pmd) +{ + pgtable_t pgtable; + pmd_t _pmd; + int i; + + pmdp_clear_flush(vma, haddr, pmd); + /* leave pmd empty until pte is filled */ + + pgtable = get_pmd_huge_pte(vma->vm_mm); + pmd_populate(vma->vm_mm, &_pmd, pgtable); + + for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { + pte_t *pte, entry; + entry = pfn_pte(my_zero_pfn(haddr), vma->vm_page_prot); + entry = pte_mkspecial(entry); + pte = pte_offset_map(&_pmd, haddr); + VM_BUG_ON(!pte_none(*pte)); + set_pte_at(vma->vm_mm, haddr, pte, entry); + pte_unmap(pte); + } + smp_wmb(); /* make pte visible before pmd */ + pmd_populate(vma->vm_mm, pmd, pgtable); +} + void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd) { struct page *page; + struct mm_struct *mm = vma->vm_mm; unsigned long haddr = address & HPAGE_PMD_MASK; + unsigned long mmun_start; /* For mmu_notifiers */ + unsigned long mmun_end; /* For mmu_notifiers */ BUG_ON(vma->vm_start > haddr || vma->vm_end < haddr + HPAGE_PMD_SIZE); - spin_lock(&vma->vm_mm->page_table_lock); + mmun_start = haddr; + mmun_end = address + HPAGE_PMD_SIZE; + mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + spin_lock(&mm->page_table_lock); if (unlikely(!pmd_trans_huge(*pmd))) { - spin_unlock(&vma->vm_mm->page_table_lock); + spin_unlock(&mm->page_table_lock); + mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + return; + } + if (is_huge_zero_pmd(*pmd)) { + __split_huge_zero_page_pmd(vma, haddr, pmd); + spin_unlock(&mm->page_table_lock); + mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); return; } page = pmd_page(*pmd); VM_BUG_ON(!page_count(page)); get_page(page); - spin_unlock(&vma->vm_mm->page_table_lock); + spin_unlock(&mm->page_table_lock); + mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); split_huge_page(page); -- 1.7.7.6 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org