From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB5C7BE65 for ; Fri, 4 Jul 2025 00:59:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751590755; cv=none; b=eEgvC65lzDZRo1IF0qjMPoiKyTalmU5VQ2aiPz11A6mOFcgxdm8Z6FfJQ13+ZyGTKGe6sqLPnbv5PFZ+JFgy8oZA9UJ+yV7JtGcaC50pW5QpEZamaQJy969d4xjihJLYqOYdMRwGj6DKXlpPHcPYw/p6fPo6OEb5QYzyr/Fo7/Q= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751590755; c=relaxed/simple; bh=VzMyw5b4GzBw85MRB8XQS2IuYu9a7XliHW9N7lhCaw4=; h=Date:To:From:Subject:Message-Id; b=GjC1IDDg0+sMbSp/JBkiOCngCbgmrN08BvzGiPaOnobYWrljXWLFuWm/xFyviP8KcZOrk/VqRAeq2f+hJsi4LTMDuEnDlz2T+iZnNGT4TvGa5DhYX0ggGWZTZ52oPvU8vaU5ilYjqLDfh2rm9V3c0x98EuLLCtalgyWCwnEnSok= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=G3Ql0P2H; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="G3Ql0P2H" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7320AC4CEE3; Fri, 4 Jul 2025 00:59:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1751590755; bh=VzMyw5b4GzBw85MRB8XQS2IuYu9a7XliHW9N7lhCaw4=; h=Date:To:From:Subject:From; b=G3Ql0P2HBW6qIL2Ul9ZilisdoaIJtIClyLpm7EIxq/BmZBtnGTKFFzu/g+RTSMJWC xYPWeAwlZG6qn64Bw6rfHgg/zw/Dby1p3TRt94BfGlR0rAfEecXywd2he/2sybvoM3 PtTI+Muv88l7I1XrwG4epgZR+JbQpZCSX/P7/6iM= Date: Thu, 03 Jul 2025 17:59:14 -0700 To: mm-commits@vger.kernel.org,zokeefe@google.com,ziy@nvidia.com,willy@infradead.org,will@kernel.org,wangkefeng.wang@huawei.com,vishal.moola@gmail.com,usamaarif642@gmail.com,tiwai@suse.de,thomas.hellstrom@linux.intel.com,surenb@google.com,sunnanyong@huawei.com,ryan.roberts@arm.com,rostedt@goodmis.org,rientjes@google.com,rdunlap@infradead.org,raquini@redhat.com,peterx@redhat.com,mhocko@suse.com,mhiramat@kernel.org,mathieu.desnoyers@efficios.com,lorenzo.stoakes@oracle.com,liam.howlett@oracle.com,kirill.shutemov@linux.intel.com,jack@suse.cz,hannes@cmpxchg.org,dev.jain@arm.com,david@redhat.com,corbet@lwn.net,cl@gentwo.org,catalin.marinas@arm.com,baolin.wang@linux.alibaba.com,baohua@kernel.org,bagasdotme@gmail.com,anshuman.khandual@arm.com,aarcange@redhat.com,npache@redhat.com,akpm@linux-foundation.org From: Andrew Morton Subject: + khugepaged-generalize-__collapse_huge_page_-for-mthp-support.patch added to mm-new branch Message-Id: <20250704005915.7320AC4CEE3@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: khugepaged: generalize __collapse_huge_page_* for mTHP support has been added to the -mm mm-new branch. Its filename is khugepaged-generalize-__collapse_huge_page_-for-mthp-support.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/khugepaged-generalize-__collapse_huge_page_-for-mthp-support.patch This patch will later appear in the mm-new branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Note, mm-new is a provisional staging ground for work-in-progress patches, and acceptance into mm-new is a notification for others take notice and to finish up reviews. Please do not hesitate to respond to review feedback and post updated versions to replace or incrementally fixup patches in mm-new. Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Nico Pache Subject: khugepaged: generalize __collapse_huge_page_* for mTHP support Date: Tue, 1 Jul 2025 23:57:32 -0600 Generalize the order of the __collapse_huge_page_* functions to support future mTHP collapse. mTHP collapse can suffer from incosistant behavior, and memory waste "creep". disable swapin and shared support for mTHP collapse. No functional changes in this patch. Link: https://lkml.kernel.org/r/20250702055742.102808-6-npache@redhat.com Signed-off-by: Dev Jain Signed-off-by: Nico Pache Reviewed-by: Baolin Wang Co-developed-by: Dev Jain Cc: Andrea Arcangeli Cc: Anshuman Khandual Cc: Bagas Sanjaya Cc: Barry Song Cc: Catalin Marinas Cc: Christoph Lameter (Ampere) Cc: David Hildenbrand Cc: David Rientjes Cc: Jan Kara Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Kefeng Wang Cc: Kirill A. Shuemov Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: "Masami Hiramatsu (Google)" Cc: Mathieu Desnoyers Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Nanyong Sun Cc: Peter Xu Cc: Rafael Aquini Cc: Randy Dunlap Cc: Reported-by:Takashi Iwai Cc: Ryan Roberts Cc: Steven Rostedt Cc: Suren Baghdasaryan Cc: Thomas Hellstrom Cc: Usama Arif Cc: Vishal Moola (Oracle) Cc: Will Deacon Cc: Zach O'Keefe Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/khugepaged.c | 48 ++++++++++++++++++++++++++++------------------ 1 file changed, 30 insertions(+), 18 deletions(-) --- a/mm/khugepaged.c~khugepaged-generalize-__collapse_huge_page_-for-mthp-support +++ a/mm/khugepaged.c @@ -552,15 +552,17 @@ static int __collapse_huge_page_isolate( unsigned long address, pte_t *pte, struct collapse_control *cc, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, + u8 order) { struct page *page = NULL; struct folio *folio = NULL; pte_t *_pte; int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; bool writable = false; + int scaled_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); - for (_pte = pte; _pte < pte + HPAGE_PMD_NR; + for (_pte = pte; _pte < pte + (1 << order); _pte++, address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); if (pte_none(pteval) || (pte_present(pteval) && @@ -568,7 +570,7 @@ static int __collapse_huge_page_isolate( ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || - none_or_zero <= khugepaged_max_ptes_none)) { + none_or_zero <= scaled_none)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -596,8 +598,8 @@ static int __collapse_huge_page_isolate( /* See hpage_collapse_scan_pmd(). */ if (folio_maybe_mapped_shared(folio)) { ++shared; - if (cc->is_khugepaged && - shared > khugepaged_max_ptes_shared) { + if (order != HPAGE_PMD_ORDER || (cc->is_khugepaged && + shared > khugepaged_max_ptes_shared)) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -698,13 +700,14 @@ static void __collapse_huge_page_copy_su struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, + u8 order) { struct folio *src, *tmp; pte_t *_pte; pte_t pteval; - for (_pte = pte; _pte < pte + HPAGE_PMD_NR; + for (_pte = pte; _pte < pte + (1 << order); _pte++, address += PAGE_SIZE) { pteval = ptep_get(_pte); if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { @@ -751,7 +754,8 @@ static void __collapse_huge_page_copy_fa pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, + u8 order) { spinlock_t *pmd_ptl; @@ -768,7 +772,7 @@ static void __collapse_huge_page_copy_fa * Release both raw and compound pages isolated * in __collapse_huge_page_isolate. */ - release_pte_pages(pte, pte + HPAGE_PMD_NR, compound_pagelist); + release_pte_pages(pte, pte + (1 << order), compound_pagelist); } /* @@ -789,7 +793,7 @@ static void __collapse_huge_page_copy_fa static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, u8 order) { unsigned int i; int result = SCAN_SUCCEED; @@ -797,7 +801,7 @@ static int __collapse_huge_page_copy(pte /* * Copying pages' contents is subject to memory poison at any iteration. */ - for (i = 0; i < HPAGE_PMD_NR; i++) { + for (i = 0; i < (1 << order); i++) { pte_t pteval = ptep_get(pte + i); struct page *page = folio_page(folio, i); unsigned long src_addr = address + i * PAGE_SIZE; @@ -816,10 +820,10 @@ static int __collapse_huge_page_copy(pte if (likely(result == SCAN_SUCCEED)) __collapse_huge_page_copy_succeeded(pte, vma, address, ptl, - compound_pagelist); + compound_pagelist, order); else __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, - compound_pagelist); + compound_pagelist, order); return result; } @@ -986,11 +990,11 @@ static int check_pmd_still_valid(struct static int __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long haddr, pmd_t *pmd, - int referenced) + int referenced, u8 order) { int swapped_in = 0; vm_fault_t ret = 0; - unsigned long address, end = haddr + (HPAGE_PMD_NR * PAGE_SIZE); + unsigned long address, end = haddr + (PAGE_SIZE << order); int result; pte_t *pte = NULL; spinlock_t *ptl; @@ -1021,6 +1025,14 @@ static int __collapse_huge_page_swapin(s if (!is_swap_pte(vmf.orig_pte)) continue; + /* Dont swapin for mTHP collapse */ + if (order != HPAGE_PMD_ORDER) { + pte_unmap(pte); + mmap_read_unlock(mm); + result = SCAN_EXCEED_SWAP_PTE; + goto out; + } + vmf.pte = pte; vmf.ptl = ptl; ret = do_swap_page(&vmf); @@ -1141,7 +1153,7 @@ static int collapse_huge_page(struct mm_ * that case. Continuing to collapse causes inconsistency. */ result = __collapse_huge_page_swapin(mm, vma, address, pmd, - referenced); + referenced, HPAGE_PMD_ORDER); if (result != SCAN_SUCCEED) goto out_nolock; } @@ -1189,7 +1201,7 @@ static int collapse_huge_page(struct mm_ pte = pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); if (pte) { result = __collapse_huge_page_isolate(vma, address, pte, cc, - &compound_pagelist); + &compound_pagelist, HPAGE_PMD_ORDER); spin_unlock(pte_ptl); } else { result = SCAN_PMD_NULL; @@ -1219,7 +1231,7 @@ static int collapse_huge_page(struct mm_ result = __collapse_huge_page_copy(pte, folio, pmd, _pmd, vma, address, pte_ptl, - &compound_pagelist); + &compound_pagelist, HPAGE_PMD_ORDER); pte_unmap(pte); if (unlikely(result != SCAN_SUCCEED)) goto out_up_write; _ Patches currently in -mm which might be from npache@redhat.com are khugepaged-rename-hpage_collapse_-to-khugepaged_.patch introduce-khugepaged_collapse_single_pmd-to-unify-khugepaged-and-madvise_collapse.patch khugepaged-generalize-hugepage_vma_revalidate-for-mthp-support.patch khugepaged-generalize-__collapse_huge_page_-for-mthp-support.patch khugepaged-introduce-khugepaged_scan_bitmap-for-mthp-support.patch khugepaged-add-mthp-support.patch khugepaged-skip-collapsing-mthp-to-smaller-orders.patch khugepaged-avoid-unnecessary-mthp-collapse-attempts.patch khugepaged-allow-madvise_collapse-to-check-all-anonymous-mthp-orders.patch khugepaged-improve-tracepoints-for-mthp-orders.patch khugepaged-add-per-order-mthp-khugepaged-stats.patch documentation-mm-update-the-admin-guide-for-mthp-collapse.patch