From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1BDD2F41 for ; Wed, 19 Feb 2025 00:04:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739923479; cv=none; b=bEDqPFFBvqY/yW80v6CTUnbnvhYdyyxVbXKUm9aHJjYjRbq9dkXaQf3H8Q/EH2UmzptT639XAJsRkIqNEyHUM5Oz9Grho68Fj8X3Ijewoi+iUr1r5fqqTk91+Nk1cg+ea4xYP+q3hZvBjfT46IvXw5PTL3ZPoe7nrgiVIv6Nt+k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739923479; c=relaxed/simple; bh=mh8ziJveq09287CpOU44v9DHgr6IP3ijiZ7qtJUL+Iw=; h=Date:To:From:Subject:Message-Id; b=tifvLOCzmX8p6M+POqYpcg6MtUu0TWuXaL/fb6BwC3+pW5MGRJS+W7M8HcZYhORqrKgmsUHzhnv/kxOXEtrxzp5DQ/JYQtT4PUNlicc1wW6mj0o94yiSik2a3xw+uo5sbdm5jQ5shRCrX/y9lO/oBUWjp8KVpPpabYEMDLjT0+I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=JdoU2IVL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="JdoU2IVL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50345C4CEE2; Wed, 19 Feb 2025 00:04:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1739923479; bh=mh8ziJveq09287CpOU44v9DHgr6IP3ijiZ7qtJUL+Iw=; h=Date:To:From:Subject:From; b=JdoU2IVLFI4DnuQt0cojz0EFfhkS29+w8MxtwDRHGuoYeGiVVNoyycl0ZUZ7sRhWt RT7mMtkDom6iZhISKWwy/B9c4373nh1ledBNMog/ztao0DcYViYS5SGL0pUQ9p1AsC RBbsUDeT3F00Yb8sMQ9ry3M6Td1sxuSRz6fe41Nc= Date: Tue, 18 Feb 2025 16:04:38 -0800 To: mm-commits@vger.kernel.org,yuzhao@google.com,usamaarif642@gmail.com,roman.gushchin@linux.dev,peterz@infradead.org,osalvador@suse.de,muchun.song@linux.dev,mpe@ellerman.id.au,maddy@linux.ibm.com,luto@kernel.org,joao.m.martins@oracle.com,hca@linux.ibm.com,gor@linux.ibm.com,dave.hansen@linux.intel.com,dan.carpenter@linaro.org,agordeev@linux.ibm.com,fvdl@google.com,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-hugetlb-add-pre-hvo-framework.patch added to mm-unstable branch Message-Id: <20250219000439.50345C4CEE2@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm/hugetlb: add pre-HVO framework has been added to the -mm mm-unstable branch. Its filename is mm-hugetlb-add-pre-hvo-framework.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-hugetlb-add-pre-hvo-framework.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Frank van der Linden Subject: mm/hugetlb: add pre-HVO framework Date: Tue, 18 Feb 2025 18:16:45 +0000 Define flags for pre-HVOed bootmem hugetlb pages, and act on them. The most important flag is the HVO flag, signalling that a bootmem allocated gigantic page has already been HVO-ed. If this flag is seen by the hugetlb bootmem gather code, the page is marked as HVO optimized. The HVO code will then not try to optimize it again. Instead, it will just map the tail page mirror pages read-only, completing the HVO steps. No functional change, as nothing sets the flags yet. Link: https://lkml.kernel.org/r/20250218181656.207178-18-fvdl@google.com Signed-off-by: Frank van der Linden Cc: Alexander Gordeev Cc: Andy Lutomirski Cc: Dan Carpenter Cc: Dave Hansen Cc: Heiko Carstens Cc: Joao Martins Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: Muchun Song Cc: Oscar Salvador Cc: Peter Zijlstra Cc: Roman Gushchin (Cruise) Cc: Usama Arif Cc: Vasily Gorbik Cc: Yu Zhao Signed-off-by: Andrew Morton --- arch/powerpc/mm/hugetlbpage.c | 1 include/linux/hugetlb.h | 4 ++ mm/hugetlb.c | 24 ++++++++++++++- mm/hugetlb_vmemmap.c | 50 ++++++++++++++++++++++++++++++-- mm/hugetlb_vmemmap.h | 7 ++++ 5 files changed, 83 insertions(+), 3 deletions(-) --- a/arch/powerpc/mm/hugetlbpage.c~mm-hugetlb-add-pre-hvo-framework +++ a/arch/powerpc/mm/hugetlbpage.c @@ -113,6 +113,7 @@ static int __init pseries_alloc_bootmem_ gpage_freearray[nr_gpages] = 0; list_add(&m->list, &huge_boot_pages[0]); m->hstate = hstate; + m->flags = 0; return 1; } --- a/include/linux/hugetlb.h~mm-hugetlb-add-pre-hvo-framework +++ a/include/linux/hugetlb.h @@ -681,8 +681,12 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; + unsigned long flags; }; +#define HUGE_BOOTMEM_HVO 0x0001 +#define HUGE_BOOTMEM_ZONES_VALID 0x0002 + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, --- a/mm/hugetlb.c~mm-hugetlb-add-pre-hvo-framework +++ a/mm/hugetlb.c @@ -3215,6 +3215,7 @@ found: INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; + m->flags = 0; return 1; } @@ -3282,7 +3283,7 @@ static void __init prep_and_add_bootmem_ struct folio *folio, *tmp_f; /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_folios(h, folio_list); + hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { @@ -3311,6 +3312,13 @@ static bool __init hugetlb_bootmem_page_ unsigned long start_pfn; bool valid; + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { + /* + * Already validated, skip check. + */ + return true; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, @@ -3343,6 +3351,11 @@ static void __init hugetlb_bootmem_free_ } } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3383,6 +3396,15 @@ static void __init gather_bootmem_preall hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); + + if (hugetlb_bootmem_page_prehvo(m)) + /* + * If pre-HVO was done, just set the + * flag, the HVO code will then skip + * this folio. + */ + folio_set_hugetlb_vmemmap_optimized(folio); + list_add(&folio->lru, &folio_list); /* --- a/mm/hugetlb_vmemmap.c~mm-hugetlb-add-pre-hvo-framework +++ a/mm/hugetlb_vmemmap.c @@ -649,14 +649,39 @@ static int hugetlb_vmemmap_split_folio(c return vmemmap_remap_split(vmemmap_start, vmemmap_end, vmemmap_reuse); } -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, + struct list_head *folio_list, + bool boot) { struct folio *folio; + int nr_to_optimize; LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; + nr_to_optimize = 0; list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split_folio(h, folio); + int ret; + unsigned long spfn, epfn; + + if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { + /* + * Already optimized by pre-HVO, just map the + * mirrored tail page structs RO. + */ + spfn = (unsigned long)&folio->page; + epfn = spfn + pages_per_huge_page(h); + vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), + HUGETLB_VMEMMAP_RESERVE_SIZE); + register_page_bootmem_memmap(pfn_to_section_nr(spfn), + &folio->page, + HUGETLB_VMEMMAP_RESERVE_SIZE); + static_branch_inc(&hugetlb_optimize_vmemmap_key); + continue; + } + + nr_to_optimize++; + + ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -668,6 +693,16 @@ void hugetlb_vmemmap_optimize_folios(str break; } + if (!nr_to_optimize) + /* + * All pre-HVO folios, nothing left to do. It's ok if + * there is a mix of pre-HVO and not yet HVO-ed folios + * here, as __hugetlb_vmemmap_optimize_folio() will + * skip any folios that already have the optimized flag + * set, see vmemmap_should_optimize_folio(). + */ + goto out; + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { @@ -693,10 +728,21 @@ void hugetlb_vmemmap_optimize_folios(str } } +out: flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, false); +} + +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, true); +} + static const struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", --- a/mm/hugetlb_vmemmap.h~mm-hugetlb-add-pre-hvo-framework +++ a/mm/hugetlb_vmemmap.h @@ -24,6 +24,8 @@ long hugetlb_vmemmap_restore_folios(cons struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -64,6 +66,11 @@ static inline void hugetlb_vmemmap_optim { } +static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, + struct list_head *folio_list) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0; _ Patches currently in -mm which might be from fvdl@google.com are mm-cma-export-total-and-free-number-of-pages-for-cma-areas.patch mm-cma-support-multiple-contiguous-ranges-if-requested.patch mm-cma-introduce-cma_intersects-function.patch mm-hugetlb-use-cma_declare_contiguous_multi.patch mm-hugetlb-remove-redundant-__clearpagereserved.patch mm-hugetlb-use-online-nodes-for-bootmem-allocation.patch mm-hugetlb-convert-cmdline-parameters-from-setup-to-early.patch x86-mm-make-register_page_bootmem_memmap-handle-pte-mappings.patch mm-bootmem_info-export-register_page_bootmem_memmap.patch mm-sparse-allow-for-alternate-vmemmap-section-init-at-boot.patch mm-hugetlb-set-migratetype-for-bootmem-folios.patch mm-define-__init_reserved_page_zone-function.patch mm-hugetlb-check-bootmem-pages-for-zone-intersections.patch mm-sparse-add-vmemmap__hvo-functions.patch mm-hugetlb-deal-with-multiple-calls-to-hugetlb_bootmem_alloc.patch mm-hugetlb-move-huge_boot_pages-list-init-to-hugetlb_bootmem_alloc.patch mm-hugetlb-add-pre-hvo-framework.patch mm-hugetlb_vmemmap-fix-hugetlb_vmemmap_restore_folios-definition.patch mm-hugetlb-do-pre-hvo-for-bootmem-allocated-pages.patch x86-setup-call-hugetlb_bootmem_alloc-early.patch x86-mm-set-arch_want_sparsemem_vmemmap_preinit.patch mm-cma-simplify-zone-intersection-check.patch mm-cma-introduce-a-cma-validate-function.patch mm-cma-introduce-interface-for-early-reservations.patch mm-hugetlb-add-hugetlb_cma_only-cmdline-option.patch mm-hugetlb-enable-bootmem-allocation-from-cma-areas.patch mm-hugetlb-move-hugetlb-cma-code-in-to-its-own-file.patch