From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96A60CD37B2 for ; Fri, 15 Sep 2023 23:28:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230318AbjIOX2O (ORCPT ); Fri, 15 Sep 2023 19:28:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236780AbjIOX1n (ORCPT ); Fri, 15 Sep 2023 19:27:43 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87FB41FF5 for ; Fri, 15 Sep 2023 16:27:37 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 12DAFC433C7; Fri, 15 Sep 2023 23:27:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1694820457; bh=reg9KlXo7z+PzrGahH7tl9Utegw0q0uApwztGfXHTGw=; h=Date:To:From:Subject:From; b=g8Oa4UOrk6P9PFODBfKFogq7n4Bx4zWHIXRBSzdWsjnRUdZpU4Np2J+qjumSHeFo3 8SmXca9Iph6GtsQnr6uoXmr7PmVYeJeXbbRsLq2u49UdKRqc8PPOi9iOeEIh+SC2SZ M8wS9pNSgVihE3brezoH6DLxQm5M5qp+1i2kXlOg= Date: Fri, 15 Sep 2023 16:27:36 -0700 To: mm-commits@vger.kernel.org, willy@infradead.org, songmuchun@bytedance.com, sidhartha.kumar@oracle.com, rientjes@google.com, osalvador@suse.de, naoya.horiguchi@linux.dev, mhocko@suse.com, linmiaohe@huawei.com, jthoughton@google.com, joao.m.martins@oracle.com, duanxiongchun@bytedance.com, david@redhat.com, anshuman.khandual@arm.com, mike.kravetz@oracle.com, akpm@linux-foundation.org From: Andrew Morton Subject: + hugetlb-batch-freeing-of-vmemmap-pages.patch added to mm-unstable branch Message-Id: <20230915232737.12DAFC433C7@smtp.kernel.org> Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org The patch titled Subject: hugetlb: batch freeing of vmemmap pages has been added to the -mm mm-unstable branch. Its filename is hugetlb-batch-freeing-of-vmemmap-pages.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/hugetlb-batch-freeing-of-vmemmap-pages.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Mike Kravetz Subject: hugetlb: batch freeing of vmemmap pages Date: Fri, 15 Sep 2023 15:15:42 -0700 Now that batching of hugetlb vmemmap optimization processing is possible, batch the freeing of vmemmap pages. When freeing vmemmap pages for a hugetlb page, we add them to a list that is freed after the entire batch has been processed. This enhances the ability to return contiguous ranges of memory to the low level allocators. Link: https://lkml.kernel.org/r/20230915221548.552084-10-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Cc: Anshuman Khandual Cc: David Hildenbrand Cc: David Rientjes Cc: James Houghton Cc: Joao Martins Cc: "Matthew Wilcox (Oracle)" Cc: Miaohe Lin Cc: Michal Hocko Cc: Muchun Song Cc: Naoya Horiguchi Cc: Oscar Salvador Cc: Sidhartha Kumar Cc: Xiongchun Duan Signed-off-by: Andrew Morton --- mm/hugetlb_vmemmap.c | 85 ++++++++++++++++++++++++++++------------- 1 file changed, 59 insertions(+), 26 deletions(-) --- a/mm/hugetlb_vmemmap.c~hugetlb-batch-freeing-of-vmemmap-pages +++ a/mm/hugetlb_vmemmap.c @@ -222,6 +222,9 @@ static void free_vmemmap_page_list(struc { struct page *page, *next; + if (list_empty(list)) + return; + list_for_each_entry_safe(page, next, list, lru) free_vmemmap_page(page); } @@ -251,7 +254,7 @@ static void vmemmap_remap_pte(pte_t *pte } entry = mk_pte(walk->reuse_page, pgprot); - list_add_tail(&page->lru, walk->vmemmap_pages); + list_add(&page->lru, walk->vmemmap_pages); set_pte_at(&init_mm, addr, pte, entry); } @@ -306,18 +309,20 @@ static void vmemmap_restore_pte(pte_t *p * @end: end address of the vmemmap virtual address range that we want to * remap. * @reuse: reuse address. + * @vmemmap_pages: list to deposit vmemmap pages to be freed. It is callers + * responsibility to free pages. * * Return: %0 on success, negative error code otherwise. */ static int vmemmap_remap_free(unsigned long start, unsigned long end, - unsigned long reuse) + unsigned long reuse, + struct list_head *vmemmap_pages) { int ret; - LIST_HEAD(vmemmap_pages); struct vmemmap_remap_walk walk = { .remap_pte = vmemmap_remap_pte, .reuse_addr = reuse, - .vmemmap_pages = &vmemmap_pages, + .vmemmap_pages = vmemmap_pages, }; int nid = page_to_nid((struct page *)reuse); gfp_t gfp_mask = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; @@ -334,7 +339,7 @@ static int vmemmap_remap_free(unsigned l if (walk.reuse_page) { copy_page(page_to_virt(walk.reuse_page), (void *)walk.reuse_addr); - list_add(&walk.reuse_page->lru, &vmemmap_pages); + list_add(&walk.reuse_page->lru, vmemmap_pages); } /* @@ -365,15 +370,13 @@ static int vmemmap_remap_free(unsigned l walk = (struct vmemmap_remap_walk) { .remap_pte = vmemmap_restore_pte, .reuse_addr = reuse, - .vmemmap_pages = &vmemmap_pages, + .vmemmap_pages = vmemmap_pages, }; vmemmap_remap_range(reuse, end, &walk); } mmap_read_unlock(&init_mm); - free_vmemmap_page_list(&vmemmap_pages); - return ret; } @@ -389,7 +392,7 @@ static int alloc_vmemmap_page_list(unsig page = alloc_pages_node(nid, gfp_mask, 0); if (!page) goto out; - list_add_tail(&page->lru, list); + list_add(&page->lru, list); } return 0; @@ -576,24 +579,17 @@ static bool vmemmap_should_optimize(cons return true; } -/** - * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages. - * @h: struct hstate. - * @head: the head page whose vmemmap pages will be optimized. - * - * This function only tries to optimize @head's vmemmap pages and does not - * guarantee that the optimization will succeed after it returns. The caller - * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages - * have been optimized. - */ -void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) +static int __hugetlb_vmemmap_optimize(const struct hstate *h, + struct page *head, + struct list_head *vmemmap_pages) { + int ret = 0; unsigned long vmemmap_start = (unsigned long)head, vmemmap_end; unsigned long vmemmap_reuse; VM_WARN_ON_ONCE(!PageHuge(head)); if (!vmemmap_should_optimize(h, head)) - return; + return ret; static_branch_inc(&hugetlb_optimize_vmemmap_key); @@ -603,21 +599,58 @@ void hugetlb_vmemmap_optimize(const stru /* * Remap the vmemmap virtual address range [@vmemmap_start, @vmemmap_end) - * to the page which @vmemmap_reuse is mapped to, then free the pages - * which the range [@vmemmap_start, @vmemmap_end] is mapped to. + * to the page which @vmemmap_reuse is mapped to. Add pages previously + * mapping the range to vmemmap_pages list so that they can be freed by + * the caller. */ - if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse)) + ret = vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, vmemmap_pages); + if (ret) static_branch_dec(&hugetlb_optimize_vmemmap_key); else SetHPageVmemmapOptimized(head); + + return ret; +} + +/** + * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages. + * @h: struct hstate. + * @head: the head page whose vmemmap pages will be optimized. + * + * This function only tries to optimize @head's vmemmap pages and does not + * guarantee that the optimization will succeed after it returns. The caller + * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages + * have been optimized. + */ +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) +{ + LIST_HEAD(vmemmap_pages); + + __hugetlb_vmemmap_optimize(h, head, &vmemmap_pages); + free_vmemmap_page_list(&vmemmap_pages); } void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) { struct folio *folio; + LIST_HEAD(vmemmap_pages); - list_for_each_entry(folio, folio_list, lru) - hugetlb_vmemmap_optimize(h, &folio->page); + list_for_each_entry(folio, folio_list, lru) { + int ret = __hugetlb_vmemmap_optimize(h, &folio->page, + &vmemmap_pages); + + /* + * Pages may have been accumulated, thus free what we have + * and try again. + */ + if (ret == -ENOMEM) { + free_vmemmap_page_list(&vmemmap_pages); + INIT_LIST_HEAD(&vmemmap_pages); + __hugetlb_vmemmap_optimize(h, &folio->page, &vmemmap_pages); + } + } + + free_vmemmap_page_list(&vmemmap_pages); } static struct ctl_table hugetlb_vmemmap_sysctls[] = { _ Patches currently in -mm which might be from mike.kravetz@oracle.com are hugetlb-set-hugetlb-page-flag-before-optimizing-vmemmap.patch hugetlb-optimize-update_and_free_pages_bulk-to-avoid-lock-cycles.patch hugetlb-restructure-pool-allocations.patch hugetlb-perform-vmemmap-optimization-on-a-list-of-pages.patch hugetlb-perform-vmemmap-restoration-on-a-list-of-pages.patch hugetlb-batch-freeing-of-vmemmap-pages.patch hugetlb-batch-tlb-flushes-when-restoring-vmemmap.patch