From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org, willy@infradead.org,
usama.arif@bytedance.com, songmuchun@bytedance.com,
senozhatsky@chromium.org, rientjes@google.com, osalvador@suse.de,
naoya.horiguchi@linux.dev, mhocko@suse.com, linmiaohe@huawei.com,
konradybcio@kernel.org, jthoughton@google.com,
joao.m.martins@oracle.com, duanxiongchun@bytedance.com,
david@redhat.com, anshuman.khandual@arm.com, 21cnbao@gmail.com,
mike.kravetz@oracle.com, akpm@linux-foundation.org
Subject: + hugetlb-batch-freeing-of-vmemmap-pages.patch added to mm-unstable branch
Date: Thu, 19 Oct 2023 09:46:45 -0700 [thread overview]
Message-ID: <20231019164645.E6F50C433C8@smtp.kernel.org> (raw)
The patch titled
Subject: hugetlb: batch freeing of vmemmap pages
has been added to the -mm mm-unstable branch. Its filename is
hugetlb-batch-freeing-of-vmemmap-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/hugetlb-batch-freeing-of-vmemmap-pages.patch
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlb: batch freeing of vmemmap pages
Date: Wed, 18 Oct 2023 19:31:07 -0700
Now that batching of hugetlb vmemmap optimization processing is possible,
batch the freeing of vmemmap pages. When freeing vmemmap pages for a
hugetlb page, we add them to a list that is freed after the entire batch
has been processed.
This enhances the ability to return contiguous ranges of memory to the low
level allocators.
Link: https://lkml.kernel.org/r/20231019023113.345257-6-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Dybcio <konradybcio@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Usama Arif <usama.arif@bytedance.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb_vmemmap.c | 82 ++++++++++++++++++++++++++++-------------
1 file changed, 56 insertions(+), 26 deletions(-)
--- a/mm/hugetlb_vmemmap.c~hugetlb-batch-freeing-of-vmemmap-pages
+++ a/mm/hugetlb_vmemmap.c
@@ -251,7 +251,7 @@ static void vmemmap_remap_pte(pte_t *pte
}
entry = mk_pte(walk->reuse_page, pgprot);
- list_add_tail(&page->lru, walk->vmemmap_pages);
+ list_add(&page->lru, walk->vmemmap_pages);
set_pte_at(&init_mm, addr, pte, entry);
}
@@ -306,18 +306,20 @@ static void vmemmap_restore_pte(pte_t *p
* @end: end address of the vmemmap virtual address range that we want to
* remap.
* @reuse: reuse address.
+ * @vmemmap_pages: list to deposit vmemmap pages to be freed. It is callers
+ * responsibility to free pages.
*
* Return: %0 on success, negative error code otherwise.
*/
static int vmemmap_remap_free(unsigned long start, unsigned long end,
- unsigned long reuse)
+ unsigned long reuse,
+ struct list_head *vmemmap_pages)
{
int ret;
- LIST_HEAD(vmemmap_pages);
struct vmemmap_remap_walk walk = {
.remap_pte = vmemmap_remap_pte,
.reuse_addr = reuse,
- .vmemmap_pages = &vmemmap_pages,
+ .vmemmap_pages = vmemmap_pages,
};
int nid = page_to_nid((struct page *)reuse);
gfp_t gfp_mask = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
@@ -334,7 +336,7 @@ static int vmemmap_remap_free(unsigned l
if (walk.reuse_page) {
copy_page(page_to_virt(walk.reuse_page),
(void *)walk.reuse_addr);
- list_add(&walk.reuse_page->lru, &vmemmap_pages);
+ list_add(&walk.reuse_page->lru, vmemmap_pages);
}
/*
@@ -365,15 +367,13 @@ static int vmemmap_remap_free(unsigned l
walk = (struct vmemmap_remap_walk) {
.remap_pte = vmemmap_restore_pte,
.reuse_addr = reuse,
- .vmemmap_pages = &vmemmap_pages,
+ .vmemmap_pages = vmemmap_pages,
};
vmemmap_remap_range(reuse, end, &walk);
}
mmap_read_unlock(&init_mm);
- free_vmemmap_page_list(&vmemmap_pages);
-
return ret;
}
@@ -389,7 +389,7 @@ static int alloc_vmemmap_page_list(unsig
page = alloc_pages_node(nid, gfp_mask, 0);
if (!page)
goto out;
- list_add_tail(&page->lru, list);
+ list_add(&page->lru, list);
}
return 0;
@@ -577,24 +577,17 @@ static bool vmemmap_should_optimize(cons
return true;
}
-/**
- * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages.
- * @h: struct hstate.
- * @head: the head page whose vmemmap pages will be optimized.
- *
- * This function only tries to optimize @head's vmemmap pages and does not
- * guarantee that the optimization will succeed after it returns. The caller
- * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages
- * have been optimized.
- */
-void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head)
+static int __hugetlb_vmemmap_optimize(const struct hstate *h,
+ struct page *head,
+ struct list_head *vmemmap_pages)
{
+ int ret = 0;
unsigned long vmemmap_start = (unsigned long)head, vmemmap_end;
unsigned long vmemmap_reuse;
VM_WARN_ON_ONCE(!PageHuge(head));
if (!vmemmap_should_optimize(h, head))
- return;
+ return ret;
static_branch_inc(&hugetlb_optimize_vmemmap_key);
@@ -604,21 +597,58 @@ void hugetlb_vmemmap_optimize(const stru
/*
* Remap the vmemmap virtual address range [@vmemmap_start, @vmemmap_end)
- * to the page which @vmemmap_reuse is mapped to, then free the pages
- * which the range [@vmemmap_start, @vmemmap_end] is mapped to.
+ * to the page which @vmemmap_reuse is mapped to. Add pages previously
+ * mapping the range to vmemmap_pages list so that they can be freed by
+ * the caller.
*/
- if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse))
+ ret = vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, vmemmap_pages);
+ if (ret)
static_branch_dec(&hugetlb_optimize_vmemmap_key);
else
SetHPageVmemmapOptimized(head);
+
+ return ret;
+}
+
+/**
+ * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages.
+ * @h: struct hstate.
+ * @head: the head page whose vmemmap pages will be optimized.
+ *
+ * This function only tries to optimize @head's vmemmap pages and does not
+ * guarantee that the optimization will succeed after it returns. The caller
+ * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages
+ * have been optimized.
+ */
+void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head)
+{
+ LIST_HEAD(vmemmap_pages);
+
+ __hugetlb_vmemmap_optimize(h, head, &vmemmap_pages);
+ free_vmemmap_page_list(&vmemmap_pages);
}
void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list)
{
struct folio *folio;
+ LIST_HEAD(vmemmap_pages);
+
+ list_for_each_entry(folio, folio_list, lru) {
+ int ret = __hugetlb_vmemmap_optimize(h, &folio->page,
+ &vmemmap_pages);
+
+ /*
+ * Pages to be freed may have been accumulated. If we
+ * encounter an ENOMEM, free what we have and try again.
+ */
+ if (ret == -ENOMEM && !list_empty(&vmemmap_pages)) {
+ free_vmemmap_page_list(&vmemmap_pages);
+ INIT_LIST_HEAD(&vmemmap_pages);
+ __hugetlb_vmemmap_optimize(h, &folio->page, &vmemmap_pages);
+ }
+ }
- list_for_each_entry(folio, folio_list, lru)
- hugetlb_vmemmap_optimize(h, &folio->page);
+ free_vmemmap_page_list(&vmemmap_pages);
}
static struct ctl_table hugetlb_vmemmap_sysctls[] = {
_
Patches currently in -mm which might be from mike.kravetz@oracle.com are
hugetlb-optimize-update_and_free_pages_bulk-to-avoid-lock-cycles.patch
hugetlb-restructure-pool-allocations.patch
hugetlb-perform-vmemmap-optimization-on-a-list-of-pages.patch
hugetlb-perform-vmemmap-restoration-on-a-list-of-pages.patch
hugetlb-batch-freeing-of-vmemmap-pages.patch
hugetlb-batch-tlb-flushes-when-restoring-vmemmap.patch
next reply other threads:[~2023-10-19 16:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-19 16:46 Andrew Morton [this message]
-- strict thread matches above, loose matches on Subject: below --
2023-10-06 19:18 + hugetlb-batch-freeing-of-vmemmap-pages.patch added to mm-unstable branch Andrew Morton
2023-09-26 21:39 Andrew Morton
2023-09-15 23:27 Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231019164645.E6F50C433C8@smtp.kernel.org \
--to=akpm@linux-foundation.org \
--cc=21cnbao@gmail.com \
--cc=anshuman.khandual@arm.com \
--cc=david@redhat.com \
--cc=duanxiongchun@bytedance.com \
--cc=joao.m.martins@oracle.com \
--cc=jthoughton@google.com \
--cc=konradybcio@kernel.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=mm-commits@vger.kernel.org \
--cc=naoya.horiguchi@linux.dev \
--cc=osalvador@suse.de \
--cc=rientjes@google.com \
--cc=senozhatsky@chromium.org \
--cc=songmuchun@bytedance.com \
--cc=usama.arif@bytedance.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.