From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
To: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
linux-mm@kvack.org, Will Deacon <will@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH 2/4] mm: Account pagetable memory when allocated
Date: Thu, 13 Nov 2025 11:39:51 -0800 [thread overview]
Message-ID: <aRY0BwgjVM2xrBnD@fedora> (raw)
In-Reply-To: <20251113140448.1814860-3-willy@infradead.org>
On Thu, Nov 13, 2025 at 02:04:44PM +0000, Matthew Wilcox (Oracle) wrote:
> Move the accounting from the constructor to the allocation site.
> Some of the architecture code is a little complex to reason about,
I think the patch is correct. After taking a look at all the paths, I
noticed 2 cases that I'm uncertain about:
1) Commit 454b0289c6b5 ("sparc32: mm: Don't try to free page-table pages
if ctor() fails") implies that sparc32 uses memblock pages, so we never
actually free them with pagetable_free(). I'm not sure how the
allocation path looks. Would this break the accounting? +Cc-ing Will.
2) In !CONFIG_MMU_GATHER_TABLE_FREE, tlb_remove_table() calls
pagetable_dtor() -> tlb_remove_page(). I'm wondering if this may run into
accounting issues?
Commit f21bb37afbba0 ("mm: pgtable: make generic tlb_remove_table() use
struct ptdesc") says it's only an arm thing that needs to be fixed
anyway. +Cc-ing Qi.
> but I think this is all correct (and slightly more efficient due
> to having 'order' as an argument instead of having to retrieve it
> from struct page again).
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> include/linux/mm.h | 11 -----------
> mm/memory.c | 16 +++++++++++++---
> 2 files changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index e168ee23091e..17f783c04c87 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3082,26 +3082,15 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; }
> static inline void ptlock_free(struct ptdesc *ptdesc) {}
> #endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */
>
> -static inline unsigned long ptdesc_nr_pages(const struct ptdesc *ptdesc)
> -{
> - return compound_nr(ptdesc_page(ptdesc));
> -}
> -
> static inline void __pagetable_ctor(struct ptdesc *ptdesc)
> {
> - pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> -
> __SetPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc));
> }
>
> static inline void pagetable_dtor(struct ptdesc *ptdesc)
> {
> - pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags));
> -
> ptlock_free(ptdesc);
> __ClearPageTable(ptdesc_page(ptdesc));
> - mod_node_page_state(pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc));
> }
>
> static inline void pagetable_dtor_free(struct ptdesc *ptdesc)
> diff --git a/mm/memory.c b/mm/memory.c
> index 781cd7f607f7..35886fde189c 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -7351,7 +7351,13 @@ long copy_folio_from_user(struct folio *dst_folio,
> struct ptdesc *pagetable_alloc_noprof(gfp_t gfp, unsigned int order)
> {
> struct page *page = alloc_frozen_pages_noprof(gfp | __GFP_COMP, order);
> + pg_data_t *pgdat;
>
> + if (!page)
> + return NULL;
> +
> + pgdat = NODE_DATA(page_to_nid(page));
> + mod_node_page_state(pgdat, NR_PAGETABLE, 1 << order);
> return page_ptdesc(page);
> }
>
> @@ -7364,12 +7370,16 @@ struct ptdesc *pagetable_alloc_noprof(gfp_t gfp, unsigned int order)
> */
> void pagetable_free(struct ptdesc *pt)
> {
> + pg_data_t *pgdat = NODE_DATA(memdesc_nid(pt->pt_flags));
> struct page *page = ptdesc_page(pt);
> + unsigned int order = compound_order(page);
>
> - if (ptdesc_test_kernel(pt))
> + if (ptdesc_test_kernel(pt)) {
> pagetable_free_kernel(pt);
> - else
> - free_frozen_pages(page, compound_order(page));
> + return;
> + }
> + mod_node_page_state(pgdat, NR_PAGETABLE, -(1L << order));
> + free_frozen_pages(page, order);
> }
>
> #if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS
> --
> 2.47.2
>
next prev parent reply other threads:[~2025-11-13 19:39 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-13 14:04 [PATCH 0/4] Convert pgtable to use frozen pages Matthew Wilcox (Oracle)
2025-11-13 14:04 ` [PATCH 1/4] mm: Use frozen pages for page tables Matthew Wilcox (Oracle)
2025-11-13 18:24 ` Vishal Moola (Oracle)
2025-11-13 19:14 ` Vishal Moola (Oracle)
2025-11-14 13:45 ` Matthew Wilcox
2025-11-14 14:31 ` Will Deacon
2025-11-17 14:38 ` kernel test robot
2025-11-18 0:44 ` Vishal Moola (Oracle)
2025-11-19 15:46 ` Chih-En Lin
2025-11-20 13:55 ` David Hildenbrand (Red Hat)
2025-11-13 14:04 ` [PATCH 2/4] mm: Account pagetable memory when allocated Matthew Wilcox (Oracle)
2025-11-13 19:39 ` Vishal Moola (Oracle) [this message]
2025-11-13 14:04 ` [PATCH 3/4] mm: Mark " Matthew Wilcox (Oracle)
2025-11-18 17:00 ` David Hildenbrand (Red Hat)
2025-11-13 14:04 ` [PATCH 4/4] pgtable: Remove uses of page->lru Matthew Wilcox (Oracle)
2025-11-20 13:56 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRY0BwgjVM2xrBnD@fedora \
--to=vishal.moola@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=linux-mm@kvack.org \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.