Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
@ 2026-05-21  3:27 Alistair Popple
  2026-05-21 22:31 ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Alistair Popple @ 2026-05-21  3:27 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, linux-mm, catalin.marinas, will, david, akpm,
	Alistair Popple

Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
__create_pgd_mapping()") page-table allocation on ARM64 always
calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type
to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL.
However the matching pagetable_dtor() calls were never added.

With DEBUG_VM enabled on kernel versions prior to v6.17 without
2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page
type") this leads to the following warning when freeing these pages due
to page->page_type sharing page->_mapcount:

  BUG: Bad page state in process ... pfn:284fbb
  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
  flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
  page_type: f2(table)
  page dumped because: nonzero mapcount
  Call trace:
   bad_page+0x13c/0x160
   __free_frozen_pages+0x6cc/0x860
   ___free_pages+0xf4/0x180
   free_pages+0x54/0x80
   free_hotplug_page_range.part.0+0x58/0x90
   free_empty_tables+0x438/0x500
   __remove_pgd_mapping.constprop.0+0x60/0xa8
   arch_remove_memory+0x48/0x80
   try_remove_memory+0x158/0x1d8
   offline_and_remove_memory+0x138/0x180

It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS
is defined and incorrect NR_PAGETABLE stats. Fix this by calling
pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the
page to undo the effects of calling pagetable_*_ctor().

Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
 arch/arm64/mm/mmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8e1d80a7033e..0c24fe650e95 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
 
 static void free_hotplug_pgtable_page(struct page *page)
 {
+	pagetable_dtor(page_ptdesc(page));
 	free_hotplug_page_range(page, PAGE_SIZE, NULL);
 }
 
-- 
2.54.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-21  3:27 [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables Alistair Popple
@ 2026-05-21 22:31 ` Andrew Morton
  2026-05-21 23:50   ` Alistair Popple
  2026-05-22  7:15   ` Catalin Marinas
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2026-05-21 22:31 UTC (permalink / raw)
  To: Alistair Popple
  Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, will,
	david

On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote:

> Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
> __create_pgd_mapping()") page-table allocation on ARM64 always
> calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type
> to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL.
> However the matching pagetable_dtor() calls were never added.
> 
> With DEBUG_VM enabled on kernel versions prior to v6.17 without
> 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page
> type") this leads to the following warning when freeing these pages due
> to page->page_type sharing page->_mapcount:
> 
>   BUG: Bad page state in process ... pfn:284fbb
>   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
>   flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
>   page_type: f2(table)
>   page dumped because: nonzero mapcount
>   Call trace:
>    bad_page+0x13c/0x160
>    __free_frozen_pages+0x6cc/0x860
>    ___free_pages+0xf4/0x180
>    free_pages+0x54/0x80
>    free_hotplug_page_range.part.0+0x58/0x90
>    free_empty_tables+0x438/0x500
>    __remove_pgd_mapping.constprop.0+0x60/0xa8
>    arch_remove_memory+0x48/0x80
>    try_remove_memory+0x158/0x1d8
>    offline_and_remove_memory+0x138/0x180
> 
> It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS
> is defined and incorrect NR_PAGETABLE stats. Fix this by calling
> pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the
> page to undo the effects of calling pagetable_*_ctor().
> 
> Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")

6.16+, so I assume we want cc:stable here.

>  arch/arm64/mm/mmu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 8e1d80a7033e..0c24fe650e95 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
>  
>  static void free_hotplug_pgtable_page(struct page *page)
>  {
> +	pagetable_dtor(page_ptdesc(page));
>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
>  }

I'd of course prefer that arm maintainers handle this.  But
5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to
fix it.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-21 22:31 ` Andrew Morton
@ 2026-05-21 23:50   ` Alistair Popple
  2026-05-22  7:15   ` Catalin Marinas
  1 sibling, 0 replies; 13+ messages in thread
From: Alistair Popple @ 2026-05-21 23:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, will,
	david

On 2026-05-22 at 08:31 +1000, Andrew Morton <akpm@linux-foundation.org> wrote...
> On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote:
> 
> > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
> > __create_pgd_mapping()") page-table allocation on ARM64 always
> > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type
> > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL.
> > However the matching pagetable_dtor() calls were never added.
> > 
> > With DEBUG_VM enabled on kernel versions prior to v6.17 without
> > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page
> > type") this leads to the following warning when freeing these pages due
> > to page->page_type sharing page->_mapcount:
> > 
> >   BUG: Bad page state in process ... pfn:284fbb
> >   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
> >   flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
> >   page_type: f2(table)
> >   page dumped because: nonzero mapcount
> >   Call trace:
> >    bad_page+0x13c/0x160
> >    __free_frozen_pages+0x6cc/0x860
> >    ___free_pages+0xf4/0x180
> >    free_pages+0x54/0x80
> >    free_hotplug_page_range.part.0+0x58/0x90
> >    free_empty_tables+0x438/0x500
> >    __remove_pgd_mapping.constprop.0+0x60/0xa8
> >    arch_remove_memory+0x48/0x80
> >    try_remove_memory+0x158/0x1d8
> >    offline_and_remove_memory+0x138/0x180
> > 
> > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS
> > is defined and incorrect NR_PAGETABLE stats. Fix this by calling
> > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the
> > page to undo the effects of calling pagetable_*_ctor().
> > 
> > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")
> 
> 6.16+, so I assume we want cc:stable here.

Yes indeed. Sorry I forgot to do that but I can see you added it so thanks for
that.

 - Alistair

> >  arch/arm64/mm/mmu.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 8e1d80a7033e..0c24fe650e95 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> >  
> >  static void free_hotplug_pgtable_page(struct page *page)
> >  {
> > +	pagetable_dtor(page_ptdesc(page));
> >  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> >  }
> 
> I'd of course prefer that arm maintainers handle this.  But
> 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to
> fix it.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-21 22:31 ` Andrew Morton
  2026-05-21 23:50   ` Alistair Popple
@ 2026-05-22  7:15   ` Catalin Marinas
  2026-05-22  7:32     ` Catalin Marinas
  2026-05-22  9:36     ` Vishal Moola
  1 sibling, 2 replies; 13+ messages in thread
From: Catalin Marinas @ 2026-05-22  7:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will,
	david

On Thu, May 21, 2026 at 03:31:30PM -0700, Andrew Morton wrote:
> On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote:
> > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
> > __create_pgd_mapping()") page-table allocation on ARM64 always
> > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type
> > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL.
> > However the matching pagetable_dtor() calls were never added.
> > 
> > With DEBUG_VM enabled on kernel versions prior to v6.17 without
> > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page
> > type") this leads to the following warning when freeing these pages due
> > to page->page_type sharing page->_mapcount:
> > 
> >   BUG: Bad page state in process ... pfn:284fbb
> >   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
> >   flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
> >   page_type: f2(table)
> >   page dumped because: nonzero mapcount
> >   Call trace:
> >    bad_page+0x13c/0x160
> >    __free_frozen_pages+0x6cc/0x860
> >    ___free_pages+0xf4/0x180
> >    free_pages+0x54/0x80
> >    free_hotplug_page_range.part.0+0x58/0x90
> >    free_empty_tables+0x438/0x500
> >    __remove_pgd_mapping.constprop.0+0x60/0xa8
> >    arch_remove_memory+0x48/0x80
> >    try_remove_memory+0x158/0x1d8
> >    offline_and_remove_memory+0x138/0x180
> > 
> > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS
> > is defined and incorrect NR_PAGETABLE stats. Fix this by calling
> > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the
> > page to undo the effects of calling pagetable_*_ctor().
> > 
> > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")
> 
> 6.16+, so I assume we want cc:stable here.
> 
> >  arch/arm64/mm/mmu.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 8e1d80a7033e..0c24fe650e95 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> >  
> >  static void free_hotplug_pgtable_page(struct page *page)
> >  {
> > +	pagetable_dtor(page_ptdesc(page));
> >  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> >  }
> 
> I'd of course prefer that arm maintainers handle this.  But
> 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to
> fix it.

That's fine but Sashiko has some points:

https://sashiko.dev/#/patchset/20260521032730.2104017-1-apopple@nvidia.com

The __remove_pgd_mapping() path is fine but we also have the
vmemmap_free() path where the constructor was never called.

We could pass around a bool dtor argument but I wonder whether we could
just check it's a pgtable page:

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c8959153ac4..9d42cbddce27 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
 
 static void free_hotplug_pgtable_page(struct page *page)
 {
+	if (folio_test_pgtable(page_folio(page)))
+		pagetable_dtor(page_ptdesc(page));
+
 	free_hotplug_page_range(page, PAGE_SIZE, NULL);
 }
 

-- 
Catalin


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-22  7:15   ` Catalin Marinas
@ 2026-05-22  7:32     ` Catalin Marinas
  2026-05-22  9:36     ` Vishal Moola
  1 sibling, 0 replies; 13+ messages in thread
From: Catalin Marinas @ 2026-05-22  7:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will,
	david

On Fri, May 22, 2026 at 08:15:09AM +0100, Catalin Marinas wrote:
> On Thu, May 21, 2026 at 03:31:30PM -0700, Andrew Morton wrote:
> > On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote:
> > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > index 8e1d80a7033e..0c24fe650e95 100644
> > > --- a/arch/arm64/mm/mmu.c
> > > +++ b/arch/arm64/mm/mmu.c
> > > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> > >  
> > >  static void free_hotplug_pgtable_page(struct page *page)
> > >  {
> > > +	pagetable_dtor(page_ptdesc(page));
> > >  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> > >  }
> > 
> > I'd of course prefer that arm maintainers handle this.  But
> > 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to
> > fix it.
> 
> That's fine but Sashiko has some points:
> 
> https://sashiko.dev/#/patchset/20260521032730.2104017-1-apopple@nvidia.com

The other Sashiko find looks like a false positive. vmemmap_*_populate()
do not allocate the page table from altmap, only the page pointed at by
the vmemmap pte.

-- 
Catalin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-22  7:15   ` Catalin Marinas
  2026-05-22  7:32     ` Catalin Marinas
@ 2026-05-22  9:36     ` Vishal Moola
  2026-05-26 11:54       ` Kevin Brodsky
  1 sibling, 1 reply; 13+ messages in thread
From: Vishal Moola @ 2026-05-22  9:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel,
	linux-mm, will, david

On Fri, May 22, 2026 at 08:15:09AM +0100, Catalin Marinas wrote:
> On Thu, May 21, 2026 at 03:31:30PM -0700, Andrew Morton wrote:
> > On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote:
> > > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
> > > __create_pgd_mapping()") page-table allocation on ARM64 always
> > > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type
> > > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL.
> > > However the matching pagetable_dtor() calls were never added.
> > > 
> > > With DEBUG_VM enabled on kernel versions prior to v6.17 without
> > > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page
> > > type") this leads to the following warning when freeing these pages due
> > > to page->page_type sharing page->_mapcount:
> > > 
> > >   BUG: Bad page state in process ... pfn:284fbb
> > >   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
> > >   flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
> > >   page_type: f2(table)
> > >   page dumped because: nonzero mapcount
> > >   Call trace:
> > >    bad_page+0x13c/0x160
> > >    __free_frozen_pages+0x6cc/0x860
> > >    ___free_pages+0xf4/0x180
> > >    free_pages+0x54/0x80
> > >    free_hotplug_page_range.part.0+0x58/0x90
> > >    free_empty_tables+0x438/0x500
> > >    __remove_pgd_mapping.constprop.0+0x60/0xa8
> > >    arch_remove_memory+0x48/0x80
> > >    try_remove_memory+0x158/0x1d8
> > >    offline_and_remove_memory+0x138/0x180
> > > 
> > > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS
> > > is defined and incorrect NR_PAGETABLE stats. Fix this by calling
> > > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the
> > > page to undo the effects of calling pagetable_*_ctor().
> > > 
> > > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")
> > 
> > 6.16+, so I assume we want cc:stable here.
> > 
> > >  arch/arm64/mm/mmu.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > index 8e1d80a7033e..0c24fe650e95 100644
> > > --- a/arch/arm64/mm/mmu.c
> > > +++ b/arch/arm64/mm/mmu.c
> > > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> > >  
> > >  static void free_hotplug_pgtable_page(struct page *page)
> > >  {
> > > +	pagetable_dtor(page_ptdesc(page));
> > >  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> > >  }
> > 
> > I'd of course prefer that arm maintainers handle this.  But
> > 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to
> > fix it.
> 
> That's fine but Sashiko has some points:
> 
> https://sashiko.dev/#/patchset/20260521032730.2104017-1-apopple@nvidia.com
> 
> The __remove_pgd_mapping() path is fine but we also have the
> vmemmap_free() path where the constructor was never called.
> 
> We could pass around a bool dtor argument but I wonder whether we could
> just check it's a pgtable page:

Free_empty_tables() looks like the only way we'd ever get to
free_hotplug_pgtable_page(). I'm a little curious why we can't
consolidate unmap_hotplug_range() and free_empty_tables().
I.e. just fold unmap_hotplug_range() into the latter.

> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 4c8959153ac4..9d42cbddce27 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
>  
>  static void free_hotplug_pgtable_page(struct page *page)
>  {
> +	if (folio_test_pgtable(page_folio(page)))

This should work.

> +		pagetable_dtor(page_ptdesc(page));
> +
>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);

In the case we presumably have a page table page (ptdesc) at this
point, we should really be freeing it with pagetable_free() as well.

Its not a big deal that we don't right now, but losing track of the
matching allocation/free sites will become a headache when separately
allocating from struct page.

>  }
>  
> 
> -- 
> Catalin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-22  9:36     ` Vishal Moola
@ 2026-05-26 11:54       ` Kevin Brodsky
  2026-05-26 12:31         ` David Hildenbrand (Arm)
  2026-05-26 15:07         ` Will Deacon
  0 siblings, 2 replies; 13+ messages in thread
From: Kevin Brodsky @ 2026-05-26 11:54 UTC (permalink / raw)
  To: Vishal Moola, Catalin Marinas
  Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel,
	linux-mm, will, david

On 22/05/2026 11:36, Vishal Moola wrote:
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 4c8959153ac4..9d42cbddce27 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
>>  
>>  static void free_hotplug_pgtable_page(struct page *page)
>>  {
>> +	if (folio_test_pgtable(page_folio(page)))
> This should work.
>
>> +		pagetable_dtor(page_ptdesc(page));
>> +
>>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> In the case we presumably have a page table page (ptdesc) at this
> point, we should really be freeing it with pagetable_free() as well.

Agreed, I think this is the right thing to do, something like:

if (folio_test_pgtable(page_folio(page)))
pagetable_dtor_free(page_ptdesc(page)); else
free_hotplug_page_range(page, PAGE_SIZE, NULL);


Strangely enough x86 calls pagetable_free() in both cases.

My series protecting page tables with pkeys has a patch [1] to get
vmemmap to allocate page tables with pagetable_alloc(). The diff above
will require pagetable_*_ctor() to be called as well, but I think that's
the right thing to do anyway. That could be posted as a separate series,
but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc().

- Kevin

[1] https://lore.kernel.org/all/20260526-kpkeys-v8-14-eaaacdacc67c@arm.com/

> Its not a big deal that we don't right now, but losing track of the
> matching allocation/free sites will become a headache when separately
> allocating from struct page.
>
>>  }


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-26 11:54       ` Kevin Brodsky
@ 2026-05-26 12:31         ` David Hildenbrand (Arm)
  2026-05-27  7:34           ` Kevin Brodsky
  2026-05-26 15:07         ` Will Deacon
  1 sibling, 1 reply; 13+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-26 12:31 UTC (permalink / raw)
  To: Kevin Brodsky, Vishal Moola, Catalin Marinas
  Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel,
	linux-mm, will

On 5/26/26 13:54, Kevin Brodsky wrote:
> On 22/05/2026 11:36, Vishal Moola wrote:
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 4c8959153ac4..9d42cbddce27 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
>>>  
>>>  static void free_hotplug_pgtable_page(struct page *page)
>>>  {
>>> +	if (folio_test_pgtable(page_folio(page)))
>> This should work.
>>
>>> +		pagetable_dtor(page_ptdesc(page));
>>> +
>>>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
>> In the case we presumably have a page table page (ptdesc) at this
>> point, we should really be freeing it with pagetable_free() as well.
> 
> Agreed, I think this is the right thing to do, something like:
> 
> if (folio_test_pgtable(page_folio(page)))
> pagetable_dtor_free(page_ptdesc(page)); else
> free_hotplug_page_range(page, PAGE_SIZE, NULL);

That code pattern is wrong.

folio_test_pgtable() shouldn't exist.

In the future, something is either a pgtable or a folio, not both.

So check the type against the page, not the folio.

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-26 11:54       ` Kevin Brodsky
  2026-05-26 12:31         ` David Hildenbrand (Arm)
@ 2026-05-26 15:07         ` Will Deacon
  2026-05-27  7:35           ` Kevin Brodsky
  1 sibling, 1 reply; 13+ messages in thread
From: Will Deacon @ 2026-05-26 15:07 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Vishal Moola, Catalin Marinas, Andrew Morton, Alistair Popple,
	linux-arm-kernel, linux-kernel, linux-mm, david

On Tue, May 26, 2026 at 01:54:00PM +0200, Kevin Brodsky wrote:
> On 22/05/2026 11:36, Vishal Moola wrote:
> >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >> index 4c8959153ac4..9d42cbddce27 100644
> >> --- a/arch/arm64/mm/mmu.c
> >> +++ b/arch/arm64/mm/mmu.c
> >> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> >>  
> >>  static void free_hotplug_pgtable_page(struct page *page)
> >>  {
> >> +	if (folio_test_pgtable(page_folio(page)))
> > This should work.
> >
> >> +		pagetable_dtor(page_ptdesc(page));
> >> +
> >>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> > In the case we presumably have a page table page (ptdesc) at this
> > point, we should really be freeing it with pagetable_free() as well.
> 
> Agreed, I think this is the right thing to do, something like:
> 
> if (folio_test_pgtable(page_folio(page)))
> pagetable_dtor_free(page_ptdesc(page)); else
> free_hotplug_page_range(page, PAGE_SIZE, NULL);
> 
> 
> Strangely enough x86 calls pagetable_free() in both cases.
> 
> My series protecting page tables with pkeys has a patch [1] to get
> vmemmap to allocate page tables with pagetable_alloc(). The diff above
> will require pagetable_*_ctor() to be called as well, but I think that's
> the right thing to do anyway. That could be posted as a separate series,
> but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc().

I agree that calling the ctor()/dtor() functions consistently is the
cleanest approach and that will need something like your patch to call
the constructor from vmemmap_alloc_block_zero(). Trying to elide these
calls for the page-table pages used to map the altmap just feels odd to
me, as there isn't anything particularly special about them afaik.

Will


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-26 12:31         ` David Hildenbrand (Arm)
@ 2026-05-27  7:34           ` Kevin Brodsky
  2026-05-27  9:22             ` Vishal Moola
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Brodsky @ 2026-05-27  7:34 UTC (permalink / raw)
  To: David Hildenbrand (Arm), Vishal Moola, Catalin Marinas
  Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel,
	linux-mm, will

On 26/05/2026 14:31, David Hildenbrand (Arm) wrote:
> On 5/26/26 13:54, Kevin Brodsky wrote:
>> On 22/05/2026 11:36, Vishal Moola wrote:
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index 4c8959153ac4..9d42cbddce27 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
>>>>  
>>>>  static void free_hotplug_pgtable_page(struct page *page)
>>>>  {
>>>> +	if (folio_test_pgtable(page_folio(page)))
>>> This should work.
>>>
>>>> +		pagetable_dtor(page_ptdesc(page));
>>>> +
>>>>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
>>> In the case we presumably have a page table page (ptdesc) at this
>>> point, we should really be freeing it with pagetable_free() as well.
>> Agreed, I think this is the right thing to do, something like:
>>
>> if (folio_test_pgtable(page_folio(page)))
>> pagetable_dtor_free(page_ptdesc(page)); else
>> free_hotplug_page_range(page, PAGE_SIZE, NULL);
> That code pattern is wrong.
>
> folio_test_pgtable() shouldn't exist.
>
> In the future, something is either a pgtable or a folio, not both.
>
> So check the type against the page, not the folio.

In other words use PageTable(page) instead? Interestingly I can see a
few calls to folio_test_pgtable() across the kernel but none to
PageTable(), maybe just an antipattern then? The ctor/dtor also use
__folio_{set,clear}_pgtable().

- Kevin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-26 15:07         ` Will Deacon
@ 2026-05-27  7:35           ` Kevin Brodsky
  2026-05-27  9:30             ` Vishal Moola
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Brodsky @ 2026-05-27  7:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: Vishal Moola, Catalin Marinas, Andrew Morton, Alistair Popple,
	linux-arm-kernel, linux-kernel, linux-mm, david

On 26/05/2026 17:07, Will Deacon wrote:
> On Tue, May 26, 2026 at 01:54:00PM +0200, Kevin Brodsky wrote:
>> On 22/05/2026 11:36, Vishal Moola wrote:
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index 4c8959153ac4..9d42cbddce27 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
>>>>  
>>>>  static void free_hotplug_pgtable_page(struct page *page)
>>>>  {
>>>> +	if (folio_test_pgtable(page_folio(page)))
>>> This should work.
>>>
>>>> +		pagetable_dtor(page_ptdesc(page));
>>>> +
>>>>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
>>> In the case we presumably have a page table page (ptdesc) at this
>>> point, we should really be freeing it with pagetable_free() as well.
>> Agreed, I think this is the right thing to do, something like:
>>
>> if (folio_test_pgtable(page_folio(page)))
>> pagetable_dtor_free(page_ptdesc(page)); else
>> free_hotplug_page_range(page, PAGE_SIZE, NULL);
>>
>>
>> Strangely enough x86 calls pagetable_free() in both cases.
>>
>> My series protecting page tables with pkeys has a patch [1] to get
>> vmemmap to allocate page tables with pagetable_alloc(). The diff above
>> will require pagetable_*_ctor() to be called as well, but I think that's
>> the right thing to do anyway. That could be posted as a separate series,
>> but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc().
> I agree that calling the ctor()/dtor() functions consistently is the
> cleanest approach and that will need something like your patch to call
> the constructor from vmemmap_alloc_block_zero(). Trying to elide these
> calls for the page-table pages used to map the altmap just feels odd to
> me, as there isn't anything particularly special about them afaik.

I don't think they're really special either, most likely they just got
missed/ignored for the purpose of ctor/dtor like many other kernel page
tables (until recently).

I'll prepare a series refactoring that code then - that will also
require changing most arch implementations of vmemmap_free() to call
pagetable_dtor_free().

In the meantime we should probably use the logic above to avoid the BUG
that Alistair reported.

- Kevin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-27  7:34           ` Kevin Brodsky
@ 2026-05-27  9:22             ` Vishal Moola
  0 siblings, 0 replies; 13+ messages in thread
From: Vishal Moola @ 2026-05-27  9:22 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: David Hildenbrand (Arm), Catalin Marinas, Andrew Morton,
	Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will

On Wed, May 27, 2026 at 09:34:19AM +0200, Kevin Brodsky wrote:
> On 26/05/2026 14:31, David Hildenbrand (Arm) wrote:
> > On 5/26/26 13:54, Kevin Brodsky wrote:
> >> On 22/05/2026 11:36, Vishal Moola wrote:
> >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >>>> index 4c8959153ac4..9d42cbddce27 100644
> >>>> --- a/arch/arm64/mm/mmu.c
> >>>> +++ b/arch/arm64/mm/mmu.c
> >>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> >>>>  
> >>>>  static void free_hotplug_pgtable_page(struct page *page)
> >>>>  {
> >>>> +	if (folio_test_pgtable(page_folio(page)))
> >>> This should work.
> >>>
> >>>> +		pagetable_dtor(page_ptdesc(page));
> >>>> +
> >>>>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> >>> In the case we presumably have a page table page (ptdesc) at this
> >>> point, we should really be freeing it with pagetable_free() as well.
> >> Agreed, I think this is the right thing to do, something like:
> >>
> >> if (folio_test_pgtable(page_folio(page)))
> >> pagetable_dtor_free(page_ptdesc(page)); else
> >> free_hotplug_page_range(page, PAGE_SIZE, NULL);
> > That code pattern is wrong.
> >
> > folio_test_pgtable() shouldn't exist.
> >
> > In the future, something is either a pgtable or a folio, not both.
> >
> > So check the type against the page, not the folio.
> 
> In other words use PageTable(page) instead? Interestingly I can see a
> few calls to folio_test_pgtable() across the kernel but none to
> PageTable(), maybe just an antipattern then? The ctor/dtor also use
> __folio_{set,clear}_pgtable().

If we know for sure we have the head page (which we probably do here), we
should use PageTable(page).

I included the folio APIs as a defensive way to ensure large order page
table users don't break. We only set the type on the head page, and using
a folio guarantees we're accessing that head page. This can go away as soon
as someone looks at the architectures that allocate large order ptdescs.

Also, most (if not all) 'pgtables' don't use ptdescs yet, but they
probably should... Anyway, in this particular case, it looks like pgtable
is just a name symbolizing 'page table' and not our type 'pgtable_t' anyway
so I'm just rambling.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
  2026-05-27  7:35           ` Kevin Brodsky
@ 2026-05-27  9:30             ` Vishal Moola
  0 siblings, 0 replies; 13+ messages in thread
From: Vishal Moola @ 2026-05-27  9:30 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Will Deacon, Catalin Marinas, Andrew Morton, Alistair Popple,
	linux-arm-kernel, linux-kernel, linux-mm, david

On Wed, May 27, 2026 at 09:35:50AM +0200, Kevin Brodsky wrote:
> On 26/05/2026 17:07, Will Deacon wrote:
> > On Tue, May 26, 2026 at 01:54:00PM +0200, Kevin Brodsky wrote:
> >> On 22/05/2026 11:36, Vishal Moola wrote:
> >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >>>> index 4c8959153ac4..9d42cbddce27 100644
> >>>> --- a/arch/arm64/mm/mmu.c
> >>>> +++ b/arch/arm64/mm/mmu.c
> >>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size,
> >>>>  
> >>>>  static void free_hotplug_pgtable_page(struct page *page)
> >>>>  {
> >>>> +	if (folio_test_pgtable(page_folio(page)))
> >>> This should work.
> >>>
> >>>> +		pagetable_dtor(page_ptdesc(page));
> >>>> +
> >>>>  	free_hotplug_page_range(page, PAGE_SIZE, NULL);
> >>> In the case we presumably have a page table page (ptdesc) at this
> >>> point, we should really be freeing it with pagetable_free() as well.
> >> Agreed, I think this is the right thing to do, something like:
> >>
> >> if (folio_test_pgtable(page_folio(page)))
> >> pagetable_dtor_free(page_ptdesc(page)); else
> >> free_hotplug_page_range(page, PAGE_SIZE, NULL);
> >>
> >>
> >> Strangely enough x86 calls pagetable_free() in both cases.
> >>
> >> My series protecting page tables with pkeys has a patch [1] to get
> >> vmemmap to allocate page tables with pagetable_alloc(). The diff above
> >> will require pagetable_*_ctor() to be called as well, but I think that's
> >> the right thing to do anyway. That could be posted as a separate series,
> >> but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc().
> > I agree that calling the ctor()/dtor() functions consistently is the
> > cleanest approach and that will need something like your patch to call
> > the constructor from vmemmap_alloc_block_zero(). Trying to elide these
> > calls for the page-table pages used to map the altmap just feels odd to
> > me, as there isn't anything particularly special about them afaik.
> 
> I don't think they're really special either, most likely they just got
> missed/ignored for the purpose of ctor/dtor like many other kernel page
> tables (until recently).
>
> I'll prepare a series refactoring that code then - that will also
> require changing most arch implementations of vmemmap_free() to call
> pagetable_dtor_free().

Take a look at Matthew's series[1]. I think thats the ideal approach for
page table accounting. He hasn't had time to iterate on it though. I
doubt he'd mind if someone picked it up.

> In the meantime we should probably use the logic above to avoid the BUG
> that Alistair reported.

Agreed.

[1] https://lore.kernel.org/linux-mm/20251113140448.1814860-4-willy@infradead.org/


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-27  9:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-21  3:27 [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables Alistair Popple
2026-05-21 22:31 ` Andrew Morton
2026-05-21 23:50   ` Alistair Popple
2026-05-22  7:15   ` Catalin Marinas
2026-05-22  7:32     ` Catalin Marinas
2026-05-22  9:36     ` Vishal Moola
2026-05-26 11:54       ` Kevin Brodsky
2026-05-26 12:31         ` David Hildenbrand (Arm)
2026-05-27  7:34           ` Kevin Brodsky
2026-05-27  9:22             ` Vishal Moola
2026-05-26 15:07         ` Will Deacon
2026-05-27  7:35           ` Kevin Brodsky
2026-05-27  9:30             ` Vishal Moola

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox