* [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables
@ 2026-05-21 3:27 Alistair Popple
2026-05-21 22:31 ` Andrew Morton
0 siblings, 1 reply; 13+ messages in thread
From: Alistair Popple @ 2026-05-21 3:27 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-kernel, linux-mm, catalin.marinas, will, david, akpm,
Alistair Popple
Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
__create_pgd_mapping()") page-table allocation on ARM64 always
calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type
to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL.
However the matching pagetable_dtor() calls were never added.
With DEBUG_VM enabled on kernel versions prior to v6.17 without
2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page
type") this leads to the following warning when freeing these pages due
to page->page_type sharing page->_mapcount:
BUG: Bad page state in process ... pfn:284fbb
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
page_type: f2(table)
page dumped because: nonzero mapcount
Call trace:
bad_page+0x13c/0x160
__free_frozen_pages+0x6cc/0x860
___free_pages+0xf4/0x180
free_pages+0x54/0x80
free_hotplug_page_range.part.0+0x58/0x90
free_empty_tables+0x438/0x500
__remove_pgd_mapping.constprop.0+0x60/0xa8
arch_remove_memory+0x48/0x80
try_remove_memory+0x158/0x1d8
offline_and_remove_memory+0x138/0x180
It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS
is defined and incorrect NR_PAGETABLE stats. Fix this by calling
pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the
page to undo the effects of calling pagetable_*_ctor().
Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
arch/arm64/mm/mmu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8e1d80a7033e..0c24fe650e95 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size,
static void free_hotplug_pgtable_page(struct page *page)
{
+ pagetable_dtor(page_ptdesc(page));
free_hotplug_page_range(page, PAGE_SIZE, NULL);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-21 3:27 [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables Alistair Popple @ 2026-05-21 22:31 ` Andrew Morton 2026-05-21 23:50 ` Alistair Popple 2026-05-22 7:15 ` Catalin Marinas 0 siblings, 2 replies; 13+ messages in thread From: Andrew Morton @ 2026-05-21 22:31 UTC (permalink / raw) To: Alistair Popple Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, will, david On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote: > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in > __create_pgd_mapping()") page-table allocation on ARM64 always > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL. > However the matching pagetable_dtor() calls were never added. > > With DEBUG_VM enabled on kernel versions prior to v6.17 without > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page > type") this leads to the following warning when freeing these pages due > to page->page_type sharing page->_mapcount: > > BUG: Bad page state in process ... pfn:284fbb > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb > flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff) > page_type: f2(table) > page dumped because: nonzero mapcount > Call trace: > bad_page+0x13c/0x160 > __free_frozen_pages+0x6cc/0x860 > ___free_pages+0xf4/0x180 > free_pages+0x54/0x80 > free_hotplug_page_range.part.0+0x58/0x90 > free_empty_tables+0x438/0x500 > __remove_pgd_mapping.constprop.0+0x60/0xa8 > arch_remove_memory+0x48/0x80 > try_remove_memory+0x158/0x1d8 > offline_and_remove_memory+0x138/0x180 > > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS > is defined and incorrect NR_PAGETABLE stats. Fix this by calling > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the > page to undo the effects of calling pagetable_*_ctor(). > > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()") 6.16+, so I assume we want cc:stable here. > arch/arm64/mm/mmu.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 8e1d80a7033e..0c24fe650e95 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size, > > static void free_hotplug_pgtable_page(struct page *page) > { > + pagetable_dtor(page_ptdesc(page)); > free_hotplug_page_range(page, PAGE_SIZE, NULL); > } I'd of course prefer that arm maintainers handle this. But 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to fix it. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-21 22:31 ` Andrew Morton @ 2026-05-21 23:50 ` Alistair Popple 2026-05-22 7:15 ` Catalin Marinas 1 sibling, 0 replies; 13+ messages in thread From: Alistair Popple @ 2026-05-21 23:50 UTC (permalink / raw) To: Andrew Morton Cc: linux-arm-kernel, linux-kernel, linux-mm, catalin.marinas, will, david On 2026-05-22 at 08:31 +1000, Andrew Morton <akpm@linux-foundation.org> wrote... > On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote: > > > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in > > __create_pgd_mapping()") page-table allocation on ARM64 always > > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type > > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL. > > However the matching pagetable_dtor() calls were never added. > > > > With DEBUG_VM enabled on kernel versions prior to v6.17 without > > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page > > type") this leads to the following warning when freeing these pages due > > to page->page_type sharing page->_mapcount: > > > > BUG: Bad page state in process ... pfn:284fbb > > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb > > flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff) > > page_type: f2(table) > > page dumped because: nonzero mapcount > > Call trace: > > bad_page+0x13c/0x160 > > __free_frozen_pages+0x6cc/0x860 > > ___free_pages+0xf4/0x180 > > free_pages+0x54/0x80 > > free_hotplug_page_range.part.0+0x58/0x90 > > free_empty_tables+0x438/0x500 > > __remove_pgd_mapping.constprop.0+0x60/0xa8 > > arch_remove_memory+0x48/0x80 > > try_remove_memory+0x158/0x1d8 > > offline_and_remove_memory+0x138/0x180 > > > > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS > > is defined and incorrect NR_PAGETABLE stats. Fix this by calling > > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the > > page to undo the effects of calling pagetable_*_ctor(). > > > > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()") > > 6.16+, so I assume we want cc:stable here. Yes indeed. Sorry I forgot to do that but I can see you added it so thanks for that. - Alistair > > arch/arm64/mm/mmu.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 8e1d80a7033e..0c24fe650e95 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size, > > > > static void free_hotplug_pgtable_page(struct page *page) > > { > > + pagetable_dtor(page_ptdesc(page)); > > free_hotplug_page_range(page, PAGE_SIZE, NULL); > > } > > I'd of course prefer that arm maintainers handle this. But > 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to > fix it. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-21 22:31 ` Andrew Morton 2026-05-21 23:50 ` Alistair Popple @ 2026-05-22 7:15 ` Catalin Marinas 2026-05-22 7:32 ` Catalin Marinas 2026-05-22 9:36 ` Vishal Moola 1 sibling, 2 replies; 13+ messages in thread From: Catalin Marinas @ 2026-05-22 7:15 UTC (permalink / raw) To: Andrew Morton Cc: Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will, david On Thu, May 21, 2026 at 03:31:30PM -0700, Andrew Morton wrote: > On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote: > > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in > > __create_pgd_mapping()") page-table allocation on ARM64 always > > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type > > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL. > > However the matching pagetable_dtor() calls were never added. > > > > With DEBUG_VM enabled on kernel versions prior to v6.17 without > > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page > > type") this leads to the following warning when freeing these pages due > > to page->page_type sharing page->_mapcount: > > > > BUG: Bad page state in process ... pfn:284fbb > > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb > > flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff) > > page_type: f2(table) > > page dumped because: nonzero mapcount > > Call trace: > > bad_page+0x13c/0x160 > > __free_frozen_pages+0x6cc/0x860 > > ___free_pages+0xf4/0x180 > > free_pages+0x54/0x80 > > free_hotplug_page_range.part.0+0x58/0x90 > > free_empty_tables+0x438/0x500 > > __remove_pgd_mapping.constprop.0+0x60/0xa8 > > arch_remove_memory+0x48/0x80 > > try_remove_memory+0x158/0x1d8 > > offline_and_remove_memory+0x138/0x180 > > > > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS > > is defined and incorrect NR_PAGETABLE stats. Fix this by calling > > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the > > page to undo the effects of calling pagetable_*_ctor(). > > > > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()") > > 6.16+, so I assume we want cc:stable here. > > > arch/arm64/mm/mmu.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 8e1d80a7033e..0c24fe650e95 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size, > > > > static void free_hotplug_pgtable_page(struct page *page) > > { > > + pagetable_dtor(page_ptdesc(page)); > > free_hotplug_page_range(page, PAGE_SIZE, NULL); > > } > > I'd of course prefer that arm maintainers handle this. But > 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to > fix it. That's fine but Sashiko has some points: https://sashiko.dev/#/patchset/20260521032730.2104017-1-apopple@nvidia.com The __remove_pgd_mapping() path is fine but we also have the vmemmap_free() path where the constructor was never called. We could pass around a bool dtor argument but I wonder whether we could just check it's a pgtable page: diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 4c8959153ac4..9d42cbddce27 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, static void free_hotplug_pgtable_page(struct page *page) { + if (folio_test_pgtable(page_folio(page))) + pagetable_dtor(page_ptdesc(page)); + free_hotplug_page_range(page, PAGE_SIZE, NULL); } -- Catalin ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-22 7:15 ` Catalin Marinas @ 2026-05-22 7:32 ` Catalin Marinas 2026-05-22 9:36 ` Vishal Moola 1 sibling, 0 replies; 13+ messages in thread From: Catalin Marinas @ 2026-05-22 7:32 UTC (permalink / raw) To: Andrew Morton Cc: Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will, david On Fri, May 22, 2026 at 08:15:09AM +0100, Catalin Marinas wrote: > On Thu, May 21, 2026 at 03:31:30PM -0700, Andrew Morton wrote: > > On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote: > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > > index 8e1d80a7033e..0c24fe650e95 100644 > > > --- a/arch/arm64/mm/mmu.c > > > +++ b/arch/arm64/mm/mmu.c > > > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size, > > > > > > static void free_hotplug_pgtable_page(struct page *page) > > > { > > > + pagetable_dtor(page_ptdesc(page)); > > > free_hotplug_page_range(page, PAGE_SIZE, NULL); > > > } > > > > I'd of course prefer that arm maintainers handle this. But > > 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to > > fix it. > > That's fine but Sashiko has some points: > > https://sashiko.dev/#/patchset/20260521032730.2104017-1-apopple@nvidia.com The other Sashiko find looks like a false positive. vmemmap_*_populate() do not allocate the page table from altmap, only the page pointed at by the vmemmap pte. -- Catalin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-22 7:15 ` Catalin Marinas 2026-05-22 7:32 ` Catalin Marinas @ 2026-05-22 9:36 ` Vishal Moola 2026-05-26 11:54 ` Kevin Brodsky 1 sibling, 1 reply; 13+ messages in thread From: Vishal Moola @ 2026-05-22 9:36 UTC (permalink / raw) To: Catalin Marinas Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will, david On Fri, May 22, 2026 at 08:15:09AM +0100, Catalin Marinas wrote: > On Thu, May 21, 2026 at 03:31:30PM -0700, Andrew Morton wrote: > > On Thu, 21 May 2026 13:27:30 +1000 Alistair Popple <apopple@nvidia.com> wrote: > > > Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in > > > __create_pgd_mapping()") page-table allocation on ARM64 always > > > calls pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type > > > to PGTY_table, increments NR_PAGETABLE and possible allocates a PTL. > > > However the matching pagetable_dtor() calls were never added. > > > > > > With DEBUG_VM enabled on kernel versions prior to v6.17 without > > > 2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page > > > type") this leads to the following warning when freeing these pages due > > > to page->page_type sharing page->_mapcount: > > > > > > BUG: Bad page state in process ... pfn:284fbb > > > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb > > > flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff) > > > page_type: f2(table) > > > page dumped because: nonzero mapcount > > > Call trace: > > > bad_page+0x13c/0x160 > > > __free_frozen_pages+0x6cc/0x860 > > > ___free_pages+0xf4/0x180 > > > free_pages+0x54/0x80 > > > free_hotplug_page_range.part.0+0x58/0x90 > > > free_empty_tables+0x438/0x500 > > > __remove_pgd_mapping.constprop.0+0x60/0xa8 > > > arch_remove_memory+0x48/0x80 > > > try_remove_memory+0x158/0x1d8 > > > offline_and_remove_memory+0x138/0x180 > > > > > > It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS > > > is defined and incorrect NR_PAGETABLE stats. Fix this by calling > > > pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the > > > page to undo the effects of calling pagetable_*_ctor(). > > > > > > Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()") > > > > 6.16+, so I assume we want cc:stable here. > > > > > arch/arm64/mm/mmu.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > > index 8e1d80a7033e..0c24fe650e95 100644 > > > --- a/arch/arm64/mm/mmu.c > > > +++ b/arch/arm64/mm/mmu.c > > > @@ -1422,6 +1422,7 @@ static void free_hotplug_page_range(struct page *page, size_t size, > > > > > > static void free_hotplug_pgtable_page(struct page *page) > > > { > > > + pagetable_dtor(page_ptdesc(page)); > > > free_hotplug_page_range(page, PAGE_SIZE, NULL); > > > } > > > > I'd of course prefer that arm maintainers handle this. But > > 5e8eb9aeeda3 came via myself so convention kinda-dictates that I get to > > fix it. > > That's fine but Sashiko has some points: > > https://sashiko.dev/#/patchset/20260521032730.2104017-1-apopple@nvidia.com > > The __remove_pgd_mapping() path is fine but we also have the > vmemmap_free() path where the constructor was never called. > > We could pass around a bool dtor argument but I wonder whether we could > just check it's a pgtable page: Free_empty_tables() looks like the only way we'd ever get to free_hotplug_pgtable_page(). I'm a little curious why we can't consolidate unmap_hotplug_range() and free_empty_tables(). I.e. just fold unmap_hotplug_range() into the latter. > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 4c8959153ac4..9d42cbddce27 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, > > static void free_hotplug_pgtable_page(struct page *page) > { > + if (folio_test_pgtable(page_folio(page))) This should work. > + pagetable_dtor(page_ptdesc(page)); > + > free_hotplug_page_range(page, PAGE_SIZE, NULL); In the case we presumably have a page table page (ptdesc) at this point, we should really be freeing it with pagetable_free() as well. Its not a big deal that we don't right now, but losing track of the matching allocation/free sites will become a headache when separately allocating from struct page. > } > > > -- > Catalin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-22 9:36 ` Vishal Moola @ 2026-05-26 11:54 ` Kevin Brodsky 2026-05-26 12:31 ` David Hildenbrand (Arm) 2026-05-26 15:07 ` Will Deacon 0 siblings, 2 replies; 13+ messages in thread From: Kevin Brodsky @ 2026-05-26 11:54 UTC (permalink / raw) To: Vishal Moola, Catalin Marinas Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will, david On 22/05/2026 11:36, Vishal Moola wrote: >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >> index 4c8959153ac4..9d42cbddce27 100644 >> --- a/arch/arm64/mm/mmu.c >> +++ b/arch/arm64/mm/mmu.c >> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, >> >> static void free_hotplug_pgtable_page(struct page *page) >> { >> + if (folio_test_pgtable(page_folio(page))) > This should work. > >> + pagetable_dtor(page_ptdesc(page)); >> + >> free_hotplug_page_range(page, PAGE_SIZE, NULL); > In the case we presumably have a page table page (ptdesc) at this > point, we should really be freeing it with pagetable_free() as well. Agreed, I think this is the right thing to do, something like: if (folio_test_pgtable(page_folio(page))) pagetable_dtor_free(page_ptdesc(page)); else free_hotplug_page_range(page, PAGE_SIZE, NULL); Strangely enough x86 calls pagetable_free() in both cases. My series protecting page tables with pkeys has a patch [1] to get vmemmap to allocate page tables with pagetable_alloc(). The diff above will require pagetable_*_ctor() to be called as well, but I think that's the right thing to do anyway. That could be posted as a separate series, but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc(). - Kevin [1] https://lore.kernel.org/all/20260526-kpkeys-v8-14-eaaacdacc67c@arm.com/ > Its not a big deal that we don't right now, but losing track of the > matching allocation/free sites will become a headache when separately > allocating from struct page. > >> } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-26 11:54 ` Kevin Brodsky @ 2026-05-26 12:31 ` David Hildenbrand (Arm) 2026-05-27 7:34 ` Kevin Brodsky 2026-05-26 15:07 ` Will Deacon 1 sibling, 1 reply; 13+ messages in thread From: David Hildenbrand (Arm) @ 2026-05-26 12:31 UTC (permalink / raw) To: Kevin Brodsky, Vishal Moola, Catalin Marinas Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will On 5/26/26 13:54, Kevin Brodsky wrote: > On 22/05/2026 11:36, Vishal Moola wrote: >>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >>> index 4c8959153ac4..9d42cbddce27 100644 >>> --- a/arch/arm64/mm/mmu.c >>> +++ b/arch/arm64/mm/mmu.c >>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, >>> >>> static void free_hotplug_pgtable_page(struct page *page) >>> { >>> + if (folio_test_pgtable(page_folio(page))) >> This should work. >> >>> + pagetable_dtor(page_ptdesc(page)); >>> + >>> free_hotplug_page_range(page, PAGE_SIZE, NULL); >> In the case we presumably have a page table page (ptdesc) at this >> point, we should really be freeing it with pagetable_free() as well. > > Agreed, I think this is the right thing to do, something like: > > if (folio_test_pgtable(page_folio(page))) > pagetable_dtor_free(page_ptdesc(page)); else > free_hotplug_page_range(page, PAGE_SIZE, NULL); That code pattern is wrong. folio_test_pgtable() shouldn't exist. In the future, something is either a pgtable or a folio, not both. So check the type against the page, not the folio. -- Cheers, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-26 12:31 ` David Hildenbrand (Arm) @ 2026-05-27 7:34 ` Kevin Brodsky 2026-05-27 9:22 ` Vishal Moola 0 siblings, 1 reply; 13+ messages in thread From: Kevin Brodsky @ 2026-05-27 7:34 UTC (permalink / raw) To: David Hildenbrand (Arm), Vishal Moola, Catalin Marinas Cc: Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will On 26/05/2026 14:31, David Hildenbrand (Arm) wrote: > On 5/26/26 13:54, Kevin Brodsky wrote: >> On 22/05/2026 11:36, Vishal Moola wrote: >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >>>> index 4c8959153ac4..9d42cbddce27 100644 >>>> --- a/arch/arm64/mm/mmu.c >>>> +++ b/arch/arm64/mm/mmu.c >>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, >>>> >>>> static void free_hotplug_pgtable_page(struct page *page) >>>> { >>>> + if (folio_test_pgtable(page_folio(page))) >>> This should work. >>> >>>> + pagetable_dtor(page_ptdesc(page)); >>>> + >>>> free_hotplug_page_range(page, PAGE_SIZE, NULL); >>> In the case we presumably have a page table page (ptdesc) at this >>> point, we should really be freeing it with pagetable_free() as well. >> Agreed, I think this is the right thing to do, something like: >> >> if (folio_test_pgtable(page_folio(page))) >> pagetable_dtor_free(page_ptdesc(page)); else >> free_hotplug_page_range(page, PAGE_SIZE, NULL); > That code pattern is wrong. > > folio_test_pgtable() shouldn't exist. > > In the future, something is either a pgtable or a folio, not both. > > So check the type against the page, not the folio. In other words use PageTable(page) instead? Interestingly I can see a few calls to folio_test_pgtable() across the kernel but none to PageTable(), maybe just an antipattern then? The ctor/dtor also use __folio_{set,clear}_pgtable(). - Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-27 7:34 ` Kevin Brodsky @ 2026-05-27 9:22 ` Vishal Moola 0 siblings, 0 replies; 13+ messages in thread From: Vishal Moola @ 2026-05-27 9:22 UTC (permalink / raw) To: Kevin Brodsky Cc: David Hildenbrand (Arm), Catalin Marinas, Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, will On Wed, May 27, 2026 at 09:34:19AM +0200, Kevin Brodsky wrote: > On 26/05/2026 14:31, David Hildenbrand (Arm) wrote: > > On 5/26/26 13:54, Kevin Brodsky wrote: > >> On 22/05/2026 11:36, Vishal Moola wrote: > >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > >>>> index 4c8959153ac4..9d42cbddce27 100644 > >>>> --- a/arch/arm64/mm/mmu.c > >>>> +++ b/arch/arm64/mm/mmu.c > >>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, > >>>> > >>>> static void free_hotplug_pgtable_page(struct page *page) > >>>> { > >>>> + if (folio_test_pgtable(page_folio(page))) > >>> This should work. > >>> > >>>> + pagetable_dtor(page_ptdesc(page)); > >>>> + > >>>> free_hotplug_page_range(page, PAGE_SIZE, NULL); > >>> In the case we presumably have a page table page (ptdesc) at this > >>> point, we should really be freeing it with pagetable_free() as well. > >> Agreed, I think this is the right thing to do, something like: > >> > >> if (folio_test_pgtable(page_folio(page))) > >> pagetable_dtor_free(page_ptdesc(page)); else > >> free_hotplug_page_range(page, PAGE_SIZE, NULL); > > That code pattern is wrong. > > > > folio_test_pgtable() shouldn't exist. > > > > In the future, something is either a pgtable or a folio, not both. > > > > So check the type against the page, not the folio. > > In other words use PageTable(page) instead? Interestingly I can see a > few calls to folio_test_pgtable() across the kernel but none to > PageTable(), maybe just an antipattern then? The ctor/dtor also use > __folio_{set,clear}_pgtable(). If we know for sure we have the head page (which we probably do here), we should use PageTable(page). I included the folio APIs as a defensive way to ensure large order page table users don't break. We only set the type on the head page, and using a folio guarantees we're accessing that head page. This can go away as soon as someone looks at the architectures that allocate large order ptdescs. Also, most (if not all) 'pgtables' don't use ptdescs yet, but they probably should... Anyway, in this particular case, it looks like pgtable is just a name symbolizing 'page table' and not our type 'pgtable_t' anyway so I'm just rambling. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-26 11:54 ` Kevin Brodsky 2026-05-26 12:31 ` David Hildenbrand (Arm) @ 2026-05-26 15:07 ` Will Deacon 2026-05-27 7:35 ` Kevin Brodsky 1 sibling, 1 reply; 13+ messages in thread From: Will Deacon @ 2026-05-26 15:07 UTC (permalink / raw) To: Kevin Brodsky Cc: Vishal Moola, Catalin Marinas, Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, david On Tue, May 26, 2026 at 01:54:00PM +0200, Kevin Brodsky wrote: > On 22/05/2026 11:36, Vishal Moola wrote: > >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > >> index 4c8959153ac4..9d42cbddce27 100644 > >> --- a/arch/arm64/mm/mmu.c > >> +++ b/arch/arm64/mm/mmu.c > >> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, > >> > >> static void free_hotplug_pgtable_page(struct page *page) > >> { > >> + if (folio_test_pgtable(page_folio(page))) > > This should work. > > > >> + pagetable_dtor(page_ptdesc(page)); > >> + > >> free_hotplug_page_range(page, PAGE_SIZE, NULL); > > In the case we presumably have a page table page (ptdesc) at this > > point, we should really be freeing it with pagetable_free() as well. > > Agreed, I think this is the right thing to do, something like: > > if (folio_test_pgtable(page_folio(page))) > pagetable_dtor_free(page_ptdesc(page)); else > free_hotplug_page_range(page, PAGE_SIZE, NULL); > > > Strangely enough x86 calls pagetable_free() in both cases. > > My series protecting page tables with pkeys has a patch [1] to get > vmemmap to allocate page tables with pagetable_alloc(). The diff above > will require pagetable_*_ctor() to be called as well, but I think that's > the right thing to do anyway. That could be posted as a separate series, > but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc(). I agree that calling the ctor()/dtor() functions consistently is the cleanest approach and that will need something like your patch to call the constructor from vmemmap_alloc_block_zero(). Trying to elide these calls for the page-table pages used to map the altmap just feels odd to me, as there isn't anything particularly special about them afaik. Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-26 15:07 ` Will Deacon @ 2026-05-27 7:35 ` Kevin Brodsky 2026-05-27 9:30 ` Vishal Moola 0 siblings, 1 reply; 13+ messages in thread From: Kevin Brodsky @ 2026-05-27 7:35 UTC (permalink / raw) To: Will Deacon Cc: Vishal Moola, Catalin Marinas, Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, david On 26/05/2026 17:07, Will Deacon wrote: > On Tue, May 26, 2026 at 01:54:00PM +0200, Kevin Brodsky wrote: >> On 22/05/2026 11:36, Vishal Moola wrote: >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >>>> index 4c8959153ac4..9d42cbddce27 100644 >>>> --- a/arch/arm64/mm/mmu.c >>>> +++ b/arch/arm64/mm/mmu.c >>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, >>>> >>>> static void free_hotplug_pgtable_page(struct page *page) >>>> { >>>> + if (folio_test_pgtable(page_folio(page))) >>> This should work. >>> >>>> + pagetable_dtor(page_ptdesc(page)); >>>> + >>>> free_hotplug_page_range(page, PAGE_SIZE, NULL); >>> In the case we presumably have a page table page (ptdesc) at this >>> point, we should really be freeing it with pagetable_free() as well. >> Agreed, I think this is the right thing to do, something like: >> >> if (folio_test_pgtable(page_folio(page))) >> pagetable_dtor_free(page_ptdesc(page)); else >> free_hotplug_page_range(page, PAGE_SIZE, NULL); >> >> >> Strangely enough x86 calls pagetable_free() in both cases. >> >> My series protecting page tables with pkeys has a patch [1] to get >> vmemmap to allocate page tables with pagetable_alloc(). The diff above >> will require pagetable_*_ctor() to be called as well, but I think that's >> the right thing to do anyway. That could be posted as a separate series, >> but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc(). > I agree that calling the ctor()/dtor() functions consistently is the > cleanest approach and that will need something like your patch to call > the constructor from vmemmap_alloc_block_zero(). Trying to elide these > calls for the page-table pages used to map the altmap just feels odd to > me, as there isn't anything particularly special about them afaik. I don't think they're really special either, most likely they just got missed/ignored for the purpose of ctor/dtor like many other kernel page tables (until recently). I'll prepare a series refactoring that code then - that will also require changing most arch implementations of vmemmap_free() to call pagetable_dtor_free(). In the meantime we should probably use the logic above to avoid the BUG that Alistair reported. - Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables 2026-05-27 7:35 ` Kevin Brodsky @ 2026-05-27 9:30 ` Vishal Moola 0 siblings, 0 replies; 13+ messages in thread From: Vishal Moola @ 2026-05-27 9:30 UTC (permalink / raw) To: Kevin Brodsky Cc: Will Deacon, Catalin Marinas, Andrew Morton, Alistair Popple, linux-arm-kernel, linux-kernel, linux-mm, david On Wed, May 27, 2026 at 09:35:50AM +0200, Kevin Brodsky wrote: > On 26/05/2026 17:07, Will Deacon wrote: > > On Tue, May 26, 2026 at 01:54:00PM +0200, Kevin Brodsky wrote: > >> On 22/05/2026 11:36, Vishal Moola wrote: > >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > >>>> index 4c8959153ac4..9d42cbddce27 100644 > >>>> --- a/arch/arm64/mm/mmu.c > >>>> +++ b/arch/arm64/mm/mmu.c > >>>> @@ -1441,6 +1441,9 @@ static void free_hotplug_page_range(struct page *page, size_t size, > >>>> > >>>> static void free_hotplug_pgtable_page(struct page *page) > >>>> { > >>>> + if (folio_test_pgtable(page_folio(page))) > >>> This should work. > >>> > >>>> + pagetable_dtor(page_ptdesc(page)); > >>>> + > >>>> free_hotplug_page_range(page, PAGE_SIZE, NULL); > >>> In the case we presumably have a page table page (ptdesc) at this > >>> point, we should really be freeing it with pagetable_free() as well. > >> Agreed, I think this is the right thing to do, something like: > >> > >> if (folio_test_pgtable(page_folio(page))) > >> pagetable_dtor_free(page_ptdesc(page)); else > >> free_hotplug_page_range(page, PAGE_SIZE, NULL); > >> > >> > >> Strangely enough x86 calls pagetable_free() in both cases. > >> > >> My series protecting page tables with pkeys has a patch [1] to get > >> vmemmap to allocate page tables with pagetable_alloc(). The diff above > >> will require pagetable_*_ctor() to be called as well, but I think that's > >> the right thing to do anyway. That could be posted as a separate series, > >> but I'm hesitant due to the lack of NUMA awareness in pagetable_alloc(). > > I agree that calling the ctor()/dtor() functions consistently is the > > cleanest approach and that will need something like your patch to call > > the constructor from vmemmap_alloc_block_zero(). Trying to elide these > > calls for the page-table pages used to map the altmap just feels odd to > > me, as there isn't anything particularly special about them afaik. > > I don't think they're really special either, most likely they just got > missed/ignored for the purpose of ctor/dtor like many other kernel page > tables (until recently). > > I'll prepare a series refactoring that code then - that will also > require changing most arch implementations of vmemmap_free() to call > pagetable_dtor_free(). Take a look at Matthew's series[1]. I think thats the ideal approach for page table accounting. He hasn't had time to iterate on it though. I doubt he'd mind if someone picked it up. > In the meantime we should probably use the logic above to avoid the BUG > that Alistair reported. Agreed. [1] https://lore.kernel.org/linux-mm/20251113140448.1814860-4-willy@infradead.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-05-27 9:30 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-21 3:27 [PATCH] arm64: mm: call pagetable dtor when freeing hot-removed page tables Alistair Popple 2026-05-21 22:31 ` Andrew Morton 2026-05-21 23:50 ` Alistair Popple 2026-05-22 7:15 ` Catalin Marinas 2026-05-22 7:32 ` Catalin Marinas 2026-05-22 9:36 ` Vishal Moola 2026-05-26 11:54 ` Kevin Brodsky 2026-05-26 12:31 ` David Hildenbrand (Arm) 2026-05-27 7:34 ` Kevin Brodsky 2026-05-27 9:22 ` Vishal Moola 2026-05-26 15:07 ` Will Deacon 2026-05-27 7:35 ` Kevin Brodsky 2026-05-27 9:30 ` Vishal Moola
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox