Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/mm: fix vmemmap leak on memory hot-remove
@ 2026-05-19 15:10 Juhyung Park
  2026-05-19 16:02 ` Dave Hansen
  2026-05-20  5:24 ` David Hildenbrand (Arm)
  0 siblings, 2 replies; 13+ messages in thread
From: Juhyung Park @ 2026-05-19 15:10 UTC (permalink / raw)
  To: linux-mm
  Cc: Juhyung Park, stable, Lu Baolu, Jason Gunthorpe,
	David Hildenbrand, Mike Rapoport (Microsoft), Oscar Salvador,
	Andrew Morton, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dan Williams,
	Dave Jiang, Vishal Verma, linux-cxl, nvdimm

free_pagetable() is called via free_hugepage_table() with
get_order(PMD_SIZE) = 9 to free the 2 MB vmemmap PMD leaves that back
struct page arrays on x86_64. After commit bf9e4e30f353 ("x86/mm: use
pagetable_free()"), it goes through pagetable_free() instead of
__free_pages(), and pagetable_free() ultimately calls
__free_pages(page, compound_order()) which ignores the explicit order
argument and infers it from the page's compound metadata.

The vmemmap PMD chunks are allocated by vmemmap_alloc_block() using
alloc_pages_node() without __GFP_COMP, so PG_head is not set and
compound_order() returns 0. Only the first of 512 pages of each PMD
chunk is returned to the buddy allocator on hot-remove; the remaining
511 pages stay allocated and become unreachable. Generalized: roughly
16 MB leaked per GB of hot-removed memory per cycle.

The leak affects every memory hot-remove path on x86_64 when
memmap_on_memory=N (the default), including dax_kmem, virtio-mem,
balloon drivers, ACPI memory hotplug, and direct sysfs offline+remove.
memmap_on_memory=Y avoids it because free_hugepage_table() then takes
the altmap branch and does not call free_pagetable().

Reproduced with CXL memory toggled through DAX in a loop:

  daxctl reconfigure-device --mode=system-ram dax0.0 --force
  daxctl reconfigure-device --mode=devdax    dax0.0 --force

Fixes: bf9e4e30f353 ("x86/mm: use pagetable_free()")
Cc: stable@vger.kernel.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <djbw@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: linux-cxl@vger.kernel.org
Cc: nvdimm@lists.linux.dev
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
---
 arch/x86/mm/init_64.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index df2261fa4f98..a2301bddb647 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1024,7 +1024,12 @@ static void __meminit free_pagetable(struct page *page, int order)
 		free_reserved_pages(page, nr_pages);
 #endif
 	} else {
-		pagetable_free(page_ptdesc(page));
+		/*
+		 * Use __free_pages() to honor @order: vmemmap PMD leaves
+		 * freed here are not compound pages, so pagetable_free()
+		 * would lose leak 511 of 512 pages per 2 MB chunk.
+		 */
+		__free_pages(page, order);
 	}
 }
 
-- 
2.54.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-22  0:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 15:10 [PATCH] x86/mm: fix vmemmap leak on memory hot-remove Juhyung Park
2026-05-19 16:02 ` Dave Hansen
2026-05-19 16:27   ` Juhyung Park
2026-05-19 16:41     ` Dave Hansen
2026-05-19 16:59       ` Juhyung Park
2026-05-20  4:49         ` Mike Rapoport
2026-05-20  5:24 ` David Hildenbrand (Arm)
2026-05-20 10:23   ` Juhyung Park
2026-05-20 10:33   ` Juhyung Park
2026-05-20 21:52     ` David Hildenbrand (Arm)
2026-05-20 21:54       ` Dave Hansen
2026-05-20 21:59         ` David Hildenbrand (Arm)
2026-05-22  0:37         ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox