Re: [PATCH] x86/mm: fix vmemmap leak on memory hot-remove

Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Mike Rapoport <rppt@kernel.org>
To: Juhyung Park <qkrwngud825@gmail.com>,
	Vishal Moola <vishal.moola@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	linux-mm@kvack.org, stable@vger.kernel.org,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	David Hildenbrand <david@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>, Dan Williams <djbw@kernel.org>,
	Dave Jiang <dave.jiang@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH] x86/mm: fix vmemmap leak on memory hot-remove
Date: Wed, 20 May 2026 07:49:27 +0300	[thread overview]
Message-ID: <ag09V3UyN2aWA7Wb@kernel.org> (raw)
In-Reply-To: <CAD14+f3sohXj9SKEkRXGK_Mpbp73R5az-tsiHnHkj0poBHwpvw@mail.gmail.com>

(adding Vishal)

On Wed, May 20, 2026 at 01:59:49AM +0900, Juhyung Park wrote:
> On Wed, May 20, 2026 at 1:41 AM Dave Hansen <dave.hansen@intel.com> wrote:
> >
> > On 5/19/26 09:27, Juhyung Park wrote:
> > > Hi Dave,
> > >
> > > On Wed, May 20, 2026 at 1:02 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > >>
> > >> On 5/19/26 08:10, Juhyung Park wrote:
> > >>>  #endif
> > >>>       } else {
> > >>> -             pagetable_free(page_ptdesc(page));
> > >>> +             /*
> > >>> +              * Use __free_pages() to honor @order: vmemmap PMD leaves
> > >>> +              * freed here are not compound pages, so pagetable_free()
> > >>> +              * would lose leak 511 of 512 pages per 2 MB chunk.
> > >>> +              */
> > >>> +             __free_pages(page, order);
> > >>>       }
> > >>>  }
> > >>
> > >> I find myself really wondering how much of this came from a human and
> > >> how much from the LLM. Could you share that with us?
> > >
> > > Not my first kernel contribution, just so you know. (first in mm tho)
> > >
> > > I asked Claude to write both the commit body and comment and it was
> > > too verbose. I manually trimmed it down.
> > > Sorry if it still sounds too LLM-ish.
> >
> > Yeah, it still sounded really LLM-ish to me. Still rather chatty.
> >
> > > This was tested on a VM with virtualized CXL device and toggling it
> > > back and forth was visibly causing leaks. kmemleak was unable to catch
> > > this (rightfully so), so I skeptically asked Claude to see if it can
> > > figure it out while pwd was the kernel source the VM was running.
> > > "Access the VM at "ssh -p2223 root@192.168.0.185". There's a memory
> > > leak whenever CXL memory switches modes via: daxctl reconfigure-device
> > > --mode=system-ram dax0.0 --force, daxctl reconfigure-device
> > > --mode=devdax dax0.0 --force. Figure out why. If you need to reboot
> > > the VM, do not do it yourself and ask me."
> > >
> > > It did in 6 minutes and it basically told me to revert bf9e4e30f353. I
> > > was very skeptical and reviewed manually (with my short knowledge of
> > > mm) why this would be a correct fix.
> >
> > Neato.
> >
> > >> We're trying to get _away_ from using the 'struct page' APIs on page
> > >> tables. This goes backwards. Worst case, do:
> > >>
> > >>         /* vmemmap PMD leaves are not compound pages */
> > >>         for (i = 0; i < 1<<order; i++)
> > >>                 pagetable_free(page_ptdesc(&page[i]));
> > >>
> > >> Right?
> > >
> > > Shouldn't I worry about the loop overhead? With order == 9, that's 512
> > > iterations. That's compounded to O(N) when the entire memory size is
> > > in consideration.
> >
> > Is it optimal? No.
> >
> > Will anybody ever notice? Also no.
> >
> > Will anybody ever care? No sir.
> 
> Just spun a test with that loop. It doesn't fix the leak.
> 
> I hate to be the guy that copy-pastas LLM but this is outside my
> knowledge of mm. Claude suggests:
> "Each pagetable_free() on the tails is a no-op: When
> alloc_pages_node(node, gfp, order=9) returns without __GFP_COMP, the
> buddy allocator only sets _refcount = 1 on the head page. The other
> 511 pages (page[1] … page[511]) have _refcount = 0. There's no
> compound metadata, so they aren't "tails" in the folio sense either —
> they're just contiguous pages whose refcounts the allocator never
> touched."
> 
> Any ideas?
> 
> Thanks.
> 
> >
> > Can you measure the difference? I'd wager a beer: No again.
> >
> > Even if someone manages to notice, then you have a clear path to fix it
> > *right*: fix the ptdesc data structure to represent high-order allocations.

-- 
Sincerely yours,
Mike.

next prev parent reply	other threads:[~2026-05-20  4:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-19 15:10 [PATCH] x86/mm: fix vmemmap leak on memory hot-remove Juhyung Park
2026-05-19 16:02 ` Dave Hansen
2026-05-19 16:27   ` Juhyung Park
2026-05-19 16:41     ` Dave Hansen
2026-05-19 16:59       ` Juhyung Park
2026-05-20  4:49         ` Mike Rapoport [this message]
2026-05-20  5:24 ` David Hildenbrand (Arm)
2026-05-20 10:23   ` Juhyung Park
2026-05-20 10:33   ` Juhyung Park
2026-05-20 21:52     ` David Hildenbrand (Arm)
2026-05-20 21:54       ` Dave Hansen
2026-05-20 21:59         ` David Hildenbrand (Arm)
2026-05-22  0:37         ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag09V3UyN2aWA7Wb@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@kernel.org \
    --cc=djbw@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=osalvador@suse.de \
    --cc=peterz@infradead.org \
    --cc=qkrwngud825@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@kernel.org \
    --cc=vishal.l.verma@intel.com \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox