public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-cxl@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 09/14] mm/sparse: remove CONFIG_MEMORY_HOTPLUG-specific usemap allocation handling
Date: Fri, 20 Mar 2026 19:49:46 +0100	[thread overview]
Message-ID: <306f0fd2-cba1-4f48-af05-248be2bc9506@kernel.org> (raw)
In-Reply-To: <f23aa64a-d78f-4c57-a08b-5c589889b7fc@lucifer.local>

On 3/17/26 20:48, Lorenzo Stoakes (Oracle) wrote:
> On Tue, Mar 17, 2026 at 05:56:47PM +0100, David Hildenbrand (Arm) wrote:
>> In 2008, we added through commit 48c906823f39 ("memory hotplug: allocate
>> usemap on the section with pgdat") quite some complexity to try
>> allocating memory for the "usemap" (storing pageblock information
>> per memory section) for a memory section close to the memory of the
>> "pgdat" of the node.
>>
>> The goal was to make memory hotunplug of boot memory more likely to
>> succeed. That commit also added some checks for circular dependencies
>> between two memory sections, whereby two memory sections would contain
>> each others usemap, turning bot memory sections un-removable.
> 
> Typo: bot -> both. Presumably you are not talking about memory a bot of some
> kind allocated :P
> 
>>
>> However, in 2010, commit a4322e1bad91 ("sparsemem: Put usemap for one node
>> together") started allocating the usemap for multiple memory
>> sections on the same node in one chunk, effectively grouping all usemap
>> allocations of the same node in a single memblock allocation.
>>
>> We don't really give guarantees about memory hotunplug of boot memory, and
>> with the change in 2010, it is pretty much impossible in practice to get
>> any circular dependencies.
> 
> Pretty much impossible? :) We can probably go so far as to so impossible no?

Yes.

> 
>>
>> commit 48c906823f39 ("memory hotplug: allocate usemap on the section with
>> pgdat") also added the comment:
>>
>> 	"Similarly, a pgdat can prevent a section being removed. If
>> 	 section A contains a pgdat and section B
>> 	 contains the usemap, both sections become inter-dependent."
>>
>> Given that we don't free the pgdat anymore, that comment (and handling)
>> does not apply.
> 
> Isn't pgdat synonymous with a node and that's the data structure that describes
> a node right? Confusingly typedef'd from pglist_data to pg_data_t but then
> referred to as pgdat because all that makes so much sense :)

Yeah, in general we refer to the NODE_DATA as pgdat (grep for it and
you'll be surprised).

> 
> But I'm confused, does a section containing a pgdat mean a section having the
> pgdat data structure literally allocated in it?

Yes. "struct pgdat" placed in some memory section.

> 
> A usemap is... something that tracks pageblock metadata I think right?

Yes. Essentially a large array of bytes, whereby each byte describes a
pageblock data (migratetype etc)

> 
> Anyway I'm also confused by 'given we don't free the pgdat any more', but the
> comment says a 'pgdat can prevent a section being removed' rather than anything
> about it being removed?

Well, if a pgdat resides in some memory section, given that it is
unmovable turns the whole memory section unremovable -> hotunplug fails.

Assuming you could free the pgdat when the node goes offlining, you
would turn that memory section removable.

And I think that commit somehow assumed that the last memory section
could be removed if all it contains is the corresponding pgdat (which
was never the case).

> 
> I guess it means the OTHER section could be prevented from being removed even
> after it's gone.. somehow?
> 
> Anyway! I think maybe this could be clearer, somehow :)

I'm afraid the whole purpose of the original patch was sketchy, which is
also while I fail to even explain the original motivation clearly.

Now it's fortunately no longer required. :)

> 
>>
>> So let's simply remove this complexity.
>>
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> 
> I think what you've done in the patch is right though, we're not doing any of
> these dances after a4322e1bad91 and pgdats sitting around mean we don't really
> care about where the usemap goes anyway I don't think so...
> 
> I usemap and I find myself in a place where I give you a:
> 
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> 

Thanks ;)

[...]

>> -
>>  #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>  unsigned long __init section_map_size(void)
>>  {
>> @@ -486,7 +390,6 @@ void __init sparse_init_early_section(int nid, struct page *map,
>>  				      unsigned long pnum, unsigned long flags)
>>  {
>>  	BUG_ON(!sparse_usagebuf || sparse_usagebuf >= sparse_usagebuf_end);
>> -	check_usemap_section_nr(nid, sparse_usagebuf);
>>  	sparse_init_one_section(__nr_to_section(pnum), pnum, map,
>>  			sparse_usagebuf, SECTION_IS_EARLY | flags);
>>  	sparse_usagebuf = (void *)sparse_usagebuf + mem_section_usage_size();
>> @@ -497,8 +400,7 @@ static int __init sparse_usage_init(int nid, unsigned long map_count)
>>  	unsigned long size;
>>
>>  	size = mem_section_usage_size() * map_count;
>> -	sparse_usagebuf = sparse_early_usemaps_alloc_pgdat_section(
>> -				NODE_DATA(nid), size);
>> +	sparse_usagebuf = memblock_alloc_node(size, SMP_CACHE_BYTES, nid);
> 
> I guess nid here is the same node as the pgdat?

Yes! before we used NODE_DATA(nid)->node_id, which is really just ... nid :)

-- 
Cheers,

David


  reply	other threads:[~2026-03-20 18:49 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 16:56 [PATCH 00/14] mm: memory hot(un)plug and SPARSEMEM cleanups David Hildenbrand (Arm)
2026-03-17 16:56 ` [PATCH 01/14] mm/memory_hotplug: remove for_each_valid_pfn() usage David Hildenbrand (Arm)
2026-03-17 17:19   ` Lorenzo Stoakes (Oracle)
2026-03-17 20:30   ` David Hildenbrand (Arm)
2026-03-18  7:51   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 02/14] mm/sparse: remove WARN_ONs from (online|offline)_mem_sections() David Hildenbrand (Arm)
2026-03-17 17:21   ` Lorenzo Stoakes (Oracle)
2026-03-18  7:53   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 03/14] mm/Kconfig: make CONFIG_MEMORY_HOTPLUG depend on CONFIG_SPARSEMEM_VMEMMAP David Hildenbrand (Arm)
2026-03-17 17:22   ` Lorenzo Stoakes (Oracle)
2026-03-18  7:55   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 04/14] mm/memory_hotplug: simplify check_pfn_span() David Hildenbrand (Arm)
2026-03-17 17:24   ` Lorenzo Stoakes (Oracle)
2026-03-18  7:56   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 05/14] mm/sparse: remove !CONFIG_SPARSEMEM_VMEMMAP leftovers for CONFIG_MEMORY_HOTPLUG David Hildenbrand (Arm)
2026-03-17 17:54   ` Lorenzo Stoakes (Oracle)
2026-03-18  7:58   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 06/14] mm/bootmem_info: remove handling for !CONFIG_SPARSEMEM_VMEMMAP David Hildenbrand (Arm)
2026-03-17 17:49   ` Lorenzo Stoakes (Oracle)
2026-03-18  8:15   ` Mike Rapoport
2026-03-20 18:37     ` David Hildenbrand (Arm)
2026-03-17 16:56 ` [PATCH 07/14] mm/bootmem_info: avoid using sparse_decode_mem_map() David Hildenbrand (Arm)
2026-03-17 18:02   ` Lorenzo Stoakes (Oracle)
2026-03-18  8:20   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 08/14] mm/sparse: remove sparse_decode_mem_map() David Hildenbrand (Arm)
2026-03-17 19:25   ` Lorenzo Stoakes (Oracle)
2026-03-18  8:20   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 09/14] mm/sparse: remove CONFIG_MEMORY_HOTPLUG-specific usemap allocation handling David Hildenbrand (Arm)
2026-03-17 19:48   ` Lorenzo Stoakes (Oracle)
2026-03-20 18:49     ` David Hildenbrand (Arm) [this message]
2026-03-20 18:58       ` David Hildenbrand (Arm)
2026-03-18  8:34   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 10/14] mm: prepare to move subsection_map_init() to mm/sparse-vmemmap.c David Hildenbrand (Arm)
2026-03-17 19:51   ` Lorenzo Stoakes (Oracle)
2026-03-20 18:59     ` David Hildenbrand (Arm)
2026-03-18  8:46   ` Mike Rapoport
2026-03-20 19:01     ` David Hildenbrand (Arm)
2026-03-17 16:56 ` [PATCH 11/14] mm/sparse: drop set_section_nid() from sparse_add_section() David Hildenbrand (Arm)
2026-03-17 19:55   ` Lorenzo Stoakes (Oracle)
2026-03-18  8:50   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 12/14] mm/sparse: move sparse_init_one_section() to internal.h David Hildenbrand (Arm)
2026-03-17 20:00   ` Lorenzo Stoakes (Oracle)
2026-03-18  8:54   ` Mike Rapoport
2026-03-17 16:56 ` [PATCH 13/14] mm/sparse: move __section_mark_present() " David Hildenbrand (Arm)
2026-03-17 20:01   ` Lorenzo Stoakes (Oracle)
2026-03-18  8:56   ` Mike Rapoport
2026-03-20 19:06     ` David Hildenbrand (Arm)
2026-03-17 16:56 ` [PATCH 14/14] mm/sparse: move memory hotplug bits to sparse-vmemmap.c David Hildenbrand (Arm)
2026-03-17 20:09   ` Lorenzo Stoakes (Oracle)
2026-03-20 19:07     ` David Hildenbrand (Arm)
2026-03-18  8:57   ` Mike Rapoport
2026-03-18 19:51 ` [PATCH 00/14] mm: memory hot(un)plug and SPARSEMEM cleanups Andrew Morton
2026-03-18 19:54   ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=306f0fd2-cba1-4f48-af05-248be2bc9506@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox