linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv4 0/3] mm/memory_hotplug: fixup crash during uevent handling
@ 2025-07-29  6:46 Hannes Reinecke
  2025-07-29  6:46 ` [PATCH 1/3] drivers/base/memory: add node id parameter to add_memory_block() Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Hannes Reinecke @ 2025-07-29  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Oscar Salvador, linux-mm, Hannes Reinecke

Hi all,

we have some udev rules trying to read the sysfs attribute 'valid_zones' during
an memory 'add' event, causing a crash in zone_for_pfn_range(). Debugging found
that mem->nid was set to NUMA_NO_NODE, which crashed in NODE_DATA(nid).
Further analysis revealed that we're running into a race with udev event
processing: add_memory_resource() has this function calls:

1) __try_online_node()
2) arch_add_memory()
3) create_memory_block_devices()
  -> calls device_register() -> memory 'add' event
4) node_set_online()/__register_one_node()
  -> calls device_register() -> node 'add' event
5) register_memory_blocks_under_node()
  -> sets mem->nid

Which, to the uninitated, is ... weird ...

Why do we try to online the node in 1), but only register
the node in 4) _after_ we have created the memory blocks in 3) ?
And why do we set the 'nid' value in 5), when the uevent
(which might need to see the correct 'nid' value) is sent out
in 3) ?
There must be a reason, I'm sure ...

So here's a small patchset to fixup uevent ordering.
The first patch adds a 'nid' parameter to add_memory_blocks()
(to avoid mem->nid being initialized with NUMA_NO_NODE), and
the second patch reshuffles the code in add_memory_resource()
to fully initialize the node prior to calling
create_memory_block_devices() so that the node is valid at
that time and uevent processing will see correct values in sysfs.

As usual, comments and reviews are welcome.

Changes to the original submission:
- Add patch to rename memory_block_add_nid()
- Add reviews

Changes to v2:
- Move changes to nid setting into the last patch
- Add reviews from David Hildenbrand

Changes to v3:
- Add reviews
- Rebase to mm-unstable

Hannes Reinecke (3):
  drivers/base/memory: add node id parameter to add_memory_block()
  mm/memory_hotplug: activate node before adding new memory blocks
  drivers/base: move memory_block_add_nid() into the caller

 drivers/base/memory.c  | 53 ++++++++++++++++++------------------------
 drivers/base/node.c    | 10 ++++----
 include/linux/memory.h |  5 ++--
 mm/memory_hotplug.c    | 32 +++++++++++++------------
 4 files changed, 46 insertions(+), 54 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-30  9:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-29  6:46 [PATCHv4 0/3] mm/memory_hotplug: fixup crash during uevent handling Hannes Reinecke
2025-07-29  6:46 ` [PATCH 1/3] drivers/base/memory: add node id parameter to add_memory_block() Hannes Reinecke
2025-07-29  6:46 ` [PATCH 2/3] mm/memory_hotplug: activate node before adding new memory blocks Hannes Reinecke
2025-07-29  6:46 ` [PATCH 3/3] drivers/base: move memory_block_add_nid() into the caller Hannes Reinecke
2025-07-29 20:38 ` [PATCHv4 0/3] mm/memory_hotplug: fixup crash during uevent handling Andrew Morton
2025-07-30  5:49   ` Hannes Reinecke
2025-07-30  6:30     ` Andrew Morton
2025-07-30  9:39     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).