linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/10]
@ 2025-06-05 14:22 Oscar Salvador
  2025-06-05 14:22 ` [PATCH v5 01/10] mm,slub: Do not special case N_NORMAL nodes for slab_nodes Oscar Salvador
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

 v4 -> v5:
   - Split out conversion for different consumers (per David)
   - Renamed node-notifier actions (per David)
   - Added new Documentation for new node-notifier and updated
     the memory-notifier one to reflect the changes
   - Make sure we do not trigger anything when !CONFIG_NUMA (per David)

 v3 -> v4:
   - Fix typos pointed out by Alok Tiwari
   - Further cleanups suggested by Vlastimil
   - Add RBs-by from Vlastimil

 v2 -> v3:
   - Add Suggested-by (David)
   - Replace last N_NORMAL_MEMORY mention in slub (David)
   - Replace the notifier for autoweitght-mempolicy
   - Fix build on !CONFIG_MEMORY_HOTPLUG
 
 v1 -> v2:
   - Remove status_change_nid_normal and the code that
     deals with it (David & Vlastimil)
   - Remove slab_mem_offline_callback (David & Vlastimil)
   - Change the order of canceling the notifiers
     in {online,offline}_pages (Vlastimil)
   - Fix up a couple of whitespaces (Jonathan Cameron)
   - Add RBs-by

Memory notifier is a tool that allow consumers to get notified whenever
memory gets onlined or offlined in the system.
Currently, there are 10 consumers of that, but 5 out of those 10 consumers
are only interested in getting notifications when a numa node changes its
memory state.
That means going from memoryless to memory-aware of vice versa.

Which means that for every {online,offline}_pages operation they get
notified even though the numa node might not have changed its state.
This is suboptimal, and we want to decouple numa node state changes from
memory state changes.

While we are doing this, remove status_change_nid_normal, as the only
current user (slub) does not really need it.
This allows us to further simplify and clean up the code.

The first patch gets rid of status_change_nid_normal in slub.
The second patch implements a numa node notifier that does just that, and have
those consumers register in there, so they get notified only when they are
interested.

The third patch replaces 'status_change_nid{_normal}' fields within
memory_notify with a 'nid', as that is only what we need for memory
notifer and update the only user of it (page_ext).

Consumers that are only interested in numa node states change are:

 - memory-tier
 - slub
 - cpuset
 - hmat
 - cxl
 - autoweight-mempolicy

Oscar Salvador (10):
  mm,slub: Do not special case N_NORMAL nodes for slab_nodes
  mm,memory_hotplug: Remove status_change_nid_normal and update
    documentation
  mm,memory_hotplug: Implement numa node notifier
  mm,slub: Use node-notifier instead of memory-notifier
  mm,memory-tiers: Use node-notifier instead of memory-notifier
  drivers,cxl: Use node-notifier instead of memory-notifier
  drivers,hmat: Use node-notifier instead of memory-notifier
  kernel,cpuset: Use node-notifier instead of memory-notifier
  mm,mempolicy: Use node-notifier instead of memory-notifier
  mm,memory_hotplug: Rename status_change_nid parameter in memory_notify

 Documentation/core-api/memory-hotplug.rst     |  78 ++++++--
 .../zh_CN/core-api/memory-hotplug.rst         |   3 -
 drivers/acpi/numa/hmat.c                      |   8 +-
 drivers/base/node.c                           |  21 +++
 drivers/cxl/core/region.c                     |  16 +-
 drivers/cxl/cxl.h                             |   4 +-
 include/linux/memory.h                        |   3 +-
 include/linux/node.h                          |  42 +++++
 kernel/cgroup/cpuset.c                        |   2 +-
 mm/memory-tiers.c                             |  14 +-
 mm/memory_hotplug.c                           | 167 ++++++++----------
 mm/mempolicy.c                                |  10 +-
 mm/page_ext.c                                 |  12 +-
 mm/slub.c                                     |  45 +----
 14 files changed, 240 insertions(+), 185 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 01/10] mm,slub: Do not special case N_NORMAL nodes for slab_nodes
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-05 14:22 ` [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation Oscar Salvador
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

Currently, slab_mem_going_online_callback() checks whether the node has
N_NORMAL memory in order to be set in slab_nodes.
While it is true that getting rid of that enforcing would mean
ending up with movables nodes in slab_nodes, the memory waste that comes
with that is negligible.

So stop checking for status_change_nid_normal and just use status_change_nid
instead which works for both types of memory.

Also, once we allocate the kmem_cache_node cache  for the node in
slab_mem_online_callback(), we never deallocate it in
slab_mem_offline_callback() when the node goes memoryless, so we can just
get rid of it.

The side effects are that we will stop clearing the node from slab_nodes,
and also that newly created kmem caches after node hotremove will now allocate
their kmem_cache_node for the node(s) that was hotremoved, but these
should be negligible.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
---
 mm/slub.c | 34 +++-------------------------------
 1 file changed, 3 insertions(+), 31 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index be8b09e09d30..f92b43d36adc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -447,7 +447,7 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 
 /*
  * Tracks for which NUMA nodes we have kmem_cache_nodes allocated.
- * Corresponds to node_state[N_NORMAL_MEMORY], but can temporarily
+ * Corresponds to node_state[N_MEMORY], but can temporarily
  * differ during memory hotplug/hotremove operations.
  * Protected by slab_mutex.
  */
@@ -6160,36 +6160,12 @@ static int slab_mem_going_offline_callback(void *arg)
 	return 0;
 }
 
-static void slab_mem_offline_callback(void *arg)
-{
-	struct memory_notify *marg = arg;
-	int offline_node;
-
-	offline_node = marg->status_change_nid_normal;
-
-	/*
-	 * If the node still has available memory. we need kmem_cache_node
-	 * for it yet.
-	 */
-	if (offline_node < 0)
-		return;
-
-	mutex_lock(&slab_mutex);
-	node_clear(offline_node, slab_nodes);
-	/*
-	 * We no longer free kmem_cache_node structures here, as it would be
-	 * racy with all get_node() users, and infeasible to protect them with
-	 * slab_mutex.
-	 */
-	mutex_unlock(&slab_mutex);
-}
-
 static int slab_mem_going_online_callback(void *arg)
 {
 	struct kmem_cache_node *n;
 	struct kmem_cache *s;
 	struct memory_notify *marg = arg;
-	int nid = marg->status_change_nid_normal;
+	int nid = marg->status_change_nid;
 	int ret = 0;
 
 	/*
@@ -6247,10 +6223,6 @@ static int slab_memory_callback(struct notifier_block *self,
 	case MEM_GOING_OFFLINE:
 		ret = slab_mem_going_offline_callback(arg);
 		break;
-	case MEM_OFFLINE:
-	case MEM_CANCEL_ONLINE:
-		slab_mem_offline_callback(arg);
-		break;
 	case MEM_ONLINE:
 	case MEM_CANCEL_OFFLINE:
 		break;
@@ -6321,7 +6293,7 @@ void __init kmem_cache_init(void)
 	 * Initialize the nodemask for which we will allocate per node
 	 * structures. Here we don't need taking slab_mutex yet.
 	 */
-	for_each_node_state(node, N_NORMAL_MEMORY)
+	for_each_node_state(node, N_MEMORY)
 		node_set(node, slab_nodes);
 
 	create_boot_cache(kmem_cache_node, "kmem_cache_node",
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
  2025-06-05 14:22 ` [PATCH v5 01/10] mm,slub: Do not special case N_NORMAL nodes for slab_nodes Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-05 14:34   ` Vlastimil Babka
  2025-06-05 14:54   ` David Hildenbrand
  2025-06-05 14:22 ` [PATCH v5 03/10] mm,memory_hotplug: Implement numa node notifier Oscar Salvador
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

Now that the last user of status_change_nid_normal is gone, we can remove it.
Update documentation accordingly.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 Documentation/core-api/memory-hotplug.rst            |  3 ---
 .../translations/zh_CN/core-api/memory-hotplug.rst   |  3 ---
 include/linux/memory.h                               |  1 -
 mm/memory_hotplug.c                                  | 12 ------------
 4 files changed, 19 deletions(-)

diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst
index 682259ee633a..d1b8eb9add8a 100644
--- a/Documentation/core-api/memory-hotplug.rst
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -56,14 +56,11 @@ The third argument (arg) passes a pointer of struct memory_notify::
 	struct memory_notify {
 		unsigned long start_pfn;
 		unsigned long nr_pages;
-		int status_change_nid_normal;
 		int status_change_nid;
 	}
 
 - start_pfn is start_pfn of online/offline memory.
 - nr_pages is # of pages of online/offline memory.
-- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
-  is (will be) set/clear, if this is -1, then nodemask status is not changed.
 - status_change_nid is set node id when N_MEMORY of nodemask is (will be)
   set/clear. It means a new(memoryless) node gets new memory by online and a
   node loses all memory. If this is -1, then nodemask status is not changed.
diff --git a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst b/Documentation/translations/zh_CN/core-api/memory-hotplug.rst
index 9b2841fb9a5f..c2a4122ae221 100644
--- a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst
+++ b/Documentation/translations/zh_CN/core-api/memory-hotplug.rst
@@ -62,7 +62,6 @@ memory_notify结构体的指针::
 	struct memory_notify {
 		unsigned long start_pfn;
 		unsigned long nr_pages;
-		int status_change_nid_normal;
 		int status_change_nid;
 	}
 
@@ -70,8 +69,6 @@ memory_notify结构体的指针::
 
 - nr_pages是在线/离线内存的页数。
 
-- status_change_nid_normal是当nodemask的N_NORMAL_MEMORY被设置/清除时设置节
-  点id,如果是-1,则nodemask状态不改变。
 
 - status_change_nid是当nodemask的N_MEMORY被(将)设置/清除时设置的节点id。这
   意味着一个新的(没上线的)节点通过联机获得新的内存,而一个节点失去了所有的内
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 5ec4e6d209b9..a9ccd6579422 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -109,7 +109,6 @@ struct memory_notify {
 	unsigned long altmap_nr_pages;
 	unsigned long start_pfn;
 	unsigned long nr_pages;
-	int status_change_nid_normal;
 	int status_change_nid;
 };
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b1caedbade5b..94ae0ca37021 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -706,19 +706,13 @@ static void node_states_check_changes_online(unsigned long nr_pages,
 	int nid = zone_to_nid(zone);
 
 	arg->status_change_nid = NUMA_NO_NODE;
-	arg->status_change_nid_normal = NUMA_NO_NODE;
 
 	if (!node_state(nid, N_MEMORY))
 		arg->status_change_nid = nid;
-	if (zone_idx(zone) <= ZONE_NORMAL && !node_state(nid, N_NORMAL_MEMORY))
-		arg->status_change_nid_normal = nid;
 }
 
 static void node_states_set_node(int node, struct memory_notify *arg)
 {
-	if (arg->status_change_nid_normal >= 0)
-		node_set_state(node, N_NORMAL_MEMORY);
-
 	if (arg->status_change_nid >= 0)
 		node_set_state(node, N_MEMORY);
 }
@@ -1895,7 +1889,6 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
 	enum zone_type zt;
 
 	arg->status_change_nid = NUMA_NO_NODE;
-	arg->status_change_nid_normal = NUMA_NO_NODE;
 
 	/*
 	 * Check whether node_states[N_NORMAL_MEMORY] will be changed.
@@ -1907,8 +1900,6 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
 	 */
 	for (zt = 0; zt <= ZONE_NORMAL; zt++)
 		present_pages += pgdat->node_zones[zt].present_pages;
-	if (zone_idx(zone) <= ZONE_NORMAL && nr_pages >= present_pages)
-		arg->status_change_nid_normal = zone_to_nid(zone);
 
 	/*
 	 * We have accounted the pages from [0..ZONE_NORMAL); ZONE_HIGHMEM
@@ -1927,9 +1918,6 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
 
 static void node_states_clear_node(int node, struct memory_notify *arg)
 {
-	if (arg->status_change_nid_normal >= 0)
-		node_clear_state(node, N_NORMAL_MEMORY);
-
 	if (arg->status_change_nid >= 0)
 		node_clear_state(node, N_MEMORY);
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 03/10] mm,memory_hotplug: Implement numa node notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
  2025-06-05 14:22 ` [PATCH v5 01/10] mm,slub: Do not special case N_NORMAL nodes for slab_nodes Oscar Salvador
  2025-06-05 14:22 ` [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-06  7:50   ` Oscar Salvador
  2025-06-05 14:22 ` [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier Oscar Salvador
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

There are at least six consumers of hotplug_memory_notifier that what they
really are interested in is whether any numa node changed its state, e.g: going
from having memory to not having memory and vice versa.

Implement a specific notifier for numa nodes when their state gets changed,
which will later be used by those consumers that are only interested
in numa node state changes.

Add documentation as well.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 Documentation/core-api/memory-hotplug.rst |  66 +++++++++
 drivers/base/node.c                       |  21 +++
 include/linux/node.h                      |  42 ++++++
 mm/memory_hotplug.c                       | 155 ++++++++++------------
 4 files changed, 202 insertions(+), 82 deletions(-)

diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst
index d1b8eb9add8a..b19c3be7437d 100644
--- a/Documentation/core-api/memory-hotplug.rst
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -9,6 +9,9 @@ Memory hotplug event notifier
 
 Hotplugging events are sent to a notification queue.
 
+Memory notifier
+----------------
+
 There are six types of notification defined in ``include/linux/memory.h``:
 
 MEM_GOING_ONLINE
@@ -80,6 +83,69 @@ further processing of the notification queue.
 
 NOTIFY_STOP stops further processing of the notification queue.
 
+Numa node notifier
+------------------
+
+There are six types of notification defined in ``include/linux/node.h``:
+
+NODE_ADDING_FIRST_MEMORY
+ Generated before memory becomes available to this node for the first time.
+
+NODE_CANCEL_ADDING_FIRST_MEMORY
+ Generated if NODE_ADDING_FIRST_MEMORY fails.
+
+NODE_ADDED_FIRST_MEMORY
+ Generated when memory has become available fo this node for the first time.
+
+NODE_REMOVING_LAST_MEMORY
+ Generated when the last memory available to this node is about to be offlined.
+
+NODE_CANCEL_REMOVING_LAST_MEMORY
+ Generated when NODE_CANCEL_REMOVING_LAST_MEMORY fails.
+
+NODE_REMOVED_LAST_MEMORY
+ Generated when the last memory available to this node has been offlined.
+
+A callback routine can be registered by calling::
+
+  hotplug_node_notifier(callback_func, priority)
+
+Callback functions with higher values of priority are called before callback
+functions with lower values.
+
+A callback function must have the following prototype::
+
+  int callback_func(
+
+    struct notifier_block *self, unsigned long action, void *arg);
+
+The first argument of the callback function (self) is a pointer to the block
+of the notifier chain that points to the callback function itself.
+The second argument (action) is one of the event types described above.
+The third argument (arg) passes a pointer of struct node_notify::
+
+        struct node_notify {
+                int nid;
+        }
+
+- nid is the node we are adding or removing memory to.
+
+  If nid >= 0, callback should create/discard structures for the
+  node if necessary.
+
+The callback routine shall return one of the values
+NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
+defined in ``include/linux/notifier.h``
+
+NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
+
+NOTIFY_BAD is used as response to the NODE_ADDING_FIRST_MEMORY,
+NODE_REMOVING_LAST_MEMORY, NODE_ADDED_FIRST_MEMORY or
+NODE_REMOVED_LAST_MEMORY action to cancel hotplugging.
+It stops further processing of the notification queue.
+
+NOTIFY_STOP stops further processing of the notification queue.
+
 Locking Internals
 =================
 
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 25ab9ec14eb8..c5b0859d846d 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -111,6 +111,27 @@ static const struct attribute_group *node_access_node_groups[] = {
 	NULL,
 };
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+static BLOCKING_NOTIFIER_HEAD(node_chain);
+
+int register_node_notifier(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_register(&node_chain, nb);
+}
+EXPORT_SYMBOL(register_node_notifier);
+
+void unregister_node_notifier(struct notifier_block *nb)
+{
+	blocking_notifier_chain_unregister(&node_chain, nb);
+}
+EXPORT_SYMBOL(unregister_node_notifier);
+
+int node_notify(unsigned long val, void *v)
+{
+	return blocking_notifier_call_chain(&node_chain, val, v);
+}
+#endif
+
 static void node_remove_accesses(struct node *node)
 {
 	struct node_access_nodes *c, *cnext;
diff --git a/include/linux/node.h b/include/linux/node.h
index 2b7517892230..8c783269011d 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -123,6 +123,48 @@ static inline void register_memory_blocks_under_node(int nid, unsigned long star
 #endif
 
 extern void unregister_node(struct node *node);
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+struct node_notify {
+	int nid;
+};
+
+#define NODE_ADDING_FIRST_MEMORY                (1<<0)
+#define NODE_ADDED_FIRST_MEMORY                 (1<<1)
+#define NODE_CANCEL_ADDING_FIRST_MEMORY         (1<<2)
+#define NODE_REMOVING_LAST_MEMORY               (1<<3)
+#define NODE_REMOVED_LAST_MEMORY                (1<<4)
+#define NODE_CANCEL_REMOVING_LAST_MEMORY        (1<<5)
+
+#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_NUMA)
+extern int register_node_notifier(struct notifier_block *nb);
+extern void unregister_node_notifier(struct notifier_block *nb);
+extern int node_notify(unsigned long val, void *v);
+
+#define hotplug_node_notifier(fn, pri) ({		\
+	static __meminitdata struct notifier_block fn##_node_nb =\
+		{ .notifier_call = fn, .priority = pri };\
+	register_node_notifier(&fn##_node_nb);			\
+})
+#else
+static inline int register_node_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+static inline void unregister_node_notifier(struct notifier_block *nb)
+{
+}
+static inline int node_notify(unsigned long val, void *v)
+{
+	return 0;
+}
+static inline int hotplug_node_notifier(notifier_fn_t fn, int pri)
+{
+	return 0;
+}
+#endif
+#endif
+
 #ifdef CONFIG_NUMA
 extern void node_dev_init(void);
 /* Core of the node registration - only memory hotplug should use this */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 94ae0ca37021..0550f3061fc4 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -35,6 +35,7 @@
 #include <linux/compaction.h>
 #include <linux/rmap.h>
 #include <linux/module.h>
+#include <linux/node.h>
 
 #include <asm/tlbflush.h>
 
@@ -699,24 +700,6 @@ static void online_pages_range(unsigned long start_pfn, unsigned long nr_pages)
 	online_mem_sections(start_pfn, end_pfn);
 }
 
-/* check which state of node_states will be changed when online memory */
-static void node_states_check_changes_online(unsigned long nr_pages,
-	struct zone *zone, struct memory_notify *arg)
-{
-	int nid = zone_to_nid(zone);
-
-	arg->status_change_nid = NUMA_NO_NODE;
-
-	if (!node_state(nid, N_MEMORY))
-		arg->status_change_nid = nid;
-}
-
-static void node_states_set_node(int node, struct memory_notify *arg)
-{
-	if (arg->status_change_nid >= 0)
-		node_set_state(node, N_MEMORY);
-}
-
 static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn,
 		unsigned long nr_pages)
 {
@@ -1171,7 +1154,9 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 	int need_zonelists_rebuild = 0;
 	const int nid = zone_to_nid(zone);
 	int ret;
-	struct memory_notify arg;
+	struct memory_notify mem_arg;
+	struct node_notify node_arg;
+	bool cancel_mem_notifier_on_err = false, cancel_node_notifier_on_err = false;
 
 	/*
 	 * {on,off}lining is constrained to full memory sections (or more
@@ -1188,11 +1173,22 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 	/* associate pfn range with the zone */
 	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
 
-	arg.start_pfn = pfn;
-	arg.nr_pages = nr_pages;
-	node_states_check_changes_online(nr_pages, zone, &arg);
+	node_arg.nid = NUMA_NO_NODE;
+	if (!node_state(nid, N_MEMORY)) {
+		/* Adding memory to the node for the first time */
+		cancel_node_notifier_on_err = true;
+		node_arg.nid = nid;
+		ret = node_notify(NODE_ADDING_FIRST_MEMORY, &node_arg);
+		ret = notifier_to_errno(ret);
+		if (ret)
+			goto failed_addition;
+	}
 
-	ret = memory_notify(MEM_GOING_ONLINE, &arg);
+	mem_arg.start_pfn = pfn;
+	mem_arg.nr_pages = nr_pages;
+	mem_arg.status_change_nid = node_arg.nid;
+	cancel_mem_notifier_on_err = true;
+	ret = memory_notify(MEM_GOING_ONLINE, &mem_arg);
 	ret = notifier_to_errno(ret);
 	if (ret)
 		goto failed_addition;
@@ -1218,7 +1214,8 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 	online_pages_range(pfn, nr_pages);
 	adjust_present_page_count(pfn_to_page(pfn), group, nr_pages);
 
-	node_states_set_node(nid, &arg);
+	if (node_arg.nid >= 0)
+		node_set_state(nid, N_MEMORY);
 	if (need_zonelists_rebuild)
 		build_all_zonelists(NULL);
 
@@ -1239,16 +1236,23 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 	kswapd_run(nid);
 	kcompactd_run(nid);
 
+	if (node_arg.nid >= 0)
+		/* First memory added successfully. Notify consumers. */
+		node_notify(NODE_ADDED_FIRST_MEMORY, &node_arg);
+
 	writeback_set_ratelimit();
 
-	memory_notify(MEM_ONLINE, &arg);
+	memory_notify(MEM_ONLINE, &mem_arg);
 	return 0;
 
 failed_addition:
 	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
 		 (unsigned long long) pfn << PAGE_SHIFT,
 		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
-	memory_notify(MEM_CANCEL_ONLINE, &arg);
+	if (cancel_mem_notifier_on_err)
+		memory_notify(MEM_CANCEL_ONLINE, &mem_arg);
+	if (cancel_node_notifier_on_err)
+		node_notify(NODE_CANCEL_ADDING_FIRST_MEMORY, &node_arg);
 	remove_pfn_range_from_zone(zone, pfn, nr_pages);
 	return ret;
 }
@@ -1880,48 +1884,6 @@ static int __init cmdline_parse_movable_node(char *p)
 }
 early_param("movable_node", cmdline_parse_movable_node);
 
-/* check which state of node_states will be changed when offline memory */
-static void node_states_check_changes_offline(unsigned long nr_pages,
-		struct zone *zone, struct memory_notify *arg)
-{
-	struct pglist_data *pgdat = zone->zone_pgdat;
-	unsigned long present_pages = 0;
-	enum zone_type zt;
-
-	arg->status_change_nid = NUMA_NO_NODE;
-
-	/*
-	 * Check whether node_states[N_NORMAL_MEMORY] will be changed.
-	 * If the memory to be offline is within the range
-	 * [0..ZONE_NORMAL], and it is the last present memory there,
-	 * the zones in that range will become empty after the offlining,
-	 * thus we can determine that we need to clear the node from
-	 * node_states[N_NORMAL_MEMORY].
-	 */
-	for (zt = 0; zt <= ZONE_NORMAL; zt++)
-		present_pages += pgdat->node_zones[zt].present_pages;
-
-	/*
-	 * We have accounted the pages from [0..ZONE_NORMAL); ZONE_HIGHMEM
-	 * does not apply as we don't support 32bit.
-	 * Here we count the possible pages from ZONE_MOVABLE.
-	 * If after having accounted all the pages, we see that the nr_pages
-	 * to be offlined is over or equal to the accounted pages,
-	 * we know that the node will become empty, and so, we can clear
-	 * it for N_MEMORY as well.
-	 */
-	present_pages += pgdat->node_zones[ZONE_MOVABLE].present_pages;
-
-	if (nr_pages >= present_pages)
-		arg->status_change_nid = zone_to_nid(zone);
-}
-
-static void node_states_clear_node(int node, struct memory_notify *arg)
-{
-	if (arg->status_change_nid >= 0)
-		node_clear_state(node, N_MEMORY);
-}
-
 static int count_system_ram_pages_cb(unsigned long start_pfn,
 				     unsigned long nr_pages, void *data)
 {
@@ -1937,13 +1899,17 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
 int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 			struct zone *zone, struct memory_group *group)
 {
-	const unsigned long end_pfn = start_pfn + nr_pages;
-	unsigned long pfn, managed_pages, system_ram_pages = 0;
-	const int node = zone_to_nid(zone);
-	unsigned long flags;
-	struct memory_notify arg;
-	char *reason;
 	int ret;
+	char *reason;
+	enum zone_type zt;
+	unsigned long flags;
+	struct memory_notify mem_arg;
+	struct node_notify node_arg;
+	const int node = zone_to_nid(zone);
+	struct pglist_data *pgdat = zone->zone_pgdat;
+	const unsigned long end_pfn = start_pfn + nr_pages;
+	unsigned long pfn, managed_pages, system_ram_pages = 0, present_pages = 0;
+	bool cancel_mem_notifier_on_err = false, cancel_node_notifier_on_err = false;
 
 	/*
 	 * {on,off}lining is constrained to full memory sections (or more
@@ -2000,11 +1966,30 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 		goto failed_removal_pcplists_disabled;
 	}
 
-	arg.start_pfn = start_pfn;
-	arg.nr_pages = nr_pages;
-	node_states_check_changes_offline(nr_pages, zone, &arg);
+	/*
+	 * Here we count the possible pages within the range [0..ZONE_MOVABLE].
+	 * If after having accounted all the pages, we see that the nr_pages to
+	 * be offlined is greater or equal to the accounted pages, we know that the
+	 * node will become empty, and so, we will clear N_MEMORY for it.
+	 */
+	node_arg.nid = NUMA_NO_NODE;
+	for (zt = 0; zt <= ZONE_MOVABLE; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+
+	if (nr_pages >= present_pages) {
+		node_arg.nid = node;
+		cancel_node_notifier_on_err = true;
+		ret = node_notify(NODE_REMOVING_LAST_MEMORY, &node_arg);
+		ret = notifier_to_errno(ret);
+		if (ret)
+			goto failed_removal_isolated;
+	}
 
-	ret = memory_notify(MEM_GOING_OFFLINE, &arg);
+	mem_arg.start_pfn = start_pfn;
+	mem_arg.nr_pages = nr_pages;
+	mem_arg.status_change_nid = node_arg.nid;
+	cancel_mem_notifier_on_err = true;
+	ret = memory_notify(MEM_GOING_OFFLINE, &mem_arg);
 	ret = notifier_to_errno(ret);
 	if (ret) {
 		reason = "notifier failure";
@@ -2084,27 +2069,33 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	 * Make sure to mark the node as memory-less before rebuilding the zone
 	 * list. Otherwise this node would still appear in the fallback lists.
 	 */
-	node_states_clear_node(node, &arg);
+	if (node_arg.nid >= 0)
+		node_clear_state(node, N_MEMORY);
 	if (!populated_zone(zone)) {
 		zone_pcp_reset(zone);
 		build_all_zonelists(NULL);
 	}
 
-	if (arg.status_change_nid >= 0) {
+	if (node_arg.nid >= 0) {
 		kcompactd_stop(node);
 		kswapd_stop(node);
+		/* Node went memoryless. Notify consumers */
+		node_notify(NODE_REMOVED_LAST_MEMORY, &node_arg);
 	}
 
 	writeback_set_ratelimit();
 
-	memory_notify(MEM_OFFLINE, &arg);
+	memory_notify(MEM_OFFLINE, &mem_arg);
 	remove_pfn_range_from_zone(zone, start_pfn, nr_pages);
 	return 0;
 
 failed_removal_isolated:
 	/* pushback to free area */
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
-	memory_notify(MEM_CANCEL_OFFLINE, &arg);
+	if (cancel_mem_notifier_on_err)
+		memory_notify(MEM_CANCEL_OFFLINE, &mem_arg);
+	if (cancel_node_notifier_on_err)
+		node_notify(NODE_CANCEL_REMOVING_LAST_MEMORY, &node_arg);
 failed_removal_pcplists_disabled:
 	lru_cache_enable();
 	zone_pcp_enable(zone);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (2 preceding siblings ...)
  2025-06-05 14:22 ` [PATCH v5 03/10] mm,memory_hotplug: Implement numa node notifier Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-06  1:50   ` kernel test robot
  2025-06-06 11:56   ` David Hildenbrand
  2025-06-05 14:22 ` [PATCH v5 05/10] mm,memory-tiers: " Oscar Salvador
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

slub is only concerned when a numa node changes its memory state,
so stop using the memory notifier and use the new numa node notifer
instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index f92b43d36adc..b8b5b81bfd1a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6164,8 +6164,8 @@ static int slab_mem_going_online_callback(void *arg)
 {
 	struct kmem_cache_node *n;
 	struct kmem_cache *s;
-	struct memory_notify *marg = arg;
-	int nid = marg->status_change_nid;
+	struct node_notify *narg = arg;
+	int nid = narg->nid;
 	int ret = 0;
 
 	/*
@@ -6217,15 +6217,12 @@ static int slab_memory_callback(struct notifier_block *self,
 	int ret = 0;
 
 	switch (action) {
-	case MEM_GOING_ONLINE:
+	case NODE_ADDING_FIRST_MEMORY:
 		ret = slab_mem_going_online_callback(arg);
 		break;
-	case MEM_GOING_OFFLINE:
+	case NODE_REMOVING_LAST_MEMORY:
 		ret = slab_mem_going_offline_callback(arg);
 		break;
-	case MEM_ONLINE:
-	case MEM_CANCEL_OFFLINE:
-		break;
 	}
 	if (ret)
 		ret = notifier_from_errno(ret);
@@ -6300,7 +6297,7 @@ void __init kmem_cache_init(void)
 			sizeof(struct kmem_cache_node),
 			SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0);
 
-	hotplug_memory_notifier(slab_memory_callback, SLAB_CALLBACK_PRI);
+	hotplug_node_notifier(slab_memory_callback, SLAB_CALLBACK_PRI);
 
 	/* Able to allocate the per node structures */
 	slab_state = PARTIAL;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 05/10] mm,memory-tiers: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (3 preceding siblings ...)
  2025-06-05 14:22 ` [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-06 11:50   ` David Hildenbrand
  2025-06-05 14:22 ` [PATCH v5 06/10] drivers,cxl: " Oscar Salvador
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

memory-tier is only concerned when a numa node changes its memory state,
because it then needs to re-create the demotion list.
So stop using the memory notifier and use the new numa node notifer
instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/memory-tiers.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
index fc14fe53e9b7..67f06e6264a1 100644
--- a/mm/memory-tiers.c
+++ b/mm/memory-tiers.c
@@ -872,25 +872,25 @@ static int __meminit memtier_hotplug_callback(struct notifier_block *self,
 					      unsigned long action, void *_arg)
 {
 	struct memory_tier *memtier;
-	struct memory_notify *arg = _arg;
+	struct node_notify *narg = _arg;
 
 	/*
 	 * Only update the node migration order when a node is
 	 * changing status, like online->offline.
 	 */
-	if (arg->status_change_nid < 0)
+	if (narg->nid < 0)
 		return notifier_from_errno(0);
 
 	switch (action) {
-	case MEM_OFFLINE:
+	case NODE_REMOVED_LAST_MEMORY:
 		mutex_lock(&memory_tier_lock);
-		if (clear_node_memory_tier(arg->status_change_nid))
+		if (clear_node_memory_tier(narg->nid))
 			establish_demotion_targets();
 		mutex_unlock(&memory_tier_lock);
 		break;
-	case MEM_ONLINE:
+	case NODE_ADDED_FIRST_MEMORY:
 		mutex_lock(&memory_tier_lock);
-		memtier = set_node_memory_tier(arg->status_change_nid);
+		memtier = set_node_memory_tier(narg->nid);
 		if (!IS_ERR(memtier))
 			establish_demotion_targets();
 		mutex_unlock(&memory_tier_lock);
@@ -929,7 +929,7 @@ static int __init memory_tier_init(void)
 	nodes_and(default_dram_nodes, node_states[N_MEMORY],
 		  node_states[N_CPU]);
 
-	hotplug_memory_notifier(memtier_hotplug_callback, MEMTIER_HOTPLUG_PRI);
+	hotplug_node_notifier(memtier_hotplug_callback, MEMTIER_HOTPLUG_PRI);
 	return 0;
 }
 subsys_initcall(memory_tier_init);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 06/10] drivers,cxl: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (4 preceding siblings ...)
  2025-06-05 14:22 ` [PATCH v5 05/10] mm,memory-tiers: " Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-06 11:51   ` David Hildenbrand
  2025-06-05 14:22 ` [PATCH v5 07/10] drivers,hmat: " Oscar Salvador
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

memory-tier is only concerned when a numa node changes its memory state,
specifically when a numa node with memory comes into play for the first
time, because it needs to get its performance attributes to build a proper
demotion chain.
So stop using the memory notifier and use the new numa node notifer
instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 drivers/cxl/core/region.c | 16 ++++++++--------
 drivers/cxl/cxl.h         |  4 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c3f4dc244df7..a8477a3e175c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2432,12 +2432,12 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
 					  unsigned long action, void *arg)
 {
 	struct cxl_region *cxlr = container_of(nb, struct cxl_region,
-					       memory_notifier);
-	struct memory_notify *mnb = arg;
-	int nid = mnb->status_change_nid;
+					       node_notifier);
+	struct node_notify *mnb = arg;
+	int nid = mnb->nid;
 	int region_nid;
 
-	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
+	if (nid == NUMA_NO_NODE || action != NODE_ADDED_FIRST_MEMORY)
 		return NOTIFY_DONE;
 
 	/*
@@ -3484,7 +3484,7 @@ static void shutdown_notifiers(void *_cxlr)
 {
 	struct cxl_region *cxlr = _cxlr;
 
-	unregister_memory_notifier(&cxlr->memory_notifier);
+	unregister_node_notifier(&cxlr->node_notifier);
 	unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
 }
 
@@ -3523,9 +3523,9 @@ static int cxl_region_probe(struct device *dev)
 	if (rc)
 		return rc;
 
-	cxlr->memory_notifier.notifier_call = cxl_region_perf_attrs_callback;
-	cxlr->memory_notifier.priority = CXL_CALLBACK_PRI;
-	register_memory_notifier(&cxlr->memory_notifier);
+	cxlr->node_notifier.notifier_call = cxl_region_perf_attrs_callback;
+	cxlr->node_notifier.priority = CXL_CALLBACK_PRI;
+	register_node_notifier(&cxlr->node_notifier);
 
 	cxlr->adist_notifier.notifier_call = cxl_region_calculate_adistance;
 	cxlr->adist_notifier.priority = 100;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index a9ab46eb0610..48ac02dee881 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -513,7 +513,7 @@ enum cxl_partition_mode {
  * @flags: Region state flags
  * @params: active + config params for the region
  * @coord: QoS access coordinates for the region
- * @memory_notifier: notifier for setting the access coordinates to node
+ * @node_notifier: notifier for setting the access coordinates to node
  * @adist_notifier: notifier for calculating the abstract distance of node
  */
 struct cxl_region {
@@ -526,7 +526,7 @@ struct cxl_region {
 	unsigned long flags;
 	struct cxl_region_params params;
 	struct access_coordinate coord[ACCESS_COORDINATE_MAX];
-	struct notifier_block memory_notifier;
+	struct notifier_block node_notifier;
 	struct notifier_block adist_notifier;
 };
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 07/10] drivers,hmat: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (5 preceding siblings ...)
  2025-06-05 14:22 ` [PATCH v5 06/10] drivers,cxl: " Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-06 11:51   ` David Hildenbrand
  2025-06-05 14:22 ` [PATCH v5 08/10] kernel,cpuset: " Oscar Salvador
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

hmat driver is only concerned when a numa node changes its memory state,
specifically when a numa node with memory comes into play for the first
time, because it will register the memory_targets belonging to that numa
node.
So stop using the memory notifier and use the new numa node notifer
instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 drivers/acpi/numa/hmat.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 9d9052258e92..fe626e969fdc 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -962,10 +962,10 @@ static int hmat_callback(struct notifier_block *self,
 			 unsigned long action, void *arg)
 {
 	struct memory_target *target;
-	struct memory_notify *mnb = arg;
-	int pxm, nid = mnb->status_change_nid;
+	struct node_notify *nb = arg;
+	int pxm, nid = nb->nid;
 
-	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
+	if (nid == NUMA_NO_NODE || action != NODE_ADDED_FIRST_MEMORY)
 		return NOTIFY_OK;
 
 	pxm = node_to_pxm(nid);
@@ -1118,7 +1118,7 @@ static __init int hmat_init(void)
 	hmat_register_targets();
 
 	/* Keep the table and structures if the notifier may use them */
-	if (hotplug_memory_notifier(hmat_callback, HMAT_CALLBACK_PRI))
+	if (hotplug_node_notifier(hmat_callback, HMAT_CALLBACK_PRI))
 		goto out_put;
 
 	if (!hmat_set_default_dram_perf())
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 08/10] kernel,cpuset: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (6 preceding siblings ...)
  2025-06-05 14:22 ` [PATCH v5 07/10] drivers,hmat: " Oscar Salvador
@ 2025-06-05 14:22 ` Oscar Salvador
  2025-06-06 11:52   ` David Hildenbrand
  2025-06-05 14:23 ` [PATCH v5 09/10] mm,mempolicy: " Oscar Salvador
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

cpuset is only concerned when a numa node changes its memory state,
as it needs to know the current numa nodes with memory to keep
an updated mems_allowed mask.
So stop using the memory notifier and use the new numa node notifer
instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 kernel/cgroup/cpuset.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 83639a12883d..66c84024f217 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -4013,7 +4013,7 @@ void __init cpuset_init_smp(void)
 	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
 	top_cpuset.effective_mems = node_states[N_MEMORY];
 
-	hotplug_memory_notifier(cpuset_track_online_nodes, CPUSET_CALLBACK_PRI);
+	hotplug_node_notifier(cpuset_track_online_nodes, CPUSET_CALLBACK_PRI);
 
 	cpuset_migrate_mm_wq = alloc_ordered_workqueue("cpuset_migrate_mm", 0);
 	BUG_ON(!cpuset_migrate_mm_wq);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 09/10] mm,mempolicy: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (7 preceding siblings ...)
  2025-06-05 14:22 ` [PATCH v5 08/10] kernel,cpuset: " Oscar Salvador
@ 2025-06-05 14:23 ` Oscar Salvador
  2025-06-09  6:47   ` Rakie Kim
  2025-06-05 14:23 ` [PATCH v5 10/10] mm,memory_hotplug: Rename status_change_nid parameter in memory_notify Oscar Salvador
  2025-06-06 11:30 ` [PATCH v5 00/10] Lorenzo Stoakes
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

mempolicy is only concerned when a numa node changes its memory state,
because it needs to take this node into account for the auto-weighted
memory policy system.
So stop using the memory notifier and use the new numa node notifer
instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/mempolicy.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 72fd72e156b1..1b87628f3cfc 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3793,20 +3793,20 @@ static int wi_node_notifier(struct notifier_block *nb,
 			       unsigned long action, void *data)
 {
 	int err;
-	struct memory_notify *arg = data;
-	int nid = arg->status_change_nid;
+	struct node_notify *arg = data;
+	int nid = arg->nid;
 
 	if (nid < 0)
 		return NOTIFY_OK;
 
 	switch (action) {
-	case MEM_ONLINE:
+	case NODE_ADDED_FIRST_MEMORY:
 		err = sysfs_wi_node_add(nid);
 		if (err)
 			pr_err("failed to add sysfs for node%d during hotplug: %d\n",
 			       nid, err);
 		break;
-	case MEM_OFFLINE:
+	case NODE_REMOVED_LAST_MEMORY:
 		sysfs_wi_node_delete(nid);
 		break;
 	}
@@ -3845,7 +3845,7 @@ static int __init add_weighted_interleave_group(struct kobject *mempolicy_kobj)
 		}
 	}
 
-	hotplug_memory_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
+	hotplug_node_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
 	return 0;
 
 err_cleanup_kobj:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 10/10] mm,memory_hotplug: Rename status_change_nid parameter in memory_notify
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (8 preceding siblings ...)
  2025-06-05 14:23 ` [PATCH v5 09/10] mm,mempolicy: " Oscar Salvador
@ 2025-06-05 14:23 ` Oscar Salvador
  2025-06-06 11:48   ` David Hildenbrand
  2025-06-06 11:30 ` [PATCH v5 00/10] Lorenzo Stoakes
  10 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 14:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Oscar Salvador

The 'status_change_nid' field was used to track changes in the memory
state of a numa node, but that funcionality has been decoupled from
memory_notify and moved to node_notify.
Current consumers of memory_notify are only interested in which node the
memory we are adding belongs to, so rename current 'status_change_nid'
to 'nid'.

Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 Documentation/core-api/memory-hotplug.rst |  9 ++-------
 include/linux/memory.h                    |  2 +-
 mm/memory_hotplug.c                       |  4 ++--
 mm/page_ext.c                             | 12 +-----------
 4 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst
index b19c3be7437d..97efb7b651ac 100644
--- a/Documentation/core-api/memory-hotplug.rst
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -59,17 +59,12 @@ The third argument (arg) passes a pointer of struct memory_notify::
 	struct memory_notify {
 		unsigned long start_pfn;
 		unsigned long nr_pages;
-		int status_change_nid;
+		int nid;
 	}
 
 - start_pfn is start_pfn of online/offline memory.
 - nr_pages is # of pages of online/offline memory.
-- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
-  set/clear. It means a new(memoryless) node gets new memory by online and a
-  node loses all memory. If this is -1, then nodemask status is not changed.
-
-  If status_changed_nid* >= 0, callback should create/discard structures for the
-  node if necessary.
+- nid is set to the node id, where the memory we are adding or removing belongs to.
 
 The callback routine shall return one of the values
 NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
diff --git a/include/linux/memory.h b/include/linux/memory.h
index a9ccd6579422..918c65ecf299 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -109,7 +109,7 @@ struct memory_notify {
 	unsigned long altmap_nr_pages;
 	unsigned long start_pfn;
 	unsigned long nr_pages;
-	int status_change_nid;
+	int nid;
 };
 
 struct notifier_block;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0550f3061fc4..bccbc02ed122 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1186,7 +1186,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 
 	mem_arg.start_pfn = pfn;
 	mem_arg.nr_pages = nr_pages;
-	mem_arg.status_change_nid = node_arg.nid;
+	mem_arg.nid = node_arg.nid;
 	cancel_mem_notifier_on_err = true;
 	ret = memory_notify(MEM_GOING_ONLINE, &mem_arg);
 	ret = notifier_to_errno(ret);
@@ -1987,7 +1987,7 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 
 	mem_arg.start_pfn = start_pfn;
 	mem_arg.nr_pages = nr_pages;
-	mem_arg.status_change_nid = node_arg.nid;
+	mem_arg.nid = node_arg.nid;
 	cancel_mem_notifier_on_err = true;
 	ret = memory_notify(MEM_GOING_OFFLINE, &mem_arg);
 	ret = notifier_to_errno(ret);
diff --git a/mm/page_ext.c b/mm/page_ext.c
index c351fdfe9e9a..477e6f24b7ab 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -378,16 +378,6 @@ static int __meminit online_page_ext(unsigned long start_pfn,
 	start = SECTION_ALIGN_DOWN(start_pfn);
 	end = SECTION_ALIGN_UP(start_pfn + nr_pages);
 
-	if (nid == NUMA_NO_NODE) {
-		/*
-		 * In this case, "nid" already exists and contains valid memory.
-		 * "start_pfn" passed to us is a pfn which is an arg for
-		 * online__pages(), and start_pfn should exist.
-		 */
-		nid = pfn_to_nid(start_pfn);
-		VM_BUG_ON(!node_online(nid));
-	}
-
 	for (pfn = start; !fail && pfn < end; pfn += PAGES_PER_SECTION)
 		fail = init_section_page_ext(pfn, nid);
 	if (!fail)
@@ -436,7 +426,7 @@ static int __meminit page_ext_callback(struct notifier_block *self,
 	switch (action) {
 	case MEM_GOING_ONLINE:
 		ret = online_page_ext(mn->start_pfn,
-				   mn->nr_pages, mn->status_change_nid);
+				   mn->nr_pages, mn->nid);
 		break;
 	case MEM_OFFLINE:
 		offline_page_ext(mn->start_pfn,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation
  2025-06-05 14:22 ` [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation Oscar Salvador
@ 2025-06-05 14:34   ` Vlastimil Babka
  2025-06-05 14:54   ` David Hildenbrand
  1 sibling, 0 replies; 31+ messages in thread
From: Vlastimil Babka @ 2025-06-05 14:34 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: David Hildenbrand, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 6/5/25 16:22, Oscar Salvador wrote:
> Now that the last user of status_change_nid_normal is gone, we can remove it.
> Update documentation accordingly.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation
  2025-06-05 14:22 ` [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation Oscar Salvador
  2025-06-05 14:34   ` Vlastimil Babka
@ 2025-06-05 14:54   ` David Hildenbrand
  2025-06-05 15:49     ` Oscar Salvador
  1 sibling, 1 reply; 31+ messages in thread
From: David Hildenbrand @ 2025-06-05 14:54 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 05.06.25 16:22, Oscar Salvador wrote:
> Now that the last user of status_change_nid_normal is gone, we can remove it.
> Update documentation accordingly.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
>   Documentation/core-api/memory-hotplug.rst            |  3 ---
>   .../translations/zh_CN/core-api/memory-hotplug.rst   |  3 ---

I'm running into similar issues with CN-only doc, which I will happily 
let bitrot, because I will not learn a new language just so I can update 
documentation.

... I raised in the past that having CN doc in the tree is absurdly stupid.

In your case, likely removing the doc works.

>   include/linux/memory.h                               |  1 -
>   mm/memory_hotplug.c                                  | 12 ------------
>   4 files changed, 19 deletions(-)
> 

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation
  2025-06-05 14:54   ` David Hildenbrand
@ 2025-06-05 15:49     ` Oscar Salvador
  0 siblings, 0 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-05 15:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel

On Thu, Jun 05, 2025 at 04:54:00PM +0200, David Hildenbrand wrote:
> On 05.06.25 16:22, Oscar Salvador wrote:
> > Now that the last user of status_change_nid_normal is gone, we can remove it.
> > Update documentation accordingly.
> > 
> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> > ---
> >   Documentation/core-api/memory-hotplug.rst            |  3 ---
> >   .../translations/zh_CN/core-api/memory-hotplug.rst   |  3 ---
> 
> I'm running into similar issues with CN-only doc, which I will happily let
> bitrot, because I will not learn a new language just so I can update
> documentation.
> 
> ... I raised in the past that having CN doc in the tree is absurdly stupid.
> 
> In your case, likely removing the doc works.

Yeah, git send-mail wasn't happy about this one and screamed something
about encoding.
I was this close to completely disregard CN Docs, but since it was only
removing stuff, I went "meh, ok". :-)
I'm not entirely sure how uptodated are those though, not only for
memory-hotplug but for other parts of the kernel.
 

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 ` [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier Oscar Salvador
@ 2025-06-06  1:50   ` kernel test robot
  2025-06-06  7:51     ` Oscar Salvador
  2025-06-06 11:56   ` David Hildenbrand
  1 sibling, 1 reply; 31+ messages in thread
From: kernel test robot @ 2025-06-06  1:50 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, David Hildenbrand,
	Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-kernel, Oscar Salvador

Hi Oscar,

kernel test robot noticed the following build errors:

[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on driver-core/driver-core-next driver-core/driver-core-linus rafael-pm/linux-next rafael-pm/bleeding-edge tj-cgroup/for-next linus/master v6.15 next-20250605]
[cannot apply to akpm-mm/mm-everything vbabka-slab/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Oscar-Salvador/mm-memory_hotplug-Remove-status_change_nid_normal-and-update-documentation/20250605-232305
base:   driver-core/driver-core-testing
patch link:    https://lore.kernel.org/r/20250605142305.244465-5-osalvador%40suse.de
patch subject: [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
config: riscv-randconfig-001-20250606 (https://download.01.org/0day-ci/archive/20250606/202506060918.HDCPogq9-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250606/202506060918.HDCPogq9-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506060918.HDCPogq9-lkp@intel.com/

All errors (new ones prefixed by >>):

   mm/slub.c: In function 'slab_mem_going_online_callback':
>> mm/slub.c:6168:23: error: invalid use of undefined type 'struct node_notify'
    6168 |         int nid = narg->nid;
         |                       ^~
   mm/slub.c: In function 'slab_memory_callback':
   mm/slub.c:6220:14: error: 'NODE_ADDING_FIRST_MEMORY' undeclared (first use in this function)
    6220 |         case NODE_ADDING_FIRST_MEMORY:
         |              ^~~~~~~~~~~~~~~~~~~~~~~~
   mm/slub.c:6220:14: note: each undeclared identifier is reported only once for each function it appears in
   mm/slub.c:6223:14: error: 'NODE_REMOVING_LAST_MEMORY' undeclared (first use in this function)
    6223 |         case NODE_REMOVING_LAST_MEMORY:
         |              ^~~~~~~~~~~~~~~~~~~~~~~~~
   mm/slub.c: In function 'kmem_cache_init':
   mm/slub.c:6300:9: error: implicit declaration of function 'hotplug_node_notifier'; did you mean 'hotplug_memory_notifier'? [-Werror=implicit-function-declaration]
    6300 |         hotplug_node_notifier(slab_memory_callback, SLAB_CALLBACK_PRI);
         |         ^~~~~~~~~~~~~~~~~~~~~
         |         hotplug_memory_notifier
   cc1: some warnings being treated as errors


vim +6168 mm/slub.c

  6162	
  6163	static int slab_mem_going_online_callback(void *arg)
  6164	{
  6165		struct kmem_cache_node *n;
  6166		struct kmem_cache *s;
  6167		struct node_notify *narg = arg;
> 6168		int nid = narg->nid;
  6169		int ret = 0;
  6170	
  6171		/*
  6172		 * If the node's memory is already available, then kmem_cache_node is
  6173		 * already created. Nothing to do.
  6174		 */
  6175		if (nid < 0)
  6176			return 0;
  6177	
  6178		/*
  6179		 * We are bringing a node online. No memory is available yet. We must
  6180		 * allocate a kmem_cache_node structure in order to bring the node
  6181		 * online.
  6182		 */
  6183		mutex_lock(&slab_mutex);
  6184		list_for_each_entry(s, &slab_caches, list) {
  6185			/*
  6186			 * The structure may already exist if the node was previously
  6187			 * onlined and offlined.
  6188			 */
  6189			if (get_node(s, nid))
  6190				continue;
  6191			/*
  6192			 * XXX: kmem_cache_alloc_node will fallback to other nodes
  6193			 *      since memory is not yet available from the node that
  6194			 *      is brought up.
  6195			 */
  6196			n = kmem_cache_alloc(kmem_cache_node, GFP_KERNEL);
  6197			if (!n) {
  6198				ret = -ENOMEM;
  6199				goto out;
  6200			}
  6201			init_kmem_cache_node(n);
  6202			s->node[nid] = n;
  6203		}
  6204		/*
  6205		 * Any cache created after this point will also have kmem_cache_node
  6206		 * initialized for the new node.
  6207		 */
  6208		node_set(nid, slab_nodes);
  6209	out:
  6210		mutex_unlock(&slab_mutex);
  6211		return ret;
  6212	}
  6213	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 03/10] mm,memory_hotplug: Implement numa node notifier
  2025-06-05 14:22 ` [PATCH v5 03/10] mm,memory_hotplug: Implement numa node notifier Oscar Salvador
@ 2025-06-06  7:50   ` Oscar Salvador
  0 siblings, 0 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-06  7:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel

On Thu, Jun 05, 2025 at 04:22:54PM +0200, Oscar Salvador wrote:
> There are at least six consumers of hotplug_memory_notifier that what they
> really are interested in is whether any numa node changed its state, e.g: going
> from having memory to not having memory and vice versa.
> 
> Implement a specific notifier for numa nodes when their state gets changed,
> which will later be used by those consumers that are only interested
> in numa node state changes.
> 
> Add documentation as well.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
...
> diff --git a/include/linux/node.h b/include/linux/node.h
> index 2b7517892230..8c783269011d 100644
> --- a/include/linux/node.h
> +++ b/include/linux/node.h
> @@ -123,6 +123,48 @@ static inline void register_memory_blocks_under_node(int nid, unsigned long star
>  #endif
>  
>  extern void unregister_node(struct node *node);
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +struct node_notify {
> +	int nid;
> +};
> +
> +#define NODE_ADDING_FIRST_MEMORY                (1<<0)
> +#define NODE_ADDED_FIRST_MEMORY                 (1<<1)
> +#define NODE_CANCEL_ADDING_FIRST_MEMORY         (1<<2)
> +#define NODE_REMOVING_LAST_MEMORY               (1<<3)
> +#define NODE_REMOVED_LAST_MEMORY                (1<<4)
> +#define NODE_CANCEL_REMOVING_LAST_MEMORY        (1<<5)
> +
> +#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_NUMA)
> +extern int register_node_notifier(struct notifier_block *nb);
> +extern void unregister_node_notifier(struct notifier_block *nb);
> +extern int node_notify(unsigned long val, void *v);
> +
> +#define hotplug_node_notifier(fn, pri) ({		\
> +	static __meminitdata struct notifier_block fn##_node_nb =\
> +		{ .notifier_call = fn, .priority = pri };\
> +	register_node_notifier(&fn##_node_nb);			\
> +})
> +#else
> +static inline int register_node_notifier(struct notifier_block *nb)
> +{
> +	return 0;
> +}
> +static inline void unregister_node_notifier(struct notifier_block *nb)
> +{
> +}
> +static inline int node_notify(unsigned long val, void *v)
> +{
> +	return 0;
> +}
> +static inline int hotplug_node_notifier(notifier_fn_t fn, int pri)
> +{
> +	return 0;
> +}
> +#endif
> +#endif


I got carried away, sorry.
We need this fixup on top:

 diff --git a/include/linux/node.h b/include/linux/node.h
 index 8c783269011d..d7aa2636d948 100644
 --- a/include/linux/node.h
 +++ b/include/linux/node.h
 @@ -124,7 +124,6 @@ static inline void register_memory_blocks_under_node(int nid, unsigned long star
 
  extern void unregister_node(struct node *node);
 
 -#ifdef CONFIG_MEMORY_HOTPLUG
  struct node_notify {
  	int nid;
  };
 @@ -163,7 +162,6 @@ static inline int hotplug_node_notifier(notifier_fn_t fn, int pri)
  	return 0;
  }
  #endif
 -#endif
 
  #ifdef CONFIG_NUMA
  extern void node_dev_init(void);
 

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
  2025-06-06  1:50   ` kernel test robot
@ 2025-06-06  7:51     ` Oscar Salvador
  0 siblings, 0 replies; 31+ messages in thread
From: Oscar Salvador @ 2025-06-06  7:51 UTC (permalink / raw)
  To: kernel test robot
  Cc: Andrew Morton, oe-kbuild-all, Linux Memory Management List,
	David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-kernel

On Fri, Jun 06, 2025 at 09:50:39AM +0800, kernel test robot wrote:
> Hi Oscar,
> 
> kernel test robot noticed the following build errors:

Fixed, see:

https://lore.kernel.org/linux-mm/aEKdvc8IWgSXSF8Q@localhost.localdomain/T/#mb3acf8bc17463f621f2b85d688817acd65cef042

Thanks

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 00/10]
  2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
                   ` (9 preceding siblings ...)
  2025-06-05 14:23 ` [PATCH v5 10/10] mm,memory_hotplug: Rename status_change_nid parameter in memory_notify Oscar Salvador
@ 2025-06-06 11:30 ` Lorenzo Stoakes
  2025-06-06 11:46   ` David Hildenbrand
  2025-06-06 12:31   ` Oscar Salvador
  10 siblings, 2 replies; 31+ messages in thread
From: Lorenzo Stoakes @ 2025-06-06 11:30 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Andrew Morton, David Hildenbrand, Vlastimil Babka,
	Jonathan Cameron, Harry Yoo, Rakie Kim, Hyeonggon Yoo, linux-mm,
	linux-kernel

Hi Oscar,

I don't have time to dig into what's broken here, but this series is breaking
the mm-new build.

NODE_REMOVED_LAST_MEMORY for instance doesn't seem to be defined, but there's a
bunch more errors.

Are you expecting stuff to land from other trees that isn't merged in Andrew's
tree yet? Maybe from slab tree?

We probably need to be careful with series that have dependencies like that
during the merge window, maybe RFC or wait until after merge window in that
case, or maybe add a note saying 'please don't merge to mm-new until after the
merge window' or something.

Thanks, Lorenzo

mm/slub.c: In function ‘slab_mem_going_online_callback’:
mm/slub.c:6168:23: error: invalid use of undefined type ‘struct node_notify’
 6168 |         int nid = narg->nid;
      |                       ^~
mm/mempolicy.c: In function ‘wi_node_notifier’:
mm/mempolicy.c:3792:22: error: invalid use of undefined type ‘struct node_notify’
 3792 |         int nid = arg->nid;
      |                      ^~
mm/slub.c: In function ‘slab_memory_callback’:
mm/slub.c:6220:14: error: ‘NODE_ADDING_FIRST_MEMORY’ undeclared (first use in this function)
 6220 |         case NODE_ADDING_FIRST_MEMORY:
      |              ^~~~~~~~~~~~~~~~~~~~~~~~
mm/slub.c:6220:14: note: each undeclared identifier is reported only once for each function it appears in
kernel/cgroup/cpuset.c: In function ‘cpuset_init_smp’:
kernel/cgroup/cpuset.c:4054:9: error: implicit declaration of function ‘hotplug_node_notifier’; did you mean ‘hotplug_memory_notifier’? [-Wimplicit-function-declaration]
 4054 |         hotplug_node_notifier(cpuset_track_online_nodes, CPUSET_CALLBACK_PRI);
      |         ^~~~~~~~~~~~~~~~~~~~~
      |         hotplug_memory_notifier
mm/mempolicy.c:3798:14: error: ‘NODE_ADDED_FIRST_MEMORY’ undeclared (first use in this function)
 3798 |         case NODE_ADDED_FIRST_MEMORY:
      |              ^~~~~~~~~~~~~~~~~~~~~~~
mm/mempolicy.c:3798:14: note: each undeclared identifier is reported only once for each function it appears in
mm/slub.c:6223:14: error: ‘NODE_REMOVING_LAST_MEMORY’ undeclared (first use in this function)
 6223 |         case NODE_REMOVING_LAST_MEMORY:
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~
mm/slub.c: In function ‘kmem_cache_init’:
mm/slub.c:6300:9: error: implicit declaration of function ‘hotplug_node_notifier’; did you mean ‘hotplug_memory_notifier’? [-Wimplicit-function-declaration]
 6300 |         hotplug_node_notifier(slab_memory_callback, SLAB_CALLBACK_PRI);
      |         ^~~~~~~~~~~~~~~~~~~~~
      |         hotplug_memory_notifier
make[4]: *** [scripts/Makefile.build:203: kernel/cgroup/cpuset.o] Error 1
make[3]: *** [scripts/Makefile.build:461: kernel/cgroup] Error 2
make[2]: *** [scripts/Makefile.build:461: kernel] Error 2
make[2]: *** Waiting for unfinished jobs....
mm/mempolicy.c:3804:14: error: ‘NODE_REMOVED_LAST_MEMORY’ undeclared (first use in this function)
 3804 |         case NODE_REMOVED_LAST_MEMORY:
      |              ^~~~~~~~~~~~~~~~~~~~~~~~
mm/mempolicy.c: In function ‘add_weighted_interleave_group’:
mm/mempolicy.c:3843:9: error: implicit declaration of function ‘hotplug_node_notifier’; did you mean ‘hotplug_memory_notifier’? [-Wimplicit-function-declaration]
 3843 |         hotplug_node_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
      |         ^~~~~~~~~~~~~~~~~~~~~
      |         hotplug_memory_notifier





On Thu, Jun 05, 2025 at 04:22:51PM +0200, Oscar Salvador wrote:
>  v4 -> v5:
>    - Split out conversion for different consumers (per David)
>    - Renamed node-notifier actions (per David)
>    - Added new Documentation for new node-notifier and updated
>      the memory-notifier one to reflect the changes
>    - Make sure we do not trigger anything when !CONFIG_NUMA (per David)
>
>  v3 -> v4:
>    - Fix typos pointed out by Alok Tiwari
>    - Further cleanups suggested by Vlastimil
>    - Add RBs-by from Vlastimil
>
>  v2 -> v3:
>    - Add Suggested-by (David)
>    - Replace last N_NORMAL_MEMORY mention in slub (David)
>    - Replace the notifier for autoweitght-mempolicy
>    - Fix build on !CONFIG_MEMORY_HOTPLUG
>
>  v1 -> v2:
>    - Remove status_change_nid_normal and the code that
>      deals with it (David & Vlastimil)
>    - Remove slab_mem_offline_callback (David & Vlastimil)
>    - Change the order of canceling the notifiers
>      in {online,offline}_pages (Vlastimil)
>    - Fix up a couple of whitespaces (Jonathan Cameron)
>    - Add RBs-by
>
> Memory notifier is a tool that allow consumers to get notified whenever
> memory gets onlined or offlined in the system.
> Currently, there are 10 consumers of that, but 5 out of those 10 consumers
> are only interested in getting notifications when a numa node changes its
> memory state.
> That means going from memoryless to memory-aware of vice versa.
>
> Which means that for every {online,offline}_pages operation they get
> notified even though the numa node might not have changed its state.
> This is suboptimal, and we want to decouple numa node state changes from
> memory state changes.
>
> While we are doing this, remove status_change_nid_normal, as the only
> current user (slub) does not really need it.
> This allows us to further simplify and clean up the code.
>
> The first patch gets rid of status_change_nid_normal in slub.
> The second patch implements a numa node notifier that does just that, and have
> those consumers register in there, so they get notified only when they are
> interested.
>
> The third patch replaces 'status_change_nid{_normal}' fields within
> memory_notify with a 'nid', as that is only what we need for memory
> notifer and update the only user of it (page_ext).
>
> Consumers that are only interested in numa node states change are:
>
>  - memory-tier
>  - slub
>  - cpuset
>  - hmat
>  - cxl
>  - autoweight-mempolicy
>
> Oscar Salvador (10):
>   mm,slub: Do not special case N_NORMAL nodes for slab_nodes
>   mm,memory_hotplug: Remove status_change_nid_normal and update
>     documentation
>   mm,memory_hotplug: Implement numa node notifier
>   mm,slub: Use node-notifier instead of memory-notifier
>   mm,memory-tiers: Use node-notifier instead of memory-notifier
>   drivers,cxl: Use node-notifier instead of memory-notifier
>   drivers,hmat: Use node-notifier instead of memory-notifier
>   kernel,cpuset: Use node-notifier instead of memory-notifier
>   mm,mempolicy: Use node-notifier instead of memory-notifier
>   mm,memory_hotplug: Rename status_change_nid parameter in memory_notify
>
>  Documentation/core-api/memory-hotplug.rst     |  78 ++++++--
>  .../zh_CN/core-api/memory-hotplug.rst         |   3 -
>  drivers/acpi/numa/hmat.c                      |   8 +-
>  drivers/base/node.c                           |  21 +++
>  drivers/cxl/core/region.c                     |  16 +-
>  drivers/cxl/cxl.h                             |   4 +-
>  include/linux/memory.h                        |   3 +-
>  include/linux/node.h                          |  42 +++++
>  kernel/cgroup/cpuset.c                        |   2 +-
>  mm/memory-tiers.c                             |  14 +-
>  mm/memory_hotplug.c                           | 167 ++++++++----------
>  mm/mempolicy.c                                |  10 +-
>  mm/page_ext.c                                 |  12 +-
>  mm/slub.c                                     |  45 +----
>  14 files changed, 240 insertions(+), 185 deletions(-)
>
> --
> 2.49.0
>
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 00/10]
  2025-06-06 11:30 ` [PATCH v5 00/10] Lorenzo Stoakes
@ 2025-06-06 11:46   ` David Hildenbrand
  2025-06-06 12:31   ` Oscar Salvador
  1 sibling, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:46 UTC (permalink / raw)
  To: Lorenzo Stoakes, Oscar Salvador
  Cc: Andrew Morton, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel

On 06.06.25 13:30, Lorenzo Stoakes wrote:
> Hi Oscar,
> 
> I don't have time to dig into what's broken here, but this series is breaking
> the mm-new build.
> 
> NODE_REMOVED_LAST_MEMORY for instance doesn't seem to be defined, but there's a
> bunch more errors.
> 
> Are you expecting stuff to land from other trees that isn't merged in Andrew's
> tree yet? Maybe from slab tree?

(David replying)

no, this is standalone, and probably just an error in the patches.

IIUC, the build bots reported this this night as reply to patch #4.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 10/10] mm,memory_hotplug: Rename status_change_nid parameter in memory_notify
  2025-06-05 14:23 ` [PATCH v5 10/10] mm,memory_hotplug: Rename status_change_nid parameter in memory_notify Oscar Salvador
@ 2025-06-06 11:48   ` David Hildenbrand
  0 siblings, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:48 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

>   struct notifier_block;
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0550f3061fc4..bccbc02ed122 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1186,7 +1186,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>   
>   	mem_arg.start_pfn = pfn;
>   	mem_arg.nr_pages = nr_pages;
> -	mem_arg.status_change_nid = node_arg.nid;
> +	mem_arg.nid = node_arg.nid;
>   	cancel_mem_notifier_on_err = true;
>   	ret = memory_notify(MEM_GOING_ONLINE, &mem_arg);
>   	ret = notifier_to_errno(ret);
> @@ -1987,7 +1987,7 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>   
>   	mem_arg.start_pfn = start_pfn;
>   	mem_arg.nr_pages = nr_pages;
> -	mem_arg.status_change_nid = node_arg.nid;
> +	mem_arg.nid = node_arg.nid;
>   	cancel_mem_notifier_on_err = true;
>   	ret = memory_notify(MEM_GOING_OFFLINE, &mem_arg);
>   	ret = notifier_to_errno(ret);

Okay, now I realize we should just remove the nid completely, because

> diff --git a/mm/page_ext.c b/mm/page_ext.c
> index c351fdfe9e9a..477e6f24b7ab 100644
> --- a/mm/page_ext.c
> +++ b/mm/page_ext.c
> @@ -378,16 +378,6 @@ static int __meminit online_page_ext(unsigned long start_pfn,
>   	start = SECTION_ALIGN_DOWN(start_pfn);
>   	end = SECTION_ALIGN_UP(start_pfn + nr_pages);
>   
> -	if (nid == NUMA_NO_NODE) {
> -		/*
> -		 * In this case, "nid" already exists and contains valid memory.
> -		 * "start_pfn" passed to us is a pfn which is an arg for
> -		 * online__pages(), and start_pfn should exist.
> -		 */
> -		nid = pfn_to_nid(start_pfn);
> -		VM_BUG_ON(!node_online(nid));
> -	}
> -
>   	for (pfn = start; !fail && pfn < end; pfn += PAGES_PER_SECTION)
>   		fail = init_section_page_ext(pfn, nid);
>   	if (!fail)
> @@ -436,7 +426,7 @@ static int __meminit page_ext_callback(struct notifier_block *self,
>   	switch (action) {
>   	case MEM_GOING_ONLINE:
>   		ret = online_page_ext(mn->start_pfn,
> -				   mn->nr_pages, mn->status_change_nid);
> +				   mn->nr_pages, mn->nid);

Nowadays we call move_pfn_range_to_zone() before MEM_GOING_ONLINE.

So we can simply do the

nid = pfn_to_nid(start_pfn);

unconditionally above.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 05/10] mm,memory-tiers: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 ` [PATCH v5 05/10] mm,memory-tiers: " Oscar Salvador
@ 2025-06-06 11:50   ` David Hildenbrand
  0 siblings, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:50 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 05.06.25 16:22, Oscar Salvador wrote:
> memory-tier is only concerned when a numa node changes its memory state,
> because it then needs to re-create the demotion list.
> So stop using the memory notifier and use the new numa node notifer
> instead.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   mm/memory-tiers.c | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
> index fc14fe53e9b7..67f06e6264a1 100644
> --- a/mm/memory-tiers.c
> +++ b/mm/memory-tiers.c
> @@ -872,25 +872,25 @@ static int __meminit memtier_hotplug_callback(struct notifier_block *self,
>   					      unsigned long action, void *_arg)
>   {
>   	struct memory_tier *memtier;
> -	struct memory_notify *arg = _arg;
> +	struct node_notify *narg = _arg;
>   
>   	/*
>   	 * Only update the node migration order when a node is
>   	 * changing status, like online->offline.
>   	 */
> -	if (arg->status_change_nid < 0)
> +	if (narg->nid < 0)
>   		return notifier_from_errno(0);

Ehm, why are we ever calling a node notifier with nid < 0 ?

We shouldn't do that.

Can be adding first / removing last from something that ... is not a 
valid node? :)

Maybe it's already do that way, in that case just drop this check here.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 06/10] drivers,cxl: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 ` [PATCH v5 06/10] drivers,cxl: " Oscar Salvador
@ 2025-06-06 11:51   ` David Hildenbrand
  0 siblings, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:51 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 05.06.25 16:22, Oscar Salvador wrote:
> memory-tier is only concerned when a numa node changes its memory state,
> specifically when a numa node with memory comes into play for the first
> time, because it needs to get its performance attributes to build a proper
> demotion chain.
> So stop using the memory notifier and use the new numa node notifer
> instead.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   drivers/cxl/core/region.c | 16 ++++++++--------
>   drivers/cxl/cxl.h         |  4 ++--
>   2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c3f4dc244df7..a8477a3e175c 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2432,12 +2432,12 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
>   					  unsigned long action, void *arg)
>   {
>   	struct cxl_region *cxlr = container_of(nb, struct cxl_region,
> -					       memory_notifier);
> -	struct memory_notify *mnb = arg;
> -	int nid = mnb->status_change_nid;
> +					       node_notifier);
> +	struct node_notify *mnb = arg;
> +	int nid = mnb->nid;
>   	int region_nid;
>   
> -	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
> +	if (nid == NUMA_NO_NODE || action != NODE_ADDED_FIRST_MEMORY)


Dito, one would expect "nid == NUMA_NO_NODE" to never even happen here.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 07/10] drivers,hmat: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 ` [PATCH v5 07/10] drivers,hmat: " Oscar Salvador
@ 2025-06-06 11:51   ` David Hildenbrand
  2025-06-07 22:59     ` Andrew Morton
  0 siblings, 1 reply; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:51 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 05.06.25 16:22, Oscar Salvador wrote:
> hmat driver is only concerned when a numa node changes its memory state,
> specifically when a numa node with memory comes into play for the first
> time, because it will register the memory_targets belonging to that numa
> node.
> So stop using the memory notifier and use the new numa node notifer
> instead.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   drivers/acpi/numa/hmat.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
> index 9d9052258e92..fe626e969fdc 100644
> --- a/drivers/acpi/numa/hmat.c
> +++ b/drivers/acpi/numa/hmat.c
> @@ -962,10 +962,10 @@ static int hmat_callback(struct notifier_block *self,
>   			 unsigned long action, void *arg)
>   {
>   	struct memory_target *target;
> -	struct memory_notify *mnb = arg;
> -	int pxm, nid = mnb->status_change_nid;
> +	struct node_notify *nb = arg;
> +	int pxm, nid = nb->nid;
>   
> -	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
> +	if (nid == NUMA_NO_NODE || action != NODE_ADDED_FIRST_MEMORY)

Same comment :)

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 08/10] kernel,cpuset: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 ` [PATCH v5 08/10] kernel,cpuset: " Oscar Salvador
@ 2025-06-06 11:52   ` David Hildenbrand
  0 siblings, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:52 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 05.06.25 16:22, Oscar Salvador wrote:
> cpuset is only concerned when a numa node changes its memory state,
> as it needs to know the current numa nodes with memory to keep
> an updated mems_allowed mask.
> So stop using the memory notifier and use the new numa node notifer
> instead.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   kernel/cgroup/cpuset.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 83639a12883d..66c84024f217 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -4013,7 +4013,7 @@ void __init cpuset_init_smp(void)
>   	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
>   	top_cpuset.effective_mems = node_states[N_MEMORY];
>   
> -	hotplug_memory_notifier(cpuset_track_online_nodes, CPUSET_CALLBACK_PRI);
> +	hotplug_node_notifier(cpuset_track_online_nodes, CPUSET_CALLBACK_PRI);

Interestingly, cpuset_track_online_nodes() is only concerned about 
"after node_states[N_MEMORY] change", so maybe we could filter out more 
events .. maybe.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
  2025-06-05 14:22 ` [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier Oscar Salvador
  2025-06-06  1:50   ` kernel test robot
@ 2025-06-06 11:56   ` David Hildenbrand
  2025-06-06 12:28     ` Oscar Salvador
  1 sibling, 1 reply; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 11:56 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: Vlastimil Babka, Jonathan Cameron, Harry Yoo, Rakie Kim,
	Hyeonggon Yoo, linux-mm, linux-kernel

On 05.06.25 16:22, Oscar Salvador wrote:
> slub is only concerned when a numa node changes its memory state,
> so stop using the memory notifier and use the new numa node notifer
> instead.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   mm/slub.c | 13 +++++--------
>   1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f92b43d36adc..b8b5b81bfd1a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -6164,8 +6164,8 @@ static int slab_mem_going_online_callback(void *arg)
>   {
>   	struct kmem_cache_node *n;
>   	struct kmem_cache *s;
> -	struct memory_notify *marg = arg;
> -	int nid = marg->status_change_nid;
> +	struct node_notify *narg = arg;
> +	int nid = narg->nid;
>   	int ret = 0;
>   
>   	/*
> @@ -6217,15 +6217,12 @@ static int slab_memory_callback(struct notifier_block *self,
>   	int ret = 0;
>   
>   	switch (action) {
> -	case MEM_GOING_ONLINE:
> +	case NODE_ADDING_FIRST_MEMORY:
>   		ret = slab_mem_going_online_callback(arg);

In slab_mem_going_online_callback we will cast arg to "struct 
memory_notify", no?

Probably needs to get fixed.

... and probably best to pass marg directly.

>   		break;
> -	case MEM_GOING_OFFLINE:
> +	case NODE_REMOVING_LAST_MEMORY:
>   		ret = slab_mem_going_offline_callback(arg);

slab_mem_going_offline_callback() doesn't even look at arg, so likely we 
can drop that parameter?


-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
  2025-06-06 11:56   ` David Hildenbrand
@ 2025-06-06 12:28     ` Oscar Salvador
  2025-06-06 12:35       ` David Hildenbrand
  0 siblings, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-06 12:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel

On Fri, Jun 06, 2025 at 01:56:15PM +0200, David Hildenbrand wrote:
> > @@ -6217,15 +6217,12 @@ static int slab_memory_callback(struct notifier_block *self,
> >   	int ret = 0;
> >   	switch (action) {
> > -	case MEM_GOING_ONLINE:
> > +	case NODE_ADDING_FIRST_MEMORY:
> >   		ret = slab_mem_going_online_callback(arg);
> 
> In slab_mem_going_online_callback we will cast arg to "struct
> memory_notify", no?

Uhm... not sure if I understood this correctly but slab_mem_going_online_callback looks
like this:

 static int slab_mem_going_online_callback(void *arg)
 {
         struct kmem_cache_node *n;
         struct kmem_cache *s;
         struct node_notify *narg = arg;
         int nid = narg->nid;
         int ret = 0;



> Probably needs to get fixed.
> 
> ... and probably best to pass marg directly.

You mean to cast it directly in slab_memory_callback and pass 'narg'
to slab_mem_going_online_callback?


> >   		break;
> > -	case MEM_GOING_OFFLINE:
> > +	case NODE_REMOVING_LAST_MEMORY:
> >   		ret = slab_mem_going_offline_callback(arg);
> 
> slab_mem_going_offline_callback() doesn't even look at arg, so likely we can
> drop that parameter?

Sure.

Thanks for the feedback!

 

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 00/10]
  2025-06-06 11:30 ` [PATCH v5 00/10] Lorenzo Stoakes
  2025-06-06 11:46   ` David Hildenbrand
@ 2025-06-06 12:31   ` Oscar Salvador
  2025-06-06 12:45     ` Lorenzo Stoakes
  1 sibling, 1 reply; 31+ messages in thread
From: Oscar Salvador @ 2025-06-06 12:31 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Vlastimil Babka,
	Jonathan Cameron, Harry Yoo, Rakie Kim, Hyeonggon Yoo, linux-mm,
	linux-kernel

On Fri, Jun 06, 2025 at 12:30:42PM +0100, Lorenzo Stoakes wrote:
> Hi Oscar,
> 
> I don't have time to dig into what's broken here, but this series is breaking
> the mm-new build.
> 
> NODE_REMOVED_LAST_MEMORY for instance doesn't seem to be defined, but there's a
> bunch more errors.

Heh, I apologye, I assumed every config has MEMORY_HOTPLUG enabled.
(I'll walk on my knees all day long to make up for that!)

Fixup was posted this morning in 

https://lore.kernel.org/linux-mm/aEKdvc8IWgSXSF8Q@localhost.localdomain/T/#u

But we can drop the patchset for now as I'll have to respin a new
version including David's feedback.


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier
  2025-06-06 12:28     ` Oscar Salvador
@ 2025-06-06 12:35       ` David Hildenbrand
  0 siblings, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2025-06-06 12:35 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Andrew Morton, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel

On 06.06.25 14:28, Oscar Salvador wrote:
> On Fri, Jun 06, 2025 at 01:56:15PM +0200, David Hildenbrand wrote:
>>> @@ -6217,15 +6217,12 @@ static int slab_memory_callback(struct notifier_block *self,
>>>    	int ret = 0;
>>>    	switch (action) {
>>> -	case MEM_GOING_ONLINE:
>>> +	case NODE_ADDING_FIRST_MEMORY:
>>>    		ret = slab_mem_going_online_callback(arg);
>>
>> In slab_mem_going_online_callback we will cast arg to "struct
>> memory_notify", no?
> 
> Uhm... not sure if I understood this correctly but slab_mem_going_online_callback looks
> like this:
> 
>   static int slab_mem_going_online_callback(void *arg)
>   {
>           struct kmem_cache_node *n;
>           struct kmem_cache *s;
>           struct node_notify *narg = arg;
>           int nid = narg->nid;
>           int ret = 0;
> 

I'm stupid and missed that hunk, sorry.

> 
> 
>> Probably needs to get fixed.
>>
>> ... and probably best to pass marg directly.
> 
> You mean to cast it directly in slab_memory_callback and pass 'narg'
> to slab_mem_going_online_callback?

Yes :)



-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 00/10]
  2025-06-06 12:31   ` Oscar Salvador
@ 2025-06-06 12:45     ` Lorenzo Stoakes
  0 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Stoakes @ 2025-06-06 12:45 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Andrew Morton, David Hildenbrand, Vlastimil Babka,
	Jonathan Cameron, Harry Yoo, Rakie Kim, Hyeonggon Yoo, linux-mm,
	linux-kernel

Andrew - as per below could you drop this for now to fix the build?

On Fri, Jun 06, 2025 at 02:31:36PM +0200, Oscar Salvador wrote:
> On Fri, Jun 06, 2025 at 12:30:42PM +0100, Lorenzo Stoakes wrote:
> > Hi Oscar,
> >
> > I don't have time to dig into what's broken here, but this series is breaking
> > the mm-new build.
> >
> > NODE_REMOVED_LAST_MEMORY for instance doesn't seem to be defined, but there's a
> > bunch more errors.
>
> Heh, I apologye, I assumed every config has MEMORY_HOTPLUG enabled.
> (I'll walk on my knees all day long to make up for that!)
>
> Fixup was posted this morning in
>
> https://lore.kernel.org/linux-mm/aEKdvc8IWgSXSF8Q@localhost.localdomain/T/#u
>
> But we can drop the patchset for now as I'll have to respin a new
> version including David's feedback.

Thanks!

Yeah sorry, I tried grepping for a define and it wasn't there, but I guess I
wasn't grepping at mm-new (I rushed it, was in the middle of looking at a syzbot
report :P).

Also a small bugbear, there's no subject in the cover letter :P I mean I
appreciate the beauty of silence as much as the next fellow but the cover letter
subject line is probably not the best place for it ;)

I speak as somebody who's made literally every possible post-send-to-list error
known to humanity, of course :>)

>
>
> --
> Oscar Salvador
> SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 07/10] drivers,hmat: Use node-notifier instead of memory-notifier
  2025-06-06 11:51   ` David Hildenbrand
@ 2025-06-07 22:59     ` Andrew Morton
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Morton @ 2025-06-07 22:59 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel

On Fri, 6 Jun 2025 13:51:35 +0200 David Hildenbrand <david@redhat.com> wrote:

> > -	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
> > +	if (nid == NUMA_NO_NODE || action != NODE_ADDED_FIRST_MEMORY)
> 
> Same comment :)

Thanks.  It appears that quite a few updates are coming so I'll remove
the v5 series from mm.git.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 09/10] mm,mempolicy: Use node-notifier instead of memory-notifier
  2025-06-05 14:23 ` [PATCH v5 09/10] mm,mempolicy: " Oscar Salvador
@ 2025-06-09  6:47   ` Rakie Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Rakie Kim @ 2025-06-09  6:47 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand, Vlastimil Babka, Jonathan Cameron, Harry Yoo,
	Rakie Kim, Hyeonggon Yoo, linux-mm, linux-kernel, Andrew Morton,
	kernel_team

On Thu,  5 Jun 2025 16:23:00 +0200 Oscar Salvador <osalvador@suse.de> wrote:
> mempolicy is only concerned when a numa node changes its memory state,
> because it needs to take this node into account for the auto-weighted
> memory policy system.
> So stop using the memory notifier and use the new numa node notifer
> instead.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/mempolicy.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 72fd72e156b1..1b87628f3cfc 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -3793,20 +3793,20 @@ static int wi_node_notifier(struct notifier_block *nb,
>  			       unsigned long action, void *data)
>  {
>  	int err;
> -	struct memory_notify *arg = data;
> -	int nid = arg->status_change_nid;
> +	struct node_notify *arg = data;
> +	int nid = arg->nid;
>  
>  	if (nid < 0)
>  		return NOTIFY_OK;
>  
>  	switch (action) {
> -	case MEM_ONLINE:
> +	case NODE_ADDED_FIRST_MEMORY:
>  		err = sysfs_wi_node_add(nid);
>  		if (err)
>  			pr_err("failed to add sysfs for node%d during hotplug: %d\n",
>  			       nid, err);
>  		break;
> -	case MEM_OFFLINE:
> +	case NODE_REMOVED_LAST_MEMORY:
>  		sysfs_wi_node_delete(nid);
>  		break;
>  	}
> @@ -3845,7 +3845,7 @@ static int __init add_weighted_interleave_group(struct kobject *mempolicy_kobj)
>  		}
>  	}
>  
> -	hotplug_memory_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
> +	hotplug_node_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
>  	return 0;
>  
>  err_cleanup_kobj:
> -- 
> 2.49.0
> 

Reviewed-by: Rakie Kim <rakie.kim@sk.com>

Rakie


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-06-09  6:48 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05 14:22 [PATCH v5 00/10] Oscar Salvador
2025-06-05 14:22 ` [PATCH v5 01/10] mm,slub: Do not special case N_NORMAL nodes for slab_nodes Oscar Salvador
2025-06-05 14:22 ` [PATCH v5 02/10] mm,memory_hotplug: Remove status_change_nid_normal and update documentation Oscar Salvador
2025-06-05 14:34   ` Vlastimil Babka
2025-06-05 14:54   ` David Hildenbrand
2025-06-05 15:49     ` Oscar Salvador
2025-06-05 14:22 ` [PATCH v5 03/10] mm,memory_hotplug: Implement numa node notifier Oscar Salvador
2025-06-06  7:50   ` Oscar Salvador
2025-06-05 14:22 ` [PATCH v5 04/10] mm,slub: Use node-notifier instead of memory-notifier Oscar Salvador
2025-06-06  1:50   ` kernel test robot
2025-06-06  7:51     ` Oscar Salvador
2025-06-06 11:56   ` David Hildenbrand
2025-06-06 12:28     ` Oscar Salvador
2025-06-06 12:35       ` David Hildenbrand
2025-06-05 14:22 ` [PATCH v5 05/10] mm,memory-tiers: " Oscar Salvador
2025-06-06 11:50   ` David Hildenbrand
2025-06-05 14:22 ` [PATCH v5 06/10] drivers,cxl: " Oscar Salvador
2025-06-06 11:51   ` David Hildenbrand
2025-06-05 14:22 ` [PATCH v5 07/10] drivers,hmat: " Oscar Salvador
2025-06-06 11:51   ` David Hildenbrand
2025-06-07 22:59     ` Andrew Morton
2025-06-05 14:22 ` [PATCH v5 08/10] kernel,cpuset: " Oscar Salvador
2025-06-06 11:52   ` David Hildenbrand
2025-06-05 14:23 ` [PATCH v5 09/10] mm,mempolicy: " Oscar Salvador
2025-06-09  6:47   ` Rakie Kim
2025-06-05 14:23 ` [PATCH v5 10/10] mm,memory_hotplug: Rename status_change_nid parameter in memory_notify Oscar Salvador
2025-06-06 11:48   ` David Hildenbrand
2025-06-06 11:30 ` [PATCH v5 00/10] Lorenzo Stoakes
2025-06-06 11:46   ` David Hildenbrand
2025-06-06 12:31   ` Oscar Salvador
2025-06-06 12:45     ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).