* [PATCH v4 1/9] mm/memory: add memory_block_aligned_range() helper
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-09 9:50 ` David Hildenbrand (Arm)
2026-06-05 21:19 ` [PATCH v4 2/9] mm/memory_hotplug: pass online_type to online_memory_block() via arg Gregory Price
` (7 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
Memory hotplug operations require ranges aligned to memory block
boundaries. This is a generic operation for hotplug.
Add memory_block_aligned_range() as a common helper in <linux/memory.h>
that aligns the start address up and end address down to memory block
boundaries.
Update dax/kmem to use this helper.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/dax/kmem.c | 4 +---
include/linux/memory.h | 22 ++++++++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index a18e2b968e4d..592171ec10f4 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -33,9 +33,7 @@ static int dax_kmem_range(struct dev_dax *dev_dax, int i, struct range *r)
struct dev_dax_range *dax_range = &dev_dax->ranges[i];
struct range *range = &dax_range->range;
- /* memory-block align the hotplug range */
- r->start = ALIGN(range->start, memory_block_size_bytes());
- r->end = ALIGN_DOWN(range->end + 1, memory_block_size_bytes()) - 1;
+ *r = memory_block_aligned_range(range);
if (r->start >= r->end) {
r->start = range->start;
r->end = range->end;
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 463dc02f6cff..9f5ef0309f77 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -20,6 +20,7 @@
#include <linux/compiler.h>
#include <linux/mutex.h>
#include <linux/memory_hotplug.h>
+#include <linux/range.h>
#define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS)
@@ -100,6 +101,27 @@ int arch_get_memory_phys_device(unsigned long start_pfn);
unsigned long memory_block_size_bytes(void);
int set_memory_block_size_order(unsigned int order);
+/**
+ * memory_block_aligned_range - align a physical address range to memory blocks
+ * @range: the input range to align
+ *
+ * Aligns the start address up and the end address down to memory block
+ * boundaries. This is required for memory hotplug operations which must
+ * operate on memory-block aligned ranges.
+ *
+ * Returns the aligned range. Callers should check that the returned
+ * range is valid (aligned.start < aligned.end) before using it.
+ */
+static inline struct range memory_block_aligned_range(const struct range *range)
+{
+ struct range aligned;
+
+ aligned.start = ALIGN(range->start, memory_block_size_bytes());
+ aligned.end = ALIGN_DOWN(range->end + 1, memory_block_size_bytes()) - 1;
+
+ return aligned;
+}
+
struct memory_notify {
unsigned long start_pfn;
unsigned long nr_pages;
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v4 1/9] mm/memory: add memory_block_aligned_range() helper
2026-06-05 21:19 ` [PATCH v4 1/9] mm/memory: add memory_block_aligned_range() helper Gregory Price
@ 2026-06-09 9:50 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-09 9:50 UTC (permalink / raw)
To: Gregory Price, linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, ljs, liam, vbabka, rppt, surenb,
mhocko, osalvador, shuah, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
On 6/5/26 23:19, Gregory Price wrote:
> Memory hotplug operations require ranges aligned to memory block
> boundaries. This is a generic operation for hotplug.
>
> Add memory_block_aligned_range() as a common helper in <linux/memory.h>
> that aligns the start address up and end address down to memory block
> boundaries.
>
> Update dax/kmem to use this helper.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 2/9] mm/memory_hotplug: pass online_type to online_memory_block() via arg
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
2026-06-05 21:19 ` [PATCH v4 1/9] mm/memory: add memory_block_aligned_range() helper Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type Gregory Price
` (6 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
Modify online_memory_block() to accept the online type through its arg
parameter rather than calling mhp_get_default_online_type() internally.
This prepares for allowing callers to specify explicit online types.
Update the caller in add_memory_resource() to pass the default online
type via a local variable.
No functional change.
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
mm/memory_hotplug.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7ac19fab2263..6833208cc17c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1337,7 +1337,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
static int online_memory_block(struct memory_block *mem, void *arg)
{
- mem->online_type = mhp_get_default_online_type();
+ enum mmop *online_type = arg;
+
+ mem->online_type = *online_type;
return device_online(&mem->dev);
}
@@ -1494,6 +1496,7 @@ static int create_altmaps_and_memory_blocks(int nid, struct memory_group *group,
int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
{
struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) };
+ enum mmop online_type = mhp_get_default_online_type();
enum memblock_flags memblock_flags = MEMBLOCK_NONE;
struct memory_group *group = NULL;
u64 start, size;
@@ -1582,7 +1585,8 @@ int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
/* online pages if requested */
if (mhp_get_default_online_type() != MMOP_OFFLINE)
- walk_memory_blocks(start, size, NULL, online_memory_block);
+ walk_memory_blocks(start, size, &online_type,
+ online_memory_block);
return ret;
error:
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
2026-06-05 21:19 ` [PATCH v4 1/9] mm/memory: add memory_block_aligned_range() helper Gregory Price
2026-06-05 21:19 ` [PATCH v4 2/9] mm/memory_hotplug: pass online_type to online_memory_block() via arg Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-09 9:52 ` David Hildenbrand (Arm)
2026-06-05 21:19 ` [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg Gregory Price
` (5 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
Drivers which may pass hotplug policy down to DAX need MMOP_ symbols
and the mhp_get_default_online_type function for hotplug use cases.
Some drivers (cxl) co-mingle their hotplug and devdax use-cases into
the same driver code, and chose the dax_kmem path as the default driver
path - making it difficult to require hotplug as a predicate to building
the overall driver (it may break other non-hotplug use-cases).
Export mhp_get_default_online_type function to allow these drivers to
build when hotplug is disabled and still use the DAX use case.
In the built-out case we simply return MMOP_OFFLINE as it's
non-destructive. The internal function can never return -1 either,
so we choose this to allow for defining the function with 'enum mmop'.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
include/linux/memory_hotplug.h | 2 ++
mm/memory_hotplug.c | 1 +
2 files changed, 3 insertions(+)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 7c9d66729c60..f059025f8f8b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -316,6 +316,8 @@ extern struct zone *zone_for_pfn_range(enum mmop online_type,
extern int arch_create_linear_mapping(int nid, u64 start, u64 size,
struct mhp_params *params);
void arch_remove_linear_mapping(u64 start, u64 size);
+#else
+static inline enum mmop mhp_get_default_online_type(void) { return MMOP_OFFLINE; }
#endif /* CONFIG_MEMORY_HOTPLUG */
#endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6833208cc17c..494257054095 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -239,6 +239,7 @@ enum mmop mhp_get_default_online_type(void)
return mhp_default_online_type;
}
+EXPORT_SYMBOL_GPL(mhp_get_default_online_type);
void mhp_set_default_online_type(enum mmop online_type)
{
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type
2026-06-05 21:19 ` [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type Gregory Price
@ 2026-06-09 9:52 ` David Hildenbrand (Arm)
2026-06-09 15:11 ` Gregory Price
0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-09 9:52 UTC (permalink / raw)
To: Gregory Price, linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, ljs, liam, vbabka, rppt, surenb,
mhocko, osalvador, shuah, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
On 6/5/26 23:19, Gregory Price wrote:
> Drivers which may pass hotplug policy down to DAX need MMOP_ symbols
> and the mhp_get_default_online_type function for hotplug use cases.
>
> Some drivers (cxl) co-mingle their hotplug and devdax use-cases into
> the same driver code, and chose the dax_kmem path as the default driver
> path - making it difficult to require hotplug as a predicate to building
> the overall driver (it may break other non-hotplug use-cases).
>
> Export mhp_get_default_online_type function to allow these drivers to
> build when hotplug is disabled and still use the DAX use case.
>
> In the built-out case we simply return MMOP_OFFLINE as it's
Ah, you mean without CONFIG_MEMORY_HOTPLUG
> non-destructive. The internal function can never return -1 either,
> so we choose this to allow for defining the function with 'enum mmop'.
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type
2026-06-09 9:52 ` David Hildenbrand (Arm)
@ 2026-06-09 15:11 ` Gregory Price
0 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-09 15:11 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: linux-mm, nvdimm, linux-kernel, kernel-team, linux-cxl,
linux-kselftest, djbw, vishal.l.verma, dave.jiang, akpm, ljs,
liam, vbabka, rppt, surenb, mhocko, osalvador, shuah,
alison.schofield, Smita.KoralahalliChannabasappa, ira.weiny,
apopple
On Tue, Jun 09, 2026 at 11:52:21AM +0200, David Hildenbrand (Arm) wrote:
> On 6/5/26 23:19, Gregory Price wrote:
> > Drivers which may pass hotplug policy down to DAX need MMOP_ symbols
> > and the mhp_get_default_online_type function for hotplug use cases.
> >
> > Some drivers (cxl) co-mingle their hotplug and devdax use-cases into
> > the same driver code, and chose the dax_kmem path as the default driver
> > path - making it difficult to require hotplug as a predicate to building
> > the overall driver (it may break other non-hotplug use-cases).
> >
> > Export mhp_get_default_online_type function to allow these drivers to
> > build when hotplug is disabled and still use the DAX use case.
> >
> > In the built-out case we simply return MMOP_OFFLINE as it's
>
> Ah, you mean without CONFIG_MEMORY_HOTPLUG
>
Yeah i'll update the commit message, thanks!
> > non-destructive. The internal function can never return -1 either,
> > so we choose this to allow for defining the function with 'enum mmop'.
>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>
> --
> Cheers,
>
> David
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
` (2 preceding siblings ...)
2026-06-05 21:19 ` [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-09 9:55 ` David Hildenbrand (Arm)
2026-06-05 21:19 ` [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug Gregory Price
` (4 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
Existing callers of add_memory_driver_managed cannot select the
preferred online type (ZONE_NORMAL vs ZONE_MOVABLE), requiring it to
hot-add memory as offline blocks, and then follow up by onlining each
memory block individually.
Most drivers prefer the system default, but the CXL driver wants to
plumb a preferred policy through the dax kmem driver.
Refactor APIs to add a new interface which allows the dax kmem module
to select a preferred policy.
Overriding the configured auto-online policy is only safe for known
in-tree modules, where we know the override reflects a different,
user-requested policy. We do not want arbitrary out-of-tree drivers
silently overriding the system-wide onlining policy, so restrict the
new interface to the kmem module using EXPORT_SYMBOL_FOR_MODULES()
rather than a plain EXPORT_SYMBOL_GPL(). Other in-tree modules (e.g.
cxl_core) can be added to the allowed list as the need arises.
Refactor add_memory_driver_managed, extract __add_memory_driver_managed
- Add proper kernel-doc for add_memory_driver_managed while refactoring
- New helper accepts an explicit online_type.
- New helper validates online_type is between OFFLINE and ONLINE_MOVABLE
Refactor: add_memory_resource, extract __add_memory_resource
- new helper accepts an explicit online_type
Original APIs now explicitly pass the system-default to new helpers.
No functional change for existing users.
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
include/linux/memory_hotplug.h | 3 ++
mm/memory_hotplug.c | 61 +++++++++++++++++++++++++++++-----
2 files changed, 56 insertions(+), 8 deletions(-)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index f059025f8f8b..d3edeb80aadb 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -294,6 +294,9 @@ extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
extern int add_memory_resource(int nid, struct resource *resource,
mhp_t mhp_flags);
+int __add_memory_driver_managed(int nid, u64 start, u64 size,
+ const char *resource_name, mhp_t mhp_flags,
+ enum mmop online_type);
extern int add_memory_driver_managed(int nid, u64 start, u64 size,
const char *resource_name,
mhp_t mhp_flags);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 494257054095..7d145217adc6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1494,10 +1494,10 @@ static int create_altmaps_and_memory_blocks(int nid, struct memory_group *group,
*
* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
*/
-int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
+static int __add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags,
+ enum mmop online_type)
{
struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) };
- enum mmop online_type = mhp_get_default_online_type();
enum memblock_flags memblock_flags = MEMBLOCK_NONE;
struct memory_group *group = NULL;
u64 start, size;
@@ -1585,7 +1585,7 @@ int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
merge_system_ram_resource(res);
/* online pages if requested */
- if (mhp_get_default_online_type() != MMOP_OFFLINE)
+ if (online_type != MMOP_OFFLINE)
walk_memory_blocks(start, size, &online_type,
online_memory_block);
@@ -1603,7 +1603,13 @@ int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
return ret;
}
-/* requires device_hotplug_lock, see add_memory_resource() */
+int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
+{
+ return __add_memory_resource(nid, res, mhp_flags,
+ mhp_get_default_online_type());
+}
+
+/* requires device_hotplug_lock, see __add_memory_resource() */
int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags)
{
struct resource *res;
@@ -1631,7 +1637,15 @@ int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags)
}
EXPORT_SYMBOL_GPL(add_memory);
-/*
+/**
+ * __add_memory_driver_managed - add driver-managed memory with explicit online_type
+ * @nid: NUMA node ID where the memory will be added
+ * @start: Start physical address of the memory range
+ * @size: Size of the memory range in bytes
+ * @resource_name: Resource name in format "System RAM ($DRIVER)"
+ * @mhp_flags: Memory hotplug flags
+ * @online_type: Auto-Online behavior (offline, online, kernel, movable)
+ *
* Add special, driver-managed memory to the system as system RAM. Such
* memory is not exposed via the raw firmware-provided memmap as system
* RAM, instead, it is detected and added by a driver - during cold boot,
@@ -1639,6 +1653,7 @@ EXPORT_SYMBOL_GPL(add_memory);
*
* Reasons why this memory should not be used for the initial memmap of a
* kexec kernel or for placing kexec images:
+ *
* - The booting kernel is in charge of determining how this memory will be
* used (e.g., use persistent memory as system RAM)
* - Coordination with a hypervisor is required before this memory
@@ -1651,9 +1666,12 @@ EXPORT_SYMBOL_GPL(add_memory);
*
* The resource_name (visible via /proc/iomem) has to have the format
* "System RAM ($DRIVER)".
+ *
+ * Return: 0 on success, negative error code on failure.
*/
-int add_memory_driver_managed(int nid, u64 start, u64 size,
- const char *resource_name, mhp_t mhp_flags)
+int __add_memory_driver_managed(int nid, u64 start, u64 size,
+ const char *resource_name, mhp_t mhp_flags,
+ enum mmop online_type)
{
struct resource *res;
int rc;
@@ -1663,6 +1681,9 @@ int add_memory_driver_managed(int nid, u64 start, u64 size,
resource_name[strlen(resource_name) - 1] != ')')
return -EINVAL;
+ if (online_type < MMOP_OFFLINE || online_type > MMOP_ONLINE_MOVABLE)
+ return -EINVAL;
+
lock_device_hotplug();
res = register_memory_resource(start, size, resource_name);
@@ -1671,7 +1692,7 @@ int add_memory_driver_managed(int nid, u64 start, u64 size,
goto out_unlock;
}
- rc = add_memory_resource(nid, res, mhp_flags);
+ rc = __add_memory_resource(nid, res, mhp_flags, online_type);
if (rc < 0)
release_memory_resource(res);
@@ -1679,6 +1700,30 @@ int add_memory_driver_managed(int nid, u64 start, u64 size,
unlock_device_hotplug();
return rc;
}
+EXPORT_SYMBOL_FOR_MODULES(__add_memory_driver_managed, "kmem");
+
+/**
+ * add_memory_driver_managed - add driver-managed memory
+ * @nid: NUMA node ID where the memory will be added
+ * @start: Start physical address of the memory range
+ * @size: Size of the memory range in bytes
+ * @resource_name: Resource name in format "System RAM ($DRIVER)"
+ * @mhp_flags: Memory hotplug flags
+ *
+ * Add driver-managed memory with the system default online type set by
+ * build config or kernel boot parameter.
+ *
+ * See __add_memory_driver_managed for more details.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int add_memory_driver_managed(int nid, u64 start, u64 size,
+ const char *resource_name, mhp_t mhp_flags)
+{
+ return __add_memory_driver_managed(nid, start, size, resource_name,
+ mhp_flags,
+ mhp_get_default_online_type());
+}
EXPORT_SYMBOL_GPL(add_memory_driver_managed);
/*
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg
2026-06-05 21:19 ` [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg Gregory Price
@ 2026-06-09 9:55 ` David Hildenbrand (Arm)
2026-06-09 15:12 ` Gregory Price
0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-09 9:55 UTC (permalink / raw)
To: Gregory Price, linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, ljs, liam, vbabka, rppt, surenb,
mhocko, osalvador, shuah, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
On 6/5/26 23:19, Gregory Price wrote:
> Existing callers of add_memory_driver_managed cannot select the
> preferred online type (ZONE_NORMAL vs ZONE_MOVABLE), requiring it to
> hot-add memory as offline blocks, and then follow up by onlining each
> memory block individually.
>
> Most drivers prefer the system default, but the CXL driver wants to
> plumb a preferred policy through the dax kmem driver.
>
> Refactor APIs to add a new interface which allows the dax kmem module
> to select a preferred policy.
>
> Overriding the configured auto-online policy is only safe for known
> in-tree modules, where we know the override reflects a different,
> user-requested policy. We do not want arbitrary out-of-tree drivers
> silently overriding the system-wide onlining policy, so restrict the
> new interface to the kmem module using EXPORT_SYMBOL_FOR_MODULES()
> rather than a plain EXPORT_SYMBOL_GPL(). Other in-tree modules (e.g.
> cxl_core) can be added to the allowed list as the need arises.
>
> Refactor add_memory_driver_managed, extract __add_memory_driver_managed
> - Add proper kernel-doc for add_memory_driver_managed while refactoring
> - New helper accepts an explicit online_type.
> - New helper validates online_type is between OFFLINE and ONLINE_MOVABLE
>
> Refactor: add_memory_resource, extract __add_memory_resource
> - new helper accepts an explicit online_type
>
> Original APIs now explicitly pass the system-default to new helpers.
>
> No functional change for existing users.
>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> include/linux/memory_hotplug.h | 3 ++
> mm/memory_hotplug.c | 61 +++++++++++++++++++++++++++++-----
> 2 files changed, 56 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index f059025f8f8b..d3edeb80aadb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -294,6 +294,9 @@ extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
> extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
> extern int add_memory_resource(int nid, struct resource *resource,
> mhp_t mhp_flags);
> +int __add_memory_driver_managed(int nid, u64 start, u64 size,
> + const char *resource_name, mhp_t mhp_flags,
> + enum mmop online_type);
We prefer two-tab indent on second parameter line while touching code / adding
new code.
Same applies to the other instances below.
Apart from that (still) LGTM.
--
Cheers,
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg
2026-06-09 9:55 ` David Hildenbrand (Arm)
@ 2026-06-09 15:12 ` Gregory Price
0 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-09 15:12 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: linux-mm, nvdimm, linux-kernel, kernel-team, linux-cxl,
linux-kselftest, djbw, vishal.l.verma, dave.jiang, akpm, ljs,
liam, vbabka, rppt, surenb, mhocko, osalvador, shuah,
alison.schofield, Smita.KoralahalliChannabasappa, ira.weiny,
apopple
On Tue, Jun 09, 2026 at 11:55:24AM +0200, David Hildenbrand (Arm) wrote:
> On 6/5/26 23:19, Gregory Price wrote:
> >
> > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > index f059025f8f8b..d3edeb80aadb 100644
> > --- a/include/linux/memory_hotplug.h
> > +++ b/include/linux/memory_hotplug.h
> > @@ -294,6 +294,9 @@ extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
> > extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
> > extern int add_memory_resource(int nid, struct resource *resource,
> > mhp_t mhp_flags);
> > +int __add_memory_driver_managed(int nid, u64 start, u64 size,
> > + const char *resource_name, mhp_t mhp_flags,
> > + enum mmop online_type);
>
> We prefer two-tab indent on second parameter line while touching code / adding
> new code.
>
> Same applies to the other instances below.
>
Will fix on next spin, thanks!
>
> Apart from that (still) LGTM.
>
> --
> Cheers,
>
> David
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
` (3 preceding siblings ...)
2026-06-05 21:19 ` [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-09 10:06 ` David Hildenbrand (Arm)
2026-06-05 21:19 ` [PATCH v4 6/9] dax: plumb hotplug online_type through dax Gregory Price
` (3 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
offline_and_remove_memory() handles a single contiguous range.
Callers that manage a device composed of several ranges (e.g. dax/kmem)
currently have to call it in a loop, which gives up atomicity.
This creates a race condition where another daemon can online a block
that was just offlined while other blocks are being offlined, causing
the eventual (original) unplug operation to fail.
Add offline_and_remove_memory_ranges(), which takes an array of ranges
and processes them as one operation under a single lock_device_hotplug():
- Phase 1 offlines every block of every range, remembering each block's
previous online type.
- Phase 2 removes the ranges only once all of them are offline.
- If any offline fails, the offlining done so far is reverted and
nothing is removed.
This gives callers all-or-nothing semantics for the offline step, so a
failed or interrupted unplug leaves every range online as before rather
than in an inconsistent partially-removed state.
Suggested-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
include/linux/memory_hotplug.h | 7 +++
mm/memory_hotplug.c | 95 ++++++++++++++++++++++++++++++++++
2 files changed, 102 insertions(+)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index d3edeb80aadb..7f1da7c428dc 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -267,6 +267,7 @@ extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
extern int remove_memory(u64 start, u64 size);
extern void __remove_memory(u64 start, u64 size);
extern int offline_and_remove_memory(u64 start, u64 size);
+int offline_and_remove_memory_ranges(const struct range *ranges, int nr_ranges);
#else
static inline void try_offline_node(int nid) {}
@@ -283,6 +284,12 @@ static inline int remove_memory(u64 start, u64 size)
}
static inline void __remove_memory(u64 start, u64 size) {}
+
+static inline int offline_and_remove_memory_ranges(const struct range *ranges,
+ int nr_ranges)
+{
+ return -EBUSY;
+}
#endif /* CONFIG_MEMORY_HOTREMOVE */
#ifdef CONFIG_MEMORY_HOTPLUG
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7d145217adc6..e486d35c22b2 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -2483,4 +2483,99 @@ int offline_and_remove_memory(u64 start, u64 size)
return rc;
}
EXPORT_SYMBOL_GPL(offline_and_remove_memory);
+
+/**
+ * offline_and_remove_memory_ranges - offline and remove multiple memory ranges
+ * @ranges: array of physical address ranges to offline and remove
+ * @nr_ranges: number of entries in @ranges
+ *
+ * Offline and remove several memory ranges as one operation, serialized
+ * against other hotplug operations by a single lock_device_hotplug().
+ *
+ * Unlike calling offline_and_remove_memory() in a loop, this offlines *all*
+ * ranges before removing any of them. If offlining any range fails, the
+ * offlining of the ranges processed so far is reverted and nothing is
+ * removed, leaving every range online as it was before the call. This gives
+ * callers all-or-nothing semantics for the offline step, so a failed unplug
+ * does not leave a device split between online and removed ranges.
+ *
+ * Each range must be memory-block aligned in start and size.
+ *
+ * Return: 0 on success, negative errno otherwise. On failure no range has
+ * been removed.
+ */
+int offline_and_remove_memory_ranges(const struct range *ranges, int nr_ranges)
+{
+ unsigned long mb_total = 0;
+ uint8_t *online_types, *tmp;
+ int i, rc = 0;
+
+ if (!ranges || nr_ranges <= 0)
+ return -EINVAL;
+
+ for (i = 0; i < nr_ranges; i++) {
+ u64 start = ranges[i].start;
+ u64 size = range_len(&ranges[i]);
+
+ if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
+ !IS_ALIGNED(size, memory_block_size_bytes()) || !size)
+ return -EINVAL;
+ mb_total += size / memory_block_size_bytes();
+ }
+
+ /*
+ * Remember the old online type of every memory block across all ranges,
+ * so we can revert if offlining a later block fails. All entries start
+ * as MMOP_OFFLINE so blocks we never touched are skipped on rollback.
+ */
+ online_types = kmalloc_array(mb_total, sizeof(*online_types),
+ GFP_KERNEL);
+ if (!online_types)
+ return -ENOMEM;
+ memset(online_types, MMOP_OFFLINE, mb_total);
+
+ lock_device_hotplug();
+
+ /* Phase 1: offline every block in every range. */
+ tmp = online_types;
+ for (i = 0; i < nr_ranges; i++) {
+ rc = walk_memory_blocks(ranges[i].start, range_len(&ranges[i]),
+ &tmp, try_offline_memory_block);
+ if (rc)
+ break;
+ }
+
+ /*
+ * Phase 2: only once everything is offline, remove it. This cannot
+ * fail as the memory can no longer be onlined in the meantime.
+ */
+ if (!rc) {
+ for (i = 0; i < nr_ranges; i++) {
+ rc = try_remove_memory(ranges[i].start,
+ range_len(&ranges[i]));
+ if (rc) {
+ pr_err("%s: Failed to remove memory: %d",
+ __func__, rc);
+ break;
+ }
+ }
+ }
+
+ /*
+ * Roll back the offlining if anything failed. Blocks we never offlined
+ * are marked MMOP_OFFLINE and skipped by try_reonline_memory_block().
+ */
+ if (rc) {
+ tmp = online_types;
+ for (i = 0; i < nr_ranges; i++)
+ walk_memory_blocks(ranges[i].start,
+ range_len(&ranges[i]), &tmp,
+ try_reonline_memory_block);
+ }
+ unlock_device_hotplug();
+
+ kfree(online_types);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(offline_and_remove_memory_ranges);
#endif /* CONFIG_MEMORY_HOTREMOVE */
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug
2026-06-05 21:19 ` [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug Gregory Price
@ 2026-06-09 10:06 ` David Hildenbrand (Arm)
2026-06-09 15:15 ` Gregory Price
0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-09 10:06 UTC (permalink / raw)
To: Gregory Price, linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, ljs, liam, vbabka, rppt, surenb,
mhocko, osalvador, shuah, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
> EXPORT_SYMBOL_GPL(offline_and_remove_memory);
> +
> +/**
> + * offline_and_remove_memory_ranges - offline and remove multiple memory ranges
> + * @ranges: array of physical address ranges to offline and remove
> + * @nr_ranges: number of entries in @ranges
> + *
> + * Offline and remove several memory ranges as one operation, serialized
> + * against other hotplug operations by a single lock_device_hotplug().
> + *
> + * Unlike calling offline_and_remove_memory() in a loop, this offlines *all*
> + * ranges before removing any of them. If offlining any range fails, the
> + * offlining of the ranges processed so far is reverted and nothing is
> + * removed, leaving every range online as it was before the call. This gives
> + * callers all-or-nothing semantics for the offline step, so a failed unplug
> + * does not leave a device split between online and removed ranges.
> + *
> + * Each range must be memory-block aligned in start and size.
> + *
> + * Return: 0 on success, negative errno otherwise. On failure no range has
> + * been removed.
> + */
> +int offline_and_remove_memory_ranges(const struct range *ranges, int nr_ranges)
> +{
Is there a way to just generalize the logic in offline_and_remove_memory() to
multiple ranges, making offline_and_remove_memory() then a simple wrapper around
the new offline_and_remove_memory_ranges(), providing only a single range?
--
Cheers,
David
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug
2026-06-09 10:06 ` David Hildenbrand (Arm)
@ 2026-06-09 15:15 ` Gregory Price
0 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-09 15:15 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: linux-mm, nvdimm, linux-kernel, kernel-team, linux-cxl,
linux-kselftest, djbw, vishal.l.verma, dave.jiang, akpm, ljs,
liam, vbabka, rppt, surenb, mhocko, osalvador, shuah,
alison.schofield, Smita.KoralahalliChannabasappa, ira.weiny,
apopple
On Tue, Jun 09, 2026 at 12:06:38PM +0200, David Hildenbrand (Arm) wrote:
>
> > EXPORT_SYMBOL_GPL(offline_and_remove_memory);
> > +
> > +/**
> > + * offline_and_remove_memory_ranges - offline and remove multiple memory ranges
> > + * @ranges: array of physical address ranges to offline and remove
> > + * @nr_ranges: number of entries in @ranges
> > + *
> > + * Offline and remove several memory ranges as one operation, serialized
> > + * against other hotplug operations by a single lock_device_hotplug().
> > + *
> > + * Unlike calling offline_and_remove_memory() in a loop, this offlines *all*
> > + * ranges before removing any of them. If offlining any range fails, the
> > + * offlining of the ranges processed so far is reverted and nothing is
> > + * removed, leaving every range online as it was before the call. This gives
> > + * callers all-or-nothing semantics for the offline step, so a failed unplug
> > + * does not leave a device split between online and removed ranges.
> > + *
> > + * Each range must be memory-block aligned in start and size.
> > + *
> > + * Return: 0 on success, negative errno otherwise. On failure no range has
> > + * been removed.
> > + */
> > +int offline_and_remove_memory_ranges(const struct range *ranges, int nr_ranges)
> > +{
>
> Is there a way to just generalize the logic in offline_and_remove_memory() to
> multiple ranges, making offline_and_remove_memory() then a simple wrapper around
> the new offline_and_remove_memory_ranges(), providing only a single range?
>
Yeah that's reasonable, I'll look at what can be done.
~Gregory
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 6/9] dax: plumb hotplug online_type through dax
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
` (4 preceding siblings ...)
2026-06-05 21:19 ` [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
[not found] ` <20260605213143.A18F01F00893@smtp.kernel.org>
2026-06-05 21:19 ` [PATCH v4 7/9] dax/kmem: extract hotplug/hotremove helper functions Gregory Price
` (2 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
There is no way for drivers leveraging dax_kmem to plumb through a
preferred auto-online policy - the system default policy is forced.
Add 'enum mmop' field to DAX device creation path to allow drivers
to specify an auto-online policy when using the kmem driver.
Current callers initialize online_type to mhp_get_default_online_type()
to retain backward compatibility and to make explicit to the drivers
what is actually happening underneath.
No functional changes to existing callers.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/dax/bus.c | 3 +++
drivers/dax/bus.h | 2 ++
drivers/dax/cxl.c | 1 +
drivers/dax/dax-private.h | 3 +++
drivers/dax/hmem/hmem.c | 1 +
drivers/dax/kmem.c | 5 +++--
drivers/dax/pmem.c | 1 +
7 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 492573b47f66..6611fe399f59 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright(c) 2017-2018 Intel Corporation. All rights reserved. */
#include <linux/memremap.h>
+#include <linux/memory_hotplug.h>
#include <linux/device.h>
#include <linux/mutex.h>
#include <linux/list.h>
@@ -394,6 +395,7 @@ static ssize_t create_store(struct device *dev, struct device_attribute *attr,
.size = 0,
.id = -1,
.memmap_on_memory = false,
+ .online_type = mhp_get_default_online_type(),
};
struct dev_dax *dev_dax = __devm_create_dev_dax(&data);
@@ -1527,6 +1529,7 @@ static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data)
ida_init(&dev_dax->ida);
dev_dax->memmap_on_memory = data->memmap_on_memory;
+ dev_dax->online_type = data->online_type;
inode = dax_inode(dax_dev);
dev->devt = inode->i_rdev;
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index 5909171a4428..c53a9427f8e4 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -3,6 +3,7 @@
#ifndef __DAX_BUS_H__
#define __DAX_BUS_H__
#include <linux/device.h>
+#include <linux/memory_hotplug.h>
#include <linux/platform_device.h>
#include <linux/range.h>
#include <linux/workqueue.h>
@@ -26,6 +27,7 @@ struct dev_dax_data {
resource_size_t size;
int id;
bool memmap_on_memory;
+ enum mmop online_type;
};
struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data);
diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
index 3ab39b77843d..0eaef700bb2a 100644
--- a/drivers/dax/cxl.c
+++ b/drivers/dax/cxl.c
@@ -27,6 +27,7 @@ static int cxl_dax_region_probe(struct device *dev)
.id = -1,
.size = range_len(&cxlr_dax->hpa_range),
.memmap_on_memory = true,
+ .online_type = mhp_get_default_online_type(),
};
return PTR_ERR_OR_ZERO(devm_create_dev_dax(&data));
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 81e4af49e39c..0787325bc8dd 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -8,6 +8,7 @@
#include <linux/device.h>
#include <linux/cdev.h>
#include <linux/idr.h>
+#include <linux/memory_hotplug.h>
/* private routines between core files */
struct dax_device;
@@ -79,6 +80,7 @@ struct dev_dax_range {
* @dev: device core
* @pgmap: pgmap for memmap setup / lifetime (driver owned)
* @memmap_on_memory: allow kmem to put the memmap in the memory
+ * @online_type: MMOP_* online type for memory hotplug
* @nr_range: size of @ranges
* @ranges: range tuples of memory used
*/
@@ -95,6 +97,7 @@ struct dev_dax {
struct device dev;
struct dev_pagemap *pgmap;
bool memmap_on_memory;
+ enum mmop online_type;
int nr_range;
struct dev_dax_range *ranges;
};
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index af21f66bf872..0ef6e9ae660d 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -37,6 +37,7 @@ static int dax_hmem_probe(struct platform_device *pdev)
.id = -1,
.size = region_idle ? 0 : range_len(&mri->range),
.memmap_on_memory = false,
+ .online_type = mhp_get_default_online_type(),
};
return PTR_ERR_OR_ZERO(devm_create_dev_dax(&data));
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 592171ec10f4..41ccb618a146 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -172,8 +172,9 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
* Ensure that future kexec'd kernels will not treat
* this as RAM automatically.
*/
- rc = add_memory_driver_managed(data->mgid, range.start,
- range_len(&range), kmem_name, mhp_flags);
+ rc = __add_memory_driver_managed(data->mgid, range.start,
+ range_len(&range), kmem_name, mhp_flags,
+ dev_dax->online_type);
if (rc) {
dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n",
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index bee93066a849..a5f987814da5 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -63,6 +63,7 @@ static struct dev_dax *__dax_pmem_probe(struct device *dev)
.pgmap = &pgmap,
.size = range_len(&range),
.memmap_on_memory = false,
+ .online_type = mhp_get_default_online_type(),
};
return devm_create_dev_dax(&data);
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v4 7/9] dax/kmem: extract hotplug/hotremove helper functions
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
` (5 preceding siblings ...)
2026-06-05 21:19 ` [PATCH v4 6/9] dax: plumb hotplug online_type through dax Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug Gregory Price
2026-06-05 21:19 ` [PATCH v4 9/9] selftests/dax: add dax/kmem hotplug sysfs regression test Gregory Price
8 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
Refactor kmem _probe() _remove() by extracting init, cleanup, hotplug,
and hot-remove logic into separate helper functions:
- dax_kmem_init_resources: inits IO_RESOURCE w/ request_mem_region
- dax_kmem_cleanup_resources: cleans up initialized IO_RESOURCE
- dax_kmem_do_hotplug: handles memory region reservation and adding
- dax_kmem_do_hotremove: handles memory removal and resource cleanup
This is a pure refactoring with no functional change. The helpers will
enable future extensions to support more granular control over memory
hotplug operations.
We need to split hotplug/remove and init/cleanup in order to have the
resources available for hot-add. Otherwise, when probe occurs, the dax
devices are never added to sysfs because the resources are never
registered.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/dax/kmem.c | 315 +++++++++++++++++++++++++++++++--------------
1 file changed, 215 insertions(+), 100 deletions(-)
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 41ccb618a146..5bf36ab73f86 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -63,14 +63,195 @@ static void kmem_put_memory_types(void)
mt_put_memory_types(&kmem_memory_types);
}
+/**
+ * dax_kmem_do_hotplug - hotplug memory for dax kmem device
+ * @dev_dax: the dev_dax instance
+ * @data: the dax_kmem_data structure with resource tracking
+ *
+ * Hotplugs all ranges in the dev_dax region as system memory.
+ *
+ * Returns the number of successfully mapped ranges, or negative error.
+ */
+static int dax_kmem_do_hotplug(struct dev_dax *dev_dax,
+ struct dax_kmem_data *data,
+ int online_type)
+{
+ struct device *dev = &dev_dax->dev;
+ int i, rc, onlined = 0;
+ mhp_t mhp_flags;
+
+ for (i = 0; i < dev_dax->nr_range; i++) {
+ struct range range;
+
+ rc = dax_kmem_range(dev_dax, i, &range);
+ if (rc)
+ continue;
+
+ mhp_flags = MHP_NID_IS_MGID;
+ if (dev_dax->memmap_on_memory)
+ mhp_flags |= MHP_MEMMAP_ON_MEMORY;
+
+ /*
+ * Ensure that future kexec'd kernels will not treat
+ * this as RAM automatically.
+ */
+ rc = __add_memory_driver_managed(data->mgid, range.start,
+ range_len(&range), kmem_name, mhp_flags,
+ online_type);
+
+ if (rc) {
+ dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n",
+ i, range.start, range.end);
+ /*
+ * Release the reservation for the range that failed to
+ * add so a later hotremove does not try to remove memory
+ * that was never added.
+ */
+ if (data->res[i]) {
+ remove_resource(data->res[i]);
+ kfree(data->res[i]);
+ data->res[i] = NULL;
+ }
+ if (onlined)
+ continue;
+ return rc;
+ }
+ onlined++;
+ }
+
+ return onlined;
+}
+
+/**
+ * dax_kmem_init_resources - create memory regions for dax kmem
+ * @dev_dax: the dev_dax instance
+ * @data: the dax_kmem_data structure with resource tracking
+ *
+ * Initializes all the resources for the DAX
+ *
+ * Returns the number of successfully mapped ranges, or negative error.
+ */
+static int dax_kmem_init_resources(struct dev_dax *dev_dax,
+ struct dax_kmem_data *data)
+{
+ struct device *dev = &dev_dax->dev;
+ int i, rc, mapped = 0;
+
+ for (i = 0; i < dev_dax->nr_range; i++) {
+ struct resource *res;
+ struct range range;
+
+ rc = dax_kmem_range(dev_dax, i, &range);
+ if (rc)
+ continue;
+
+ /* Skip ranges already added */
+ if (data->res[i])
+ continue;
+
+ /* Region is permanently reserved if hotremove fails. */
+ res = request_mem_region(range.start, range_len(&range),
+ data->res_name);
+ if (!res) {
+ dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve region\n",
+ i, range.start, range.end);
+ /*
+ * Once some memory has been onlined we can't
+ * assume that it can be un-onlined safely.
+ */
+ if (mapped)
+ continue;
+ return -EBUSY;
+ }
+ data->res[i] = res;
+ /*
+ * Set flags appropriate for System RAM. Leave ..._BUSY clear
+ * so that add_memory() can add a child resource. Do not
+ * inherit flags from the parent since it may set new flags
+ * unknown to us that will break add_memory() below.
+ */
+ res->flags = IORESOURCE_SYSTEM_RAM;
+ mapped++;
+ }
+ return mapped;
+}
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/**
+ * dax_kmem_do_hotremove - hot-remove memory for dax kmem device
+ * @dev_dax: the dev_dax instance
+ * @data: the dax_kmem_data structure with resource tracking
+ *
+ * Removes all ranges in the dev_dax region.
+ *
+ * Returns the number of successfully removed ranges.
+ */
+static int dax_kmem_do_hotremove(struct dev_dax *dev_dax,
+ struct dax_kmem_data *data)
+{
+ struct device *dev = &dev_dax->dev;
+ int i, success = 0;
+
+ for (i = 0; i < dev_dax->nr_range; i++) {
+ struct range range;
+ int rc;
+
+ rc = dax_kmem_range(dev_dax, i, &range);
+ if (rc)
+ continue;
+
+ /* range was never added during probe, count as removed */
+ if (!data->res[i]) {
+ success++;
+ continue;
+ }
+
+ rc = remove_memory(range.start, range_len(&range));
+ if (rc == 0) {
+ /* Release the resource for the successfully removed range */
+ remove_resource(data->res[i]);
+ kfree(data->res[i]);
+ data->res[i] = NULL;
+ success++;
+ continue;
+ }
+ any_hotremove_failed = true;
+ dev_err(dev, "mapping%d: %#llx-%#llx hotremove failed\n",
+ i, range.start, range.end);
+ }
+
+ return success;
+}
+#endif /* CONFIG_MEMORY_HOTREMOVE */
+
+/**
+ * dax_kmem_cleanup_resources - remove the dax memory resources
+ * @dev_dax: the dev_dax instance
+ * @data: the dax_kmem_data structure with resource tracking
+ *
+ * Removes all resources in the dev_dax region.
+ */
+static void dax_kmem_cleanup_resources(struct dev_dax *dev_dax,
+ struct dax_kmem_data *data)
+{
+ int i;
+
+ for (i = 0; i < dev_dax->nr_range; i++) {
+ if (!data->res[i])
+ continue;
+ remove_resource(data->res[i]);
+ kfree(data->res[i]);
+ data->res[i] = NULL;
+ }
+}
+
static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
{
struct device *dev = &dev_dax->dev;
unsigned long total_len = 0, orig_len = 0;
struct dax_kmem_data *data;
struct memory_dev_type *mtype;
- int i, rc, mapped = 0;
- mhp_t mhp_flags;
+ int i, rc;
int numa_node;
int adist = MEMTIER_DEFAULT_DAX_ADISTANCE;
@@ -132,68 +313,25 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
goto err_reg_mgid;
data->mgid = rc;
- for (i = 0; i < dev_dax->nr_range; i++) {
- struct resource *res;
- struct range range;
-
- rc = dax_kmem_range(dev_dax, i, &range);
- if (rc)
- continue;
-
- /* Region is permanently reserved if hotremove fails. */
- res = request_mem_region(range.start, range_len(&range), data->res_name);
- if (!res) {
- dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve region\n",
- i, range.start, range.end);
- /*
- * Once some memory has been onlined we can't
- * assume that it can be un-onlined safely.
- */
- if (mapped)
- continue;
- rc = -EBUSY;
- goto err_request_mem;
- }
- data->res[i] = res;
-
- /*
- * Set flags appropriate for System RAM. Leave ..._BUSY clear
- * so that add_memory() can add a child resource. Do not
- * inherit flags from the parent since it may set new flags
- * unknown to us that will break add_memory() below.
- */
- res->flags = IORESOURCE_SYSTEM_RAM;
-
- mhp_flags = MHP_NID_IS_MGID;
- if (dev_dax->memmap_on_memory)
- mhp_flags |= MHP_MEMMAP_ON_MEMORY;
-
- /*
- * Ensure that future kexec'd kernels will not treat
- * this as RAM automatically.
- */
- rc = __add_memory_driver_managed(data->mgid, range.start,
- range_len(&range), kmem_name, mhp_flags,
- dev_dax->online_type);
+ dev_set_drvdata(dev, data);
- if (rc) {
- dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n",
- i, range.start, range.end);
- remove_resource(res);
- kfree(res);
- data->res[i] = NULL;
- if (mapped)
- continue;
- goto err_request_mem;
- }
- mapped++;
- }
+ rc = dax_kmem_init_resources(dev_dax, data);
+ if (rc < 0)
+ goto err_resources;
- dev_set_drvdata(dev, data);
+ /*
+ * Hotplug using the configured online type for this device.
+ */
+ rc = dax_kmem_do_hotplug(dev_dax, data, dev_dax->online_type);
+ if (rc < 0)
+ goto err_hotplug;
return 0;
-err_request_mem:
+err_hotplug:
+ dax_kmem_cleanup_resources(dev_dax, data);
+err_resources:
+ dev_set_drvdata(dev, NULL);
memory_group_unregister(data->mgid);
err_reg_mgid:
kfree(data->res_name);
@@ -207,7 +345,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
#ifdef CONFIG_MEMORY_HOTREMOVE
static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
{
- int i, success = 0;
+ int success;
int node = dev_dax->target_node;
struct device *dev = &dev_dax->dev;
struct dax_kmem_data *data = dev_get_drvdata(dev);
@@ -218,48 +356,25 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
* there is no way to hotremove this memory until reboot because device
* unbind will succeed even if we return failure.
*/
- for (i = 0; i < dev_dax->nr_range; i++) {
- struct range range;
- int rc;
-
- rc = dax_kmem_range(dev_dax, i, &range);
- if (rc)
- continue;
-
- /* range was never added during probe */
- if (!data->res[i]) {
- success++;
- continue;
- }
-
- rc = remove_memory(range.start, range_len(&range));
- if (rc == 0) {
- remove_resource(data->res[i]);
- kfree(data->res[i]);
- data->res[i] = NULL;
- success++;
- continue;
- }
- any_hotremove_failed = true;
- dev_err(dev,
- "mapping%d: %#llx-%#llx cannot be hotremoved until the next reboot\n",
- i, range.start, range.end);
+ success = dax_kmem_do_hotremove(dev_dax, data);
+ if (success < dev_dax->nr_range) {
+ dev_err(dev, "Hotplug regions stuck online until reboot\n");
+ return;
}
- if (success >= dev_dax->nr_range) {
- memory_group_unregister(data->mgid);
- kfree(data->res_name);
- kfree(data);
- dev_set_drvdata(dev, NULL);
- /*
- * Clear the memtype association on successful unplug.
- * If not, we have memory blocks left which can be
- * offlined/onlined later. We need to keep memory_dev_type
- * for that. This implies this reference will be around
- * till next reboot.
- */
- clear_node_memory_type(node, NULL);
- }
+ dax_kmem_cleanup_resources(dev_dax, data);
+ memory_group_unregister(data->mgid);
+ kfree(data->res_name);
+ kfree(data);
+ dev_set_drvdata(dev, NULL);
+ /*
+ * Clear the memtype association on successful unplug.
+ * If not, we have memory blocks left which can be
+ * offlined/onlined later. We need to keep memory_dev_type
+ * for that. This implies this reference will be around
+ * till next reboot.
+ */
+ clear_node_memory_type(node, NULL);
}
#else
static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
` (6 preceding siblings ...)
2026-06-05 21:19 ` [PATCH v4 7/9] dax/kmem: extract hotplug/hotremove helper functions Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
2026-06-09 10:26 ` David Hildenbrand (Arm)
2026-06-05 21:19 ` [PATCH v4 9/9] selftests/dax: add dax/kmem hotplug sysfs regression test Gregory Price
8 siblings, 1 reply; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple,
Hannes Reinecke
The dax kmem driver currently onlines memory automatically during
probe using the system's default online policy but provides no way
to control or query the entire region state at runtime.
Additionally, there is no atomic mechanism to offline and remove
the entire set of memory blocks together. Instead, this is presently
done in two steps: (offline all, remove all). This creates a race
condition where external entities can operate directly on the blocks
and cause hot-unplug to fail.
Add a new 'hotplug' sysfs attribute that allows userspace to control
and query the entire memory region state. The writable states mirror
the per-block /sys/devices/system/memory/memoryX/state ABI:
- "unplugged": memory blocks are not present
- "online": memory is online, zone chosen by the kernel
- "online_kernel": memory is online in ZONE_NORMAL
- "online_movable": memory is online in ZONE_MOVABLE
The "unplugged" state is new and only applies to kmem/hotplug.
Valid transitions:
- unplugged -> online[_kernel|_movable]
- online | online_kernel | online_movable -> unplugged
- offline -> unplugged
A device can only be onlined from "unplugged", so it must be returned
there before being onlined into a different state.
For backwards compatibility the memory blocks are always created at
probe: existing tools expect them to be present once the kmem driver
binds. When the configured policy (mhp_get_default_online_type())
selects an online state the blocks are onlined into that policy's zone;
when the policy is offline the blocks are created but left offline and
the device reports the state "offline".
"offline" is therefore a reportable state but is not writable: it only
arises from the legacy auto_online_blocks=offline policy. Onlining such
a device through this attribute requires unplugging it first.
The "offline" state may be deprecated later if the memory block ABI
changes and userland migrates to using the region-wide hotplug.
Unplug is atomic across the whole device: dax_kmem_do_hotremove()
collects every added range and offlines/removes them in one operation
via offline_and_remove_memory_ranges(). Either all ranges are removed
and the device becomes "unplugged", or offlining is rolled back and the
device is left fully online, so the reported 'hotplug' state always
matches reality.
Unbind Note:
We used to call remove_memory() during unbind, which would fire a
BUG() if any of the memory blocks were online at that time. We lift
this into a WARN in the cleanup routine and don't attempt hotremove
if ->state is not DAX_KMEM_UNPLUGGED or MMOP_OFFLINE. Memory that is
merely offline (the legacy "offline" state) is removed on unbind as
before; only online memory is left pinned.
The resources are still leaked but this prevents deadlock on unbind
if a memory region happens to be impossible to hotremove.
Inconsistency Note:
Since memory blocks can still be modified individually, the hotplug
attribute can become out of sync with the state of the system if
userland software mixes and matches the use of memory_block ABI and
kmem/hotplug ABI. It's suggests to use one or the other.
Suggested-by: Hannes Reinecke <hare@suse.de>
Suggested-by: David Hildenbrand <david@kernel.org>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
Documentation/ABI/testing/sysfs-bus-dax | 25 +++
drivers/dax/kmem.c | 254 ++++++++++++++++++++----
2 files changed, 238 insertions(+), 41 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/testing/sysfs-bus-dax
index b34266bfae49..931eb4e20358 100644
--- a/Documentation/ABI/testing/sysfs-bus-dax
+++ b/Documentation/ABI/testing/sysfs-bus-dax
@@ -151,3 +151,28 @@ Description:
memmap_on_memory parameter for memory_hotplug. This is
typically set on the kernel command line -
memory_hotplug.memmap_on_memory set to 'true' or 'force'."
+
+What: /sys/bus/dax/devices/daxX.Y/hotplug
+Date: January, 2026
+KernelVersion: v6.21
+Contact: nvdimm@lists.linux.dev
+Description:
+ (RW) Controls the hotplug state of the memory region.
+ Applies to all memory blocks associated with the device.
+ Only applies to dax_kmem devices.
+
+ Reading returns the current state; the writable states mirror
+ the per-block /sys/devices/system/memory/memoryX/state ABI:
+ "unplugged": memory blocks are not present
+ "online": memory is online, zone chosen by the kernel
+ "online_kernel": memory is online in ZONE_NORMAL
+ "online_movable": memory is online in ZONE_MOVABLE
+
+ "offline" (memory blocks are present but offline) may also be
+ reported - this happens when the device is bound while the
+ auto_online_blocks policy is offline. It cannot be written and
+ is deprecated; it may be removed in the future.
+
+ A device can only be onlined from the "unplugged" state, so a
+ device must be returned to "unplugged" before it can be onlined
+ into a different state.
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 5bf36ab73f86..46ee06d9f56b 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -42,9 +42,15 @@ static int dax_kmem_range(struct dev_dax *dev_dax, int i, struct range *r)
return 0;
}
+#define DAX_KMEM_UNPLUGGED (-1)
+
struct dax_kmem_data {
const char *res_name;
int mgid;
+ int numa_node;
+ struct dev_dax *dev_dax;
+ int state;
+ struct mutex lock; /* protects hotplug state transitions */
struct resource *res[];
};
@@ -63,23 +69,41 @@ static void kmem_put_memory_types(void)
mt_put_memory_types(&kmem_memory_types);
}
+/* True for the online states a kmem dax device can hold. */
+static bool dax_kmem_state_is_online(int state)
+{
+ return state == MMOP_ONLINE ||
+ state == MMOP_ONLINE_KERNEL ||
+ state == MMOP_ONLINE_MOVABLE;
+}
+
/**
- * dax_kmem_do_hotplug - hotplug memory for dax kmem device
+ * dax_kmem_do_hotplug - add the dev_dax memory ranges as system memory
* @dev_dax: the dev_dax instance
* @data: the dax_kmem_data structure with resource tracking
+ * @online_type: MMOP_OFFLINE to add the blocks offline, otherwise the online
+ * state (MMOP_ONLINE, MMOP_ONLINE_KERNEL, MMOP_ONLINE_MOVABLE)
+ * to bring them online in.
*
- * Hotplugs all ranges in the dev_dax region as system memory.
+ * Adds all ranges in the dev_dax region as system memory, onlining them in
+ * the requested zone unless @online_type is MMOP_OFFLINE.
*
- * Returns the number of successfully mapped ranges, or negative error.
+ * Returns the number of successfully added ranges, or negative error.
*/
static int dax_kmem_do_hotplug(struct dev_dax *dev_dax,
struct dax_kmem_data *data,
int online_type)
{
struct device *dev = &dev_dax->dev;
- int i, rc, onlined = 0;
+ int i, rc, added = 0;
mhp_t mhp_flags;
+ if (dax_kmem_state_is_online(data->state))
+ return -EINVAL;
+
+ if (online_type < MMOP_OFFLINE || online_type > MMOP_ONLINE_MOVABLE)
+ return -EINVAL;
+
for (i = 0; i < dev_dax->nr_range; i++) {
struct range range;
@@ -112,14 +136,14 @@ static int dax_kmem_do_hotplug(struct dev_dax *dev_dax,
kfree(data->res[i]);
data->res[i] = NULL;
}
- if (onlined)
+ if (added)
continue;
return rc;
}
- onlined++;
+ added++;
}
- return onlined;
+ return added;
}
/**
@@ -182,45 +206,65 @@ static int dax_kmem_init_resources(struct dev_dax *dev_dax,
* @dev_dax: the dev_dax instance
* @data: the dax_kmem_data structure with resource tracking
*
- * Removes all ranges in the dev_dax region.
+ * Offlines and removes every currently-added range in the dev_dax region
+ * atomically: either all ranges are offlined and removed, or none are and
+ * the device is left fully online (see offline_and_remove_memory_ranges()).
*
- * Returns the number of successfully removed ranges.
+ * Returns 0 on success, or a negative errno if the device could not be
+ * fully unplugged (in which case nothing was removed).
*/
static int dax_kmem_do_hotremove(struct dev_dax *dev_dax,
struct dax_kmem_data *data)
{
struct device *dev = &dev_dax->dev;
- int i, success = 0;
+ struct range *ranges;
+ int i, nr_ranges = 0, rc;
+ ranges = kmalloc_array(dev_dax->nr_range, sizeof(*ranges), GFP_KERNEL);
+ if (!ranges)
+ return -ENOMEM;
+
+ /* Collect the ranges that were actually added during probe. */
for (i = 0; i < dev_dax->nr_range; i++) {
struct range range;
- int rc;
- rc = dax_kmem_range(dev_dax, i, &range);
- if (rc)
+ if (!data->res[i])
continue;
-
- /* range was never added during probe, count as removed */
- if (!data->res[i]) {
- success++;
+ if (dax_kmem_range(dev_dax, i, &range))
continue;
- }
+ ranges[nr_ranges++] = range;
+ }
- rc = remove_memory(range.start, range_len(&range));
- if (rc == 0) {
- /* Release the resource for the successfully removed range */
- remove_resource(data->res[i]);
- kfree(data->res[i]);
- data->res[i] = NULL;
- success++;
- continue;
- }
+ /* Nothing added means nothing to remove. */
+ if (!nr_ranges) {
+ kfree(ranges);
+ return 0;
+ }
+
+ rc = offline_and_remove_memory_ranges(ranges, nr_ranges);
+ kfree(ranges);
+ if (rc) {
any_hotremove_failed = true;
- dev_err(dev, "mapping%d: %#llx-%#llx hotremove failed\n",
- i, range.start, range.end);
+ dev_err(dev, "hotremove failed, device left online: %d\n", rc);
+ return rc;
}
- return success;
+ /* All ranges removed; release the reserved resources. */
+ for (i = 0; i < dev_dax->nr_range; i++) {
+ if (!data->res[i])
+ continue;
+ remove_resource(data->res[i]);
+ kfree(data->res[i]);
+ data->res[i] = NULL;
+ }
+
+ return 0;
+}
+#else
+static int dax_kmem_do_hotremove(struct dev_dax *dev_dax,
+ struct dax_kmem_data *data)
+{
+ return -EBUSY;
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
@@ -236,6 +280,20 @@ static void dax_kmem_cleanup_resources(struct dev_dax *dev_dax,
{
int i;
+ /*
+ * If the device unbind occurs before memory is hotremoved, we can never
+ * remove the memory (requires reboot). Attempting an offline operation
+ * here may cause deadlock and a failure to finish the unbind.
+ *
+ * This WARN used to be a BUG called by remove_memory().
+ *
+ * Note: This leaks the resources.
+ */
+ if (WARN(((data->state != DAX_KMEM_UNPLUGGED) &&
+ (data->state != MMOP_OFFLINE)),
+ "Hotplug memory regions stuck online until reboot"))
+ return;
+
for (i = 0; i < dev_dax->nr_range; i++) {
if (!data->res[i])
continue;
@@ -245,6 +303,107 @@ static void dax_kmem_cleanup_resources(struct dev_dax *dev_dax,
}
}
+static int dax_kmem_parse_state(const char *buf)
+{
+ if (sysfs_streq(buf, "unplugged"))
+ return DAX_KMEM_UNPLUGGED;
+ if (sysfs_streq(buf, "online"))
+ return MMOP_ONLINE;
+ if (sysfs_streq(buf, "online_kernel"))
+ return MMOP_ONLINE_KERNEL;
+ if (sysfs_streq(buf, "online_movable"))
+ return MMOP_ONLINE_MOVABLE;
+ return -EINVAL;
+}
+
+static ssize_t hotplug_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct dax_kmem_data *data = dev_get_drvdata(dev);
+ const char *state_str;
+
+ if (!data)
+ return -ENXIO;
+
+ switch (data->state) {
+ case DAX_KMEM_UNPLUGGED:
+ state_str = "unplugged";
+ break;
+ case MMOP_OFFLINE:
+ state_str = "offline";
+ break;
+ case MMOP_ONLINE:
+ state_str = "online";
+ break;
+ case MMOP_ONLINE_KERNEL:
+ state_str = "online_kernel";
+ break;
+ case MMOP_ONLINE_MOVABLE:
+ state_str = "online_movable";
+ break;
+ default:
+ state_str = "unknown";
+ break;
+ }
+
+ return sysfs_emit(buf, "%s\n", state_str);
+}
+
+static ssize_t hotplug_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct dev_dax *dev_dax = to_dev_dax(dev);
+ struct dax_kmem_data *data = dev_get_drvdata(dev);
+ int online_type;
+ int rc;
+
+ if (!data)
+ return -ENXIO;
+
+ online_type = dax_kmem_parse_state(buf);
+ if (online_type < DAX_KMEM_UNPLUGGED)
+ return online_type;
+
+ guard(mutex)(&data->lock);
+
+ /* Already in requested state */
+ if (data->state == online_type)
+ return len;
+
+ if (online_type == DAX_KMEM_UNPLUGGED) {
+ rc = dax_kmem_do_hotremove(dev_dax, data);
+ if (rc)
+ return rc;
+ data->state = DAX_KMEM_UNPLUGGED;
+ return len;
+ }
+
+ /*
+ * Onlining is only allowed from the unplugged state. An already-online
+ * device (or one left in the legacy offline state) must be unplugged
+ * first.
+ */
+ if (data->state != DAX_KMEM_UNPLUGGED)
+ return -EBUSY;
+
+ /*
+ * A previous unplug releases the per-range resources, so re-acquire
+ * them here (mirroring probe). This is a no-op for ranges that are
+ * still reserved (e.g. transitioning from the offline state).
+ */
+ rc = dax_kmem_init_resources(dev_dax, data);
+ if (rc < 0)
+ return rc;
+
+ rc = dax_kmem_do_hotplug(dev_dax, data, online_type);
+ if (rc < 0)
+ return rc;
+
+ data->state = online_type;
+ return len;
+}
+static DEVICE_ATTR_RW(hotplug);
+
static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
{
struct device *dev = &dev_dax->dev;
@@ -312,6 +471,10 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
if (rc < 0)
goto err_reg_mgid;
data->mgid = rc;
+ data->numa_node = numa_node;
+ data->dev_dax = dev_dax;
+ data->state = DAX_KMEM_UNPLUGGED;
+ mutex_init(&data->lock);
dev_set_drvdata(dev, data);
@@ -320,11 +483,19 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
goto err_resources;
/*
- * Hotplug using the configured online type for this device.
+ * Always create the memory blocks for backwards compatibility: existing
+ * tools expect them to be present after the kmem driver binds. Under
+ * the offline policy they are added but left offline (state
+ * MMOP_OFFLINE); otherwise they are onlined per the configured policy.
*/
rc = dax_kmem_do_hotplug(dev_dax, data, dev_dax->online_type);
if (rc < 0)
goto err_hotplug;
+ data->state = dev_dax->online_type;
+
+ rc = device_create_file(dev, &dev_attr_hotplug);
+ if (rc)
+ dev_warn(dev, "failed to create hotplug sysfs entry\n");
return 0;
@@ -345,23 +516,20 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
#ifdef CONFIG_MEMORY_HOTREMOVE
static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
{
- int success;
int node = dev_dax->target_node;
struct device *dev = &dev_dax->dev;
struct dax_kmem_data *data = dev_get_drvdata(dev);
+ device_remove_file(dev, &dev_attr_hotplug);
/*
- * We have one shot for removing memory, if some memory blocks were not
- * offline prior to calling this function remove_memory() will fail, and
- * there is no way to hotremove this memory until reboot because device
- * unbind will succeed even if we return failure.
+ * Blocks added under the legacy offline policy are present but offline;
+ * remove them on unbind as the driver always has. If removal fails,
+ * leak the resources rather than freeing state that still backs present
+ * memory. Online memory is left alone (dax_kmem_cleanup_resources()
+ * warns and leaks it) since offlining it here could deadlock the unbind.
*/
- success = dax_kmem_do_hotremove(dev_dax, data);
- if (success < dev_dax->nr_range) {
- dev_err(dev, "Hotplug regions stuck online until reboot\n");
+ if (data->state == MMOP_OFFLINE && dax_kmem_do_hotremove(dev_dax, data))
return;
- }
-
dax_kmem_cleanup_resources(dev_dax, data);
memory_group_unregister(data->mgid);
kfree(data->res_name);
@@ -379,6 +547,10 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
#else
static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
{
+ struct device *dev = &dev_dax->dev;
+
+ device_remove_file(dev, &dev_attr_hotplug);
+
/*
* Without hotremove purposely leak the request_mem_region() for the
* device-dax range and return '0' to ->remove() attempts. The removal
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug
2026-06-05 21:19 ` [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug Gregory Price
@ 2026-06-09 10:26 ` David Hildenbrand (Arm)
2026-06-09 15:35 ` Gregory Price
0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-09 10:26 UTC (permalink / raw)
To: Gregory Price, linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, ljs, liam, vbabka, rppt, surenb,
mhocko, osalvador, shuah, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple,
Hannes Reinecke
On 6/5/26 23:19, Gregory Price wrote:
> The dax kmem driver currently onlines memory automatically during
> probe using the system's default online policy but provides no way
> to control or query the entire region state at runtime.
>
> Additionally, there is no atomic mechanism to offline and remove
> the entire set of memory blocks together. Instead, this is presently
> done in two steps: (offline all, remove all). This creates a race
> condition where external entities can operate directly on the blocks
> and cause hot-unplug to fail.
>
> Add a new 'hotplug' sysfs attribute that allows userspace to control
> and query the entire memory region state. The writable states mirror
> the per-block /sys/devices/system/memory/memoryX/state ABI:
> - "unplugged": memory blocks are not present
> - "online": memory is online, zone chosen by the kernel
> - "online_kernel": memory is online in ZONE_NORMAL
> - "online_movable": memory is online in ZONE_MOVABLE
>
> The "unplugged" state is new and only applies to kmem/hotplug.
>
> Valid transitions:
> - unplugged -> online[_kernel|_movable]
> - online | online_kernel | online_movable -> unplugged
> - offline -> unplugged
>
> A device can only be onlined from "unplugged", so it must be returned
> there before being onlined into a different state.
>
> For backwards compatibility the memory blocks are always created at
> probe: existing tools expect them to be present once the kmem driver
> binds. When the configured policy (mhp_get_default_online_type())
> selects an online state the blocks are onlined into that policy's zone;
> when the policy is offline the blocks are created but left offline and
> the device reports the state "offline".
>
> "offline" is therefore a reportable state but is not writable: it only
> arises from the legacy auto_online_blocks=offline policy. Onlining such
> a device through this attribute requires unplugging it first.
>
> The "offline" state may be deprecated later if the memory block ABI
> changes and userland migrates to using the region-wide hotplug.
>
> Unplug is atomic across the whole device: dax_kmem_do_hotremove()
> collects every added range and offlines/removes them in one operation
> via offline_and_remove_memory_ranges(). Either all ranges are removed
> and the device becomes "unplugged", or offlining is rolled back and the
> device is left fully online, so the reported 'hotplug' state always
> matches reality.
>
> Unbind Note:
> We used to call remove_memory() during unbind, which would fire a
> BUG() if any of the memory blocks were online at that time. We lift
> this into a WARN in the cleanup routine and don't attempt hotremove
> if ->state is not DAX_KMEM_UNPLUGGED or MMOP_OFFLINE. Memory that is
> merely offline (the legacy "offline" state) is removed on unbind as
> before; only online memory is left pinned.
>
> The resources are still leaked but this prevents deadlock on unbind
> if a memory region happens to be impossible to hotremove.
>
> Inconsistency Note:
>
> Since memory blocks can still be modified individually, the hotplug
> attribute can become out of sync with the state of the system if
> userland software mixes and matches the use of memory_block ABI and
> kmem/hotplug ABI. It's suggests to use one or the other.
>
> Suggested-by: Hannes Reinecke <hare@suse.de>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
[...]
>
> +static int dax_kmem_parse_state(const char *buf)
> +{
> + if (sysfs_streq(buf, "unplugged"))
> + return DAX_KMEM_UNPLUGGED;
> + if (sysfs_streq(buf, "online"))
> + return MMOP_ONLINE;
> + if (sysfs_streq(buf, "online_kernel"))
> + return MMOP_ONLINE_KERNEL;
> + if (sysfs_streq(buf, "online_movable"))
> + return MMOP_ONLINE_MOVABLE;
> + return -EINVAL;
Should we try making use of mhp_online_type_from_str()/online_type_to_str()
[possibly a nicer exported function for the latter] to avoid duplicating this ...
> +}
> +
> +static ssize_t hotplug_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct dax_kmem_data *data = dev_get_drvdata(dev);
> + const char *state_str;
> +
> + if (!data)
> + return -ENXIO;
> +
> + switch (data->state) {
> + case DAX_KMEM_UNPLUGGED:
> + state_str = "unplugged";
> + break;
> + case MMOP_OFFLINE:
> + state_str = "offline";
> + break;
> + case MMOP_ONLINE:
> + state_str = "online";
> + break;
> + case MMOP_ONLINE_KERNEL:
> + state_str = "online_kernel";
> + break;
> + case MMOP_ONLINE_MOVABLE:
> + state_str = "online_movable";
> + break;
...
and this?
[sorry if we already discussed this]
--
Cheers,
David
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug
2026-06-09 10:26 ` David Hildenbrand (Arm)
@ 2026-06-09 15:35 ` Gregory Price
0 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-09 15:35 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: linux-mm, nvdimm, linux-kernel, kernel-team, linux-cxl,
linux-kselftest, djbw, vishal.l.verma, dave.jiang, akpm, ljs,
liam, vbabka, rppt, surenb, mhocko, osalvador, shuah,
alison.schofield, Smita.KoralahalliChannabasappa, ira.weiny,
apopple, Hannes Reinecke
On Tue, Jun 09, 2026 at 12:26:07PM +0200, David Hildenbrand (Arm) wrote:
> On 6/5/26 23:19, Gregory Price wrote:
>
> >
> > +static int dax_kmem_parse_state(const char *buf)
> > +{
> > + if (sysfs_streq(buf, "unplugged"))
> > + return DAX_KMEM_UNPLUGGED;
> > + if (sysfs_streq(buf, "online"))
> > + return MMOP_ONLINE;
> > + if (sysfs_streq(buf, "online_kernel"))
> > + return MMOP_ONLINE_KERNEL;
> > + if (sysfs_streq(buf, "online_movable"))
> > + return MMOP_ONLINE_MOVABLE;
> > + return -EINVAL;
>
> Should we try making use of mhp_online_type_from_str()/online_type_to_str()
> [possibly a nicer exported function for the latter] to avoid duplicating this ...
>
In patch 6 response i point out adding MMOP_UNPLUGGED
If we add MMOP_UNPLUGGED as a state that is only use by callers of
memory hotplug to represent the current state - but not as a valid input
to memory_hotplug.c - then we can simply this as you request.
Although we'll need to add a couple lines to memoryN/state parsing code
to disallow MMOP_UNPLUGGED as a valid input (otherwise you could
permanently unplug memory without the ability to get it back... unless
you want that? Seems useless to me.)
~Gregory
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 9/9] selftests/dax: add dax/kmem hotplug sysfs regression test
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
` (7 preceding siblings ...)
2026-06-05 21:19 ` [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug Gregory Price
@ 2026-06-05 21:19 ` Gregory Price
8 siblings, 0 replies; 22+ messages in thread
From: Gregory Price @ 2026-06-05 21:19 UTC (permalink / raw)
To: linux-mm, nvdimm
Cc: linux-kernel, kernel-team, linux-cxl, linux-kselftest, djbw,
vishal.l.verma, dave.jiang, akpm, david, ljs, liam, vbabka, rppt,
surenb, mhocko, osalvador, shuah, gourry, alison.schofield,
Smita.KoralahalliChannabasappa, ira.weiny, apopple
Add a kselftest for the dax/kmem whole-device "hotplug" sysfs attribute
(/sys/bus/dax/devices/daxX.Y/hotplug), which transitions a kmem-backed
dax device between "unplugged", "online" and "online_movable".
Provisioning a devdax device and binding it to kmem needs daxctl/ndctl
(or the tools/testing/nvdimm emulation) and is out of scope for an
in-tree selftest, so the test discovers an already kmem-bound dax device
and SKIPs (KSFT_SKIP) when none is present or when the memory cannot be
freed to reach a known baseline.
When a device is available it validates the interface contract:
- online / online_movable actually add memory (MemTotal grows),
- online is idempotent,
- switching between online types without an intervening unplug is
rejected,
- unplug removes the memory and the reported state matches reality,
- invalid input is rejected.
In particular it covers the online -> unplug -> online_movable -> unplug
cycle: a re-online must re-reserve the per-range resources so that a
subsequent unplug actually offlines and removes the memory instead of
silently reporting success while the memory stays online.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/dax/Makefile | 6 +
tools/testing/selftests/dax/config | 4 +
.../testing/selftests/dax/dax-kmem-hotplug.sh | 145 ++++++++++++++++++
tools/testing/selftests/dax/settings | 1 +
5 files changed, 157 insertions(+)
create mode 100644 tools/testing/selftests/dax/Makefile
create mode 100644 tools/testing/selftests/dax/config
create mode 100755 tools/testing/selftests/dax/dax-kmem-hotplug.sh
create mode 100644 tools/testing/selftests/dax/settings
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 6e59b8f63e41..8c2b4f97619c 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -14,6 +14,7 @@ TARGETS += core
TARGETS += cpufreq
TARGETS += cpu-hotplug
TARGETS += damon
+TARGETS += dax
TARGETS += devices/error_logs
TARGETS += devices/probe
TARGETS += dmabuf-heaps
diff --git a/tools/testing/selftests/dax/Makefile b/tools/testing/selftests/dax/Makefile
new file mode 100644
index 000000000000..25a4f3d73a5b
--- /dev/null
+++ b/tools/testing/selftests/dax/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+all:
+
+TEST_PROGS := dax-kmem-hotplug.sh
+
+include ../lib.mk
diff --git a/tools/testing/selftests/dax/config b/tools/testing/selftests/dax/config
new file mode 100644
index 000000000000..4c9aaeb6ceb4
--- /dev/null
+++ b/tools/testing/selftests/dax/config
@@ -0,0 +1,4 @@
+CONFIG_DEV_DAX=m
+CONFIG_DEV_DAX_KMEM=m
+CONFIG_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTREMOVE=y
diff --git a/tools/testing/selftests/dax/dax-kmem-hotplug.sh b/tools/testing/selftests/dax/dax-kmem-hotplug.sh
new file mode 100755
index 000000000000..705a34cc3c6d
--- /dev/null
+++ b/tools/testing/selftests/dax/dax-kmem-hotplug.sh
@@ -0,0 +1,145 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Exercise the dax/kmem whole-device "hotplug" sysfs attribute:
+# /sys/bus/dax/devices/daxX.Y/hotplug -> unplugged | online | online_movable
+#
+# The test needs a dax device already bound to the kmem driver (so the
+# 'hotplug' attribute exists). Provisioning a devdax device and binding it to
+# kmem requires daxctl/ndctl (or the tools/testing/nvdimm emulation) and is out
+# of scope here; if no suitable device is found the test SKIPs.
+#
+# To actually run it, provision a kmem-backed dax device first. For example,
+# carve a chunk of RAM into an emulated pmem region via the kernel command line
+# (the region must be at least one memory block, e.g. 128MiB on x86):
+#
+# memmap=2G!4G
+#
+# then, in the booted system:
+#
+# ndctl create-namespace -m devdax -e namespace0.0 -f
+# daxctl reconfigure-device -N -m system-ram dax0.0 # binds the kmem driver
+# ./dax-kmem-hotplug.sh
+
+DIR="$(dirname "$(readlink -f "$0")")"
+. "$DIR"/../kselftest/ktap_helpers.sh
+
+DAX_BASE=/sys/bus/dax/devices
+
+memtotal_kb() { awk '/^MemTotal:/ {print $2}' /proc/meminfo; }
+get_state() { cat "$HP" 2>/dev/null; }
+# set_state STATE -- write a state to the hotplug attribute; returns the
+# write's exit status (0 = accepted by the kernel)
+set_state() { echo "$1" > "$HP" 2>/dev/null; }
+
+find_kmem_dax() {
+ local d drv
+ for d in "$DAX_BASE"/dax*; do
+ [ -e "$d/hotplug" ] || continue
+ drv=$(readlink "$d/driver" 2>/dev/null)
+ [ "$(basename "${drv:-}")" = kmem ] || continue
+ basename "$d"
+ return 0
+ done
+ return 1
+}
+
+ktap_print_header
+
+if [ "$UID" != 0 ]; then
+ ktap_skip_all "must be run as root"
+ exit "$KSFT_SKIP"
+fi
+
+DAX=$(find_kmem_dax)
+if [ -z "$DAX" ]; then
+ ktap_skip_all "no kmem-bound dax device with a hotplug attribute"
+ exit "$KSFT_SKIP"
+fi
+HP=$DAX_BASE/$DAX/hotplug
+ORIG=$(get_state)
+
+# A failure to reach the baseline is environmental (memory in use), not an
+# interface failure, so skip rather than fail.
+set_state unplugged; rc=$?
+if [ "$rc" != 0 ] || [ "$(get_state)" != unplugged ]; then
+ ktap_skip_all "$DAX: cannot reach 'unplugged' baseline (memory in use?)"
+ [ -n "$ORIG" ] && set_state "$ORIG"
+ exit "$KSFT_SKIP"
+fi
+mt_unplugged=$(memtotal_kb)
+
+ktap_print_msg "using $DAX (initial state was: $ORIG)"
+ktap_set_plan 8
+
+set_state online; rc=$?
+mt_online=$(memtotal_kb)
+if [ "$rc" = 0 ] && [ "$(get_state)" = online ] && [ "$mt_online" -gt "$mt_unplugged" ]; then
+ ktap_test_pass "online: state=online, MemTotal $mt_unplugged -> $mt_online kB"
+else
+ ktap_test_fail "online: rc=$rc state=$(get_state) MemTotal $mt_unplugged -> $mt_online"
+fi
+
+set_state online; rc=$?
+if [ "$rc" = 0 ] && [ "$(get_state)" = online ]; then
+ ktap_test_pass "online idempotent"
+else
+ ktap_test_fail "online idempotent: rc=$rc state=$(get_state)"
+fi
+
+set_state online_movable; rc=$?
+if [ "$rc" != 0 ] && [ "$(get_state)" = online ]; then
+ ktap_test_pass "reject online_movable without intervening unplug"
+else
+ ktap_test_fail "online->online_movable not rejected: rc=$rc state=$(get_state)"
+fi
+
+set_state unplugged; rc=$?
+mt=$(memtotal_kb)
+if [ "$rc" = 0 ] && [ "$(get_state)" = unplugged ] && [ "$mt" -lt "$mt_online" ]; then
+ ktap_test_pass "unplug from online: MemTotal $mt_online -> $mt kB"
+else
+ ktap_test_fail "unplug from online: rc=$rc state=$(get_state) MemTotal $mt_online -> $mt"
+fi
+
+set_state online_movable; rc=$?
+mt_mov=$(memtotal_kb)
+if [ "$rc" = 0 ] && [ "$(get_state)" = online_movable ] && [ "$mt_mov" -gt "$mt_unplugged" ]; then
+ ktap_test_pass "online_movable after unplug: MemTotal $mt_unplugged -> $mt_mov kB"
+else
+ ktap_test_fail "online_movable after unplug: rc=$rc state=$(get_state) MemTotal=$mt_mov"
+fi
+
+# The online -> unplug -> online_movable -> unplug cycle once regressed: a
+# re-online failed to re-reserve the per-range resources, so this final unplug
+# reported success while leaving the memory online. Assert it is really freed.
+set_state unplugged; rc=$?
+mt=$(memtotal_kb)
+if [ "$rc" != 0 ]; then
+ ktap_test_skip "unplug from movable not accepted (memory in use?) rc=$rc"
+elif [ "$(get_state)" = unplugged ] && [ "$mt" -lt "$mt_mov" ]; then
+ ktap_test_pass "unplug from online_movable removed memory: $mt_mov -> $mt kB"
+else
+ ktap_test_fail "unplug success but memory remained: $(get_state) $mt_mov -> $mt kB"
+fi
+
+set_state online_kernel; rc=$?
+mt=$(memtotal_kb)
+if [ "$rc" = 0 ] && [ "$(get_state)" = online_kernel ] && [ "$mt" -gt "$mt_unplugged" ]; then
+ ktap_test_pass "online_kernel: MemTotal $mt_unplugged -> $mt kB"
+else
+ ktap_test_fail "online_kernel: rc=$rc state=$(get_state) MemTotal=$mt"
+fi
+set_state unplugged
+
+before=$(get_state)
+set_state bogus_state; rc=$?
+if [ "$rc" != 0 ] && [ "$(get_state)" = "$before" ]; then
+ ktap_test_pass "reject invalid state string"
+else
+ ktap_test_fail "invalid state not rejected: rc=$rc state=$(get_state)"
+fi
+
+[ -n "$ORIG" ] && set_state "$ORIG"
+
+ktap_finished
diff --git a/tools/testing/selftests/dax/settings b/tools/testing/selftests/dax/settings
new file mode 100644
index 000000000000..ba4d85f74cd6
--- /dev/null
+++ b/tools/testing/selftests/dax/settings
@@ -0,0 +1 @@
+timeout=90
--
2.54.0
^ permalink raw reply related [flat|nested] 22+ messages in thread