* [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
` (7 more replies)
0 siblings, 8 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
The CXL driver currently hands policy management over to the DAX
subsystem for sysram regions. This makes building policy around
entire regions clunky and at times difficult - for example, requiring
multiple actions to reliably offline and hot-unplug memory.
This series introduces a memory controller abstraction for CXL regions
and adds a "sysram" controller that directly hotplugs memory without
needing to route through DAX. This simplifies the sysram use case
considerably.
This also prepares for future use cases which may require different
memory controller logic (such as private numa nodes).
We organize the controllers into core/memctrl/*_region.c files.
The series is organized as follows:
Patch 1 introduces the cxl_memctrl_mode enum and region->memctrl field,
allowing regions to be switched between different memory controllers.
The supported modes are NONE, AUTO, and DAX initially. Auto-created
regions default to AUTO, while manually created regions default to NONE
(requiring explicit controller selection).
Patch 2 adds the sysram_region memory controller, which provides direct
memory hotplug without DAX intermediation. New sysfs controls are
exposed under region/memctrl/:
- hotplug: trigger memory hotplug
- hotunplug: offline and hotunplug memory
- state: online/online_normal/offline
Patch 3 refactors existing pmem memctrl logic out of region.c into the
new memctrl/pmem_region.c, simplifying controller selection in region
probe.
Patch 4 adds CONFIG_CXL_REGION_CTRL_AUTO_* options, allowing users to
configure auto-regions to default to SYSRAM instead of DAX for existing
simple system configurations (i.e. local memory expansion only).
Patch 5 adds CONFIG_CXL_REGION_SYSRAM_DEFAULT_* options to control the
default state of sysram blocks (OFFLINE, ONLINE/ZONE_MOVABLE, or
ONLINE_NORMAL/ZONE_NORMAL). This provides an alternative to the global
MHP auto-online setting which may cause issues with other devices.
Online defaults to ZONE_MOVABLE to defend hot-unplug by default.
This is the opposite of memory blocks "online" and "online_movable".
Patch 6 adds a memory_notify callback that prevents memory blocks from
being onlined into ZONE_NORMAL when the controller state is set to
ZONE_MOVABLE. This protects against administrators accidentally
breaking hot-unpluggability by writing "offline" then "online" to the
memory block sysfs.
Gregory Price (6):
drivers/cxl: add cxl_memctrl_mode and region->memctrl
cxl: add sysram_region memory controller
cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options
cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
drivers/cxl/Kconfig | 72 ++++
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/core.h | 5 +
drivers/cxl/core/memctrl/Makefile | 6 +
drivers/cxl/core/memctrl/dax_region.c | 79 ++++
drivers/cxl/core/memctrl/memctrl.c | 48 +++
drivers/cxl/core/memctrl/pmem_region.c | 191 +++++++++
drivers/cxl/core/memctrl/sysram_region.c | 520 +++++++++++++++++++++++
drivers/cxl/core/region.c | 358 ++++------------
drivers/cxl/cxl.h | 18 +
10 files changed, 1013 insertions(+), 285 deletions(-)
create mode 100644 drivers/cxl/core/memctrl/Makefile
create mode 100644 drivers/cxl/core/memctrl/dax_region.c
create mode 100644 drivers/cxl/core/memctrl/memctrl.c
create mode 100644 drivers/cxl/core/memctrl/pmem_region.c
create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
--
2.52.0
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 20:59 ` dan.j.williams
` (2 more replies)
2026-01-12 16:35 ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
` (6 subsequent siblings)
7 siblings, 3 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
The CXL driver presently hands policy management over to DAX subsystem
for sysram regions, which makes building policy around the entire region
clunky and at time difficult (e.g. multiple actions to offline and
hot-unplug memory reliably).
To support multiple backend controllers for memory regions (for example
dax vs direct hotplug), implement a memctrl field in cxl_region allows
switching uncomitted regions between different "memory controllers".
CXL_CONTROL_NONE: No selected controller, probe will fail.
CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
otherwise register a dax_region
CXL_CONTROL_DAX : register a dax_region
Auto regions will either be static sysram (BIOS-onlined) and has no
region controller associated with it - or if the SP bit was set a
DAX device will be created.
Rather than default all regions to the auto-controller, only default
auto-regions to the auto controller.
Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
a failure to probe unless a controller is selected.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/core.h | 2 +
drivers/cxl/core/memctrl/Makefile | 4 +
drivers/cxl/core/memctrl/dax_region.c | 79 +++++++++++++++
drivers/cxl/core/memctrl/memctrl.c | 42 ++++++++
drivers/cxl/core/region.c | 136 ++++++++++----------------
drivers/cxl/cxl.h | 14 +++
7 files changed, 192 insertions(+), 86 deletions(-)
create mode 100644 drivers/cxl/core/memctrl/Makefile
create mode 100644 drivers/cxl/core/memctrl/dax_region.c
create mode 100644 drivers/cxl/core/memctrl/memctrl.c
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 5ad8fef210b5..79de20e3f8aa 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -17,6 +17,7 @@ cxl_core-y += cdat.o
cxl_core-y += ras.o
cxl_core-$(CONFIG_TRACING) += trace.o
cxl_core-$(CONFIG_CXL_REGION) += region.o
+include $(src)/memctrl/Makefile
cxl_core-$(CONFIG_CXL_MCE) += mce.o
cxl_core-$(CONFIG_CXL_FEATURES) += features.o
cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 1fb66132b777..1156a4bd0080 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -42,6 +42,8 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port);
struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
+int cxl_enable_memctrl(struct cxl_region *cxlr);
+int devm_cxl_add_dax_region(struct cxl_region *cxlr);
#else
static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
new file mode 100644
index 000000000000..8165aad5a52a
--- /dev/null
+++ b/drivers/cxl/core/memctrl/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+
+cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
+cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
diff --git a/drivers/cxl/core/memctrl/dax_region.c b/drivers/cxl/core/memctrl/dax_region.c
new file mode 100644
index 000000000000..90d7fdb97013
--- /dev/null
+++ b/drivers/cxl/core/memctrl/dax_region.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <cxlmem.h>
+#include <cxl.h>
+#include "../core.h"
+
+static struct lock_class_key cxl_dax_region_key;
+
+static struct cxl_dax_region *cxl_dax_region_alloc(struct cxl_region *cxlr)
+{
+ struct cxl_region_params *p = &cxlr->params;
+ struct cxl_dax_region *cxlr_dax;
+ struct device *dev;
+
+ guard(rwsem_read)(&cxl_rwsem.region);
+ if (p->state != CXL_CONFIG_COMMIT)
+ return ERR_PTR(-ENXIO);
+
+ cxlr_dax = kzalloc(sizeof(*cxlr_dax), GFP_KERNEL);
+ if (!cxlr_dax)
+ return ERR_PTR(-ENOMEM);
+
+ cxlr_dax->hpa_range.start = p->res->start;
+ cxlr_dax->hpa_range.end = p->res->end;
+
+ dev = &cxlr_dax->dev;
+ cxlr_dax->cxlr = cxlr;
+ device_initialize(dev);
+ lockdep_set_class(&dev->mutex, &cxl_dax_region_key);
+ device_set_pm_not_required(dev);
+ dev->parent = &cxlr->dev;
+ dev->bus = &cxl_bus_type;
+ dev->type = &cxl_dax_region_type;
+
+ return cxlr_dax;
+}
+
+static void cxlr_dax_unregister(void *_cxlr_dax)
+{
+ struct cxl_dax_region *cxlr_dax = _cxlr_dax;
+
+ device_unregister(&cxlr_dax->dev);
+}
+
+/*
+ * The dax controller is the default controller and simply hands the
+ * control pattern over to the dax driver. It does with a dax_region
+ * built by dax/cxl.c
+ */
+int devm_cxl_add_dax_region(struct cxl_region *cxlr)
+{
+ struct cxl_dax_region *cxlr_dax;
+ struct device *dev;
+ int rc;
+
+ cxlr_dax = cxl_dax_region_alloc(cxlr);
+ if (IS_ERR(cxlr_dax))
+ return PTR_ERR(cxlr_dax);
+
+ dev = &cxlr_dax->dev;
+ rc = dev_set_name(dev, "dax_region%d", cxlr->id);
+ if (rc)
+ goto err;
+
+ rc = device_add(dev);
+ if (rc)
+ goto err;
+
+ dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
+ dev_name(dev));
+
+ return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
+ cxlr_dax);
+err:
+ put_device(dev);
+ return rc;
+}
diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
new file mode 100644
index 000000000000..24e0e14b39c7
--- /dev/null
+++ b/drivers/cxl/core/memctrl/memctrl.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
+/* Copyright(c) 2026 Meta Inc. All rights reserved. */
+#include <linux/device.h>
+#include <linux/ioport.h>
+#include <cxlmem.h>
+#include <cxl.h>
+#include "../core.h"
+
+static int is_system_ram(struct resource *res, void *arg)
+{
+ struct cxl_region *cxlr = arg;
+ struct cxl_region_params *p = &cxlr->params;
+
+ dev_dbg(&cxlr->dev, "%pr has System RAM: %pr\n", p->res, res);
+ return 1;
+}
+
+int cxl_enable_memctrl(struct cxl_region *cxlr)
+{
+ struct cxl_region_params *p = &cxlr->params;
+
+ switch (cxlr->memctrl) {
+ case CXL_MEMCTRL_AUTO:
+ /*
+ * The region can not be manged by CXL if any portion of
+ * it is already online as 'System RAM'
+ */
+ if (walk_iomem_res_desc(IORES_DESC_NONE,
+ IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+ p->res->start, p->res->end, cxlr,
+ is_system_ram) > 0)
+ return 0;
+ return devm_cxl_add_dax_region(cxlr);
+ case CXL_MEMCTRL_DAX:
+ return devm_cxl_add_dax_region(cxlr);
+ default:
+ return -EINVAL;
+ }
+}
+
+
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ae899f68551f..02d7d9ae0252 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -626,6 +626,50 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RO(mode);
+static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ const char *desc;
+
+ switch (cxlr->memctrl) {
+ case CXL_MEMCTRL_AUTO:
+ desc = "auto";
+ break;
+ case CXL_MEMCTRL_DAX:
+ desc = "dax";
+ break;
+ default:
+ desc = "";
+ break;
+ }
+
+ return sysfs_emit(buf, "%s\n", desc);
+}
+
+static ssize_t ctrl_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ struct cxl_region_params *p = &cxlr->params;
+ int rc;
+
+ ACQUIRE(rwsem_write_kill, rwsem)(&cxl_rwsem.region);
+ if ((rc = ACQUIRE_ERR(rwsem_write_kill, &rwsem)))
+ return rc;
+
+ if (p->state >= CXL_CONFIG_COMMIT)
+ return -EBUSY;
+
+ if (sysfs_streq(buf, "dax"))
+ cxlr->memctrl = CXL_MEMCTRL_DAX;
+ else
+ return -EINVAL;
+
+ return len;
+}
+static DEVICE_ATTR_RW(ctrl);
+
static int alloc_hpa(struct cxl_region *cxlr, resource_size_t size)
{
struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
@@ -772,6 +816,7 @@ static struct attribute *cxl_region_attrs[] = {
&dev_attr_size.attr,
&dev_attr_mode.attr,
&dev_attr_extended_linear_cache_size.attr,
+ &dev_attr_ctrl.attr,
NULL,
};
@@ -2598,6 +2643,7 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
return cxlr;
cxlr->mode = mode;
cxlr->type = type;
+ cxlr->memctrl = CXL_MEMCTRL_NONE;
dev = &cxlr->dev;
rc = dev_set_name(dev, "region%d", id);
@@ -3307,37 +3353,6 @@ struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
}
EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, "CXL");
-static struct lock_class_key cxl_dax_region_key;
-
-static struct cxl_dax_region *cxl_dax_region_alloc(struct cxl_region *cxlr)
-{
- struct cxl_region_params *p = &cxlr->params;
- struct cxl_dax_region *cxlr_dax;
- struct device *dev;
-
- guard(rwsem_read)(&cxl_rwsem.region);
- if (p->state != CXL_CONFIG_COMMIT)
- return ERR_PTR(-ENXIO);
-
- cxlr_dax = kzalloc(sizeof(*cxlr_dax), GFP_KERNEL);
- if (!cxlr_dax)
- return ERR_PTR(-ENOMEM);
-
- cxlr_dax->hpa_range.start = p->res->start;
- cxlr_dax->hpa_range.end = p->res->end;
-
- dev = &cxlr_dax->dev;
- cxlr_dax->cxlr = cxlr;
- device_initialize(dev);
- lockdep_set_class(&dev->mutex, &cxl_dax_region_key);
- device_set_pm_not_required(dev);
- dev->parent = &cxlr->dev;
- dev->bus = &cxl_bus_type;
- dev->type = &cxl_dax_region_type;
-
- return cxlr_dax;
-}
-
static void cxlr_pmem_unregister(void *_cxlr_pmem)
{
struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem;
@@ -3424,42 +3439,6 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
return rc;
}
-static void cxlr_dax_unregister(void *_cxlr_dax)
-{
- struct cxl_dax_region *cxlr_dax = _cxlr_dax;
-
- device_unregister(&cxlr_dax->dev);
-}
-
-static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
-{
- struct cxl_dax_region *cxlr_dax;
- struct device *dev;
- int rc;
-
- cxlr_dax = cxl_dax_region_alloc(cxlr);
- if (IS_ERR(cxlr_dax))
- return PTR_ERR(cxlr_dax);
-
- dev = &cxlr_dax->dev;
- rc = dev_set_name(dev, "dax_region%d", cxlr->id);
- if (rc)
- goto err;
-
- rc = device_add(dev);
- if (rc)
- goto err;
-
- dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
- dev_name(dev));
-
- return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
- cxlr_dax);
-err:
- put_device(dev);
- return rc;
-}
-
static int match_decoder_by_range(struct device *dev, const void *data)
{
const struct range *r1, *r2 = data;
@@ -3579,6 +3558,9 @@ static int __construct_region(struct cxl_region *cxlr,
set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
+ /* Auto-regions will either be static sysram (onlined by BIOS) or DAX */
+ cxlr->memctrl = CXL_MEMCTRL_AUTO;
+
res = kmalloc(sizeof(*res), GFP_KERNEL);
if (!res)
return -ENOMEM;
@@ -3755,15 +3737,6 @@ u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
}
EXPORT_SYMBOL_NS_GPL(cxl_port_get_spa_cache_alias, "CXL");
-static int is_system_ram(struct resource *res, void *arg)
-{
- struct cxl_region *cxlr = arg;
- struct cxl_region_params *p = &cxlr->params;
-
- dev_dbg(&cxlr->dev, "%pr has System RAM: %pr\n", p->res, res);
- return 1;
-}
-
static void shutdown_notifiers(void *_cxlr)
{
struct cxl_region *cxlr = _cxlr;
@@ -3965,16 +3938,7 @@ static int cxl_region_probe(struct device *dev)
dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
cxlr->id);
- /*
- * The region can not be manged by CXL if any portion of
- * it is already online as 'System RAM'
- */
- if (walk_iomem_res_desc(IORES_DESC_NONE,
- IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
- p->res->start, p->res->end, cxlr,
- is_system_ram) > 0)
- return 0;
- return devm_cxl_add_dax_region(cxlr);
+ return cxl_enable_memctrl(cxlr);
default:
dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
cxlr->mode);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ba17fa86d249..b8fabaa77262 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -502,6 +502,19 @@ enum cxl_partition_mode {
CXL_PARTMODE_PMEM,
};
+
+/*
+ * Memory Controller modes:
+ * None - No controller selected
+ * Auto - either BIOS-configured as SysRAM, or default to DAX
+ * DAX - creates a dax_region controller for the cxl_region
+ */
+enum cxl_memctrl_mode {
+ CXL_MEMCTRL_NONE,
+ CXL_MEMCTRL_AUTO,
+ CXL_MEMCTRL_DAX,
+};
+
/*
* Indicate whether this region has been assembled by autodetection or
* userspace assembly. Prevent endpoint decoders outside of automatic
@@ -543,6 +556,7 @@ struct cxl_region {
struct device dev;
int id;
enum cxl_partition_mode mode;
+ enum cxl_memctrl_mode memctrl;
enum cxl_decoder_type type;
struct cxl_nvdimm_bridge *cxl_nvb;
struct cxl_pmem_region *cxlr_pmem;
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 20:00 ` David Hildenbrand (Red Hat)
` (2 more replies)
2026-01-12 16:35 ` [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region Gregory Price
` (5 subsequent siblings)
7 siblings, 3 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams,
David Hildenbrand
Add a sysram memctrl that directly hotplugs memory without needing to
route through DAX. This simplifies the sysram usecase considerably.
The sysram memctl adds new sysfs controls when registered:
region/memctrl/[hotplug, hotunplug, state]
hotplug: controller attempts to hotplug the memory region
hotunplug: controller attempts to offline and hotunplug the memory region
state: [online,online_normal,offline]
online : controller onlines blocks in ZONE_MOVABLE
online_normal: controller onlines blocks in ZONE_NORMAL
offline : controller attempts to offline the memory blocks
Hotplug note - by default the controller will hotplug the blocks, but
leave them offline (unless MHP auto-online in Kconfig is enabled).
Setting state to "online_normal" may prevent future hot-unplug of sysram
regions, and unbinding a memory region with memory online in ZONE_NORMAL
may result in the device being removed but the memory remaining online.
This can result in future management functions failing (such as adding a
new region). This is why "online_normal" is explicit, and the default
online zone is ZONE_MOVABLE.
Cc: David Hildenbrand <david@kernel.org>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/cxl/core/core.h | 2 +
drivers/cxl/core/memctrl/Makefile | 1 +
drivers/cxl/core/memctrl/memctrl.c | 2 +
drivers/cxl/core/memctrl/sysram_region.c | 358 +++++++++++++++++++++++
drivers/cxl/core/region.c | 5 +
drivers/cxl/cxl.h | 6 +-
6 files changed, 372 insertions(+), 2 deletions(-)
create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 1156a4bd0080..18cb84950500 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -31,6 +31,8 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
struct cxl_endpoint_decoder *cxled, int pos,
enum cxl_detach_mode mode);
+int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
+
#define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
#define CXL_REGION_TYPE(x) (&cxl_region_type)
#define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
index 8165aad5a52a..1c52c7d75570 100644
--- a/drivers/cxl/core/memctrl/Makefile
+++ b/drivers/cxl/core/memctrl/Makefile
@@ -2,3 +2,4 @@
cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
+cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
index 24e0e14b39c7..40ffb59353bb 100644
--- a/drivers/cxl/core/memctrl/memctrl.c
+++ b/drivers/cxl/core/memctrl/memctrl.c
@@ -34,6 +34,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
return devm_cxl_add_dax_region(cxlr);
case CXL_MEMCTRL_DAX:
return devm_cxl_add_dax_region(cxlr);
+ case CXL_MEMCTRL_SYSRAM:
+ return devm_cxl_add_sysram_region(cxlr);
default:
return -EINVAL;
}
diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
new file mode 100644
index 000000000000..a7570c8a54e1
--- /dev/null
+++ b/drivers/cxl/core/memctrl/sysram_region.c
@@ -0,0 +1,358 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2026 Meta Inc. All rights reserved. */
+#include <linux/memremap.h>
+#include <linux/memory.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/mm.h>
+#include <linux/memory-tiers.h>
+#include <linux/memory_hotplug.h>
+#include <linux/string_helpers.h>
+#include <linux/sched/signal.h>
+#include <cxlmem.h>
+#include <cxl.h>
+#include "../core.h"
+
+/* If HMAT was unavailable, assign a default distance. */
+#define MEMTIER_DEFAULT_CXL_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 5)
+
+static const char *sysram_name = "System RAM (CXL)";
+
+struct cxl_sysram_data {
+ const char *res_name;
+ int mgid;
+ struct resource *res;
+};
+
+static DEFINE_MUTEX(cxl_memory_type_lock);
+static LIST_HEAD(cxl_memory_types);
+
+static struct cxl_region *to_cxl_region(struct device *dev)
+{
+ if (dev->type != &cxl_region_type)
+ return NULL;
+ return container_of(dev, struct cxl_region, dev);
+}
+
+static struct memory_dev_type *cxl_find_alloc_memory_type(int adist)
+{
+ guard(mutex)(&cxl_memory_type_lock);
+ return mt_find_alloc_memory_type(adist, &cxl_memory_types);
+}
+
+static void __maybe_unused cxl_put_memory_types(void)
+{
+ guard(mutex)(&cxl_memory_type_lock);
+ mt_put_memory_types(&cxl_memory_types);
+}
+
+static int cxl_sysram_range(struct cxl_region *cxlr, struct range *r)
+{
+ struct cxl_region_params *p = &cxlr->params;
+
+ if (!p->res)
+ return -ENODEV;
+
+ /* memory-block align the hotplug range */
+ r->start = ALIGN(p->res->start, memory_block_size_bytes());
+ r->end = ALIGN_DOWN(p->res->end + 1, memory_block_size_bytes()) - 1;
+ if (r->start >= r->end) {
+ r->start = p->res->start;
+ r->end = p->res->end;
+ return -ENOSPC;
+ }
+ return 0;
+}
+
+static ssize_t hotunplug_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ struct range range;
+ int rc;
+
+ if (!cxlr)
+ return -ENODEV;
+
+ rc = cxl_sysram_range(cxlr, &range);
+ if (rc)
+ return rc;
+
+ rc = offline_and_remove_memory(range.start, range_len(&range));
+
+ if (rc)
+ return rc;
+
+ return len;
+}
+static DEVICE_ATTR_WO(hotunplug);
+
+struct online_memory_cb_arg {
+ int online_type;
+ int rc;
+};
+
+static int online_memory_block_cb(struct memory_block *mem, void *arg)
+{
+ struct online_memory_cb_arg *cb_arg = arg;
+
+ if (signal_pending(current))
+ return -EINTR;
+
+ cond_resched();
+
+ if (mem->state == MEM_ONLINE)
+ return 0;
+
+ mem->online_type = cb_arg->online_type;
+ cb_arg->rc = device_online(&mem->dev);
+
+ return cb_arg->rc;
+}
+
+static int offline_memory_block_cb(struct memory_block *mem, void *arg)
+{
+ int *rc = arg;
+
+ if (signal_pending(current))
+ return -EINTR;
+
+ cond_resched();
+
+ if (mem->state == MEM_OFFLINE)
+ return 0;
+
+ *rc = device_offline(&mem->dev);
+
+ return *rc;
+}
+
+static ssize_t state_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ struct online_memory_cb_arg cb_arg;
+ struct range range;
+ int rc;
+
+ if (!cxlr)
+ return -ENODEV;
+
+ rc = cxl_sysram_range(cxlr, &range);
+ if (rc)
+ return rc;
+
+ rc = lock_device_hotplug_sysfs();
+ if (rc)
+ return rc;
+
+ if (sysfs_streq(buf, "online")) {
+ cb_arg.online_type = MMOP_ONLINE_MOVABLE;
+ cb_arg.rc = 0;
+ rc = walk_memory_blocks(range.start, range_len(&range),
+ &cb_arg, online_memory_block_cb);
+ if (!rc)
+ rc = cb_arg.rc;
+ } else if (sysfs_streq(buf, "online_normal")) {
+ cb_arg.online_type = MMOP_ONLINE;
+ cb_arg.rc = 0;
+ rc = walk_memory_blocks(range.start, range_len(&range),
+ &cb_arg, online_memory_block_cb);
+ if (!rc)
+ rc = cb_arg.rc;
+ } else if (sysfs_streq(buf, "offline")) {
+ int offline_rc = 0;
+
+ rc = walk_memory_blocks(range.start, range_len(&range),
+ &offline_rc, offline_memory_block_cb);
+ if (!rc)
+ rc = offline_rc;
+ } else {
+ rc = -EINVAL;
+ }
+
+ unlock_device_hotplug();
+
+ if (rc)
+ return rc;
+
+ return len;
+}
+static DEVICE_ATTR_WO(state);
+
+static ssize_t hotplug_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct cxl_region *cxlr = to_cxl_region(dev);
+ struct cxl_sysram_data *data;
+ struct range range;
+ int rc;
+
+ if (!cxlr)
+ return -ENODEV;
+
+ data = dev_get_drvdata(dev);
+ if (!data)
+ return -ENODEV;
+
+ rc = cxl_sysram_range(cxlr, &range);
+ if (rc)
+ return rc;
+
+ rc = add_memory_driver_managed(data->mgid, range.start,
+ range_len(&range), sysram_name,
+ MHP_NID_IS_MGID);
+ if (rc)
+ return rc;
+
+ return len;
+}
+static DEVICE_ATTR_WO(hotplug);
+
+static struct attribute *cxl_sysram_region_attrs[] = {
+ &dev_attr_hotunplug.attr,
+ &dev_attr_state.attr,
+ &dev_attr_hotplug.attr,
+ NULL,
+};
+
+static const struct attribute_group cxl_sysram_region_group = {
+ .name = "memctl",
+ .attrs = cxl_sysram_region_attrs,
+};
+
+static void cxl_sysram_unregister(void *_data)
+{
+ struct cxl_sysram_data *data = _data;
+ struct range range = {
+ .start = data->res->start,
+ .end = data->res->end
+ };
+
+ /* We have one shot for removal, otherwise it's stuck til reboot */
+ if (!offline_and_remove_memory(range.start, range_len(&range))) {
+ remove_resource(data->res);
+ kfree(data->res);
+ memory_group_unregister(data->mgid);
+ kfree(data->res_name);
+ kfree(data);
+ return;
+ }
+ pr_err("CXL: %#llx-%#llx cannot be hotremoved until next reboot\n",
+ range.start, range.end);
+}
+
+int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
+{
+ struct cxl_region_params *p = &cxlr->params;
+ struct device *dev = &cxlr->dev;
+ struct cxl_sysram_data *data;
+ struct memory_dev_type *mtype;
+ unsigned long total_len = 0;
+ struct resource *res;
+ struct range range;
+ mhp_t mhp_flags;
+ int numa_node;
+ int adist = MEMTIER_DEFAULT_CXL_ADISTANCE;
+ int rc;
+
+ numa_node = phys_to_target_node(p->res->start);
+ if (numa_node < 0) {
+ dev_warn(dev, "rejecting CXL region with invalid node: %d\n",
+ numa_node);
+ return -EINVAL;
+ }
+
+ rc = cxl_sysram_range(cxlr, &range);
+ if (rc) {
+ dev_info(dev, "range %#llx-%#llx too small after alignment\n",
+ range.start, range.end);
+ return rc;
+ }
+ total_len = range_len(&range);
+
+ if (!total_len) {
+ dev_warn(dev, "rejecting CXL region without any memory after alignment\n");
+ return -EINVAL;
+ }
+
+ mt_calc_adistance(numa_node, &adist);
+ mtype = cxl_find_alloc_memory_type(adist);
+ if (IS_ERR(mtype))
+ return PTR_ERR(mtype);
+
+ init_node_memory_type(numa_node, mtype);
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data) {
+ rc = -ENOMEM;
+ goto err_data;
+ }
+
+ data->res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+ if (!data->res_name) {
+ rc = -ENOMEM;
+ goto err_res_name;
+ }
+
+ rc = memory_group_register_static(numa_node, PFN_UP(total_len));
+ if (rc < 0)
+ goto err_reg_mgid;
+ data->mgid = rc;
+
+ /* Region is permanently reserved if hotremove fails when unbinding. */
+ res = request_mem_region(range.start, range_len(&range),
+ data->res_name);
+ if (!res) {
+ dev_warn(dev, "range %#llx-%#llx could not reserve region\n",
+ range.start, range.end);
+ rc = -EBUSY;
+ goto err_request_mem;
+ }
+ data->res = res;
+
+ /*
+ * Setup flags for System RAM. Leave _BUSY clear so add_memory() can add
+ * a child resource. Do not inherit flags from parent since it may set
+ * flags unknown to us that will the break add_memory() below.
+ */
+ res->flags = IORESOURCE_SYSTEM_RAM;
+ mhp_flags = MHP_NID_IS_MGID;
+ rc = add_memory_driver_managed(data->mgid, range.start,
+ range_len(&range), sysram_name, mhp_flags);
+ if (rc) {
+ dev_warn(dev, "range %#llx-%#llx memory add failed\n",
+ range.start, range.end);
+ goto err_add_memory;
+ }
+ dev_dbg(dev, "%s: added %llu bytes as System RAM\n", dev_name(dev),
+ (unsigned long long)total_len);
+
+ dev_set_drvdata(dev, data);
+ rc = devm_device_add_group(dev, &cxl_sysram_region_group);
+ if (rc)
+ goto err_add_group;
+
+ return devm_add_action_or_reset(dev, cxl_sysram_unregister, data);
+
+err_add_group:
+ dev_set_drvdata(dev, NULL);
+ /* if this fails, memory cannot be removed from the system until reboot */
+ remove_memory(range.start, range_len(&range));
+err_add_memory:
+ remove_resource(res);
+ kfree(res);
+err_request_mem:
+ memory_group_unregister(data->mgid);
+err_reg_mgid:
+ kfree(data->res_name);
+err_res_name:
+ kfree(data);
+err_data:
+ clear_node_memory_type(numa_node, mtype);
+ return rc;
+}
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 02d7d9ae0252..eeab091f043a 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -639,6 +639,9 @@ static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
case CXL_MEMCTRL_DAX:
desc = "dax";
break;
+ case CXL_MEMCTRL_SYSRAM:
+ desc = "sysram";
+ break;
default:
desc = "";
break;
@@ -663,6 +666,8 @@ static ssize_t ctrl_store(struct device *dev, struct device_attribute *attr,
if (sysfs_streq(buf, "dax"))
cxlr->memctrl = CXL_MEMCTRL_DAX;
+ else if (sysfs_streq(buf, "sysram"))
+ cxlr->memctrl = CXL_MEMCTRL_SYSRAM;
else
return -EINVAL;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b8fabaa77262..bb4f877b4e8f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -506,13 +506,15 @@ enum cxl_partition_mode {
/*
* Memory Controller modes:
* None - No controller selected
- * Auto - either BIOS-configured as SysRAM, or default to DAX
- * DAX - creates a dax_region controller for the cxl_region
+ * Auto - either BIOS-configured as SysRAM, or default to DAX
+ * DAX - creates a dax_region controller for the cxl_region
+ * SYSRAM - hotplugs the region directly as System RAM
*/
enum cxl_memctrl_mode {
CXL_MEMCTRL_NONE,
CXL_MEMCTRL_AUTO,
CXL_MEMCTRL_DAX,
+ CXL_MEMCTRL_SYSRAM,
};
/*
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
2026-01-12 16:35 ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 16:35 ` [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options Gregory Price
` (4 subsequent siblings)
7 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
Move the pmem_region logic from region.c into memctrl/pmem_region.c.
Restrict the valid controllers for pmem to the pmem controller.
Simplify the controller selection logic in region probe.
Cc:
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/cxl/core/core.h | 1 +
drivers/cxl/core/memctrl/Makefile | 1 +
drivers/cxl/core/memctrl/memctrl.c | 2 +
drivers/cxl/core/memctrl/pmem_region.c | 191 +++++++++++++++++++++
drivers/cxl/core/region.c | 221 +++----------------------
drivers/cxl/cxl.h | 2 +
6 files changed, 217 insertions(+), 201 deletions(-)
create mode 100644 drivers/cxl/core/memctrl/pmem_region.c
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 18cb84950500..59175890a6ac 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -46,6 +46,7 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
int cxl_enable_memctrl(struct cxl_region *cxlr);
int devm_cxl_add_dax_region(struct cxl_region *cxlr);
+int devm_cxl_add_pmem_region(struct cxl_region *cxlr);
#else
static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
index 1c52c7d75570..efffc8ba2c0b 100644
--- a/drivers/cxl/core/memctrl/Makefile
+++ b/drivers/cxl/core/memctrl/Makefile
@@ -3,3 +3,4 @@
cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
+cxl_core-$(CONFIG_CXL_REGION) += memctrl/pmem_region.o
diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
index 40ffb59353bb..1b661465bdeb 100644
--- a/drivers/cxl/core/memctrl/memctrl.c
+++ b/drivers/cxl/core/memctrl/memctrl.c
@@ -36,6 +36,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
return devm_cxl_add_dax_region(cxlr);
case CXL_MEMCTRL_SYSRAM:
return devm_cxl_add_sysram_region(cxlr);
+ case CXL_MEMCTRL_PMEM:
+ return devm_cxl_add_pmem_region(cxlr);
default:
return -EINVAL;
}
diff --git a/drivers/cxl/core/memctrl/pmem_region.c b/drivers/cxl/core/memctrl/pmem_region.c
new file mode 100644
index 000000000000..57668dd82d71
--- /dev/null
+++ b/drivers/cxl/core/memctrl/pmem_region.c
@@ -0,0 +1,191 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <cxlmem.h>
+#include <cxl.h>
+#include "../core.h"
+
+static void cxl_pmem_region_release(struct device *dev)
+{
+ struct cxl_pmem_region *cxlr_pmem = to_cxl_pmem_region(dev);
+ int i;
+
+ for (i = 0; i < cxlr_pmem->nr_mappings; i++) {
+ struct cxl_memdev *cxlmd = cxlr_pmem->mapping[i].cxlmd;
+
+ put_device(&cxlmd->dev);
+ }
+
+ kfree(cxlr_pmem);
+}
+
+static const struct attribute_group *cxl_pmem_region_attribute_groups[] = {
+ &cxl_base_attribute_group,
+ NULL,
+};
+
+const struct device_type cxl_pmem_region_type = {
+ .name = "cxl_pmem_region",
+ .release = cxl_pmem_region_release,
+ .groups = cxl_pmem_region_attribute_groups,
+};
+bool is_cxl_pmem_region(struct device *dev)
+{
+ return dev->type == &cxl_pmem_region_type;
+}
+EXPORT_SYMBOL_NS_GPL(is_cxl_pmem_region, "CXL");
+
+struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
+{
+ if (dev_WARN_ONCE(dev, !is_cxl_pmem_region(dev),
+ "not a cxl_pmem_region device\n"))
+ return NULL;
+ return container_of(dev, struct cxl_pmem_region, dev);
+}
+EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, "CXL");
+static struct lock_class_key cxl_pmem_region_key;
+
+static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
+{
+ struct cxl_region_params *p = &cxlr->params;
+ struct cxl_nvdimm_bridge *cxl_nvb;
+ struct device *dev;
+ int i;
+
+ guard(rwsem_read)(&cxl_rwsem.region);
+ if (p->state != CXL_CONFIG_COMMIT)
+ return -ENXIO;
+
+ struct cxl_pmem_region *cxlr_pmem __free(kfree) =
+ kzalloc(struct_size(cxlr_pmem, mapping, p->nr_targets), GFP_KERNEL);
+ if (!cxlr_pmem)
+ return -ENOMEM;
+
+ cxlr_pmem->hpa_range.start = p->res->start;
+ cxlr_pmem->hpa_range.end = p->res->end;
+
+ /* Snapshot the region configuration underneath the cxl_rwsem.region */
+ cxlr_pmem->nr_mappings = p->nr_targets;
+ for (i = 0; i < p->nr_targets; i++) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+ struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+ struct cxl_pmem_region_mapping *m = &cxlr_pmem->mapping[i];
+
+ /*
+ * Regions never span CXL root devices, so by definition the
+ * bridge for one device is the same for all.
+ */
+ if (i == 0) {
+ cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint);
+ if (!cxl_nvb)
+ return -ENODEV;
+ cxlr->cxl_nvb = cxl_nvb;
+ }
+ m->cxlmd = cxlmd;
+ get_device(&cxlmd->dev);
+ m->start = cxled->dpa_res->start;
+ m->size = resource_size(cxled->dpa_res);
+ m->position = i;
+ }
+
+ dev = &cxlr_pmem->dev;
+ device_initialize(dev);
+ lockdep_set_class(&dev->mutex, &cxl_pmem_region_key);
+ device_set_pm_not_required(dev);
+ dev->parent = &cxlr->dev;
+ dev->bus = &cxl_bus_type;
+ dev->type = &cxl_pmem_region_type;
+ cxlr_pmem->cxlr = cxlr;
+ cxlr->cxlr_pmem = no_free_ptr(cxlr_pmem);
+
+ return 0;
+}
+
+static void cxlr_pmem_unregister(void *_cxlr_pmem)
+{
+ struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem;
+ struct cxl_region *cxlr = cxlr_pmem->cxlr;
+ struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
+
+ /*
+ * Either the bridge is in ->remove() context under the device_lock(),
+ * or cxlr_release_nvdimm() is cancelling the bridge's release action
+ * for @cxlr_pmem and doing it itself (while manually holding the bridge
+ * lock).
+ */
+ device_lock_assert(&cxl_nvb->dev);
+ cxlr->cxlr_pmem = NULL;
+ cxlr_pmem->cxlr = NULL;
+ device_unregister(&cxlr_pmem->dev);
+}
+
+static void cxlr_release_nvdimm(void *_cxlr)
+{
+ struct cxl_region *cxlr = _cxlr;
+ struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
+
+ scoped_guard(device, &cxl_nvb->dev) {
+ if (cxlr->cxlr_pmem)
+ devm_release_action(&cxl_nvb->dev, cxlr_pmem_unregister,
+ cxlr->cxlr_pmem);
+ }
+ cxlr->cxl_nvb = NULL;
+ put_device(&cxl_nvb->dev);
+}
+
+/**
+ * devm_cxl_add_pmem_region() - add a cxl_region-to-nd_region bridge
+ * @cxlr: parent CXL region for this pmem region bridge device
+ *
+ * Return: 0 on success negative error code on failure.
+ */
+int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
+{
+ struct cxl_pmem_region *cxlr_pmem;
+ struct cxl_nvdimm_bridge *cxl_nvb;
+ struct device *dev;
+ int rc;
+
+ rc = cxl_pmem_region_alloc(cxlr);
+ if (rc)
+ return rc;
+ cxlr_pmem = cxlr->cxlr_pmem;
+ cxl_nvb = cxlr->cxl_nvb;
+
+ dev = &cxlr_pmem->dev;
+ rc = dev_set_name(dev, "pmem_region%d", cxlr->id);
+ if (rc)
+ goto err;
+
+ rc = device_add(dev);
+ if (rc)
+ goto err;
+
+ dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
+ dev_name(dev));
+
+ scoped_guard(device, &cxl_nvb->dev) {
+ if (cxl_nvb->dev.driver)
+ rc = devm_add_action_or_reset(&cxl_nvb->dev,
+ cxlr_pmem_unregister,
+ cxlr_pmem);
+ else
+ rc = -ENXIO;
+ }
+
+ if (rc)
+ goto err_bridge;
+
+ /* @cxlr carries a reference on @cxl_nvb until cxlr_release_nvdimm */
+ return devm_add_action_or_reset(&cxlr->dev, cxlr_release_nvdimm, cxlr);
+
+err:
+ put_device(dev);
+err_bridge:
+ put_device(&cxl_nvb->dev);
+ cxlr->cxl_nvb = NULL;
+ return rc;
+}
+
+
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index eeab091f043a..85c20a09246d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -642,6 +642,9 @@ static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
case CXL_MEMCTRL_SYSRAM:
desc = "sysram";
break;
+ case CXL_MEMCTRL_PMEM:
+ desc = "pmem";
+ break;
default:
desc = "";
break;
@@ -661,6 +664,10 @@ static ssize_t ctrl_store(struct device *dev, struct device_attribute *attr,
if ((rc = ACQUIRE_ERR(rwsem_write_kill, &rwsem)))
return rc;
+ /* PMEM only has one controller - the pmem controller */
+ if (cxlr->mode == CXL_PARTMODE_PMEM)
+ return -EBUSY;
+
if (p->state >= CXL_CONFIG_COMMIT)
return -EBUSY;
@@ -2648,7 +2655,11 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
return cxlr;
cxlr->mode = mode;
cxlr->type = type;
- cxlr->memctrl = CXL_MEMCTRL_NONE;
+
+ if (mode == CXL_PARTMODE_PMEM)
+ cxlr->memctrl = CXL_MEMCTRL_PMEM;
+ else
+ cxlr->memctrl = CXL_MEMCTRL_NONE;
dev = &cxlr->dev;
rc = dev_set_name(dev, "region%d", id);
@@ -2797,46 +2808,6 @@ static ssize_t delete_region_store(struct device *dev,
}
DEVICE_ATTR_WO(delete_region);
-static void cxl_pmem_region_release(struct device *dev)
-{
- struct cxl_pmem_region *cxlr_pmem = to_cxl_pmem_region(dev);
- int i;
-
- for (i = 0; i < cxlr_pmem->nr_mappings; i++) {
- struct cxl_memdev *cxlmd = cxlr_pmem->mapping[i].cxlmd;
-
- put_device(&cxlmd->dev);
- }
-
- kfree(cxlr_pmem);
-}
-
-static const struct attribute_group *cxl_pmem_region_attribute_groups[] = {
- &cxl_base_attribute_group,
- NULL,
-};
-
-const struct device_type cxl_pmem_region_type = {
- .name = "cxl_pmem_region",
- .release = cxl_pmem_region_release,
- .groups = cxl_pmem_region_attribute_groups,
-};
-
-bool is_cxl_pmem_region(struct device *dev)
-{
- return dev->type == &cxl_pmem_region_type;
-}
-EXPORT_SYMBOL_NS_GPL(is_cxl_pmem_region, "CXL");
-
-struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
-{
- if (dev_WARN_ONCE(dev, !is_cxl_pmem_region(dev),
- "not a cxl_pmem_region device\n"))
- return NULL;
- return container_of(dev, struct cxl_pmem_region, dev);
-}
-EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, "CXL");
-
struct cxl_poison_context {
struct cxl_port *port;
int part;
@@ -3268,64 +3239,6 @@ static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
return -ENXIO;
}
-static struct lock_class_key cxl_pmem_region_key;
-
-static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
-{
- struct cxl_region_params *p = &cxlr->params;
- struct cxl_nvdimm_bridge *cxl_nvb;
- struct device *dev;
- int i;
-
- guard(rwsem_read)(&cxl_rwsem.region);
- if (p->state != CXL_CONFIG_COMMIT)
- return -ENXIO;
-
- struct cxl_pmem_region *cxlr_pmem __free(kfree) =
- kzalloc(struct_size(cxlr_pmem, mapping, p->nr_targets), GFP_KERNEL);
- if (!cxlr_pmem)
- return -ENOMEM;
-
- cxlr_pmem->hpa_range.start = p->res->start;
- cxlr_pmem->hpa_range.end = p->res->end;
-
- /* Snapshot the region configuration underneath the cxl_rwsem.region */
- cxlr_pmem->nr_mappings = p->nr_targets;
- for (i = 0; i < p->nr_targets; i++) {
- struct cxl_endpoint_decoder *cxled = p->targets[i];
- struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
- struct cxl_pmem_region_mapping *m = &cxlr_pmem->mapping[i];
-
- /*
- * Regions never span CXL root devices, so by definition the
- * bridge for one device is the same for all.
- */
- if (i == 0) {
- cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint);
- if (!cxl_nvb)
- return -ENODEV;
- cxlr->cxl_nvb = cxl_nvb;
- }
- m->cxlmd = cxlmd;
- get_device(&cxlmd->dev);
- m->start = cxled->dpa_res->start;
- m->size = resource_size(cxled->dpa_res);
- m->position = i;
- }
-
- dev = &cxlr_pmem->dev;
- device_initialize(dev);
- lockdep_set_class(&dev->mutex, &cxl_pmem_region_key);
- device_set_pm_not_required(dev);
- dev->parent = &cxlr->dev;
- dev->bus = &cxl_bus_type;
- dev->type = &cxl_pmem_region_type;
- cxlr_pmem->cxlr = cxlr;
- cxlr->cxlr_pmem = no_free_ptr(cxlr_pmem);
-
- return 0;
-}
-
static void cxl_dax_region_release(struct device *dev)
{
struct cxl_dax_region *cxlr_dax = to_cxl_dax_region(dev);
@@ -3358,92 +3271,6 @@ struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
}
EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, "CXL");
-static void cxlr_pmem_unregister(void *_cxlr_pmem)
-{
- struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem;
- struct cxl_region *cxlr = cxlr_pmem->cxlr;
- struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
-
- /*
- * Either the bridge is in ->remove() context under the device_lock(),
- * or cxlr_release_nvdimm() is cancelling the bridge's release action
- * for @cxlr_pmem and doing it itself (while manually holding the bridge
- * lock).
- */
- device_lock_assert(&cxl_nvb->dev);
- cxlr->cxlr_pmem = NULL;
- cxlr_pmem->cxlr = NULL;
- device_unregister(&cxlr_pmem->dev);
-}
-
-static void cxlr_release_nvdimm(void *_cxlr)
-{
- struct cxl_region *cxlr = _cxlr;
- struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
-
- scoped_guard(device, &cxl_nvb->dev) {
- if (cxlr->cxlr_pmem)
- devm_release_action(&cxl_nvb->dev, cxlr_pmem_unregister,
- cxlr->cxlr_pmem);
- }
- cxlr->cxl_nvb = NULL;
- put_device(&cxl_nvb->dev);
-}
-
-/**
- * devm_cxl_add_pmem_region() - add a cxl_region-to-nd_region bridge
- * @cxlr: parent CXL region for this pmem region bridge device
- *
- * Return: 0 on success negative error code on failure.
- */
-static int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
-{
- struct cxl_pmem_region *cxlr_pmem;
- struct cxl_nvdimm_bridge *cxl_nvb;
- struct device *dev;
- int rc;
-
- rc = cxl_pmem_region_alloc(cxlr);
- if (rc)
- return rc;
- cxlr_pmem = cxlr->cxlr_pmem;
- cxl_nvb = cxlr->cxl_nvb;
-
- dev = &cxlr_pmem->dev;
- rc = dev_set_name(dev, "pmem_region%d", cxlr->id);
- if (rc)
- goto err;
-
- rc = device_add(dev);
- if (rc)
- goto err;
-
- dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
- dev_name(dev));
-
- scoped_guard(device, &cxl_nvb->dev) {
- if (cxl_nvb->dev.driver)
- rc = devm_add_action_or_reset(&cxl_nvb->dev,
- cxlr_pmem_unregister,
- cxlr_pmem);
- else
- rc = -ENXIO;
- }
-
- if (rc)
- goto err_bridge;
-
- /* @cxlr carries a reference on @cxl_nvb until cxlr_release_nvdimm */
- return devm_add_action_or_reset(&cxlr->dev, cxlr_release_nvdimm, cxlr);
-
-err:
- put_device(dev);
-err_bridge:
- put_device(&cxl_nvb->dev);
- cxlr->cxl_nvb = NULL;
- return rc;
-}
-
static int match_decoder_by_range(struct device *dev, const void *data)
{
const struct range *r1, *r2 = data;
@@ -3929,26 +3756,18 @@ static int cxl_region_probe(struct device *dev)
return rc;
}
- switch (cxlr->mode) {
- case CXL_PARTMODE_PMEM:
- rc = devm_cxl_region_edac_register(cxlr);
- if (rc)
- dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
- cxlr->id);
-
- return devm_cxl_add_pmem_region(cxlr);
- case CXL_PARTMODE_RAM:
- rc = devm_cxl_region_edac_register(cxlr);
- if (rc)
- dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
- cxlr->id);
-
- return cxl_enable_memctrl(cxlr);
- default:
+ if (cxlr->mode > CXL_PARTMODE_PMEM) {
dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
cxlr->mode);
return -ENXIO;
}
+
+ rc = devm_cxl_region_edac_register(cxlr);
+ if (rc)
+ dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
+ cxlr->id);
+
+ return cxl_enable_memctrl(cxlr);
}
static struct cxl_driver cxl_region_driver = {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index bb4f877b4e8f..c69d27a2e97d 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -509,12 +509,14 @@ enum cxl_partition_mode {
* Auto - either BIOS-configured as SysRAM, or default to DAX
* DAX - creates a dax_region controller for the cxl_region
* SYSRAM - hotplugs the region directly as System RAM
+ * PMEM - persistent memory controller (nvdimm)
*/
enum cxl_memctrl_mode {
CXL_MEMCTRL_NONE,
CXL_MEMCTRL_AUTO,
CXL_MEMCTRL_DAX,
CXL_MEMCTRL_SYSRAM,
+ CXL_MEMCTRL_PMEM,
};
/*
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
` (2 preceding siblings ...)
2026-01-12 16:35 ` [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region Gregory Price
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 16:35 ` [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options Gregory Price
` (3 subsequent siblings)
7 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
To give users the option to have the auto-behavior of memory to default
to SYSRAM, provide a switch. The default is still recommended to be DAX
in case of multiple devices being added to the system, but this provides
simpler systems a path to use the sysram controller for systems already
configured with auto-regions.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/cxl/Kconfig | 32 ++++++++++++++++++++++++++++++
drivers/cxl/core/memctrl/memctrl.c | 2 ++
drivers/cxl/cxl.h | 2 +-
3 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 48b7314afdb8..5aed1524f8f1 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -211,6 +211,38 @@ config CXL_REGION
If unsure say 'y'
+choice
+ prompt "CXL Region Auto Control Mode"
+ depends on CXL_REGION
+ default CXL_REGION_CTRL_AUTO_DAX
+ help
+ Select the default controller for CXL regions when ctrl mode is
+ set to 'auto'. This determines how CXL memory regions are exposed
+ to the system when no explicit control mode is specified.
+
+config CXL_REGION_CTRL_AUTO_DAX
+ bool "DAX"
+ help
+ When a CXL region's control mode is 'auto', create a DAX region
+ controller. This allows fine-grained control over the memory region
+ through the DAX subsystem, and the region can later be converted to
+ System RAM via daxctl.
+
+ This is the default and recommended option for most use cases.
+
+config CXL_REGION_CTRL_AUTO_SYSRAM
+ bool "System RAM"
+ help
+ When a CXL region's control mode is 'auto', hotplug the region
+ directly as System RAM. This makes the CXL memory immediately
+ available to the kernel's memory allocator without requiring
+ additional userspace configuration.
+
+ Select this if you want CXL memory to be automatically available
+ as regular system memory.
+
+endchoice
+
config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)"
depends on CXL_REGION
diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
index 1b661465bdeb..cb6c37f4c0ee 100644
--- a/drivers/cxl/core/memctrl/memctrl.c
+++ b/drivers/cxl/core/memctrl/memctrl.c
@@ -31,6 +31,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
p->res->start, p->res->end, cxlr,
is_system_ram) > 0)
return 0;
+ if (IS_ENABLED(CONFIG_CXL_REGION_CTRL_AUTO_SYSRAM))
+ return devm_cxl_add_sysram_region(cxlr);
return devm_cxl_add_dax_region(cxlr);
case CXL_MEMCTRL_DAX:
return devm_cxl_add_dax_region(cxlr);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index c69d27a2e97d..1dae6fe4f70c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -506,7 +506,7 @@ enum cxl_partition_mode {
/*
* Memory Controller modes:
* None - No controller selected
- * Auto - either BIOS-configured as SysRAM, or default to DAX
+ * Auto - Auto-select based on BIOS, boot, and build configs.
* DAX - creates a dax_region controller for the cxl_region
* SYSRAM - hotplugs the region directly as System RAM
* PMEM - persistent memory controller (nvdimm)
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
` (3 preceding siblings ...)
2026-01-12 16:35 ` [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options Gregory Price
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 21:11 ` Cheatham, Benjamin
2026-01-12 16:35 ` [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only Gregory Price
` (2 subsequent siblings)
7 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
DEFAULT_OFFLINE: Blocks will be offline after being created.
DEFAULT_ONLINE: Blocks will be onlined in ZONE_MOVABLE
DEFAULT_ONLINE_NORMAL: Blocks will be onliend in ZONE_NORMAL.
This prevents users from having to use the MHP auto-online build config,
which may cause misbehaviors with other devices hotplugging memory.
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/cxl/Kconfig | 40 ++++++++++
drivers/cxl/core/memctrl/sysram_region.c | 94 ++++++++++++++++++------
2 files changed, 110 insertions(+), 24 deletions(-)
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 5aed1524f8f1..3e087c9d5ea7 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -243,6 +243,46 @@ config CXL_REGION_CTRL_AUTO_SYSRAM
endchoice
+choice
+ prompt "CXL SYSRAM Auto Online Mode"
+ depends on CXL_REGION
+ default CXL_REGION_SYSRAM_DEFAULT_OFFLINE
+ help
+ Select whether CXL memory hotplugged as System RAM should be
+ automatically onlined and in which zone. This applies when the
+ region controller is set to SYSRAM (either explicitly or via
+ the auto control mode).
+
+config CXL_REGION_SYSRAM_DEFAULT_OFFLINE
+ bool "Offline"
+ help
+ Leave the memory offline after hotplug. The memory must be
+ manually onlined via sysfs or other mechanisms before it can
+ be used by the system.
+
+ This is the default and most conservative option.
+
+config CXL_REGION_SYSRAM_DEFAULT_ONLINE
+ bool "Online (Movable)"
+ help
+ Automatically online the memory as ZONE_MOVABLE after hotplug.
+ ZONE_MOVABLE memory can be used for user pages and is eligible
+ for memory hotremove, but cannot be used for kernel allocations.
+
+ Select this for memory that may need to be hotremoved later.
+
+config CXL_REGION_SYSRAM_DEFAULT_ONLINE_NORMAL
+ bool "Online (Normal)"
+ help
+ Automatically online the memory as ZONE_NORMAL after hotplug.
+ ZONE_NORMAL memory can be used for all allocations including
+ kernel allocations, but may not be hotremovable.
+
+ Select this for maximum memory utilization when hotremove is
+ not required.
+
+endchoice
+
config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)"
depends on CXL_REGION
diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
index a7570c8a54e1..2e2d9b59a725 100644
--- a/drivers/cxl/core/memctrl/sysram_region.c
+++ b/drivers/cxl/core/memctrl/sysram_region.c
@@ -129,12 +129,69 @@ static int offline_memory_block_cb(struct memory_block *mem, void *arg)
return *rc;
}
+static int cxl_sysram_online_memory(struct range *range, int online_type)
+{
+ struct online_memory_cb_arg cb_arg = {
+ .online_type = online_type,
+ .rc = 0,
+ };
+ int rc;
+
+ rc = walk_memory_blocks(range->start, range_len(range),
+ &cb_arg, online_memory_block_cb);
+ if (!rc)
+ rc = cb_arg.rc;
+
+ return rc;
+}
+
+static int cxl_sysram_offline_memory(struct range *range)
+{
+ int offline_rc = 0;
+ int rc;
+
+ rc = walk_memory_blocks(range->start, range_len(range),
+ &offline_rc, offline_memory_block_cb);
+ if (!rc)
+ rc = offline_rc;
+
+ return rc;
+}
+
+static int cxl_sysram_auto_online(struct device *dev, struct range *range)
+{
+ int online_type;
+ int rc;
+
+ if (IS_ENABLED(CONFIG_CXL_REGION_SYSRAM_DEFAULT_OFFLINE))
+ return 0;
+
+ if (IS_ENABLED(CONFIG_CXL_REGION_SYSRAM_DEFAULT_ONLINE))
+ online_type = MMOP_ONLINE_MOVABLE;
+ else if (IS_ENABLED(CONFIG_CXL_REGION_SYSRAM_DEFAULT_ONLINE_NORMAL))
+ online_type = MMOP_ONLINE_KERNEL;
+ else
+ online_type = MMOP_ONLINE_MOVABLE;
+
+ rc = lock_device_hotplug_sysfs();
+ if (rc)
+ return rc;
+
+ rc = cxl_sysram_online_memory(range, online_type);
+
+ unlock_device_hotplug();
+
+ if (rc)
+ dev_warn(dev, "auto-online failed: %d\n", rc);
+
+ return rc;
+}
+
static ssize_t state_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t len)
{
struct cxl_region *cxlr = to_cxl_region(dev);
- struct online_memory_cb_arg cb_arg;
struct range range;
int rc;
@@ -149,30 +206,14 @@ static ssize_t state_store(struct device *dev,
if (rc)
return rc;
- if (sysfs_streq(buf, "online")) {
- cb_arg.online_type = MMOP_ONLINE_MOVABLE;
- cb_arg.rc = 0;
- rc = walk_memory_blocks(range.start, range_len(&range),
- &cb_arg, online_memory_block_cb);
- if (!rc)
- rc = cb_arg.rc;
- } else if (sysfs_streq(buf, "online_normal")) {
- cb_arg.online_type = MMOP_ONLINE;
- cb_arg.rc = 0;
- rc = walk_memory_blocks(range.start, range_len(&range),
- &cb_arg, online_memory_block_cb);
- if (!rc)
- rc = cb_arg.rc;
- } else if (sysfs_streq(buf, "offline")) {
- int offline_rc = 0;
-
- rc = walk_memory_blocks(range.start, range_len(&range),
- &offline_rc, offline_memory_block_cb);
- if (!rc)
- rc = offline_rc;
- } else {
+ if (sysfs_streq(buf, "online"))
+ rc = cxl_sysram_online_memory(&range, MMOP_ONLINE_MOVABLE);
+ else if (sysfs_streq(buf, "online_normal"))
+ rc = cxl_sysram_online_memory(&range, MMOP_ONLINE);
+ else if (sysfs_streq(buf, "offline"))
+ rc = cxl_sysram_offline_memory(&range);
+ else
rc = -EINVAL;
- }
unlock_device_hotplug();
@@ -332,6 +373,10 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
dev_dbg(dev, "%s: added %llu bytes as System RAM\n", dev_name(dev),
(unsigned long long)total_len);
+ rc = cxl_sysram_auto_online(dev, &range);
+ if (rc)
+ goto err_auto_online;
+
dev_set_drvdata(dev, data);
rc = devm_device_add_group(dev, &cxl_sysram_region_group);
if (rc)
@@ -341,6 +386,7 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
err_add_group:
dev_set_drvdata(dev, NULL);
+err_auto_online:
/* if this fails, memory cannot be removed from the system until reboot */
remove_memory(range.start, range_len(&range));
err_add_memory:
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
` (4 preceding siblings ...)
2026-01-12 16:35 ` [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options Gregory Price
@ 2026-01-12 16:35 ` Gregory Price
2026-01-12 21:11 ` Cheatham, Benjamin
2026-01-13 9:37 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Neeraj Kumar
2026-01-15 18:43 ` Alejandro Lucero Palau
7 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 16:35 UTC (permalink / raw)
To: linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams,
David Hildenbrand, Hannes Reinecke
If state is set to online (default to ZONE_MOVABLE), the user intends
for this memory to either refuse non-movable allocations, and/or intends
to preserve the hot-unpluggability of this memory. However, any admin
can write `offline` and `online` to the memory block controller and
bring that memory online in ZONE_NORMAL.
Register a memory_notify callback that disallows onlining the block into
ZONE_NORMAL if the default state of the controller is ZONE_MOVABLE.
If an actor attempts to online the block into ZONE_NORMAL, it will fail,
but if it attempts to online into either NORMAL or MOVABLE, only MOVABLE
will be allowed and it will succeed.
Suggested-by: David Hildenbrand <david@kernel.org>
Suggested-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/linux-mm/39533aa8-ca78-41a8-b005-9202ce53e3ae@kernel.org/
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/cxl/core/memctrl/sysram_region.c | 138 +++++++++++++++++++++--
1 file changed, 127 insertions(+), 11 deletions(-)
diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
index 2e2d9b59a725..71e39d725dc5 100644
--- a/drivers/cxl/core/memctrl/sysram_region.c
+++ b/drivers/cxl/core/memctrl/sysram_region.c
@@ -2,6 +2,7 @@
/* Copyright(c) 2026 Meta Inc. All rights reserved. */
#include <linux/memremap.h>
#include <linux/memory.h>
+#include <linux/mmzone.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/slab.h>
@@ -23,6 +24,14 @@ struct cxl_sysram_data {
const char *res_name;
int mgid;
struct resource *res;
+ struct range range;
+ struct notifier_block memory_notifier;
+ /*
+ * Last online type requested by user via state sysfs or auto-online.
+ * Used to enforce zone consistency when memory blocks are onlined.
+ * MMOP_OFFLINE means no online preference has been set yet.
+ */
+ int last_online_type;
};
static DEFINE_MUTEX(cxl_memory_type_lock);
@@ -158,7 +167,58 @@ static int cxl_sysram_offline_memory(struct range *range)
return rc;
}
-static int cxl_sysram_auto_online(struct device *dev, struct range *range)
+/*
+ * Memory notifier callback to enforce zone consistency.
+ *
+ * When the user (or auto-online) requests memory to be onlined into
+ * ZONE_MOVABLE, reject any subsequent attempts to online memory blocks
+ * from this region into a different zone (e.g., ZONE_NORMAL). This prevents
+ * accidental zone mixing which could lead to memory fragmentation and
+ * offlining failures.
+ */
+static int cxl_sysram_memory_notify_cb(struct notifier_block *nb,
+ unsigned long action, void *arg)
+{
+ struct cxl_sysram_data *data = container_of(nb, struct cxl_sysram_data,
+ memory_notifier);
+ struct memory_notify *mhp = arg;
+ unsigned long start_phys = PFN_PHYS(mhp->start_pfn);
+ unsigned long size = PFN_PHYS(mhp->nr_pages);
+ struct page *page;
+
+ if (action != MEM_GOING_ONLINE)
+ return NOTIFY_DONE;
+
+ /* Check if this memory block overlaps with our region */
+ if (start_phys + size <= data->range.start ||
+ start_phys > data->range.end)
+ return NOTIFY_DONE;
+
+ /*
+ * If no online preference has been set (MMOP_OFFLINE), allow any zone.
+ * Also allow if the preference wasn't for ZONE_MOVABLE.
+ */
+ if (data->last_online_type != MMOP_ONLINE_MOVABLE)
+ return NOTIFY_DONE;
+
+ /*
+ * The zone has already been assigned to the pages at this point
+ * via move_pfn_range_to_zone() before MEM_GOING_ONLINE is sent.
+ * Check if it's ZONE_MOVABLE as expected.
+ */
+ page = pfn_to_page(mhp->start_pfn);
+
+ if (!is_zone_movable_page(page)) {
+ pr_warn("CXL sysram: rejecting online to non-movable zone for range %#lx-%#lx (expected ZONE_MOVABLE)\n",
+ start_phys, start_phys + size - 1);
+ return NOTIFY_BAD;
+ }
+
+ return NOTIFY_OK;
+}
+
+static int cxl_sysram_auto_online(struct device *dev, struct range *range,
+ struct cxl_sysram_data *data)
{
int online_type;
int rc;
@@ -173,6 +233,9 @@ static int cxl_sysram_auto_online(struct device *dev, struct range *range)
else
online_type = MMOP_ONLINE_MOVABLE;
+ /* Record the auto-online type for zone enforcement */
+ data->last_online_type = online_type;
+
rc = lock_device_hotplug_sysfs();
if (rc)
return rc;
@@ -187,17 +250,43 @@ static int cxl_sysram_auto_online(struct device *dev, struct range *range)
return rc;
}
+static ssize_t state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct cxl_sysram_data *data;
+
+ data = dev_get_drvdata(dev);
+ if (!data)
+ return -ENODEV;
+
+ switch (data->last_online_type) {
+ case MMOP_ONLINE_MOVABLE:
+ return sysfs_emit(buf, "online\n");
+ case MMOP_ONLINE_KERNEL:
+ return sysfs_emit(buf, "online_normal\n");
+ case MMOP_OFFLINE:
+ default:
+ return sysfs_emit(buf, "offline\n");
+ }
+}
+
static ssize_t state_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t len)
{
struct cxl_region *cxlr = to_cxl_region(dev);
+ struct cxl_sysram_data *data;
struct range range;
+ int online_type = MMOP_OFFLINE;
int rc;
if (!cxlr)
return -ENODEV;
+ data = dev_get_drvdata(dev);
+ if (!data)
+ return -ENODEV;
+
rc = cxl_sysram_range(cxlr, &range);
if (rc)
return rc;
@@ -206,23 +295,30 @@ static ssize_t state_store(struct device *dev,
if (rc)
return rc;
- if (sysfs_streq(buf, "online"))
- rc = cxl_sysram_online_memory(&range, MMOP_ONLINE_MOVABLE);
- else if (sysfs_streq(buf, "online_normal"))
- rc = cxl_sysram_online_memory(&range, MMOP_ONLINE);
- else if (sysfs_streq(buf, "offline"))
+ if (sysfs_streq(buf, "online")) {
+ online_type = MMOP_ONLINE_MOVABLE;
+ rc = cxl_sysram_online_memory(&range, online_type);
+ } else if (sysfs_streq(buf, "online_normal")) {
+ online_type = MMOP_ONLINE;
+ rc = cxl_sysram_online_memory(&range, online_type);
+ } else if (sysfs_streq(buf, "offline")) {
rc = cxl_sysram_offline_memory(&range);
- else
+ } else {
rc = -EINVAL;
+ }
unlock_device_hotplug();
if (rc)
return rc;
+ /* Record the online type for zone enforcement on success */
+ if (online_type != MMOP_OFFLINE)
+ data->last_online_type = online_type;
+
return len;
}
-static DEVICE_ATTR_WO(state);
+static DEVICE_ATTR_RW(state);
static ssize_t hotplug_store(struct device *dev,
struct device_attribute *attr,
@@ -274,6 +370,11 @@ static void cxl_sysram_unregister(void *_data)
.end = data->res->end
};
+ unregister_memory_notifier(&data->memory_notifier);
+
+ range.start = data->res->start;
+ range.end = data->res->end;
+
/* We have one shot for removal, otherwise it's stuck til reboot */
if (!offline_and_remove_memory(range.start, range_len(&range))) {
remove_resource(data->res);
@@ -334,6 +435,10 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
goto err_data;
}
+ /* Initialize range and online type tracking */
+ data->range = range;
+ data->last_online_type = MMOP_OFFLINE;
+
data->res_name = kstrdup(dev_name(dev), GFP_KERNEL);
if (!data->res_name) {
rc = -ENOMEM;
@@ -373,11 +478,20 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
dev_dbg(dev, "%s: added %llu bytes as System RAM\n", dev_name(dev),
(unsigned long long)total_len);
- rc = cxl_sysram_auto_online(dev, &range);
+ /* Set drvdata early so auto_online can access it */
+ dev_set_drvdata(dev, data);
+
+ /* Register memory notifier for zone enforcement */
+ data->memory_notifier.notifier_call = cxl_sysram_memory_notify_cb;
+ data->memory_notifier.priority = CXL_CALLBACK_PRI;
+ rc = register_memory_notifier(&data->memory_notifier);
+ if (rc)
+ goto err_notifier;
+
+ rc = cxl_sysram_auto_online(dev, &range, data);
if (rc)
goto err_auto_online;
- dev_set_drvdata(dev, data);
rc = devm_device_add_group(dev, &cxl_sysram_region_group);
if (rc)
goto err_add_group;
@@ -385,9 +499,11 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
return devm_add_action_or_reset(dev, cxl_sysram_unregister, data);
err_add_group:
- dev_set_drvdata(dev, NULL);
err_auto_online:
/* if this fails, memory cannot be removed from the system until reboot */
+ unregister_memory_notifier(&data->memory_notifier);
+err_notifier:
+ dev_set_drvdata(dev, NULL);
remove_memory(range.start, range_len(&range));
err_add_memory:
remove_resource(res);
--
2.52.0
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 16:35 ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
@ 2026-01-12 20:00 ` David Hildenbrand (Red Hat)
2026-01-12 22:43 ` Gregory Price
2026-01-12 21:10 ` dan.j.williams
2026-01-12 21:10 ` Cheatham, Benjamin
2 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand (Red Hat) @ 2026-01-12 20:00 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On 1/12/26 17:35, Gregory Price wrote:
> Add a sysram memctrl that directly hotplugs memory without needing to
> route through DAX. This simplifies the sysram usecase considerably.
>
> The sysram memctl adds new sysfs controls when registered:
> region/memctrl/[hotplug, hotunplug, state]
>
> hotplug: controller attempts to hotplug the memory region
Why disconnect the hotplug from the online state?
echo online_movable > hotplug ?
Then we can just have something like add_and_online_memory() in the core.
> hotunplug: controller attempts to offline and hotunplug the memory region
> state: [online,online_normal,offline]
> online : controller onlines blocks in ZONE_MOVABLE
I don't like this incosistency regarding the remainder of common hotplug
toggles.
We should use exactly the same values with exactly the same semantics.
Yes, user-space tooling should be thaught to pass in online_movable :)
> online_normal: controller onlines blocks in ZONE_NORMAL
> offline : controller attempts to offline the memory blocks
Why is that required? ideally we'd start with hotplug vs. hotunplug and
leave manual onlining/offlining out of this interface for now.
>
> Hotplug note - by default the controller will hotplug the blocks, but
> leave them offline (unless MHP auto-online in Kconfig is enabled).
>
> Setting state to "online_normal" may prevent future hot-unplug of sysram
> regions, and unbinding a memory region with memory online in ZONE_NORMAL
> may result in the device being removed but the memory remaining online.
>
> This can result in future management functions failing (such as adding a
> new region). This is why "online_normal" is explicit, and the default
> online zone is ZONE_MOVABLE.
>
> Cc: David Hildenbrand <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/core.h | 2 +
> drivers/cxl/core/memctrl/Makefile | 1 +
> drivers/cxl/core/memctrl/memctrl.c | 2 +
> drivers/cxl/core/memctrl/sysram_region.c | 358 +++++++++++++++++++++++
> drivers/cxl/core/region.c | 5 +
> drivers/cxl/cxl.h | 6 +-
> 6 files changed, 372 insertions(+), 2 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
>
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 1156a4bd0080..18cb84950500 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -31,6 +31,8 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
> struct cxl_endpoint_decoder *cxled, int pos,
> enum cxl_detach_mode mode);
>
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
> +
> #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
> #define CXL_REGION_TYPE(x) (&cxl_region_type)
> #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> index 8165aad5a52a..1c52c7d75570 100644
> --- a/drivers/cxl/core/memctrl/Makefile
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -2,3 +2,4 @@
>
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> index 24e0e14b39c7..40ffb59353bb 100644
> --- a/drivers/cxl/core/memctrl/memctrl.c
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -34,6 +34,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
> return devm_cxl_add_dax_region(cxlr);
> case CXL_MEMCTRL_DAX:
> return devm_cxl_add_dax_region(cxlr);
> + case CXL_MEMCTRL_SYSRAM:
> + return devm_cxl_add_sysram_region(cxlr);
> default:
> return -EINVAL;
> }
> diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
> new file mode 100644
> index 000000000000..a7570c8a54e1
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/sysram_region.c
> @@ -0,0 +1,358 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2026 Meta Inc. All rights reserved. */
> +#include <linux/memremap.h>
> +#include <linux/memory.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <linux/mm.h>
> +#include <linux/memory-tiers.h>
> +#include <linux/memory_hotplug.h>
> +#include <linux/string_helpers.h>
> +#include <linux/sched/signal.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +/* If HMAT was unavailable, assign a default distance. */
> +#define MEMTIER_DEFAULT_CXL_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 5)
> +
> +static const char *sysram_name = "System RAM (CXL)";
> +
> +struct cxl_sysram_data {
> + const char *res_name;
> + int mgid;
> + struct resource *res;
> +};
> +
> +static DEFINE_MUTEX(cxl_memory_type_lock);
> +static LIST_HEAD(cxl_memory_types);
> +
> +static struct cxl_region *to_cxl_region(struct device *dev)
> +{
> + if (dev->type != &cxl_region_type)
> + return NULL;
> + return container_of(dev, struct cxl_region, dev);
> +}
> +
> +static struct memory_dev_type *cxl_find_alloc_memory_type(int adist)
> +{
> + guard(mutex)(&cxl_memory_type_lock);
> + return mt_find_alloc_memory_type(adist, &cxl_memory_types);
> +}
> +
> +static void __maybe_unused cxl_put_memory_types(void)
> +{
> + guard(mutex)(&cxl_memory_type_lock);
> + mt_put_memory_types(&cxl_memory_types);
> +}
> +
> +static int cxl_sysram_range(struct cxl_region *cxlr, struct range *r)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> +
> + if (!p->res)
> + return -ENODEV;
> +
> + /* memory-block align the hotplug range */
> + r->start = ALIGN(p->res->start, memory_block_size_bytes());
> + r->end = ALIGN_DOWN(p->res->end + 1, memory_block_size_bytes()) - 1;
> + if (r->start >= r->end) {
> + r->start = p->res->start;
> + r->end = p->res->end;
> + return -ENOSPC;
> + }
> + return 0;
> +}
> +
> +static ssize_t hotunplug_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + struct range range;
> + int rc;
> +
> + if (!cxlr)
> + return -ENODEV;
> +
> + rc = cxl_sysram_range(cxlr, &range);
> + if (rc)
> + return rc;
> +
> + rc = offline_and_remove_memory(range.start, range_len(&range));
> +
> + if (rc)
> + return rc;
> +
> + return len;
> +}
> +static DEVICE_ATTR_WO(hotunplug);
> +
> +struct online_memory_cb_arg {
> + int online_type;
> + int rc;
> +};
> +
> +static int online_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> + struct online_memory_cb_arg *cb_arg = arg;
> +
> + if (signal_pending(current))
> + return -EINTR;
> +
> + cond_resched();
> +
> + if (mem->state == MEM_ONLINE)
> + return 0;
> +
> + mem->online_type = cb_arg->online_type;
> + cb_arg->rc = device_online(&mem->dev);
> +
> + return cb_arg->rc;
> +}
> +
> +static int offline_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> + int *rc = arg;
> +
> + if (signal_pending(current))
> + return -EINTR;
> +
> + cond_resched();
> +
> + if (mem->state == MEM_OFFLINE)
> + return 0;
> +
> + *rc = device_offline(&mem->dev);
> +
> + return *rc;
> +}
> +
> +static ssize_t state_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + struct online_memory_cb_arg cb_arg;
> + struct range range;
> + int rc;
> +
> + if (!cxlr)
> + return -ENODEV;
> +
> + rc = cxl_sysram_range(cxlr, &range);
> + if (rc)
> + return rc;
> +
> + rc = lock_device_hotplug_sysfs();
> + if (rc)
> + return rc;
> +
> + if (sysfs_streq(buf, "online")) {
> + cb_arg.online_type = MMOP_ONLINE_MOVABLE;
> + cb_arg.rc = 0;
> + rc = walk_memory_blocks(range.start, range_len(&range),
> + &cb_arg, online_memory_block_cb);
> + if (!rc)
> + rc = cb_arg.rc;
> + } else if (sysfs_streq(buf, "online_normal")) {
> + cb_arg.online_type = MMOP_ONLINE;
> + cb_arg.rc = 0;
> + rc = walk_memory_blocks(range.start, range_len(&range),
> + &cb_arg, online_memory_block_cb);
> + if (!rc)
> + rc = cb_arg.rc;
> + } else if (sysfs_streq(buf, "offline")) {
> + int offline_rc = 0;
> +
> + rc = walk_memory_blocks(range.start, range_len(&range),
> + &offline_rc, offline_memory_block_cb);
> + if (!rc)
> + rc = offline_rc;
Let's expose this functionality through some common-code helpers. I
really don't want more code doing this non-obvious device_offline() etc
dance.
walk_memory_blocks() should become a core-mm helper. Maybe we can also
cleanup drivers/acpi/acpi_memhotplug.c in that regard.
Hopefully we can then also reuse these helpers in ppc code (see
dlpar_add_lmb() and dlpar_remove_lmb() that do something similar, but
grab the device hotplug lock themselves as they want to perform some
additional operations).
--
Cheers
David
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
@ 2026-01-12 20:59 ` dan.j.williams
2026-01-12 22:25 ` Gregory Price
2026-01-13 18:00 ` Dave Jiang
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-14 17:18 ` Jonathan Cameron
2 siblings, 2 replies; 40+ messages in thread
From: dan.j.williams @ 2026-01-12 20:59 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
Gregory Price wrote:
> The CXL driver presently hands policy management over to DAX subsystem
> for sysram regions, which makes building policy around the entire region
> clunky and at time difficult (e.g. multiple actions to offline and
> hot-unplug memory reliably).
>
> To support multiple backend controllers for memory regions (for example
> dax vs direct hotplug), implement a memctrl field in cxl_region allows
> switching uncomitted regions between different "memory controllers".
>
> CXL_CONTROL_NONE: No selected controller, probe will fail.
> CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
> otherwise register a dax_region
> CXL_CONTROL_DAX : register a dax_region
>
> Auto regions will either be static sysram (BIOS-onlined) and has no
> region controller associated with it - or if the SP bit was set a
> DAX device will be created.
>
> Rather than default all regions to the auto-controller, only default
> auto-regions to the auto controller.
>
> Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
> a failure to probe unless a controller is selected.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/core.h | 2 +
> drivers/cxl/core/memctrl/Makefile | 4 +
> drivers/cxl/core/memctrl/dax_region.c | 79 +++++++++++++++
> drivers/cxl/core/memctrl/memctrl.c | 42 ++++++++
> drivers/cxl/core/region.c | 136 ++++++++++----------------
> drivers/cxl/cxl.h | 14 +++
> 7 files changed, 192 insertions(+), 86 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/Makefile
> create mode 100644 drivers/cxl/core/memctrl/dax_region.c
> create mode 100644 drivers/cxl/core/memctrl/memctrl.c
>
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 5ad8fef210b5..79de20e3f8aa 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -17,6 +17,7 @@ cxl_core-y += cdat.o
> cxl_core-y += ras.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +include $(src)/memctrl/Makefile
Not sure this merits its own directory, but if it does just do the
canonical:
obj-y += memctrl/
...to add an object-sub-directory.
> cxl_core-$(CONFIG_CXL_MCE) += mce.o
> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 1fb66132b777..1156a4bd0080 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -42,6 +42,8 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port);
> struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
> u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> u64 dpa);
> +int cxl_enable_memctrl(struct cxl_region *cxlr);
This is a "probe" operation not an "enable" in terms of runtime ABI and
presentation that starts decorating the region. In that respect it also
is not a "control" as much as an "operation model / driver". So no need
for a "control" concept, i.e.:
s/CXL_CONTROL_{NONE,AUTO,DAX}/CXL_DRIVER_{NONE,AUTO,DAX}/
s/enum cxl_memctrl_mode/enum cxl_region_driver/
...otherwise there is nothing in this proposal that makes me want to
abandon the traditional meaning of a "driver" probing a "resource" in a
certain way to make it usable with the rest of the kernel.
Rest of this looks fine. With that fixup if we are going to have a set
of different region driver modes then the directory can be:
drivers/cxl/core/region/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 16:35 ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
2026-01-12 20:00 ` David Hildenbrand (Red Hat)
@ 2026-01-12 21:10 ` dan.j.williams
2026-01-12 22:47 ` Gregory Price
2026-01-12 21:10 ` Cheatham, Benjamin
2 siblings, 1 reply; 40+ messages in thread
From: dan.j.williams @ 2026-01-12 21:10 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams,
David Hildenbrand
Gregory Price wrote:
> Add a sysram memctrl that directly hotplugs memory without needing to
> route through DAX. This simplifies the sysram usecase considerably.
>
> The sysram memctl adds new sysfs controls when registered:
> region/memctrl/[hotplug, hotunplug, state]
>
> hotplug: controller attempts to hotplug the memory region
> hotunplug: controller attempts to offline and hotunplug the memory region
> state: [online,online_normal,offline]
> online : controller onlines blocks in ZONE_MOVABLE
> online_normal: controller onlines blocks in ZONE_NORMAL
> offline : controller attempts to offline the memory blocks
>
> Hotplug note - by default the controller will hotplug the blocks, but
> leave them offline (unless MHP auto-online in Kconfig is enabled).
>
> Setting state to "online_normal" may prevent future hot-unplug of sysram
> regions, and unbinding a memory region with memory online in ZONE_NORMAL
> may result in the device being removed but the memory remaining online.
>
> This can result in future management functions failing (such as adding a
> new region). This is why "online_normal" is explicit, and the default
> online zone is ZONE_MOVABLE.
David's early feedback aligns with my own with respect to not creating
new "online_*" ABI terms, but I want to go a step further.
Part of the proposal here solves a fundamental problem with the way
dax_kmem operates in terms of fixing the complication of dax_kmem
depending on fine grained / multi-step online control via memblock
sysfs.
If we are going to introduce a new omnibus way to online entire regions
at a time then that goodness should first come to dax_kmem and then
potentially be refactored into a library that CXL can use to skip the
device_dax indirection.
I.e. the end result would be this "hotplug" mechanism that fixes a long
standing dax_kmem problem and then go further to drop the indirection
through device_dax and have a "hotplug" mechanism directly at the
cxl_region level.
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> +{
[..]
> +err_add_group:
> + dev_set_drvdata(dev, NULL);
> + /* if this fails, memory cannot be removed from the system until reboot */
> + remove_memory(range.start, range_len(&range));
> +err_add_memory:
> + remove_resource(res);
> + kfree(res);
> +err_request_mem:
> + memory_group_unregister(data->mgid);
> +err_reg_mgid:
> + kfree(data->res_name);
> +err_res_name:
> + kfree(data);
> +err_data:
> + clear_node_memory_type(numa_node, mtype);
> + return rc;
...btw, this feels like too many new gotos in the age of
scope-based-cleanup. It also feels like a bunch of duplicated code that
CXL and fixed up dax_kmem can share.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
2026-01-12 20:59 ` dan.j.williams
@ 2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 22:34 ` Gregory Price
2026-01-14 17:18 ` Jonathan Cameron
2 siblings, 1 reply; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-12 21:10 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On 1/12/2026 10:35 AM, Gregory Price wrote:
> The CXL driver presently hands policy management over to DAX subsystem
> for sysram regions, which makes building policy around the entire region
> clunky and at time difficult (e.g. multiple actions to offline and
> hot-unplug memory reliably).
>
> To support multiple backend controllers for memory regions (for example
> dax vs direct hotplug), implement a memctrl field in cxl_region allows
> switching uncomitted regions between different "memory controllers".
>
> CXL_CONTROL_NONE: No selected controller, probe will fail.
> CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
> otherwise register a dax_region
I think you can streamline this to get rid of CXL_CONTROL_AUTO. If BIOS set up
the memory you can just set the mode to CXL_CONTROL_NONE, otherwise use CXL_CONTROL_DAX.
This patch would be a bit less complicated at the very least, and I don't think it
would require much reworking of later patches.
> CXL_CONTROL_DAX : register a dax_region
>
> Auto regions will either be static sysram (BIOS-onlined) and has no
> region controller associated with it - or if the SP bit was set a
> DAX device will be created.
>
> Rather than default all regions to the auto-controller, only default
> auto-regions to the auto controller.
>
> Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
> a failure to probe unless a controller is selected.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/core.h | 2 +
> drivers/cxl/core/memctrl/Makefile | 4 +
> drivers/cxl/core/memctrl/dax_region.c | 79 +++++++++++++++
> drivers/cxl/core/memctrl/memctrl.c | 42 ++++++++
> drivers/cxl/core/region.c | 136 ++++++++++----------------
> drivers/cxl/cxl.h | 14 +++
> 7 files changed, 192 insertions(+), 86 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/Makefile
> create mode 100644 drivers/cxl/core/memctrl/dax_region.c
> create mode 100644 drivers/cxl/core/memctrl/memctrl.c
>
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 5ad8fef210b5..79de20e3f8aa 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -17,6 +17,7 @@ cxl_core-y += cdat.o
> cxl_core-y += ras.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +include $(src)/memctrl/Makefile
> cxl_core-$(CONFIG_CXL_MCE) += mce.o
> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 1fb66132b777..1156a4bd0080 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -42,6 +42,8 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port);
> struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
> u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> u64 dpa);
> +int cxl_enable_memctrl(struct cxl_region *cxlr);
> +int devm_cxl_add_dax_region(struct cxl_region *cxlr);
>
> #else
> static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> new file mode 100644
> index 000000000000..8165aad5a52a
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> diff --git a/drivers/cxl/core/memctrl/dax_region.c b/drivers/cxl/core/memctrl/dax_region.c
> new file mode 100644
> index 000000000000..90d7fdb97013
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/dax_region.c
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +static struct lock_class_key cxl_dax_region_key;
> +
> +static struct cxl_dax_region *cxl_dax_region_alloc(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> + struct cxl_dax_region *cxlr_dax;
> + struct device *dev;
> +
> + guard(rwsem_read)(&cxl_rwsem.region);
> + if (p->state != CXL_CONFIG_COMMIT)
> + return ERR_PTR(-ENXIO);
> +
> + cxlr_dax = kzalloc(sizeof(*cxlr_dax), GFP_KERNEL);
> + if (!cxlr_dax)
> + return ERR_PTR(-ENOMEM);
> +
> + cxlr_dax->hpa_range.start = p->res->start;
> + cxlr_dax->hpa_range.end = p->res->end;
> +
> + dev = &cxlr_dax->dev;
> + cxlr_dax->cxlr = cxlr;
> + device_initialize(dev);
> + lockdep_set_class(&dev->mutex, &cxl_dax_region_key);
> + device_set_pm_not_required(dev);
> + dev->parent = &cxlr->dev;
> + dev->bus = &cxl_bus_type;
> + dev->type = &cxl_dax_region_type;
> +
> + return cxlr_dax;
> +}
> +
> +static void cxlr_dax_unregister(void *_cxlr_dax)
> +{
> + struct cxl_dax_region *cxlr_dax = _cxlr_dax;
> +
> + device_unregister(&cxlr_dax->dev);
> +}
> +
> +/*
> + * The dax controller is the default controller and simply hands the
> + * control pattern over to the dax driver. It does with a dax_region
> + * built by dax/cxl.c
> + */
> +int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> +{
> + struct cxl_dax_region *cxlr_dax;
> + struct device *dev;
> + int rc;
> +
> + cxlr_dax = cxl_dax_region_alloc(cxlr);
> + if (IS_ERR(cxlr_dax))
> + return PTR_ERR(cxlr_dax);
> +
> + dev = &cxlr_dax->dev;
> + rc = dev_set_name(dev, "dax_region%d", cxlr->id);
> + if (rc)
> + goto err;
> +
> + rc = device_add(dev);
> + if (rc)
> + goto err;
> +
> + dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
> + dev_name(dev));
> +
> + return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
> + cxlr_dax);
> +err:
> + put_device(dev);
> + return rc;
> +}
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> new file mode 100644
> index 000000000000..24e0e14b39c7
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -0,0 +1,42 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> +/* Copyright(c) 2026 Meta Inc. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/ioport.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +static int is_system_ram(struct resource *res, void *arg)
> +{
> + struct cxl_region *cxlr = arg;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + dev_dbg(&cxlr->dev, "%pr has System RAM: %pr\n", p->res, res);
> + return 1;
> +}
> +
> +int cxl_enable_memctrl(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> +
> + switch (cxlr->memctrl) {
> + case CXL_MEMCTRL_AUTO:
> + /*
> + * The region can not be manged by CXL if any portion of
s/manged/managed. May as well fix it since it's being moved.
> + * it is already online as 'System RAM'
> + */
> + if (walk_iomem_res_desc(IORES_DESC_NONE,
> + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> + p->res->start, p->res->end, cxlr,
> + is_system_ram) > 0)
> + return 0;
> + return devm_cxl_add_dax_region(cxlr);
If you take my suggestion at the top about removing MEMCTRL_AUTO, this case become
the MEMCTRL_DAX case.
> + case CXL_MEMCTRL_DAX:
> + return devm_cxl_add_dax_region(cxlr);
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ae899f68551f..02d7d9ae0252 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -626,6 +626,50 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr,
> }
> static DEVICE_ATTR_RO(mode);
>
> +static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + const char *desc;
> +
> + switch (cxlr->memctrl) {
> + case CXL_MEMCTRL_AUTO:
> + desc = "auto";
> + break;
> + case CXL_MEMCTRL_DAX:
> + desc = "dax";
> + break;
> + default:
> + desc = "";
Nit: I would prefer this say "none" instead of being blank to match the code.
> + break;
> + }
> +
> + return sysfs_emit(buf, "%s\n", desc);
> +}
> +
> +static ssize_t ctrl_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + struct cxl_region_params *p = &cxlr->params;
> + int rc;
> +
> + ACQUIRE(rwsem_write_kill, rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_write_kill, &rwsem)))
> + return rc;
> +
> + if (p->state >= CXL_CONFIG_COMMIT)
> + return -EBUSY;
> +
> + if (sysfs_streq(buf, "dax"))
> + cxlr->memctrl = CXL_MEMCTRL_DAX;
> + else
> + return -EINVAL;
> +
> + return len;
> +}
> +static DEVICE_ATTR_RW(ctrl);
> +
> static int alloc_hpa(struct cxl_region *cxlr, resource_size_t size)
> {
> struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> @@ -772,6 +816,7 @@ static struct attribute *cxl_region_attrs[] = {
> &dev_attr_size.attr,
> &dev_attr_mode.attr,
> &dev_attr_extended_linear_cache_size.attr,
> + &dev_attr_ctrl.attr,
> NULL,
> };
>
> @@ -2598,6 +2643,7 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
> return cxlr;
> cxlr->mode = mode;
> cxlr->type = type;
> + cxlr->memctrl = CXL_MEMCTRL_NONE;
>
> dev = &cxlr->dev;
> rc = dev_set_name(dev, "region%d", id);
> @@ -3307,37 +3353,6 @@ struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
> }
> EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, "CXL");
>
> -static struct lock_class_key cxl_dax_region_key;
> -
> -static struct cxl_dax_region *cxl_dax_region_alloc(struct cxl_region *cxlr)
> -{
> - struct cxl_region_params *p = &cxlr->params;
> - struct cxl_dax_region *cxlr_dax;
> - struct device *dev;
> -
> - guard(rwsem_read)(&cxl_rwsem.region);
> - if (p->state != CXL_CONFIG_COMMIT)
> - return ERR_PTR(-ENXIO);
> -
> - cxlr_dax = kzalloc(sizeof(*cxlr_dax), GFP_KERNEL);
> - if (!cxlr_dax)
> - return ERR_PTR(-ENOMEM);
> -
> - cxlr_dax->hpa_range.start = p->res->start;
> - cxlr_dax->hpa_range.end = p->res->end;
> -
> - dev = &cxlr_dax->dev;
> - cxlr_dax->cxlr = cxlr;
> - device_initialize(dev);
> - lockdep_set_class(&dev->mutex, &cxl_dax_region_key);
> - device_set_pm_not_required(dev);
> - dev->parent = &cxlr->dev;
> - dev->bus = &cxl_bus_type;
> - dev->type = &cxl_dax_region_type;
> -
> - return cxlr_dax;
> -}
> -
> static void cxlr_pmem_unregister(void *_cxlr_pmem)
> {
> struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem;
> @@ -3424,42 +3439,6 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
> return rc;
> }
>
> -static void cxlr_dax_unregister(void *_cxlr_dax)
> -{
> - struct cxl_dax_region *cxlr_dax = _cxlr_dax;
> -
> - device_unregister(&cxlr_dax->dev);
> -}
> -
> -static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> -{
> - struct cxl_dax_region *cxlr_dax;
> - struct device *dev;
> - int rc;
> -
> - cxlr_dax = cxl_dax_region_alloc(cxlr);
> - if (IS_ERR(cxlr_dax))
> - return PTR_ERR(cxlr_dax);
> -
> - dev = &cxlr_dax->dev;
> - rc = dev_set_name(dev, "dax_region%d", cxlr->id);
> - if (rc)
> - goto err;
> -
> - rc = device_add(dev);
> - if (rc)
> - goto err;
> -
> - dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
> - dev_name(dev));
> -
> - return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
> - cxlr_dax);
> -err:
> - put_device(dev);
> - return rc;
> -}
> -
> static int match_decoder_by_range(struct device *dev, const void *data)
> {
> const struct range *r1, *r2 = data;
> @@ -3579,6 +3558,9 @@ static int __construct_region(struct cxl_region *cxlr,
>
> set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>
> + /* Auto-regions will either be static sysram (onlined by BIOS) or DAX */
> + cxlr->memctrl = CXL_MEMCTRL_AUTO;
And this would be set to CXL_MEM_CTRL_DAX instead.
> +
> res = kmalloc(sizeof(*res), GFP_KERNEL);
> if (!res)
> return -ENOMEM;
> @@ -3755,15 +3737,6 @@ u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_port_get_spa_cache_alias, "CXL");
>
> -static int is_system_ram(struct resource *res, void *arg)
> -{
> - struct cxl_region *cxlr = arg;
> - struct cxl_region_params *p = &cxlr->params;
> -
> - dev_dbg(&cxlr->dev, "%pr has System RAM: %pr\n", p->res, res);
> - return 1;
> -}
> -
> static void shutdown_notifiers(void *_cxlr)
> {
> struct cxl_region *cxlr = _cxlr;
> @@ -3965,16 +3938,7 @@ static int cxl_region_probe(struct device *dev)
> dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
> cxlr->id);
>
> - /*
> - * The region can not be manged by CXL if any portion of
> - * it is already online as 'System RAM'
> - */
> - if (walk_iomem_res_desc(IORES_DESC_NONE,
> - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> - p->res->start, p->res->end, cxlr,
> - is_system_ram) > 0)
> - return 0;
> - return devm_cxl_add_dax_region(cxlr);
> + return cxl_enable_memctrl(cxlr);
> default:
> dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
> cxlr->mode);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ba17fa86d249..b8fabaa77262 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -502,6 +502,19 @@ enum cxl_partition_mode {
> CXL_PARTMODE_PMEM,
> };
>
> +
> +/*
> + * Memory Controller modes:
> + * None - No controller selected
> + * Auto - either BIOS-configured as SysRAM, or default to DAX
> + * DAX - creates a dax_region controller for the cxl_region
> + */
> +enum cxl_memctrl_mode {
> + CXL_MEMCTRL_NONE,
> + CXL_MEMCTRL_AUTO,
> + CXL_MEMCTRL_DAX,
> +};
> +
> /*
> * Indicate whether this region has been assembled by autodetection or
> * userspace assembly. Prevent endpoint decoders outside of automatic
> @@ -543,6 +556,7 @@ struct cxl_region {
> struct device dev;
> int id;
> enum cxl_partition_mode mode;
> + enum cxl_memctrl_mode memctrl;
> enum cxl_decoder_type type;
> struct cxl_nvdimm_bridge *cxl_nvb;
> struct cxl_pmem_region *cxlr_pmem;
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 16:35 ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
2026-01-12 20:00 ` David Hildenbrand (Red Hat)
2026-01-12 21:10 ` dan.j.williams
@ 2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 22:55 ` Gregory Price
2 siblings, 1 reply; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-12 21:10 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams,
David Hildenbrand
On 1/12/2026 10:35 AM, Gregory Price wrote:
> Add a sysram memctrl that directly hotplugs memory without needing to
> route through DAX. This simplifies the sysram usecase considerably.
>
> The sysram memctl adds new sysfs controls when registered:
> region/memctrl/[hotplug, hotunplug, state]
>
> hotplug: controller attempts to hotplug the memory region
> hotunplug: controller attempts to offline and hotunplug the memory region
Nit: Would it be better to use hotadd/hotremove here instead of hotplug/hotunplug? The terms
are basically synonymous, but I think hotadd and hotremove are more descriptive.
> state: [online,online_normal,offline]
> online : controller onlines blocks in ZONE_MOVABLE
> online_normal: controller onlines blocks in ZONE_NORMAL
The naming for online states could be improved imo. I understand and agree with the motivation
behind the names, but I could see the use of the word "normal" being confusing to less savvy users.
You could change it to include the zone for both (online_movable/online_normal), but I think it may
be easier to mark which one has drawbacks, i.e. change "online_normal" to something like "online_nonremovable".
That way, anyone who doesn't want to go find the documentation for these can understand the user-visible
impact.
In any case, all of these attributes need ABI documentation as well.
> offline : controller attempts to offline the memory blocks
>
> Hotplug note - by default the controller will hotplug the blocks, but
> leave them offline (unless MHP auto-online in Kconfig is enabled).
>
> Setting state to "online_normal" may prevent future hot-unplug of sysram
> regions, and unbinding a memory region with memory online in ZONE_NORMAL
> may result in the device being removed but the memory remaining online.
>
> This can result in future management functions failing (such as adding a
> new region). This is why "online_normal" is explicit, and the default
> online zone is ZONE_MOVABLE.
>
> Cc: David Hildenbrand <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/core.h | 2 +
> drivers/cxl/core/memctrl/Makefile | 1 +
> drivers/cxl/core/memctrl/memctrl.c | 2 +
> drivers/cxl/core/memctrl/sysram_region.c | 358 +++++++++++++++++++++++
> drivers/cxl/core/region.c | 5 +
> drivers/cxl/cxl.h | 6 +-
> 6 files changed, 372 insertions(+), 2 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
>
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 1156a4bd0080..18cb84950500 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -31,6 +31,8 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
> struct cxl_endpoint_decoder *cxled, int pos,
> enum cxl_detach_mode mode);
>
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
> +
> #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
> #define CXL_REGION_TYPE(x) (&cxl_region_type)
> #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> index 8165aad5a52a..1c52c7d75570 100644
> --- a/drivers/cxl/core/memctrl/Makefile
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -2,3 +2,4 @@
>
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> index 24e0e14b39c7..40ffb59353bb 100644
> --- a/drivers/cxl/core/memctrl/memctrl.c
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -34,6 +34,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
> return devm_cxl_add_dax_region(cxlr);
> case CXL_MEMCTRL_DAX:
> return devm_cxl_add_dax_region(cxlr);
> + case CXL_MEMCTRL_SYSRAM:
> + return devm_cxl_add_sysram_region(cxlr);
> default:
> return -EINVAL;
> }
> diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
> new file mode 100644
> index 000000000000..a7570c8a54e1
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/sysram_region.c
> @@ -0,0 +1,358 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2026 Meta Inc. All rights reserved. */
> +#include <linux/memremap.h>
> +#include <linux/memory.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <linux/mm.h>
> +#include <linux/memory-tiers.h>
> +#include <linux/memory_hotplug.h>
> +#include <linux/string_helpers.h>
> +#include <linux/sched/signal.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +/* If HMAT was unavailable, assign a default distance. */
> +#define MEMTIER_DEFAULT_CXL_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 5)
> +
> +static const char *sysram_name = "System RAM (CXL)";
> +
> +struct cxl_sysram_data {
> + const char *res_name;
> + int mgid;
> + struct resource *res;
> +};
> +
> +static DEFINE_MUTEX(cxl_memory_type_lock);
> +static LIST_HEAD(cxl_memory_types);
> +
> +static struct cxl_region *to_cxl_region(struct device *dev)
> +{
> + if (dev->type != &cxl_region_type)
> + return NULL;
> + return container_of(dev, struct cxl_region, dev);
> +}
What's the reasoning behind redefining this in this file? It's still defined in cxl/core/region.c,
so I would probably just drop the static there and include it through core.h.
> +
> +static struct memory_dev_type *cxl_find_alloc_memory_type(int adist)
> +{
> + guard(mutex)(&cxl_memory_type_lock);
> + return mt_find_alloc_memory_type(adist, &cxl_memory_types);
> +}
> +
> +static void __maybe_unused cxl_put_memory_types(void)
> +{
> + guard(mutex)(&cxl_memory_type_lock);
> + mt_put_memory_types(&cxl_memory_types);
> +}
> +
> +static int cxl_sysram_range(struct cxl_region *cxlr, struct range *r)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> +
> + if (!p->res)
> + return -ENODEV;
> +
> + /* memory-block align the hotplug range */
> + r->start = ALIGN(p->res->start, memory_block_size_bytes());
> + r->end = ALIGN_DOWN(p->res->end + 1, memory_block_size_bytes()) - 1;
> + if (r->start >= r->end) {
> + r->start = p->res->start;
> + r->end = p->res->end;
> + return -ENOSPC;
> + }
> + return 0;
> +}
> +
> +static ssize_t hotunplug_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + struct range range;
> + int rc;
> +
> + if (!cxlr)
> + return -ENODEV;
> +
> + rc = cxl_sysram_range(cxlr, &range);
> + if (rc)
> + return rc;
> +
> + rc = offline_and_remove_memory(range.start, range_len(&range));
> +
> + if (rc)
Extra blank line above.
> + return rc;
> +
> + return len;
> +}
> +static DEVICE_ATTR_WO(hotunplug);
> +
> +struct online_memory_cb_arg {
> + int online_type;
> + int rc;
> +};
> +
> +static int online_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> + struct online_memory_cb_arg *cb_arg = arg;
> +
> + if (signal_pending(current))
> + return -EINTR;
> +
> + cond_resched();
> +
> + if (mem->state == MEM_ONLINE)
> + return 0;
> +
> + mem->online_type = cb_arg->online_type;
> + cb_arg->rc = device_online(&mem->dev);
> +
> + return cb_arg->rc;
> +}
> +
> +static int offline_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> + int *rc = arg;
> +
> + if (signal_pending(current))
> + return -EINTR;
> +
> + cond_resched();
> +
> + if (mem->state == MEM_OFFLINE)
> + return 0;
> +
> + *rc = device_offline(&mem->dev);
> +
> + return *rc;
> +}
> +
> +static ssize_t state_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + struct online_memory_cb_arg cb_arg;
> + struct range range;
> + int rc;
> +
> + if (!cxlr)
> + return -ENODEV;
> +
> + rc = cxl_sysram_range(cxlr, &range);
> + if (rc)
> + return rc;
> +
> + rc = lock_device_hotplug_sysfs();
> + if (rc)
> + return rc;
> +
> + if (sysfs_streq(buf, "online")) {
> + cb_arg.online_type = MMOP_ONLINE_MOVABLE;
> + cb_arg.rc = 0;
> + rc = walk_memory_blocks(range.start, range_len(&range),
> + &cb_arg, online_memory_block_cb);
> + if (!rc)
> + rc = cb_arg.rc;
> + } else if (sysfs_streq(buf, "online_normal")) {
> + cb_arg.online_type = MMOP_ONLINE;
> + cb_arg.rc = 0;
> + rc = walk_memory_blocks(range.start, range_len(&range),
> + &cb_arg, online_memory_block_cb);
> + if (!rc)
> + rc = cb_arg.rc;
> + } else if (sysfs_streq(buf, "offline")) {
> + int offline_rc = 0;
> +
> + rc = walk_memory_blocks(range.start, range_len(&range),
> + &offline_rc, offline_memory_block_cb);
> + if (!rc)
> + rc = offline_rc;
> + } else {
> + rc = -EINVAL;
> + }
Nit: You can just set rc = -EINVAL before the if statement instead of doing this else clause.> +
> + unlock_device_hotplug();
> +
> + if (rc)
> + return rc;
> +
> + return len;
> +}
> +static DEVICE_ATTR_WO(state);
> +
> +static ssize_t hotplug_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + struct cxl_sysram_data *data;
> + struct range range;
> + int rc;
> +
> + if (!cxlr)
> + return -ENODEV;
> +
> + data = dev_get_drvdata(dev);
> + if (!data)
> + return -ENODEV;
> +
> + rc = cxl_sysram_range(cxlr, &range);
> + if (rc)
> + return rc;
> +
> + rc = add_memory_driver_managed(data->mgid, range.start,
> + range_len(&range), sysram_name,
> + MHP_NID_IS_MGID);
> + if (rc)
> + return rc;
> +
> + return len;
> +}
> +static DEVICE_ATTR_WO(hotplug);
> +
> +static struct attribute *cxl_sysram_region_attrs[] = {
> + &dev_attr_hotunplug.attr,
> + &dev_attr_state.attr,
> + &dev_attr_hotplug.attr,
> + NULL,
> +};
> +
> +static const struct attribute_group cxl_sysram_region_group = {
> + .name = "memctl",
> + .attrs = cxl_sysram_region_attrs,
> +};
> +
> +static void cxl_sysram_unregister(void *_data)
> +{
> + struct cxl_sysram_data *data = _data;
> + struct range range = {
> + .start = data->res->start,
> + .end = data->res->end
> + };
> +
> + /* We have one shot for removal, otherwise it's stuck til reboot */
> + if (!offline_and_remove_memory(range.start, range_len(&range))) {
> + remove_resource(data->res);
> + kfree(data->res);
> + memory_group_unregister(data->mgid);
> + kfree(data->res_name);
> + kfree(data);
> + return;
> + }
> + pr_err("CXL: %#llx-%#llx cannot be hotremoved until next reboot\n",
> + range.start, range.end);
> +}
> +
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> + struct device *dev = &cxlr->dev;
> + struct cxl_sysram_data *data;
> + struct memory_dev_type *mtype;
> + unsigned long total_len = 0;
> + struct resource *res;
> + struct range range;
> + mhp_t mhp_flags;
> + int numa_node;
> + int adist = MEMTIER_DEFAULT_CXL_ADISTANCE;
> + int rc;
> +
> + numa_node = phys_to_target_node(p->res->start);
> + if (numa_node < 0) {
> + dev_warn(dev, "rejecting CXL region with invalid node: %d\n",
> + numa_node);
> + return -EINVAL;
> + }
> +
> + rc = cxl_sysram_range(cxlr, &range);
> + if (rc) {
> + dev_info(dev, "range %#llx-%#llx too small after alignment\n",
> + range.start, range.end);
This should probably be a warning instead. You do it for the next check which is essentially the same
case, so may as well do it here.
> + return rc;
> + }
> + total_len = range_len(&range);
> +
> + if (!total_len) {
> + dev_warn(dev, "rejecting CXL region without any memory after alignment\n");
> + return -EINVAL;
> + }
I don't think this check is needed. cxl_sysram_range() checks if the range->start == range->end (i.e. size == 0)
and errors out. That should cause the above check to error out before this.
> +
> + mt_calc_adistance(numa_node, &adist);
> + mtype = cxl_find_alloc_memory_type(adist);
> + if (IS_ERR(mtype))
> + return PTR_ERR(mtype);
> +
> + init_node_memory_type(numa_node, mtype);
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data) {
> + rc = -ENOMEM;
> + goto err_data;
> + }
> +
> + data->res_name = kstrdup(dev_name(dev), GFP_KERNEL);
> + if (!data->res_name) {
> + rc = -ENOMEM;
> + goto err_res_name;
> + }
> +
> + rc = memory_group_register_static(numa_node, PFN_UP(total_len));
> + if (rc < 0)
> + goto err_reg_mgid;
> + data->mgid = rc;
> +
> + /* Region is permanently reserved if hotremove fails when unbinding. */
> + res = request_mem_region(range.start, range_len(&range),
> + data->res_name);
> + if (!res) {
> + dev_warn(dev, "range %#llx-%#llx could not reserve region\n",
> + range.start, range.end);
> + rc = -EBUSY;
> + goto err_request_mem;
> + }
> + data->res = res;
> +
> + /*
> + * Setup flags for System RAM. Leave _BUSY clear so add_memory() can add
> + * a child resource. Do not inherit flags from parent since it may set
> + * flags unknown to us that will the break add_memory() below.
> + */
> + res->flags = IORESOURCE_SYSTEM_RAM;
> + mhp_flags = MHP_NID_IS_MGID;
> + rc = add_memory_driver_managed(data->mgid, range.start,
> + range_len(&range), sysram_name, mhp_flags);
Look like mhp_flags is only used once, I'd get rid of it and just use MHP_NID_IS_MGID instead.
> + if (rc) {
> + dev_warn(dev, "range %#llx-%#llx memory add failed\n",
> + range.start, range.end);
> + goto err_add_memory;
> + }
> + dev_dbg(dev, "%s: added %llu bytes as System RAM\n", dev_name(dev),
> + (unsigned long long)total_len);
> +
> + dev_set_drvdata(dev, data);
> + rc = devm_device_add_group(dev, &cxl_sysram_region_group);
> + if (rc)
> + goto err_add_group;
> +
> + return devm_add_action_or_reset(dev, cxl_sysram_unregister, data);
> +
> +err_add_group:
> + dev_set_drvdata(dev, NULL);
> + /* if this fails, memory cannot be removed from the system until reboot */
> + remove_memory(range.start, range_len(&range));
> +err_add_memory:
> + remove_resource(res);
> + kfree(res);
> +err_request_mem:
> + memory_group_unregister(data->mgid);
> +err_reg_mgid:
> + kfree(data->res_name);
> +err_res_name:
> + kfree(data);
> +err_data:
> + clear_node_memory_type(numa_node, mtype);
> + return rc;
> +}
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 02d7d9ae0252..eeab091f043a 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -639,6 +639,9 @@ static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
> case CXL_MEMCTRL_DAX:
> desc = "dax";
> break;
> + case CXL_MEMCTRL_SYSRAM:
> + desc = "sysram";
> + break;
> default:
> desc = "";
> break;
> @@ -663,6 +666,8 @@ static ssize_t ctrl_store(struct device *dev, struct device_attribute *attr,
>
> if (sysfs_streq(buf, "dax"))
> cxlr->memctrl = CXL_MEMCTRL_DAX;
> + else if (sysfs_streq(buf, "sysram"))
> + cxlr->memctrl = CXL_MEMCTRL_SYSRAM;
> else
> return -EINVAL;
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b8fabaa77262..bb4f877b4e8f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -506,13 +506,15 @@ enum cxl_partition_mode {
> /*
> * Memory Controller modes:
> * None - No controller selected
> - * Auto - either BIOS-configured as SysRAM, or default to DAX
> - * DAX - creates a dax_region controller for the cxl_region
> + * Auto - either BIOS-configured as SysRAM, or default to DAX
> + * DAX - creates a dax_region controller for the cxl_region
> + * SYSRAM - hotplugs the region directly as System RAM
> */
> enum cxl_memctrl_mode {
> CXL_MEMCTRL_NONE,
> CXL_MEMCTRL_AUTO,
> CXL_MEMCTRL_DAX,
> + CXL_MEMCTRL_SYSRAM,
> };
>
> /*
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
2026-01-12 16:35 ` [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region Gregory Price
@ 2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 22:58 ` Gregory Price
0 siblings, 1 reply; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-12 21:10 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On 1/12/2026 10:35 AM, Gregory Price wrote:
> Move the pmem_region logic from region.c into memctrl/pmem_region.c.
> Restrict the valid controllers for pmem to the pmem controller.
> Simplify the controller selection logic in region probe.
>
> Cc:
May want to forward this to whoever this Cc tag was meant for :).
One nit below, otherwise this looks good to me:
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/core.h | 1 +
> drivers/cxl/core/memctrl/Makefile | 1 +
> drivers/cxl/core/memctrl/memctrl.c | 2 +
> drivers/cxl/core/memctrl/pmem_region.c | 191 +++++++++++++++++++++
> drivers/cxl/core/region.c | 221 +++----------------------
> drivers/cxl/cxl.h | 2 +
> 6 files changed, 217 insertions(+), 201 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/pmem_region.c
>
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 18cb84950500..59175890a6ac 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -46,6 +46,7 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> u64 dpa);
> int cxl_enable_memctrl(struct cxl_region *cxlr);
> int devm_cxl_add_dax_region(struct cxl_region *cxlr);
> +int devm_cxl_add_pmem_region(struct cxl_region *cxlr);
>
> #else
> static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> index 1c52c7d75570..efffc8ba2c0b 100644
> --- a/drivers/cxl/core/memctrl/Makefile
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -3,3 +3,4 @@
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/pmem_region.o
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> index 40ffb59353bb..1b661465bdeb 100644
> --- a/drivers/cxl/core/memctrl/memctrl.c
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -36,6 +36,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
> return devm_cxl_add_dax_region(cxlr);
> case CXL_MEMCTRL_SYSRAM:
> return devm_cxl_add_sysram_region(cxlr);
> + case CXL_MEMCTRL_PMEM:
> + return devm_cxl_add_pmem_region(cxlr);
> default:
> return -EINVAL;
> }
> diff --git a/drivers/cxl/core/memctrl/pmem_region.c b/drivers/cxl/core/memctrl/pmem_region.c
> new file mode 100644
> index 000000000000..57668dd82d71
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/pmem_region.c
> @@ -0,0 +1,191 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +static void cxl_pmem_region_release(struct device *dev)
> +{
> + struct cxl_pmem_region *cxlr_pmem = to_cxl_pmem_region(dev);
> + int i;
> +
> + for (i = 0; i < cxlr_pmem->nr_mappings; i++) {
> + struct cxl_memdev *cxlmd = cxlr_pmem->mapping[i].cxlmd;
> +
> + put_device(&cxlmd->dev);
> + }
> +
> + kfree(cxlr_pmem);
> +}
> +
> +static const struct attribute_group *cxl_pmem_region_attribute_groups[] = {
> + &cxl_base_attribute_group,
> + NULL,
> +};
> +
> +const struct device_type cxl_pmem_region_type = {
> + .name = "cxl_pmem_region",
> + .release = cxl_pmem_region_release,
> + .groups = cxl_pmem_region_attribute_groups,
> +};
> +bool is_cxl_pmem_region(struct device *dev)
> +{
> + return dev->type == &cxl_pmem_region_type;
> +}
> +EXPORT_SYMBOL_NS_GPL(is_cxl_pmem_region, "CXL");
> +
> +struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
> +{
> + if (dev_WARN_ONCE(dev, !is_cxl_pmem_region(dev),
> + "not a cxl_pmem_region device\n"))
> + return NULL;
> + return container_of(dev, struct cxl_pmem_region, dev);
> +}
> +EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, "CXL");
Missing blank line above.
> +static struct lock_class_key cxl_pmem_region_key;
> +
> +static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> + struct cxl_nvdimm_bridge *cxl_nvb;
> + struct device *dev;
> + int i;
> +
> + guard(rwsem_read)(&cxl_rwsem.region);
> + if (p->state != CXL_CONFIG_COMMIT)
> + return -ENXIO;
> +
> + struct cxl_pmem_region *cxlr_pmem __free(kfree) =
> + kzalloc(struct_size(cxlr_pmem, mapping, p->nr_targets), GFP_KERNEL);
> + if (!cxlr_pmem)
> + return -ENOMEM;
> +
> + cxlr_pmem->hpa_range.start = p->res->start;
> + cxlr_pmem->hpa_range.end = p->res->end;
> +
> + /* Snapshot the region configuration underneath the cxl_rwsem.region */
> + cxlr_pmem->nr_mappings = p->nr_targets;
> + for (i = 0; i < p->nr_targets; i++) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct cxl_pmem_region_mapping *m = &cxlr_pmem->mapping[i];
> +
> + /*
> + * Regions never span CXL root devices, so by definition the
> + * bridge for one device is the same for all.
> + */
> + if (i == 0) {
> + cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint);
> + if (!cxl_nvb)
> + return -ENODEV;
> + cxlr->cxl_nvb = cxl_nvb;
> + }
> + m->cxlmd = cxlmd;
> + get_device(&cxlmd->dev);
> + m->start = cxled->dpa_res->start;
> + m->size = resource_size(cxled->dpa_res);
> + m->position = i;
> + }
> +
> + dev = &cxlr_pmem->dev;
> + device_initialize(dev);
> + lockdep_set_class(&dev->mutex, &cxl_pmem_region_key);
> + device_set_pm_not_required(dev);
> + dev->parent = &cxlr->dev;
> + dev->bus = &cxl_bus_type;
> + dev->type = &cxl_pmem_region_type;
> + cxlr_pmem->cxlr = cxlr;
> + cxlr->cxlr_pmem = no_free_ptr(cxlr_pmem);
> +
> + return 0;
> +}
> +
> +static void cxlr_pmem_unregister(void *_cxlr_pmem)
> +{
> + struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem;
> + struct cxl_region *cxlr = cxlr_pmem->cxlr;
> + struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
> +
> + /*
> + * Either the bridge is in ->remove() context under the device_lock(),
> + * or cxlr_release_nvdimm() is cancelling the bridge's release action
> + * for @cxlr_pmem and doing it itself (while manually holding the bridge
> + * lock).
> + */
> + device_lock_assert(&cxl_nvb->dev);
> + cxlr->cxlr_pmem = NULL;
> + cxlr_pmem->cxlr = NULL;
> + device_unregister(&cxlr_pmem->dev);
> +}
> +
> +static void cxlr_release_nvdimm(void *_cxlr)
> +{
> + struct cxl_region *cxlr = _cxlr;
> + struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
> +
> + scoped_guard(device, &cxl_nvb->dev) {
> + if (cxlr->cxlr_pmem)
> + devm_release_action(&cxl_nvb->dev, cxlr_pmem_unregister,
> + cxlr->cxlr_pmem);
> + }
> + cxlr->cxl_nvb = NULL;
> + put_device(&cxl_nvb->dev);
> +}
> +
> +/**
> + * devm_cxl_add_pmem_region() - add a cxl_region-to-nd_region bridge
> + * @cxlr: parent CXL region for this pmem region bridge device
> + *
> + * Return: 0 on success negative error code on failure.
> + */
> +int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
> +{
> + struct cxl_pmem_region *cxlr_pmem;
> + struct cxl_nvdimm_bridge *cxl_nvb;
> + struct device *dev;
> + int rc;
> +
> + rc = cxl_pmem_region_alloc(cxlr);
> + if (rc)
> + return rc;
> + cxlr_pmem = cxlr->cxlr_pmem;
> + cxl_nvb = cxlr->cxl_nvb;
> +
> + dev = &cxlr_pmem->dev;
> + rc = dev_set_name(dev, "pmem_region%d", cxlr->id);
> + if (rc)
> + goto err;
> +
> + rc = device_add(dev);
> + if (rc)
> + goto err;
> +
> + dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
> + dev_name(dev));
> +
> + scoped_guard(device, &cxl_nvb->dev) {
> + if (cxl_nvb->dev.driver)
> + rc = devm_add_action_or_reset(&cxl_nvb->dev,
> + cxlr_pmem_unregister,
> + cxlr_pmem);
> + else
> + rc = -ENXIO;
> + }
> +
> + if (rc)
> + goto err_bridge;
> +
> + /* @cxlr carries a reference on @cxl_nvb until cxlr_release_nvdimm */
> + return devm_add_action_or_reset(&cxlr->dev, cxlr_release_nvdimm, cxlr);
> +
> +err:
> + put_device(dev);
> +err_bridge:
> + put_device(&cxl_nvb->dev);
> + cxlr->cxl_nvb = NULL;
> + return rc;
> +}
> +
> +
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index eeab091f043a..85c20a09246d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -642,6 +642,9 @@ static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
> case CXL_MEMCTRL_SYSRAM:
> desc = "sysram";
> break;
> + case CXL_MEMCTRL_PMEM:
> + desc = "pmem";
> + break;
> default:
> desc = "";
> break;
> @@ -661,6 +664,10 @@ static ssize_t ctrl_store(struct device *dev, struct device_attribute *attr,
> if ((rc = ACQUIRE_ERR(rwsem_write_kill, &rwsem)))
> return rc;
>
> + /* PMEM only has one controller - the pmem controller */
> + if (cxlr->mode == CXL_PARTMODE_PMEM)
> + return -EBUSY;
> +
> if (p->state >= CXL_CONFIG_COMMIT)
> return -EBUSY;
>
> @@ -2648,7 +2655,11 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
> return cxlr;
> cxlr->mode = mode;
> cxlr->type = type;
> - cxlr->memctrl = CXL_MEMCTRL_NONE;
> +
> + if (mode == CXL_PARTMODE_PMEM)
> + cxlr->memctrl = CXL_MEMCTRL_PMEM;
> + else
> + cxlr->memctrl = CXL_MEMCTRL_NONE;
>
> dev = &cxlr->dev;
> rc = dev_set_name(dev, "region%d", id);
> @@ -2797,46 +2808,6 @@ static ssize_t delete_region_store(struct device *dev,
> }
> DEVICE_ATTR_WO(delete_region);
>
> -static void cxl_pmem_region_release(struct device *dev)
> -{
> - struct cxl_pmem_region *cxlr_pmem = to_cxl_pmem_region(dev);
> - int i;
> -
> - for (i = 0; i < cxlr_pmem->nr_mappings; i++) {
> - struct cxl_memdev *cxlmd = cxlr_pmem->mapping[i].cxlmd;
> -
> - put_device(&cxlmd->dev);
> - }
> -
> - kfree(cxlr_pmem);
> -}
> -
> -static const struct attribute_group *cxl_pmem_region_attribute_groups[] = {
> - &cxl_base_attribute_group,
> - NULL,
> -};
> -
> -const struct device_type cxl_pmem_region_type = {
> - .name = "cxl_pmem_region",
> - .release = cxl_pmem_region_release,
> - .groups = cxl_pmem_region_attribute_groups,
> -};
> -
> -bool is_cxl_pmem_region(struct device *dev)
> -{
> - return dev->type == &cxl_pmem_region_type;
> -}
> -EXPORT_SYMBOL_NS_GPL(is_cxl_pmem_region, "CXL");
> -
> -struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
> -{
> - if (dev_WARN_ONCE(dev, !is_cxl_pmem_region(dev),
> - "not a cxl_pmem_region device\n"))
> - return NULL;
> - return container_of(dev, struct cxl_pmem_region, dev);
> -}
> -EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, "CXL");
> -
> struct cxl_poison_context {
> struct cxl_port *port;
> int part;
> @@ -3268,64 +3239,6 @@ static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
> return -ENXIO;
> }
>
> -static struct lock_class_key cxl_pmem_region_key;
> -
> -static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
> -{
> - struct cxl_region_params *p = &cxlr->params;
> - struct cxl_nvdimm_bridge *cxl_nvb;
> - struct device *dev;
> - int i;
> -
> - guard(rwsem_read)(&cxl_rwsem.region);
> - if (p->state != CXL_CONFIG_COMMIT)
> - return -ENXIO;
> -
> - struct cxl_pmem_region *cxlr_pmem __free(kfree) =
> - kzalloc(struct_size(cxlr_pmem, mapping, p->nr_targets), GFP_KERNEL);
> - if (!cxlr_pmem)
> - return -ENOMEM;
> -
> - cxlr_pmem->hpa_range.start = p->res->start;
> - cxlr_pmem->hpa_range.end = p->res->end;
> -
> - /* Snapshot the region configuration underneath the cxl_rwsem.region */
> - cxlr_pmem->nr_mappings = p->nr_targets;
> - for (i = 0; i < p->nr_targets; i++) {
> - struct cxl_endpoint_decoder *cxled = p->targets[i];
> - struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct cxl_pmem_region_mapping *m = &cxlr_pmem->mapping[i];
> -
> - /*
> - * Regions never span CXL root devices, so by definition the
> - * bridge for one device is the same for all.
> - */
> - if (i == 0) {
> - cxl_nvb = cxl_find_nvdimm_bridge(cxlmd->endpoint);
> - if (!cxl_nvb)
> - return -ENODEV;
> - cxlr->cxl_nvb = cxl_nvb;
> - }
> - m->cxlmd = cxlmd;
> - get_device(&cxlmd->dev);
> - m->start = cxled->dpa_res->start;
> - m->size = resource_size(cxled->dpa_res);
> - m->position = i;
> - }
> -
> - dev = &cxlr_pmem->dev;
> - device_initialize(dev);
> - lockdep_set_class(&dev->mutex, &cxl_pmem_region_key);
> - device_set_pm_not_required(dev);
> - dev->parent = &cxlr->dev;
> - dev->bus = &cxl_bus_type;
> - dev->type = &cxl_pmem_region_type;
> - cxlr_pmem->cxlr = cxlr;
> - cxlr->cxlr_pmem = no_free_ptr(cxlr_pmem);
> -
> - return 0;
> -}
> -
> static void cxl_dax_region_release(struct device *dev)
> {
> struct cxl_dax_region *cxlr_dax = to_cxl_dax_region(dev);
> @@ -3358,92 +3271,6 @@ struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
> }
> EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, "CXL");
>
> -static void cxlr_pmem_unregister(void *_cxlr_pmem)
> -{
> - struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem;
> - struct cxl_region *cxlr = cxlr_pmem->cxlr;
> - struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
> -
> - /*
> - * Either the bridge is in ->remove() context under the device_lock(),
> - * or cxlr_release_nvdimm() is cancelling the bridge's release action
> - * for @cxlr_pmem and doing it itself (while manually holding the bridge
> - * lock).
> - */
> - device_lock_assert(&cxl_nvb->dev);
> - cxlr->cxlr_pmem = NULL;
> - cxlr_pmem->cxlr = NULL;
> - device_unregister(&cxlr_pmem->dev);
> -}
> -
> -static void cxlr_release_nvdimm(void *_cxlr)
> -{
> - struct cxl_region *cxlr = _cxlr;
> - struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
> -
> - scoped_guard(device, &cxl_nvb->dev) {
> - if (cxlr->cxlr_pmem)
> - devm_release_action(&cxl_nvb->dev, cxlr_pmem_unregister,
> - cxlr->cxlr_pmem);
> - }
> - cxlr->cxl_nvb = NULL;
> - put_device(&cxl_nvb->dev);
> -}
> -
> -/**
> - * devm_cxl_add_pmem_region() - add a cxl_region-to-nd_region bridge
> - * @cxlr: parent CXL region for this pmem region bridge device
> - *
> - * Return: 0 on success negative error code on failure.
> - */
> -static int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
> -{
> - struct cxl_pmem_region *cxlr_pmem;
> - struct cxl_nvdimm_bridge *cxl_nvb;
> - struct device *dev;
> - int rc;
> -
> - rc = cxl_pmem_region_alloc(cxlr);
> - if (rc)
> - return rc;
> - cxlr_pmem = cxlr->cxlr_pmem;
> - cxl_nvb = cxlr->cxl_nvb;
> -
> - dev = &cxlr_pmem->dev;
> - rc = dev_set_name(dev, "pmem_region%d", cxlr->id);
> - if (rc)
> - goto err;
> -
> - rc = device_add(dev);
> - if (rc)
> - goto err;
> -
> - dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
> - dev_name(dev));
> -
> - scoped_guard(device, &cxl_nvb->dev) {
> - if (cxl_nvb->dev.driver)
> - rc = devm_add_action_or_reset(&cxl_nvb->dev,
> - cxlr_pmem_unregister,
> - cxlr_pmem);
> - else
> - rc = -ENXIO;
> - }
> -
> - if (rc)
> - goto err_bridge;
> -
> - /* @cxlr carries a reference on @cxl_nvb until cxlr_release_nvdimm */
> - return devm_add_action_or_reset(&cxlr->dev, cxlr_release_nvdimm, cxlr);
> -
> -err:
> - put_device(dev);
> -err_bridge:
> - put_device(&cxl_nvb->dev);
> - cxlr->cxl_nvb = NULL;
> - return rc;
> -}
> -
> static int match_decoder_by_range(struct device *dev, const void *data)
> {
> const struct range *r1, *r2 = data;
> @@ -3929,26 +3756,18 @@ static int cxl_region_probe(struct device *dev)
> return rc;
> }
>
> - switch (cxlr->mode) {
> - case CXL_PARTMODE_PMEM:
> - rc = devm_cxl_region_edac_register(cxlr);
> - if (rc)
> - dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
> - cxlr->id);
> -
> - return devm_cxl_add_pmem_region(cxlr);
> - case CXL_PARTMODE_RAM:
> - rc = devm_cxl_region_edac_register(cxlr);
> - if (rc)
> - dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
> - cxlr->id);
> -
> - return cxl_enable_memctrl(cxlr);
> - default:
> + if (cxlr->mode > CXL_PARTMODE_PMEM) {
> dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
> cxlr->mode);
> return -ENXIO;
> }
> +
> + rc = devm_cxl_region_edac_register(cxlr);
> + if (rc)
> + dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
> + cxlr->id);
> +
> + return cxl_enable_memctrl(cxlr);
> }
>
> static struct cxl_driver cxl_region_driver = {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index bb4f877b4e8f..c69d27a2e97d 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -509,12 +509,14 @@ enum cxl_partition_mode {
> * Auto - either BIOS-configured as SysRAM, or default to DAX
> * DAX - creates a dax_region controller for the cxl_region
> * SYSRAM - hotplugs the region directly as System RAM
> + * PMEM - persistent memory controller (nvdimm)
> */
> enum cxl_memctrl_mode {
> CXL_MEMCTRL_NONE,
> CXL_MEMCTRL_AUTO,
> CXL_MEMCTRL_DAX,
> CXL_MEMCTRL_SYSRAM,
> + CXL_MEMCTRL_PMEM,
> };
>
> /*
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
2026-01-12 16:35 ` [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options Gregory Price
@ 2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 23:05 ` Gregory Price
0 siblings, 1 reply; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-12 21:10 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On 1/12/2026 10:35 AM, Gregory Price wrote:
> To give users the option to have the auto-behavior of memory to default
> to SYSRAM, provide a switch. The default is still recommended to be DAX
> in case of multiple devices being added to the system, but this provides
> simpler systems a path to use the sysram controller for systems already
> configured with auto-regions.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/Kconfig | 32 ++++++++++++++++++++++++++++++
> drivers/cxl/core/memctrl/memctrl.c | 2 ++
> drivers/cxl/cxl.h | 2 +-
> 3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 48b7314afdb8..5aed1524f8f1 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -211,6 +211,38 @@ config CXL_REGION
>
> If unsure say 'y'
>
> +choice
> + prompt "CXL Region Auto Control Mode"
> + depends on CXL_REGION
> + default CXL_REGION_CTRL_AUTO_DAX
> + help
> + Select the default controller for CXL regions when ctrl mode is
> + set to 'auto'. This determines how CXL memory regions are exposed
> + to the system when no explicit control mode is specified.
> +
> +config CXL_REGION_CTRL_AUTO_DAX
This should probably be renamed to CXL_REGION_CTRL_DAX since only DAX is mentioned.
> + bool "DAX"
> + help
> + When a CXL region's control mode is 'auto', create a DAX region
> + controller. This allows fine-grained control over the memory region
> + through the DAX subsystem, and the region can later be converted to
> + System RAM via daxctl.
> +
> + This is the default and recommended option for most use cases.
If you remove the 'auto' mode earlier on, then you can just drop the first sentence here.
I'd also add a note about when a DAX region can be failed to be created (i.e. BIOS already
set up and onlined the memory).
> +
> +config CXL_REGION_CTRL_AUTO_SYSRAM
> + bool "System RAM"
> + help
> + When a CXL region's control mode is 'auto', hotplug the region
> + directly as System RAM. This makes the CXL memory immediately
> + available to the kernel's memory allocator without requiring
> + additional userspace configuration.
> +
> + Select this if you want CXL memory to be automatically available
> + as regular system memory.
> +
> +endchoice
> +
> config CXL_REGION_INVALIDATION_TEST
> bool "CXL: Region Cache Management Bypass (TEST)"
> depends on CXL_REGION
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> index 1b661465bdeb..cb6c37f4c0ee 100644
> --- a/drivers/cxl/core/memctrl/memctrl.c
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -31,6 +31,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
> p->res->start, p->res->end, cxlr,
> is_system_ram) > 0)
> return 0;
> + if (IS_ENABLED(CONFIG_CXL_REGION_CTRL_AUTO_SYSRAM))
> + return devm_cxl_add_sysram_region(cxlr);
> return devm_cxl_add_dax_region(cxlr);
> case CXL_MEMCTRL_DAX:
> return devm_cxl_add_dax_region(cxlr);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index c69d27a2e97d..1dae6fe4f70c 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -506,7 +506,7 @@ enum cxl_partition_mode {
> /*
> * Memory Controller modes:
> * None - No controller selected
> - * Auto - either BIOS-configured as SysRAM, or default to DAX
> + * Auto - Auto-select based on BIOS, boot, and build configs.
> * DAX - creates a dax_region controller for the cxl_region
> * SYSRAM - hotplugs the region directly as System RAM
> * PMEM - persistent memory controller (nvdimm)
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options
2026-01-12 16:35 ` [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options Gregory Price
@ 2026-01-12 21:11 ` Cheatham, Benjamin
2026-01-12 23:07 ` Gregory Price
0 siblings, 1 reply; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-12 21:11 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On 1/12/2026 10:35 AM, Gregory Price wrote:
> DEFAULT_OFFLINE: Blocks will be offline after being created.
> DEFAULT_ONLINE: Blocks will be onlined in ZONE_MOVABLE
> DEFAULT_ONLINE_NORMAL: Blocks will be onliend in ZONE_NORMAL.
>
> This prevents users from having to use the MHP auto-online build config,
> which may cause misbehaviors with other devices hotplugging memory.
Isn't the MHP auto-online build config still used in some flows? A quick note on
when that option will still be used would be nice.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/Kconfig | 40 ++++++++++
> drivers/cxl/core/memctrl/sysram_region.c | 94 ++++++++++++++++++------
> 2 files changed, 110 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 5aed1524f8f1..3e087c9d5ea7 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -243,6 +243,46 @@ config CXL_REGION_CTRL_AUTO_SYSRAM
>
> endchoice
>
> +choice
> + prompt "CXL SYSRAM Auto Online Mode"
> + depends on CXL_REGION
> + default CXL_REGION_SYSRAM_DEFAULT_OFFLINE
> + help
> + Select whether CXL memory hotplugged as System RAM should be
> + automatically onlined and in which zone. This applies when the
> + region controller is set to SYSRAM (either explicitly or via
> + the auto control mode).
> +
> +config CXL_REGION_SYSRAM_DEFAULT_OFFLINE
> + bool "Offline"
> + help
> + Leave the memory offline after hotplug. The memory must be
> + manually onlined via sysfs or other mechanisms before it can
> + be used by the system.
> +
> + This is the default and most conservative option.
> +
> +config CXL_REGION_SYSRAM_DEFAULT_ONLINE
> + bool "Online (Movable)"
> + help
> + Automatically online the memory as ZONE_MOVABLE after hotplug.
> + ZONE_MOVABLE memory can be used for user pages and is eligible
> + for memory hotremove, but cannot be used for kernel allocations.
> +
> + Select this for memory that may need to be hotremoved later.
> +
> +config CXL_REGION_SYSRAM_DEFAULT_ONLINE_NORMAL
> + bool "Online (Normal)"
> + help
> + Automatically online the memory as ZONE_NORMAL after hotplug.
> + ZONE_NORMAL memory can be used for all allocations including
> + kernel allocations, but may not be hotremovable.
> +
> + Select this for maximum memory utilization when hotremove is
> + not required.
> +
> +endchoice
> +
> config CXL_REGION_INVALIDATION_TEST
> bool "CXL: Region Cache Management Bypass (TEST)"
> depends on CXL_REGION
> diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
> index a7570c8a54e1..2e2d9b59a725 100644
> --- a/drivers/cxl/core/memctrl/sysram_region.c
> +++ b/drivers/cxl/core/memctrl/sysram_region.c
> @@ -129,12 +129,69 @@ static int offline_memory_block_cb(struct memory_block *mem, void *arg)
> return *rc;
> }
>
> +static int cxl_sysram_online_memory(struct range *range, int online_type)
> +{
> + struct online_memory_cb_arg cb_arg = {
> + .online_type = online_type,
> + .rc = 0,
> + };
> + int rc;
> +
> + rc = walk_memory_blocks(range->start, range_len(range),
> + &cb_arg, online_memory_block_cb);
> + if (!rc)
> + rc = cb_arg.rc;
> +
> + return rc;
> +}
> +
> +static int cxl_sysram_offline_memory(struct range *range)
> +{
> + int offline_rc = 0;
> + int rc;
> +
> + rc = walk_memory_blocks(range->start, range_len(range),
> + &offline_rc, offline_memory_block_cb);
> + if (!rc)
> + rc = offline_rc;
> +
> + return rc;
> +}
I think these two helpers can get moved into patch 2/6 when the 'store' attribute was defined. I don't
see anything that requires they're in this patch and it would help reduce churn.
> +
> +static int cxl_sysram_auto_online(struct device *dev, struct range *range)
> +{
> + int online_type;
> + int rc;
> +
> + if (IS_ENABLED(CONFIG_CXL_REGION_SYSRAM_DEFAULT_OFFLINE))
> + return 0;
> +
> + if (IS_ENABLED(CONFIG_CXL_REGION_SYSRAM_DEFAULT_ONLINE))
> + online_type = MMOP_ONLINE_MOVABLE;
> + else if (IS_ENABLED(CONFIG_CXL_REGION_SYSRAM_DEFAULT_ONLINE_NORMAL))
> + online_type = MMOP_ONLINE_KERNEL;
> + else
> + online_type = MMOP_ONLINE_MOVABLE;
> +
> + rc = lock_device_hotplug_sysfs();
> + if (rc)
> + return rc;
> +
> + rc = cxl_sysram_online_memory(range, online_type);
> +
> + unlock_device_hotplug();
> +
> + if (rc)
> + dev_warn(dev, "auto-online failed: %d\n", rc);
> +
> + return rc;
> +}
> +
> static ssize_t state_store(struct device *dev,
> struct device_attribute *attr,
> const char *buf, size_t len)
> {
> struct cxl_region *cxlr = to_cxl_region(dev);
> - struct online_memory_cb_arg cb_arg;
> struct range range;
> int rc;
>
> @@ -149,30 +206,14 @@ static ssize_t state_store(struct device *dev,
> if (rc)
> return rc;
>
> - if (sysfs_streq(buf, "online")) {
> - cb_arg.online_type = MMOP_ONLINE_MOVABLE;
> - cb_arg.rc = 0;
> - rc = walk_memory_blocks(range.start, range_len(&range),
> - &cb_arg, online_memory_block_cb);
> - if (!rc)
> - rc = cb_arg.rc;
> - } else if (sysfs_streq(buf, "online_normal")) {
> - cb_arg.online_type = MMOP_ONLINE;
> - cb_arg.rc = 0;
> - rc = walk_memory_blocks(range.start, range_len(&range),
> - &cb_arg, online_memory_block_cb);
> - if (!rc)
> - rc = cb_arg.rc;
> - } else if (sysfs_streq(buf, "offline")) {
> - int offline_rc = 0;
> -
> - rc = walk_memory_blocks(range.start, range_len(&range),
> - &offline_rc, offline_memory_block_cb);
> - if (!rc)
> - rc = offline_rc;
> - } else {
> + if (sysfs_streq(buf, "online"))
> + rc = cxl_sysram_online_memory(&range, MMOP_ONLINE_MOVABLE);
> + else if (sysfs_streq(buf, "online_normal"))
> + rc = cxl_sysram_online_memory(&range, MMOP_ONLINE);
> + else if (sysfs_streq(buf, "offline"))
> + rc = cxl_sysram_offline_memory(&range);
> + else
> rc = -EINVAL;
> - }
>
> unlock_device_hotplug();
>
> @@ -332,6 +373,10 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> dev_dbg(dev, "%s: added %llu bytes as System RAM\n", dev_name(dev),
> (unsigned long long)total_len);
>
> + rc = cxl_sysram_auto_online(dev, &range);
> + if (rc)
> + goto err_auto_online;
> +
> dev_set_drvdata(dev, data);
> rc = devm_device_add_group(dev, &cxl_sysram_region_group);
> if (rc)
> @@ -341,6 +386,7 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
>
> err_add_group:
> dev_set_drvdata(dev, NULL);
> +err_auto_online:
> /* if this fails, memory cannot be removed from the system until reboot */
> remove_memory(range.start, range_len(&range));
> err_add_memory:
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
2026-01-12 16:35 ` [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only Gregory Price
@ 2026-01-12 21:11 ` Cheatham, Benjamin
2026-01-12 23:14 ` Gregory Price
0 siblings, 1 reply; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-12 21:11 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams,
David Hildenbrand, Hannes Reinecke
On 1/12/2026 10:35 AM, Gregory Price wrote:
> If state is set to online (default to ZONE_MOVABLE), the user intends
> for this memory to either refuse non-movable allocations, and/or intends
> to preserve the hot-unpluggability of this memory. However, any admin
> can write `offline` and `online` to the memory block controller and
> bring that memory online in ZONE_NORMAL.
Is it the expectation that the user will never want to change the zone from
MOVABLE to NORMAL? I can't think of a reason someone would want to off the top
of my head, but I also can't think of a reason to restrict it either.
>
> Register a memory_notify callback that disallows onlining the block into
> ZONE_NORMAL if the default state of the controller is ZONE_MOVABLE.
>
> If an actor attempts to online the block into ZONE_NORMAL, it will fail,
> but if it attempts to online into either NORMAL or MOVABLE, only MOVABLE
> will be allowed and it will succeed.
I'm not sure you need this paragraph. I think it's a logical conclusion of the above
that if someone attempts to online the memory as NORMAL or MOVABLE it'll only be onlined
as MOVABLE.
>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Suggested-by: Hannes Reinecke <hare@suse.de>
> Link: https://lore.kernel.org/linux-mm/39533aa8-ca78-41a8-b005-9202ce53e3ae@kernel.org/
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/memctrl/sysram_region.c | 138 +++++++++++++++++++++--
> 1 file changed, 127 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
> index 2e2d9b59a725..71e39d725dc5 100644
> --- a/drivers/cxl/core/memctrl/sysram_region.c
> +++ b/drivers/cxl/core/memctrl/sysram_region.c
> @@ -2,6 +2,7 @@
> /* Copyright(c) 2026 Meta Inc. All rights reserved. */
> #include <linux/memremap.h>
> #include <linux/memory.h>
> +#include <linux/mmzone.h>
> #include <linux/module.h>
> #include <linux/device.h>
> #include <linux/slab.h>
> @@ -23,6 +24,14 @@ struct cxl_sysram_data {
> const char *res_name;
> int mgid;
> struct resource *res;
> + struct range range;
> + struct notifier_block memory_notifier;
> + /*
> + * Last online type requested by user via state sysfs or auto-online.
> + * Used to enforce zone consistency when memory blocks are onlined.
> + * MMOP_OFFLINE means no online preference has been set yet.
> + */
> + int last_online_type;
> };
>
> static DEFINE_MUTEX(cxl_memory_type_lock);
> @@ -158,7 +167,58 @@ static int cxl_sysram_offline_memory(struct range *range)
> return rc;
> }
>
> -static int cxl_sysram_auto_online(struct device *dev, struct range *range)
> +/*
> + * Memory notifier callback to enforce zone consistency.
> + *
> + * When the user (or auto-online) requests memory to be onlined into
> + * ZONE_MOVABLE, reject any subsequent attempts to online memory blocks
> + * from this region into a different zone (e.g., ZONE_NORMAL). This prevents
> + * accidental zone mixing which could lead to memory fragmentation and
> + * offlining failures.
> + */
> +static int cxl_sysram_memory_notify_cb(struct notifier_block *nb,
> + unsigned long action, void *arg)
> +{
> + struct cxl_sysram_data *data = container_of(nb, struct cxl_sysram_data,
> + memory_notifier);
> + struct memory_notify *mhp = arg;
> + unsigned long start_phys = PFN_PHYS(mhp->start_pfn);
> + unsigned long size = PFN_PHYS(mhp->nr_pages);
> + struct page *page;
> +
> + if (action != MEM_GOING_ONLINE)
> + return NOTIFY_DONE;
> +
> + /* Check if this memory block overlaps with our region */
> + if (start_phys + size <= data->range.start ||
> + start_phys > data->range.end)
> + return NOTIFY_DONE;
> +
> + /*
> + * If no online preference has been set (MMOP_OFFLINE), allow any zone.
> + * Also allow if the preference wasn't for ZONE_MOVABLE.
> + */
> + if (data->last_online_type != MMOP_ONLINE_MOVABLE)
> + return NOTIFY_DONE;
> +
> + /*
> + * The zone has already been assigned to the pages at this point
> + * via move_pfn_range_to_zone() before MEM_GOING_ONLINE is sent.
> + * Check if it's ZONE_MOVABLE as expected.
> + */
> + page = pfn_to_page(mhp->start_pfn);
> +
> + if (!is_zone_movable_page(page)) {
> + pr_warn("CXL sysram: rejecting online to non-movable zone for range %#lx-%#lx (expected ZONE_MOVABLE)\n",
> + start_phys, start_phys + size - 1);
> + return NOTIFY_BAD;
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +static int cxl_sysram_auto_online(struct device *dev, struct range *range,
> + struct cxl_sysram_data *data)
> {
> int online_type;
> int rc;
> @@ -173,6 +233,9 @@ static int cxl_sysram_auto_online(struct device *dev, struct range *range)
> else
> online_type = MMOP_ONLINE_MOVABLE;
>
> + /* Record the auto-online type for zone enforcement */
> + data->last_online_type = online_type;
> +
> rc = lock_device_hotplug_sysfs();
> if (rc)
> return rc;
> @@ -187,17 +250,43 @@ static int cxl_sysram_auto_online(struct device *dev, struct range *range)
> return rc;
> }
>
> +static ssize_t state_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct cxl_sysram_data *data;
> +
> + data = dev_get_drvdata(dev);
> + if (!data)
> + return -ENODEV;
> +
> + switch (data->last_online_type) {
> + case MMOP_ONLINE_MOVABLE:
> + return sysfs_emit(buf, "online\n");
> + case MMOP_ONLINE_KERNEL:
> + return sysfs_emit(buf, "online_normal\n");
> + case MMOP_OFFLINE:
> + default:
You're missing the MMOP_ONLINE case. In that case the memory would be reported as "offline", which
I doubt is the intention.
Thanks,
Ben
> + return sysfs_emit(buf, "offline\n");
> + }
> +}
> +
> static ssize_t state_store(struct device *dev,
> struct device_attribute *attr,
> const char *buf, size_t len)
> {
> struct cxl_region *cxlr = to_cxl_region(dev);
> + struct cxl_sysram_data *data;
> struct range range;
> + int online_type = MMOP_OFFLINE;
> int rc;
>
> if (!cxlr)
> return -ENODEV;
>
> + data = dev_get_drvdata(dev);
> + if (!data)
> + return -ENODEV;
> +
> rc = cxl_sysram_range(cxlr, &range);
> if (rc)
> return rc;
> @@ -206,23 +295,30 @@ static ssize_t state_store(struct device *dev,
> if (rc)
> return rc;
>
> - if (sysfs_streq(buf, "online"))
> - rc = cxl_sysram_online_memory(&range, MMOP_ONLINE_MOVABLE);
> - else if (sysfs_streq(buf, "online_normal"))
> - rc = cxl_sysram_online_memory(&range, MMOP_ONLINE);
> - else if (sysfs_streq(buf, "offline"))
> + if (sysfs_streq(buf, "online")) {
> + online_type = MMOP_ONLINE_MOVABLE;
> + rc = cxl_sysram_online_memory(&range, online_type);
> + } else if (sysfs_streq(buf, "online_normal")) {
> + online_type = MMOP_ONLINE;
> + rc = cxl_sysram_online_memory(&range, online_type);
> + } else if (sysfs_streq(buf, "offline")) {
> rc = cxl_sysram_offline_memory(&range);
> - else
> + } else {
> rc = -EINVAL;
> + }
>
> unlock_device_hotplug();
>
> if (rc)
> return rc;
>
> + /* Record the online type for zone enforcement on success */
> + if (online_type != MMOP_OFFLINE)
> + data->last_online_type = online_type;
> +
> return len;
> }
> -static DEVICE_ATTR_WO(state);
> +static DEVICE_ATTR_RW(state);
>
> static ssize_t hotplug_store(struct device *dev,
> struct device_attribute *attr,
> @@ -274,6 +370,11 @@ static void cxl_sysram_unregister(void *_data)
> .end = data->res->end
> };
>
> + unregister_memory_notifier(&data->memory_notifier);
> +
> + range.start = data->res->start;
> + range.end = data->res->end;
> +
> /* We have one shot for removal, otherwise it's stuck til reboot */
> if (!offline_and_remove_memory(range.start, range_len(&range))) {
> remove_resource(data->res);
> @@ -334,6 +435,10 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> goto err_data;
> }
>
> + /* Initialize range and online type tracking */
> + data->range = range;
> + data->last_online_type = MMOP_OFFLINE;
> +
> data->res_name = kstrdup(dev_name(dev), GFP_KERNEL);
> if (!data->res_name) {
> rc = -ENOMEM;
> @@ -373,11 +478,20 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> dev_dbg(dev, "%s: added %llu bytes as System RAM\n", dev_name(dev),
> (unsigned long long)total_len);
>
> - rc = cxl_sysram_auto_online(dev, &range);
> + /* Set drvdata early so auto_online can access it */
> + dev_set_drvdata(dev, data);
> +
> + /* Register memory notifier for zone enforcement */
> + data->memory_notifier.notifier_call = cxl_sysram_memory_notify_cb;
> + data->memory_notifier.priority = CXL_CALLBACK_PRI;
> + rc = register_memory_notifier(&data->memory_notifier);
> + if (rc)
> + goto err_notifier;
> +
> + rc = cxl_sysram_auto_online(dev, &range, data);
> if (rc)
> goto err_auto_online;
>
> - dev_set_drvdata(dev, data);
> rc = devm_device_add_group(dev, &cxl_sysram_region_group);
> if (rc)
> goto err_add_group;
> @@ -385,9 +499,11 @@ int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> return devm_add_action_or_reset(dev, cxl_sysram_unregister, data);
>
> err_add_group:
> - dev_set_drvdata(dev, NULL);
> err_auto_online:
> /* if this fails, memory cannot be removed from the system until reboot */
> + unregister_memory_notifier(&data->memory_notifier);
> +err_notifier:
> + dev_set_drvdata(dev, NULL);
> remove_memory(range.start, range_len(&range));
> err_add_memory:
> remove_resource(res);
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 20:59 ` dan.j.williams
@ 2026-01-12 22:25 ` Gregory Price
2026-01-13 18:00 ` Dave Jiang
1 sibling, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 22:25 UTC (permalink / raw)
To: dan.j.williams
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny
On Mon, Jan 12, 2026 at 12:59:44PM -0800, dan.j.williams@intel.com wrote:
> Gregory Price wrote:
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -42,6 +42,8 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port);
> > struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
> > u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> > u64 dpa);
> > +int cxl_enable_memctrl(struct cxl_region *cxlr);
>
> This is a "probe" operation not an "enable" in terms of runtime ABI and
> presentation that starts decorating the region. In that respect it also
> is not a "control" as much as an "operation model / driver". So no need
> for a "control" concept, i.e.:
>
> s/CXL_CONTROL_{NONE,AUTO,DAX}/CXL_DRIVER_{NONE,AUTO,DAX}/
> s/enum cxl_memctrl_mode/enum cxl_region_driver/
>
Ack.
> ...otherwise there is nothing in this proposal that makes me want to
> abandon the traditional meaning of a "driver" probing a "resource" in a
> certain way to make it usable with the rest of the kernel.
>
> Rest of this looks fine. With that fixup if we are going to have a set
> of different region driver modes then the directory can be:
>
> drivers/cxl/core/region/
Mostly I imagine we'll end up with at least 5-6 of these, and the
directory makes it clear for any new comers "This is what you must do if
you want a new region".
but if folks would rather have it directly in core/ then `git mv...`
either way, ack.
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 21:10 ` Cheatham, Benjamin
@ 2026-01-12 22:34 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 22:34 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams
On Mon, Jan 12, 2026 at 03:10:29PM -0600, Cheatham, Benjamin wrote:
> On 1/12/2026 10:35 AM, Gregory Price wrote:
> > The CXL driver presently hands policy management over to DAX subsystem
> > for sysram regions, which makes building policy around the entire region
> > clunky and at time difficult (e.g. multiple actions to offline and
> > hot-unplug memory reliably).
> >
> > To support multiple backend controllers for memory regions (for example
> > dax vs direct hotplug), implement a memctrl field in cxl_region allows
> > switching uncomitted regions between different "memory controllers".
> >
> > CXL_CONTROL_NONE: No selected controller, probe will fail.
> > CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
> > otherwise register a dax_region
>
> I think you can streamline this to get rid of CXL_CONTROL_AUTO. If BIOS set up
> the memory you can just set the mode to CXL_CONTROL_NONE, otherwise use CXL_CONTROL_DAX.
> This patch would be a bit less complicated at the very least, and I don't think it
> would require much reworking of later patches.
>
I suppose if it's an auto-region (CXL_REGION_F_AUTO) we can default to
DAX if the SP bit is set (memory is soft-reserved).
I was hoping to quietly convert our systems from the dax driver to the
sysram driver without too much hubub, but this would require our old
systems to implement a udev or chef thing to online the memory after
boot. Otherwise I can't ditch the MHP_ auto-online flags to
auto-online.
But yeah I won't fight this too much, just need to think about it. I
can always ditch it and bring auto back later if there's something that
breaks unexpectedly.
> > + switch (cxlr->memctrl) {
> > + case CXL_MEMCTRL_AUTO:
> > + /*
> > + * The region can not be manged by CXL if any portion of
>
> s/manged/managed. May as well fix it since it's being moved.
ack.
> > + * it is already online as 'System RAM'
> > + */
> > + if (walk_iomem_res_desc(IORES_DESC_NONE,
> > + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> > + p->res->start, p->res->end, cxlr,
> > + is_system_ram) > 0)
> > + return 0;
> > + return devm_cxl_add_dax_region(cxlr);
>
> If you take my suggestion at the top about removing MEMCTRL_AUTO, this case become
> the MEMCTRL_DAX case.
>
ack.
> > + default:
> > + desc = "";
>
> Nit: I would prefer this say "none" instead of being blank to match the code.
reasonable
> > @@ -3579,6 +3558,9 @@ static int __construct_region(struct cxl_region *cxlr,
> >
> > set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
> >
> > + /* Auto-regions will either be static sysram (onlined by BIOS) or DAX */
> > + cxlr->memctrl = CXL_MEMCTRL_AUTO;
>
> And this would be set to CXL_MEM_CTRL_DAX instead.
>
Mmmm, hm. I guess if the region is already online, we'll revert this to
none at probe time, so yeah that should work.
There may be some kind of management we want to add to auto-regions
onlined by BIOS (e.g. no SP bit), but we can add AUTO in if we ever
discover that use-case and argue for it then.
Thanks,
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 20:00 ` David Hildenbrand (Red Hat)
@ 2026-01-12 22:43 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 22:43 UTC (permalink / raw)
To: David Hildenbrand (Red Hat)
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams
On Mon, Jan 12, 2026 at 09:00:54PM +0100, David Hildenbrand (Red Hat) wrote:
> On 1/12/26 17:35, Gregory Price wrote:
> > Add a sysram memctrl that directly hotplugs memory without needing to
> > route through DAX. This simplifies the sysram usecase considerably.
> >
> > The sysram memctl adds new sysfs controls when registered:
> > region/memctrl/[hotplug, hotunplug, state]
> >
> > hotplug: controller attempts to hotplug the memory region
>
> Why disconnect the hotplug from the online state?
>
> echo online_movable > hotplug ?
>
> Then we can just have something like add_and_online_memory() in the core.
>
mostly i cobbled this together over the weekend to have it for
discussion at the community DAX meeting.
I think just having
[offline,online,online_movable] > hotplug
is probably the better option. There's not much use in a memory_region
control that lets you offline the memory but not remove the blocks.
I mean, I know of *a* use for that, and it's not something we want to
support :]
> > hotunplug: controller attempts to offline and hotunplug the memory region
> > state: [online,online_normal,offline]
> > online : controller onlines blocks in ZONE_MOVABLE
>
> I don't like this incosistency regarding the remainder of common hotplug
> toggles.
>
> We should use exactly the same values with exactly the same semantics. Yes,
> user-space tooling should be thaught to pass in online_movable :)
>
> > online_normal: controller onlines blocks in ZONE_NORMAL
> > offline : controller attempts to offline the memory blocks
>
> Why is that required? ideally we'd start with hotplug vs. hotunplug and
> leave manual onlining/offlining out of this interface for now.
>
That is fair, although i would like a build option to default the online
mode to ZONE_MOVABLE for auto-configured sysram regions w/ the SP bit
set, otherwise that will be forever locked to using the DAX model.
> > + } else if (sysfs_streq(buf, "offline")) {
> > + int offline_rc = 0;
> > +
> > + rc = walk_memory_blocks(range.start, range_len(&range),
> > + &offline_rc, offline_memory_block_cb);
> > + if (!rc)
> > + rc = offline_rc;
>
> Let's expose this functionality through some common-code helpers. I really
> don't want more code doing this non-obvious device_offline() etc dance.
>
> walk_memory_blocks() should become a core-mm helper. Maybe we can also
> cleanup drivers/acpi/acpi_memhotplug.c in that regard.
>
> Hopefully we can then also reuse these helpers in ppc code (see
> dlpar_add_lmb() and dlpar_remove_lmb() that do something similar, but grab
> the device hotplug lock themselves as they want to perform some additional
> operations).
>
I'll take a look.
Thanks!
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 21:10 ` dan.j.williams
@ 2026-01-12 22:47 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 22:47 UTC (permalink / raw)
To: dan.j.williams
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
David Hildenbrand
On Mon, Jan 12, 2026 at 01:10:15PM -0800, dan.j.williams@intel.com wrote:
> Gregory Price wrote:
> >
> > This can result in future management functions failing (such as adding a
> > new region). This is why "online_normal" is explicit, and the default
> > online zone is ZONE_MOVABLE.
>
> David's early feedback aligns with my own with respect to not creating
> new "online_*" ABI terms, but I want to go a step further.
>
> Part of the proposal here solves a fundamental problem with the way
> dax_kmem operates in terms of fixing the complication of dax_kmem
> depending on fine grained / multi-step online control via memblock
> sysfs.
>
> If we are going to introduce a new omnibus way to online entire regions
> at a time then that goodness should first come to dax_kmem and then
> potentially be refactored into a library that CXL can use to skip the
> device_dax indirection.
>
I think that probably just looks like sinking some of this into
memory_hotplug.c as bulk-commands and then exposing a similar
dax0.0/hotplug function that shows up if you're bound to dax_kmem.
That should be trivial to sink, can do.
> I.e. the end result would be this "hotplug" mechanism that fixes a long
> standing dax_kmem problem and then go further to drop the indirection
> through device_dax and have a "hotplug" mechanism directly at the
> cxl_region level.
>
The only catch may be auto-online behavior in dax, we may not want to
encode that there and instead improve the hotplug interface to entice
users to use it directly instead.
> > +int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> > +{
> [..]
> > +err_add_group:
> > + dev_set_drvdata(dev, NULL);
> > + /* if this fails, memory cannot be removed from the system until reboot */
> > + remove_memory(range.start, range_len(&range));
> > +err_add_memory:
> > + remove_resource(res);
> > + kfree(res);
> > +err_request_mem:
> > + memory_group_unregister(data->mgid);
> > +err_reg_mgid:
> > + kfree(data->res_name);
> > +err_res_name:
> > + kfree(data);
> > +err_data:
> > + clear_node_memory_type(numa_node, mtype);
> > + return rc;
>
> ...btw, this feels like too many new gotos in the age of
> scope-based-cleanup. It also feels like a bunch of duplicated code that
> CXL and fixed up dax_kmem can share.
Yeah i cribbed a bunch of this from dax and hotplug, i expect this to
get significantly cleaner in a version or two.
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 21:10 ` Cheatham, Benjamin
@ 2026-01-12 22:55 ` Gregory Price
2026-01-13 22:34 ` Cheatham, Benjamin
0 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 22:55 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, David Hildenbrand
On Mon, Jan 12, 2026 at 03:10:41PM -0600, Cheatham, Benjamin wrote:
> On 1/12/2026 10:35 AM, Gregory Price wrote:
> > Add a sysram memctrl that directly hotplugs memory without needing to
> > route through DAX. This simplifies the sysram usecase considerably.
> >
> > The sysram memctl adds new sysfs controls when registered:
> > region/memctrl/[hotplug, hotunplug, state]
> >
> > hotplug: controller attempts to hotplug the memory region
> > hotunplug: controller attempts to offline and hotunplug the memory region
>
> Nit: Would it be better to use hotadd/hotremove here instead of hotplug/hotunplug? The terms
> are basically synonymous, but I think hotadd and hotremove are more descriptive.
I will defer to David on this. I think keeping the terminology
consistent is better, but also hotplug is overloaded between physical
and logical. It ultimately means the same thing to be honest.
> > state: [online,online_normal,offline]
> > online : controller onlines blocks in ZONE_MOVABLE
> > online_normal: controller onlines blocks in ZONE_NORMAL
>
> The naming for online states could be improved imo. I understand and agree with the motivation
> behind the names, but I could see the use of the word "normal" being confusing to less savvy users.
> You could change it to include the zone for both (online_movable/online_normal), but I think it may
> be easier to mark which one has drawbacks, i.e. change "online_normal" to something like "online_nonremovable".
> That way, anyone who doesn't want to go find the documentation for these can understand the user-visible
> impact.
>
> In any case, all of these attributes need ABI documentation as well.
>
This is what i was getting at originally, I will consider the other
feedback and spin a v2 with this simplified a bit.
I'm leaning towards agreeing with Dan and David that probably we just
keep online/online_movable since it's consistent with base/memory.c, but
we can continue to have this argument.
I don't think we can reasonable get away from users of this interface
understanding the implications of ZONEs, since whatever they choose to
do dictates what zone the memory gets added to.
> > +static DEFINE_MUTEX(cxl_memory_type_lock);
> > +static LIST_HEAD(cxl_memory_types);
> > +
> > +static struct cxl_region *to_cxl_region(struct device *dev)
> > +{
> > + if (dev->type != &cxl_region_type)
> > + return NULL;
> > + return container_of(dev, struct cxl_region, dev);
> > +}
>
> What's the reasoning behind redefining this in this file? It's still defined in cxl/core/region.c,
> so I would probably just drop the static there and include it through core.h.
>
Just cruft from rapidly moving stuff around. Will fixup.
> > + rc = cxl_sysram_range(cxlr, &range);
> > + if (rc) {
> > + dev_info(dev, "range %#llx-%#llx too small after alignment\n",
> > + range.start, range.end);
>
> This should probably be a warning instead. You do it for the next check which is essentially the same
> case, so may as well do it here.
ack.
> > + if (!total_len) {
> > + dev_warn(dev, "rejecting CXL region without any memory after alignment\n");
> > + return -EINVAL;
> > + }
>
> I don't think this check is needed. cxl_sysram_range() checks if the range->start == range->end (i.e. size == 0)
> and errors out. That should cause the above check to error out before this.
ack
> > + /*
> > + * Setup flags for System RAM. Leave _BUSY clear so add_memory() can add
> > + * a child resource. Do not inherit flags from parent since it may set
> > + * flags unknown to us that will the break add_memory() below.
> > + */
> > + res->flags = IORESOURCE_SYSTEM_RAM;
> > + mhp_flags = MHP_NID_IS_MGID;
> > + rc = add_memory_driver_managed(data->mgid, range.start,
> > + range_len(&range), sysram_name, mhp_flags);
>
> Look like mhp_flags is only used once, I'd get rid of it and just use MHP_NID_IS_MGID instead.
>
ack - yeah this was cribbed from dax.c
Thank you!
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
2026-01-12 21:10 ` Cheatham, Benjamin
@ 2026-01-12 22:58 ` Gregory Price
2026-01-13 9:12 ` Neeraj Kumar
0 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 22:58 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, s.neeraj
On Mon, Jan 12, 2026 at 03:10:47PM -0600, Cheatham, Benjamin wrote:
> On 1/12/2026 10:35 AM, Gregory Price wrote:
> > Move the pmem_region logic from region.c into memctrl/pmem_region.c.
> > Restrict the valid controllers for pmem to the pmem controller.
> > Simplify the controller selection logic in region probe.
> >
> > Cc:
>
> May want to forward this to whoever this Cc tag was meant for :).
>
> One nit below, otherwise this looks good to me:
> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
>
doh, meant to send this to Neeraj at samsung because they've been poking
at pmem stuff. Thank you.
~Gregory
Neeraj:
Link: https://lore.kernel.org/linux-cxl/20260112163514.2551809-4-gourry@gourry.net/
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
2026-01-12 21:10 ` Cheatham, Benjamin
@ 2026-01-12 23:05 ` Gregory Price
2026-01-13 4:31 ` dan.j.williams
0 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 23:05 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams
On Mon, Jan 12, 2026 at 03:10:55PM -0600, Cheatham, Benjamin wrote:
> On 1/12/2026 10:35 AM, Gregory Price wrote:
> > +choice
> > + prompt "CXL Region Auto Control Mode"
> > + depends on CXL_REGION
> > + default CXL_REGION_CTRL_AUTO_DAX
> > + help
> > + Select the default controller for CXL regions when ctrl mode is
> > + set to 'auto'. This determines how CXL memory regions are exposed
> > + to the system when no explicit control mode is specified.
> > +
> > +config CXL_REGION_CTRL_AUTO_DAX
>
> This should probably be renamed to CXL_REGION_CTRL_DAX since only DAX is mentioned.
>
> > + bool "DAX"
> > + help
> > + When a CXL region's control mode is 'auto', create a DAX region
> > + controller. This allows fine-grained control over the memory region
> > + through the DAX subsystem, and the region can later be converted to
> > + System RAM via daxctl.
> > +
> > + This is the default and recommended option for most use cases.
>
> If you remove the 'auto' mode earlier on, then you can just drop the first sentence here.
> I'd also add a note about when a DAX region can be failed to be created (i.e. BIOS already
> set up and onlined the memory).
>
I think I'm just going to drop this entirely, probably this was just too
ambitious trying to create an easy transition from dax to sysram for
auto regions.
The reality is BIOS-configured decoders "is NOT the way" (TM). If BIOS
configures it - it's DAX, otherwise the user gets a choice (or they can
tear it down and rebuild).
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options
2026-01-12 21:11 ` Cheatham, Benjamin
@ 2026-01-12 23:07 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-12 23:07 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams
On Mon, Jan 12, 2026 at 03:11:00PM -0600, Cheatham, Benjamin wrote:
> On 1/12/2026 10:35 AM, Gregory Price wrote:
> > DEFAULT_OFFLINE: Blocks will be offline after being created.
> > DEFAULT_ONLINE: Blocks will be onlined in ZONE_MOVABLE
> > DEFAULT_ONLINE_NORMAL: Blocks will be onliend in ZONE_NORMAL.
> >
> > This prevents users from having to use the MHP auto-online build config,
> > which may cause misbehaviors with other devices hotplugging memory.
>
> Isn't the MHP auto-online build config still used in some flows? A quick note on
> when that option will still be used would be nice.
It's definitely still in use, and in fact we use it to manage many
systems with BIOS configured decoders.
That option super-cedes this option, which... is probably problematic,
and David might want to chime in on whether improving the hotplug+online
pattern to include the intended zone should dictate its removal.
~Gregory
> > +static int cxl_sysram_offline_memory(struct range *range)
> > +{
> > + int offline_rc = 0;
> > + int rc;
> > +
> > + rc = walk_memory_blocks(range->start, range_len(range),
> > + &offline_rc, offline_memory_block_cb);
> > + if (!rc)
> > + rc = offline_rc;
> > +
> > + return rc;
> > +}
>
> I think these two helpers can get moved into patch 2/6 when the 'store' attribute was defined. I don't
> see anything that requires they're in this patch and it would help reduce churn.
>
Yeah this'll get reworked with the interface rework.
Thanks again,
Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
2026-01-12 21:11 ` Cheatham, Benjamin
@ 2026-01-12 23:14 ` Gregory Price
2026-01-13 22:35 ` Cheatham, Benjamin
0 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-12 23:14 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, David Hildenbrand, Hannes Reinecke
On Mon, Jan 12, 2026 at 03:11:05PM -0600, Cheatham, Benjamin wrote:
> On 1/12/2026 10:35 AM, Gregory Price wrote:
> > If state is set to online (default to ZONE_MOVABLE), the user intends
> > for this memory to either refuse non-movable allocations, and/or intends
> > to preserve the hot-unpluggability of this memory. However, any admin
> > can write `offline` and `online` to the memory block controller and
> > bring that memory online in ZONE_NORMAL.
>
> Is it the expectation that the user will never want to change the zone from
> MOVABLE to NORMAL? I can't think of a reason someone would want to off the top
> of my head, but I also can't think of a reason to restrict it either.
>
It's more to restrict this pattern
echo online_movable > region0/hotplug
-> creates: node1/memory123/
echo offline > node1/memory123/state
echo online > node1/memory123/state
The result of this would be valid_zones=[normal movable], which would
break hot-unplug.
> > If an actor attempts to online the block into ZONE_NORMAL, it will fail,
> > but if it attempts to online into either NORMAL or MOVABLE, only MOVABLE
> > will be allowed and it will succeed.
>
> I'm not sure you need this paragraph. I think it's a logical conclusion of the above
> that if someone attempts to online the memory as NORMAL or MOVABLE it'll only be onlined
> as MOVABLE.
in the above situation the following occurs:
echo online > region0/hotplug
echo offline > node1/memory123/state
echo online > node1/memory123/state
cat node1/memory123/valid_zones
normal movable
echo offline > node1/memory123/state
echo 1 > node1/memory123/online
cat node1/memory123/valid_zones
normal
echo online_movable > region0/hotplug
echo offline > node1/memory123/state
echo online > node1/memory123/state
cat node1/memory123/valid_zones
movable
echo offline > node1/memory123/state
echo 1 > node1/memory123/online
fail with EXXXX (i forget what code)
It's a little confusing.
> > + switch (data->last_online_type) {
> > + case MMOP_ONLINE_MOVABLE:
> > + return sysfs_emit(buf, "online\n");
> > + case MMOP_ONLINE_KERNEL:
> > + return sysfs_emit(buf, "online_normal\n");
> > + case MMOP_OFFLINE:
> > + default:
>
> You're missing the MMOP_ONLINE case. In that case the memory would be reported as "offline", which
> I doubt is the intention.
>
Blah, i originally had all of them and just reduced to
MMOP_ONLINE_MOVABLE and MMOP_ONLINE (i don't see a good use for
MMOP_ONLINE_KERNEL), but i'll fix this up.
Thanks!
Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
2026-01-12 23:05 ` Gregory Price
@ 2026-01-13 4:31 ` dan.j.williams
2026-01-13 13:55 ` Gregory Price
0 siblings, 1 reply; 40+ messages in thread
From: dan.j.williams @ 2026-01-13 4:31 UTC (permalink / raw)
To: Gregory Price, Cheatham, Benjamin
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams
Gregory Price wrote:
[..]
> > If you remove the 'auto' mode earlier on, then you can just drop the first sentence here.
> > I'd also add a note about when a DAX region can be failed to be created (i.e. BIOS already
> > set up and onlined the memory).
> >
>
> I think I'm just going to drop this entirely, probably this was just too
> ambitious trying to create an easy transition from dax to sysram for
> auto regions.
>
> The reality is BIOS-configured decoders "is NOT the way" (TM). If BIOS
> configures it - it's DAX, otherwise the user gets a choice (or they can
> tear it down and rebuild).
Is the plan here to "whither struct memory_block"? I can see value in
starting the deprecation process given the problems Hannes points out
and BIOS alignment causes massive numbers of those things to show up.
If yes, then even if it is DAX the distro might still want the option to
only allows for region-scoped "hotplug" rather than memory_block-scoped
"online".
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
2026-01-12 22:58 ` Gregory Price
@ 2026-01-13 9:12 ` Neeraj Kumar
0 siblings, 0 replies; 40+ messages in thread
From: Neeraj Kumar @ 2026-01-13 9:12 UTC (permalink / raw)
To: Gregory Price
Cc: Cheatham, Benjamin, linux-cxl, linux-kernel, kernel-team, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, gost.dev, neeraj.kernel, cpgs
[-- Attachment #1: Type: text/plain, Size: 858 bytes --]
On 12/01/26 05:58PM, Gregory Price wrote:
>On Mon, Jan 12, 2026 at 03:10:47PM -0600, Cheatham, Benjamin wrote:
>> On 1/12/2026 10:35 AM, Gregory Price wrote:
>> > Move the pmem_region logic from region.c into memctrl/pmem_region.c.
>> > Restrict the valid controllers for pmem to the pmem controller.
>> > Simplify the controller selection logic in region probe.
>> >
>> > Cc:
>>
>> May want to forward this to whoever this Cc tag was meant for :).
>>
>> One nit below, otherwise this looks good to me:
>> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
>>
>
>doh, meant to send this to Neeraj at samsung because they've been poking
>at pmem stuff. Thank you.
>
>~Gregory
>
>Neeraj:
>Link: https://lore.kernel.org/linux-cxl/20260112163514.2551809-4-gourry@gourry.net/
Hi Gregory,
Sure I will use this new infra for LSA 2.1 Support.
Regards,
Neeraj
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
` (5 preceding siblings ...)
2026-01-12 16:35 ` [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only Gregory Price
@ 2026-01-13 9:37 ` Neeraj Kumar
2026-01-13 13:33 ` Gregory Price
2026-01-15 18:43 ` Alejandro Lucero Palau
7 siblings, 1 reply; 40+ messages in thread
From: Neeraj Kumar @ 2026-01-13 9:37 UTC (permalink / raw)
To: Gregory Price
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, gost.dev, neeraj.kernel, cpgs
[-- Attachment #1: Type: text/plain, Size: 5069 bytes --]
On 12/01/26 11:35AM, Gregory Price wrote:
>The CXL driver currently hands policy management over to the DAX
>subsystem for sysram regions. This makes building policy around
>entire regions clunky and at times difficult - for example, requiring
>multiple actions to reliably offline and hot-unplug memory.
>
>This series introduces a memory controller abstraction for CXL regions
>and adds a "sysram" controller that directly hotplugs memory without
>needing to route through DAX. This simplifies the sysram use case
>considerably.
>
>This also prepares for future use cases which may require different
>memory controller logic (such as private numa nodes).
>
>We organize the controllers into core/memctrl/*_region.c files.
>
>The series is organized as follows:
>
>Patch 1 introduces the cxl_memctrl_mode enum and region->memctrl field,
>allowing regions to be switched between different memory controllers.
>The supported modes are NONE, AUTO, and DAX initially. Auto-created
>regions default to AUTO, while manually created regions default to NONE
>(requiring explicit controller selection).
>
>Patch 2 adds the sysram_region memory controller, which provides direct
>memory hotplug without DAX intermediation. New sysfs controls are
>exposed under region/memctrl/:
> - hotplug: trigger memory hotplug
> - hotunplug: offline and hotunplug memory
> - state: online/online_normal/offline
>
>Patch 3 refactors existing pmem memctrl logic out of region.c into the
>new memctrl/pmem_region.c, simplifying controller selection in region
>probe.
>
>Patch 4 adds CONFIG_CXL_REGION_CTRL_AUTO_* options, allowing users to
>configure auto-regions to default to SYSRAM instead of DAX for existing
>simple system configurations (i.e. local memory expansion only).
>
>Patch 5 adds CONFIG_CXL_REGION_SYSRAM_DEFAULT_* options to control the
>default state of sysram blocks (OFFLINE, ONLINE/ZONE_MOVABLE, or
>ONLINE_NORMAL/ZONE_NORMAL). This provides an alternative to the global
>MHP auto-online setting which may cause issues with other devices.
>
>Online defaults to ZONE_MOVABLE to defend hot-unplug by default.
>This is the opposite of memory blocks "online" and "online_movable".
>
>Patch 6 adds a memory_notify callback that prevents memory blocks from
>being onlined into ZONE_NORMAL when the controller state is set to
>ZONE_MOVABLE. This protects against administrators accidentally
>breaking hot-unpluggability by writing "offline" then "online" to the
>memory block sysfs.
>
>Gregory Price (6):
> drivers/cxl: add cxl_memctrl_mode and region->memctrl
> cxl: add sysram_region memory controller
> cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
> cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
> cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options
> cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
>
> drivers/cxl/Kconfig | 72 ++++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/core.h | 5 +
> drivers/cxl/core/memctrl/Makefile | 6 +
> drivers/cxl/core/memctrl/dax_region.c | 79 ++++
> drivers/cxl/core/memctrl/memctrl.c | 48 +++
> drivers/cxl/core/memctrl/pmem_region.c | 191 +++++++++
> drivers/cxl/core/memctrl/sysram_region.c | 520 +++++++++++++++++++++++
> drivers/cxl/core/region.c | 358 ++++------------
> drivers/cxl/cxl.h | 18 +
> 10 files changed, 1013 insertions(+), 285 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/Makefile
> create mode 100644 drivers/cxl/core/memctrl/dax_region.c
> create mode 100644 drivers/cxl/core/memctrl/memctrl.c
> create mode 100644 drivers/cxl/core/memctrl/pmem_region.c
> create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
>
>--
>2.52.0
>
Hi Gregory,
I am facing compilation issue with this series using CONFIG_CXL_BUS=m
{{{
AR drivers/built-in.a
AR built-in.a
AR vmlinux.a
LD vmlinux.o
MODPOST Module.symvers
ERROR: modpost: "device_offline" [drivers/cxl/core/cxl_core.ko] undefined!
ERROR: modpost: "lock_device_hotplug_sysfs" [drivers/cxl/core/cxl_core.ko] undefined!
ERROR: modpost: "unlock_device_hotplug" [drivers/cxl/core/cxl_core.ko] undefined!
ERROR: modpost: "device_online" [drivers/cxl/core/cxl_core.ko] undefined!
ERROR: modpost: "walk_memory_blocks" [drivers/cxl/core/cxl_core.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:147: Module.symvers] Error 1
make[1]: *** [/mnt/ssd1/neeraj/dcd/cxl_env/cxl-linux-mainline/Makefile:2004: modpost] Error 2
make: *** [Makefile:248: __sub-make] Error 2
}}}
Above routines are not EXPORT_SYMBOL_GPL(), thats why with
"CONFIG_CXL_BUS=m" its breaking.
After adding following EXPORT_SYMBOL_GPL() below their respective
routines. This issue is fixed.
{{{
EXPORT_SYMBOL_GPL(unlock_device_hotplug);
EXPORT_SYMBOL_GPL(lock_device_hotplug_sysfs);
EXPORT_SYMBOL_GPL(device_offline);
EXPORT_SYMBOL_GPL(device_online);
EXPORT_SYMBOL_GPL(walk_memory_blocks);
}}}
Can you please have a look?
Regards,
Neeraj
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller
2026-01-13 9:37 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Neeraj Kumar
@ 2026-01-13 13:33 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-13 13:33 UTC (permalink / raw)
To: Neeraj Kumar
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, gost.dev, neeraj.kernel, cpgs
On Tue, Jan 13, 2026 at 03:07:49PM +0530, Neeraj Kumar wrote:
> Hi Gregory,
>
> I am facing compilation issue with this series using CONFIG_CXL_BUS=m
> {{{
> AR drivers/built-in.a
> AR built-in.a
> AR vmlinux.a
> LD vmlinux.o
> MODPOST Module.symvers
> ERROR: modpost: "device_offline" [drivers/cxl/core/cxl_core.ko] undefined!
> ERROR: modpost: "lock_device_hotplug_sysfs" [drivers/cxl/core/cxl_core.ko] undefined!
> ERROR: modpost: "unlock_device_hotplug" [drivers/cxl/core/cxl_core.ko] undefined!
> ERROR: modpost: "device_online" [drivers/cxl/core/cxl_core.ko] undefined!
> ERROR: modpost: "walk_memory_blocks" [drivers/cxl/core/cxl_core.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:147: Module.symvers] Error 1
> make[1]: *** [/mnt/ssd1/neeraj/dcd/cxl_env/cxl-linux-mainline/Makefile:2004: modpost] Error 2
> make: *** [Makefile:248: __sub-make] Error 2
> }}}
>
> Above routines are not EXPORT_SYMBOL_GPL(), thats why with
> "CONFIG_CXL_BUS=m" its breaking.
>
> After adding following EXPORT_SYMBOL_GPL() below their respective
> routines. This issue is fixed.
> {{{
> EXPORT_SYMBOL_GPL(unlock_device_hotplug);
> EXPORT_SYMBOL_GPL(lock_device_hotplug_sysfs);
> EXPORT_SYMBOL_GPL(device_offline);
> EXPORT_SYMBOL_GPL(device_online);
> EXPORT_SYMBOL_GPL(walk_memory_blocks);
> }}}
>
> Can you please have a look?
>
Much appreciated, will fixup!
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
2026-01-13 4:31 ` dan.j.williams
@ 2026-01-13 13:55 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-13 13:55 UTC (permalink / raw)
To: dan.j.williams
Cc: Cheatham, Benjamin, linux-cxl, linux-kernel, kernel-team, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny
On Mon, Jan 12, 2026 at 08:31:56PM -0800, dan.j.williams@intel.com wrote:
> Gregory Price wrote:
> [..]
> > > If you remove the 'auto' mode earlier on, then you can just drop the first sentence here.
> > > I'd also add a note about when a DAX region can be failed to be created (i.e. BIOS already
> > > set up and onlined the memory).
> > >
> >
> > I think I'm just going to drop this entirely, probably this was just too
> > ambitious trying to create an easy transition from dax to sysram for
> > auto regions.
> >
> > The reality is BIOS-configured decoders "is NOT the way" (TM). If BIOS
> > configures it - it's DAX, otherwise the user gets a choice (or they can
> > tear it down and rebuild).
>
> Is the plan here to "whither struct memory_block"? I can see value in
> starting the deprecation process given the problems Hannes points out
> and BIOS alignment causes massive numbers of those things to show up.
>
> If yes, then even if it is DAX the distro might still want the option to
> only allows for region-scoped "hotplug" rather than memory_block-scoped
> "online".
This was not an intent, but maybe? I'm not sure what the larger
implications of this are - except that maybe poisoned regions of memory
might take out larger chunks of hotplug memory.
Other things may depend on memory block size in unexpected ways.
I think maybe lets tuck that away until after we get region-scoped
hotplug. Maybe it would look like this
Step 1: Region-scoped hotplug that uses all the blocks
Step 2: memory hotplug callbacks that disallow any hotplugger (except
emergency hotplug?) from acting on individual blocks
We already want this for online_movable, just defer a bit and
make the guarantee stronger.
Step 3: deprecate memory_block for something else?
make memory_block variably sized with some base alignment?
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 20:59 ` dan.j.williams
2026-01-12 22:25 ` Gregory Price
@ 2026-01-13 18:00 ` Dave Jiang
2026-01-13 20:07 ` Gregory Price
2026-01-14 16:36 ` dan.j.williams
1 sibling, 2 replies; 40+ messages in thread
From: Dave Jiang @ 2026-01-13 18:00 UTC (permalink / raw)
To: dan.j.williams, Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron,
alison.schofield, vishal.l.verma, ira.weiny,
Fabio M. De Francesco
On 1/12/26 1:59 PM, dan.j.williams@intel.com wrote:
> Gregory Price wrote:
>> The CXL driver presently hands policy management over to DAX subsystem
>> for sysram regions, which makes building policy around the entire region
>> clunky and at time difficult (e.g. multiple actions to offline and
>> hot-unplug memory reliably).
>>
>> To support multiple backend controllers for memory regions (for example
>> dax vs direct hotplug), implement a memctrl field in cxl_region allows
>> switching uncomitted regions between different "memory controllers".
>>
>> CXL_CONTROL_NONE: No selected controller, probe will fail.
>> CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
>> otherwise register a dax_region
>> CXL_CONTROL_DAX : register a dax_region
>>
>> Auto regions will either be static sysram (BIOS-onlined) and has no
>> region controller associated with it - or if the SP bit was set a
>> DAX device will be created.
>>
>> Rather than default all regions to the auto-controller, only default
>> auto-regions to the auto controller.
>>
>> Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
>> a failure to probe unless a controller is selected.
>>
>> Signed-off-by: Gregory Price <gourry@gourry.net>
>> ---
>> drivers/cxl/core/Makefile | 1 +
>> drivers/cxl/core/core.h | 2 +
>> drivers/cxl/core/memctrl/Makefile | 4 +
>> drivers/cxl/core/memctrl/dax_region.c | 79 +++++++++++++++
>> drivers/cxl/core/memctrl/memctrl.c | 42 ++++++++
>> drivers/cxl/core/region.c | 136 ++++++++++----------------
>> drivers/cxl/cxl.h | 14 +++
>> 7 files changed, 192 insertions(+), 86 deletions(-)
>> create mode 100644 drivers/cxl/core/memctrl/Makefile
>> create mode 100644 drivers/cxl/core/memctrl/dax_region.c
>> create mode 100644 drivers/cxl/core/memctrl/memctrl.c
>>
>> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
>> index 5ad8fef210b5..79de20e3f8aa 100644
>> --- a/drivers/cxl/core/Makefile
>> +++ b/drivers/cxl/core/Makefile
>> @@ -17,6 +17,7 @@ cxl_core-y += cdat.o
>> cxl_core-y += ras.o
>> cxl_core-$(CONFIG_TRACING) += trace.o
>> cxl_core-$(CONFIG_CXL_REGION) += region.o
>> +include $(src)/memctrl/Makefile
>
> Not sure this merits its own directory, but if it does just do the
> canonical:
>
> obj-y += memctrl/
>
> ...to add an object-sub-directory.
>
>> cxl_core-$(CONFIG_CXL_MCE) += mce.o
>> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
>> cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
>> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
>> index 1fb66132b777..1156a4bd0080 100644
>> --- a/drivers/cxl/core/core.h
>> +++ b/drivers/cxl/core/core.h
>> @@ -42,6 +42,8 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port);
>> struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
>> u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
>> u64 dpa);
>> +int cxl_enable_memctrl(struct cxl_region *cxlr);
>
> This is a "probe" operation not an "enable" in terms of runtime ABI and
> presentation that starts decorating the region. In that respect it also
> is not a "control" as much as an "operation model / driver". So no need
> for a "control" concept, i.e.:
>
> s/CXL_CONTROL_{NONE,AUTO,DAX}/CXL_DRIVER_{NONE,AUTO,DAX}/
> s/enum cxl_memctrl_mode/enum cxl_region_driver/
>
> ...otherwise there is nothing in this proposal that makes me want to
> abandon the traditional meaning of a "driver" probing a "resource" in a
> certain way to make it usable with the rest of the kernel.
>
> Rest of this looks fine. With that fixup if we are going to have a set
> of different region driver modes then the directory can be:
>
> drivers/cxl/core/region/
Do we still have reasons to keep the region drivers in core? I know Fabio has been looking at moving the region drivers to drivers/cxl/ so the LMH cxl_test stuff doesn't doesn't need to do all the weird stuff to make it work. Maybe we just do the refactor now and move the region drivers outside of core. How about drivers/cxl/region/?
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-13 18:00 ` Dave Jiang
@ 2026-01-13 20:07 ` Gregory Price
2026-01-14 16:36 ` dan.j.williams
1 sibling, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-13 20:07 UTC (permalink / raw)
To: Dave Jiang
Cc: dan.j.williams, linux-cxl, linux-kernel, kernel-team, dave,
jonathan.cameron, alison.schofield, vishal.l.verma, ira.weiny,
Fabio M. De Francesco
On Tue, Jan 13, 2026 at 11:00:42AM -0700, Dave Jiang wrote:
>
> >
> > This is a "probe" operation not an "enable" in terms of runtime ABI and
> > presentation that starts decorating the region. In that respect it also
> > is not a "control" as much as an "operation model / driver". So no need
> > for a "control" concept, i.e.:
> >
> > s/CXL_CONTROL_{NONE,AUTO,DAX}/CXL_DRIVER_{NONE,AUTO,DAX}/
> > s/enum cxl_memctrl_mode/enum cxl_region_driver/
> >
> > ...otherwise there is nothing in this proposal that makes me want to
> > abandon the traditional meaning of a "driver" probing a "resource" in a
> > certain way to make it usable with the rest of the kernel.
> >
> > Rest of this looks fine. With that fixup if we are going to have a set
> > of different region driver modes then the directory can be:
> >
> > drivers/cxl/core/region/
>
> Do we still have reasons to keep the region drivers in core? I know Fabio has been looking at moving the region drivers to drivers/cxl/ so the LMH cxl_test stuff doesn't doesn't need to do all the weird stuff to make it work. Maybe we just do the refactor now and move the region drivers outside of core. How about drivers/cxl/region/?
>
I pulled them out into drivers/cxl/core for now, but this is a trivial
move. I think it may still use some core.h functions, so it would be
more work to pull them all the way out into cxl.
will submit a v2 here shortly with just the refactor work w/ the new
sysfs entry (but no sysram driver, going to defer that to a separate
line on top of this).
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 2/6] cxl: add sysram_region memory controller
2026-01-12 22:55 ` Gregory Price
@ 2026-01-13 22:34 ` Cheatham, Benjamin
0 siblings, 0 replies; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-13 22:34 UTC (permalink / raw)
To: Gregory Price
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, David Hildenbrand
On 1/12/2026 4:55 PM, Gregory Price wrote:
> On Mon, Jan 12, 2026 at 03:10:41PM -0600, Cheatham, Benjamin wrote:
>> On 1/12/2026 10:35 AM, Gregory Price wrote:
>>> Add a sysram memctrl that directly hotplugs memory without needing to
>>> route through DAX. This simplifies the sysram usecase considerably.
>>>
>>> The sysram memctl adds new sysfs controls when registered:
>>> region/memctrl/[hotplug, hotunplug, state]
>>>
>>> hotplug: controller attempts to hotplug the memory region
>>> hotunplug: controller attempts to offline and hotunplug the memory region
>>
>> Nit: Would it be better to use hotadd/hotremove here instead of hotplug/hotunplug? The terms
>> are basically synonymous, but I think hotadd and hotremove are more descriptive.
>
> I will defer to David on this. I think keeping the terminology
> consistent is better, but also hotplug is overloaded between physical
> and logical. It ultimately means the same thing to be honest.
I agree, I'm fine with either here.
>
>>> state: [online,online_normal,offline]
>>> online : controller onlines blocks in ZONE_MOVABLE
>>> online_normal: controller onlines blocks in ZONE_NORMAL
>>
>> The naming for online states could be improved imo. I understand and agree with the motivation
>> behind the names, but I could see the use of the word "normal" being confusing to less savvy users.
>> You could change it to include the zone for both (online_movable/online_normal), but I think it may
>> be easier to mark which one has drawbacks, i.e. change "online_normal" to something like "online_nonremovable".
>> That way, anyone who doesn't want to go find the documentation for these can understand the user-visible
>> impact.
>>
>> In any case, all of these attributes need ABI documentation as well.
>>
>
> This is what i was getting at originally, I will consider the other
> feedback and spin a v2 with this simplified a bit.
>
> I'm leaning towards agreeing with Dan and David that probably we just
> keep online/online_movable since it's consistent with base/memory.c, but
> we can continue to have this argument.
>
> I don't think we can reasonable get away from users of this interface
> understanding the implications of ZONEs, since whatever they choose to
> do dictates what zone the memory gets added to.
That sounds reasonable. I was going under the assumption that someone may come
along who doesn't know much about zones, which probably isn't very likely. So
if we want to ditch that assumption it's fine by me.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
2026-01-12 23:14 ` Gregory Price
@ 2026-01-13 22:35 ` Cheatham, Benjamin
0 siblings, 0 replies; 40+ messages in thread
From: Cheatham, Benjamin @ 2026-01-13 22:35 UTC (permalink / raw)
To: Gregory Price
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, David Hildenbrand, Hannes Reinecke
On 1/12/2026 5:14 PM, Gregory Price wrote:
> On Mon, Jan 12, 2026 at 03:11:05PM -0600, Cheatham, Benjamin wrote:
>> On 1/12/2026 10:35 AM, Gregory Price wrote:
>>> If state is set to online (default to ZONE_MOVABLE), the user intends
>>> for this memory to either refuse non-movable allocations, and/or intends
>>> to preserve the hot-unpluggability of this memory. However, any admin
>>> can write `offline` and `online` to the memory block controller and
>>> bring that memory online in ZONE_NORMAL.
>>
>> Is it the expectation that the user will never want to change the zone from
>> MOVABLE to NORMAL? I can't think of a reason someone would want to off the top
>> of my head, but I also can't think of a reason to restrict it either.
>>
>
> It's more to restrict this pattern
>
> echo online_movable > region0/hotplug
> -> creates: node1/memory123/
>
> echo offline > node1/memory123/state
> echo online > node1/memory123/state
>
> The result of this would be valid_zones=[normal movable], which would
> break hot-unplug.
Ahh ok I think I get it now. I wasn't thinking about bypassing the memctrl/ interface
and using the memory block sysfs directly. Thanks for the explanation!
Thanks,
Ben
>
>>> If an actor attempts to online the block into ZONE_NORMAL, it will fail,
>>> but if it attempts to online into either NORMAL or MOVABLE, only MOVABLE
>>> will be allowed and it will succeed.
>>
>> I'm not sure you need this paragraph. I think it's a logical conclusion of the above
>> that if someone attempts to online the memory as NORMAL or MOVABLE it'll only be onlined
>> as MOVABLE.
>
> in the above situation the following occurs:
>
> echo online > region0/hotplug
> echo offline > node1/memory123/state
> echo online > node1/memory123/state
> cat node1/memory123/valid_zones
> normal movable
> echo offline > node1/memory123/state
> echo 1 > node1/memory123/online
> cat node1/memory123/valid_zones
> normal
>
>
> echo online_movable > region0/hotplug
> echo offline > node1/memory123/state
> echo online > node1/memory123/state
> cat node1/memory123/valid_zones
> movable
> echo offline > node1/memory123/state
> echo 1 > node1/memory123/online
> fail with EXXXX (i forget what code)
>
> It's a little confusing.
>
>>> + switch (data->last_online_type) {
>>> + case MMOP_ONLINE_MOVABLE:
>>> + return sysfs_emit(buf, "online\n");
>>> + case MMOP_ONLINE_KERNEL:
>>> + return sysfs_emit(buf, "online_normal\n");
>>> + case MMOP_OFFLINE:
>>> + default:
>>
>> You're missing the MMOP_ONLINE case. In that case the memory would be reported as "offline", which
>> I doubt is the intention.
>>
>
> Blah, i originally had all of them and just reduced to
> MMOP_ONLINE_MOVABLE and MMOP_ONLINE (i don't see a good use for
> MMOP_ONLINE_KERNEL), but i'll fix this up.
>
> Thanks!
> Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-13 18:00 ` Dave Jiang
2026-01-13 20:07 ` Gregory Price
@ 2026-01-14 16:36 ` dan.j.williams
1 sibling, 0 replies; 40+ messages in thread
From: dan.j.williams @ 2026-01-14 16:36 UTC (permalink / raw)
To: Dave Jiang, dan.j.williams, Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron,
alison.schofield, vishal.l.verma, ira.weiny,
Fabio M. De Francesco
Dave Jiang wrote:
[..]
> > Rest of this looks fine. With that fixup if we are going to have a set
> > of different region driver modes then the directory can be:
> >
> > drivers/cxl/core/region/
>
> Do we still have reasons to keep the region drivers in core? I know
> Fabio has been looking at moving the region drivers to drivers/cxl/ so
> the LMH cxl_test stuff doesn't doesn't need to do all the weird stuff
> to make it work. Maybe we just do the refactor now and move the region
> drivers outside of core. How about drivers/cxl/region/?
Not opposed to making it a module, just need to make sure that all
potential region drivers are loaded prior to the first
cxl_add_to_region() call. Otherwise, this breaks the expectation that
auto-assembly of regions present at boot can be flushed by
wait_for_device_probe(). Specifically, userspace module loading requests
are not flushed by this helper, only direct symbol dependencies.
It would be lovely if we had a unit test that specifically checked for
regressions like this, because the race can be difficult to see.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
2026-01-12 20:59 ` dan.j.williams
2026-01-12 21:10 ` Cheatham, Benjamin
@ 2026-01-14 17:18 ` Jonathan Cameron
2026-01-14 18:25 ` Gregory Price
2 siblings, 1 reply; 40+ messages in thread
From: Jonathan Cameron @ 2026-01-14 17:18 UTC (permalink / raw)
To: Gregory Price
Cc: linux-cxl, linux-kernel, kernel-team, dave, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On Mon, 12 Jan 2026 11:35:09 -0500
Gregory Price <gourry@gourry.net> wrote:
> The CXL driver presently hands policy management over to DAX subsystem
> for sysram regions, which makes building policy around the entire region
> clunky and at time difficult (e.g. multiple actions to offline and
> hot-unplug memory reliably).
>
> To support multiple backend controllers for memory regions (for example
> dax vs direct hotplug), implement a memctrl field in cxl_region allows
> switching uncomitted regions between different "memory controllers".
>
> CXL_CONTROL_NONE: No selected controller, probe will fail.
> CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
> otherwise register a dax_region
> CXL_CONTROL_DAX : register a dax_region
>
> Auto regions will either be static sysram (BIOS-onlined) and has no
> region controller associated with it - or if the SP bit was set a
> DAX device will be created.
>
> Rather than default all regions to the auto-controller, only default
> auto-regions to the auto controller.
>
> Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
> a failure to probe unless a controller is selected.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
Trivial comments whilst I try to get my head around the series.
...
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> new file mode 100644
> index 000000000000..8165aad5a52a
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> diff --git a/drivers/cxl/core/memctrl/dax_region.c b/drivers/cxl/core/memctrl/dax_region.c
> new file mode 100644
> index 000000000000..90d7fdb97013
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/dax_region.c
Can we break the code move out as a precursor with a bit of explanation of
why? I want to be able to spot functional changes more easily.
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +static struct lock_class_key cxl_dax_region_key;
> +
> +/*
> + * The dax controller is the default controller and simply hands the
> + * control pattern over to the dax driver. It does with a dax_region
> + * built by dax/cxl.c
> + */
> +int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> +{
> + struct cxl_dax_region *cxlr_dax;
> + struct device *dev;
> + int rc;
> +
> + cxlr_dax = cxl_dax_region_alloc(cxlr);
Nice to spinkle __free magic on this one though obviously
unrelated to what you are focused on here!
> + if (IS_ERR(cxlr_dax))
> + return PTR_ERR(cxlr_dax);
> +
> + dev = &cxlr_dax->dev;
> + rc = dev_set_name(dev, "dax_region%d", cxlr->id);
> + if (rc)
> + goto err;
> +
> + rc = device_add(dev);
> + if (rc)
> + goto err;
> +
> + dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
> + dev_name(dev));
> +
> + return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
> + cxlr_dax);
> +err:
> + put_device(dev);
> + return rc;
> +}
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> new file mode 100644
> index 000000000000..24e0e14b39c7
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/memctrl.c
...
> +int cxl_enable_memctrl(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> +
> + switch (cxlr->memctrl) {
> + case CXL_MEMCTRL_AUTO:
> + /*
> + * The region can not be manged by CXL if any portion of
> + * it is already online as 'System RAM'
> + */
> + if (walk_iomem_res_desc(IORES_DESC_NONE,
> + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> + p->res->start, p->res->end, cxlr,
> + is_system_ram) > 0)
> + return 0;
> + return devm_cxl_add_dax_region(cxlr);
> + case CXL_MEMCTRL_DAX:
> + return devm_cxl_add_dax_region(cxlr);
> + default:
> + return -EINVAL;
> + }
> +}
> +
I can't resist pointing out the trivial. One line sufficient.
> +
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ae899f68551f..02d7d9ae0252 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -626,6 +626,50 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr,
> }
> static DEVICE_ATTR_RO(mode);
>
> +static ssize_t ctrl_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct cxl_region *cxlr = to_cxl_region(dev);
> + const char *desc;
> +
> + switch (cxlr->memctrl) {
> + case CXL_MEMCTRL_AUTO:
> + desc = "auto";
If any expectation of non trivial number of these, look up in an array rather
than a switch statement. I'll scale better.
> + break;
> + case CXL_MEMCTRL_DAX:
> + desc = "dax";
> + break;
> + default:
I'd call this out as NONE simply as then the compiler will point out if you
missed adding code here when a new type is added.
> + desc = "";
> + break;
> + }
> +
> + return sysfs_emit(buf, "%s\n", desc);
> +}
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ba17fa86d249..b8fabaa77262 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -502,6 +502,19 @@ enum cxl_partition_mode {
> CXL_PARTMODE_PMEM,
> };
>
> +
> +/*
> + * Memory Controller modes:
Give it kernel-doc.
> + * None - No controller selected
> + * Auto - either BIOS-configured as SysRAM, or default to DAX
> + * DAX - creates a dax_region controller for the cxl_region
> + */
> +enum cxl_memctrl_mode {
> + CXL_MEMCTRL_NONE,
> + CXL_MEMCTRL_AUTO,
> + CXL_MEMCTRL_DAX,
> +};
> +
> /*
> * Indicate whether this region has been assembled by autodetection or
> * userspace assembly. Prevent endpoint decoders outside of automatic
> @@ -543,6 +556,7 @@ struct cxl_region {
> struct device dev;
> int id;
> enum cxl_partition_mode mode;
> + enum cxl_memctrl_mode memctrl;
Needs kernel-doc.
> enum cxl_decoder_type type;
> struct cxl_nvdimm_bridge *cxl_nvb;
> struct cxl_pmem_region *cxlr_pmem;
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-14 17:18 ` Jonathan Cameron
@ 2026-01-14 18:25 ` Gregory Price
2026-01-14 18:36 ` Jonathan Cameron
0 siblings, 1 reply; 40+ messages in thread
From: Gregory Price @ 2026-01-14 18:25 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-cxl, linux-kernel, kernel-team, dave, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On Wed, Jan 14, 2026 at 05:18:08PM +0000, Jonathan Cameron wrote:
> On Mon, 12 Jan 2026 11:35:09 -0500
> Gregory Price <gourry@gourry.net> wrote:
>
> > The CXL driver presently hands policy management over to DAX subsystem
> > for sysram regions, which makes building policy around the entire region
> > clunky and at time difficult (e.g. multiple actions to offline and
> > hot-unplug memory reliably).
> >
> > To support multiple backend controllers for memory regions (for example
> > dax vs direct hotplug), implement a memctrl field in cxl_region allows
> > switching uncomitted regions between different "memory controllers".
> >
> > CXL_CONTROL_NONE: No selected controller, probe will fail.
> > CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
> > otherwise register a dax_region
> > CXL_CONTROL_DAX : register a dax_region
> >
> > Auto regions will either be static sysram (BIOS-onlined) and has no
> > region controller associated with it - or if the SP bit was set a
> > DAX device will be created.
> >
> > Rather than default all regions to the auto-controller, only default
> > auto-regions to the auto controller.
> >
> > Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
> > a failure to probe unless a controller is selected.
> >
> > Signed-off-by: Gregory Price <gourry@gourry.net>
> Trivial comments whilst I try to get my head around the series.
>
> ...
>
Recommend piviting to v2 which simplifies and addresses some of your
notes here already:
https://lore.kernel.org/linux-cxl/aWfe-r7uEV-ajfhX@gourry-fedora-PF4VCD3F/T/#m791301178876f9b1ab55ed4091c674d4f4ceb07c
Still need to add some docs.
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl
2026-01-14 18:25 ` Gregory Price
@ 2026-01-14 18:36 ` Jonathan Cameron
0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2026-01-14 18:36 UTC (permalink / raw)
To: Gregory Price
Cc: linux-cxl, linux-kernel, kernel-team, dave, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
On Wed, 14 Jan 2026 13:25:53 -0500
Gregory Price <gourry@gourry.net> wrote:
> On Wed, Jan 14, 2026 at 05:18:08PM +0000, Jonathan Cameron wrote:
> > On Mon, 12 Jan 2026 11:35:09 -0500
> > Gregory Price <gourry@gourry.net> wrote:
> >
> > > The CXL driver presently hands policy management over to DAX subsystem
> > > for sysram regions, which makes building policy around the entire region
> > > clunky and at time difficult (e.g. multiple actions to offline and
> > > hot-unplug memory reliably).
> > >
> > > To support multiple backend controllers for memory regions (for example
> > > dax vs direct hotplug), implement a memctrl field in cxl_region allows
> > > switching uncomitted regions between different "memory controllers".
> > >
> > > CXL_CONTROL_NONE: No selected controller, probe will fail.
> > > CXL_CONTROL_AUTO: If memory is already online as SysRAM, no controller
> > > otherwise register a dax_region
> > > CXL_CONTROL_DAX : register a dax_region
> > >
> > > Auto regions will either be static sysram (BIOS-onlined) and has no
> > > region controller associated with it - or if the SP bit was set a
> > > DAX device will be created.
> > >
> > > Rather than default all regions to the auto-controller, only default
> > > auto-regions to the auto controller.
> > >
> > > Non-auto regions will be defaulted to CXL_CONTROL_NONE, which will cause
> > > a failure to probe unless a controller is selected.
> > >
> > > Signed-off-by: Gregory Price <gourry@gourry.net>
> > Trivial comments whilst I try to get my head around the series.
> >
> > ...
> >
>
> Recommend piviting to v2 which simplifies and addresses some of your
> notes here already:
>
> https://lore.kernel.org/linux-cxl/aWfe-r7uEV-ajfhX@gourry-fedora-PF4VCD3F/T/#m791301178876f9b1ab55ed4091c674d4f4ceb07c
>
> Still need to add some docs.
That came cover letter free so I didn't figure out it was this
until a few patches in!
J
>
> ~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
` (6 preceding siblings ...)
2026-01-13 9:37 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Neeraj Kumar
@ 2026-01-15 18:43 ` Alejandro Lucero Palau
2026-01-15 18:56 ` Gregory Price
7 siblings, 1 reply; 40+ messages in thread
From: Alejandro Lucero Palau @ 2026-01-15 18:43 UTC (permalink / raw)
To: Gregory Price, linux-cxl
Cc: linux-kernel, kernel-team, dave, jonathan.cameron, dave.jiang,
alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
Hi Gregory,
I was concerned with how this could affect Type2 but I think there is no
issue at all, but I prefer to ask specifically about it.
Type2 can obtain an auto region if BIOS enabled/configured the HDMs, or
it can create one on purpose if not. Type2 patchset does not allow to
create a dax region when region probing and it will be the same type2
check precluding call to enabling the sysram controller.
However, I can see the region will have default sysfs files for setting
the controller. I think even with such a change there is no way for
enabling the controller from the type2 region probing, so I guess it is
safe, but I would prefer to not allow a Type2 region setting a
controller at all.
I like the approach for solving the problem pointed out, and I think
something similar or a controller extension for type2 could be needed in
the future, but maybe adding more flexibility for theoretical per type2
driver memory-handling uniqueness.
Thank you!
On 1/12/26 16:35, Gregory Price wrote:
> The CXL driver currently hands policy management over to the DAX
> subsystem for sysram regions. This makes building policy around
> entire regions clunky and at times difficult - for example, requiring
> multiple actions to reliably offline and hot-unplug memory.
>
> This series introduces a memory controller abstraction for CXL regions
> and adds a "sysram" controller that directly hotplugs memory without
> needing to route through DAX. This simplifies the sysram use case
> considerably.
>
> This also prepares for future use cases which may require different
> memory controller logic (such as private numa nodes).
>
> We organize the controllers into core/memctrl/*_region.c files.
>
> The series is organized as follows:
>
> Patch 1 introduces the cxl_memctrl_mode enum and region->memctrl field,
> allowing regions to be switched between different memory controllers.
> The supported modes are NONE, AUTO, and DAX initially. Auto-created
> regions default to AUTO, while manually created regions default to NONE
> (requiring explicit controller selection).
>
> Patch 2 adds the sysram_region memory controller, which provides direct
> memory hotplug without DAX intermediation. New sysfs controls are
> exposed under region/memctrl/:
> - hotplug: trigger memory hotplug
> - hotunplug: offline and hotunplug memory
> - state: online/online_normal/offline
>
> Patch 3 refactors existing pmem memctrl logic out of region.c into the
> new memctrl/pmem_region.c, simplifying controller selection in region
> probe.
>
> Patch 4 adds CONFIG_CXL_REGION_CTRL_AUTO_* options, allowing users to
> configure auto-regions to default to SYSRAM instead of DAX for existing
> simple system configurations (i.e. local memory expansion only).
>
> Patch 5 adds CONFIG_CXL_REGION_SYSRAM_DEFAULT_* options to control the
> default state of sysram blocks (OFFLINE, ONLINE/ZONE_MOVABLE, or
> ONLINE_NORMAL/ZONE_NORMAL). This provides an alternative to the global
> MHP auto-online setting which may cause issues with other devices.
>
> Online defaults to ZONE_MOVABLE to defend hot-unplug by default.
> This is the opposite of memory blocks "online" and "online_movable".
>
> Patch 6 adds a memory_notify callback that prevents memory blocks from
> being onlined into ZONE_NORMAL when the controller state is set to
> ZONE_MOVABLE. This protects against administrators accidentally
> breaking hot-unpluggability by writing "offline" then "online" to the
> memory block sysfs.
>
> Gregory Price (6):
> drivers/cxl: add cxl_memctrl_mode and region->memctrl
> cxl: add sysram_region memory controller
> cxl/core/region: move pmem memctrl logic into memctrl/pmem_region
> cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options
> cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options
> cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only
>
> drivers/cxl/Kconfig | 72 ++++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/core.h | 5 +
> drivers/cxl/core/memctrl/Makefile | 6 +
> drivers/cxl/core/memctrl/dax_region.c | 79 ++++
> drivers/cxl/core/memctrl/memctrl.c | 48 +++
> drivers/cxl/core/memctrl/pmem_region.c | 191 +++++++++
> drivers/cxl/core/memctrl/sysram_region.c | 520 +++++++++++++++++++++++
> drivers/cxl/core/region.c | 358 ++++------------
> drivers/cxl/cxl.h | 18 +
> 10 files changed, 1013 insertions(+), 285 deletions(-)
> create mode 100644 drivers/cxl/core/memctrl/Makefile
> create mode 100644 drivers/cxl/core/memctrl/dax_region.c
> create mode 100644 drivers/cxl/core/memctrl/memctrl.c
> create mode 100644 drivers/cxl/core/memctrl/pmem_region.c
> create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller
2026-01-15 18:43 ` Alejandro Lucero Palau
@ 2026-01-15 18:56 ` Gregory Price
0 siblings, 0 replies; 40+ messages in thread
From: Gregory Price @ 2026-01-15 18:56 UTC (permalink / raw)
To: Alejandro Lucero Palau
Cc: linux-cxl, linux-kernel, kernel-team, dave, jonathan.cameron,
dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams
On Thu, Jan 15, 2026 at 06:43:08PM +0000, Alejandro Lucero Palau wrote:
> Hi Gregory,
>
>
> I was concerned with how this could affect Type2 but I think there is no
> issue at all, but I prefer to ask specifically about it.
>
>
> Type2 can obtain an auto region if BIOS enabled/configured the HDMs, or it
> can create one on purpose if not. Type2 patchset does not allow to create a
> dax region when region probing and it will be the same type2 check
> precluding call to enabling the sysram controller.
>
>
> However, I can see the region will have default sysfs files for setting the
> controller. I think even with such a change there is no way for enabling the
> controller from the type2 region probing, so I guess it is safe, but I would
> prefer to not allow a Type2 region setting a controller at all.
>
>
> I like the approach for solving the problem pointed out, and I think
> something similar or a controller extension for type2 could be needed in the
> future, but maybe adding more flexibility for theoretical per type2 driver
> memory-handling uniqueness.
>
>
(pre-note: we changed the verbiage from controller to driver)
I think Type2 devices (and some special memory devices) are exactly
the use case that drives formalizing region-drivers.
Some Type2 devices might just register a normal memory region.
Some Type2 devices might want a dax device.
Some Type2 devices might want a "private memory" region (private node).
Some Type2 devices might have a completely different usage pattern.
Maybe we might want a control that limits which drivers a given device
can use (limiting switch-ability), and let devices inform the core
region driver of that in some way.
The sysfs toggle is a pressure-valve for devices that might be configurable
in multiple ways. Maybe it won't actually be needed, but at least for
"Regular Memory" devices i can see us having dax and sysram drivers at a
minimum.
The base sysram driver may even be the basis for a dcd_sysram driver.
After all it's just a sysram driver w/ add/remove extent functions :]
~Gregory
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2026-01-15 18:56 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20260113093758epcas5p10cc9749a657b8e4d32db75b8b973b67d@epcas5p1.samsung.com>
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
2026-01-12 16:35 ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
2026-01-12 20:59 ` dan.j.williams
2026-01-12 22:25 ` Gregory Price
2026-01-13 18:00 ` Dave Jiang
2026-01-13 20:07 ` Gregory Price
2026-01-14 16:36 ` dan.j.williams
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 22:34 ` Gregory Price
2026-01-14 17:18 ` Jonathan Cameron
2026-01-14 18:25 ` Gregory Price
2026-01-14 18:36 ` Jonathan Cameron
2026-01-12 16:35 ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
2026-01-12 20:00 ` David Hildenbrand (Red Hat)
2026-01-12 22:43 ` Gregory Price
2026-01-12 21:10 ` dan.j.williams
2026-01-12 22:47 ` Gregory Price
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 22:55 ` Gregory Price
2026-01-13 22:34 ` Cheatham, Benjamin
2026-01-12 16:35 ` [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region Gregory Price
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 22:58 ` Gregory Price
2026-01-13 9:12 ` Neeraj Kumar
2026-01-12 16:35 ` [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options Gregory Price
2026-01-12 21:10 ` Cheatham, Benjamin
2026-01-12 23:05 ` Gregory Price
2026-01-13 4:31 ` dan.j.williams
2026-01-13 13:55 ` Gregory Price
2026-01-12 16:35 ` [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options Gregory Price
2026-01-12 21:11 ` Cheatham, Benjamin
2026-01-12 23:07 ` Gregory Price
2026-01-12 16:35 ` [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only Gregory Price
2026-01-12 21:11 ` Cheatham, Benjamin
2026-01-12 23:14 ` Gregory Price
2026-01-13 22:35 ` Cheatham, Benjamin
2026-01-13 9:37 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Neeraj Kumar
2026-01-13 13:33 ` Gregory Price
2026-01-15 18:43 ` Alejandro Lucero Palau
2026-01-15 18:56 ` Gregory Price
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox