[RFC PATCH 0/3] Add Support for Multiple DC Regions

Linux CXL
 help / color / mirror / Atom feed

* [RFC PATCH 0/3] Add Support for Multiple DC Regions
@ 2025-12-03 20:29 anisa.su887
  2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw)
  To: dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Anisa Su

From: Anisa Su <anisa.su@samsung.com>

This patchset introduces support for multiple DC regions. It is rebased on top
of the latest branch published to Ira's repository:
https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
We hope it will be useful in the meantime for others and restart some
discussion around how to move DCD forward.

The corresponding NDCTL support can be found on this branch:
https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
I will reply to this thread with a reference to the thread for the
NDCTL patches once published.

Testing:
This patchset was tested on a QEMU VM with the following topology:

PCIE Root (pcie.0)
│
├─ CXL Fixed Memory Window cxl-fmw.0
├─ CXL Root Complex cxl.0
│  └─ Root Port root_port1
│     └─ CXL Type-3 Device cxl-dcd0
│
├─ CXL Fixed Memory Window cxl-fmw.1
├─ CXL Root Complex cxl.1
│  └─ Root Port root_port2
│     └─ CXL Type-3 Device cxl-dcd1
└─

"-object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/t3_cxl1.raw,size=8G \
-object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/t3_lsa1.raw,size=1M \
-object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/t3_cxl2.raw,size=8G \
-object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/t3_lsa2.raw,size=1M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.0,hdm_for_passthrough=true \
-device pxb-cxl,bus_nr=48,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
-device cxl-rp,port=0,bus=cxl.0,id=root_port1,chassis=0,slot=1 \
-device cxl-rp,port=1,bus=cxl.1,id=root_port2,chassis=1,slot=1 \
-device cxl-type3,bus=root_port1,volatile-dc-memdev=cxl-mem1,id=cxl-dcd0,lsa=cxl-lsa1,num-dc-regions=8,sn=99 \
-device cxl-type3,bus=root_port2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd1,lsa=cxl-lsa2,num-dc-regions=8,sn=100 \
-machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=8G,cxl-fmw.1.targets.0=cxl.1,cxl-fmw.1.size=8G"

2 CFMWs and 2 root complexes are emulated because QEMU creates
4 decoders/topology level. With 1 root complex, there are only 4 upstream
decoders. Therefore in order to create 4+ regions, we need a total of
8 upstream decoders. This does mean that we are only able to create
4 regions on each device, although up to 8 are supported.

Using `cxl list`, we can see mem0 and mem1 have dynamic_ram_* capablities:
root@deb-101020-bm01:~# cxl list
[
  {
    "memdevs":[
      {
        "memdev":"mem0",
        "dynamic_ram_0_size":1073741824,
        "dynamic_ram_1_size":1073741824,
        "dynamic_ram_2_size":1073741824,
        "dynamic_ram_3_size":1073741824,
        "dynamic_ram_4_size":1073741824,
        "dynamic_ram_5_size":1073741824,
        "dynamic_ram_6_size":1073741824,
        "dynamic_ram_7_size":1073741824,
        "serial":100,
        "host":"0000:31:00.0",
        "firmware_version":"BWFW VERSION 00"
      },
      {
        "memdev":"mem1",
        "dynamic_ram_0_size":1073741824,
        "dynamic_ram_1_size":1073741824,
        "dynamic_ram_2_size":1073741824,
        "dynamic_ram_3_size":1073741824,
        "dynamic_ram_4_size":1073741824,
        "dynamic_ram_5_size":1073741824,
        "dynamic_ram_6_size":1073741824,
        "dynamic_ram_7_size":1073741824,
        "serial":99,
        "host":"0000:0d:00.0",
        "firmware_version":"BWFW VERSION 00"
      }
    ]
  }
]

To create the 8 regions:
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_0
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_1
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_2
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_3

cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_4
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_5
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_6
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_7


We can verify the 8 regions:
root@deb-101020-bm01:~# cxl list
[
  {
    "memdevs":[
...
  },
  {
    "regions":[
      {
        "region":"region0",
        "resource":79993765888,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region6",
        "resource":81067507712,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region7",
        "resource":82141249536,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region8",
        "resource":83214991360,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region1",
        "resource":88315265024,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region2",
        "resource":89389006848,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region3",
        "resource":90462748672,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region4",
        "resource":91536490496,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      }
    ]
  }
]

Extents of various sizes (128MB, 256MB, 512MB, and 1GB) are added from mem1,
which correspond to regions 0-3, then DAX devices are created from them.
The extent DPAs are as follows, which allows each one to map to a distinct
region:
 - [0-128] --> region0
 - [1024-1280] --> region1
 - [2048-2560] --> region2
 - [3072-4096] --> region3

The correct sizes can be verified when creating the DAX device.
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region0
[
  {
    "chardev":"dax0.1",
    "size":134217728,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region1
[
  {
    "chardev":"dax1.1",
    "size":268435456,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region2
[
  {
    "chardev":"dax2.1",
    "size":536870912,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region3
[
  {
    "chardev":"dax3.1",
    "size":1073741824,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device

Then the DAX devices are reconfigured to system-ram mode and verified with lsmem.
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax0.1 -m system-ram
[
  {
    "chardev":"dax0.1",
    "size":134217728,
    "target_node":1,
    "align":2097152,
    "mode":"system-ram",
    "online_memblocks":1,
    "total_memblocks":1,
    "movable":true
  }
]
reconfigured 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax1.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax2.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax3.1 -m system-ram
...


root@deb-101020-bm01:~/libcxlmi# lsmem
RANGE                                  SIZE  STATE REMOVABLE   BLOCK
0x0000000000000000-0x000000007fffffff    2G online       yes    0-15
0x0000000100000000-0x000000027fffffff    6G online       yes   32-79
0x00000012a0000000-0x00000012a7ffffff  128M online       yes     596
0x00000012e0000000-0x00000012efffffff  256M online       yes 604-605
0x0000001320000000-0x000000133fffffff  512M online       yes 612-615
0x0000001360000000-0x000000139fffffff    1G online       yes 620-627

Memory block size:                128M
Total online memory:              9.9G
Total offline memory:               0B

-------------------------------------------------------------------------------
Note: I did try hacking QEMU to create 8 decoders at each level to avoid having
2 separate host bridges/DCDs by modifying include/hw/cxl/cxl_component.h like so:

#define CXL_HDM_DECODER_COUNT 8
HDM_DECODER_INIT(0);
HDM_DECODER_INIT(1);
HDM_DECODER_INIT(2);
HDM_DECODER_INIT(3);
HDM_DECODER_INIT(4);
HDM_DECODER_INIT(5);
HDM_DECODER_INIT(6);
HDM_DECODER_INIT(7);

However, when attempting to create the 5th cxl region,
I ran into a timeout error when committing the decoders.
Did not spend much time pursuing this further, most likely
need to change more things on the QEMU side.
But the 8 decoders do show up correctly under sysfs.

Fan Ni (3):
  core/region: fix return logic for store_targetN
  dax/cxl: add existing dc extents when probing dax region
  dcd: Add support for multiple DC regions

 drivers/cxl/core/cdat.c   |   2 +-
 drivers/cxl/core/core.h   |   9 +-
 drivers/cxl/core/extent.c |   2 +-
 drivers/cxl/core/hdm.c    |  18 +++-
 drivers/cxl/core/mbox.c   |  39 +++++----
 drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++-------------
 drivers/cxl/core/port.c   |  45 ++++++++--
 drivers/cxl/core/region.c |  65 ++++++++------
 drivers/cxl/cxl.h         |  23 ++++-
 drivers/cxl/cxlmem.h      |   5 +-
 drivers/dax/cxl.c         |  28 ++----
 11 files changed, 281 insertions(+), 134 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 1/3] core/region: fix return logic for store_targetN
  2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
@ 2025-12-03 20:29 ` anisa.su887
  2025-12-04 17:04   ` Ira Weiny
  2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw)
  To: dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su

From: Fan Ni <fan.ni@samsung.com>

Currently, store_targetN attempts to attach_target even if an error
(cxlr->mode is incorrect, DCD is unsupported). Add a goto out statement
to skip any attempt to attach the target on error.

Signed-off-by: Fan Ni <nifan.cxl@gmail.com>
Tested-by: Anisa Su <anisa.su@samsung.com>
Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
---
 drivers/cxl/core/region.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index a93979dd345d..adab1f338ee9 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2258,7 +2258,8 @@ static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos,
 		if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A &&
 		    !cxl_dcd_supported(cxled_to_mds(cxled))) {
 			dev_dbg(dev, "DCD unsupported\n");
-			return -EINVAL;
+			rc = -EINVAL;
+			goto out;
 		}
 		rc = attach_target(cxlr, cxled, pos, TASK_INTERRUPTIBLE);
 out:
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/3] core/region: fix return logic for store_targetN
  2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
@ 2025-12-04 17:04   ` Ira Weiny
  0 siblings, 0 replies; 19+ messages in thread
From: Ira Weiny @ 2025-12-04 17:04 UTC (permalink / raw)
  To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su

anisa.su887@ wrote:
> From: Fan Ni <fan.ni@samsung.com>
> 
> Currently, store_targetN attempts to attach_target even if an error

I think you mean the endpoint decoder device reference is not released,
correct?

> (cxlr->mode is incorrect, DCD is unsupported). Add a goto out statement
> to skip any attempt to attach the target on error.
> 

With an updated commit message:

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region
  2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
  2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
@ 2025-12-03 20:29 ` anisa.su887
  2025-12-03 21:03   ` Anisa Su
  2025-12-04 17:29   ` Ira Weiny
  2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 19+ messages in thread
From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw)
  To: dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su

From: Fan Ni <fan.ni@samsung.com>

Add existing dc extents on the device before probing dax region will
cause the creation of the dax device fail as resource cannot present
when driver is bound to the device as shown in really_probe().

We delay the processing of existing dc extents to cxl region driver
probe.

Question: the guard() in cxlr_notify_extent() will cause lock issue,
removed it. Not sure whether it will cause issue or not although no
issue is observed during test.

Signed-off-by: Fan Ni <nifan.cxl@gmail.com>
Tested-by: Anisa Su <anisa.su@samsung.com>
Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
---
 drivers/cxl/core/extent.c |  2 +-
 drivers/cxl/core/region.c |  8 ++------
 drivers/cxl/cxl.h         |  5 +++++
 drivers/dax/cxl.c         | 24 +++++++-----------------
 4 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
index 3e7295d3e5e2..3b0e4d72d4ac 100644
--- a/drivers/cxl/core/extent.c
+++ b/drivers/cxl/core/extent.c
@@ -285,7 +285,7 @@ static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event,
 	dev_dbg(dev, "Trying notify: type %d HPA %pra\n", event,
 		&region_extent->hpa_range);
 
-	guard(device)(dev);
+	// guard(device)(dev);
 
 	/*
 	 * The lack of a driver indicates a notification has failed.  No user
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index adab1f338ee9..da3ea3cf8585 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3232,7 +3232,7 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
 	return rc;
 }
 
-static int cxlr_add_existing_extents(struct cxl_region *cxlr)
+int cxlr_add_existing_extents(struct cxl_region *cxlr)
 {
 	struct cxl_region_params *p = &cxlr->params;
 	int i, latched_rc = 0;
@@ -3251,6 +3251,7 @@ static int cxlr_add_existing_extents(struct cxl_region *cxlr)
 
 	return latched_rc;
 }
+EXPORT_SYMBOL_NS_GPL(cxlr_add_existing_extents, "CXL");
 
 static void cxlr_dax_unregister(void *_cxlr_dax)
 {
@@ -3287,11 +3288,6 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
 	dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
 		dev_name(dev));
 
-	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A)
-		if (cxlr_add_existing_extents(cxlr))
-			dev_err(&cxlr->dev, "Existing extent processing failed %d\n",
-				rc);
-
 	return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
 					cxlr_dax);
 err:
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index d22fe5e50647..3e400dd4f08b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -926,6 +926,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
 int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
 struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
 u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
+int cxlr_add_existing_extents(struct cxl_region *cxlr);
 #else
 static inline bool is_cxl_pmem_region(struct device *dev)
 {
@@ -948,6 +949,10 @@ static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
 {
 	return 0;
 }
+int cxlr_add_existing_extents(struct cxl_region *cxlr)
+{
+	return 0;
+}
 #endif
 
 void cxl_endpoint_parse_cdat(struct cxl_port *port);
diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
index 011bd1dc7691..15fc2de63185 100644
--- a/drivers/dax/cxl.c
+++ b/drivers/dax/cxl.c
@@ -18,21 +18,6 @@ static int __cxl_dax_add_resource(struct dax_region *dax_region,
 	return dax_region_add_resource(dax_region, dev, start, length);
 }
 
-static int cxl_dax_add_resource(struct device *dev, void *data)
-{
-	struct dax_region *dax_region = data;
-	struct region_extent *region_extent;
-
-	region_extent = to_region_extent(dev);
-	if (!region_extent)
-		return 0;
-
-	dev_dbg(dax_region->dev, "Adding resource HPA %pra\n",
-		&region_extent->hpa_range);
-
-	return __cxl_dax_add_resource(dax_region, region_extent);
-}
-
 static int cxl_dax_region_notify(struct device *dev,
 				 struct cxl_notify_data *notify_data)
 {
@@ -66,6 +51,7 @@ static int cxl_dax_region_probe(struct device *dev)
 	struct dev_dax_data data;
 	resource_size_t dev_size;
 	unsigned long flags;
+	int rc;
 
 	if (nid == NUMA_NO_NODE)
 		nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start);
@@ -80,8 +66,12 @@ static int cxl_dax_region_probe(struct device *dev)
 		return -ENOMEM;
 
 	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) {
-		device_for_each_child(&cxlr_dax->dev, dax_region,
-				      cxl_dax_add_resource);
+		rc = cxlr_add_existing_extents(cxlr);
+		/* If adding existing extents fails, continue with only an error
+		 * message ?? */
+		if (rc)
+			dev_err(&cxlr->dev, "Existing extent processing failed %d\n",
+				rc);
 		/* Add empty seed dax device */
 		dev_size = 0;
 	} else {
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region
  2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887
@ 2025-12-03 21:03   ` Anisa Su
  2025-12-04 17:29   ` Ira Weiny
  1 sibling, 0 replies; 19+ messages in thread
From: Anisa Su @ 2025-12-03 21:03 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ira.weiny, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Wed, Dec 03, 2025 at 08:29:12PM +0000, anisa.su887@gmail.com wrote:
> From: Fan Ni <fan.ni@samsung.com>
> 
> Add existing dc extents on the device before probing dax region will
> cause the creation of the dax device fail as resource cannot present
> when driver is bound to the device as shown in really_probe().
> 
> We delay the processing of existing dc extents to cxl region driver
> probe.
> 
> Question: the guard() in cxlr_notify_extent() will cause lock issue,
> removed it. Not sure whether it will cause issue or not although no
> issue is observed during test.
> 
Hi Fan, I added the guard() back in when I was testing  and did not run into
any issues when.

Do you recall where you ran into the locking issue?

- Anisa
> Signed-off-by: Fan Ni <nifan.cxl@gmail.com>
> Tested-by: Anisa Su <anisa.su@samsung.com>
> Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
> ---
>  drivers/cxl/core/extent.c |  2 +-
>  drivers/cxl/core/region.c |  8 ++------
>  drivers/cxl/cxl.h         |  5 +++++
>  drivers/dax/cxl.c         | 24 +++++++-----------------
>  4 files changed, 15 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
> index 3e7295d3e5e2..3b0e4d72d4ac 100644
> --- a/drivers/cxl/core/extent.c
> +++ b/drivers/cxl/core/extent.c
> @@ -285,7 +285,7 @@ static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event,
>  	dev_dbg(dev, "Trying notify: type %d HPA %pra\n", event,
>  		&region_extent->hpa_range);
>  
> -	guard(device)(dev);
> +	// guard(device)(dev);
>  
>  	/*
>  	 * The lack of a driver indicates a notification has failed.  No user
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index adab1f338ee9..da3ea3cf8585 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3232,7 +3232,7 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr)
>  	return rc;
>  }
>  
> -static int cxlr_add_existing_extents(struct cxl_region *cxlr)
> +int cxlr_add_existing_extents(struct cxl_region *cxlr)
>  {
>  	struct cxl_region_params *p = &cxlr->params;
>  	int i, latched_rc = 0;
> @@ -3251,6 +3251,7 @@ static int cxlr_add_existing_extents(struct cxl_region *cxlr)
>  
>  	return latched_rc;
>  }
> +EXPORT_SYMBOL_NS_GPL(cxlr_add_existing_extents, "CXL");
>  
>  static void cxlr_dax_unregister(void *_cxlr_dax)
>  {
> @@ -3287,11 +3288,6 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
>  	dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
>  		dev_name(dev));
>  
> -	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A)
> -		if (cxlr_add_existing_extents(cxlr))
> -			dev_err(&cxlr->dev, "Existing extent processing failed %d\n",
> -				rc);
> -
>  	return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister,
>  					cxlr_dax);
>  err:
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index d22fe5e50647..3e400dd4f08b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -926,6 +926,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
>  int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
>  struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
>  u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
> +int cxlr_add_existing_extents(struct cxl_region *cxlr);
>  #else
>  static inline bool is_cxl_pmem_region(struct device *dev)
>  {
> @@ -948,6 +949,10 @@ static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
>  {
>  	return 0;
>  }
> +int cxlr_add_existing_extents(struct cxl_region *cxlr)
> +{
> +	return 0;
> +}
>  #endif
>  
>  void cxl_endpoint_parse_cdat(struct cxl_port *port);
> diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
> index 011bd1dc7691..15fc2de63185 100644
> --- a/drivers/dax/cxl.c
> +++ b/drivers/dax/cxl.c
> @@ -18,21 +18,6 @@ static int __cxl_dax_add_resource(struct dax_region *dax_region,
>  	return dax_region_add_resource(dax_region, dev, start, length);
>  }
>  
> -static int cxl_dax_add_resource(struct device *dev, void *data)
> -{
> -	struct dax_region *dax_region = data;
> -	struct region_extent *region_extent;
> -
> -	region_extent = to_region_extent(dev);
> -	if (!region_extent)
> -		return 0;
> -
> -	dev_dbg(dax_region->dev, "Adding resource HPA %pra\n",
> -		&region_extent->hpa_range);
> -
> -	return __cxl_dax_add_resource(dax_region, region_extent);
> -}
> -
>  static int cxl_dax_region_notify(struct device *dev,
>  				 struct cxl_notify_data *notify_data)
>  {
> @@ -66,6 +51,7 @@ static int cxl_dax_region_probe(struct device *dev)
>  	struct dev_dax_data data;
>  	resource_size_t dev_size;
>  	unsigned long flags;
> +	int rc;
>  
>  	if (nid == NUMA_NO_NODE)
>  		nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start);
> @@ -80,8 +66,12 @@ static int cxl_dax_region_probe(struct device *dev)
>  		return -ENOMEM;
>  
>  	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) {
> -		device_for_each_child(&cxlr_dax->dev, dax_region,
> -				      cxl_dax_add_resource);
> +		rc = cxlr_add_existing_extents(cxlr);
> +		/* If adding existing extents fails, continue with only an error
> +		 * message ?? */
> +		if (rc)
> +			dev_err(&cxlr->dev, "Existing extent processing failed %d\n",
> +				rc);
>  		/* Add empty seed dax device */
>  		dev_size = 0;
>  	} else {
> -- 
> 2.51.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region
  2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887
  2025-12-03 21:03   ` Anisa Su
@ 2025-12-04 17:29   ` Ira Weiny
  1 sibling, 0 replies; 19+ messages in thread
From: Ira Weiny @ 2025-12-04 17:29 UTC (permalink / raw)
  To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su

anisa.su887@ wrote:
> From: Fan Ni <fan.ni@samsung.com>
> 
> Add existing dc extents on the device before probing dax region will
> cause the creation of the dax device fail as resource cannot present
> when driver is bound to the device as shown in really_probe().

It's been a while since I've looked at this but the above explanation is
not clear to me.

There can't be dax devices on a region before region devices.  So how is a
dax device driver preventing the creation of a resource while the region
is being probed?

> 
> We delay the processing of existing dc extents to cxl region driver

NIT: Don't use 'we'.  Just say:

"Delay the processing..."

> probe.
> 
> Question: the guard() in cxlr_notify_extent() will cause lock issue,
> removed it. Not sure whether it will cause issue or not although no
> issue is observed during test.
> 
> Signed-off-by: Fan Ni <nifan.cxl@gmail.com>
> Tested-by: Anisa Su <anisa.su@samsung.com>
> Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
> ---
>  drivers/cxl/core/extent.c |  2 +-
>  drivers/cxl/core/region.c |  8 ++------
>  drivers/cxl/cxl.h         |  5 +++++
>  drivers/dax/cxl.c         | 24 +++++++-----------------
>  4 files changed, 15 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
> index 3e7295d3e5e2..3b0e4d72d4ac 100644
> --- a/drivers/cxl/core/extent.c
> +++ b/drivers/cxl/core/extent.c
> @@ -285,7 +285,7 @@ static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event,
>  	dev_dbg(dev, "Trying notify: type %d HPA %pra\n", event,
>  		&region_extent->hpa_range);
>  
> -	guard(device)(dev);
> +	// guard(device)(dev);

This must remain to check for the driver notify callback.

I'm totally willing to admit there might be issues with this code but I'm
not clear what problem this patch is fixing.  Perhaps some more details?

Ira

[snip]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 3/3] dcd: Add support for multiple DC regions
  2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
  2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
  2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887
@ 2025-12-03 20:29 ` anisa.su887
  2025-12-04 17:44   ` Ira Weiny
  2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su
  2025-12-04 17:28 ` Ira Weiny
  4 siblings, 1 reply; 19+ messages in thread
From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw)
  To: dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su

From: Fan Ni <fan.ni@samsung.com>

With the change, we add following support:
1. Allow creating multiple DC regions (up to 8);
2. Allow DC extents to belong to regions other than region 0;
3. Modify sysfs entries to enable the above capabilities;
4. Shareable attribute is added to dc region (partition);

This series is tested with proper NDCTL fix, see:
https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support

Signed-off-by: Fan Ni <nifan.cxl@gmail.com>
Tested-by: Anisa Su <anisa.su@samsung.com>
Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com>
---
 drivers/cxl/core/cdat.c   |   2 +-
 drivers/cxl/core/core.h   |   9 +-
 drivers/cxl/core/hdm.c    |  18 +++-
 drivers/cxl/core/mbox.c   |  39 +++++----
 drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++-------------
 drivers/cxl/core/port.c   |  45 ++++++++--
 drivers/cxl/core/region.c |  54 ++++++++----
 drivers/cxl/cxl.h         |  18 +++-
 drivers/cxl/cxlmem.h      |   5 +-
 drivers/dax/cxl.c         |   4 +-
 10 files changed, 264 insertions(+), 109 deletions(-)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index 67c6917a9add..4b05af576a4f 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -278,7 +278,7 @@ static void cxl_memdev_set_qos_class(struct cxl_dev_state *cxlds,
 			};
 
 			if (range_contains(&range, &dent->dpa_range)) {
-				if (mode == CXL_PARTMODE_DYNAMIC_RAM_A &&
+				if (is_cxl_dc_partition_mode(mode) &&
 				    dent->handle != handle)
 					dev_warn(dev,
 						"Dynamic RAM perf mismatch; %pra (%u) vs %pra (%u)\n",
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 70942c40221b..061dcf3320cd 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -34,7 +34,14 @@ int cxl_region_invalidate_memregion(struct cxl_region *cxlr);
 #ifdef CONFIG_CXL_REGION
 extern struct device_attribute dev_attr_create_pmem_region;
 extern struct device_attribute dev_attr_create_ram_region;
-extern struct device_attribute dev_attr_create_dynamic_ram_a_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_0_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_1_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_2_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_3_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_4_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_5_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_6_region;
+extern struct device_attribute dev_attr_create_dynamic_ram_7_region;
 extern struct device_attribute dev_attr_delete_region;
 extern struct device_attribute dev_attr_region;
 extern const struct device_type cxl_pmem_region_type;
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 6b976da4a70a..faa4656f9542 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -463,8 +463,22 @@ static const char *cxl_mode_name(enum cxl_partition_mode mode)
 		return "ram";
 	case CXL_PARTMODE_PMEM:
 		return "pmem";
-	case CXL_PARTMODE_DYNAMIC_RAM_A:
-		return "dynamic_ram_a";
+	case CXL_PARTMODE_DYNAMIC_RAM_0:
+		return "dynamic_ram_0";
+	case CXL_PARTMODE_DYNAMIC_RAM_1:
+		return "dynamic_ram_1";
+	case CXL_PARTMODE_DYNAMIC_RAM_2:
+		return "dynamic_ram_2";
+	case CXL_PARTMODE_DYNAMIC_RAM_3:
+		return "dynamic_ram_3";
+	case CXL_PARTMODE_DYNAMIC_RAM_4:
+		return "dynamic_ram_4";
+	case CXL_PARTMODE_DYNAMIC_RAM_5:
+		return "dynamic_ram_5";
+	case CXL_PARTMODE_DYNAMIC_RAM_6:
+		return "dynamic_ram_6";
+	case CXL_PARTMODE_DYNAMIC_RAM_7:
+		return "dynamic_ram_7";
 	default:
 		return "";
 	};
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index a6de98eb1310..291a96757ac8 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -963,7 +963,7 @@ static int cxl_validate_extent(struct cxl_memdev_state *mds,
 	for (int i = 0; i < cxlds->nr_partitions; i++) {
 		struct cxl_dpa_partition *part = &cxlds->part[i];
 
-		if (part->mode != CXL_PARTMODE_DYNAMIC_RAM_A)
+		if (!is_cxl_dc_partition_mode(part->mode))
 			continue;
 
 		struct range partition_range = (struct range) {
@@ -1710,6 +1710,7 @@ static int cxl_get_dc_config(struct cxl_mailbox *mbox, u8 start_partition,
  *                         device.
  * @mbox: Mailbox to query
  * @dc_info: The dynamic partition information to return
+ * @num_part: The number of dynamic partitions returned
  *
  * Read Dynamic Capacity information from the device and return the partition
  * information.
@@ -1718,7 +1719,7 @@ static int cxl_get_dc_config(struct cxl_mailbox *mbox, u8 start_partition,
  *         on error only dynamic_bytes is left unchanged.
  */
 int cxl_dev_dc_identify(struct cxl_mailbox *mbox,
-			struct cxl_dc_partition_info *dc_info)
+			struct cxl_dc_partition_info *dc_info, int *num_part)
 {
 	struct cxl_dc_partition_info partitions[CXL_MAX_DC_PARTITIONS];
 	size_t dc_resp_size = mbox->payload_size;
@@ -1763,12 +1764,15 @@ int cxl_dev_dc_identify(struct cxl_mailbox *mbox,
 
 	} while (num_partitions < dc_resp->avail_partition_count);
 
-	/* Return 1st partition */
-	dc_info->start = partitions[0].start;
-	dc_info->size = partitions[0].size;
-	dc_info->handle = partitions[0].handle;
-	dev_dbg(dev, "Returning partition 0 %zu size %zu\n",
-		dc_info->start, dc_info->size);
+
+	*num_part = dc_resp->avail_partition_count;
+	for (int i = 0; i < dc_resp->avail_partition_count; i++) {
+		dc_info[i].start = partitions[i].start;
+		dc_info[i].size = partitions[i].size;
+		dc_info[i].handle = partitions[i].handle;
+		dev_dbg(dev, "Returning partition %d %zu size %zu\n",
+			i, dc_info[i].start, dc_info[i].size);
+	}
 
 	return 0;
 }
@@ -1955,12 +1959,12 @@ EXPORT_SYMBOL_NS_GPL(cxl_get_dirty_count, "CXL");
 
 void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *info)
 {
-	struct cxl_dc_partition_info dc_info = { 0 };
+	struct cxl_dc_partition_info dc_info[CXL_MAX_DC_PARTITIONS];
 	struct device *dev = mds->cxlds.dev;
 	size_t skip;
-	int rc;
+	int rc, num_part;
 
-	rc = cxl_dev_dc_identify(&mds->cxlds.cxl_mbox, &dc_info);
+	rc = cxl_dev_dc_identify(&mds->cxlds.cxl_mbox, dc_info, &num_part);
 	if (rc) {
 		dev_warn(dev,
 			 "Failed to read Dynamic Capacity config: %d\n", rc);
@@ -1969,7 +1973,7 @@ void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *info)
 	}
 
 	/* Skips between pmem and the dynamic partition are not supported */
-	skip = dc_info.start - info->size;
+	skip = dc_info[0].start - info->size;
 	if (skip) {
 		dev_warn(dev,
 			 "Dynamic Capacity skip from pmem not supported: %zu\n",
@@ -1978,10 +1982,13 @@ void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *info)
 		return;
 	}
 
-	info->size += dc_info.size;
-	dev_dbg(dev, "Adding dynamic ram partition A; %zu size %zu\n",
-		dc_info.start, dc_info.size);
-	add_part(info, dc_info.start, dc_info.size, CXL_PARTMODE_DYNAMIC_RAM_A);
+	for (int i = 0; i < num_part; i++) {
+		info->size += dc_info[i].size;
+		dev_dbg(dev, "Adding dynamic ram partition %d; %zu size %zu\n",
+			i, dc_info[i].start, dc_info[i].size);
+		add_part(info, dc_info[i].start, dc_info[i].size, CXL_PARTITION_DC_MODE(0) + i);
+	}
+	mds->cxlds.nr_dc_partitions = num_part;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_configure_dcd, "CXL");
 
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index c53b06522d6c..720780901f5a 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -2,6 +2,7 @@
 /* Copyright(c) 2020 Intel Corporation. */
 
 #include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/string_choices.h>
 #include <linux/firmware.h>
 #include <linux/device.h>
 #include <linux/slab.h>
@@ -102,18 +103,115 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_pmem_size =
 	__ATTR(size, 0444, pmem_size_show, NULL);
 
-static ssize_t dynamic_ram_a_size_show(struct device *dev, struct device_attribute *attr,
-			      char *buf)
+static ssize_t dynamic_ram_N_size_show(struct cxl_memdev *cxlmd, char *buf, int pos)
 {
-	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-	unsigned long long len = cxl_part_size(cxlds, CXL_PARTMODE_DYNAMIC_RAM_A);
+	unsigned long long len = cxl_part_size(cxlds, CXL_PARTITION_DC_MODE(0) + pos);
 
 	return sysfs_emit(buf, "%#llx\n", len);
 }
 
-static struct device_attribute dev_attr_dynamic_ram_a_size =
-	__ATTR(size, 0444, dynamic_ram_a_size_show, NULL);
+static ssize_t dynamic_ram_N_shareable_show(struct cxl_memdev *cxlmd, char *buf, int pos)
+{
+	enum cxl_partition_mode mode = CXL_PARTITION_DC_MODE(0) + pos;
+	bool val = cxlmd->cxlds->part[mode].perf.shareable;
+
+	return sysfs_emit(buf, "%s\n", str_true_false(val));
+}
+
+static struct cxl_dpa_perf *part_perf(struct cxl_dev_state *cxlds,
+				      enum cxl_partition_mode mode)
+{
+	for (int i = 0; i < cxlds->nr_partitions; i++)
+		if (cxlds->part[i].mode == mode)
+			return &cxlds->part[i].perf;
+	return NULL;
+}
+
+static ssize_t dynamic_ram_N_qos_class_show(struct cxl_memdev *cxlmd,
+					    char *buf, int pos)
+{
+	enum cxl_partition_mode mode = CXL_PARTITION_DC_MODE(0) + pos;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	return sysfs_emit(buf, "%d\n", part_perf(cxlds, mode)->qos_class);
+}
+
+#define CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(n)					\
+static ssize_t dynamic_ram_##n##_size_show(struct device *dev,			\
+					   struct device_attribute *attr,	\
+					   char *buf)				\
+{										\
+	return dynamic_ram_N_size_show(to_cxl_memdev(dev), buf, (n));		\
+}										\
+struct device_attribute dynamic_ram_##n##_size = {				\
+	.attr = { .name = "size",  .mode = 0444 },				\
+	.show = dynamic_ram_##n##_size_show,					\
+};										\
+static ssize_t dynamic_ram_##n##_shareable_show(struct device *dev,		\
+						struct device_attribute *attr,	\
+						char *buf)			\
+{										\
+	return dynamic_ram_N_shareable_show(to_cxl_memdev(dev), buf, (n));	\
+}										\
+struct device_attribute dynamic_ram_##n##_shareable = {				\
+	.attr = { .name = "shareable",  .mode = 0444 },				\
+	.show = dynamic_ram_##n##_shareable_show,				\
+};										\
+static ssize_t dynamic_ram_##n##_qos_class_show(struct device *dev,		\
+						struct device_attribute *attr,	\
+						char *buf)			\
+{										\
+	return dynamic_ram_N_qos_class_show(to_cxl_memdev(dev), buf, (n));	\
+}										\
+struct device_attribute dynamic_ram_##n##_qos_class = {				\
+	.attr = { .name = "qos_class",  .mode = 0444 },				\
+	.show = dynamic_ram_##n##_qos_class_show,				\
+};										\
+static struct attribute *cxl_memdev_dynamic_ram_##n##_attributes[] = {		\
+	&dynamic_ram_##n##_size.attr,						\
+	&dynamic_ram_##n##_shareable.attr,					\
+	&dynamic_ram_##n##_qos_class.attr,					\
+	NULL,									\
+};										\
+static umode_t cxl_memdev_dynamic_ram_##n##_attr_visible(struct kobject *kobj,	\
+							 struct attribute *a,	\
+							 int pos)		\
+{										\
+	struct device *dev = kobj_to_dev(kobj);					\
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);				\
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);	\
+										\
+	if (!mds)								\
+		return 0;							\
+										\
+	return a->mode;								\
+}										\
+static umode_t cxl_memdev_dynamic_ram_##n##_group_visible(struct kobject *kobj)	\
+{										\
+	struct device *dev = kobj_to_dev(kobj);					\
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);				\
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);	\
+										\
+	if (!mds || n >= mds->cxlds.nr_dc_partitions)					\
+		return 0;							\
+										\
+	return true;								\
+}										\
+DEFINE_SYSFS_GROUP_VISIBLE(cxl_memdev_dynamic_ram_##n);                        \
+static struct attribute_group cxl_memdev_dynamic_ram_##n##_attribute_group = { \
+       .name = "dynamic_ram_"#n,                                               \
+       .attrs = cxl_memdev_dynamic_ram_##n##_attributes,                       \
+       .is_visible = SYSFS_GROUP_VISIBLE(cxl_memdev_dynamic_ram_##n),          \
+}
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(0);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(1);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(2);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(3);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(4);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(5);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(6);
+CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(7);
 
 static ssize_t serial_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
@@ -399,15 +497,6 @@ static struct attribute *cxl_memdev_attributes[] = {
 	NULL,
 };
 
-static struct cxl_dpa_perf *part_perf(struct cxl_dev_state *cxlds,
-				      enum cxl_partition_mode mode)
-{
-	for (int i = 0; i < cxlds->nr_partitions; i++)
-		if (cxlds->part[i].mode == mode)
-			return &cxlds->part[i].perf;
-	return NULL;
-}
-
 static ssize_t pmem_qos_class_show(struct device *dev,
 				   struct device_attribute *attr, char *buf)
 {
@@ -426,25 +515,6 @@ static struct attribute *cxl_memdev_pmem_attributes[] = {
 	NULL,
 };
 
-static ssize_t dynamic_ram_a_qos_class_show(struct device *dev,
-				   struct device_attribute *attr, char *buf)
-{
-	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-
-	return sysfs_emit(buf, "%d\n",
-			  part_perf(cxlds, CXL_PARTMODE_DYNAMIC_RAM_A)->qos_class);
-}
-
-static struct device_attribute dev_attr_dynamic_ram_a_qos_class =
-	__ATTR(qos_class, 0444, dynamic_ram_a_qos_class_show, NULL);
-
-static struct attribute *cxl_memdev_dynamic_ram_a_attributes[] = {
-	&dev_attr_dynamic_ram_a_size.attr,
-	&dev_attr_dynamic_ram_a_qos_class.attr,
-	NULL,
-};
-
 static ssize_t ram_qos_class_show(struct device *dev,
 				  struct device_attribute *attr, char *buf)
 {
@@ -521,29 +591,6 @@ static struct attribute_group cxl_memdev_pmem_attribute_group = {
 	.is_visible = cxl_pmem_visible,
 };
 
-static umode_t cxl_dynamic_ram_a_visible(struct kobject *kobj, struct attribute *a, int n)
-{
-	struct device *dev = kobj_to_dev(kobj);
-	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
-	struct cxl_dpa_perf *perf = part_perf(cxlmd->cxlds, CXL_PARTMODE_DYNAMIC_RAM_A);
-
-	if (a == &dev_attr_dynamic_ram_a_qos_class.attr &&
-	    (!perf || perf->qos_class == CXL_QOS_CLASS_INVALID))
-		return 0;
-
-	if (a == &dev_attr_dynamic_ram_a_size.attr &&
-	    (!cxl_part_size(cxlmd->cxlds, CXL_PARTMODE_DYNAMIC_RAM_A)))
-		return 0;
-
-	return a->mode;
-}
-
-static struct attribute_group cxl_memdev_dynamic_ram_a_attribute_group = {
-	.name = "dynamic_ram_a",
-	.attrs = cxl_memdev_dynamic_ram_a_attributes,
-	.is_visible = cxl_dynamic_ram_a_visible,
-};
-
 static umode_t cxl_memdev_security_visible(struct kobject *kobj,
 					   struct attribute *a, int n)
 {
@@ -572,7 +619,14 @@ static const struct attribute_group *cxl_memdev_attribute_groups[] = {
 	&cxl_memdev_attribute_group,
 	&cxl_memdev_ram_attribute_group,
 	&cxl_memdev_pmem_attribute_group,
-	&cxl_memdev_dynamic_ram_a_attribute_group,
+	&cxl_memdev_dynamic_ram_0_attribute_group,
+	&cxl_memdev_dynamic_ram_1_attribute_group,
+	&cxl_memdev_dynamic_ram_2_attribute_group,
+	&cxl_memdev_dynamic_ram_3_attribute_group,
+	&cxl_memdev_dynamic_ram_4_attribute_group,
+	&cxl_memdev_dynamic_ram_5_attribute_group,
+	&cxl_memdev_dynamic_ram_6_attribute_group,
+	&cxl_memdev_dynamic_ram_7_attribute_group,
 	&cxl_memdev_security_attribute_group,
 	NULL,
 };
@@ -581,7 +635,14 @@ void cxl_memdev_update_perf(struct cxl_memdev *cxlmd)
 {
 	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_ram_attribute_group);
 	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_pmem_attribute_group);
-	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_a_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_0_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_1_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_2_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_3_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_4_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_5_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_6_attribute_group);
+	sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_7_attribute_group);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_memdev_update_perf, "CXL");
 
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 3f94dbf63ba9..68b88159e525 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -119,7 +119,14 @@ static DEVICE_ATTR_RO(name)
 
 CXL_DECODER_FLAG_ATTR(cap_pmem, CXL_DECODER_F_PMEM);
 CXL_DECODER_FLAG_ATTR(cap_ram, CXL_DECODER_F_RAM);
-CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_a, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_0, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_1, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_2, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_3, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_4, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_5, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_6, CXL_DECODER_F_RAM);
+CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_7, CXL_DECODER_F_RAM);
 CXL_DECODER_FLAG_ATTR(cap_type2, CXL_DECODER_F_TYPE2);
 CXL_DECODER_FLAG_ATTR(cap_type3, CXL_DECODER_F_TYPE3);
 CXL_DECODER_FLAG_ATTR(locked, CXL_DECODER_F_LOCK);
@@ -214,8 +221,22 @@ static ssize_t mode_store(struct device *dev, struct device_attribute *attr,
 		mode = CXL_PARTMODE_PMEM;
 	else if (sysfs_streq(buf, "ram"))
 		mode = CXL_PARTMODE_RAM;
-	else if (sysfs_streq(buf, "dynamic_ram_a"))
-		mode = CXL_PARTMODE_DYNAMIC_RAM_A;
+	else if (sysfs_streq(buf, "dynamic_ram_0"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_0;
+	else if (sysfs_streq(buf, "dynamic_ram_1"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_1;
+	else if (sysfs_streq(buf, "dynamic_ram_2"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_2;
+	else if (sysfs_streq(buf, "dynamic_ram_3"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_3;
+	else if (sysfs_streq(buf, "dynamic_ram_4"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_4;
+	else if (sysfs_streq(buf, "dynamic_ram_5"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_5;
+	else if (sysfs_streq(buf, "dynamic_ram_6"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_6;
+	else if (sysfs_streq(buf, "dynamic_ram_7"))
+		mode = CXL_PARTMODE_DYNAMIC_RAM_7;
 	else
 		return -EINVAL;
 
@@ -321,14 +342,28 @@ static struct attribute_group cxl_decoder_base_attribute_group = {
 static struct attribute *cxl_decoder_root_attrs[] = {
 	&dev_attr_cap_pmem.attr,
 	&dev_attr_cap_ram.attr,
-	&dev_attr_cap_dynamic_ram_a.attr,
+	&dev_attr_cap_dynamic_ram_0.attr,
+	&dev_attr_cap_dynamic_ram_1.attr,
+	&dev_attr_cap_dynamic_ram_2.attr,
+	&dev_attr_cap_dynamic_ram_3.attr,
+	&dev_attr_cap_dynamic_ram_4.attr,
+	&dev_attr_cap_dynamic_ram_5.attr,
+	&dev_attr_cap_dynamic_ram_6.attr,
+	&dev_attr_cap_dynamic_ram_7.attr,
 	&dev_attr_cap_type2.attr,
 	&dev_attr_cap_type3.attr,
 	&dev_attr_target_list.attr,
 	&dev_attr_qos_class.attr,
 	SET_CXL_REGION_ATTR(create_pmem_region)
 	SET_CXL_REGION_ATTR(create_ram_region)
-	SET_CXL_REGION_ATTR(create_dynamic_ram_a_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_0_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_1_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_2_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_3_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_4_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_5_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_6_region)
+	SET_CXL_REGION_ATTR(create_dynamic_ram_7_region)
 	SET_CXL_REGION_ATTR(delete_region)
 	NULL,
 };
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index da3ea3cf8585..1a53c74b814c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -499,7 +499,7 @@ static ssize_t interleave_ways_store(struct device *dev,
 	if (rc)
 		return rc;
 
-	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A && val != 1) {
+	if (is_cxl_dc_partition_mode(cxlr->mode) && val != 1) {
 		dev_err(dev, "Interleaving and DCD not supported\n");
 		return -EINVAL;
 	}
@@ -2255,7 +2255,7 @@ static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos,
 		}
 
 		cxled = to_cxl_endpoint_decoder(dev);
-		if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A &&
+		if (is_cxl_dc_partition_mode(cxlr->mode) &&
 		    !cxl_dcd_supported(cxled_to_mds(cxled))) {
 			dev_dbg(dev, "DCD unsupported\n");
 			rc = -EINVAL;
@@ -2606,7 +2606,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
 	switch (mode) {
 	case CXL_PARTMODE_RAM:
 	case CXL_PARTMODE_PMEM:
-	case CXL_PARTMODE_DYNAMIC_RAM_A:
+	case CXL_PARTMODE_DYNAMIC_RAM_0...CXL_PARTMODE_DYNAMIC_RAM_7:
 		break;
 	default:
 		dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode);
@@ -2659,20 +2659,36 @@ static ssize_t create_ram_region_store(struct device *dev,
 }
 DEVICE_ATTR_RW(create_ram_region);
 
-static ssize_t create_dynamic_ram_a_region_show(struct device *dev,
-						struct device_attribute *attr,
-						char *buf)
-{
-	return __create_region_show(to_cxl_root_decoder(dev), buf);
-}
-
-static ssize_t create_dynamic_ram_a_region_store(struct device *dev,
-						 struct device_attribute *attr,
-						 const char *buf, size_t len)
-{
-	return create_region_store(dev, buf, len, CXL_PARTMODE_DYNAMIC_RAM_A);
-}
-DEVICE_ATTR_RW(create_dynamic_ram_a_region);
+#define CREATE_DYNAMIC_RAM_N_REGION(n)						\
+static ssize_t create_dynamic_ram_##n##_region_show(struct device *dev,		\
+						struct device_attribute *attr,	\
+						char *buf)			\
+{										\
+	return __create_region_show(to_cxl_root_decoder(dev), buf);		\
+}										\
+static ssize_t create_dynamic_ram_##n##_region_store(struct device *dev,		\
+						 struct device_attribute *attr, \
+						 const char *buf, size_t len)	\
+{										\
+	enum cxl_partition_mode mode = CXL_PARTITION_DC_MODE(0) + (n);		\
+	return create_region_store(dev, buf, len, mode);			\
+}
+CREATE_DYNAMIC_RAM_N_REGION(0);
+CREATE_DYNAMIC_RAM_N_REGION(1);
+CREATE_DYNAMIC_RAM_N_REGION(2);
+CREATE_DYNAMIC_RAM_N_REGION(3);
+CREATE_DYNAMIC_RAM_N_REGION(4);
+CREATE_DYNAMIC_RAM_N_REGION(5);
+CREATE_DYNAMIC_RAM_N_REGION(6);
+CREATE_DYNAMIC_RAM_N_REGION(7);
+DEVICE_ATTR_RW(create_dynamic_ram_0_region);
+DEVICE_ATTR_RW(create_dynamic_ram_1_region);
+DEVICE_ATTR_RW(create_dynamic_ram_2_region);
+DEVICE_ATTR_RW(create_dynamic_ram_3_region);
+DEVICE_ATTR_RW(create_dynamic_ram_4_region);
+DEVICE_ATTR_RW(create_dynamic_ram_5_region);
+DEVICE_ATTR_RW(create_dynamic_ram_6_region);
+DEVICE_ATTR_RW(create_dynamic_ram_7_region);
 
 static ssize_t region_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
@@ -3266,7 +3282,7 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
 	struct device *dev;
 	int rc;
 
-	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A &&
+	if (is_cxl_dc_partition_mode(cxlr->mode) &&
 	    cxlr->params.interleave_ways != 1) {
 		dev_err(&cxlr->dev, "Interleaving DC not supported\n");
 		return -EINVAL;
@@ -3667,7 +3683,7 @@ static int cxl_region_probe(struct device *dev)
 
 		return devm_cxl_add_pmem_region(cxlr);
 	case CXL_PARTMODE_RAM:
-	case CXL_PARTMODE_DYNAMIC_RAM_A:
+	case CXL_PARTMODE_DYNAMIC_RAM_0...CXL_PARTMODE_DYNAMIC_RAM_7:
 		rc = devm_cxl_region_edac_register(cxlr);
 		if (rc)
 			dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 3e400dd4f08b..80fb8d09172c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -503,12 +503,26 @@ struct cxl_region_params {
 	resource_size_t cache_size;
 };
 
+#define CXL_PARTITION_DC_MODE(n) CXL_PARTMODE_DYNAMIC_RAM_##n
 /* Modes should be in the implied DPA order */
 enum cxl_partition_mode {
 	CXL_PARTMODE_RAM,
 	CXL_PARTMODE_PMEM,
-	CXL_PARTMODE_DYNAMIC_RAM_A,
-};
+	CXL_PARTITION_DC_MODE(0),
+	CXL_PARTITION_DC_MODE(1),
+	CXL_PARTITION_DC_MODE(2),
+	CXL_PARTITION_DC_MODE(3),
+	CXL_PARTITION_DC_MODE(4),
+	CXL_PARTITION_DC_MODE(5),
+	CXL_PARTITION_DC_MODE(6),
+	CXL_PARTITION_DC_MODE(7),
+	CXL_PARTITION_MODE_MAX,
+};
+
+static inline bool is_cxl_dc_partition_mode(enum cxl_partition_mode mode)
+{
+	return mode >= CXL_PARTITION_DC_MODE(0) && mode < CXL_PARTITION_MODE_MAX;
+}
 
 /*
  * Indicate whether this region has been assembled by autodetection or
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 2bad68f13e21..e28cd6827c7d 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -106,7 +106,7 @@ int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 			 resource_size_t base, resource_size_t len,
 			 resource_size_t skipped);
 
-#define CXL_NR_PARTITIONS_MAX 3
+#define CXL_NR_PARTITIONS_MAX 10
 
 struct cxl_dpa_info {
 	u64 size;
@@ -456,6 +456,7 @@ struct cxl_dev_state {
 	struct resource dpa_res;
 	struct cxl_dpa_partition part[CXL_NR_PARTITIONS_MAX];
 	unsigned int nr_partitions;
+	unsigned int nr_dc_partitions;
 	u64 serial;
 	enum cxl_devtype type;
 	struct cxl_mailbox cxl_mbox;
@@ -954,7 +955,7 @@ struct cxl_dc_partition_info {
 };
 
 int cxl_dev_dc_identify(struct cxl_mailbox *mbox,
-			struct cxl_dc_partition_info *dc_info);
+			struct cxl_dc_partition_info *dc_info, int *num_part);
 int cxl_await_media_ready(struct cxl_dev_state *cxlds);
 int cxl_enumerate_cmds(struct cxl_memdev_state *mds);
 int cxl_mem_dpa_fetch(struct cxl_memdev_state *mds, struct cxl_dpa_info *info);
diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
index 15fc2de63185..fa6ada01b681 100644
--- a/drivers/dax/cxl.c
+++ b/drivers/dax/cxl.c
@@ -57,7 +57,7 @@ static int cxl_dax_region_probe(struct device *dev)
 		nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start);
 
 	flags = IORESOURCE_DAX_KMEM;
-	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A)
+	if (is_cxl_dc_partition_mode(cxlr->mode))
 		flags |= IORESOURCE_DAX_SPARSE_CAP;
 
 	dax_region = alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid,
@@ -65,7 +65,7 @@ static int cxl_dax_region_probe(struct device *dev)
 	if (!dax_region)
 		return -ENOMEM;
 
-	if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) {
+	if (is_cxl_dc_partition_mode(cxlr->mode)) {
 		rc = cxlr_add_existing_extents(cxlr);
 		/* If adding existing extents fails, continue with only an error
 		 * message ?? */
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 3/3] dcd: Add support for multiple DC regions
  2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887
@ 2025-12-04 17:44   ` Ira Weiny
  0 siblings, 0 replies; 19+ messages in thread
From: Ira Weiny @ 2025-12-04 17:44 UTC (permalink / raw)
  To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su

anisa.su887@ wrote:
> From: Fan Ni <fan.ni@samsung.com>
> 
> With the change, we add following support:
> 1. Allow creating multiple DC regions (up to 8);
> 2. Allow DC extents to belong to regions other than region 0;
> 3. Modify sysfs entries to enable the above capabilities;
> 4. Shareable attribute is added to dc region (partition);
> 
> This series is tested with proper NDCTL fix, see:
> https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support
> 

[snip]

> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 6b976da4a70a..faa4656f9542 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -463,8 +463,22 @@ static const char *cxl_mode_name(enum cxl_partition_mode mode)
>  		return "ram";
>  	case CXL_PARTMODE_PMEM:
>  		return "pmem";
> -	case CXL_PARTMODE_DYNAMIC_RAM_A:
> -		return "dynamic_ram_a";
> +	case CXL_PARTMODE_DYNAMIC_RAM_0:
> +		return "dynamic_ram_0";

If my v9 were to land then this lands this would break users who developed
against 'ram_a'.

Either we need to change ram_a to ram_0 in the base series or use
ram_[b,c,...] etc.

[same comment throughout]

> +	case CXL_PARTMODE_DYNAMIC_RAM_1:
> +		return "dynamic_ram_1";
> +	case CXL_PARTMODE_DYNAMIC_RAM_2:
> +		return "dynamic_ram_2";
> +	case CXL_PARTMODE_DYNAMIC_RAM_3:
> +		return "dynamic_ram_3";
> +	case CXL_PARTMODE_DYNAMIC_RAM_4:
> +		return "dynamic_ram_4";
> +	case CXL_PARTMODE_DYNAMIC_RAM_5:
> +		return "dynamic_ram_5";
> +	case CXL_PARTMODE_DYNAMIC_RAM_6:
> +		return "dynamic_ram_6";
> +	case CXL_PARTMODE_DYNAMIC_RAM_7:
> +		return "dynamic_ram_7";
>  	default:
>  		return "";
>  	};

[snip]

> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2bad68f13e21..e28cd6827c7d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -106,7 +106,7 @@ int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>  			 resource_size_t base, resource_size_t len,
>  			 resource_size_t skipped);
>  
> -#define CXL_NR_PARTITIONS_MAX 3
> +#define CXL_NR_PARTITIONS_MAX 10
>  
>  struct cxl_dpa_info {
>  	u64 size;
> @@ -456,6 +456,7 @@ struct cxl_dev_state {
>  	struct resource dpa_res;
>  	struct cxl_dpa_partition part[CXL_NR_PARTITIONS_MAX];
>  	unsigned int nr_partitions;
> +	unsigned int nr_dc_partitions;

I think nr_partitions needs to include the dc count.  And when it does I
don't think we need a separate dc count.

After looking at this patch I'm thinking that the changes made after
dropping partition support from v8 to v9 the merging of the
volitile/persistant/dc partitions into a single range might have made this
support easier?

Is that why yall did not try to use v8?  I've really not looked into the
detail of if this is really all that is needed to support more partitions.
If this is all it takes I think what we really need is a use case.

Basically keep this patch (with the name change I mention) until such time
as v9 lands with a use case for single partitions.  Then when multiple
partitions come we can land this change.

patch 1/3 is a bug fix and needs to be in v9.  2/3 I don't quite
understand yet but is a bug fix as well.  So if it is an issue it will
need to go in with v9.

Thanks for the work!
Ira

[snip]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
                   ` (2 preceding siblings ...)
  2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887
@ 2025-12-03 21:19 ` Anisa Su
  2025-12-04 17:28 ` Ira Weiny
  4 siblings, 0 replies; 19+ messages in thread
From: Anisa Su @ 2025-12-03 21:19 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ira.weiny, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Wed, Dec 03, 2025 at 08:29:10PM +0000, anisa.su887@gmail.com wrote:
> From: Anisa Su <anisa.su@samsung.com>
> 
> This patchset introduces support for multiple DC regions. It is rebased on top
> of the latest branch published to Ira's repository:
> https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
> We hope it will be useful in the meantime for others and restart some
> discussion around how to move DCD forward.
> 
> The corresponding NDCTL support can be found on this branch:
> https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
> I will reply to this thread with a reference to the thread for the
> NDCTL patches once published.
> 

NDCTL thread: https://lore.kernel.org/linux-cxl/20251203211642.1104918-1-anisa.su887@gmail.com/T/#u

> Testing:
> This patchset was tested on a QEMU VM with the following topology:
> 
> PCIE Root (pcie.0)
> │
> ├─ CXL Fixed Memory Window cxl-fmw.0
> ├─ CXL Root Complex cxl.0
> │  └─ Root Port root_port1
> │     └─ CXL Type-3 Device cxl-dcd0
> │
> ├─ CXL Fixed Memory Window cxl-fmw.1
> ├─ CXL Root Complex cxl.1
> │  └─ Root Port root_port2
> │     └─ CXL Type-3 Device cxl-dcd1
> └─
> 
> "-object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/t3_cxl1.raw,size=8G \
> -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/t3_lsa1.raw,size=1M \
> -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/t3_cxl2.raw,size=8G \
> -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/t3_lsa2.raw,size=1M \
> -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.0,hdm_for_passthrough=true \
> -device pxb-cxl,bus_nr=48,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
> -device cxl-rp,port=0,bus=cxl.0,id=root_port1,chassis=0,slot=1 \
> -device cxl-rp,port=1,bus=cxl.1,id=root_port2,chassis=1,slot=1 \
> -device cxl-type3,bus=root_port1,volatile-dc-memdev=cxl-mem1,id=cxl-dcd0,lsa=cxl-lsa1,num-dc-regions=8,sn=99 \
> -device cxl-type3,bus=root_port2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd1,lsa=cxl-lsa2,num-dc-regions=8,sn=100 \
> -machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=8G,cxl-fmw.1.targets.0=cxl.1,cxl-fmw.1.size=8G"
> 
> 2 CFMWs and 2 root complexes are emulated because QEMU creates
> 4 decoders/topology level. With 1 root complex, there are only 4 upstream
> decoders. Therefore in order to create 4+ regions, we need a total of
> 8 upstream decoders. This does mean that we are only able to create
> 4 regions on each device, although up to 8 are supported.
> 
> Using `cxl list`, we can see mem0 and mem1 have dynamic_ram_* capablities:
> root@deb-101020-bm01:~# cxl list
> [
>   {
>     "memdevs":[
>       {
>         "memdev":"mem0",
>         "dynamic_ram_0_size":1073741824,
>         "dynamic_ram_1_size":1073741824,
>         "dynamic_ram_2_size":1073741824,
>         "dynamic_ram_3_size":1073741824,
>         "dynamic_ram_4_size":1073741824,
>         "dynamic_ram_5_size":1073741824,
>         "dynamic_ram_6_size":1073741824,
>         "dynamic_ram_7_size":1073741824,
>         "serial":100,
>         "host":"0000:31:00.0",
>         "firmware_version":"BWFW VERSION 00"
>       },
>       {
>         "memdev":"mem1",
>         "dynamic_ram_0_size":1073741824,
>         "dynamic_ram_1_size":1073741824,
>         "dynamic_ram_2_size":1073741824,
>         "dynamic_ram_3_size":1073741824,
>         "dynamic_ram_4_size":1073741824,
>         "dynamic_ram_5_size":1073741824,
>         "dynamic_ram_6_size":1073741824,
>         "dynamic_ram_7_size":1073741824,
>         "serial":99,
>         "host":"0000:0d:00.0",
>         "firmware_version":"BWFW VERSION 00"
>       }
>     ]
>   }
> ]
> 
> To create the 8 regions:
> cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_0
> cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_1
> cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_2
> cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_3
> 
> cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_4
> cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_5
> cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_6
> cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_7
> 
> 
> We can verify the 8 regions:
> root@deb-101020-bm01:~# cxl list
> [
>   {
>     "memdevs":[
> ...
>   },
>   {
>     "regions":[
>       {
>         "region":"region0",
>         "resource":79993765888,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region6",
>         "resource":81067507712,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region7",
>         "resource":82141249536,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region8",
>         "resource":83214991360,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region1",
>         "resource":88315265024,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region2",
>         "resource":89389006848,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region3",
>         "resource":90462748672,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       },
>       {
>         "region":"region4",
>         "resource":91536490496,
>         "size":1073741824,
>         "interleave_ways":1,
>         "interleave_granularity":256,
>         "decode_state":"commit"
>       }
>     ]
>   }
> ]
> 
> Extents of various sizes (128MB, 256MB, 512MB, and 1GB) are added from mem1,
> which correspond to regions 0-3, then DAX devices are created from them.
> The extent DPAs are as follows, which allows each one to map to a distinct
> region:
>  - [0-128] --> region0
>  - [1024-1280] --> region1
>  - [2048-2560] --> region2
>  - [3072-4096] --> region3
> 
> The correct sizes can be verified when creating the DAX device.
> root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region0
> [
>   {
>     "chardev":"dax0.1",
>     "size":134217728,
>     "target_node":1,
>     "align":2097152,
>     "mode":"devdax"
>   }
> ]
> created 1 device
> root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region1
> [
>   {
>     "chardev":"dax1.1",
>     "size":268435456,
>     "target_node":1,
>     "align":2097152,
>     "mode":"devdax"
>   }
> ]
> created 1 device
> root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region2
> [
>   {
>     "chardev":"dax2.1",
>     "size":536870912,
>     "target_node":1,
>     "align":2097152,
>     "mode":"devdax"
>   }
> ]
> created 1 device
> root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region3
> [
>   {
>     "chardev":"dax3.1",
>     "size":1073741824,
>     "target_node":1,
>     "align":2097152,
>     "mode":"devdax"
>   }
> ]
> created 1 device
> 
> Then the DAX devices are reconfigured to system-ram mode and verified with lsmem.
> root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax0.1 -m system-ram
> [
>   {
>     "chardev":"dax0.1",
>     "size":134217728,
>     "target_node":1,
>     "align":2097152,
>     "mode":"system-ram",
>     "online_memblocks":1,
>     "total_memblocks":1,
>     "movable":true
>   }
> ]
> reconfigured 1 device
> root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax1.1 -m system-ram
> ...
> root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax2.1 -m system-ram
> ...
> root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax3.1 -m system-ram
> ...
> 
> 
> root@deb-101020-bm01:~/libcxlmi# lsmem
> RANGE                                  SIZE  STATE REMOVABLE   BLOCK
> 0x0000000000000000-0x000000007fffffff    2G online       yes    0-15
> 0x0000000100000000-0x000000027fffffff    6G online       yes   32-79
> 0x00000012a0000000-0x00000012a7ffffff  128M online       yes     596
> 0x00000012e0000000-0x00000012efffffff  256M online       yes 604-605
> 0x0000001320000000-0x000000133fffffff  512M online       yes 612-615
> 0x0000001360000000-0x000000139fffffff    1G online       yes 620-627
> 
> Memory block size:                128M
> Total online memory:              9.9G
> Total offline memory:               0B
> 
> -------------------------------------------------------------------------------
> Note: I did try hacking QEMU to create 8 decoders at each level to avoid having
> 2 separate host bridges/DCDs by modifying include/hw/cxl/cxl_component.h like so:
> 
> #define CXL_HDM_DECODER_COUNT 8
> HDM_DECODER_INIT(0);
> HDM_DECODER_INIT(1);
> HDM_DECODER_INIT(2);
> HDM_DECODER_INIT(3);
> HDM_DECODER_INIT(4);
> HDM_DECODER_INIT(5);
> HDM_DECODER_INIT(6);
> HDM_DECODER_INIT(7);
> 
> However, when attempting to create the 5th cxl region,
> I ran into a timeout error when committing the decoders.
> Did not spend much time pursuing this further, most likely
> need to change more things on the QEMU side.
> But the 8 decoders do show up correctly under sysfs.
> 
> Fan Ni (3):
>   core/region: fix return logic for store_targetN
>   dax/cxl: add existing dc extents when probing dax region
>   dcd: Add support for multiple DC regions
> 
>  drivers/cxl/core/cdat.c   |   2 +-
>  drivers/cxl/core/core.h   |   9 +-
>  drivers/cxl/core/extent.c |   2 +-
>  drivers/cxl/core/hdm.c    |  18 +++-
>  drivers/cxl/core/mbox.c   |  39 +++++----
>  drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++-------------
>  drivers/cxl/core/port.c   |  45 ++++++++--
>  drivers/cxl/core/region.c |  65 ++++++++------
>  drivers/cxl/cxl.h         |  23 ++++-
>  drivers/cxl/cxlmem.h      |   5 +-
>  drivers/dax/cxl.c         |  28 ++----
>  11 files changed, 281 insertions(+), 134 deletions(-)
> 
> -- 
> 2.51.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
                   ` (3 preceding siblings ...)
  2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su
@ 2025-12-04 17:28 ` Ira Weiny
  2025-12-11 21:05   ` Anisa Su
  4 siblings, 1 reply; 19+ messages in thread
From: Ira Weiny @ 2025-12-04 17:28 UTC (permalink / raw)
  To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Anisa Su

anisa.su887@ wrote:
> From: Anisa Su <anisa.su@samsung.com>
> 
> This patchset introduces support for multiple DC regions. It is rebased on top
> of the latest branch published to Ira's repository:
> https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
> We hope it will be useful in the meantime for others and restart some
> discussion around how to move DCD forward.

FWIW it seems patch 1/3 and this patch are both bug fixes to the DCD
series I last posted.  If so they should be tacked onto that series.

So, you are more that welcome to take over DCD development.

However, I had multiple DC partitions (Regions) supported in previous
versions of that series and the community decided that there was no use
case for such a device.  Based on this submission it seems that me ripping
out the multiple partitions was incorrect.

> 
> The corresponding NDCTL support can be found on this branch:
> https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
> I will reply to this thread with a reference to the thread for the
> NDCTL patches once published.
> 
> Testing:
> This patchset was tested on a QEMU VM with the following topology:

Unfortunately none of the details presented in this cover letter really
show why the kernel needs this additional complexity.

Can you go into more details on the use cases of multiple partitions?

Also, did you consider to use previous versions of my series?  Perhaps v8?

	https://lore.kernel.org/all/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com/#r

Thanks,
Ira

[snip]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-04 17:28 ` Ira Weiny
@ 2025-12-11 21:05   ` Anisa Su
  2025-12-12 22:07     ` Ira Weiny
  2025-12-13  3:36     ` dan.j.williams
  0 siblings, 2 replies; 19+ messages in thread
From: Anisa Su @ 2025-12-11 21:05 UTC (permalink / raw)
  To: Ira Weiny
  Cc: anisa.su887, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote:
> anisa.su887@ wrote:
> > From: Anisa Su <anisa.su@samsung.com>
> > 
> > This patchset introduces support for multiple DC regions. It is rebased on top
> > of the latest branch published to Ira's repository:
> > https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
> > We hope it will be useful in the meantime for others and restart some
> > discussion around how to move DCD forward.
> 
> FWIW it seems patch 1/3 and this patch are both bug fixes to the DCD
> series I last posted.  If so they should be tacked onto that series.
> 
> So, you are more that welcome to take over DCD development.
> 
> However, I had multiple DC partitions (Regions) supported in previous
> versions of that series and the community decided that there was no use
> case for such a device.  Based on this submission it seems that me ripping
> out the multiple partitions was incorrect.
>
> > 
> > The corresponding NDCTL support can be found on this branch:
> > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
> > I will reply to this thread with a reference to the thread for the
> > NDCTL patches once published.
> > 
> > Testing:
> > This patchset was tested on a QEMU VM with the following topology:
> 
> Unfortunately none of the details presented in this cover letter really
> show why the kernel needs this additional complexity.
> 
> Can you go into more details on the use cases of multiple partitions?
> 
From what I understand, the motivation for DCD as a whole has always been a
blocker for the entire series. However, this year we've seen multiple vendors
demo memory pooling/sharing at SC,25[1], as well as the development of a 
controller that supports "memory pooling and sharing across 
multiple hosts" from Montage[2].

The flexibility and control provided by multiple partitions is
an important capability of DCD for enabling composable memory 
infrastructures. IMO, adding multi-partition support back in from v8 or
picking up this patchset would strengthen the series.

Let me know if this proposal sounds fair? Otherwise I can
separate out the 2 patches that are bug fixes.

Also, apologies if these points have already been discussed. I've not been
following this series for very long, so forgive the ignorance as I try to
catch up. If you can think of any materials/documentation outside of the
mailing list or open collab sync notes that would help fill in the gaps,
please let me know :)

Thanks for the feedback,
Anisa

[1] https://computeexpresslink.org/event/supercomputing-2025/
[2] https://www.montage-tech.com/MXC

> Also, did you consider to use previous versions of my series?  Perhaps v8?
> 
> 	https://lore.kernel.org/all/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com/#r
> 
> Thanks,
> Ira
> 
> [snip]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-11 21:05   ` Anisa Su
@ 2025-12-12 22:07     ` Ira Weiny
  2026-01-12 22:23       ` Anisa Su
  2025-12-13  3:36     ` dan.j.williams
  1 sibling, 1 reply; 19+ messages in thread
From: Ira Weiny @ 2025-12-12 22:07 UTC (permalink / raw)
  To: Anisa Su, Ira Weiny
  Cc: anisa.su887, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

Anisa Su wrote:
> On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote:
> > anisa.su887@ wrote:
> > > From: Anisa Su <anisa.su@samsung.com>
> > > 
> > > This patchset introduces support for multiple DC regions. It is rebased on top
> > > of the latest branch published to Ira's repository:
> > > https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
> > > We hope it will be useful in the meantime for others and restart some
> > > discussion around how to move DCD forward.
> > 
> > FWIW it seems patch 1/3 and this patch are both bug fixes to the DCD
> > series I last posted.  If so they should be tacked onto that series.
> > 
> > So, you are more that welcome to take over DCD development.
> > 
> > However, I had multiple DC partitions (Regions) supported in previous
> > versions of that series and the community decided that there was no use
> > case for such a device.  Based on this submission it seems that me ripping
> > out the multiple partitions was incorrect.
> >
> > > 
> > > The corresponding NDCTL support can be found on this branch:
> > > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
> > > I will reply to this thread with a reference to the thread for the
> > > NDCTL patches once published.
> > > 
> > > Testing:
> > > This patchset was tested on a QEMU VM with the following topology:
> > 
> > Unfortunately none of the details presented in this cover letter really
> > show why the kernel needs this additional complexity.
> > 
> > Can you go into more details on the use cases of multiple partitions?
> > 
> From what I understand, the motivation for DCD as a whole has always been a
> blocker for the entire series. However, this year we've seen multiple vendors
> demo memory pooling/sharing at SC,25[1], as well as the development of a 
> controller that supports "memory pooling and sharing across 
> multiple hosts" from Montage[2].

That is great!  Do we know if they used the patches which have been
submitted?  Do we know if the user interfaces were sufficient?  How will
this memory be presented with the new DAX changes being proposed?

> 
> The flexibility and control provided by multiple partitions is
> an important capability of DCD for enabling composable memory 
> infrastructures. IMO, adding multi-partition support back in from v8 or
> picking up this patchset would strengthen the series.
> 
> Let me know if this proposal sounds fair? Otherwise I can
> separate out the 2 patches that are bug fixes.

After RC1 could you rebase the series and fold the bug fixes in?

Before we get to multiple DCD partitions the interface for DAX devices
needs to be settled.  In the last community call we were discussing a
special famfs dax type I believe.  Has any work been done on that?

For multi-partitions we need some review on the partition (region) names
because you made a change which would be incompatible with the base
series.  But it would be good to get single partitions landed and then
multiple partitions as you have added.

> Also, apologies if these points have already been discussed. I've not been
> following this series for very long, so forgive the ignorance as I try to
> catch up. If you can think of any materials/documentation outside of the
> mailing list or open collab sync notes that would help fill in the gaps,
> please let me know :)

NP this has been a while.  I've been looking for someone to take the
series who is more familiar with the use cases.

I look forward to you posting a new series with the support you feel you
need.

Ira

> 
> Thanks for the feedback,
> Anisa
> 
> [1] https://computeexpresslink.org/event/supercomputing-2025/
> [2] https://www.montage-tech.com/MXC
> 
> > Also, did you consider to use previous versions of my series?  Perhaps v8?
> > 
> > 	https://lore.kernel.org/all/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com/#r
> > 
> > Thanks,
> > Ira
> > 
> > [snip]



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-12 22:07     ` Ira Weiny
@ 2026-01-12 22:23       ` Anisa Su
  2026-01-15 10:28         ` Alireza Sanaee
  0 siblings, 1 reply; 19+ messages in thread
From: Anisa Su @ 2026-01-12 22:23 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Anisa Su, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Fri, Dec 12, 2025 at 04:07:32PM -0600, Ira Weiny wrote:
> Anisa Su wrote:
> > On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote:
> > > anisa.su887@ wrote:
> > > > From: Anisa Su <anisa.su@samsung.com>
> > > > 
[snip] 
> > > Unfortunately none of the details presented in this cover letter really
> > > show why the kernel needs this additional complexity.
> > > 
> > > Can you go into more details on the use cases of multiple partitions?
> > > 
> > From what I understand, the motivation for DCD as a whole has always been a
> > blocker for the entire series. However, this year we've seen multiple vendors
> > demo memory pooling/sharing at SC,25[1], as well as the development of a 
> > controller that supports "memory pooling and sharing across 
> > multiple hosts" from Montage[2].
> 
> That is great!  Do we know if they used the patches which have been
> submitted?  Do we know if the user interfaces were sufficient?

Sorry for the delay! While I don't know the details of the software
stack used in those demos, I think the root of the question "Do we know
if the user interfaces were sufficient" goes back to the missing use
case for DCD.

So then, can I ask: how can I demonstrate a reasonable use case?
Ex: bringing up Kubernetes pods using this patchset on real hw?
Or something else?
^ This is also a question for the community, so everyone please chime in :)
>  
> How will this memory be presented with the new DAX changes being proposed?
>
From the call today, there seemed to be general agreement that the changes
proposed by Gregory's patches are a promising direction for DCD, bc it
allows hotplug/unplug capabilities without needing to route it through
the DAX subsystem.
I haven't looked at those patches yet, but from what was
discussed today, plan to move forward based on that. Are those the changes
you were referring to? Or the special famfs dax type you mentioned
below?
> > 
> > The flexibility and control provided by multiple partitions is
> > an important capability of DCD for enabling composable memory 
> > infrastructures. IMO, adding multi-partition support back in from v8 or
> > picking up this patchset would strengthen the series.
> > 
> > Let me know if this proposal sounds fair? Otherwise I can
> > separate out the 2 patches that are bug fixes.
> 
> After RC1 could you rebase the series and fold the bug fixes in?
> 
Yep, working on rebasing the series now and will send RC2 with the bug
fixes.
> Before we get to multiple DCD partitions the interface for DAX devices
> needs to be settled.  In the last community call we were discussing a
> special famfs dax type I believe.  Has any work been done on that?
>
That makes sense to me; I missed that call, so I'm not familiar with the
famfs dax type, but as I mentioned above, it sounds like Gregory's
patch set is a good solution to this, so I'll explore how to integrate
with that first.

> For multi-partitions we need some review on the partition (region) names
> because you made a change which would be incompatible with the base
> series.  But it would be good to get single partitions landed and then
> multiple partitions as you have added.
> 
Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes.
> > Also, apologies if these points have already been discussed. I've not been
> > following this series for very long, so forgive the ignorance as I try to
> > catch up. If you can think of any materials/documentation outside of the
> > mailing list or open collab sync notes that would help fill in the gaps,
> > please let me know :)
> 
> NP this has been a while.  I've been looking for someone to take the
> series who is more familiar with the use cases.
> 
> I look forward to you posting a new series with the support you feel you
> need.
> 
> Ira
> 
Thanks Ira, I will definitely have to keep bothering you, though I'll
try to keep it to a minimum.

Thanks,
Anisa

[snip]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2026-01-12 22:23       ` Anisa Su
@ 2026-01-15 10:28         ` Alireza Sanaee
  2026-02-11  1:44           ` Anisa Su
  0 siblings, 1 reply; 19+ messages in thread
From: Alireza Sanaee @ 2026-01-15 10:28 UTC (permalink / raw)
  To: Anisa Su
  Cc: Ira Weiny, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Mon, 12 Jan 2026 14:23:55 -0800
Anisa Su <anisa.su887@gmail.com> wrote:

Hi Anisa,

> On Fri, Dec 12, 2025 at 04:07:32PM -0600, Ira Weiny wrote:
> > Anisa Su wrote:  
> > > On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote:  
> > > > anisa.su887@ wrote:  
> > > > > From: Anisa Su <anisa.su@samsung.com>
> > > > >   
> [snip] 
> > > > Unfortunately none of the details presented in this cover letter really
> > > > show why the kernel needs this additional complexity.
> > > > 
> > > > Can you go into more details on the use cases of multiple partitions?
> > > >   
> > > From what I understand, the motivation for DCD as a whole has always been a
> > > blocker for the entire series. However, this year we've seen multiple vendors
> > > demo memory pooling/sharing at SC,25[1], as well as the development of a 
> > > controller that supports "memory pooling and sharing across 
> > > multiple hosts" from Montage[2].  
> > 
> > That is great!  Do we know if they used the patches which have been
> > submitted?  Do we know if the user interfaces were sufficient?  
> 
> Sorry for the delay! While I don't know the details of the software
> stack used in those demos, I think the root of the question "Do we know
> if the user interfaces were sufficient" goes back to the missing use
> case for DCD.
> 
> So then, can I ask: how can I demonstrate a reasonable use case?
> Ex: bringing up Kubernetes pods using this patchset on real hw?
> Or something else?
> ^ This is also a question for the community, so everyone please chime in :)
> >  
> > How will this memory be presented with the new DAX changes being proposed?
> >  
> From the call today, there seemed to be general agreement that the changes
> proposed by Gregory's patches are a promising direction for DCD, bc it
> allows hotplug/unplug capabilities without needing to route it through
> the DAX subsystem.
> I haven't looked at those patches yet, but from what was
> discussed today, plan to move forward based on that. Are those the changes
> you were referring to? Or the special famfs dax type you mentioned
> below?
> > > 
> > > The flexibility and control provided by multiple partitions is
> > > an important capability of DCD for enabling composable memory 
> > > infrastructures. IMO, adding multi-partition support back in from v8 or
> > > picking up this patchset would strengthen the series.
> > > 
> > > Let me know if this proposal sounds fair? Otherwise I can
> > > separate out the 2 patches that are bug fixes.  
> > 
> > After RC1 could you rebase the series and fold the bug fixes in?
> >   
> Yep, working on rebasing the series now and will send RC2 with the bug
> fixes.
> > Before we get to multiple DCD partitions the interface for DAX devices
> > needs to be settled.  In the last community call we were discussing a
> > special famfs dax type I believe.  Has any work been done on that?
> >  
> That makes sense to me; I missed that call, so I'm not familiar with the
> famfs dax type, but as I mentioned above, it sounds like Gregory's
> patch set is a good solution to this, so I'll explore how to integrate
> with that first.
> 
> > For multi-partitions we need some review on the partition (region) names
> > because you made a change which would be incompatible with the base
> > series.  But it would be good to get single partitions landed and then
> > multiple partitions as you have added.
> >   
> Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes.
Thanks Anisa. I was also about to look into the rebasing as well.
> > > Also, apologies if these points have already been discussed. I've not been
> > > following this series for very long, so forgive the ignorance as I try to
> > > catch up. If you can think of any materials/documentation outside of the
> > > mailing list or open collab sync notes that would help fill in the gaps,
> > > please let me know :)  
> > 
> > NP this has been a while.  I've been looking for someone to take the
> > series who is more familiar with the use cases.
> > 
> > I look forward to you posting a new series with the support you feel you
> > need.
> > 
> > Ira
> >   
> Thanks Ira, I will definitely have to keep bothering you, though I'll
> try to keep it to a minimum.
> 
> Thanks,
> Anisa
> 
> [snip]


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2026-01-15 10:28         ` Alireza Sanaee
@ 2026-02-11  1:44           ` Anisa Su
  2026-02-11  9:34             ` Alireza Sanaee
  0 siblings, 1 reply; 19+ messages in thread
From: Anisa Su @ 2026-02-11  1:44 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: Anisa Su, Ira Weiny, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Thu, Jan 15, 2026 at 10:28:19AM +0000, Alireza Sanaee wrote:
> On Mon, 12 Jan 2026 14:23:55 -0800
> Anisa Su <anisa.su887@gmail.com> wrote:
> 
> Hi Anisa,
> 
[snip] 
> > Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes.
> Thanks Anisa. I was also about to look into the rebasing as well.
Hey sorry for the delay! I've rebased on cxl-next: https://github.com/anisa-su993/anisa-linux-kernel/tree/dcd-v10-2026-02-09

I tested basic add/remove and region/dax device creation with QEMU, but didn't
spend too much time testing as the bug fixes were quite small.

The bug fixes + who they were suggested by are tracked at the end of the commit
message :)

Thanks,
Anisa

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2026-02-11  1:44           ` Anisa Su
@ 2026-02-11  9:34             ` Alireza Sanaee
  0 siblings, 0 replies; 19+ messages in thread
From: Alireza Sanaee @ 2026-02-11  9:34 UTC (permalink / raw)
  To: Anisa Su
  Cc: Ira Weiny, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Tue, 10 Feb 2026 17:44:06 -0800
Anisa Su <anisa.su887@gmail.com> wrote:

> On Thu, Jan 15, 2026 at 10:28:19AM +0000, Alireza Sanaee wrote:
> > On Mon, 12 Jan 2026 14:23:55 -0800
> > Anisa Su <anisa.su887@gmail.com> wrote:
> > 
> > Hi Anisa,
> >   
> [snip] 
> > > Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes.  
> > Thanks Anisa. I was also about to look into the rebasing as well.  
> Hey sorry for the delay! I've rebased on cxl-next: https://github.com/anisa-su993/anisa-linux-kernel/tree/dcd-v10-2026-02-09
Oh nice! Thanks Anisa.
> 
> I tested basic add/remove and region/dax device creation with QEMU, but didn't
> spend too much time testing as the bug fixes were quite small.
> 
> The bug fixes + who they were suggested by are tracked at the end of the commit
> message :)
> 
> Thanks,
> Anisa


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-11 21:05   ` Anisa Su
  2025-12-12 22:07     ` Ira Weiny
@ 2025-12-13  3:36     ` dan.j.williams
  2026-01-12 22:50       ` Anisa Su
  1 sibling, 1 reply; 19+ messages in thread
From: dan.j.williams @ 2025-12-13  3:36 UTC (permalink / raw)
  To: Anisa Su, Ira Weiny
  Cc: anisa.su887, dan.j.williams, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

Anisa Su wrote:
[..]
> > Unfortunately none of the details presented in this cover letter really
> > show why the kernel needs this additional complexity.
> > 
> > Can you go into more details on the use cases of multiple partitions?
> > 
> From what I understand, the motivation for DCD as a whole has always been a
> blocker for the entire series. However, this year we've seen multiple vendors
> demo memory pooling/sharing at SC,25[1], as well as the development of a 
> controller that supports "memory pooling and sharing across 
> multiple hosts" from Montage[2].
> 
> The flexibility and control provided by multiple partitions is
> an important capability of DCD for enabling composable memory 
> infrastructures. IMO, adding multi-partition support back in from v8 or
> picking up this patchset would strengthen the series.

Can you explain the use case for multiple partitions per-device?
Describe it in terms of what Linux loses if it never entertains this
aspect of the specification. It significantly complicates the ABI for a
benefit to Linux that I am unable to articulate.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2025-12-13  3:36     ` dan.j.williams
@ 2026-01-12 22:50       ` Anisa Su
  2026-01-13  0:08         ` Gregory Price
  0 siblings, 1 reply; 19+ messages in thread
From: Anisa Su @ 2026-01-12 22:50 UTC (permalink / raw)
  To: dan.j.williams
  Cc: Anisa Su, Ira Weiny, dave, linux-cxl, nifan.cxl, dongjoo.seo1

On Sat, Dec 13, 2025 at 12:36:24PM +0900, dan.j.williams@intel.com wrote:
> Anisa Su wrote:
> [..]
> > > Unfortunately none of the details presented in this cover letter really
> > > show why the kernel needs this additional complexity.
> > > 
> > > Can you go into more details on the use cases of multiple partitions?
> > > 
> > From what I understand, the motivation for DCD as a whole has always been a
> > blocker for the entire series. However, this year we've seen multiple vendors
> > demo memory pooling/sharing at SC,25[1], as well as the development of a 
> > controller that supports "memory pooling and sharing across 
> > multiple hosts" from Montage[2].
> > 
> > The flexibility and control provided by multiple partitions is
> > an important capability of DCD for enabling composable memory 
> > infrastructures. IMO, adding multi-partition support back in from v8 or
> > picking up this patchset would strengthen the series.
> 
> Can you explain the use case for multiple partitions per-device?
> Describe it in terms of what Linux loses if it never entertains this
> aspect of the specification. It significantly complicates the ABI for a
> benefit to Linux that I am unable to articulate.

Let me backtrack a bit here :( I was too hasty trying to push for
multiple partitions.

However, can I ask for some clarification on what a sufficient use case is
(with just a single partition)?

Then I'll (try my best) to demonstrate how this series can accomplish it
so I can help land the single partition stuff first. Does that sound fair
to you? Or am I misunderstanding the core problem?

Thanks,
Anisa

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions
  2026-01-12 22:50       ` Anisa Su
@ 2026-01-13  0:08         ` Gregory Price
  0 siblings, 0 replies; 19+ messages in thread
From: Gregory Price @ 2026-01-13  0:08 UTC (permalink / raw)
  To: Anisa Su
  Cc: dan.j.williams, Ira Weiny, dave, linux-cxl, nifan.cxl,
	dongjoo.seo1

On Mon, Jan 12, 2026 at 02:50:01PM -0800, Anisa Su wrote:
> On Sat, Dec 13, 2025 at 12:36:24PM +0900, dan.j.williams@intel.com wrote:
> > Anisa Su wrote:
> > [..]
> > > > Unfortunately none of the details presented in this cover letter really
> > > > show why the kernel needs this additional complexity.
> > > > 
> > > > Can you go into more details on the use cases of multiple partitions?
> > > > 
> > > From what I understand, the motivation for DCD as a whole has always been a
> > > blocker for the entire series. However, this year we've seen multiple vendors
> > > demo memory pooling/sharing at SC,25[1], as well as the development of a 
> > > controller that supports "memory pooling and sharing across 
> > > multiple hosts" from Montage[2].
> > > 
> > > The flexibility and control provided by multiple partitions is
> > > an important capability of DCD for enabling composable memory 
> > > infrastructures. IMO, adding multi-partition support back in from v8 or
> > > picking up this patchset would strengthen the series.
> > 
> > Can you explain the use case for multiple partitions per-device?
> > Describe it in terms of what Linux loses if it never entertains this
> > aspect of the specification. It significantly complicates the ABI for a
> > benefit to Linux that I am unable to articulate.
> 
> Let me backtrack a bit here :( I was too hasty trying to push for
> multiple partitions.
> 
> However, can I ask for some clarification on what a sufficient use case is
> (with just a single partition)?
> 
> Then I'll (try my best) to demonstrate how this series can accomplish it
> so I can help land the single partition stuff first. Does that sound fair
> to you? Or am I misunderstanding the core problem?
> 
> Thanks,
> Anisa

regions != partitions

the discussion from today was regarding FAMFS wanting to balance over
subscription and max bandwidth - which requires a per-device region
and an interleaved regions, they cannot be managed with one region.

I can't think of a reason to need multiple *physical partitions* on
the device, when you can chop up a single partition with many regions.

Either way you need dedicated decoders for each region - regardless of
what partition it's on, so the extra partition doesn't get you much.

~Gregory

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-02-11  9:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
2025-12-04 17:04   ` Ira Weiny
2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887
2025-12-03 21:03   ` Anisa Su
2025-12-04 17:29   ` Ira Weiny
2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887
2025-12-04 17:44   ` Ira Weiny
2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su
2025-12-04 17:28 ` Ira Weiny
2025-12-11 21:05   ` Anisa Su
2025-12-12 22:07     ` Ira Weiny
2026-01-12 22:23       ` Anisa Su
2026-01-15 10:28         ` Alireza Sanaee
2026-02-11  1:44           ` Anisa Su
2026-02-11  9:34             ` Alireza Sanaee
2025-12-13  3:36     ` dan.j.williams
2026-01-12 22:50       ` Anisa Su
2026-01-13  0:08         ` Gregory Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox