* [RFC PATCH 0/3] Add Support for Multiple DC Regions
@ 2025-12-03 20:29 anisa.su887
2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw)
To: dan.j.williams, ira.weiny, dave, linux-cxl
Cc: nifan.cxl, dongjoo.seo1, Anisa Su
From: Anisa Su <anisa.su@samsung.com>
This patchset introduces support for multiple DC regions. It is rebased on top
of the latest branch published to Ira's repository:
https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
We hope it will be useful in the meantime for others and restart some
discussion around how to move DCD forward.
The corresponding NDCTL support can be found on this branch:
https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
I will reply to this thread with a reference to the thread for the
NDCTL patches once published.
Testing:
This patchset was tested on a QEMU VM with the following topology:
PCIE Root (pcie.0)
│
├─ CXL Fixed Memory Window cxl-fmw.0
├─ CXL Root Complex cxl.0
│ └─ Root Port root_port1
│ └─ CXL Type-3 Device cxl-dcd0
│
├─ CXL Fixed Memory Window cxl-fmw.1
├─ CXL Root Complex cxl.1
│ └─ Root Port root_port2
│ └─ CXL Type-3 Device cxl-dcd1
└─
"-object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/t3_cxl1.raw,size=8G \
-object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/t3_lsa1.raw,size=1M \
-object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/t3_cxl2.raw,size=8G \
-object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/t3_lsa2.raw,size=1M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.0,hdm_for_passthrough=true \
-device pxb-cxl,bus_nr=48,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
-device cxl-rp,port=0,bus=cxl.0,id=root_port1,chassis=0,slot=1 \
-device cxl-rp,port=1,bus=cxl.1,id=root_port2,chassis=1,slot=1 \
-device cxl-type3,bus=root_port1,volatile-dc-memdev=cxl-mem1,id=cxl-dcd0,lsa=cxl-lsa1,num-dc-regions=8,sn=99 \
-device cxl-type3,bus=root_port2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd1,lsa=cxl-lsa2,num-dc-regions=8,sn=100 \
-machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=8G,cxl-fmw.1.targets.0=cxl.1,cxl-fmw.1.size=8G"
2 CFMWs and 2 root complexes are emulated because QEMU creates
4 decoders/topology level. With 1 root complex, there are only 4 upstream
decoders. Therefore in order to create 4+ regions, we need a total of
8 upstream decoders. This does mean that we are only able to create
4 regions on each device, although up to 8 are supported.
Using `cxl list`, we can see mem0 and mem1 have dynamic_ram_* capablities:
root@deb-101020-bm01:~# cxl list
[
{
"memdevs":[
{
"memdev":"mem0",
"dynamic_ram_0_size":1073741824,
"dynamic_ram_1_size":1073741824,
"dynamic_ram_2_size":1073741824,
"dynamic_ram_3_size":1073741824,
"dynamic_ram_4_size":1073741824,
"dynamic_ram_5_size":1073741824,
"dynamic_ram_6_size":1073741824,
"dynamic_ram_7_size":1073741824,
"serial":100,
"host":"0000:31:00.0",
"firmware_version":"BWFW VERSION 00"
},
{
"memdev":"mem1",
"dynamic_ram_0_size":1073741824,
"dynamic_ram_1_size":1073741824,
"dynamic_ram_2_size":1073741824,
"dynamic_ram_3_size":1073741824,
"dynamic_ram_4_size":1073741824,
"dynamic_ram_5_size":1073741824,
"dynamic_ram_6_size":1073741824,
"dynamic_ram_7_size":1073741824,
"serial":99,
"host":"0000:0d:00.0",
"firmware_version":"BWFW VERSION 00"
}
]
}
]
To create the 8 regions:
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_0
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_1
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_2
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_3
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_4
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_5
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_6
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_7
We can verify the 8 regions:
root@deb-101020-bm01:~# cxl list
[
{
"memdevs":[
...
},
{
"regions":[
{
"region":"region0",
"resource":79993765888,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region6",
"resource":81067507712,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region7",
"resource":82141249536,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region8",
"resource":83214991360,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region1",
"resource":88315265024,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region2",
"resource":89389006848,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region3",
"resource":90462748672,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
},
{
"region":"region4",
"resource":91536490496,
"size":1073741824,
"interleave_ways":1,
"interleave_granularity":256,
"decode_state":"commit"
}
]
}
]
Extents of various sizes (128MB, 256MB, 512MB, and 1GB) are added from mem1,
which correspond to regions 0-3, then DAX devices are created from them.
The extent DPAs are as follows, which allows each one to map to a distinct
region:
- [0-128] --> region0
- [1024-1280] --> region1
- [2048-2560] --> region2
- [3072-4096] --> region3
The correct sizes can be verified when creating the DAX device.
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region0
[
{
"chardev":"dax0.1",
"size":134217728,
"target_node":1,
"align":2097152,
"mode":"devdax"
}
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region1
[
{
"chardev":"dax1.1",
"size":268435456,
"target_node":1,
"align":2097152,
"mode":"devdax"
}
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region2
[
{
"chardev":"dax2.1",
"size":536870912,
"target_node":1,
"align":2097152,
"mode":"devdax"
}
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region3
[
{
"chardev":"dax3.1",
"size":1073741824,
"target_node":1,
"align":2097152,
"mode":"devdax"
}
]
created 1 device
Then the DAX devices are reconfigured to system-ram mode and verified with lsmem.
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax0.1 -m system-ram
[
{
"chardev":"dax0.1",
"size":134217728,
"target_node":1,
"align":2097152,
"mode":"system-ram",
"online_memblocks":1,
"total_memblocks":1,
"movable":true
}
]
reconfigured 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax1.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax2.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax3.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x000000007fffffff 2G online yes 0-15
0x0000000100000000-0x000000027fffffff 6G online yes 32-79
0x00000012a0000000-0x00000012a7ffffff 128M online yes 596
0x00000012e0000000-0x00000012efffffff 256M online yes 604-605
0x0000001320000000-0x000000133fffffff 512M online yes 612-615
0x0000001360000000-0x000000139fffffff 1G online yes 620-627
Memory block size: 128M
Total online memory: 9.9G
Total offline memory: 0B
-------------------------------------------------------------------------------
Note: I did try hacking QEMU to create 8 decoders at each level to avoid having
2 separate host bridges/DCDs by modifying include/hw/cxl/cxl_component.h like so:
#define CXL_HDM_DECODER_COUNT 8
HDM_DECODER_INIT(0);
HDM_DECODER_INIT(1);
HDM_DECODER_INIT(2);
HDM_DECODER_INIT(3);
HDM_DECODER_INIT(4);
HDM_DECODER_INIT(5);
HDM_DECODER_INIT(6);
HDM_DECODER_INIT(7);
However, when attempting to create the 5th cxl region,
I ran into a timeout error when committing the decoders.
Did not spend much time pursuing this further, most likely
need to change more things on the QEMU side.
But the 8 decoders do show up correctly under sysfs.
Fan Ni (3):
core/region: fix return logic for store_targetN
dax/cxl: add existing dc extents when probing dax region
dcd: Add support for multiple DC regions
drivers/cxl/core/cdat.c | 2 +-
drivers/cxl/core/core.h | 9 +-
drivers/cxl/core/extent.c | 2 +-
drivers/cxl/core/hdm.c | 18 +++-
drivers/cxl/core/mbox.c | 39 +++++----
drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++-------------
drivers/cxl/core/port.c | 45 ++++++++--
drivers/cxl/core/region.c | 65 ++++++++------
drivers/cxl/cxl.h | 23 ++++-
drivers/cxl/cxlmem.h | 5 +-
drivers/dax/cxl.c | 28 ++----
11 files changed, 281 insertions(+), 134 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 19+ messages in thread* [RFC PATCH 1/3] core/region: fix return logic for store_targetN 2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887 @ 2025-12-03 20:29 ` anisa.su887 2025-12-04 17:04 ` Ira Weiny 2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887 ` (3 subsequent siblings) 4 siblings, 1 reply; 19+ messages in thread From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw) To: dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su From: Fan Ni <fan.ni@samsung.com> Currently, store_targetN attempts to attach_target even if an error (cxlr->mode is incorrect, DCD is unsupported). Add a goto out statement to skip any attempt to attach the target on error. Signed-off-by: Fan Ni <nifan.cxl@gmail.com> Tested-by: Anisa Su <anisa.su@samsung.com> Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com> --- drivers/cxl/core/region.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index a93979dd345d..adab1f338ee9 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -2258,7 +2258,8 @@ static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos, if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A && !cxl_dcd_supported(cxled_to_mds(cxled))) { dev_dbg(dev, "DCD unsupported\n"); - return -EINVAL; + rc = -EINVAL; + goto out; } rc = attach_target(cxlr, cxled, pos, TASK_INTERRUPTIBLE); out: -- 2.51.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 1/3] core/region: fix return logic for store_targetN 2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887 @ 2025-12-04 17:04 ` Ira Weiny 0 siblings, 0 replies; 19+ messages in thread From: Ira Weiny @ 2025-12-04 17:04 UTC (permalink / raw) To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su anisa.su887@ wrote: > From: Fan Ni <fan.ni@samsung.com> > > Currently, store_targetN attempts to attach_target even if an error I think you mean the endpoint decoder device reference is not released, correct? > (cxlr->mode is incorrect, DCD is unsupported). Add a goto out statement > to skip any attempt to attach the target on error. > With an updated commit message: Reviewed-by: Ira Weiny <ira.weiny@intel.com> [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region 2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887 2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887 @ 2025-12-03 20:29 ` anisa.su887 2025-12-03 21:03 ` Anisa Su 2025-12-04 17:29 ` Ira Weiny 2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887 ` (2 subsequent siblings) 4 siblings, 2 replies; 19+ messages in thread From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw) To: dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su From: Fan Ni <fan.ni@samsung.com> Add existing dc extents on the device before probing dax region will cause the creation of the dax device fail as resource cannot present when driver is bound to the device as shown in really_probe(). We delay the processing of existing dc extents to cxl region driver probe. Question: the guard() in cxlr_notify_extent() will cause lock issue, removed it. Not sure whether it will cause issue or not although no issue is observed during test. Signed-off-by: Fan Ni <nifan.cxl@gmail.com> Tested-by: Anisa Su <anisa.su@samsung.com> Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com> --- drivers/cxl/core/extent.c | 2 +- drivers/cxl/core/region.c | 8 ++------ drivers/cxl/cxl.h | 5 +++++ drivers/dax/cxl.c | 24 +++++++----------------- 4 files changed, 15 insertions(+), 24 deletions(-) diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index 3e7295d3e5e2..3b0e4d72d4ac 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -285,7 +285,7 @@ static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event, dev_dbg(dev, "Trying notify: type %d HPA %pra\n", event, ®ion_extent->hpa_range); - guard(device)(dev); + // guard(device)(dev); /* * The lack of a driver indicates a notification has failed. No user diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index adab1f338ee9..da3ea3cf8585 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3232,7 +3232,7 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr) return rc; } -static int cxlr_add_existing_extents(struct cxl_region *cxlr) +int cxlr_add_existing_extents(struct cxl_region *cxlr) { struct cxl_region_params *p = &cxlr->params; int i, latched_rc = 0; @@ -3251,6 +3251,7 @@ static int cxlr_add_existing_extents(struct cxl_region *cxlr) return latched_rc; } +EXPORT_SYMBOL_NS_GPL(cxlr_add_existing_extents, "CXL"); static void cxlr_dax_unregister(void *_cxlr_dax) { @@ -3287,11 +3288,6 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr) dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent), dev_name(dev)); - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) - if (cxlr_add_existing_extents(cxlr)) - dev_err(&cxlr->dev, "Existing extent processing failed %d\n", - rc); - return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister, cxlr_dax); err: diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index d22fe5e50647..3e400dd4f08b 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -926,6 +926,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); int cxl_add_to_region(struct cxl_endpoint_decoder *cxled); struct cxl_dax_region *to_cxl_dax_region(struct device *dev); u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa); +int cxlr_add_existing_extents(struct cxl_region *cxlr); #else static inline bool is_cxl_pmem_region(struct device *dev) { @@ -948,6 +949,10 @@ static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, { return 0; } +int cxlr_add_existing_extents(struct cxl_region *cxlr) +{ + return 0; +} #endif void cxl_endpoint_parse_cdat(struct cxl_port *port); diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 011bd1dc7691..15fc2de63185 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -18,21 +18,6 @@ static int __cxl_dax_add_resource(struct dax_region *dax_region, return dax_region_add_resource(dax_region, dev, start, length); } -static int cxl_dax_add_resource(struct device *dev, void *data) -{ - struct dax_region *dax_region = data; - struct region_extent *region_extent; - - region_extent = to_region_extent(dev); - if (!region_extent) - return 0; - - dev_dbg(dax_region->dev, "Adding resource HPA %pra\n", - ®ion_extent->hpa_range); - - return __cxl_dax_add_resource(dax_region, region_extent); -} - static int cxl_dax_region_notify(struct device *dev, struct cxl_notify_data *notify_data) { @@ -66,6 +51,7 @@ static int cxl_dax_region_probe(struct device *dev) struct dev_dax_data data; resource_size_t dev_size; unsigned long flags; + int rc; if (nid == NUMA_NO_NODE) nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start); @@ -80,8 +66,12 @@ static int cxl_dax_region_probe(struct device *dev) return -ENOMEM; if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) { - device_for_each_child(&cxlr_dax->dev, dax_region, - cxl_dax_add_resource); + rc = cxlr_add_existing_extents(cxlr); + /* If adding existing extents fails, continue with only an error + * message ?? */ + if (rc) + dev_err(&cxlr->dev, "Existing extent processing failed %d\n", + rc); /* Add empty seed dax device */ dev_size = 0; } else { -- 2.51.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region 2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887 @ 2025-12-03 21:03 ` Anisa Su 2025-12-04 17:29 ` Ira Weiny 1 sibling, 0 replies; 19+ messages in thread From: Anisa Su @ 2025-12-03 21:03 UTC (permalink / raw) To: linux-cxl Cc: dan.j.williams, ira.weiny, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Wed, Dec 03, 2025 at 08:29:12PM +0000, anisa.su887@gmail.com wrote: > From: Fan Ni <fan.ni@samsung.com> > > Add existing dc extents on the device before probing dax region will > cause the creation of the dax device fail as resource cannot present > when driver is bound to the device as shown in really_probe(). > > We delay the processing of existing dc extents to cxl region driver > probe. > > Question: the guard() in cxlr_notify_extent() will cause lock issue, > removed it. Not sure whether it will cause issue or not although no > issue is observed during test. > Hi Fan, I added the guard() back in when I was testing and did not run into any issues when. Do you recall where you ran into the locking issue? - Anisa > Signed-off-by: Fan Ni <nifan.cxl@gmail.com> > Tested-by: Anisa Su <anisa.su@samsung.com> > Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com> > --- > drivers/cxl/core/extent.c | 2 +- > drivers/cxl/core/region.c | 8 ++------ > drivers/cxl/cxl.h | 5 +++++ > drivers/dax/cxl.c | 24 +++++++----------------- > 4 files changed, 15 insertions(+), 24 deletions(-) > > diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c > index 3e7295d3e5e2..3b0e4d72d4ac 100644 > --- a/drivers/cxl/core/extent.c > +++ b/drivers/cxl/core/extent.c > @@ -285,7 +285,7 @@ static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event, > dev_dbg(dev, "Trying notify: type %d HPA %pra\n", event, > ®ion_extent->hpa_range); > > - guard(device)(dev); > + // guard(device)(dev); > > /* > * The lack of a driver indicates a notification has failed. No user > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index adab1f338ee9..da3ea3cf8585 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -3232,7 +3232,7 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr) > return rc; > } > > -static int cxlr_add_existing_extents(struct cxl_region *cxlr) > +int cxlr_add_existing_extents(struct cxl_region *cxlr) > { > struct cxl_region_params *p = &cxlr->params; > int i, latched_rc = 0; > @@ -3251,6 +3251,7 @@ static int cxlr_add_existing_extents(struct cxl_region *cxlr) > > return latched_rc; > } > +EXPORT_SYMBOL_NS_GPL(cxlr_add_existing_extents, "CXL"); > > static void cxlr_dax_unregister(void *_cxlr_dax) > { > @@ -3287,11 +3288,6 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr) > dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent), > dev_name(dev)); > > - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) > - if (cxlr_add_existing_extents(cxlr)) > - dev_err(&cxlr->dev, "Existing extent processing failed %d\n", > - rc); > - > return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister, > cxlr_dax); > err: > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index d22fe5e50647..3e400dd4f08b 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -926,6 +926,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); > int cxl_add_to_region(struct cxl_endpoint_decoder *cxled); > struct cxl_dax_region *to_cxl_dax_region(struct device *dev); > u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa); > +int cxlr_add_existing_extents(struct cxl_region *cxlr); > #else > static inline bool is_cxl_pmem_region(struct device *dev) > { > @@ -948,6 +949,10 @@ static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, > { > return 0; > } > +int cxlr_add_existing_extents(struct cxl_region *cxlr) > +{ > + return 0; > +} > #endif > > void cxl_endpoint_parse_cdat(struct cxl_port *port); > diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c > index 011bd1dc7691..15fc2de63185 100644 > --- a/drivers/dax/cxl.c > +++ b/drivers/dax/cxl.c > @@ -18,21 +18,6 @@ static int __cxl_dax_add_resource(struct dax_region *dax_region, > return dax_region_add_resource(dax_region, dev, start, length); > } > > -static int cxl_dax_add_resource(struct device *dev, void *data) > -{ > - struct dax_region *dax_region = data; > - struct region_extent *region_extent; > - > - region_extent = to_region_extent(dev); > - if (!region_extent) > - return 0; > - > - dev_dbg(dax_region->dev, "Adding resource HPA %pra\n", > - ®ion_extent->hpa_range); > - > - return __cxl_dax_add_resource(dax_region, region_extent); > -} > - > static int cxl_dax_region_notify(struct device *dev, > struct cxl_notify_data *notify_data) > { > @@ -66,6 +51,7 @@ static int cxl_dax_region_probe(struct device *dev) > struct dev_dax_data data; > resource_size_t dev_size; > unsigned long flags; > + int rc; > > if (nid == NUMA_NO_NODE) > nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start); > @@ -80,8 +66,12 @@ static int cxl_dax_region_probe(struct device *dev) > return -ENOMEM; > > if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) { > - device_for_each_child(&cxlr_dax->dev, dax_region, > - cxl_dax_add_resource); > + rc = cxlr_add_existing_extents(cxlr); > + /* If adding existing extents fails, continue with only an error > + * message ?? */ > + if (rc) > + dev_err(&cxlr->dev, "Existing extent processing failed %d\n", > + rc); > /* Add empty seed dax device */ > dev_size = 0; > } else { > -- > 2.51.0 > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region 2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887 2025-12-03 21:03 ` Anisa Su @ 2025-12-04 17:29 ` Ira Weiny 1 sibling, 0 replies; 19+ messages in thread From: Ira Weiny @ 2025-12-04 17:29 UTC (permalink / raw) To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su anisa.su887@ wrote: > From: Fan Ni <fan.ni@samsung.com> > > Add existing dc extents on the device before probing dax region will > cause the creation of the dax device fail as resource cannot present > when driver is bound to the device as shown in really_probe(). It's been a while since I've looked at this but the above explanation is not clear to me. There can't be dax devices on a region before region devices. So how is a dax device driver preventing the creation of a resource while the region is being probed? > > We delay the processing of existing dc extents to cxl region driver NIT: Don't use 'we'. Just say: "Delay the processing..." > probe. > > Question: the guard() in cxlr_notify_extent() will cause lock issue, > removed it. Not sure whether it will cause issue or not although no > issue is observed during test. > > Signed-off-by: Fan Ni <nifan.cxl@gmail.com> > Tested-by: Anisa Su <anisa.su@samsung.com> > Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com> > --- > drivers/cxl/core/extent.c | 2 +- > drivers/cxl/core/region.c | 8 ++------ > drivers/cxl/cxl.h | 5 +++++ > drivers/dax/cxl.c | 24 +++++++----------------- > 4 files changed, 15 insertions(+), 24 deletions(-) > > diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c > index 3e7295d3e5e2..3b0e4d72d4ac 100644 > --- a/drivers/cxl/core/extent.c > +++ b/drivers/cxl/core/extent.c > @@ -285,7 +285,7 @@ static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event, > dev_dbg(dev, "Trying notify: type %d HPA %pra\n", event, > ®ion_extent->hpa_range); > > - guard(device)(dev); > + // guard(device)(dev); This must remain to check for the driver notify callback. I'm totally willing to admit there might be issues with this code but I'm not clear what problem this patch is fixing. Perhaps some more details? Ira [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 3/3] dcd: Add support for multiple DC regions 2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887 2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887 2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887 @ 2025-12-03 20:29 ` anisa.su887 2025-12-04 17:44 ` Ira Weiny 2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su 2025-12-04 17:28 ` Ira Weiny 4 siblings, 1 reply; 19+ messages in thread From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw) To: dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su From: Fan Ni <fan.ni@samsung.com> With the change, we add following support: 1. Allow creating multiple DC regions (up to 8); 2. Allow DC extents to belong to regions other than region 0; 3. Modify sysfs entries to enable the above capabilities; 4. Shareable attribute is added to dc region (partition); This series is tested with proper NDCTL fix, see: https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support Signed-off-by: Fan Ni <nifan.cxl@gmail.com> Tested-by: Anisa Su <anisa.su@samsung.com> Tested-by: Dongjoo Seo <dongjoo.seo1@samsung.com> --- drivers/cxl/core/cdat.c | 2 +- drivers/cxl/core/core.h | 9 +- drivers/cxl/core/hdm.c | 18 +++- drivers/cxl/core/mbox.c | 39 +++++---- drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++------------- drivers/cxl/core/port.c | 45 ++++++++-- drivers/cxl/core/region.c | 54 ++++++++---- drivers/cxl/cxl.h | 18 +++- drivers/cxl/cxlmem.h | 5 +- drivers/dax/cxl.c | 4 +- 10 files changed, 264 insertions(+), 109 deletions(-) diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c index 67c6917a9add..4b05af576a4f 100644 --- a/drivers/cxl/core/cdat.c +++ b/drivers/cxl/core/cdat.c @@ -278,7 +278,7 @@ static void cxl_memdev_set_qos_class(struct cxl_dev_state *cxlds, }; if (range_contains(&range, &dent->dpa_range)) { - if (mode == CXL_PARTMODE_DYNAMIC_RAM_A && + if (is_cxl_dc_partition_mode(mode) && dent->handle != handle) dev_warn(dev, "Dynamic RAM perf mismatch; %pra (%u) vs %pra (%u)\n", diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 70942c40221b..061dcf3320cd 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -34,7 +34,14 @@ int cxl_region_invalidate_memregion(struct cxl_region *cxlr); #ifdef CONFIG_CXL_REGION extern struct device_attribute dev_attr_create_pmem_region; extern struct device_attribute dev_attr_create_ram_region; -extern struct device_attribute dev_attr_create_dynamic_ram_a_region; +extern struct device_attribute dev_attr_create_dynamic_ram_0_region; +extern struct device_attribute dev_attr_create_dynamic_ram_1_region; +extern struct device_attribute dev_attr_create_dynamic_ram_2_region; +extern struct device_attribute dev_attr_create_dynamic_ram_3_region; +extern struct device_attribute dev_attr_create_dynamic_ram_4_region; +extern struct device_attribute dev_attr_create_dynamic_ram_5_region; +extern struct device_attribute dev_attr_create_dynamic_ram_6_region; +extern struct device_attribute dev_attr_create_dynamic_ram_7_region; extern struct device_attribute dev_attr_delete_region; extern struct device_attribute dev_attr_region; extern const struct device_type cxl_pmem_region_type; diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 6b976da4a70a..faa4656f9542 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -463,8 +463,22 @@ static const char *cxl_mode_name(enum cxl_partition_mode mode) return "ram"; case CXL_PARTMODE_PMEM: return "pmem"; - case CXL_PARTMODE_DYNAMIC_RAM_A: - return "dynamic_ram_a"; + case CXL_PARTMODE_DYNAMIC_RAM_0: + return "dynamic_ram_0"; + case CXL_PARTMODE_DYNAMIC_RAM_1: + return "dynamic_ram_1"; + case CXL_PARTMODE_DYNAMIC_RAM_2: + return "dynamic_ram_2"; + case CXL_PARTMODE_DYNAMIC_RAM_3: + return "dynamic_ram_3"; + case CXL_PARTMODE_DYNAMIC_RAM_4: + return "dynamic_ram_4"; + case CXL_PARTMODE_DYNAMIC_RAM_5: + return "dynamic_ram_5"; + case CXL_PARTMODE_DYNAMIC_RAM_6: + return "dynamic_ram_6"; + case CXL_PARTMODE_DYNAMIC_RAM_7: + return "dynamic_ram_7"; default: return ""; }; diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index a6de98eb1310..291a96757ac8 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -963,7 +963,7 @@ static int cxl_validate_extent(struct cxl_memdev_state *mds, for (int i = 0; i < cxlds->nr_partitions; i++) { struct cxl_dpa_partition *part = &cxlds->part[i]; - if (part->mode != CXL_PARTMODE_DYNAMIC_RAM_A) + if (!is_cxl_dc_partition_mode(part->mode)) continue; struct range partition_range = (struct range) { @@ -1710,6 +1710,7 @@ static int cxl_get_dc_config(struct cxl_mailbox *mbox, u8 start_partition, * device. * @mbox: Mailbox to query * @dc_info: The dynamic partition information to return + * @num_part: The number of dynamic partitions returned * * Read Dynamic Capacity information from the device and return the partition * information. @@ -1718,7 +1719,7 @@ static int cxl_get_dc_config(struct cxl_mailbox *mbox, u8 start_partition, * on error only dynamic_bytes is left unchanged. */ int cxl_dev_dc_identify(struct cxl_mailbox *mbox, - struct cxl_dc_partition_info *dc_info) + struct cxl_dc_partition_info *dc_info, int *num_part) { struct cxl_dc_partition_info partitions[CXL_MAX_DC_PARTITIONS]; size_t dc_resp_size = mbox->payload_size; @@ -1763,12 +1764,15 @@ int cxl_dev_dc_identify(struct cxl_mailbox *mbox, } while (num_partitions < dc_resp->avail_partition_count); - /* Return 1st partition */ - dc_info->start = partitions[0].start; - dc_info->size = partitions[0].size; - dc_info->handle = partitions[0].handle; - dev_dbg(dev, "Returning partition 0 %zu size %zu\n", - dc_info->start, dc_info->size); + + *num_part = dc_resp->avail_partition_count; + for (int i = 0; i < dc_resp->avail_partition_count; i++) { + dc_info[i].start = partitions[i].start; + dc_info[i].size = partitions[i].size; + dc_info[i].handle = partitions[i].handle; + dev_dbg(dev, "Returning partition %d %zu size %zu\n", + i, dc_info[i].start, dc_info[i].size); + } return 0; } @@ -1955,12 +1959,12 @@ EXPORT_SYMBOL_NS_GPL(cxl_get_dirty_count, "CXL"); void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *info) { - struct cxl_dc_partition_info dc_info = { 0 }; + struct cxl_dc_partition_info dc_info[CXL_MAX_DC_PARTITIONS]; struct device *dev = mds->cxlds.dev; size_t skip; - int rc; + int rc, num_part; - rc = cxl_dev_dc_identify(&mds->cxlds.cxl_mbox, &dc_info); + rc = cxl_dev_dc_identify(&mds->cxlds.cxl_mbox, dc_info, &num_part); if (rc) { dev_warn(dev, "Failed to read Dynamic Capacity config: %d\n", rc); @@ -1969,7 +1973,7 @@ void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *info) } /* Skips between pmem and the dynamic partition are not supported */ - skip = dc_info.start - info->size; + skip = dc_info[0].start - info->size; if (skip) { dev_warn(dev, "Dynamic Capacity skip from pmem not supported: %zu\n", @@ -1978,10 +1982,13 @@ void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *info) return; } - info->size += dc_info.size; - dev_dbg(dev, "Adding dynamic ram partition A; %zu size %zu\n", - dc_info.start, dc_info.size); - add_part(info, dc_info.start, dc_info.size, CXL_PARTMODE_DYNAMIC_RAM_A); + for (int i = 0; i < num_part; i++) { + info->size += dc_info[i].size; + dev_dbg(dev, "Adding dynamic ram partition %d; %zu size %zu\n", + i, dc_info[i].start, dc_info[i].size); + add_part(info, dc_info[i].start, dc_info[i].size, CXL_PARTITION_DC_MODE(0) + i); + } + mds->cxlds.nr_dc_partitions = num_part; } EXPORT_SYMBOL_NS_GPL(cxl_configure_dcd, "CXL"); diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index c53b06522d6c..720780901f5a 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -2,6 +2,7 @@ /* Copyright(c) 2020 Intel Corporation. */ #include <linux/io-64-nonatomic-lo-hi.h> +#include <linux/string_choices.h> #include <linux/firmware.h> #include <linux/device.h> #include <linux/slab.h> @@ -102,18 +103,115 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr, static struct device_attribute dev_attr_pmem_size = __ATTR(size, 0444, pmem_size_show, NULL); -static ssize_t dynamic_ram_a_size_show(struct device *dev, struct device_attribute *attr, - char *buf) +static ssize_t dynamic_ram_N_size_show(struct cxl_memdev *cxlmd, char *buf, int pos) { - struct cxl_memdev *cxlmd = to_cxl_memdev(dev); struct cxl_dev_state *cxlds = cxlmd->cxlds; - unsigned long long len = cxl_part_size(cxlds, CXL_PARTMODE_DYNAMIC_RAM_A); + unsigned long long len = cxl_part_size(cxlds, CXL_PARTITION_DC_MODE(0) + pos); return sysfs_emit(buf, "%#llx\n", len); } -static struct device_attribute dev_attr_dynamic_ram_a_size = - __ATTR(size, 0444, dynamic_ram_a_size_show, NULL); +static ssize_t dynamic_ram_N_shareable_show(struct cxl_memdev *cxlmd, char *buf, int pos) +{ + enum cxl_partition_mode mode = CXL_PARTITION_DC_MODE(0) + pos; + bool val = cxlmd->cxlds->part[mode].perf.shareable; + + return sysfs_emit(buf, "%s\n", str_true_false(val)); +} + +static struct cxl_dpa_perf *part_perf(struct cxl_dev_state *cxlds, + enum cxl_partition_mode mode) +{ + for (int i = 0; i < cxlds->nr_partitions; i++) + if (cxlds->part[i].mode == mode) + return &cxlds->part[i].perf; + return NULL; +} + +static ssize_t dynamic_ram_N_qos_class_show(struct cxl_memdev *cxlmd, + char *buf, int pos) +{ + enum cxl_partition_mode mode = CXL_PARTITION_DC_MODE(0) + pos; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + return sysfs_emit(buf, "%d\n", part_perf(cxlds, mode)->qos_class); +} + +#define CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(n) \ +static ssize_t dynamic_ram_##n##_size_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return dynamic_ram_N_size_show(to_cxl_memdev(dev), buf, (n)); \ +} \ +struct device_attribute dynamic_ram_##n##_size = { \ + .attr = { .name = "size", .mode = 0444 }, \ + .show = dynamic_ram_##n##_size_show, \ +}; \ +static ssize_t dynamic_ram_##n##_shareable_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return dynamic_ram_N_shareable_show(to_cxl_memdev(dev), buf, (n)); \ +} \ +struct device_attribute dynamic_ram_##n##_shareable = { \ + .attr = { .name = "shareable", .mode = 0444 }, \ + .show = dynamic_ram_##n##_shareable_show, \ +}; \ +static ssize_t dynamic_ram_##n##_qos_class_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return dynamic_ram_N_qos_class_show(to_cxl_memdev(dev), buf, (n)); \ +} \ +struct device_attribute dynamic_ram_##n##_qos_class = { \ + .attr = { .name = "qos_class", .mode = 0444 }, \ + .show = dynamic_ram_##n##_qos_class_show, \ +}; \ +static struct attribute *cxl_memdev_dynamic_ram_##n##_attributes[] = { \ + &dynamic_ram_##n##_size.attr, \ + &dynamic_ram_##n##_shareable.attr, \ + &dynamic_ram_##n##_qos_class.attr, \ + NULL, \ +}; \ +static umode_t cxl_memdev_dynamic_ram_##n##_attr_visible(struct kobject *kobj, \ + struct attribute *a, \ + int pos) \ +{ \ + struct device *dev = kobj_to_dev(kobj); \ + struct cxl_memdev *cxlmd = to_cxl_memdev(dev); \ + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); \ + \ + if (!mds) \ + return 0; \ + \ + return a->mode; \ +} \ +static umode_t cxl_memdev_dynamic_ram_##n##_group_visible(struct kobject *kobj) \ +{ \ + struct device *dev = kobj_to_dev(kobj); \ + struct cxl_memdev *cxlmd = to_cxl_memdev(dev); \ + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); \ + \ + if (!mds || n >= mds->cxlds.nr_dc_partitions) \ + return 0; \ + \ + return true; \ +} \ +DEFINE_SYSFS_GROUP_VISIBLE(cxl_memdev_dynamic_ram_##n); \ +static struct attribute_group cxl_memdev_dynamic_ram_##n##_attribute_group = { \ + .name = "dynamic_ram_"#n, \ + .attrs = cxl_memdev_dynamic_ram_##n##_attributes, \ + .is_visible = SYSFS_GROUP_VISIBLE(cxl_memdev_dynamic_ram_##n), \ +} +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(0); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(1); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(2); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(3); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(4); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(5); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(6); +CXL_MEMDEV_DYNAMIC_RAM_ATTR_GROUP(7); static ssize_t serial_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -399,15 +497,6 @@ static struct attribute *cxl_memdev_attributes[] = { NULL, }; -static struct cxl_dpa_perf *part_perf(struct cxl_dev_state *cxlds, - enum cxl_partition_mode mode) -{ - for (int i = 0; i < cxlds->nr_partitions; i++) - if (cxlds->part[i].mode == mode) - return &cxlds->part[i].perf; - return NULL; -} - static ssize_t pmem_qos_class_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -426,25 +515,6 @@ static struct attribute *cxl_memdev_pmem_attributes[] = { NULL, }; -static ssize_t dynamic_ram_a_qos_class_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct cxl_memdev *cxlmd = to_cxl_memdev(dev); - struct cxl_dev_state *cxlds = cxlmd->cxlds; - - return sysfs_emit(buf, "%d\n", - part_perf(cxlds, CXL_PARTMODE_DYNAMIC_RAM_A)->qos_class); -} - -static struct device_attribute dev_attr_dynamic_ram_a_qos_class = - __ATTR(qos_class, 0444, dynamic_ram_a_qos_class_show, NULL); - -static struct attribute *cxl_memdev_dynamic_ram_a_attributes[] = { - &dev_attr_dynamic_ram_a_size.attr, - &dev_attr_dynamic_ram_a_qos_class.attr, - NULL, -}; - static ssize_t ram_qos_class_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -521,29 +591,6 @@ static struct attribute_group cxl_memdev_pmem_attribute_group = { .is_visible = cxl_pmem_visible, }; -static umode_t cxl_dynamic_ram_a_visible(struct kobject *kobj, struct attribute *a, int n) -{ - struct device *dev = kobj_to_dev(kobj); - struct cxl_memdev *cxlmd = to_cxl_memdev(dev); - struct cxl_dpa_perf *perf = part_perf(cxlmd->cxlds, CXL_PARTMODE_DYNAMIC_RAM_A); - - if (a == &dev_attr_dynamic_ram_a_qos_class.attr && - (!perf || perf->qos_class == CXL_QOS_CLASS_INVALID)) - return 0; - - if (a == &dev_attr_dynamic_ram_a_size.attr && - (!cxl_part_size(cxlmd->cxlds, CXL_PARTMODE_DYNAMIC_RAM_A))) - return 0; - - return a->mode; -} - -static struct attribute_group cxl_memdev_dynamic_ram_a_attribute_group = { - .name = "dynamic_ram_a", - .attrs = cxl_memdev_dynamic_ram_a_attributes, - .is_visible = cxl_dynamic_ram_a_visible, -}; - static umode_t cxl_memdev_security_visible(struct kobject *kobj, struct attribute *a, int n) { @@ -572,7 +619,14 @@ static const struct attribute_group *cxl_memdev_attribute_groups[] = { &cxl_memdev_attribute_group, &cxl_memdev_ram_attribute_group, &cxl_memdev_pmem_attribute_group, - &cxl_memdev_dynamic_ram_a_attribute_group, + &cxl_memdev_dynamic_ram_0_attribute_group, + &cxl_memdev_dynamic_ram_1_attribute_group, + &cxl_memdev_dynamic_ram_2_attribute_group, + &cxl_memdev_dynamic_ram_3_attribute_group, + &cxl_memdev_dynamic_ram_4_attribute_group, + &cxl_memdev_dynamic_ram_5_attribute_group, + &cxl_memdev_dynamic_ram_6_attribute_group, + &cxl_memdev_dynamic_ram_7_attribute_group, &cxl_memdev_security_attribute_group, NULL, }; @@ -581,7 +635,14 @@ void cxl_memdev_update_perf(struct cxl_memdev *cxlmd) { sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_ram_attribute_group); sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_pmem_attribute_group); - sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_a_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_0_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_1_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_2_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_3_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_4_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_5_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_6_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_7_attribute_group); } EXPORT_SYMBOL_NS_GPL(cxl_memdev_update_perf, "CXL"); diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 3f94dbf63ba9..68b88159e525 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -119,7 +119,14 @@ static DEVICE_ATTR_RO(name) CXL_DECODER_FLAG_ATTR(cap_pmem, CXL_DECODER_F_PMEM); CXL_DECODER_FLAG_ATTR(cap_ram, CXL_DECODER_F_RAM); -CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_a, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_0, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_1, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_2, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_3, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_4, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_5, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_6, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_7, CXL_DECODER_F_RAM); CXL_DECODER_FLAG_ATTR(cap_type2, CXL_DECODER_F_TYPE2); CXL_DECODER_FLAG_ATTR(cap_type3, CXL_DECODER_F_TYPE3); CXL_DECODER_FLAG_ATTR(locked, CXL_DECODER_F_LOCK); @@ -214,8 +221,22 @@ static ssize_t mode_store(struct device *dev, struct device_attribute *attr, mode = CXL_PARTMODE_PMEM; else if (sysfs_streq(buf, "ram")) mode = CXL_PARTMODE_RAM; - else if (sysfs_streq(buf, "dynamic_ram_a")) - mode = CXL_PARTMODE_DYNAMIC_RAM_A; + else if (sysfs_streq(buf, "dynamic_ram_0")) + mode = CXL_PARTMODE_DYNAMIC_RAM_0; + else if (sysfs_streq(buf, "dynamic_ram_1")) + mode = CXL_PARTMODE_DYNAMIC_RAM_1; + else if (sysfs_streq(buf, "dynamic_ram_2")) + mode = CXL_PARTMODE_DYNAMIC_RAM_2; + else if (sysfs_streq(buf, "dynamic_ram_3")) + mode = CXL_PARTMODE_DYNAMIC_RAM_3; + else if (sysfs_streq(buf, "dynamic_ram_4")) + mode = CXL_PARTMODE_DYNAMIC_RAM_4; + else if (sysfs_streq(buf, "dynamic_ram_5")) + mode = CXL_PARTMODE_DYNAMIC_RAM_5; + else if (sysfs_streq(buf, "dynamic_ram_6")) + mode = CXL_PARTMODE_DYNAMIC_RAM_6; + else if (sysfs_streq(buf, "dynamic_ram_7")) + mode = CXL_PARTMODE_DYNAMIC_RAM_7; else return -EINVAL; @@ -321,14 +342,28 @@ static struct attribute_group cxl_decoder_base_attribute_group = { static struct attribute *cxl_decoder_root_attrs[] = { &dev_attr_cap_pmem.attr, &dev_attr_cap_ram.attr, - &dev_attr_cap_dynamic_ram_a.attr, + &dev_attr_cap_dynamic_ram_0.attr, + &dev_attr_cap_dynamic_ram_1.attr, + &dev_attr_cap_dynamic_ram_2.attr, + &dev_attr_cap_dynamic_ram_3.attr, + &dev_attr_cap_dynamic_ram_4.attr, + &dev_attr_cap_dynamic_ram_5.attr, + &dev_attr_cap_dynamic_ram_6.attr, + &dev_attr_cap_dynamic_ram_7.attr, &dev_attr_cap_type2.attr, &dev_attr_cap_type3.attr, &dev_attr_target_list.attr, &dev_attr_qos_class.attr, SET_CXL_REGION_ATTR(create_pmem_region) SET_CXL_REGION_ATTR(create_ram_region) - SET_CXL_REGION_ATTR(create_dynamic_ram_a_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_0_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_1_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_2_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_3_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_4_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_5_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_6_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_7_region) SET_CXL_REGION_ATTR(delete_region) NULL, }; diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index da3ea3cf8585..1a53c74b814c 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -499,7 +499,7 @@ static ssize_t interleave_ways_store(struct device *dev, if (rc) return rc; - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A && val != 1) { + if (is_cxl_dc_partition_mode(cxlr->mode) && val != 1) { dev_err(dev, "Interleaving and DCD not supported\n"); return -EINVAL; } @@ -2255,7 +2255,7 @@ static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos, } cxled = to_cxl_endpoint_decoder(dev); - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A && + if (is_cxl_dc_partition_mode(cxlr->mode) && !cxl_dcd_supported(cxled_to_mds(cxled))) { dev_dbg(dev, "DCD unsupported\n"); rc = -EINVAL; @@ -2606,7 +2606,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd, switch (mode) { case CXL_PARTMODE_RAM: case CXL_PARTMODE_PMEM: - case CXL_PARTMODE_DYNAMIC_RAM_A: + case CXL_PARTMODE_DYNAMIC_RAM_0...CXL_PARTMODE_DYNAMIC_RAM_7: break; default: dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode); @@ -2659,20 +2659,36 @@ static ssize_t create_ram_region_store(struct device *dev, } DEVICE_ATTR_RW(create_ram_region); -static ssize_t create_dynamic_ram_a_region_show(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - return __create_region_show(to_cxl_root_decoder(dev), buf); -} - -static ssize_t create_dynamic_ram_a_region_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t len) -{ - return create_region_store(dev, buf, len, CXL_PARTMODE_DYNAMIC_RAM_A); -} -DEVICE_ATTR_RW(create_dynamic_ram_a_region); +#define CREATE_DYNAMIC_RAM_N_REGION(n) \ +static ssize_t create_dynamic_ram_##n##_region_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return __create_region_show(to_cxl_root_decoder(dev), buf); \ +} \ +static ssize_t create_dynamic_ram_##n##_region_store(struct device *dev, \ + struct device_attribute *attr, \ + const char *buf, size_t len) \ +{ \ + enum cxl_partition_mode mode = CXL_PARTITION_DC_MODE(0) + (n); \ + return create_region_store(dev, buf, len, mode); \ +} +CREATE_DYNAMIC_RAM_N_REGION(0); +CREATE_DYNAMIC_RAM_N_REGION(1); +CREATE_DYNAMIC_RAM_N_REGION(2); +CREATE_DYNAMIC_RAM_N_REGION(3); +CREATE_DYNAMIC_RAM_N_REGION(4); +CREATE_DYNAMIC_RAM_N_REGION(5); +CREATE_DYNAMIC_RAM_N_REGION(6); +CREATE_DYNAMIC_RAM_N_REGION(7); +DEVICE_ATTR_RW(create_dynamic_ram_0_region); +DEVICE_ATTR_RW(create_dynamic_ram_1_region); +DEVICE_ATTR_RW(create_dynamic_ram_2_region); +DEVICE_ATTR_RW(create_dynamic_ram_3_region); +DEVICE_ATTR_RW(create_dynamic_ram_4_region); +DEVICE_ATTR_RW(create_dynamic_ram_5_region); +DEVICE_ATTR_RW(create_dynamic_ram_6_region); +DEVICE_ATTR_RW(create_dynamic_ram_7_region); static ssize_t region_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -3266,7 +3282,7 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr) struct device *dev; int rc; - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A && + if (is_cxl_dc_partition_mode(cxlr->mode) && cxlr->params.interleave_ways != 1) { dev_err(&cxlr->dev, "Interleaving DC not supported\n"); return -EINVAL; @@ -3667,7 +3683,7 @@ static int cxl_region_probe(struct device *dev) return devm_cxl_add_pmem_region(cxlr); case CXL_PARTMODE_RAM: - case CXL_PARTMODE_DYNAMIC_RAM_A: + case CXL_PARTMODE_DYNAMIC_RAM_0...CXL_PARTMODE_DYNAMIC_RAM_7: rc = devm_cxl_region_edac_register(cxlr); if (rc) dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n", diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 3e400dd4f08b..80fb8d09172c 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -503,12 +503,26 @@ struct cxl_region_params { resource_size_t cache_size; }; +#define CXL_PARTITION_DC_MODE(n) CXL_PARTMODE_DYNAMIC_RAM_##n /* Modes should be in the implied DPA order */ enum cxl_partition_mode { CXL_PARTMODE_RAM, CXL_PARTMODE_PMEM, - CXL_PARTMODE_DYNAMIC_RAM_A, -}; + CXL_PARTITION_DC_MODE(0), + CXL_PARTITION_DC_MODE(1), + CXL_PARTITION_DC_MODE(2), + CXL_PARTITION_DC_MODE(3), + CXL_PARTITION_DC_MODE(4), + CXL_PARTITION_DC_MODE(5), + CXL_PARTITION_DC_MODE(6), + CXL_PARTITION_DC_MODE(7), + CXL_PARTITION_MODE_MAX, +}; + +static inline bool is_cxl_dc_partition_mode(enum cxl_partition_mode mode) +{ + return mode >= CXL_PARTITION_DC_MODE(0) && mode < CXL_PARTITION_MODE_MAX; +} /* * Indicate whether this region has been assembled by autodetection or diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 2bad68f13e21..e28cd6827c7d 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -106,7 +106,7 @@ int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, resource_size_t base, resource_size_t len, resource_size_t skipped); -#define CXL_NR_PARTITIONS_MAX 3 +#define CXL_NR_PARTITIONS_MAX 10 struct cxl_dpa_info { u64 size; @@ -456,6 +456,7 @@ struct cxl_dev_state { struct resource dpa_res; struct cxl_dpa_partition part[CXL_NR_PARTITIONS_MAX]; unsigned int nr_partitions; + unsigned int nr_dc_partitions; u64 serial; enum cxl_devtype type; struct cxl_mailbox cxl_mbox; @@ -954,7 +955,7 @@ struct cxl_dc_partition_info { }; int cxl_dev_dc_identify(struct cxl_mailbox *mbox, - struct cxl_dc_partition_info *dc_info); + struct cxl_dc_partition_info *dc_info, int *num_part); int cxl_await_media_ready(struct cxl_dev_state *cxlds); int cxl_enumerate_cmds(struct cxl_memdev_state *mds); int cxl_mem_dpa_fetch(struct cxl_memdev_state *mds, struct cxl_dpa_info *info); diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 15fc2de63185..fa6ada01b681 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -57,7 +57,7 @@ static int cxl_dax_region_probe(struct device *dev) nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start); flags = IORESOURCE_DAX_KMEM; - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) + if (is_cxl_dc_partition_mode(cxlr->mode)) flags |= IORESOURCE_DAX_SPARSE_CAP; dax_region = alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid, @@ -65,7 +65,7 @@ static int cxl_dax_region_probe(struct device *dev) if (!dax_region) return -ENOMEM; - if (cxlr->mode == CXL_PARTMODE_DYNAMIC_RAM_A) { + if (is_cxl_dc_partition_mode(cxlr->mode)) { rc = cxlr_add_existing_extents(cxlr); /* If adding existing extents fails, continue with only an error * message ?? */ -- 2.51.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 3/3] dcd: Add support for multiple DC regions 2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887 @ 2025-12-04 17:44 ` Ira Weiny 0 siblings, 0 replies; 19+ messages in thread From: Ira Weiny @ 2025-12-04 17:44 UTC (permalink / raw) To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Fan Ni, Anisa Su anisa.su887@ wrote: > From: Fan Ni <fan.ni@samsung.com> > > With the change, we add following support: > 1. Allow creating multiple DC regions (up to 8); > 2. Allow DC extents to belong to regions other than region 0; > 3. Modify sysfs entries to enable the above capabilities; > 4. Shareable attribute is added to dc region (partition); > > This series is tested with proper NDCTL fix, see: > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support > [snip] > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > index 6b976da4a70a..faa4656f9542 100644 > --- a/drivers/cxl/core/hdm.c > +++ b/drivers/cxl/core/hdm.c > @@ -463,8 +463,22 @@ static const char *cxl_mode_name(enum cxl_partition_mode mode) > return "ram"; > case CXL_PARTMODE_PMEM: > return "pmem"; > - case CXL_PARTMODE_DYNAMIC_RAM_A: > - return "dynamic_ram_a"; > + case CXL_PARTMODE_DYNAMIC_RAM_0: > + return "dynamic_ram_0"; If my v9 were to land then this lands this would break users who developed against 'ram_a'. Either we need to change ram_a to ram_0 in the base series or use ram_[b,c,...] etc. [same comment throughout] > + case CXL_PARTMODE_DYNAMIC_RAM_1: > + return "dynamic_ram_1"; > + case CXL_PARTMODE_DYNAMIC_RAM_2: > + return "dynamic_ram_2"; > + case CXL_PARTMODE_DYNAMIC_RAM_3: > + return "dynamic_ram_3"; > + case CXL_PARTMODE_DYNAMIC_RAM_4: > + return "dynamic_ram_4"; > + case CXL_PARTMODE_DYNAMIC_RAM_5: > + return "dynamic_ram_5"; > + case CXL_PARTMODE_DYNAMIC_RAM_6: > + return "dynamic_ram_6"; > + case CXL_PARTMODE_DYNAMIC_RAM_7: > + return "dynamic_ram_7"; > default: > return ""; > }; [snip] > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 2bad68f13e21..e28cd6827c7d 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -106,7 +106,7 @@ int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, > resource_size_t base, resource_size_t len, > resource_size_t skipped); > > -#define CXL_NR_PARTITIONS_MAX 3 > +#define CXL_NR_PARTITIONS_MAX 10 > > struct cxl_dpa_info { > u64 size; > @@ -456,6 +456,7 @@ struct cxl_dev_state { > struct resource dpa_res; > struct cxl_dpa_partition part[CXL_NR_PARTITIONS_MAX]; > unsigned int nr_partitions; > + unsigned int nr_dc_partitions; I think nr_partitions needs to include the dc count. And when it does I don't think we need a separate dc count. After looking at this patch I'm thinking that the changes made after dropping partition support from v8 to v9 the merging of the volitile/persistant/dc partitions into a single range might have made this support easier? Is that why yall did not try to use v8? I've really not looked into the detail of if this is really all that is needed to support more partitions. If this is all it takes I think what we really need is a use case. Basically keep this patch (with the name change I mention) until such time as v9 lands with a use case for single partitions. Then when multiple partitions come we can land this change. patch 1/3 is a bug fix and needs to be in v9. 2/3 I don't quite understand yet but is a bug fix as well. So if it is an issue it will need to go in with v9. Thanks for the work! Ira [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887 ` (2 preceding siblings ...) 2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887 @ 2025-12-03 21:19 ` Anisa Su 2025-12-04 17:28 ` Ira Weiny 4 siblings, 0 replies; 19+ messages in thread From: Anisa Su @ 2025-12-03 21:19 UTC (permalink / raw) To: linux-cxl Cc: dan.j.williams, ira.weiny, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Wed, Dec 03, 2025 at 08:29:10PM +0000, anisa.su887@gmail.com wrote: > From: Anisa Su <anisa.su@samsung.com> > > This patchset introduces support for multiple DC regions. It is rebased on top > of the latest branch published to Ira's repository: > https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23. > We hope it will be useful in the meantime for others and restart some > discussion around how to move DCD forward. > > The corresponding NDCTL support can be found on this branch: > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support. > I will reply to this thread with a reference to the thread for the > NDCTL patches once published. > NDCTL thread: https://lore.kernel.org/linux-cxl/20251203211642.1104918-1-anisa.su887@gmail.com/T/#u > Testing: > This patchset was tested on a QEMU VM with the following topology: > > PCIE Root (pcie.0) > │ > ├─ CXL Fixed Memory Window cxl-fmw.0 > ├─ CXL Root Complex cxl.0 > │ └─ Root Port root_port1 > │ └─ CXL Type-3 Device cxl-dcd0 > │ > ├─ CXL Fixed Memory Window cxl-fmw.1 > ├─ CXL Root Complex cxl.1 > │ └─ Root Port root_port2 > │ └─ CXL Type-3 Device cxl-dcd1 > └─ > > "-object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/t3_cxl1.raw,size=8G \ > -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/t3_lsa1.raw,size=1M \ > -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/t3_cxl2.raw,size=8G \ > -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/t3_lsa2.raw,size=1M \ > -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.0,hdm_for_passthrough=true \ > -device pxb-cxl,bus_nr=48,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \ > -device cxl-rp,port=0,bus=cxl.0,id=root_port1,chassis=0,slot=1 \ > -device cxl-rp,port=1,bus=cxl.1,id=root_port2,chassis=1,slot=1 \ > -device cxl-type3,bus=root_port1,volatile-dc-memdev=cxl-mem1,id=cxl-dcd0,lsa=cxl-lsa1,num-dc-regions=8,sn=99 \ > -device cxl-type3,bus=root_port2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd1,lsa=cxl-lsa2,num-dc-regions=8,sn=100 \ > -machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=8G,cxl-fmw.1.targets.0=cxl.1,cxl-fmw.1.size=8G" > > 2 CFMWs and 2 root complexes are emulated because QEMU creates > 4 decoders/topology level. With 1 root complex, there are only 4 upstream > decoders. Therefore in order to create 4+ regions, we need a total of > 8 upstream decoders. This does mean that we are only able to create > 4 regions on each device, although up to 8 are supported. > > Using `cxl list`, we can see mem0 and mem1 have dynamic_ram_* capablities: > root@deb-101020-bm01:~# cxl list > [ > { > "memdevs":[ > { > "memdev":"mem0", > "dynamic_ram_0_size":1073741824, > "dynamic_ram_1_size":1073741824, > "dynamic_ram_2_size":1073741824, > "dynamic_ram_3_size":1073741824, > "dynamic_ram_4_size":1073741824, > "dynamic_ram_5_size":1073741824, > "dynamic_ram_6_size":1073741824, > "dynamic_ram_7_size":1073741824, > "serial":100, > "host":"0000:31:00.0", > "firmware_version":"BWFW VERSION 00" > }, > { > "memdev":"mem1", > "dynamic_ram_0_size":1073741824, > "dynamic_ram_1_size":1073741824, > "dynamic_ram_2_size":1073741824, > "dynamic_ram_3_size":1073741824, > "dynamic_ram_4_size":1073741824, > "dynamic_ram_5_size":1073741824, > "dynamic_ram_6_size":1073741824, > "dynamic_ram_7_size":1073741824, > "serial":99, > "host":"0000:0d:00.0", > "firmware_version":"BWFW VERSION 00" > } > ] > } > ] > > To create the 8 regions: > cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_0 > cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_1 > cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_2 > cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_3 > > cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_4 > cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_5 > cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_6 > cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_7 > > > We can verify the 8 regions: > root@deb-101020-bm01:~# cxl list > [ > { > "memdevs":[ > ... > }, > { > "regions":[ > { > "region":"region0", > "resource":79993765888, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region6", > "resource":81067507712, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region7", > "resource":82141249536, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region8", > "resource":83214991360, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region1", > "resource":88315265024, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region2", > "resource":89389006848, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region3", > "resource":90462748672, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > }, > { > "region":"region4", > "resource":91536490496, > "size":1073741824, > "interleave_ways":1, > "interleave_granularity":256, > "decode_state":"commit" > } > ] > } > ] > > Extents of various sizes (128MB, 256MB, 512MB, and 1GB) are added from mem1, > which correspond to regions 0-3, then DAX devices are created from them. > The extent DPAs are as follows, which allows each one to map to a distinct > region: > - [0-128] --> region0 > - [1024-1280] --> region1 > - [2048-2560] --> region2 > - [3072-4096] --> region3 > > The correct sizes can be verified when creating the DAX device. > root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region0 > [ > { > "chardev":"dax0.1", > "size":134217728, > "target_node":1, > "align":2097152, > "mode":"devdax" > } > ] > created 1 device > root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region1 > [ > { > "chardev":"dax1.1", > "size":268435456, > "target_node":1, > "align":2097152, > "mode":"devdax" > } > ] > created 1 device > root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region2 > [ > { > "chardev":"dax2.1", > "size":536870912, > "target_node":1, > "align":2097152, > "mode":"devdax" > } > ] > created 1 device > root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region3 > [ > { > "chardev":"dax3.1", > "size":1073741824, > "target_node":1, > "align":2097152, > "mode":"devdax" > } > ] > created 1 device > > Then the DAX devices are reconfigured to system-ram mode and verified with lsmem. > root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax0.1 -m system-ram > [ > { > "chardev":"dax0.1", > "size":134217728, > "target_node":1, > "align":2097152, > "mode":"system-ram", > "online_memblocks":1, > "total_memblocks":1, > "movable":true > } > ] > reconfigured 1 device > root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax1.1 -m system-ram > ... > root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax2.1 -m system-ram > ... > root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax3.1 -m system-ram > ... > > > root@deb-101020-bm01:~/libcxlmi# lsmem > RANGE SIZE STATE REMOVABLE BLOCK > 0x0000000000000000-0x000000007fffffff 2G online yes 0-15 > 0x0000000100000000-0x000000027fffffff 6G online yes 32-79 > 0x00000012a0000000-0x00000012a7ffffff 128M online yes 596 > 0x00000012e0000000-0x00000012efffffff 256M online yes 604-605 > 0x0000001320000000-0x000000133fffffff 512M online yes 612-615 > 0x0000001360000000-0x000000139fffffff 1G online yes 620-627 > > Memory block size: 128M > Total online memory: 9.9G > Total offline memory: 0B > > ------------------------------------------------------------------------------- > Note: I did try hacking QEMU to create 8 decoders at each level to avoid having > 2 separate host bridges/DCDs by modifying include/hw/cxl/cxl_component.h like so: > > #define CXL_HDM_DECODER_COUNT 8 > HDM_DECODER_INIT(0); > HDM_DECODER_INIT(1); > HDM_DECODER_INIT(2); > HDM_DECODER_INIT(3); > HDM_DECODER_INIT(4); > HDM_DECODER_INIT(5); > HDM_DECODER_INIT(6); > HDM_DECODER_INIT(7); > > However, when attempting to create the 5th cxl region, > I ran into a timeout error when committing the decoders. > Did not spend much time pursuing this further, most likely > need to change more things on the QEMU side. > But the 8 decoders do show up correctly under sysfs. > > Fan Ni (3): > core/region: fix return logic for store_targetN > dax/cxl: add existing dc extents when probing dax region > dcd: Add support for multiple DC regions > > drivers/cxl/core/cdat.c | 2 +- > drivers/cxl/core/core.h | 9 +- > drivers/cxl/core/extent.c | 2 +- > drivers/cxl/core/hdm.c | 18 +++- > drivers/cxl/core/mbox.c | 39 +++++---- > drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++------------- > drivers/cxl/core/port.c | 45 ++++++++-- > drivers/cxl/core/region.c | 65 ++++++++------ > drivers/cxl/cxl.h | 23 ++++- > drivers/cxl/cxlmem.h | 5 +- > drivers/dax/cxl.c | 28 ++---- > 11 files changed, 281 insertions(+), 134 deletions(-) > > -- > 2.51.0 > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887 ` (3 preceding siblings ...) 2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su @ 2025-12-04 17:28 ` Ira Weiny 2025-12-11 21:05 ` Anisa Su 4 siblings, 1 reply; 19+ messages in thread From: Ira Weiny @ 2025-12-04 17:28 UTC (permalink / raw) To: anisa.su887, dan.j.williams, ira.weiny, dave, linux-cxl Cc: nifan.cxl, dongjoo.seo1, Anisa Su anisa.su887@ wrote: > From: Anisa Su <anisa.su@samsung.com> > > This patchset introduces support for multiple DC regions. It is rebased on top > of the latest branch published to Ira's repository: > https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23. > We hope it will be useful in the meantime for others and restart some > discussion around how to move DCD forward. FWIW it seems patch 1/3 and this patch are both bug fixes to the DCD series I last posted. If so they should be tacked onto that series. So, you are more that welcome to take over DCD development. However, I had multiple DC partitions (Regions) supported in previous versions of that series and the community decided that there was no use case for such a device. Based on this submission it seems that me ripping out the multiple partitions was incorrect. > > The corresponding NDCTL support can be found on this branch: > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support. > I will reply to this thread with a reference to the thread for the > NDCTL patches once published. > > Testing: > This patchset was tested on a QEMU VM with the following topology: Unfortunately none of the details presented in this cover letter really show why the kernel needs this additional complexity. Can you go into more details on the use cases of multiple partitions? Also, did you consider to use previous versions of my series? Perhaps v8? https://lore.kernel.org/all/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com/#r Thanks, Ira [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-04 17:28 ` Ira Weiny @ 2025-12-11 21:05 ` Anisa Su 2025-12-12 22:07 ` Ira Weiny 2025-12-13 3:36 ` dan.j.williams 0 siblings, 2 replies; 19+ messages in thread From: Anisa Su @ 2025-12-11 21:05 UTC (permalink / raw) To: Ira Weiny Cc: anisa.su887, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote: > anisa.su887@ wrote: > > From: Anisa Su <anisa.su@samsung.com> > > > > This patchset introduces support for multiple DC regions. It is rebased on top > > of the latest branch published to Ira's repository: > > https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23. > > We hope it will be useful in the meantime for others and restart some > > discussion around how to move DCD forward. > > FWIW it seems patch 1/3 and this patch are both bug fixes to the DCD > series I last posted. If so they should be tacked onto that series. > > So, you are more that welcome to take over DCD development. > > However, I had multiple DC partitions (Regions) supported in previous > versions of that series and the community decided that there was no use > case for such a device. Based on this submission it seems that me ripping > out the multiple partitions was incorrect. > > > > > The corresponding NDCTL support can be found on this branch: > > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support. > > I will reply to this thread with a reference to the thread for the > > NDCTL patches once published. > > > > Testing: > > This patchset was tested on a QEMU VM with the following topology: > > Unfortunately none of the details presented in this cover letter really > show why the kernel needs this additional complexity. > > Can you go into more details on the use cases of multiple partitions? > From what I understand, the motivation for DCD as a whole has always been a blocker for the entire series. However, this year we've seen multiple vendors demo memory pooling/sharing at SC,25[1], as well as the development of a controller that supports "memory pooling and sharing across multiple hosts" from Montage[2]. The flexibility and control provided by multiple partitions is an important capability of DCD for enabling composable memory infrastructures. IMO, adding multi-partition support back in from v8 or picking up this patchset would strengthen the series. Let me know if this proposal sounds fair? Otherwise I can separate out the 2 patches that are bug fixes. Also, apologies if these points have already been discussed. I've not been following this series for very long, so forgive the ignorance as I try to catch up. If you can think of any materials/documentation outside of the mailing list or open collab sync notes that would help fill in the gaps, please let me know :) Thanks for the feedback, Anisa [1] https://computeexpresslink.org/event/supercomputing-2025/ [2] https://www.montage-tech.com/MXC > Also, did you consider to use previous versions of my series? Perhaps v8? > > https://lore.kernel.org/all/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com/#r > > Thanks, > Ira > > [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-11 21:05 ` Anisa Su @ 2025-12-12 22:07 ` Ira Weiny 2026-01-12 22:23 ` Anisa Su 2025-12-13 3:36 ` dan.j.williams 1 sibling, 1 reply; 19+ messages in thread From: Ira Weiny @ 2025-12-12 22:07 UTC (permalink / raw) To: Anisa Su, Ira Weiny Cc: anisa.su887, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 Anisa Su wrote: > On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote: > > anisa.su887@ wrote: > > > From: Anisa Su <anisa.su@samsung.com> > > > > > > This patchset introduces support for multiple DC regions. It is rebased on top > > > of the latest branch published to Ira's repository: > > > https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23. > > > We hope it will be useful in the meantime for others and restart some > > > discussion around how to move DCD forward. > > > > FWIW it seems patch 1/3 and this patch are both bug fixes to the DCD > > series I last posted. If so they should be tacked onto that series. > > > > So, you are more that welcome to take over DCD development. > > > > However, I had multiple DC partitions (Regions) supported in previous > > versions of that series and the community decided that there was no use > > case for such a device. Based on this submission it seems that me ripping > > out the multiple partitions was incorrect. > > > > > > > > The corresponding NDCTL support can be found on this branch: > > > https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support. > > > I will reply to this thread with a reference to the thread for the > > > NDCTL patches once published. > > > > > > Testing: > > > This patchset was tested on a QEMU VM with the following topology: > > > > Unfortunately none of the details presented in this cover letter really > > show why the kernel needs this additional complexity. > > > > Can you go into more details on the use cases of multiple partitions? > > > From what I understand, the motivation for DCD as a whole has always been a > blocker for the entire series. However, this year we've seen multiple vendors > demo memory pooling/sharing at SC,25[1], as well as the development of a > controller that supports "memory pooling and sharing across > multiple hosts" from Montage[2]. That is great! Do we know if they used the patches which have been submitted? Do we know if the user interfaces were sufficient? How will this memory be presented with the new DAX changes being proposed? > > The flexibility and control provided by multiple partitions is > an important capability of DCD for enabling composable memory > infrastructures. IMO, adding multi-partition support back in from v8 or > picking up this patchset would strengthen the series. > > Let me know if this proposal sounds fair? Otherwise I can > separate out the 2 patches that are bug fixes. After RC1 could you rebase the series and fold the bug fixes in? Before we get to multiple DCD partitions the interface for DAX devices needs to be settled. In the last community call we were discussing a special famfs dax type I believe. Has any work been done on that? For multi-partitions we need some review on the partition (region) names because you made a change which would be incompatible with the base series. But it would be good to get single partitions landed and then multiple partitions as you have added. > Also, apologies if these points have already been discussed. I've not been > following this series for very long, so forgive the ignorance as I try to > catch up. If you can think of any materials/documentation outside of the > mailing list or open collab sync notes that would help fill in the gaps, > please let me know :) NP this has been a while. I've been looking for someone to take the series who is more familiar with the use cases. I look forward to you posting a new series with the support you feel you need. Ira > > Thanks for the feedback, > Anisa > > [1] https://computeexpresslink.org/event/supercomputing-2025/ > [2] https://www.montage-tech.com/MXC > > > Also, did you consider to use previous versions of my series? Perhaps v8? > > > > https://lore.kernel.org/all/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com/#r > > > > Thanks, > > Ira > > > > [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-12 22:07 ` Ira Weiny @ 2026-01-12 22:23 ` Anisa Su 2026-01-15 10:28 ` Alireza Sanaee 0 siblings, 1 reply; 19+ messages in thread From: Anisa Su @ 2026-01-12 22:23 UTC (permalink / raw) To: Ira Weiny Cc: Anisa Su, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Fri, Dec 12, 2025 at 04:07:32PM -0600, Ira Weiny wrote: > Anisa Su wrote: > > On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote: > > > anisa.su887@ wrote: > > > > From: Anisa Su <anisa.su@samsung.com> > > > > [snip] > > > Unfortunately none of the details presented in this cover letter really > > > show why the kernel needs this additional complexity. > > > > > > Can you go into more details on the use cases of multiple partitions? > > > > > From what I understand, the motivation for DCD as a whole has always been a > > blocker for the entire series. However, this year we've seen multiple vendors > > demo memory pooling/sharing at SC,25[1], as well as the development of a > > controller that supports "memory pooling and sharing across > > multiple hosts" from Montage[2]. > > That is great! Do we know if they used the patches which have been > submitted? Do we know if the user interfaces were sufficient? Sorry for the delay! While I don't know the details of the software stack used in those demos, I think the root of the question "Do we know if the user interfaces were sufficient" goes back to the missing use case for DCD. So then, can I ask: how can I demonstrate a reasonable use case? Ex: bringing up Kubernetes pods using this patchset on real hw? Or something else? ^ This is also a question for the community, so everyone please chime in :) > > How will this memory be presented with the new DAX changes being proposed? > From the call today, there seemed to be general agreement that the changes proposed by Gregory's patches are a promising direction for DCD, bc it allows hotplug/unplug capabilities without needing to route it through the DAX subsystem. I haven't looked at those patches yet, but from what was discussed today, plan to move forward based on that. Are those the changes you were referring to? Or the special famfs dax type you mentioned below? > > > > The flexibility and control provided by multiple partitions is > > an important capability of DCD for enabling composable memory > > infrastructures. IMO, adding multi-partition support back in from v8 or > > picking up this patchset would strengthen the series. > > > > Let me know if this proposal sounds fair? Otherwise I can > > separate out the 2 patches that are bug fixes. > > After RC1 could you rebase the series and fold the bug fixes in? > Yep, working on rebasing the series now and will send RC2 with the bug fixes. > Before we get to multiple DCD partitions the interface for DAX devices > needs to be settled. In the last community call we were discussing a > special famfs dax type I believe. Has any work been done on that? > That makes sense to me; I missed that call, so I'm not familiar with the famfs dax type, but as I mentioned above, it sounds like Gregory's patch set is a good solution to this, so I'll explore how to integrate with that first. > For multi-partitions we need some review on the partition (region) names > because you made a change which would be incompatible with the base > series. But it would be good to get single partitions landed and then > multiple partitions as you have added. > Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes. > > Also, apologies if these points have already been discussed. I've not been > > following this series for very long, so forgive the ignorance as I try to > > catch up. If you can think of any materials/documentation outside of the > > mailing list or open collab sync notes that would help fill in the gaps, > > please let me know :) > > NP this has been a while. I've been looking for someone to take the > series who is more familiar with the use cases. > > I look forward to you posting a new series with the support you feel you > need. > > Ira > Thanks Ira, I will definitely have to keep bothering you, though I'll try to keep it to a minimum. Thanks, Anisa [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2026-01-12 22:23 ` Anisa Su @ 2026-01-15 10:28 ` Alireza Sanaee 2026-02-11 1:44 ` Anisa Su 0 siblings, 1 reply; 19+ messages in thread From: Alireza Sanaee @ 2026-01-15 10:28 UTC (permalink / raw) To: Anisa Su Cc: Ira Weiny, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Mon, 12 Jan 2026 14:23:55 -0800 Anisa Su <anisa.su887@gmail.com> wrote: Hi Anisa, > On Fri, Dec 12, 2025 at 04:07:32PM -0600, Ira Weiny wrote: > > Anisa Su wrote: > > > On Thu, Dec 04, 2025 at 11:28:40AM -0600, Ira Weiny wrote: > > > > anisa.su887@ wrote: > > > > > From: Anisa Su <anisa.su@samsung.com> > > > > > > [snip] > > > > Unfortunately none of the details presented in this cover letter really > > > > show why the kernel needs this additional complexity. > > > > > > > > Can you go into more details on the use cases of multiple partitions? > > > > > > > From what I understand, the motivation for DCD as a whole has always been a > > > blocker for the entire series. However, this year we've seen multiple vendors > > > demo memory pooling/sharing at SC,25[1], as well as the development of a > > > controller that supports "memory pooling and sharing across > > > multiple hosts" from Montage[2]. > > > > That is great! Do we know if they used the patches which have been > > submitted? Do we know if the user interfaces were sufficient? > > Sorry for the delay! While I don't know the details of the software > stack used in those demos, I think the root of the question "Do we know > if the user interfaces were sufficient" goes back to the missing use > case for DCD. > > So then, can I ask: how can I demonstrate a reasonable use case? > Ex: bringing up Kubernetes pods using this patchset on real hw? > Or something else? > ^ This is also a question for the community, so everyone please chime in :) > > > > How will this memory be presented with the new DAX changes being proposed? > > > From the call today, there seemed to be general agreement that the changes > proposed by Gregory's patches are a promising direction for DCD, bc it > allows hotplug/unplug capabilities without needing to route it through > the DAX subsystem. > I haven't looked at those patches yet, but from what was > discussed today, plan to move forward based on that. Are those the changes > you were referring to? Or the special famfs dax type you mentioned > below? > > > > > > The flexibility and control provided by multiple partitions is > > > an important capability of DCD for enabling composable memory > > > infrastructures. IMO, adding multi-partition support back in from v8 or > > > picking up this patchset would strengthen the series. > > > > > > Let me know if this proposal sounds fair? Otherwise I can > > > separate out the 2 patches that are bug fixes. > > > > After RC1 could you rebase the series and fold the bug fixes in? > > > Yep, working on rebasing the series now and will send RC2 with the bug > fixes. > > Before we get to multiple DCD partitions the interface for DAX devices > > needs to be settled. In the last community call we were discussing a > > special famfs dax type I believe. Has any work been done on that? > > > That makes sense to me; I missed that call, so I'm not familiar with the > famfs dax type, but as I mentioned above, it sounds like Gregory's > patch set is a good solution to this, so I'll explore how to integrate > with that first. > > > For multi-partitions we need some review on the partition (region) names > > because you made a change which would be incompatible with the base > > series. But it would be good to get single partitions landed and then > > multiple partitions as you have added. > > > Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes. Thanks Anisa. I was also about to look into the rebasing as well. > > > Also, apologies if these points have already been discussed. I've not been > > > following this series for very long, so forgive the ignorance as I try to > > > catch up. If you can think of any materials/documentation outside of the > > > mailing list or open collab sync notes that would help fill in the gaps, > > > please let me know :) > > > > NP this has been a while. I've been looking for someone to take the > > series who is more familiar with the use cases. > > > > I look forward to you posting a new series with the support you feel you > > need. > > > > Ira > > > Thanks Ira, I will definitely have to keep bothering you, though I'll > try to keep it to a minimum. > > Thanks, > Anisa > > [snip] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2026-01-15 10:28 ` Alireza Sanaee @ 2026-02-11 1:44 ` Anisa Su 2026-02-11 9:34 ` Alireza Sanaee 0 siblings, 1 reply; 19+ messages in thread From: Anisa Su @ 2026-02-11 1:44 UTC (permalink / raw) To: Alireza Sanaee Cc: Anisa Su, Ira Weiny, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Thu, Jan 15, 2026 at 10:28:19AM +0000, Alireza Sanaee wrote: > On Mon, 12 Jan 2026 14:23:55 -0800 > Anisa Su <anisa.su887@gmail.com> wrote: > > Hi Anisa, > [snip] > > Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes. > Thanks Anisa. I was also about to look into the rebasing as well. Hey sorry for the delay! I've rebased on cxl-next: https://github.com/anisa-su993/anisa-linux-kernel/tree/dcd-v10-2026-02-09 I tested basic add/remove and region/dax device creation with QEMU, but didn't spend too much time testing as the bug fixes were quite small. The bug fixes + who they were suggested by are tracked at the end of the commit message :) Thanks, Anisa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2026-02-11 1:44 ` Anisa Su @ 2026-02-11 9:34 ` Alireza Sanaee 0 siblings, 0 replies; 19+ messages in thread From: Alireza Sanaee @ 2026-02-11 9:34 UTC (permalink / raw) To: Anisa Su Cc: Ira Weiny, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Tue, 10 Feb 2026 17:44:06 -0800 Anisa Su <anisa.su887@gmail.com> wrote: > On Thu, Jan 15, 2026 at 10:28:19AM +0000, Alireza Sanaee wrote: > > On Mon, 12 Jan 2026 14:23:55 -0800 > > Anisa Su <anisa.su887@gmail.com> wrote: > > > > Hi Anisa, > > > [snip] > > > Sounds good. For RC2, I'll keep it simple and just rebase + bug fixes. > > Thanks Anisa. I was also about to look into the rebasing as well. > Hey sorry for the delay! I've rebased on cxl-next: https://github.com/anisa-su993/anisa-linux-kernel/tree/dcd-v10-2026-02-09 Oh nice! Thanks Anisa. > > I tested basic add/remove and region/dax device creation with QEMU, but didn't > spend too much time testing as the bug fixes were quite small. > > The bug fixes + who they were suggested by are tracked at the end of the commit > message :) > > Thanks, > Anisa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-11 21:05 ` Anisa Su 2025-12-12 22:07 ` Ira Weiny @ 2025-12-13 3:36 ` dan.j.williams 2026-01-12 22:50 ` Anisa Su 1 sibling, 1 reply; 19+ messages in thread From: dan.j.williams @ 2025-12-13 3:36 UTC (permalink / raw) To: Anisa Su, Ira Weiny Cc: anisa.su887, dan.j.williams, dave, linux-cxl, nifan.cxl, dongjoo.seo1 Anisa Su wrote: [..] > > Unfortunately none of the details presented in this cover letter really > > show why the kernel needs this additional complexity. > > > > Can you go into more details on the use cases of multiple partitions? > > > From what I understand, the motivation for DCD as a whole has always been a > blocker for the entire series. However, this year we've seen multiple vendors > demo memory pooling/sharing at SC,25[1], as well as the development of a > controller that supports "memory pooling and sharing across > multiple hosts" from Montage[2]. > > The flexibility and control provided by multiple partitions is > an important capability of DCD for enabling composable memory > infrastructures. IMO, adding multi-partition support back in from v8 or > picking up this patchset would strengthen the series. Can you explain the use case for multiple partitions per-device? Describe it in terms of what Linux loses if it never entertains this aspect of the specification. It significantly complicates the ABI for a benefit to Linux that I am unable to articulate. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2025-12-13 3:36 ` dan.j.williams @ 2026-01-12 22:50 ` Anisa Su 2026-01-13 0:08 ` Gregory Price 0 siblings, 1 reply; 19+ messages in thread From: Anisa Su @ 2026-01-12 22:50 UTC (permalink / raw) To: dan.j.williams Cc: Anisa Su, Ira Weiny, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Sat, Dec 13, 2025 at 12:36:24PM +0900, dan.j.williams@intel.com wrote: > Anisa Su wrote: > [..] > > > Unfortunately none of the details presented in this cover letter really > > > show why the kernel needs this additional complexity. > > > > > > Can you go into more details on the use cases of multiple partitions? > > > > > From what I understand, the motivation for DCD as a whole has always been a > > blocker for the entire series. However, this year we've seen multiple vendors > > demo memory pooling/sharing at SC,25[1], as well as the development of a > > controller that supports "memory pooling and sharing across > > multiple hosts" from Montage[2]. > > > > The flexibility and control provided by multiple partitions is > > an important capability of DCD for enabling composable memory > > infrastructures. IMO, adding multi-partition support back in from v8 or > > picking up this patchset would strengthen the series. > > Can you explain the use case for multiple partitions per-device? > Describe it in terms of what Linux loses if it never entertains this > aspect of the specification. It significantly complicates the ABI for a > benefit to Linux that I am unable to articulate. Let me backtrack a bit here :( I was too hasty trying to push for multiple partitions. However, can I ask for some clarification on what a sufficient use case is (with just a single partition)? Then I'll (try my best) to demonstrate how this series can accomplish it so I can help land the single partition stuff first. Does that sound fair to you? Or am I misunderstanding the core problem? Thanks, Anisa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH 0/3] Add Support for Multiple DC Regions 2026-01-12 22:50 ` Anisa Su @ 2026-01-13 0:08 ` Gregory Price 0 siblings, 0 replies; 19+ messages in thread From: Gregory Price @ 2026-01-13 0:08 UTC (permalink / raw) To: Anisa Su Cc: dan.j.williams, Ira Weiny, dave, linux-cxl, nifan.cxl, dongjoo.seo1 On Mon, Jan 12, 2026 at 02:50:01PM -0800, Anisa Su wrote: > On Sat, Dec 13, 2025 at 12:36:24PM +0900, dan.j.williams@intel.com wrote: > > Anisa Su wrote: > > [..] > > > > Unfortunately none of the details presented in this cover letter really > > > > show why the kernel needs this additional complexity. > > > > > > > > Can you go into more details on the use cases of multiple partitions? > > > > > > > From what I understand, the motivation for DCD as a whole has always been a > > > blocker for the entire series. However, this year we've seen multiple vendors > > > demo memory pooling/sharing at SC,25[1], as well as the development of a > > > controller that supports "memory pooling and sharing across > > > multiple hosts" from Montage[2]. > > > > > > The flexibility and control provided by multiple partitions is > > > an important capability of DCD for enabling composable memory > > > infrastructures. IMO, adding multi-partition support back in from v8 or > > > picking up this patchset would strengthen the series. > > > > Can you explain the use case for multiple partitions per-device? > > Describe it in terms of what Linux loses if it never entertains this > > aspect of the specification. It significantly complicates the ABI for a > > benefit to Linux that I am unable to articulate. > > Let me backtrack a bit here :( I was too hasty trying to push for > multiple partitions. > > However, can I ask for some clarification on what a sufficient use case is > (with just a single partition)? > > Then I'll (try my best) to demonstrate how this series can accomplish it > so I can help land the single partition stuff first. Does that sound fair > to you? Or am I misunderstanding the core problem? > > Thanks, > Anisa regions != partitions the discussion from today was regarding FAMFS wanting to balance over subscription and max bandwidth - which requires a per-device region and an interleaved regions, they cannot be managed with one region. I can't think of a reason to need multiple *physical partitions* on the device, when you can chop up a single partition with many regions. Either way you need dedicated decoders for each region - regardless of what partition it's on, so the extra partition doesn't get you much. ~Gregory ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-02-11 9:34 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887 2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887 2025-12-04 17:04 ` Ira Weiny 2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887 2025-12-03 21:03 ` Anisa Su 2025-12-04 17:29 ` Ira Weiny 2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887 2025-12-04 17:44 ` Ira Weiny 2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su 2025-12-04 17:28 ` Ira Weiny 2025-12-11 21:05 ` Anisa Su 2025-12-12 22:07 ` Ira Weiny 2026-01-12 22:23 ` Anisa Su 2026-01-15 10:28 ` Alireza Sanaee 2026-02-11 1:44 ` Anisa Su 2026-02-11 9:34 ` Alireza Sanaee 2025-12-13 3:36 ` dan.j.williams 2026-01-12 22:50 ` Anisa Su 2026-01-13 0:08 ` Gregory Price
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox