Linux CXL
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Add Support for Multiple DC Regions
@ 2025-12-03 20:29 anisa.su887
  2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: anisa.su887 @ 2025-12-03 20:29 UTC (permalink / raw)
  To: dan.j.williams, ira.weiny, dave, linux-cxl
  Cc: nifan.cxl, dongjoo.seo1, Anisa Su

From: Anisa Su <anisa.su@samsung.com>

This patchset introduces support for multiple DC regions. It is rebased on top
of the latest branch published to Ira's repository:
https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-09-23.
We hope it will be useful in the meantime for others and restart some
discussion around how to move DCD forward.

The corresponding NDCTL support can be found on this branch:
https://github.com/anisa-su993/anisa-ndctl/tree/multiple-dc-region-support.
I will reply to this thread with a reference to the thread for the
NDCTL patches once published.

Testing:
This patchset was tested on a QEMU VM with the following topology:

PCIE Root (pcie.0)
│
├─ CXL Fixed Memory Window cxl-fmw.0
├─ CXL Root Complex cxl.0
│  └─ Root Port root_port1
│     └─ CXL Type-3 Device cxl-dcd0
│
├─ CXL Fixed Memory Window cxl-fmw.1
├─ CXL Root Complex cxl.1
│  └─ Root Port root_port2
│     └─ CXL Type-3 Device cxl-dcd1
└─

"-object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/t3_cxl1.raw,size=8G \
-object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/t3_lsa1.raw,size=1M \
-object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/t3_cxl2.raw,size=8G \
-object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/t3_lsa2.raw,size=1M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.0,hdm_for_passthrough=true \
-device pxb-cxl,bus_nr=48,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
-device cxl-rp,port=0,bus=cxl.0,id=root_port1,chassis=0,slot=1 \
-device cxl-rp,port=1,bus=cxl.1,id=root_port2,chassis=1,slot=1 \
-device cxl-type3,bus=root_port1,volatile-dc-memdev=cxl-mem1,id=cxl-dcd0,lsa=cxl-lsa1,num-dc-regions=8,sn=99 \
-device cxl-type3,bus=root_port2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd1,lsa=cxl-lsa2,num-dc-regions=8,sn=100 \
-machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=8G,cxl-fmw.1.targets.0=cxl.1,cxl-fmw.1.size=8G"

2 CFMWs and 2 root complexes are emulated because QEMU creates
4 decoders/topology level. With 1 root complex, there are only 4 upstream
decoders. Therefore in order to create 4+ regions, we need a total of
8 upstream decoders. This does mean that we are only able to create
4 regions on each device, although up to 8 are supported.

Using `cxl list`, we can see mem0 and mem1 have dynamic_ram_* capablities:
root@deb-101020-bm01:~# cxl list
[
  {
    "memdevs":[
      {
        "memdev":"mem0",
        "dynamic_ram_0_size":1073741824,
        "dynamic_ram_1_size":1073741824,
        "dynamic_ram_2_size":1073741824,
        "dynamic_ram_3_size":1073741824,
        "dynamic_ram_4_size":1073741824,
        "dynamic_ram_5_size":1073741824,
        "dynamic_ram_6_size":1073741824,
        "dynamic_ram_7_size":1073741824,
        "serial":100,
        "host":"0000:31:00.0",
        "firmware_version":"BWFW VERSION 00"
      },
      {
        "memdev":"mem1",
        "dynamic_ram_0_size":1073741824,
        "dynamic_ram_1_size":1073741824,
        "dynamic_ram_2_size":1073741824,
        "dynamic_ram_3_size":1073741824,
        "dynamic_ram_4_size":1073741824,
        "dynamic_ram_5_size":1073741824,
        "dynamic_ram_6_size":1073741824,
        "dynamic_ram_7_size":1073741824,
        "serial":99,
        "host":"0000:0d:00.0",
        "firmware_version":"BWFW VERSION 00"
      }
    ]
  }
]

To create the 8 regions:
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_0
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_1
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_2
cxl create-region -m -d decoder0.0 -w 1 -s 1G mem1 -t dynamic_ram_3

cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_4
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_5
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_6
cxl create-region -m -d decoder0.1 -w 1 -s 1G mem0 -t dynamic_ram_7


We can verify the 8 regions:
root@deb-101020-bm01:~# cxl list
[
  {
    "memdevs":[
...
  },
  {
    "regions":[
      {
        "region":"region0",
        "resource":79993765888,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region6",
        "resource":81067507712,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region7",
        "resource":82141249536,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region8",
        "resource":83214991360,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region1",
        "resource":88315265024,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region2",
        "resource":89389006848,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region3",
        "resource":90462748672,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      },
      {
        "region":"region4",
        "resource":91536490496,
        "size":1073741824,
        "interleave_ways":1,
        "interleave_granularity":256,
        "decode_state":"commit"
      }
    ]
  }
]

Extents of various sizes (128MB, 256MB, 512MB, and 1GB) are added from mem1,
which correspond to regions 0-3, then DAX devices are created from them.
The extent DPAs are as follows, which allows each one to map to a distinct
region:
 - [0-128] --> region0
 - [1024-1280] --> region1
 - [2048-2560] --> region2
 - [3072-4096] --> region3

The correct sizes can be verified when creating the DAX device.
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region0
[
  {
    "chardev":"dax0.1",
    "size":134217728,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region1
[
  {
    "chardev":"dax1.1",
    "size":268435456,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region2
[
  {
    "chardev":"dax2.1",
    "size":536870912,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl create-device -r region3
[
  {
    "chardev":"dax3.1",
    "size":1073741824,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]
created 1 device

Then the DAX devices are reconfigured to system-ram mode and verified with lsmem.
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax0.1 -m system-ram
[
  {
    "chardev":"dax0.1",
    "size":134217728,
    "target_node":1,
    "align":2097152,
    "mode":"system-ram",
    "online_memblocks":1,
    "total_memblocks":1,
    "movable":true
  }
]
reconfigured 1 device
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax1.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax2.1 -m system-ram
...
root@deb-101020-bm01:~/libcxlmi# daxctl reconfigure-device dax3.1 -m system-ram
...


root@deb-101020-bm01:~/libcxlmi# lsmem
RANGE                                  SIZE  STATE REMOVABLE   BLOCK
0x0000000000000000-0x000000007fffffff    2G online       yes    0-15
0x0000000100000000-0x000000027fffffff    6G online       yes   32-79
0x00000012a0000000-0x00000012a7ffffff  128M online       yes     596
0x00000012e0000000-0x00000012efffffff  256M online       yes 604-605
0x0000001320000000-0x000000133fffffff  512M online       yes 612-615
0x0000001360000000-0x000000139fffffff    1G online       yes 620-627

Memory block size:                128M
Total online memory:              9.9G
Total offline memory:               0B

-------------------------------------------------------------------------------
Note: I did try hacking QEMU to create 8 decoders at each level to avoid having
2 separate host bridges/DCDs by modifying include/hw/cxl/cxl_component.h like so:

#define CXL_HDM_DECODER_COUNT 8
HDM_DECODER_INIT(0);
HDM_DECODER_INIT(1);
HDM_DECODER_INIT(2);
HDM_DECODER_INIT(3);
HDM_DECODER_INIT(4);
HDM_DECODER_INIT(5);
HDM_DECODER_INIT(6);
HDM_DECODER_INIT(7);

However, when attempting to create the 5th cxl region,
I ran into a timeout error when committing the decoders.
Did not spend much time pursuing this further, most likely
need to change more things on the QEMU side.
But the 8 decoders do show up correctly under sysfs.

Fan Ni (3):
  core/region: fix return logic for store_targetN
  dax/cxl: add existing dc extents when probing dax region
  dcd: Add support for multiple DC regions

 drivers/cxl/core/cdat.c   |   2 +-
 drivers/cxl/core/core.h   |   9 +-
 drivers/cxl/core/extent.c |   2 +-
 drivers/cxl/core/hdm.c    |  18 +++-
 drivers/cxl/core/mbox.c   |  39 +++++----
 drivers/cxl/core/memdev.c | 179 +++++++++++++++++++++++++-------------
 drivers/cxl/core/port.c   |  45 ++++++++--
 drivers/cxl/core/region.c |  65 ++++++++------
 drivers/cxl/cxl.h         |  23 ++++-
 drivers/cxl/cxlmem.h      |   5 +-
 drivers/dax/cxl.c         |  28 ++----
 11 files changed, 281 insertions(+), 134 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-02-11  9:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-03 20:29 [RFC PATCH 0/3] Add Support for Multiple DC Regions anisa.su887
2025-12-03 20:29 ` [RFC PATCH 1/3] core/region: fix return logic for store_targetN anisa.su887
2025-12-04 17:04   ` Ira Weiny
2025-12-03 20:29 ` [RFC PATCH 2/3] dax/cxl: add existing dc extents when probing dax region anisa.su887
2025-12-03 21:03   ` Anisa Su
2025-12-04 17:29   ` Ira Weiny
2025-12-03 20:29 ` [RFC PATCH 3/3] dcd: Add support for multiple DC regions anisa.su887
2025-12-04 17:44   ` Ira Weiny
2025-12-03 21:19 ` [RFC PATCH 0/3] Add Support for Multiple DC Regions Anisa Su
2025-12-04 17:28 ` Ira Weiny
2025-12-11 21:05   ` Anisa Su
2025-12-12 22:07     ` Ira Weiny
2026-01-12 22:23       ` Anisa Su
2026-01-15 10:28         ` Alireza Sanaee
2026-02-11  1:44           ` Anisa Su
2026-02-11  9:34             ` Alireza Sanaee
2025-12-13  3:36     ` dan.j.williams
2026-01-12 22:50       ` Anisa Su
2026-01-13  0:08         ` Gregory Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox