public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] cxl: Support Back-Invalidate
@ 2026-03-15 20:27 Davidlohr Bueso
  2026-03-15 20:27 ` [PATCH 1/6] cxl: Add Back-Invalidate register definitions and structures Davidlohr Bueso
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Davidlohr Bueso @ 2026-03-15 20:27 UTC (permalink / raw)
  To: dave.jiang, dan.j.williams
  Cc: jonathan.cameron, alison.schofield, ira.weiny, gourry,
	dongjoo.seo1, anisa.su, linux-cxl, Davidlohr Bueso

Hello,

Changes from rfc (https://lore.kernel.org/all/20250812010228.2589787-1-dave@stgolabs.net/):
  o Dropped rfc status and split the series into smaller patches.
  o Renamed from toggle to ctrl + cleanup cxl_bi_ctrl_dport() by using a
    switch statement + better variable declarations. (Jonathan)
  o Introduce proper definitions for HDM supported coherence models. (Jonathan)
  o Fixed a regs remapping (unbind+bind) issue.
  o Added rollback logic if enabling BI on a component fails during the
    hierarchy walk.
  o Reworked BI regmaps.
  o BI-ID deallocation via delete_endpoint() instead of in cxl_detach_ep().

The following is the initial plumbing to enabling HDM-DB in Linux. This model
allows type2 and type3 devices, to expose their local memory to the host CPU
in a coherent manner. In alignment with what was discussed at 2024 LPC type2
support session, this series takes the type3 memory expander approach, which
is more direct. Further, afaik there is no type2+BI hardware out there.

A flagship use case of type3 + BI is coherent shared memory, and there is
currently a big gap in this regard (ie: GFAM). Another one is P2P via PCIe UIO,
which is also lacking today. Media Operation (4402h) command for ranged
sanitize/zero also trigger snoops, but such cmd support is missing as well
in the driver (albeit this might be the easiest entry point). As such this
series focuses on BI enablement in terms of discovery and configuration.

This adds two interfaces around cxlds to allow the setup and deallocations of
BI-IDs. The idea is for type3 memdevs and future type2 devices to make use of
cxlds->bi when committing HDM decoders, such that different device coherence
models can be differentiated as:

	type2 hdm-d:  cxlds->type == CXL_DEVTYPE_DEVMEM && cxlds->bi == false
	type2 hdm-db: cxlds->type == CXL_DEVTYPE_DEVMEM && cxlds->bi == true
	type3 hdm-h:  cxlds->type == CXL_DEVTYPE_CLASSMEM && cxlds->bi == false
	type3 hdm-db: cxlds->type == CXL_DEVTYPE_CLASSMEM && cxlds->bi == true

Because ->bi becoming true does not depend on auto-committing upon HDM decoder
port/enumeration (port driver), for now this is set as unsupported and will error
out when initializing the HDM decoder that has its BI bit set.

o Patch 1 adds BI Decoder and BI Route Table register definitions, capability IDs,
  and structures.
  
o Patch 2 probes BI capabilities during register discovery and maps BI Decoder
  registers on dports and endpoints, and BI Route Table registers on upstream
  switch ports.

o Patch 3 implements the BI-ID allocation and deallocation API. Both do the
  required hierarchy iteration (bottom-up) ensuring it is safe to enable/disable
  BI for every component.
  
o Patch 4 wires BI setup into cxl_mem_probe and teardown.

o Patch 5 introduces cxl_dpa_set_coherence() to switch between host-only and
  device coherent modes, and programs the BI control bit at commit time.

o Patch 6 adds sysfs interfaces for creating BI-enabled region for memory
  devices.

Testing has been done with the qemu HDM-DB type3 counterpart (now merged).

1. HDM Decoder with BI through ad-hoc region creation.
------------------------------------------------------
# cxl list -D
[
  {
    "decoder":"decoder0.0",
    "resource":4563402752,
    "size":10737418240,
    "interleave_ways":1,
    "max_available_extent":10737418240,
    "pmem_capable":true,
    "volatile_capable":true,
    "accelmem_capable":true,
    "nr_targets":1
  }
]

Program the endpoint decoder
# echo ram > /sys/bus/cxl/devices/decoder2.0/mode
# echo 1 > /sys/bus/cxl/devices/decoder2.0/bi
# echo 0x40000000 > /sys/bus/cxl/devices/decoder2.0/dpa_size

Create a region in the root decoder
# echo region0 > /sys/bus/cxl/devices/decoder0.0/create_ram_bi_region
[   22.730899] cxl_core:cxl_region_can_probe:4214: cxl_region region0: config state: 0
[   22.732655] cxl_core:cxl_bus_probe:2334: cxl_region region0: probe: -6
[   22.734200] cxl_core:devm_cxl_add_region:2636: cxl_acpi ACPI0017:00: decoder0.0: created region0


Configure the region with the same IG, IW the root and endpoint decoders
# echo 256 > /sys/bus/cxl/devices/region0/interleave_granularity
# echo 1 > /sys/bus/cxl/devices/region0/interleave_ways
# echo 0x40000000 > /sys/bus/cxl/devices/region0/size

Link the endpoint decoder as a target in the region
# echo decoder2.0 > /sys/bus/cxl/devices/region0/target0
[   99.819398] cxl_core:cxl_port_attach_region:1244: cxl region0: mem0:endpoint2 decoder2.0 add: mem0:decoder2.0 @ 0 n1
[   99.821802] cxl_core:cxl_port_attach_region:1244: cxl region0: pci0000:0c:port1 decoder1.0 add: mem0:decoder2.0 @ 01
[   99.823965] cxl_core:cxl_port_setup_targets:1562: cxl region0: pci0000:0c:port1 iw: 1 ig: 256
[   99.825153] cxl_core:cxl_port_setup_targets:1587: cxl region0: pci0000:0c:port1 target[0] = 0000:0c:01.0 for mem0:d0
[   99.826715] cxl_core:cxl_calc_interleave_pos:1956: cxl_mem mem0: decoder:decoder2.0 parent:0000:0e:00.0 port:endpoi0
[   99.828491] cxl_core:cxl_region_attach:2157: cxl decoder2.0: Test cxl_calc_interleave_pos(): success test_pos:0 cxl0

Commit the changes
# echo 1 > /sys/bus/cxl/devices/region0/commit
[  134.556073] cxl region0: Bypassing cpu_cache_invalidate_memregion() for testing!
non interleaved decoder 120000000 40000000 0
non interleaved decoder 120000000 40000000 0

# cxl list -D (tooling still needs updated to show decoder target type)
[
  {
    "root decoders":[
      {
        "decoder":"decoder0.0",
        "resource":4563402752,
        "size":10737418240,
        "interleave_ways":1,
        "max_available_extent":9395240960,
        "pmem_capable":true,
        "volatile_capable":true,
        "accelmem_capable":true,
        "nr_targets":1
      }
    ]
  },
  {
    "port decoders":[
      {
        "decoder":"decoder1.0",
        "resource":4831838208,
        "size":1073741824,
        "interleave_ways":1,
        "region":"region0",
        "nr_targets":1
      }
    ]
  },
  {
    "endpoint decoders":[
      {
        "decoder":"decoder2.0",
        "resource":4831838208,
        "size":1073741824,
        "interleave_ways":1,
        "region":"region0",
        "dpa_resource":0,
        "dpa_size":1073741824,
        "mode":"ram"
      }
    ]
  }
]


2. On a type3 HDM-DB volatile device, create one BI and one regular region.
--------------------------------------------------------------------------
# echo ram > /sys/bus/cxl/devices/decoder2.0/mode
# echo 1 > /sys/bus/cxl/devices/decoder2.0/bi
# echo 0x20000000 > /sys/bus/cxl/devices/decoder2.0/dpa_size
# echo region0 > /sys/bus/cxl/devices/decoder0.0/create_ram_bi_region
# echo 256 > /sys/bus/cxl/devices/region0/interleave_granularity
# echo 1 > /sys/bus/cxl/devices/region0/interleave_ways
# echo 0x20000000 > /sys/bus/cxl/devices/region0/size
# echo decoder2.0 > /sys/bus/cxl/devices/region0/target0
# echo 1 > /sys/bus/cxl/devices/region0/commit

# echo ram > /sys/bus/cxl/devices/decoder2.1/mode
# echo 0 > /sys/bus/cxl/devices/decoder2.1/bi (this is already the default)
# echo 0x20000000 > /sys/bus/cxl/devices/decoder2.1/dpa_size
# echo region1 > /sys/bus/cxl/devices/decoder0.0/create_ram_region
# echo 256 > /sys/bus/cxl/devices/region1/interleave_granularity
# echo 1 > /sys/bus/cxl/devices/region1/interleave_ways
# echo 0x20000000 > /sys/bus/cxl/devices/region1/size
# echo decoder2.1 > /sys/bus/cxl/devices/region1/target0
# echo 1 > /sys/bus/cxl/devices/region1/commit

# cxl list -D
[
  {
    "root decoders":[
      {
        "decoder":"decoder0.0",
        "resource":4563402752,
        "size":10737418240,
        "interleave_ways":1,
        "max_available_extent":9395240960,
        "pmem_capable":true,
        "volatile_capable":true,
        "accelmem_capable":true,
        "nr_targets":1
      }
    ]
  },
  {
    "port decoders":[
      {
        "decoder":"decoder1.0",
        "resource":4831838208,
        "size":536870912,
        "interleave_ways":1,
        "region":"region0",
        "nr_targets":1
      },
      {
        "decoder":"decoder1.1",
        "resource":5368709120,
        "size":536870912,
        "interleave_ways":1,
        "region":"region1",
        "nr_targets":1
      }
    ]
  },
  {
    "endpoint decoders":[
      {
        "decoder":"decoder2.0",
        "resource":4831838208,
        "size":536870912,
        "interleave_ways":1,
        "region":"region0",
        "dpa_resource":0,
        "dpa_size":536870912,
        "mode":"ram"
      },
      {
        "decoder":"decoder2.1",
        "resource":5368709120,
        "size":536870912,
        "interleave_ways":1,
        "region":"region1",
        "dpa_resource":536870912,
        "dpa_size":536870912,
        "mode":"ram"
      }
    ]
  }
]

3. Unbind + Bind
----------------
[    0.799944] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:0d:00.0: device capable of issuing BI requests

# echo mem0 > /sys/bus/cxl/drivers/cxl_mem/unbind
[  108.682475] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:0d:00.0: device incapable of issuing BI requests
[  108.685569] cxl_core:cxl_detach_ep:1600: cxl_mem mem0: disconnect mem0 from port1

# echo mem0 > /sys/bus/cxl/drivers/cxl_mem/bind
[  152.828550] cxl_core:devm_cxl_enumerate_ports:1906: cxl_mem mem0: scan: iter: mem0 dport_dev: 0000:0c:00.0 parent: c
[  152.833431] cxl_core:devm_cxl_enumerate_ports:1912: cxl_mem mem0: found already registered port port1:pci0000:0c
[  152.836573] cxl_core:cxl_port_alloc:802: cxl_mem mem0: host-bridge: pci0000:0c
[  152.838898] cxl_core:cxl_cdat_get_length:486: cxl_port endpoint2: CDAT length 160
[  152.846065] cxl_core:cxl_port_perf_data_calculate:207: cxl_port endpoint2: Failed to retrieve ep perf coordinates.
[  152.848011] cxl_core:cxl_endpoint_parse_cdat:423: cxl_port endpoint2: Failed to do perf coord calculations.
[  152.870738] cxl_core:init_hdm_decoder:1139: cxl_port endpoint2: decoder2.0: range: 0x0-0xffffffffffffffff iw: 1 ig:6
[  152.876618] cxl_core:add_hdm_decoder:39: cxl_mem mem0: decoder2.0 added to endpoint2
[  152.880951] cxl_core:init_hdm_decoder:1139: cxl_port endpoint2: decoder2.1: range: 0x0-0xffffffffffffffff iw: 1 ig:6
[  152.883983] cxl_core:add_hdm_decoder:39: cxl_mem mem0: decoder2.1 added to endpoint2
[  152.885263] cxl_core:init_hdm_decoder:1139: cxl_port endpoint2: decoder2.2: range: 0x0-0xffffffffffffffff iw: 1 ig:6
[  152.887181] cxl_core:add_hdm_decoder:39: cxl_mem mem0: decoder2.2 added to endpoint2
[  152.888514] cxl_core:init_hdm_decoder:1139: cxl_port endpoint2: decoder2.3: range: 0x0-0xffffffffffffffff iw: 1 ig:6
[  152.890362] cxl_core:add_hdm_decoder:39: cxl_mem mem0: decoder2.3 added to endpoint2
[  152.891493] cxl_core:cxl_bus_probe:2334: cxl_port endpoint2: probe: 0
[  152.892420] cxl_core:devm_cxl_add_port:1010: cxl_mem mem0: endpoint2 added to port1
[  152.893360] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:0d:00.0: device capable of issuing BI requests
[  152.894395] cxl_core:cxl_bus_probe:2334: cxl_mem mem0: probe: 0

Unbind both devices:
# echo mem0 > /sys/bus/cxl/drivers/cxl_mem/unbind
[  344.593463] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:0d:00.0: device incapable of issuing BI requests
# echo mem1 > /sys/bus/cxl/drivers/cxl_mem/unbind
[  349.861528] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:0e:00.0: device incapable of issuing BI requests

4. Discovery of 2 BI capable devices behind a switch (w/ flit mode)
-------------------------------------------------------------------
[    0.758525] cxl_core:cxl_probe_component_regs:102: cxl_pci 0000:11:00.0: found BI Decoder capability (0xab4)
[    0.767234] cxl_core:cxl_probe_component_regs:102: cxl_pci 0000:0f:00.0: found BI Decoder capability (0xab4)
[    0.842162] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0c:00.0: found BI Decoder capability (0xab4)
[    0.863272] cxl_port:cxl_uport_init_bi:94: cxl_port port2: BI RT registers not found
[    0.907717] cxl_core:cxl_probe_component_regs:96: cxl_port port2: found BI RT capability (0xaa8)
[    1.113556] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:03.0: found BI Decoder capability (0xab4)
[    1.128571] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:01.0: found BI Decoder capability (0xab4)
[    1.281506] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:02.0: found BI Decoder capability (0xab4)
[    1.290050] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:00.0: found BI Decoder capability (0xab4)
[    1.612582] cxl_core:__cxl_bi_commit:1024: pcieport 0000:0e:00.0: BI-ID commit wait took 203541us
[    1.616836] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:0f:00.0: device capable of issuing BI requests
[    1.820621] cxl_core:__cxl_bi_commit:1024: pcieport 0000:0e:02.0: BI-ID commit wait took 203526us
[    1.823760] cxl_core:cxl_bi_ctrl_endpoint:1123: cxl_pci 0000:11:00.0: device capable of issuing BI requests

5. Mixed Configurations (BI-capable type3 but DSP 68b)
------------------------------------------------------
[    0.842026] cxl_core:cxl_probe_component_regs:102: cxl_pci 0000:11:00.0: found BI Decoder capability (0xab4)
[    0.842092] cxl_core:cxl_probe_component_regs:102: cxl_pci 0000:0f:00.0: found BI Decoder capability (0xab4)
[    0.878436] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0c:00.0: found BI Decoder capability (0xab4)
[    1.297020] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:03.0: found BI Decoder capability (0xab4)
[    1.315171] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:02.0: found BI Decoder capability (0xab4)
[    1.475849] cxl_mem:cxl_mem_probe:160: cxl_mem mem3: BI setup failed rc=-22
[    1.476831] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:01.0: found BI Decoder capability (0xab4)
[    1.477785] cxl_core:cxl_probe_component_regs:102: pcieport 0000:0e:00.0: found BI Decoder capability (0xab4)
[    1.671067] cxl_mem:cxl_mem_probe:160: cxl_mem mem0: BI setup failed rc=-22

# echo 1 > /sys/bus/cxl/devices/decoder5.0/bi 
-bash: echo: write error: Invalid argument

# echo region0 > /sys/bus/cxl/devices/decoder0.0/create_ram_bi_region
[  287.198551] cxl_core:cxl_region_can_probe:4227: cxl_region region0: config state: 0
[  287.200005] cxl_core:cxl_bus_probe:2334: cxl_region region0: probe: -6
[  287.201012] cxl_core:devm_cxl_add_region:2649: cxl_acpi ACPI0017:00: decoder0.0: created region0

# echo decoder5.0 > /sys/bus/cxl/devices/region0/target0
[  384.895032] cxl_core:cxl_region_attach:2062: cxl region0: mem0:decoder5.0 BI not enabled on device
[  384.896300] cxl_port endpoint5: failed to attach decoder5.0 to region0: -6
-bash: echo: write error: No such device or address


6. Detect mismatch between decoder and region types
---------------------------------------------------
# echo ram > /sys/bus/cxl/devices/decoder2.0/mode
# echo 1 > /sys/bus/cxl/devices/decoder2.0/bi
# echo 0x40000000 > /sys/bus/cxl/devices/decoder2.0/dpa_size
# echo region0 > /sys/bus/cxl/devices/decoder0.0/create_ram_region <- no bi
# echo 256 > /sys/bus/cxl/devices/region0/interleave_granularity
# echo 1 > /sys/bus/cxl/devices/region0/interleave_ways
# echo 0x40000000 > /sys/bus/cxl/devices/region0/size
# echo decoder2.0 > /sys/bus/cxl/devices/region0/target0
# echo 1 > /sys/bus/cxl/devices/region0/commit
[   24.738308] cxl_core:cxl_region_attach:2048: cxl region0: mem0:decoder2.0 type mismatch: 2 vs 3

Applies against the 'next' branch of cxl.git + the type2 support preparation
series (https://lore.kernel.org/linux-cxl/20260306164741.3796372-1-alejandro.lucero-palau@amd.com)

Thanks!

Davidlohr Bueso (6):
  cxl: Add Back-Invalidate register definitions and structures
  cxl: Add BI register probing and port initialization
  cxl/pci: Add Back-Invalidate topology enable/disable
  cxl: Wire BI setup and dealloc into device lifecycle
  cxl/hdm: Add BI coherency support for endpoint decoders
  cxl: Add HDM-DB region creation and sysfs interface

 Documentation/ABI/testing/sysfs-bus-cxl |  40 ++-
 drivers/cxl/acpi.c                      |   2 +
 drivers/cxl/core/core.h                 |   4 +
 drivers/cxl/core/hdm.c                  |  60 ++++-
 drivers/cxl/core/pci.c                  | 339 ++++++++++++++++++++++++
 drivers/cxl/core/port.c                 |  63 ++++-
 drivers/cxl/core/region.c               |  65 ++++-
 drivers/cxl/core/regs.c                 |  13 +
 drivers/cxl/cxl.h                       |  41 +++
 drivers/cxl/cxlmem.h                    |   2 +
 drivers/cxl/mem.c                       |  20 ++
 drivers/cxl/port.c                      |  88 ++++++
 include/cxl/cxl.h                       |   5 +
 13 files changed, 726 insertions(+), 16 deletions(-)

-- 
2.39.5


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-03-24  0:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-15 20:27 [PATCH 0/6] cxl: Support Back-Invalidate Davidlohr Bueso
2026-03-15 20:27 ` [PATCH 1/6] cxl: Add Back-Invalidate register definitions and structures Davidlohr Bueso
2026-03-19 16:59   ` Jonathan Cameron
2026-03-20 14:57   ` Jonathan Cameron
2026-03-23 22:11   ` Dave Jiang
2026-03-15 20:27 ` [PATCH 2/6] cxl: Add BI register probing and port initialization Davidlohr Bueso
2026-03-20 15:46   ` Jonathan Cameron
2026-03-20 16:19   ` Cheatham, Benjamin
2026-03-23 23:10   ` Dave Jiang
2026-03-15 20:27 ` [PATCH 3/6] cxl/pci: Add Back-Invalidate topology enable/disable Davidlohr Bueso
2026-03-20 16:20   ` Cheatham, Benjamin
2026-03-20 20:52     ` Alison Schofield
2026-03-20 16:27   ` [PATCH 3/6] cxl/pci: Add Back-Invalidate topology enable/disabl Jonathan Cameron
2026-03-24  0:21   ` [PATCH 3/6] cxl/pci: Add Back-Invalidate topology enable/disable Dave Jiang
2026-03-15 20:27 ` [PATCH 4/6] cxl: Wire BI setup and dealloc into device lifecycle Davidlohr Bueso
2026-03-20 16:20   ` Cheatham, Benjamin
2026-03-20 16:29   ` Jonathan Cameron
2026-03-15 20:27 ` [PATCH 5/6] cxl/hdm: Add BI coherency support for endpoint decoders Davidlohr Bueso
2026-03-20 16:20   ` Cheatham, Benjamin
2026-03-15 20:27 ` [PATCH 6/6] cxl: Add HDM-DB region creation and sysfs interface Davidlohr Bueso
2026-03-20 16:39   ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox