* [PATCH -qemu 0/4] hw/cxl: Support Back-Invalidate
@ 2025-08-06 5:57 Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 1/4] hw/pcie: Support enabling flit mode Davidlohr Bueso
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-06 5:57 UTC (permalink / raw)
To: jonathan.cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Davidlohr Bueso
Hello,
The following allows support for component basic back invalidation discovery
and config, by exposing the BI routing table and decoder registers. Instead
of going the type2[0] route, this series proposes adding support for type3
hdm-db, which allows a more direct way of supporting BI in qemu.
Changes from rfc (https://lore.kernel.org/qemu-devel/20250729165441.1898150-1-dave@stgolabs.net/)
o Added 256b-flit parameter, per Jonathan.
o Added window restrictions changes.
o Dropped rfc tag.
Patch 1 introduces the flit mode parameter.
Patch 2 is lifted from Ira's series with some small (but non-trivial) changes.
Patch 3 updates the cfmw restrictions option.
Patch 4 adds BI decoder/rt register support.
Testing wise, this has passed relevant kernel side BI register IO flows for
BI-ID setup and deallocation.
The next step for this would be to add UIO support to qemu.
Applies against branch 'origin/cxl-2025-07-03' from Jonathan's repository.
Thanks!
[0] https://lore.kernel.org/linux-cxl/20230517-rfc-type2-dev-v1-0-6eb2e470981b@intel.com/
Davidlohr Bueso (3):
hw/pcie: Support enabling flit mode
hw/cxl: Allow BI by default in Window restrictions
hw/cxl: Support Type3 HDM-DB
Ira Weiny (1):
hw/cxl: Refactor component register initialization
docs/system/devices/cxl.rst | 26 +++
hw/core/qdev-properties-system.c | 11 ++
hw/cxl/cxl-component-utils.c | 206 ++++++++++++++++------
hw/cxl/cxl-host.c | 2 +-
hw/mem/cxl_type3.c | 13 +-
hw/pci-bridge/cxl_downstream.c | 1 +
hw/pci-bridge/cxl_root_port.c | 1 +
hw/pci-bridge/cxl_upstream.c | 3 +-
hw/pci-bridge/gen_pcie_root_port.c | 1 +
hw/pci/pcie.c | 13 +-
include/hw/cxl/cxl_component.h | 87 +++++++--
include/hw/cxl/cxl_device.h | 4 +
include/hw/pci-bridge/cxl_upstream_port.h | 1 +
include/hw/pci/pcie.h | 2 +-
include/hw/pci/pcie_port.h | 1 +
include/hw/qdev-properties-system.h | 3 +
qapi/common.json | 14 ++
qapi/machine.json | 3 +-
qemu-options.hx | 4 +-
19 files changed, 317 insertions(+), 79 deletions(-)
--
2.39.5
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/4] hw/pcie: Support enabling flit mode
2025-08-06 5:57 [PATCH -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
@ 2025-08-06 5:57 ` Davidlohr Bueso
2025-08-08 15:42 ` Jonathan Cameron
2025-08-08 16:02 ` Jonathan Cameron
2025-08-06 5:57 ` [PATCH 2/4] hw/cxl: Refactor component register initialization Davidlohr Bueso
` (2 subsequent siblings)
3 siblings, 2 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-06 5:57 UTC (permalink / raw)
To: jonathan.cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Davidlohr Bueso, Jonathan Cameron
As with the link speed and width training, have ad-hoc property for
setting the flit mode and allow CXL components to make use of it.
For the CXL root port and dsp cases, always report flit mode but
the actual value after 'training' will depend on the downstream
device configuration.
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
hw/core/qdev-properties-system.c | 11 +++++++++++
hw/mem/cxl_type3.c | 4 +++-
hw/pci-bridge/cxl_downstream.c | 1 +
hw/pci-bridge/cxl_root_port.c | 1 +
hw/pci-bridge/cxl_upstream.c | 3 ++-
hw/pci-bridge/gen_pcie_root_port.c | 1 +
hw/pci/pcie.c | 13 +++++++++----
include/hw/cxl/cxl_device.h | 1 +
include/hw/pci-bridge/cxl_upstream_port.h | 1 +
include/hw/pci/pcie.h | 2 +-
include/hw/pci/pcie_port.h | 1 +
include/hw/qdev-properties-system.h | 3 +++
qapi/common.json | 14 ++++++++++++++
13 files changed, 49 insertions(+), 7 deletions(-)
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 24e145d87001..94a1b754ecdc 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -1172,6 +1172,17 @@ const PropertyInfo qdev_prop_pcie_link_width = {
.set_default_value = qdev_propinfo_set_default_value_enum,
};
+/* --- Flit mode --- */
+
+const PropertyInfo qdev_prop_pcie_link_flit = {
+ .type = "PCIELinkFlit",
+ .description = "off/on",
+ .enum_table = &PCIELinkFlit_lookup,
+ .get = qdev_propinfo_get_enum,
+ .set = qdev_propinfo_set_enum,
+ .set_default_value = qdev_propinfo_set_default_value_enum,
+};
+
/* --- UUID --- */
static void get_uuid(Object *obj, Visitor *v, const char *name, void *opaque,
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index c4658e0955d5..324cf62e8141 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1501,7 +1501,8 @@ void ct3d_reset(DeviceState *dev)
uint32_t *reg_state = ct3d->cxl_cstate.crb.cache_mem_registers;
uint32_t *write_msk = ct3d->cxl_cstate.crb.cache_mem_regs_write_mask;
- pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed);
+ pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed,
+ ct3d->flitmode);
cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
cxl_device_register_init_t3(ct3d, CXL_T3_MSIX_MBOX);
@@ -1540,6 +1541,7 @@ static const Property ct3_props[] = {
speed, PCIE_LINK_SPEED_32),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLType3Dev,
width, PCIE_LINK_WIDTH_16),
+ DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLType3Dev, flitmode, 0),
DEFINE_PROP_UINT16("chmu-port", CXLType3Dev, cxl_dstate.chmu[0].port, 0),
};
diff --git a/hw/pci-bridge/cxl_downstream.c b/hw/pci-bridge/cxl_downstream.c
index 6aa8586f0161..82e6618b111c 100644
--- a/hw/pci-bridge/cxl_downstream.c
+++ b/hw/pci-bridge/cxl_downstream.c
@@ -257,6 +257,7 @@ static const Property cxl_dsp_props[] = {
speed, PCIE_LINK_SPEED_64),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
width, PCIE_LINK_WIDTH_16),
+ DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 1),
};
static void cxl_dsp_class_init(ObjectClass *oc, const void *data)
diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
index f035987b6f1f..721dc3981fd2 100644
--- a/hw/pci-bridge/cxl_root_port.c
+++ b/hw/pci-bridge/cxl_root_port.c
@@ -235,6 +235,7 @@ static const Property gen_rp_props[] = {
speed, PCIE_LINK_SPEED_64),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
width, PCIE_LINK_WIDTH_32),
+ DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 1),
};
static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
diff --git a/hw/pci-bridge/cxl_upstream.c b/hw/pci-bridge/cxl_upstream.c
index c2150afff39b..5e6d559a3215 100644
--- a/hw/pci-bridge/cxl_upstream.c
+++ b/hw/pci-bridge/cxl_upstream.c
@@ -147,7 +147,7 @@ static void cxl_usp_reset(DeviceState *qdev)
pci_bridge_reset(qdev);
pcie_cap_deverr_reset(d);
- pcie_cap_fill_link_ep_usp(d, usp->width, usp->speed);
+ pcie_cap_fill_link_ep_usp(d, usp->width, usp->speed, usp->flitmode);
latch_registers(usp);
}
@@ -433,6 +433,7 @@ static const Property cxl_upstream_props[] = {
speed, PCIE_LINK_SPEED_32),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLUpstreamPort,
width, PCIE_LINK_WIDTH_16),
+ DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLUpstreamPort, flitmode, 0),
};
static void cxl_upstream_class_init(ObjectClass *oc, const void *data)
diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
index d9078e783bf0..c00be8147c2a 100644
--- a/hw/pci-bridge/gen_pcie_root_port.c
+++ b/hw/pci-bridge/gen_pcie_root_port.c
@@ -145,6 +145,7 @@ static const Property gen_rp_props[] = {
speed, PCIE_LINK_SPEED_16),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
width, PCIE_LINK_WIDTH_32),
+ DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 0),
};
static void gen_rp_dev_class_init(ObjectClass *klass, const void *data)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index eaeb68894e6e..55e0f7110ae5 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -113,7 +113,7 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t type, uint8_t version)
/* Includes setting the target speed default */
static void pcie_cap_fill_lnk(uint8_t *exp_cap, PCIExpLinkWidth width,
- PCIExpLinkSpeed speed)
+ PCIExpLinkSpeed speed, PCIELinkFlit flitmode)
{
/* Clear and fill LNKCAP from what was configured above */
pci_long_test_and_clear_mask(exp_cap + PCI_EXP_LNKCAP,
@@ -158,10 +158,15 @@ static void pcie_cap_fill_lnk(uint8_t *exp_cap, PCIExpLinkWidth width,
PCI_EXP_LNKCAP2_SLS_64_0GB);
}
}
+
+ if (flitmode) {
+ pci_long_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA2,
+ PCI_EXP_LNKSTA2_FLIT);
+ }
}
void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
- PCIExpLinkSpeed speed)
+ PCIExpLinkSpeed speed, PCIELinkFlit flitmode)
{
uint8_t *exp_cap = dev->config + dev->exp.exp_cap;
@@ -175,7 +180,7 @@ void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
QEMU_PCI_EXP_LNKSTA_NLW(width) |
QEMU_PCI_EXP_LNKSTA_CLS(speed));
- pcie_cap_fill_lnk(exp_cap, width, speed);
+ pcie_cap_fill_lnk(exp_cap, width, speed, flitmode);
}
static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
@@ -212,7 +217,7 @@ static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
/* the PCI_EXP_LNKSTA_DLLLA will be set in the hotplug function */
}
- pcie_cap_fill_lnk(exp_cap, s->width, s->speed);
+ pcie_cap_fill_lnk(exp_cap, s->width, s->speed, s->flitmode);
}
int pcie_cap_init(PCIDevice *dev, uint8_t offset,
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 068c20d61ebc..4c9d2247cf02 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -796,6 +796,7 @@ struct CXLType3Dev {
/* PCIe link characteristics */
PCIExpLinkSpeed speed;
PCIExpLinkWidth width;
+ PCIELinkFlit flitmode;
/* DOE */
DOECap doe_cdat;
diff --git a/include/hw/pci-bridge/cxl_upstream_port.h b/include/hw/pci-bridge/cxl_upstream_port.h
index db1dfb6afd98..584e43c37291 100644
--- a/include/hw/pci-bridge/cxl_upstream_port.h
+++ b/include/hw/pci-bridge/cxl_upstream_port.h
@@ -20,6 +20,7 @@ typedef struct CXLUpstreamPort {
PCIExpLinkSpeed speed;
PCIExpLinkWidth width;
+ PCIELinkFlit flitmode;
DOECap doe_cdat;
uint64_t sn;
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index ff6ce08e135a..82fcbc9f8823 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -142,7 +142,7 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset);
void pcie_dev_ser_num_init(PCIDevice *dev, uint16_t offset, uint64_t ser_num);
void pcie_ats_init(PCIDevice *dev, uint16_t offset, bool aligned);
void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
- PCIExpLinkSpeed speed);
+ PCIExpLinkSpeed speed, PCIELinkFlit flitmode);
void pcie_cap_slot_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
Error **errp);
diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
index 7cd7af8cfa4b..2f96fc685729 100644
--- a/include/hw/pci/pcie_port.h
+++ b/include/hw/pci/pcie_port.h
@@ -58,6 +58,7 @@ struct PCIESlot {
PCIExpLinkSpeed speed;
PCIExpLinkWidth width;
+ PCIELinkFlit flitmode;
/* Disable ACS (really for a pcie_root_port) */
bool disable_acs;
diff --git a/include/hw/qdev-properties-system.h b/include/hw/qdev-properties-system.h
index b921392c5256..dd5dc4515ea7 100644
--- a/include/hw/qdev-properties-system.h
+++ b/include/hw/qdev-properties-system.h
@@ -28,6 +28,7 @@ extern const PropertyInfo qdev_prop_audiodev;
extern const PropertyInfo qdev_prop_off_auto_pcibar;
extern const PropertyInfo qdev_prop_pcie_link_speed;
extern const PropertyInfo qdev_prop_pcie_link_width;
+extern const PropertyInfo qdev_prop_pcie_link_flit;
extern const PropertyInfo qdev_prop_cpus390entitlement;
extern const PropertyInfo qdev_prop_iothread_vq_mapping_list;
extern const PropertyInfo qdev_prop_endian_mode;
@@ -80,6 +81,8 @@ extern const PropertyInfo qdev_prop_vmapple_virtio_blk_variant;
#define DEFINE_PROP_PCIE_LINK_WIDTH(_n, _s, _f, _d) \
DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_pcie_link_width, \
PCIExpLinkWidth)
+#define DEFINE_PROP_PCIE_LINK_FLIT(_n, _s, _f, _d) \
+ DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_pcie_link_flit, PCIELinkFlit)
#define DEFINE_PROP_UUID(_name, _state, _field) \
DEFINE_PROP(_name, _state, _field, qdev_prop_uuid, QemuUUID, \
diff --git a/qapi/common.json b/qapi/common.json
index 0e3a0bbbfb0b..da047fbf874f 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -140,6 +140,20 @@
{ 'enum': 'PCIELinkWidth',
'data': [ '1', '2', '4', '8', '12', '16', '32' ] }
+##
+# @PCIELinkFlit:
+#
+# An enumeration of PCIe link FLIT mode
+#
+# @off: the link is not operating in FLIT mode
+#
+# @on: each FLIT is a fixed 256 bytes in size
+#
+# Since: 10.0
+##
+{ 'enum': 'PCIELinkFlit',
+ 'data': [ 'off', 'on'] }
+
##
# @HostMemPolicy:
#
--
2.39.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/4] hw/cxl: Refactor component register initialization
2025-08-06 5:57 [PATCH -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 1/4] hw/pcie: Support enabling flit mode Davidlohr Bueso
@ 2025-08-06 5:57 ` Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 4/4] hw/cxl: Support Type3 HDM-DB Davidlohr Bueso
3 siblings, 0 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-06 5:57 UTC (permalink / raw)
To: jonathan.cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Davidlohr Bueso
From: Ira Weiny <ira.weiny@intel.com>
CXL 3.2 8.2.4 Table 8-22 defines which capabilities are mandatory, not
permitted, or optional for each type of device.
cxl_component_register_init_common() uses a rather odd 'fall through'
mechanism to define each component register set. This assumes that any
device or capability being added builds on the previous devices
capabilities. This is not true as there are mutually exclusive
capabilities defined. For example, downstream ports can not have snoop
but it can have Back Invalidate capable decoders.
Refactor this code to make it easier to add individual capabilities as
defined by a device type. Any capability which is not specified by the
type is left NULL'ed out which complies with the packed nature of the
register array.
Update all spec references to 3.2.
No functional changes should be seen with this patch.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
[rebased, no RAS for HBs, r3.2 references]
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
hw/cxl/cxl-component-utils.c | 75 +++++++++++-----------------------
include/hw/cxl/cxl_component.h | 33 ++++++++++-----
2 files changed, 46 insertions(+), 62 deletions(-)
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
index ce42fa4a2946..a43d227336ca 100644
--- a/hw/cxl/cxl-component-utils.c
+++ b/hw/cxl/cxl-component-utils.c
@@ -289,32 +289,6 @@ void cxl_component_register_init_common(uint32_t *reg_state,
{
int caps = 0;
- /*
- * In CXL 2.0 the capabilities required for each CXL component are such
- * that, with the ordering chosen here, a single number can be used to
- * define which capabilities should be provided.
- */
- switch (type) {
- case CXL2_DOWNSTREAM_PORT:
- case CXL2_DEVICE:
- /* RAS, Link */
- caps = 2;
- break;
- case CXL2_UPSTREAM_PORT:
- case CXL2_TYPE3_DEVICE:
- case CXL2_LOGICAL_DEVICE:
- /* + HDM */
- caps = 3;
- break;
- case CXL2_ROOT_PORT:
- case CXL2_RC:
- /* + Extended Security, + Snoop */
- caps = 5;
- break;
- default:
- abort();
- }
-
memset(reg_state, 0, CXL2_COMPONENT_CM_REGION_SIZE);
/* CXL Capability Header Register */
@@ -322,11 +296,12 @@ void cxl_component_register_init_common(uint32_t *reg_state,
ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION,
CXL_CAPABILITY_VERSION);
ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
- ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
#define init_cap_reg(reg, id, version) \
do { \
- int which = R_CXL_##reg##_CAPABILITY_HEADER; \
+ int which = CXL_##reg##_CAP_HDR_IDX; \
+ if (CXL_##reg##_CAP_HDR_IDX > caps) \
+ caps = CXL_##reg##_CAP_HDR_IDX; \
reg_state[which] = FIELD_DP32(reg_state[which], \
CXL_##reg##_CAPABILITY_HEADER, ID, id); \
reg_state[which] = \
@@ -337,37 +312,35 @@ void cxl_component_register_init_common(uint32_t *reg_state,
CXL_##reg##_REGISTERS_OFFSET); \
} while (0)
+ /* CXL r3.2 8.2.4 Table 8-22 */
switch (type) {
- case CXL2_DEVICE:
- case CXL2_TYPE3_DEVICE:
- case CXL2_LOGICAL_DEVICE:
case CXL2_ROOT_PORT:
+ case CXL2_RC:
+ /* + Extended Security, + Snoop */
+ init_cap_reg(EXTSEC, 6, 1);
+ init_cap_reg(SNOOP, 8, 1);
+ /* fallthrough */
case CXL2_UPSTREAM_PORT:
+ case CXL2_TYPE3_DEVICE:
+ case CXL2_LOGICAL_DEVICE:
+ /* + HDM */
+ init_cap_reg(HDM, 5, 1);
+ hdm_init_common(reg_state, write_msk, type);
+ /* fallthrough */
case CXL2_DOWNSTREAM_PORT:
- init_cap_reg(RAS, 2, CXL_RAS_CAPABILITY_VERSION);
- ras_init_common(reg_state, write_msk);
+ case CXL2_DEVICE:
+ /* RAS, Link */
+ if (type != CXL2_RC) {
+ init_cap_reg(RAS, 2, 2);
+ ras_init_common(reg_state, write_msk);
+ }
+ init_cap_reg(LINK, 4, 2);
break;
default:
- break;
- }
-
- init_cap_reg(LINK, 4, CXL_LINK_CAPABILITY_VERSION);
-
- if (caps < 3) {
- return;
- }
-
- if (type != CXL2_ROOT_PORT) {
- init_cap_reg(HDM, 5, CXL_HDM_CAPABILITY_VERSION);
- hdm_init_common(reg_state, write_msk, type);
- }
- if (caps < 5) {
- return;
+ abort();
}
- init_cap_reg(EXTSEC, 6, CXL_EXTSEC_CAP_VERSION);
- init_cap_reg(SNOOP, 8, CXL_SNOOP_CAP_VERSION);
-
+ ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
#undef init_cap_reg
}
diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
index b721333cb7aa..cd92cb02532a 100644
--- a/include/hw/cxl/cxl_component.h
+++ b/include/hw/cxl/cxl_component.h
@@ -32,10 +32,20 @@ enum reg_type {
};
/*
- * Capability registers are defined at the top of the CXL.cache/mem region and
- * are packed. For our purposes we will always define the caps in the same
- * order.
- * CXL r3.1 Table 8-22: CXL_CAPABILITY_ID Assignment for details.
+ * CXL r3.2 - 8.2.4 Table 8-22 and 8-23
+ *
+ * Capability registers are defined at the top of the CXL.cache/mem region.
+ * They are defined to be packed and at variable offsets. However, NULL
+ * capabilities can be added to the packed array. To facilitate easier access
+ * within the QEMU code, define these at specified offsets. Then NULL out any
+ * capabilities for devices which don't (or can't) have a particular capability
+ * (see cxl_component_register_init_common). NULL capabilities are to be
+ * ignored by software.
+ *
+ * 'offsets' are based on index's which can then be used to report the array
+ * size in CXL Capability Header Register (index/offset 0).
+ *
+ * See CXL r3.2 Table 8-25 for an example of allowing a 'NULL' header.
*/
/* CXL r3.1 Section 8.2.4.1: CXL Capability Header Register */
@@ -46,16 +56,17 @@ REG32(CXL_CAPABILITY_HEADER, 0)
FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
-#define CXLx_CAPABILITY_HEADER(type, offset) \
- REG32(CXL_##type##_CAPABILITY_HEADER, offset) \
+#define CXLx_CAPABILITY_HEADER(type, idx) \
+ enum { CXL_##type##_CAP_HDR_IDX = idx }; \
+ REG32(CXL_##type##_CAPABILITY_HEADER, (idx * 0x4)) \
FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16) \
FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
-CXLx_CAPABILITY_HEADER(RAS, 0x4)
-CXLx_CAPABILITY_HEADER(LINK, 0x8)
-CXLx_CAPABILITY_HEADER(HDM, 0xc)
-CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
-CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
+CXLx_CAPABILITY_HEADER(RAS, 1)
+CXLx_CAPABILITY_HEADER(LINK, 2)
+CXLx_CAPABILITY_HEADER(HDM, 3)
+CXLx_CAPABILITY_HEADER(EXTSEC, 4)
+CXLx_CAPABILITY_HEADER(SNOOP, 5)
/*
* Capability structures contain the actual registers that the CXL component
--
2.39.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions
2025-08-06 5:57 [PATCH -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 1/4] hw/pcie: Support enabling flit mode Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 2/4] hw/cxl: Refactor component register initialization Davidlohr Bueso
@ 2025-08-06 5:57 ` Davidlohr Bueso
2025-08-07 0:06 ` Davidlohr Bueso
2025-08-08 15:47 ` Jonathan Cameron
2025-08-06 5:57 ` [PATCH 4/4] hw/cxl: Support Type3 HDM-DB Davidlohr Bueso
3 siblings, 2 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-06 5:57 UTC (permalink / raw)
To: jonathan.cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Davidlohr Bueso
Update the CFMW restrictions to also permit Back-Invalidate
flows by default, which is aligned with the no-restrictions
policy.
While at it, document the 'restrictions=' option.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
hw/cxl/cxl-host.c | 2 +-
qapi/machine.json | 3 ++-
qemu-options.hx | 4 +++-
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c
index def2cf75be61..0d17ea3e4c26 100644
--- a/hw/cxl/cxl-host.c
+++ b/hw/cxl/cxl-host.c
@@ -64,7 +64,7 @@ static void cxl_fixed_memory_window_config(CXLFixedMemoryWindowOptions *object,
if (object->has_restrictions) {
fw->restrictions = object->restrictions;
} else {
- fw->restrictions = 0xf; /* No restrictions */
+ fw->restrictions = 0x2f; /* No restrictions */
}
fw->targets = g_malloc0_n(fw->num_targets, sizeof(*fw->targets));
diff --git a/qapi/machine.json b/qapi/machine.json
index ac258578e4ab..ea8ba71305b0 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -589,7 +589,8 @@
# BIT(2) - Volatile
# BIT(3) - Persistent
# BIT(4) - Fixed Device Config
-# Default is 0xF
+# BIT(5) - BI
+# Default is 0x2F
#
# @targets: Target root bridge IDs from -device ...,id=<ID> for each
# root bridge.
diff --git a/qemu-options.hx b/qemu-options.hx
index 1f862b19a676..ef6072bd8b59 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -142,7 +142,7 @@ SRST
-machine memory-backend=pc.ram
-m 512M
- ``cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]``
+ ``cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity,restrictions=restrictions]``
Define a CXL Fixed Memory Window (CFMW).
Described in the CXL 2.0 ECN: CEDT CFMWS & QTG _DSM.
@@ -168,6 +168,8 @@ SRST
interleave. Default 256 (bytes). Only 256, 512, 1k, 2k,
4k, 8k and 16k granularities supported.
+ ``restrictions=restrictions`` bitmask of restrictions of the CFMW.
+
Example:
::
--
2.39.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/4] hw/cxl: Support Type3 HDM-DB
2025-08-06 5:57 [PATCH -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
` (2 preceding siblings ...)
2025-08-06 5:57 ` [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions Davidlohr Bueso
@ 2025-08-06 5:57 ` Davidlohr Bueso
3 siblings, 0 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-06 5:57 UTC (permalink / raw)
To: jonathan.cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Davidlohr Bueso
Add basic plumbing for memory expander devices that support Back
Invalidation. This introduces a 'hdm-db=on|off' parameter and
exposes the relevant BI RT/Decoder component cachemem registers.
Devices require enabling Flit mode.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
docs/system/devices/cxl.rst | 26 +++++++
hw/cxl/cxl-component-utils.c | 135 +++++++++++++++++++++++++++++++--
hw/mem/cxl_type3.c | 9 ++-
include/hw/cxl/cxl_component.h | 54 ++++++++++++-
include/hw/cxl/cxl_device.h | 3 +
5 files changed, 218 insertions(+), 9 deletions(-)
diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst
index bf7908429af8..9a0c5a5cdac3 100644
--- a/docs/system/devices/cxl.rst
+++ b/docs/system/devices/cxl.rst
@@ -384,6 +384,32 @@ An example of 4 devices below a switch suitable for 1, 2 or 4 way interleave::
-device cxl-type3,bus=swport3,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,sn=0x4 \
-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
+The same 4 Type3 devices under a switch, but two of them use HDM-DB for coherence::
+
+ qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \
+ ...
+ -object memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M \
+ -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M \
+ -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M \
+ -object memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M \
+ -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M \
+ -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M \
+ -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M \
+ -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M \
+ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
+ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
+ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
+ -device cxl-upstream,bus=root_port0,id=us0,256b-flit=on \
+ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
+ -device cxl-type3,bus=swport0,persistent-memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0,sn=0x1,256b-flit=on,hdm-db=on \
+ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
+ -device cxl-type3,bus=swport1,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1,sn=0x2,256b-flit=on,hdm-db=on \
+ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
+ -device cxl-type3,bus=swport2,persistent-memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2,sn=0x3 \
+ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
+ -device cxl-type3,bus=swport3,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,sn=0x4 \
+ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
+
A simple arm/virt example featuring a single direct connected CXL Type 3
Volatile Memory device::
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
index a43d227336ca..dfdbf23a427c 100644
--- a/hw/cxl/cxl-component-utils.c
+++ b/hw/cxl/cxl-component-utils.c
@@ -71,10 +71,40 @@ static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
case 4:
if (cregs->special_ops && cregs->special_ops->read) {
return cregs->special_ops->read(cxl_cstate, offset, 4);
- } else {
- QEMU_BUILD_BUG_ON(sizeof(*cregs->cache_mem_registers) != 4);
- return cregs->cache_mem_registers[offset / 4];
}
+
+ QEMU_BUILD_BUG_ON(sizeof(*cregs->cache_mem_registers) != 4);
+
+ if (offset == A_CXL_BI_RT_STATUS ||
+ offset == A_CXL_BI_DECODER_STATUS) {
+ int type;
+ uint64_t started;
+
+ type = (offset == A_CXL_BI_RT_STATUS) ?
+ CXL_BISTATE_RT : CXL_BISTATE_DECODER;
+ started = cxl_cstate->bi_state[type].last_commit;
+
+ if (started) {
+ uint32_t val, *cache_mem = cregs->cache_mem_registers;
+ uint64_t now;
+ int set;
+
+ val = cregs->cache_mem_registers[offset / 4];
+ now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+ /* arbitrary 100 ms to do the commit */
+ set = !!(now >= started + 100);
+
+ if (offset == A_CXL_BI_RT_STATUS) {
+ val = FIELD_DP32(val, CXL_BI_RT_STATUS, COMMITTED, set);
+ } else {
+ val = FIELD_DP32(val, CXL_BI_DECODER_STATUS, COMMITTED,
+ set);
+ }
+ stl_le_p((uint8_t *)cache_mem + offset, val);
+ }
+ }
+
+ return cregs->cache_mem_registers[offset / 4];
case 8:
qemu_log_mask(LOG_UNIMP,
"CXL 8 byte cache mem registers not implemented\n");
@@ -123,6 +153,47 @@ static void dumb_hdm_handler(CXLComponentState *cxl_cstate, hwaddr offset,
}
}
+static void dumb_bi_handler(CXLComponentState *cxl_cstate, hwaddr offset,
+ uint32_t value)
+{
+ ComponentRegisters *cregs = &cxl_cstate->crb;
+ uint32_t sts, *cache_mem = cregs->cache_mem_registers;
+ bool to_commit = false;
+ int type;
+
+ switch (offset) {
+ case A_CXL_BI_RT_CTRL:
+ to_commit = FIELD_EX32(value, CXL_BI_RT_CTRL, COMMIT);
+ if (to_commit) {
+ sts = cxl_cache_mem_read_reg(cxl_cstate,
+ R_CXL_BI_RT_STATUS, 4);
+ sts = FIELD_DP32(sts, CXL_BI_RT_STATUS, COMMITTED, 0);
+ stl_le_p((uint8_t *)cache_mem + R_CXL_BI_RT_STATUS, sts);
+ type = CXL_BISTATE_RT;
+ }
+ break;
+ case A_CXL_BI_DECODER_CTRL:
+ to_commit = FIELD_EX32(value, CXL_BI_DECODER_CTRL, COMMIT);
+ if (to_commit) {
+ sts = cxl_cache_mem_read_reg(cxl_cstate,
+ R_CXL_BI_DECODER_STATUS, 4);
+ sts = FIELD_DP32(sts, CXL_BI_DECODER_STATUS, COMMITTED, 0);
+ stl_le_p((uint8_t *)cache_mem + R_CXL_BI_DECODER_STATUS, sts);
+ type = CXL_BISTATE_DECODER;
+ }
+ break;
+ default:
+ break;
+ }
+
+ if (to_commit) {
+ cxl_cstate->bi_state[type].last_commit =
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+ }
+
+ stl_le_p((uint8_t *)cache_mem + offset, value);
+}
+
static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
unsigned size)
{
@@ -146,6 +217,9 @@ static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
if (offset >= A_CXL_HDM_DECODER_CAPABILITY &&
offset <= A_CXL_HDM_DECODER3_TARGET_LIST_HI) {
dumb_hdm_handler(cxl_cstate, offset, value);
+ } else if (offset == A_CXL_BI_RT_CTRL ||
+ offset == A_CXL_BI_DECODER_CTRL) {
+ dumb_bi_handler(cxl_cstate, offset, value);
} else {
cregs->cache_mem_registers[offset / 4] = value;
}
@@ -248,7 +322,7 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_4K, 1);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
POISON_ON_ERR_CAP, 0);
- if (type == CXL2_TYPE3_DEVICE) {
+ if (type == CXL2_TYPE3_DEVICE || type == CXL3_TYPE3_DEVICE) {
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 3_6_12_WAY, 1);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 16_WAY, 1);
} else {
@@ -260,7 +334,8 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
UIO_DECODER_COUNT, 0);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, MEMDATA_NXM_CAP, 0);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
- SUPPORTED_COHERENCY_MODEL, 0); /* Unknown */
+ SUPPORTED_COHERENCY_MODEL,
+ type == CXL3_TYPE3_DEVICE ? 3:0); /* host+dev or Unknown */
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL,
HDM_DECODER_ENABLE, 0);
write_msk[R_CXL_HDM_DECODER_GLOBAL_CONTROL] = 0x3;
@@ -271,7 +346,7 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
write_msk[R_CXL_HDM_DECODER0_SIZE_HI + i * hdm_inc] = 0xffffffff;
write_msk[R_CXL_HDM_DECODER0_CTRL + i * hdm_inc] = 0x13ff;
if (type == CXL2_DEVICE ||
- type == CXL2_TYPE3_DEVICE ||
+ type == CXL2_TYPE3_DEVICE || type == CXL3_TYPE3_DEVICE ||
type == CXL2_LOGICAL_DEVICE) {
write_msk[R_CXL_HDM_DECODER0_TARGET_LIST_LO + i * hdm_inc] =
0xf0000000;
@@ -283,6 +358,37 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
}
}
+static void bi_rt_init_common(uint32_t *reg_state, uint32_t *write_msk)
+{
+ /* switch usp must commit the new BI-ID, timeout of 2secs */
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_CAPABILITY, EXPLICIT_COMMIT, 1);
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_CTRL, COMMIT, 0);
+ write_msk[R_CXL_BI_RT_CTRL] = 0xffffffff;
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, ERR_NOT_COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMIT_TMO_SCALE, 0x6);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMIT_TMO_BASE, 0x2);
+}
+
+static void bi_decoder_init_common(uint32_t *reg_state, uint32_t *write_msk)
+{
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CAPABILITY, HDM_D, 0);
+ /* switch dsp must commit the new BI-ID, timeout of 2secs */
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CAPABILITY, EXPLICIT_COMMIT, 1);
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, BI_FW, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, BI_ENABLE, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, COMMIT, 0);
+ write_msk[R_CXL_BI_DECODER_CTRL] = 0xffffffff;
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, ERR_NOT_COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMIT_TMO_SCALE, 0x6);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMIT_TMO_BASE, 0x2);
+}
+
void cxl_component_register_init_common(uint32_t *reg_state,
uint32_t *write_msk,
enum reg_type type)
@@ -323,6 +429,7 @@ void cxl_component_register_init_common(uint32_t *reg_state,
case CXL2_UPSTREAM_PORT:
case CXL2_TYPE3_DEVICE:
case CXL2_LOGICAL_DEVICE:
+ case CXL3_TYPE3_DEVICE:
/* + HDM */
init_cap_reg(HDM, 5, 1);
hdm_init_common(reg_state, write_msk, type);
@@ -340,6 +447,22 @@ void cxl_component_register_init_common(uint32_t *reg_state,
abort();
}
+ /* back invalidate */
+ switch (type) {
+ case CXL2_UPSTREAM_PORT:
+ init_cap_reg(BI_RT, 11, CXL_BI_RT_CAP_VERSION);
+ bi_rt_init_common(reg_state, write_msk);
+ break;
+ case CXL2_ROOT_PORT:
+ case CXL2_DOWNSTREAM_PORT:
+ case CXL3_TYPE3_DEVICE:
+ init_cap_reg(BI_DECODER, 12, CXL_BI_DECODER_CAP_VERSION);
+ bi_decoder_init_common(reg_state, write_msk);
+ break;
+ default:
+ break;
+ }
+
ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
#undef init_cap_reg
}
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 324cf62e8141..ce56126984f3 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -968,6 +968,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
return false;
}
+ if (!ct3d->flitmode && ct3d->hdmdb) {
+ error_setg(errp, "hdm-db requires operating in 256b flit");
+ return false;
+ }
+
if (ct3d->hostvmem) {
MemoryRegion *vmr;
char *v_name;
@@ -1503,7 +1508,8 @@ void ct3d_reset(DeviceState *dev)
pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed,
ct3d->flitmode);
- cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
+ cxl_component_register_init_common(reg_state, write_msk, ct3d->hdmdb ?
+ CXL3_TYPE3_DEVICE : CXL2_TYPE3_DEVICE);
cxl_device_register_init_t3(ct3d, CXL_T3_MSIX_MBOX);
/*
@@ -1543,6 +1549,7 @@ static const Property ct3_props[] = {
width, PCIE_LINK_WIDTH_16),
DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLType3Dev, flitmode, 0),
DEFINE_PROP_UINT16("chmu-port", CXLType3Dev, cxl_dstate.chmu[0].port, 0),
+ DEFINE_PROP_BOOL("hdm-db", CXLType3Dev, hdmdb, 0),
};
static uint64_t get_lsa_size(CXLType3Dev *ct3d)
diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
index cd92cb02532a..4872ec5ead39 100644
--- a/include/hw/cxl/cxl_component.h
+++ b/include/hw/cxl/cxl_component.h
@@ -29,6 +29,7 @@ enum reg_type {
CXL2_UPSTREAM_PORT,
CXL2_DOWNSTREAM_PORT,
CXL3_SWITCH_MAILBOX_CCI,
+ CXL3_TYPE3_DEVICE, /* hdm-db */
};
/*
@@ -67,6 +68,8 @@ CXLx_CAPABILITY_HEADER(LINK, 2)
CXLx_CAPABILITY_HEADER(HDM, 3)
CXLx_CAPABILITY_HEADER(EXTSEC, 4)
CXLx_CAPABILITY_HEADER(SNOOP, 5)
+CXLx_CAPABILITY_HEADER(BI_RT, 6)
+CXLx_CAPABILITY_HEADER(BI_DECODER, 7)
/*
* Capability structures contain the actual registers that the CXL component
@@ -211,10 +214,56 @@ HDM_DECODER_INIT(3);
(CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
#define CXL_SNOOP_REGISTERS_SIZE 0x8
-QEMU_BUILD_BUG_MSG((CXL_SNOOP_REGISTERS_OFFSET +
- CXL_SNOOP_REGISTERS_SIZE) >= 0x1000,
+#define CXL_BI_RT_CAP_VERSION 1
+#define CXL_BI_RT_REGISTERS_OFFSET \
+ (CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE)
+#define CXL_BI_RT_REGISTERS_SIZE 0xC
+
+REG32(CXL_BI_RT_CAPABILITY, CXL_BI_RT_REGISTERS_OFFSET)
+ FIELD(CXL_BI_RT_CAPABILITY, EXPLICIT_COMMIT, 0, 1)
+REG32(CXL_BI_RT_CTRL, CXL_BI_RT_REGISTERS_OFFSET + 0x4)
+ FIELD(CXL_BI_RT_CTRL, COMMIT, 0, 1)
+REG32(CXL_BI_RT_STATUS, CXL_BI_RT_REGISTERS_OFFSET + 0x8)
+ FIELD(CXL_BI_RT_STATUS, COMMITTED, 0, 1)
+ FIELD(CXL_BI_RT_STATUS, ERR_NOT_COMMITTED, 1, 1)
+ FIELD(CXL_BI_RT_STATUS, COMMIT_TMO_SCALE, 8, 4)
+ FIELD(CXL_BI_RT_STATUS, COMMIT_TMO_BASE, 12, 4)
+
+/* CXL r3.2 8.2.4.27 - CXL BI Decoder Capability Structure */
+#define CXL_BI_DECODER_CAP_VERSION 1
+#define CXL_BI_DECODER_REGISTERS_OFFSET \
+ (CXL_BI_RT_REGISTERS_OFFSET + CXL_BI_RT_REGISTERS_SIZE)
+#define CXL_BI_DECODER_REGISTERS_SIZE 0xC
+
+REG32(CXL_BI_DECODER_CAPABILITY, CXL_BI_DECODER_REGISTERS_OFFSET)
+ FIELD(CXL_BI_DECODER_CAPABILITY, HDM_D, 0, 1)
+ FIELD(CXL_BI_DECODER_CAPABILITY, EXPLICIT_COMMIT, 1, 1)
+REG32(CXL_BI_DECODER_CTRL, CXL_BI_DECODER_REGISTERS_OFFSET + 0x4)
+ FIELD(CXL_BI_DECODER_CTRL, BI_FW, 0, 1)
+ FIELD(CXL_BI_DECODER_CTRL, BI_ENABLE, 1, 1)
+ FIELD(CXL_BI_DECODER_CTRL, COMMIT, 2, 1)
+REG32(CXL_BI_DECODER_STATUS, CXL_BI_DECODER_REGISTERS_OFFSET + 0x8)
+ FIELD(CXL_BI_DECODER_STATUS, COMMITTED, 0, 1)
+ FIELD(CXL_BI_DECODER_STATUS, ERR_NOT_COMMITTED, 1, 1)
+ FIELD(CXL_BI_DECODER_STATUS, COMMIT_TMO_SCALE, 8, 4)
+ FIELD(CXL_BI_DECODER_STATUS, COMMIT_TMO_BASE, 12, 4)
+
+QEMU_BUILD_BUG_MSG((CXL_BI_DECODER_REGISTERS_OFFSET +
+ CXL_BI_DECODER_REGISTERS_SIZE) >= 0x1000,
"No space for registers");
+/* to track BI explicit commit handling */
+enum {
+ CXL_BISTATE_RT = 0, /* switch usp */
+ CXL_BISTATE_DECODER, /* switch dsp */
+ CXL_BISTATE_MAX
+};
+
+typedef struct bi_state {
+ /* last 0->1 transition */
+ uint64_t last_commit;
+} BIState;
+
typedef struct component_registers {
/*
* Main memory region to be registered with QEMU core.
@@ -260,6 +309,7 @@ typedef struct cxl_component {
CDATObject cdat;
CXLCompObject compliance;
+ BIState bi_state[CXL_BISTATE_MAX]; /* for ups+dsp switches */
} CXLComponentState;
void cxl_component_register_block_init(Object *obj,
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 4c9d2247cf02..d0462671609f 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -841,6 +841,9 @@ struct CXLType3Dev {
CXLMemSparingReadAttrs rank_sparing_attrs;
CXLMemSparingWriteAttrs rank_sparing_wr_attrs;
+ /* BI flows */
+ bool hdmdb;
+
struct dynamic_capacity {
HostMemoryBackend *host_dc;
AddressSpace host_dc_as;
--
2.39.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions
2025-08-06 5:57 ` [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions Davidlohr Bueso
@ 2025-08-07 0:06 ` Davidlohr Bueso
2025-08-08 15:47 ` Jonathan Cameron
1 sibling, 0 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-07 0:06 UTC (permalink / raw)
To: jonathan.cameron; +Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel
On Tue, 05 Aug 2025, Davidlohr Bueso wrote:
>diff --git a/qemu-options.hx b/qemu-options.hx
>index 1f862b19a676..ef6072bd8b59 100644
>--- a/qemu-options.hx
>+++ b/qemu-options.hx
>@@ -142,7 +142,7 @@ SRST
> -machine memory-backend=pc.ram
> -m 512M
>
>- ``cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]``
>+ ``cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity,restrictions=restrictions]``
> Define a CXL Fixed Memory Window (CFMW).
>
> Described in the CXL 2.0 ECN: CEDT CFMWS & QTG _DSM.
>@@ -168,6 +168,8 @@ SRST
> interleave. Default 256 (bytes). Only 256, 512, 1k, 2k,
> 4k, 8k and 16k granularities supported.
>
>+ ``restrictions=restrictions`` bitmask of restrictions of the CFMW.
hmm so there is a doc build error I missed:
qemu-options.hx:212:Block quote ends without a blank line; unexpected unindent.
------
diff --git a/qemu-options.hx b/qemu-options.hx
index ef6072bd8b59..da642642eafc 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -168,7 +168,7 @@ SRST
interleave. Default 256 (bytes). Only 256, 512, 1k, 2k,
4k, 8k and 16k granularities supported.
- ``restrictions=restrictions`` bitmask of restrictions of the CFMW.
+ ``restrictions=restrictions`` bitmask of the restrictions of the CFMW.
Example:
>+
> Example:
>
> ::
>--
>2.39.5
>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/4] hw/pcie: Support enabling flit mode
2025-08-06 5:57 ` [PATCH 1/4] hw/pcie: Support enabling flit mode Davidlohr Bueso
@ 2025-08-08 15:42 ` Jonathan Cameron
2025-08-08 17:45 ` Davidlohr Bueso
2025-08-08 18:18 ` Markus Armbruster
2025-08-08 16:02 ` Jonathan Cameron
1 sibling, 2 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-08-08 15:42 UTC (permalink / raw)
To: Davidlohr Bueso
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Michael S. Tsirkin, Marcel Apfelbaum, Markus Armbruster,
Michael Roth
On Tue, 5 Aug 2025 22:57:05 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:
> As with the link speed and width training, have ad-hoc property for
> setting the flit mode and allow CXL components to make use of it.
>
> For the CXL root port and dsp cases, always report flit mode but
> the actual value after 'training' will depend on the downstream
> device configuration.
>
> Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Hi Davidlohr,
This looks a bit like an interface that evolved, but in the end
you seem to have something that is a simple boolean property.
As such you can avoid a fair bit of complexity.
Look for disable-acs for an example.
I don't know if it is desirable to make it an explicit type or not,
but my gut says boolean is fine here.
+CC A few potentially relevant people to answer that question more
definitively.
> ---
> hw/core/qdev-properties-system.c | 11 +++++++++++
> hw/mem/cxl_type3.c | 4 +++-
> hw/pci-bridge/cxl_downstream.c | 1 +
> hw/pci-bridge/cxl_root_port.c | 1 +
> hw/pci-bridge/cxl_upstream.c | 3 ++-
> hw/pci-bridge/gen_pcie_root_port.c | 1 +
> hw/pci/pcie.c | 13 +++++++++----
> include/hw/cxl/cxl_device.h | 1 +
> include/hw/pci-bridge/cxl_upstream_port.h | 1 +
> include/hw/pci/pcie.h | 2 +-
> include/hw/pci/pcie_port.h | 1 +
> include/hw/qdev-properties-system.h | 3 +++
> qapi/common.json | 14 ++++++++++++++
> 13 files changed, 49 insertions(+), 7 deletions(-)
>
> diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> index 24e145d87001..94a1b754ecdc 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -1172,6 +1172,17 @@ const PropertyInfo qdev_prop_pcie_link_width = {
> .set_default_value = qdev_propinfo_set_default_value_enum,
> };
>
> +/* --- Flit mode --- */
> +
> +const PropertyInfo qdev_prop_pcie_link_flit = {
> + .type = "PCIELinkFlit",
> + .description = "off/on",
> + .enum_table = &PCIELinkFlit_lookup,
> + .get = qdev_propinfo_get_enum,
> + .set = qdev_propinfo_set_enum,
Just adding extra indent to these two doesn't seem particularly useful.
Feels like a qdev_prop_bool would work fine here given it's on / off.
> + .set_default_value = qdev_propinfo_set_default_value_enum,
> +};
> +
> /* --- UUID --- */
>
> static void get_uuid(Object *obj, Visitor *v, const char *name, void *opaque,
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index c4658e0955d5..324cf62e8141 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1501,7 +1501,8 @@ void ct3d_reset(DeviceState *dev)
> uint32_t *reg_state = ct3d->cxl_cstate.crb.cache_mem_registers;
> uint32_t *write_msk = ct3d->cxl_cstate.crb.cache_mem_regs_write_mask;
>
> - pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed);
> + pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed,
> + ct3d->flitmode);
> cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
> cxl_device_register_init_t3(ct3d, CXL_T3_MSIX_MBOX);
>
> @@ -1540,6 +1541,7 @@ static const Property ct3_props[] = {
> speed, PCIE_LINK_SPEED_32),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLType3Dev,
> width, PCIE_LINK_WIDTH_16),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLType3Dev, flitmode, 0),
DEFINE_PROP_BOOL("256b-flit, CXLTYpe3Dev, flitmode, false)
> DEFINE_PROP_UINT16("chmu-port", CXLType3Dev, cxl_dstate.chmu[0].port, 0),
> };
>
> diff --git a/hw/pci-bridge/cxl_downstream.c b/hw/pci-bridge/cxl_downstream.c
> index 6aa8586f0161..82e6618b111c 100644
> --- a/hw/pci-bridge/cxl_downstream.c
> +++ b/hw/pci-bridge/cxl_downstream.c
> @@ -257,6 +257,7 @@ static const Property cxl_dsp_props[] = {
> speed, PCIE_LINK_SPEED_64),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_16),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 1),
> };
>
> static void cxl_dsp_class_init(ObjectClass *oc, const void *data)
> diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
> index f035987b6f1f..721dc3981fd2 100644
> --- a/hw/pci-bridge/cxl_root_port.c
> +++ b/hw/pci-bridge/cxl_root_port.c
> @@ -235,6 +235,7 @@ static const Property gen_rp_props[] = {
> speed, PCIE_LINK_SPEED_64),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_32),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 1),
> };
>
> static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
> diff --git a/hw/pci-bridge/cxl_upstream.c b/hw/pci-bridge/cxl_upstream.c
> index c2150afff39b..5e6d559a3215 100644
> --- a/hw/pci-bridge/cxl_upstream.c
> +++ b/hw/pci-bridge/cxl_upstream.c
> @@ -147,7 +147,7 @@ static void cxl_usp_reset(DeviceState *qdev)
>
> pci_bridge_reset(qdev);
> pcie_cap_deverr_reset(d);
> - pcie_cap_fill_link_ep_usp(d, usp->width, usp->speed);
> + pcie_cap_fill_link_ep_usp(d, usp->width, usp->speed, usp->flitmode);
> latch_registers(usp);
> }
>
> @@ -433,6 +433,7 @@ static const Property cxl_upstream_props[] = {
> speed, PCIE_LINK_SPEED_32),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLUpstreamPort,
> width, PCIE_LINK_WIDTH_16),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLUpstreamPort, flitmode, 0),
> };
>
> static void cxl_upstream_class_init(ObjectClass *oc, const void *data)
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index d9078e783bf0..c00be8147c2a 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -145,6 +145,7 @@ static const Property gen_rp_props[] = {
> speed, PCIE_LINK_SPEED_16),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_32),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 0),
> };
>
> static void gen_rp_dev_class_init(ObjectClass *klass, const void *data)
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index eaeb68894e6e..55e0f7110ae5 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -113,7 +113,7 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t type, uint8_t version)
>
> /* Includes setting the target speed default */
> static void pcie_cap_fill_lnk(uint8_t *exp_cap, PCIExpLinkWidth width,
> - PCIExpLinkSpeed speed)
> + PCIExpLinkSpeed speed, PCIELinkFlit flitmode)
> {
> /* Clear and fill LNKCAP from what was configured above */
> pci_long_test_and_clear_mask(exp_cap + PCI_EXP_LNKCAP,
> @@ -158,10 +158,15 @@ static void pcie_cap_fill_lnk(uint8_t *exp_cap, PCIExpLinkWidth width,
> PCI_EXP_LNKCAP2_SLS_64_0GB);
> }
> }
> +
> + if (flitmode) {
> + pci_long_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA2,
> + PCI_EXP_LNKSTA2_FLIT);
> + }
> }
>
> void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
> - PCIExpLinkSpeed speed)
> + PCIExpLinkSpeed speed, PCIELinkFlit flitmode)
> {
> uint8_t *exp_cap = dev->config + dev->exp.exp_cap;
>
> @@ -175,7 +180,7 @@ void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
> QEMU_PCI_EXP_LNKSTA_NLW(width) |
> QEMU_PCI_EXP_LNKSTA_CLS(speed));
>
> - pcie_cap_fill_lnk(exp_cap, width, speed);
> + pcie_cap_fill_lnk(exp_cap, width, speed, flitmode);
> }
>
> static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
> @@ -212,7 +217,7 @@ static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
> /* the PCI_EXP_LNKSTA_DLLLA will be set in the hotplug function */
> }
>
> - pcie_cap_fill_lnk(exp_cap, s->width, s->speed);
> + pcie_cap_fill_lnk(exp_cap, s->width, s->speed, s->flitmode);
> }
>
> int pcie_cap_init(PCIDevice *dev, uint8_t offset,
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 068c20d61ebc..4c9d2247cf02 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -796,6 +796,7 @@ struct CXLType3Dev {
> /* PCIe link characteristics */
> PCIExpLinkSpeed speed;
> PCIExpLinkWidth width;
> + PCIELinkFlit flitmode;
>
> /* DOE */
> DOECap doe_cdat;
> diff --git a/include/hw/pci-bridge/cxl_upstream_port.h b/include/hw/pci-bridge/cxl_upstream_port.h
> index db1dfb6afd98..584e43c37291 100644
> --- a/include/hw/pci-bridge/cxl_upstream_port.h
> +++ b/include/hw/pci-bridge/cxl_upstream_port.h
> @@ -20,6 +20,7 @@ typedef struct CXLUpstreamPort {
>
> PCIExpLinkSpeed speed;
> PCIExpLinkWidth width;
> + PCIELinkFlit flitmode;
>
> DOECap doe_cdat;
> uint64_t sn;
> diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> index ff6ce08e135a..82fcbc9f8823 100644
> --- a/include/hw/pci/pcie.h
> +++ b/include/hw/pci/pcie.h
> @@ -142,7 +142,7 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset);
> void pcie_dev_ser_num_init(PCIDevice *dev, uint16_t offset, uint64_t ser_num);
> void pcie_ats_init(PCIDevice *dev, uint16_t offset, bool aligned);
> void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
> - PCIExpLinkSpeed speed);
> + PCIExpLinkSpeed speed, PCIELinkFlit flitmode);
>
> void pcie_cap_slot_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
> Error **errp);
> diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
> index 7cd7af8cfa4b..2f96fc685729 100644
> --- a/include/hw/pci/pcie_port.h
> +++ b/include/hw/pci/pcie_port.h
> @@ -58,6 +58,7 @@ struct PCIESlot {
>
> PCIExpLinkSpeed speed;
> PCIExpLinkWidth width;
> + PCIELinkFlit flitmode;
bool probably fine.
>
> /* Disable ACS (really for a pcie_root_port) */
> bool disable_acs;
> diff --git a/qapi/common.json b/qapi/common.json
> index 0e3a0bbbfb0b..da047fbf874f 100644
> --- a/qapi/common.json
> +++ b/qapi/common.json
> @@ -140,6 +140,20 @@
> { 'enum': 'PCIELinkWidth',
> 'data': [ '1', '2', '4', '8', '12', '16', '32' ] }
>
Hmm. Not sure why these are here rather than pci.json.
> +##
> +# @PCIELinkFlit:
> +#
> +# An enumeration of PCIe link FLIT mode
Bit odd having an enumeration for 'on' vs 'off'
> +#
> +# @off: the link is not operating in FLIT mode
> +#
> +# @on: each FLIT is a fixed 256 bytes in size
> +#
> +# Since: 10.0
That was a while back.
> +##
> +{ 'enum': 'PCIELinkFlit',
> + 'data': [ 'off', 'on'] }
> +
> ##
> # @HostMemPolicy:
> #
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions
2025-08-06 5:57 ` [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions Davidlohr Bueso
2025-08-07 0:06 ` Davidlohr Bueso
@ 2025-08-08 15:47 ` Jonathan Cameron
1 sibling, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-08-08 15:47 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel
On Tue, 5 Aug 2025 22:57:07 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:
> Update the CFMW restrictions to also permit Back-Invalidate
> flows by default, which is aligned with the no-restrictions
> policy.
>
> While at it, document the 'restrictions=' option.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
I sat on this the original restrictions patch without trying to
upstream it on the basis that it's a horrible interface.
Time to clean my mess up I guess then I'll fix this up on top.
We probably want it to enable everything by default (other
than the fixed device config one and then provide boolean
properties to turn things off. For now I can't see a reason
to have the fixed device config as a possibility.
Jonathan
> ---
> hw/cxl/cxl-host.c | 2 +-
> qapi/machine.json | 3 ++-
> qemu-options.hx | 4 +++-
> 3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c
> index def2cf75be61..0d17ea3e4c26 100644
> --- a/hw/cxl/cxl-host.c
> +++ b/hw/cxl/cxl-host.c
> @@ -64,7 +64,7 @@ static void cxl_fixed_memory_window_config(CXLFixedMemoryWindowOptions *object,
> if (object->has_restrictions) {
> fw->restrictions = object->restrictions;
> } else {
> - fw->restrictions = 0xf; /* No restrictions */
> + fw->restrictions = 0x2f; /* No restrictions */
> }
>
> fw->targets = g_malloc0_n(fw->num_targets, sizeof(*fw->targets));
> diff --git a/qapi/machine.json b/qapi/machine.json
> index ac258578e4ab..ea8ba71305b0 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -589,7 +589,8 @@
> # BIT(2) - Volatile
> # BIT(3) - Persistent
> # BIT(4) - Fixed Device Config
> -# Default is 0xF
> +# BIT(5) - BI
> +# Default is 0x2F
> #
> # @targets: Target root bridge IDs from -device ...,id=<ID> for each
> # root bridge.
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 1f862b19a676..ef6072bd8b59 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -142,7 +142,7 @@ SRST
> -machine memory-backend=pc.ram
> -m 512M
>
> - ``cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]``
> + ``cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity,restrictions=restrictions]``
> Define a CXL Fixed Memory Window (CFMW).
>
> Described in the CXL 2.0 ECN: CEDT CFMWS & QTG _DSM.
> @@ -168,6 +168,8 @@ SRST
> interleave. Default 256 (bytes). Only 256, 512, 1k, 2k,
> 4k, 8k and 16k granularities supported.
>
> + ``restrictions=restrictions`` bitmask of restrictions of the CFMW.
> +
> Example:
>
> ::
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/4] hw/pcie: Support enabling flit mode
2025-08-06 5:57 ` [PATCH 1/4] hw/pcie: Support enabling flit mode Davidlohr Bueso
2025-08-08 15:42 ` Jonathan Cameron
@ 2025-08-08 16:02 ` Jonathan Cameron
1 sibling, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-08-08 16:02 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel
On Tue, 5 Aug 2025 22:57:05 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:
> As with the link speed and width training, have ad-hoc property for
> setting the flit mode and allow CXL components to make use of it.
>
> For the CXL root port and dsp cases, always report flit mode but
> the actual value after 'training' will depend on the downstream
> device configuration.
Hi Davidlohr,
I'm not immediately spotting how this bit works. I was expecting
to see some code in pcie_sync_bridge_lnk() to see the Link Status
stuff indicating if it 'trained' in flit mode or not.
What am I missing?
J
>
> Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
> hw/core/qdev-properties-system.c | 11 +++++++++++
> hw/mem/cxl_type3.c | 4 +++-
> hw/pci-bridge/cxl_downstream.c | 1 +
> hw/pci-bridge/cxl_root_port.c | 1 +
> hw/pci-bridge/cxl_upstream.c | 3 ++-
> hw/pci-bridge/gen_pcie_root_port.c | 1 +
> hw/pci/pcie.c | 13 +++++++++----
> include/hw/cxl/cxl_device.h | 1 +
> include/hw/pci-bridge/cxl_upstream_port.h | 1 +
> include/hw/pci/pcie.h | 2 +-
> include/hw/pci/pcie_port.h | 1 +
> include/hw/qdev-properties-system.h | 3 +++
> qapi/common.json | 14 ++++++++++++++
> 13 files changed, 49 insertions(+), 7 deletions(-)
>
> diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> index 24e145d87001..94a1b754ecdc 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -1172,6 +1172,17 @@ const PropertyInfo qdev_prop_pcie_link_width = {
> .set_default_value = qdev_propinfo_set_default_value_enum,
> };
>
> +/* --- Flit mode --- */
> +
> +const PropertyInfo qdev_prop_pcie_link_flit = {
> + .type = "PCIELinkFlit",
> + .description = "off/on",
> + .enum_table = &PCIELinkFlit_lookup,
> + .get = qdev_propinfo_get_enum,
> + .set = qdev_propinfo_set_enum,
> + .set_default_value = qdev_propinfo_set_default_value_enum,
> +};
> +
> /* --- UUID --- */
>
> static void get_uuid(Object *obj, Visitor *v, const char *name, void *opaque,
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index c4658e0955d5..324cf62e8141 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1501,7 +1501,8 @@ void ct3d_reset(DeviceState *dev)
> uint32_t *reg_state = ct3d->cxl_cstate.crb.cache_mem_registers;
> uint32_t *write_msk = ct3d->cxl_cstate.crb.cache_mem_regs_write_mask;
>
> - pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed);
> + pcie_cap_fill_link_ep_usp(PCI_DEVICE(dev), ct3d->width, ct3d->speed,
> + ct3d->flitmode);
> cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
> cxl_device_register_init_t3(ct3d, CXL_T3_MSIX_MBOX);
>
> @@ -1540,6 +1541,7 @@ static const Property ct3_props[] = {
> speed, PCIE_LINK_SPEED_32),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLType3Dev,
> width, PCIE_LINK_WIDTH_16),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLType3Dev, flitmode, 0),
> DEFINE_PROP_UINT16("chmu-port", CXLType3Dev, cxl_dstate.chmu[0].port, 0),
> };
>
> diff --git a/hw/pci-bridge/cxl_downstream.c b/hw/pci-bridge/cxl_downstream.c
> index 6aa8586f0161..82e6618b111c 100644
> --- a/hw/pci-bridge/cxl_downstream.c
> +++ b/hw/pci-bridge/cxl_downstream.c
> @@ -257,6 +257,7 @@ static const Property cxl_dsp_props[] = {
> speed, PCIE_LINK_SPEED_64),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_16),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 1),
> };
>
> static void cxl_dsp_class_init(ObjectClass *oc, const void *data)
> diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
> index f035987b6f1f..721dc3981fd2 100644
> --- a/hw/pci-bridge/cxl_root_port.c
> +++ b/hw/pci-bridge/cxl_root_port.c
> @@ -235,6 +235,7 @@ static const Property gen_rp_props[] = {
> speed, PCIE_LINK_SPEED_64),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_32),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 1),
> };
>
> static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
> diff --git a/hw/pci-bridge/cxl_upstream.c b/hw/pci-bridge/cxl_upstream.c
> index c2150afff39b..5e6d559a3215 100644
> --- a/hw/pci-bridge/cxl_upstream.c
> +++ b/hw/pci-bridge/cxl_upstream.c
> @@ -147,7 +147,7 @@ static void cxl_usp_reset(DeviceState *qdev)
>
> pci_bridge_reset(qdev);
> pcie_cap_deverr_reset(d);
> - pcie_cap_fill_link_ep_usp(d, usp->width, usp->speed);
> + pcie_cap_fill_link_ep_usp(d, usp->width, usp->speed, usp->flitmode);
> latch_registers(usp);
> }
>
> @@ -433,6 +433,7 @@ static const Property cxl_upstream_props[] = {
> speed, PCIE_LINK_SPEED_32),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", CXLUpstreamPort,
> width, PCIE_LINK_WIDTH_16),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", CXLUpstreamPort, flitmode, 0),
> };
>
> static void cxl_upstream_class_init(ObjectClass *oc, const void *data)
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index d9078e783bf0..c00be8147c2a 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -145,6 +145,7 @@ static const Property gen_rp_props[] = {
> speed, PCIE_LINK_SPEED_16),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_32),
> + DEFINE_PROP_PCIE_LINK_FLIT("256b-flit", PCIESlot, flitmode, 0),
> };
>
> static void gen_rp_dev_class_init(ObjectClass *klass, const void *data)
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index eaeb68894e6e..55e0f7110ae5 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -113,7 +113,7 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t type, uint8_t version)
>
> /* Includes setting the target speed default */
> static void pcie_cap_fill_lnk(uint8_t *exp_cap, PCIExpLinkWidth width,
> - PCIExpLinkSpeed speed)
> + PCIExpLinkSpeed speed, PCIELinkFlit flitmode)
> {
> /* Clear and fill LNKCAP from what was configured above */
> pci_long_test_and_clear_mask(exp_cap + PCI_EXP_LNKCAP,
> @@ -158,10 +158,15 @@ static void pcie_cap_fill_lnk(uint8_t *exp_cap, PCIExpLinkWidth width,
> PCI_EXP_LNKCAP2_SLS_64_0GB);
> }
> }
> +
> + if (flitmode) {
> + pci_long_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA2,
> + PCI_EXP_LNKSTA2_FLIT);
> + }
> }
>
> void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
> - PCIExpLinkSpeed speed)
> + PCIExpLinkSpeed speed, PCIELinkFlit flitmode)
> {
> uint8_t *exp_cap = dev->config + dev->exp.exp_cap;
>
> @@ -175,7 +180,7 @@ void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
> QEMU_PCI_EXP_LNKSTA_NLW(width) |
> QEMU_PCI_EXP_LNKSTA_CLS(speed));
>
> - pcie_cap_fill_lnk(exp_cap, width, speed);
> + pcie_cap_fill_lnk(exp_cap, width, speed, flitmode);
> }
>
> static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
> @@ -212,7 +217,7 @@ static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
> /* the PCI_EXP_LNKSTA_DLLLA will be set in the hotplug function */
> }
>
> - pcie_cap_fill_lnk(exp_cap, s->width, s->speed);
> + pcie_cap_fill_lnk(exp_cap, s->width, s->speed, s->flitmode);
> }
>
> int pcie_cap_init(PCIDevice *dev, uint8_t offset,
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 068c20d61ebc..4c9d2247cf02 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -796,6 +796,7 @@ struct CXLType3Dev {
> /* PCIe link characteristics */
> PCIExpLinkSpeed speed;
> PCIExpLinkWidth width;
> + PCIELinkFlit flitmode;
>
> /* DOE */
> DOECap doe_cdat;
> diff --git a/include/hw/pci-bridge/cxl_upstream_port.h b/include/hw/pci-bridge/cxl_upstream_port.h
> index db1dfb6afd98..584e43c37291 100644
> --- a/include/hw/pci-bridge/cxl_upstream_port.h
> +++ b/include/hw/pci-bridge/cxl_upstream_port.h
> @@ -20,6 +20,7 @@ typedef struct CXLUpstreamPort {
>
> PCIExpLinkSpeed speed;
> PCIExpLinkWidth width;
> + PCIELinkFlit flitmode;
>
> DOECap doe_cdat;
> uint64_t sn;
> diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> index ff6ce08e135a..82fcbc9f8823 100644
> --- a/include/hw/pci/pcie.h
> +++ b/include/hw/pci/pcie.h
> @@ -142,7 +142,7 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset);
> void pcie_dev_ser_num_init(PCIDevice *dev, uint16_t offset, uint64_t ser_num);
> void pcie_ats_init(PCIDevice *dev, uint16_t offset, bool aligned);
> void pcie_cap_fill_link_ep_usp(PCIDevice *dev, PCIExpLinkWidth width,
> - PCIExpLinkSpeed speed);
> + PCIExpLinkSpeed speed, PCIELinkFlit flitmode);
>
> void pcie_cap_slot_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
> Error **errp);
> diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
> index 7cd7af8cfa4b..2f96fc685729 100644
> --- a/include/hw/pci/pcie_port.h
> +++ b/include/hw/pci/pcie_port.h
> @@ -58,6 +58,7 @@ struct PCIESlot {
>
> PCIExpLinkSpeed speed;
> PCIExpLinkWidth width;
> + PCIELinkFlit flitmode;
>
> /* Disable ACS (really for a pcie_root_port) */
> bool disable_acs;
> diff --git a/include/hw/qdev-properties-system.h b/include/hw/qdev-properties-system.h
> index b921392c5256..dd5dc4515ea7 100644
> --- a/include/hw/qdev-properties-system.h
> +++ b/include/hw/qdev-properties-system.h
> @@ -28,6 +28,7 @@ extern const PropertyInfo qdev_prop_audiodev;
> extern const PropertyInfo qdev_prop_off_auto_pcibar;
> extern const PropertyInfo qdev_prop_pcie_link_speed;
> extern const PropertyInfo qdev_prop_pcie_link_width;
> +extern const PropertyInfo qdev_prop_pcie_link_flit;
> extern const PropertyInfo qdev_prop_cpus390entitlement;
> extern const PropertyInfo qdev_prop_iothread_vq_mapping_list;
> extern const PropertyInfo qdev_prop_endian_mode;
> @@ -80,6 +81,8 @@ extern const PropertyInfo qdev_prop_vmapple_virtio_blk_variant;
> #define DEFINE_PROP_PCIE_LINK_WIDTH(_n, _s, _f, _d) \
> DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_pcie_link_width, \
> PCIExpLinkWidth)
> +#define DEFINE_PROP_PCIE_LINK_FLIT(_n, _s, _f, _d) \
> + DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_pcie_link_flit, PCIELinkFlit)
>
> #define DEFINE_PROP_UUID(_name, _state, _field) \
> DEFINE_PROP(_name, _state, _field, qdev_prop_uuid, QemuUUID, \
> diff --git a/qapi/common.json b/qapi/common.json
> index 0e3a0bbbfb0b..da047fbf874f 100644
> --- a/qapi/common.json
> +++ b/qapi/common.json
> @@ -140,6 +140,20 @@
> { 'enum': 'PCIELinkWidth',
> 'data': [ '1', '2', '4', '8', '12', '16', '32' ] }
>
> +##
> +# @PCIELinkFlit:
> +#
> +# An enumeration of PCIe link FLIT mode
> +#
> +# @off: the link is not operating in FLIT mode
> +#
> +# @on: each FLIT is a fixed 256 bytes in size
> +#
> +# Since: 10.0
> +##
> +{ 'enum': 'PCIELinkFlit',
> + 'data': [ 'off', 'on'] }
> +
> ##
> # @HostMemPolicy:
> #
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/4] hw/pcie: Support enabling flit mode
2025-08-08 15:42 ` Jonathan Cameron
@ 2025-08-08 17:45 ` Davidlohr Bueso
2025-08-08 18:18 ` Markus Armbruster
1 sibling, 0 replies; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-08 17:45 UTC (permalink / raw)
To: Jonathan Cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Michael S. Tsirkin, Marcel Apfelbaum, Markus Armbruster,
Michael Roth
On Fri, 08 Aug 2025, Jonathan Cameron wrote:
>This looks a bit like an interface that evolved, but in the end
>you seem to have something that is a simple boolean property.
>As such you can avoid a fair bit of complexity.
Yeah, I started out having this as a bool property. But the alignment
with the other link training properties felt right, albeit a bit of
an overkill. At one point I considered the automatic training (which
you point out missing here) being an option, so having off/on/auto,
but that also seems like an overkill.
>Look for disable-acs for an example.
Ok, so with a counter example, I think you are right.
>I don't know if it is desirable to make it an explicit type or not,
>but my gut says boolean is fine here.
>
>+CC A few potentially relevant people to answer that question more
>definitively.
Unless someone shouts, I will go with just the bool prop.
Thanks,
Davidlohr
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/4] hw/pcie: Support enabling flit mode
2025-08-08 15:42 ` Jonathan Cameron
2025-08-08 17:45 ` Davidlohr Bueso
@ 2025-08-08 18:18 ` Markus Armbruster
1 sibling, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2025-08-08 18:18 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Davidlohr Bueso, ira.weiny, alucerop, a.manzanares, linux-cxl,
qemu-devel, Michael S. Tsirkin, Marcel Apfelbaum, Michael Roth
Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:
> On Tue, 5 Aug 2025 22:57:05 -0700
> Davidlohr Bueso <dave@stgolabs.net> wrote:
>
>> As with the link speed and width training, have ad-hoc property for
>> setting the flit mode and allow CXL components to make use of it.
>>
>> For the CXL root port and dsp cases, always report flit mode but
>> the actual value after 'training' will depend on the downstream
>> device configuration.
>>
>> Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> Hi Davidlohr,
>
> This looks a bit like an interface that evolved, but in the end
> you seem to have something that is a simple boolean property.
> As such you can avoid a fair bit of complexity.
> Look for disable-acs for an example.
>
>
> I don't know if it is desirable to make it an explicit type or not,
> but my gut says boolean is fine here.
>
> +CC A few potentially relevant people to answer that question more
> definitively.
[...]
>> diff --git a/qapi/common.json b/qapi/common.json
>> index 0e3a0bbbfb0b..da047fbf874f 100644
>> --- a/qapi/common.json
>> +++ b/qapi/common.json
>> @@ -140,6 +140,20 @@
>> { 'enum': 'PCIELinkWidth',
>> 'data': [ '1', '2', '4', '8', '12', '16', '32' ] }
>>
>
> Hmm. Not sure why these are here rather than pci.json.
Pretty sure there was a good reason back then. Less sure there is a
good reason now :)
>> +##
>> +# @PCIELinkFlit:
>> +#
>> +# An enumeration of PCIe link FLIT mode
>
> Bit odd having an enumeration for 'on' vs 'off'
Indeed. Please stick to bool.
>> +#
>> +# @off: the link is not operating in FLIT mode
>> +#
>> +# @on: each FLIT is a fixed 256 bytes in size
>> +#
>> +# Since: 10.0
>
> That was a while back.
>
>> +##
>> +{ 'enum': 'PCIELinkFlit',
>> + 'data': [ 'off', 'on'] }
>> +
>> ##
>> # @HostMemPolicy:
>> #
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 4/4] hw/cxl: Support type3 HDM-DB
2025-08-11 3:34 [PATCH v2 -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
@ 2025-08-11 3:34 ` Davidlohr Bueso
2025-08-11 16:33 ` Jonathan Cameron
0 siblings, 1 reply; 13+ messages in thread
From: Davidlohr Bueso @ 2025-08-11 3:34 UTC (permalink / raw)
To: jonathan.cameron
Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel,
Davidlohr Bueso
Add basic plumbing for memory expander devices that support Back
Invalidation. This introduces a 'hdm-db=on|off' parameter and
exposes the relevant BI RT/Decoder component cachemem registers.
Some noteworthy properties:
- Devices require enabling Flit mode.
- Explicit BI-ID commit is required.
- HDM decoder support both host and dev coherency models.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
docs/system/devices/cxl.rst | 23 ++++++
hw/cxl/cxl-component-utils.c | 135 +++++++++++++++++++++++++++++++--
hw/mem/cxl_type3.c | 14 +++-
include/hw/cxl/cxl_component.h | 54 ++++++++++++-
include/hw/cxl/cxl_device.h | 3 +
5 files changed, 217 insertions(+), 12 deletions(-)
diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst
index bf7908429af8..4815de0f2dc4 100644
--- a/docs/system/devices/cxl.rst
+++ b/docs/system/devices/cxl.rst
@@ -384,6 +384,29 @@ An example of 4 devices below a switch suitable for 1, 2 or 4 way interleave::
-device cxl-type3,bus=swport3,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,sn=0x4 \
-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
+An example of 4 type3 devices with volatile memory below a switch. Two of the devices
+use HDM-DB for coherence::
+
+ qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \
+ ...
+ -object memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M \
+ -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M \
+ -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M \
+ -object memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M \
+ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
+ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
+ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
+ -device cxl-upstream,bus=root_port0,id=us0,256b-flit=on \
+ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
+ -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-mem0,sn=0x1,256b-flit=on,hdm-db=on \
+ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
+ -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-mem1,sn=0x2,256b-flit=on,hdm-db=on \
+ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
+ -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-mem2,sn=0x3 \
+ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
+ -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-mem3,sn=0x4 \
+ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
+
A simple arm/virt example featuring a single direct connected CXL Type 3
Volatile Memory device::
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
index a43d227336ca..dfdbf23a427c 100644
--- a/hw/cxl/cxl-component-utils.c
+++ b/hw/cxl/cxl-component-utils.c
@@ -71,10 +71,40 @@ static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
case 4:
if (cregs->special_ops && cregs->special_ops->read) {
return cregs->special_ops->read(cxl_cstate, offset, 4);
- } else {
- QEMU_BUILD_BUG_ON(sizeof(*cregs->cache_mem_registers) != 4);
- return cregs->cache_mem_registers[offset / 4];
}
+
+ QEMU_BUILD_BUG_ON(sizeof(*cregs->cache_mem_registers) != 4);
+
+ if (offset == A_CXL_BI_RT_STATUS ||
+ offset == A_CXL_BI_DECODER_STATUS) {
+ int type;
+ uint64_t started;
+
+ type = (offset == A_CXL_BI_RT_STATUS) ?
+ CXL_BISTATE_RT : CXL_BISTATE_DECODER;
+ started = cxl_cstate->bi_state[type].last_commit;
+
+ if (started) {
+ uint32_t val, *cache_mem = cregs->cache_mem_registers;
+ uint64_t now;
+ int set;
+
+ val = cregs->cache_mem_registers[offset / 4];
+ now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+ /* arbitrary 100 ms to do the commit */
+ set = !!(now >= started + 100);
+
+ if (offset == A_CXL_BI_RT_STATUS) {
+ val = FIELD_DP32(val, CXL_BI_RT_STATUS, COMMITTED, set);
+ } else {
+ val = FIELD_DP32(val, CXL_BI_DECODER_STATUS, COMMITTED,
+ set);
+ }
+ stl_le_p((uint8_t *)cache_mem + offset, val);
+ }
+ }
+
+ return cregs->cache_mem_registers[offset / 4];
case 8:
qemu_log_mask(LOG_UNIMP,
"CXL 8 byte cache mem registers not implemented\n");
@@ -123,6 +153,47 @@ static void dumb_hdm_handler(CXLComponentState *cxl_cstate, hwaddr offset,
}
}
+static void dumb_bi_handler(CXLComponentState *cxl_cstate, hwaddr offset,
+ uint32_t value)
+{
+ ComponentRegisters *cregs = &cxl_cstate->crb;
+ uint32_t sts, *cache_mem = cregs->cache_mem_registers;
+ bool to_commit = false;
+ int type;
+
+ switch (offset) {
+ case A_CXL_BI_RT_CTRL:
+ to_commit = FIELD_EX32(value, CXL_BI_RT_CTRL, COMMIT);
+ if (to_commit) {
+ sts = cxl_cache_mem_read_reg(cxl_cstate,
+ R_CXL_BI_RT_STATUS, 4);
+ sts = FIELD_DP32(sts, CXL_BI_RT_STATUS, COMMITTED, 0);
+ stl_le_p((uint8_t *)cache_mem + R_CXL_BI_RT_STATUS, sts);
+ type = CXL_BISTATE_RT;
+ }
+ break;
+ case A_CXL_BI_DECODER_CTRL:
+ to_commit = FIELD_EX32(value, CXL_BI_DECODER_CTRL, COMMIT);
+ if (to_commit) {
+ sts = cxl_cache_mem_read_reg(cxl_cstate,
+ R_CXL_BI_DECODER_STATUS, 4);
+ sts = FIELD_DP32(sts, CXL_BI_DECODER_STATUS, COMMITTED, 0);
+ stl_le_p((uint8_t *)cache_mem + R_CXL_BI_DECODER_STATUS, sts);
+ type = CXL_BISTATE_DECODER;
+ }
+ break;
+ default:
+ break;
+ }
+
+ if (to_commit) {
+ cxl_cstate->bi_state[type].last_commit =
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+ }
+
+ stl_le_p((uint8_t *)cache_mem + offset, value);
+}
+
static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
unsigned size)
{
@@ -146,6 +217,9 @@ static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
if (offset >= A_CXL_HDM_DECODER_CAPABILITY &&
offset <= A_CXL_HDM_DECODER3_TARGET_LIST_HI) {
dumb_hdm_handler(cxl_cstate, offset, value);
+ } else if (offset == A_CXL_BI_RT_CTRL ||
+ offset == A_CXL_BI_DECODER_CTRL) {
+ dumb_bi_handler(cxl_cstate, offset, value);
} else {
cregs->cache_mem_registers[offset / 4] = value;
}
@@ -248,7 +322,7 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_4K, 1);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
POISON_ON_ERR_CAP, 0);
- if (type == CXL2_TYPE3_DEVICE) {
+ if (type == CXL2_TYPE3_DEVICE || type == CXL3_TYPE3_DEVICE) {
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 3_6_12_WAY, 1);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 16_WAY, 1);
} else {
@@ -260,7 +334,8 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
UIO_DECODER_COUNT, 0);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, MEMDATA_NXM_CAP, 0);
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
- SUPPORTED_COHERENCY_MODEL, 0); /* Unknown */
+ SUPPORTED_COHERENCY_MODEL,
+ type == CXL3_TYPE3_DEVICE ? 3:0); /* host+dev or Unknown */
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL,
HDM_DECODER_ENABLE, 0);
write_msk[R_CXL_HDM_DECODER_GLOBAL_CONTROL] = 0x3;
@@ -271,7 +346,7 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
write_msk[R_CXL_HDM_DECODER0_SIZE_HI + i * hdm_inc] = 0xffffffff;
write_msk[R_CXL_HDM_DECODER0_CTRL + i * hdm_inc] = 0x13ff;
if (type == CXL2_DEVICE ||
- type == CXL2_TYPE3_DEVICE ||
+ type == CXL2_TYPE3_DEVICE || type == CXL3_TYPE3_DEVICE ||
type == CXL2_LOGICAL_DEVICE) {
write_msk[R_CXL_HDM_DECODER0_TARGET_LIST_LO + i * hdm_inc] =
0xf0000000;
@@ -283,6 +358,37 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
}
}
+static void bi_rt_init_common(uint32_t *reg_state, uint32_t *write_msk)
+{
+ /* switch usp must commit the new BI-ID, timeout of 2secs */
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_CAPABILITY, EXPLICIT_COMMIT, 1);
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_CTRL, COMMIT, 0);
+ write_msk[R_CXL_BI_RT_CTRL] = 0xffffffff;
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, ERR_NOT_COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMIT_TMO_SCALE, 0x6);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMIT_TMO_BASE, 0x2);
+}
+
+static void bi_decoder_init_common(uint32_t *reg_state, uint32_t *write_msk)
+{
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CAPABILITY, HDM_D, 0);
+ /* switch dsp must commit the new BI-ID, timeout of 2secs */
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CAPABILITY, EXPLICIT_COMMIT, 1);
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, BI_FW, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, BI_ENABLE, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, COMMIT, 0);
+ write_msk[R_CXL_BI_DECODER_CTRL] = 0xffffffff;
+
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, ERR_NOT_COMMITTED, 0);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMIT_TMO_SCALE, 0x6);
+ ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMIT_TMO_BASE, 0x2);
+}
+
void cxl_component_register_init_common(uint32_t *reg_state,
uint32_t *write_msk,
enum reg_type type)
@@ -323,6 +429,7 @@ void cxl_component_register_init_common(uint32_t *reg_state,
case CXL2_UPSTREAM_PORT:
case CXL2_TYPE3_DEVICE:
case CXL2_LOGICAL_DEVICE:
+ case CXL3_TYPE3_DEVICE:
/* + HDM */
init_cap_reg(HDM, 5, 1);
hdm_init_common(reg_state, write_msk, type);
@@ -340,6 +447,22 @@ void cxl_component_register_init_common(uint32_t *reg_state,
abort();
}
+ /* back invalidate */
+ switch (type) {
+ case CXL2_UPSTREAM_PORT:
+ init_cap_reg(BI_RT, 11, CXL_BI_RT_CAP_VERSION);
+ bi_rt_init_common(reg_state, write_msk);
+ break;
+ case CXL2_ROOT_PORT:
+ case CXL2_DOWNSTREAM_PORT:
+ case CXL3_TYPE3_DEVICE:
+ init_cap_reg(BI_DECODER, 12, CXL_BI_DECODER_CAP_VERSION);
+ bi_decoder_init_common(reg_state, write_msk);
+ break;
+ default:
+ break;
+ }
+
ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
#undef init_cap_reg
}
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index ecd3a7703b35..1e55d13c1e93 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -447,6 +447,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
CXLDVSECRegisterLocator *regloc_dvsec;
uint8_t *dvsec;
+ uint16_t type = ct3d->hdmdb ? CXL3_TYPE3_DEVICE : CXL2_TYPE3_DEVICE;
uint32_t range1_size_hi = 0, range1_size_lo = 0,
range1_base_hi = 0, range1_base_lo = 0,
range2_size_hi = 0, range2_size_lo = 0,
@@ -491,7 +492,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
.range2_base_hi = range2_base_hi,
.range2_base_lo = range2_base_lo,
};
- cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
+ cxl_component_create_dvsec(cxl_cstate, type,
PCIE_CXL_DEVICE_DVSEC_LENGTH,
PCIE_CXL_DEVICE_DVSEC,
PCIE_CXL31_DEVICE_DVSEC_REVID, dvsec);
@@ -521,14 +522,14 @@ static void build_dvsecs(CXLType3Dev *ct3d)
},
};
- cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
+ cxl_component_create_dvsec(cxl_cstate, type,
REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
REG_LOC_DVSEC_REVID, (uint8_t *)regloc_dvsec);
dvsec = (uint8_t *)&(CXLDVSECDeviceGPF){
.phase2_duration = 0x603, /* 3 seconds */
.phase2_power = 0x33, /* 0x33 miliwatts */
};
- cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
+ cxl_component_create_dvsec(cxl_cstate, type,
GPF_DEVICE_DVSEC_LENGTH, GPF_DEVICE_DVSEC,
GPF_DEVICE_DVSEC_REVID, dvsec);
@@ -539,7 +540,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
.status = ct3d->flitmode ? 0x6 : 0x26, /* same */
.rcvd_mod_ts_data_phase1 = 0xef, /* WTF? */
};
- cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
+ cxl_component_create_dvsec(cxl_cstate, type,
PCIE_CXL3_FLEXBUS_PORT_DVSEC_LENGTH,
PCIE_FLEXBUS_PORT_DVSEC,
PCIE_CXL3_FLEXBUS_PORT_DVSEC_REVID, dvsec);
@@ -969,6 +970,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
return false;
}
+ if (!ct3d->flitmode && ct3d->hdmdb) {
+ error_setg(errp, "hdm-db requires operating in 256b flit");
+ return false;
+ }
+
if (ct3d->hostvmem) {
MemoryRegion *vmr;
char *v_name;
diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
index cd92cb02532a..acec76152ad0 100644
--- a/include/hw/cxl/cxl_component.h
+++ b/include/hw/cxl/cxl_component.h
@@ -29,6 +29,7 @@ enum reg_type {
CXL2_UPSTREAM_PORT,
CXL2_DOWNSTREAM_PORT,
CXL3_SWITCH_MAILBOX_CCI,
+ CXL3_TYPE3_DEVICE, /* hdm-db */
};
/*
@@ -67,6 +68,8 @@ CXLx_CAPABILITY_HEADER(LINK, 2)
CXLx_CAPABILITY_HEADER(HDM, 3)
CXLx_CAPABILITY_HEADER(EXTSEC, 4)
CXLx_CAPABILITY_HEADER(SNOOP, 5)
+CXLx_CAPABILITY_HEADER(BI_RT, 6)
+CXLx_CAPABILITY_HEADER(BI_DECODER, 7)
/*
* Capability structures contain the actual registers that the CXL component
@@ -211,10 +214,56 @@ HDM_DECODER_INIT(3);
(CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
#define CXL_SNOOP_REGISTERS_SIZE 0x8
-QEMU_BUILD_BUG_MSG((CXL_SNOOP_REGISTERS_OFFSET +
- CXL_SNOOP_REGISTERS_SIZE) >= 0x1000,
+#define CXL_BI_RT_CAP_VERSION 1
+#define CXL_BI_RT_REGISTERS_OFFSET \
+ (CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE)
+#define CXL_BI_RT_REGISTERS_SIZE 0xC
+
+REG32(CXL_BI_RT_CAPABILITY, CXL_BI_RT_REGISTERS_OFFSET)
+ FIELD(CXL_BI_RT_CAPABILITY, EXPLICIT_COMMIT, 0, 1)
+REG32(CXL_BI_RT_CTRL, CXL_BI_RT_REGISTERS_OFFSET + 0x4)
+ FIELD(CXL_BI_RT_CTRL, COMMIT, 0, 1)
+REG32(CXL_BI_RT_STATUS, CXL_BI_RT_REGISTERS_OFFSET + 0x8)
+ FIELD(CXL_BI_RT_STATUS, COMMITTED, 0, 1)
+ FIELD(CXL_BI_RT_STATUS, ERR_NOT_COMMITTED, 1, 1)
+ FIELD(CXL_BI_RT_STATUS, COMMIT_TMO_SCALE, 8, 4)
+ FIELD(CXL_BI_RT_STATUS, COMMIT_TMO_BASE, 12, 4)
+
+/* CXL r3.2 8.2.4.27 - CXL BI Decoder Capability Structure */
+#define CXL_BI_DECODER_CAP_VERSION 1
+#define CXL_BI_DECODER_REGISTERS_OFFSET \
+ (CXL_BI_RT_REGISTERS_OFFSET + CXL_BI_RT_REGISTERS_SIZE)
+#define CXL_BI_DECODER_REGISTERS_SIZE 0xC
+
+REG32(CXL_BI_DECODER_CAPABILITY, CXL_BI_DECODER_REGISTERS_OFFSET)
+ FIELD(CXL_BI_DECODER_CAPABILITY, HDM_D, 0, 1)
+ FIELD(CXL_BI_DECODER_CAPABILITY, EXPLICIT_COMMIT, 1, 1)
+REG32(CXL_BI_DECODER_CTRL, CXL_BI_DECODER_REGISTERS_OFFSET + 0x4)
+ FIELD(CXL_BI_DECODER_CTRL, BI_FW, 0, 1)
+ FIELD(CXL_BI_DECODER_CTRL, BI_ENABLE, 1, 1)
+ FIELD(CXL_BI_DECODER_CTRL, COMMIT, 2, 1)
+REG32(CXL_BI_DECODER_STATUS, CXL_BI_DECODER_REGISTERS_OFFSET + 0x8)
+ FIELD(CXL_BI_DECODER_STATUS, COMMITTED, 0, 1)
+ FIELD(CXL_BI_DECODER_STATUS, ERR_NOT_COMMITTED, 1, 1)
+ FIELD(CXL_BI_DECODER_STATUS, COMMIT_TMO_SCALE, 8, 4)
+ FIELD(CXL_BI_DECODER_STATUS, COMMIT_TMO_BASE, 12, 4)
+
+QEMU_BUILD_BUG_MSG((CXL_BI_DECODER_REGISTERS_OFFSET +
+ CXL_BI_DECODER_REGISTERS_SIZE) >= 0x1000,
"No space for registers");
+/* to track BI explicit commit handling */
+enum {
+ CXL_BISTATE_RT = 0, /* switch usp */
+ CXL_BISTATE_DECODER, /* switch dsp */
+ CXL_BISTATE_MAX
+};
+
+typedef struct bi_state {
+ /* last 0->1 transition */
+ uint64_t last_commit;
+} BIState;
+
typedef struct component_registers {
/*
* Main memory region to be registered with QEMU core.
@@ -260,6 +309,7 @@ typedef struct cxl_component {
CDATObject cdat;
CXLCompObject compliance;
+ BIState bi_state[CXL_BISTATE_MAX];
} CXLComponentState;
void cxl_component_register_block_init(Object *obj,
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 0abfd678b875..75603b8180b5 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -841,6 +841,9 @@ struct CXLType3Dev {
CXLMemSparingReadAttrs rank_sparing_attrs;
CXLMemSparingWriteAttrs rank_sparing_wr_attrs;
+ /* BI flows */
+ bool hdmdb;
+
struct dynamic_capacity {
HostMemoryBackend *host_dc;
AddressSpace host_dc_as;
--
2.39.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] hw/cxl: Support type3 HDM-DB
2025-08-11 3:34 ` [PATCH 4/4] hw/cxl: Support type3 HDM-DB Davidlohr Bueso
@ 2025-08-11 16:33 ` Jonathan Cameron
0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-08-11 16:33 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: ira.weiny, alucerop, a.manzanares, linux-cxl, qemu-devel
On Sun, 10 Aug 2025 20:34:05 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:
> Add basic plumbing for memory expander devices that support Back
> Invalidation. This introduces a 'hdm-db=on|off' parameter and
> exposes the relevant BI RT/Decoder component cachemem registers.
>
> Some noteworthy properties:
> - Devices require enabling Flit mode.
> - Explicit BI-ID commit is required.
> - HDM decoder support both host and dev coherency models.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Hi Davidlohr,
Ran out of time last week, so didn't get as far on this as I should have done.
Anyhow, comments follow.
> ---
> docs/system/devices/cxl.rst | 23 ++++++
> hw/cxl/cxl-component-utils.c | 135 +++++++++++++++++++++++++++++++--
> hw/mem/cxl_type3.c | 14 +++-
> include/hw/cxl/cxl_component.h | 54 ++++++++++++-
> include/hw/cxl/cxl_device.h | 3 +
> 5 files changed, 217 insertions(+), 12 deletions(-)
>
> diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst
> index bf7908429af8..4815de0f2dc4 100644
> --- a/docs/system/devices/cxl.rst
> +++ b/docs/system/devices/cxl.rst
> @@ -384,6 +384,29 @@ An example of 4 devices below a switch suitable for 1, 2 or 4 way interleave::
> -device cxl-type3,bus=swport3,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,sn=0x4 \
> -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
>
> +An example of 4 type3 devices with volatile memory below a switch. Two of the devices
> +use HDM-DB for coherence::
> +
> + qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \
> + ...
> + -object memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M \
> + -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M \
> + -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M \
> + -object memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M \
> + -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> + -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
> + -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
> + -device cxl-upstream,bus=root_port0,id=us0,256b-flit=on \
> + -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
> + -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-mem0,sn=0x1,256b-flit=on,hdm-db=on \
> + -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
> + -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-mem1,sn=0x2,256b-flit=on,hdm-db=on \
> + -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
> + -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-mem2,sn=0x3 \
> + -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
> + -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-mem3,sn=0x4 \
> + -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
> +
> A simple arm/virt example featuring a single direct connected CXL Type 3
> Volatile Memory device::
>
> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> index a43d227336ca..dfdbf23a427c 100644
> --- a/hw/cxl/cxl-component-utils.c
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -71,10 +71,40 @@ static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> case 4:
> if (cregs->special_ops && cregs->special_ops->read) {
> return cregs->special_ops->read(cxl_cstate, offset, 4);
I'm not 100% sure we ever used the special_ops->read. Might be able to just rip that
out.
> - } else {
> - QEMU_BUILD_BUG_ON(sizeof(*cregs->cache_mem_registers) != 4);
> - return cregs->cache_mem_registers[offset / 4];
> }
> +
> + QEMU_BUILD_BUG_ON(sizeof(*cregs->cache_mem_registers) != 4);
> +
> + if (offset == A_CXL_BI_RT_STATUS ||
> + offset == A_CXL_BI_DECODER_STATUS) {
I suppose this does exist for all types, so special_ops->read doesn't
make sense and it indeed belongs in here.
> + int type;
> + uint64_t started;
> +
> + type = (offset == A_CXL_BI_RT_STATUS) ?
> + CXL_BISTATE_RT : CXL_BISTATE_DECODER;
> + started = cxl_cstate->bi_state[type].last_commit;
> +
> + if (started) {
> + uint32_t val, *cache_mem = cregs->cache_mem_registers;
I'd split
uing32_t *cache_mem = cregs->cache_mem_registers;
uint32_t val = cache_mem[offset / 4];
> + uint64_t now;
> + int set;
> +
> + val = cregs->cache_mem_registers[offset / 4];
You just added a local variable cache_mem.
> + now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> + /* arbitrary 100 ms to do the commit */
> + set = !!(now >= started + 100);
> +
> + if (offset == A_CXL_BI_RT_STATUS) {
> + val = FIELD_DP32(val, CXL_BI_RT_STATUS, COMMITTED, set);
> + } else {
> + val = FIELD_DP32(val, CXL_BI_DECODER_STATUS, COMMITTED,
> + set);
> + }
> + stl_le_p((uint8_t *)cache_mem + offset, val);
> + }
> + }
> +
> + return cregs->cache_mem_registers[offset / 4];
> case 8:
> qemu_log_mask(LOG_UNIMP,
> "CXL 8 byte cache mem registers not implemented\n");
> @@ -123,6 +153,47 @@ static void dumb_hdm_handler(CXLComponentState *cxl_cstate, hwaddr offset,
> }
> }
>
> +static void dumb_bi_handler(CXLComponentState *cxl_cstate, hwaddr offset,
Can probably drop the dumb. For the HDM decoder one it was
meant to remind me to come back and add validity checks etc
(which I haven't done yet!) That will accept parameters that make
no sense and hence fail in rather hard to debug ways.
I don't think the same applies to this.
> + uint32_t value)
> +{
> + ComponentRegisters *cregs = &cxl_cstate->crb;
> + uint32_t sts, *cache_mem = cregs->cache_mem_registers;
> + bool to_commit = false;
> + int type;
> +
> + switch (offset) {
> + case A_CXL_BI_RT_CTRL:
> + to_commit = FIELD_EX32(value, CXL_BI_RT_CTRL, COMMIT);
> + if (to_commit) {
> + sts = cxl_cache_mem_read_reg(cxl_cstate,
> + R_CXL_BI_RT_STATUS, 4);
> + sts = FIELD_DP32(sts, CXL_BI_RT_STATUS, COMMITTED, 0);
> + stl_le_p((uint8_t *)cache_mem + R_CXL_BI_RT_STATUS, sts);
> + type = CXL_BISTATE_RT;
> + }
> + break;
> + case A_CXL_BI_DECODER_CTRL:
> + to_commit = FIELD_EX32(value, CXL_BI_DECODER_CTRL, COMMIT);
> + if (to_commit) {
> + sts = cxl_cache_mem_read_reg(cxl_cstate,
> + R_CXL_BI_DECODER_STATUS, 4);
> + sts = FIELD_DP32(sts, CXL_BI_DECODER_STATUS, COMMITTED, 0);
> + stl_le_p((uint8_t *)cache_mem + R_CXL_BI_DECODER_STATUS, sts);
> + type = CXL_BISTATE_DECODER;
> + }
> + break;
> + default:
> + break;
> + }
> +
> + if (to_commit) {
> + cxl_cstate->bi_state[type].last_commit =
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
> + }
> +
> + stl_le_p((uint8_t *)cache_mem + offset, value);
> +}
> @@ -248,7 +322,7 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
As on later functions I think we need to pass in a separate flag rather than
using the type (for long term maintenance reasons)
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_4K, 1);
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> POISON_ON_ERR_CAP, 0);
> - if (type == CXL2_TYPE3_DEVICE) {
> + if (type == CXL2_TYPE3_DEVICE || type == CXL3_TYPE3_DEVICE) {
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 3_6_12_WAY, 1);
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 16_WAY, 1);
> } else {
> @@ -260,7 +334,8 @@ static void hdm_init_common(uint32_t *reg_state, uint32_t *write_msk,
> UIO_DECODER_COUNT, 0);
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, MEMDATA_NXM_CAP, 0);
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> - SUPPORTED_COHERENCY_MODEL, 0); /* Unknown */
> + SUPPORTED_COHERENCY_MODEL,
> + type == CXL3_TYPE3_DEVICE ? 3:0); /* host+dev or Unknown */
Spaces around the :
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL,
> HDM_DECODER_ENABLE, 0);
> write_msk[R_CXL_HDM_DECODER_GLOBAL_CONTROL] = 0x3;
>
> +static void bi_rt_init_common(uint32_t *reg_state, uint32_t *write_msk)
> +{
> + /* switch usp must commit the new BI-ID, timeout of 2secs */
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_CAPABILITY, EXPLICIT_COMMIT, 1);
> +
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_CTRL, COMMIT, 0);
> + write_msk[R_CXL_BI_RT_CTRL] = 0xffffffff;
0x1 (See below)
> +
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMITTED, 0);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, ERR_NOT_COMMITTED, 0);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMIT_TMO_SCALE, 0x6);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_RT_STATUS, COMMIT_TMO_BASE, 0x2);
> +}
> +
> +static void bi_decoder_init_common(uint32_t *reg_state, uint32_t *write_msk)
> +{
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CAPABILITY, HDM_D, 0);
> + /* switch dsp must commit the new BI-ID, timeout of 2secs */
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CAPABILITY, EXPLICIT_COMMIT, 1);
Reserved for EP and root ports. I think we need to pass type into this function
so we can set this to 0 for those types and 1 for others.
> +
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, BI_FW, 0);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, BI_ENABLE, 0);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_CTRL, COMMIT, 0);
> + write_msk[R_CXL_BI_DECODER_CTRL] = 0xffffffff;
IIRC should only have the non reserved bits in the write_msk. So 0x7
> +
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMITTED, 0);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, ERR_NOT_COMMITTED, 0);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMIT_TMO_SCALE, 0x6);
> + ARRAY_FIELD_DP32(reg_state, CXL_BI_DECODER_STATUS, COMMIT_TMO_BASE, 0x2);
> +}
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index ecd3a7703b35..1e55d13c1e93 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -447,6 +447,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
> CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
> CXLDVSECRegisterLocator *regloc_dvsec;
> uint8_t *dvsec;
> + uint16_t type = ct3d->hdmdb ? CXL3_TYPE3_DEVICE : CXL2_TYPE3_DEVICE;
Using a type for this feels like something that won't scale as we add
more features. Perhaps stick to CXL2_TYPE3_DEVICE (perhaps renamed)
and a separate boolean.
> uint32_t range1_size_hi = 0, range1_size_lo = 0,
> range1_base_hi = 0, range1_base_lo = 0,
> range2_size_hi = 0, range2_size_lo = 0,
> @@ -491,7 +492,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
> .range2_base_hi = range2_base_hi,
> .range2_base_lo = range2_base_lo,
> };
> - cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
> + cxl_component_create_dvsec(cxl_cstate, type,
> PCIE_CXL_DEVICE_DVSEC_LENGTH,
> PCIE_CXL_DEVICE_DVSEC,
> PCIE_CXL31_DEVICE_DVSEC_REVID, dvsec);
> @@ -521,14 +522,14 @@ static void build_dvsecs(CXLType3Dev *ct3d)
> },
> };
>
> - cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
> + cxl_component_create_dvsec(cxl_cstate, type,
> REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
> REG_LOC_DVSEC_REVID, (uint8_t *)regloc_dvsec);
> dvsec = (uint8_t *)&(CXLDVSECDeviceGPF){
> .phase2_duration = 0x603, /* 3 seconds */
> .phase2_power = 0x33, /* 0x33 miliwatts */
> };
> - cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
> + cxl_component_create_dvsec(cxl_cstate, type,
> GPF_DEVICE_DVSEC_LENGTH, GPF_DEVICE_DVSEC,
> GPF_DEVICE_DVSEC_REVID, dvsec);
>
> @@ -539,7 +540,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
> .status = ct3d->flitmode ? 0x6 : 0x26, /* same */
> .rcvd_mod_ts_data_phase1 = 0xef, /* WTF? */
> };
> - cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
> + cxl_component_create_dvsec(cxl_cstate, type,
> PCIE_CXL3_FLEXBUS_PORT_DVSEC_LENGTH,
> PCIE_FLEXBUS_PORT_DVSEC,
> PCIE_CXL3_FLEXBUS_PORT_DVSEC_REVID, dvsec);
> @@ -969,6 +970,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> return false;
> }
>
> + if (!ct3d->flitmode && ct3d->hdmdb) {
> + error_setg(errp, "hdm-db requires operating in 256b flit");
> + return false;
> + }
> +
> if (ct3d->hostvmem) {
> MemoryRegion *vmr;
> char *v_name;
> diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
> index cd92cb02532a..acec76152ad0 100644
> --- a/include/hw/cxl/cxl_component.h
> +++ b/include/hw/cxl/cxl_component.h
> @@ -29,6 +29,7 @@ enum reg_type {
> CXL2_UPSTREAM_PORT,
> CXL2_DOWNSTREAM_PORT,
> CXL3_SWITCH_MAILBOX_CCI,
> + CXL3_TYPE3_DEVICE, /* hdm-db */
I'm wondering about this - whilst it's true that CXL3 allows
an hdm-db it also enabled a bunch of other things I don't think
we want to gate on this. Could we just pass a separate parameter
for it?
> };
> +/* to track BI explicit commit handling */
> +enum {
> + CXL_BISTATE_RT = 0, /* switch usp */
Could spell out Route table somewhere. I couldn't remember what
RT was for.
> + CXL_BISTATE_DECODER, /* switch dsp */
Also endpoints, root ports etc.
> + CXL_BISTATE_MAX
> +};
> +
> +typedef struct bi_state {
> + /* last 0->1 transition */
> + uint64_t last_commit;
> +} BIState;
> +
> typedef struct component_registers {
> /*
> * Main memory region to be registered with QEMU core.
> @@ -260,6 +309,7 @@ typedef struct cxl_component {
>
> CDATObject cdat;
> CXLCompObject compliance;
> + BIState bi_state[CXL_BISTATE_MAX];
> } CXLComponentState;
>
> void cxl_component_register_block_init(Object *obj,
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 0abfd678b875..75603b8180b5 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -841,6 +841,9 @@ struct CXLType3Dev {
> CXLMemSparingReadAttrs rank_sparing_attrs;
> CXLMemSparingWriteAttrs rank_sparing_wr_attrs;
>
> + /* BI flows */
> + bool hdmdb;
> +
> struct dynamic_capacity {
> HostMemoryBackend *host_dc;
> AddressSpace host_dc_as;
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-08-11 16:33 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-06 5:57 [PATCH -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 1/4] hw/pcie: Support enabling flit mode Davidlohr Bueso
2025-08-08 15:42 ` Jonathan Cameron
2025-08-08 17:45 ` Davidlohr Bueso
2025-08-08 18:18 ` Markus Armbruster
2025-08-08 16:02 ` Jonathan Cameron
2025-08-06 5:57 ` [PATCH 2/4] hw/cxl: Refactor component register initialization Davidlohr Bueso
2025-08-06 5:57 ` [PATCH 3/4] hw/cxl: Allow BI by default in Window restrictions Davidlohr Bueso
2025-08-07 0:06 ` Davidlohr Bueso
2025-08-08 15:47 ` Jonathan Cameron
2025-08-06 5:57 ` [PATCH 4/4] hw/cxl: Support Type3 HDM-DB Davidlohr Bueso
-- strict thread matches above, loose matches on Subject: below --
2025-08-11 3:34 [PATCH v2 -qemu 0/4] hw/cxl: Support Back-Invalidate Davidlohr Bueso
2025-08-11 3:34 ` [PATCH 4/4] hw/cxl: Support type3 HDM-DB Davidlohr Bueso
2025-08-11 16:33 ` Jonathan Cameron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).