* [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe
@ 2025-08-14 22:21 Dave Jiang
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
` (11 more replies)
0 siblings, 12 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Gregory Price,
Jonathan Cameron, Li Ming
v8:
- A bit of changes from Dan and Robert's comments. Main change is moving the port MMIO
register probing to after the first dport shows up. This resulted with decoder allocation
happens after the register probe.
- See specific commits for more detailed changes.
v7:
- Remove -EEXIST to simplify error flow. (Ming)
- Set dport to NULL during declare. (Jonathan)
v6:
- Return -EEXIST when a dport already exists. (Jonathan)
- Fix checking wrong port for NULL. (Ming)
- Check host_bridge and call devm_cxl_add_dport_by_uport() directly vs add_port_attach_ep(). (Ming)
- Set dport to NULL during declaration. (Jonathan)
v5:
- Return dport instead of errno with dport pointer as output param. (Jonathan)
- Consolidate common code in cxl_test. (Jonathan)
- Rename cxl_port_get_total_dports() to cxl_port_update_total_dports(). (Jonathan)
v4:
- Push dport allocation to when they are discovered. (Robert)
- Drop linux id for dport with above changes.
v3:
- Main changes revolve around improving naming of hostbridge uport and dport (Gregory)
- See specific patches for detailed change log
This series attempts to delay the allocation of dports until when the endpoint device
(memdev) are being probed. At this point, the CXL link is established and all the
devices along the CXL link path up to the Root Port (RP) should be active.
And hopefully this help a bit with Robert's issue raised in the "Inactive
downstream port handling" series [1]. Testing would be appreicated. Thank you!
[1]: https://lore.kernel.org/linux-cxl/67c8a0cc23ec_24b64294f6@dwillia2-xfh.jf.intel.com.notmuch/
Dave Jiang (11):
cxl: Add helper to detect top of CXL device topology
cxl: Add helper to reap dport
cxl: Add a cached copy of target_map to cxl_decoder
cxl: Move port register setup to first dport appear
cxl: Defer dport allocation for switch ports
cxl/test: Add cxl_test support for cxl_port_get_possible_dports()
cxl/test: Add mock version of devm_cxl_add_dport_by_dev()
cxl/test: Add support to cxl_test for decoder enumeration mock
functions
cxl/test: Setup target_map for cxl_test decoder initialization
cxl: Change sslbis handler to only handle single dport
tools/testing/cxl: Add decoder save/restore support
drivers/cxl/acpi.c | 7 +-
drivers/cxl/core/cdat.c | 25 +-
drivers/cxl/core/core.h | 2 +
drivers/cxl/core/hdm.c | 51 ++--
drivers/cxl/core/pci.c | 82 ++++++
drivers/cxl/core/port.c | 358 ++++++++++++++++++------
drivers/cxl/core/region.c | 4 +-
drivers/cxl/cxl.h | 44 ++-
drivers/cxl/port.c | 29 +-
tools/testing/cxl/Kbuild | 5 +-
tools/testing/cxl/cxl_core_exports.c | 42 +++
tools/testing/cxl/exports.h | 21 ++
tools/testing/cxl/test/cxl.c | 399 ++++++++++++++++++++++++++-
tools/testing/cxl/test/mock.c | 70 ++++-
tools/testing/cxl/test/mock.h | 3 +
15 files changed, 963 insertions(+), 179 deletions(-)
create mode 100644 tools/testing/cxl/exports.h
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
--
2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-15 12:50 ` Jonathan Cameron
2025-08-20 13:51 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 02/11] cxl: Add helper to reap dport Dave Jiang
` (10 subsequent siblings)
11 siblings, 2 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Jonathan Cameron, Li Ming
Add a helper to replace the open code detection of CXL device hierarchy
root, or the host bridge. The helper will be used for delayed downstream
port (dport) creation.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v8:
- Rename to is_cxl_host_bridge() (Dan)
- Rename duplicate tags from Jonathan
---
drivers/cxl/core/port.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 29197376b18e..855623cebd7d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -33,6 +33,15 @@
static DEFINE_IDA(cxl_port_ida);
static DEFINE_XARRAY(cxl_root_buses);
+/*
+ * The terminal device in PCI is NULL and @platform_bus
+ * for platform devices (for cxl_test)
+ */
+static bool is_cxl_host_bridge(struct device *dev)
+{
+ return (!dev || dev == &platform_bus);
+}
+
int cxl_num_decoders_committed(struct cxl_port *port)
{
lockdep_assert_held(&cxl_rwsem.region);
@@ -1541,7 +1550,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
resource_size_t component_reg_phys;
int rc;
- if (!dparent) {
+ if (is_cxl_host_bridge(dparent)) {
/*
* The iteration reached the topology root without finding the
* CXL-root 'cxl_port' on a previous iteration, fail for now to
@@ -1629,11 +1638,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
struct device *uport_dev;
struct cxl_dport *dport;
- /*
- * The terminal "grandparent" in PCI is NULL and @platform_bus
- * for platform devices
- */
- if (!dport_dev || dport_dev == &platform_bus)
+ if (is_cxl_host_bridge(dport_dev))
return 0;
uport_dev = dport_dev->parent;
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 02/11] cxl: Add helper to reap dport
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-20 14:10 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
` (9 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Li Ming
Refactor the code in reap_dports() out to provide a helper function that
reaps a single dport. This will be used later in the cleanup path for
allocating a dport.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/core/port.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 855623cebd7d..fd316e9bd59d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1441,6 +1441,15 @@ static void delete_switch_port(struct cxl_port *port)
devm_release_action(port->dev.parent, unregister_port, port);
}
+static void reap_dport(struct cxl_dport *dport)
+{
+ struct cxl_port *port = dport->port;
+
+ devm_release_action(&port->dev, cxl_dport_unlink, dport);
+ devm_release_action(&port->dev, cxl_dport_remove, dport);
+ devm_kfree(&port->dev, dport);
+}
+
static void reap_dports(struct cxl_port *port)
{
struct cxl_dport *dport;
@@ -1448,11 +1457,8 @@ static void reap_dports(struct cxl_port *port)
device_lock_assert(&port->dev);
- xa_for_each(&port->dports, index, dport) {
- devm_release_action(&port->dev, cxl_dport_unlink, dport);
- devm_release_action(&port->dev, cxl_dport_remove, dport);
- devm_kfree(&port->dev, dport);
- }
+ xa_for_each(&port->dports, index, dport)
+ reap_dport(dport);
}
struct detach_ctx {
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
2025-08-14 22:21 ` [PATCH v8 02/11] cxl: Add helper to reap dport Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-15 12:52 ` Jonathan Cameron
2025-08-20 14:17 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
` (8 subsequent siblings)
11 siblings, 2 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter
Add a cached copy of the hardware port-id list that is available at init
before all @dport objects have been instantiated. Change is in preparation
of delayed dport instantiation.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/acpi.c | 7 +++----
drivers/cxl/core/hdm.c | 20 ++++++++------------
drivers/cxl/core/port.c | 22 +++++++---------------
drivers/cxl/cxl.h | 8 ++++++--
tools/testing/cxl/test/cxl.c | 8 ++++----
5 files changed, 28 insertions(+), 37 deletions(-)
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 712624cba2b6..bb0871d92620 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -398,7 +398,6 @@ DEFINE_FREE(del_cxl_resource, struct resource *, if (_T) del_cxl_resource(_T))
static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
struct cxl_cfmws_context *ctx)
{
- int target_map[CXL_DECODER_MAX_INTERLEAVE];
struct cxl_port *root_port = ctx->root_port;
struct cxl_cxims_context cxims_ctx;
struct device *dev = ctx->dev;
@@ -416,8 +415,6 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
rc = eig_to_granularity(cfmws->granularity, &ig);
if (rc)
return rc;
- for (i = 0; i < ways; i++)
- target_map[i] = cfmws->interleave_targets[i];
struct resource *res __free(del_cxl_resource) = alloc_cxl_resource(
cfmws->base_hpa, cfmws->window_size, ctx->id++);
@@ -443,6 +440,8 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
.end = cfmws->base_hpa + cfmws->window_size - 1,
};
cxld->interleave_ways = ways;
+ for (i = 0; i < ways; i++)
+ cxld->target_map[i] = cfmws->interleave_targets[i];
/*
* Minimize the x1 granularity to advertise support for any
* valid region granularity
@@ -475,7 +474,7 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR)
cxlrd->hpa_to_spa = cxl_xor_hpa_to_spa;
- rc = cxl_decoder_add(cxld, target_map);
+ rc = cxl_decoder_add(cxld);
if (rc)
return rc;
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index e9e1d555cec6..cee68bbc7ff6 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -21,12 +21,11 @@ struct cxl_rwsem cxl_rwsem = {
.dpa = __RWSEM_INITIALIZER(cxl_rwsem.dpa),
};
-static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
- int *target_map)
+static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
{
int rc;
- rc = cxl_decoder_add_locked(cxld, target_map);
+ rc = cxl_decoder_add_locked(cxld);
if (rc) {
put_device(&cxld->dev);
dev_err(&port->dev, "Failed to add decoder\n");
@@ -54,7 +53,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
{
struct cxl_switch_decoder *cxlsd;
struct cxl_dport *dport = NULL;
- int single_port_map[1];
unsigned long index;
struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
@@ -73,9 +71,9 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
xa_for_each(&port->dports, index, dport)
break;
- single_port_map[0] = dport->port_id;
+ cxlsd->cxld.target_map[0] = dport->port_id;
- return add_hdm_decoder(port, &cxlsd->cxld, single_port_map);
+ return add_hdm_decoder(port, &cxlsd->cxld);
}
EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
@@ -984,7 +982,7 @@ static int cxl_setup_hdm_decoder_from_dvsec(
}
static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
- int *target_map, void __iomem *hdm, int which,
+ void __iomem *hdm, int which,
u64 *dpa_base, struct cxl_endpoint_dvsec_info *info)
{
struct cxl_endpoint_decoder *cxled = NULL;
@@ -1103,7 +1101,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(which));
target_list.value = (hi << 32) + lo;
for (i = 0; i < cxld->interleave_ways; i++)
- target_map[i] = target_list.target_id[i];
+ cxld->target_map[i] = target_list.target_id[i];
return 0;
}
@@ -1179,7 +1177,6 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
cxl_settle_decoders(cxlhdm);
for (i = 0; i < cxlhdm->decoder_count; i++) {
- int target_map[CXL_DECODER_MAX_INTERLEAVE] = { 0 };
int rc, target_count = cxlhdm->target_count;
struct cxl_decoder *cxld;
@@ -1207,8 +1204,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
cxld = &cxlsd->cxld;
}
- rc = init_hdm_decoder(port, cxld, target_map, hdm, i,
- &dpa_base, info);
+ rc = init_hdm_decoder(port, cxld, hdm, i, &dpa_base, info);
if (rc) {
dev_warn(&port->dev,
"Failed to initialize decoder%d.%d\n",
@@ -1216,7 +1212,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
put_device(&cxld->dev);
return rc;
}
- rc = add_hdm_decoder(port, cxld, target_map);
+ rc = add_hdm_decoder(port, cxld);
if (rc) {
dev_warn(&port->dev,
"Failed to add decoder%d.%d\n", port->id, i);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index fd316e9bd59d..48e76673aaf3 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1715,13 +1715,11 @@ struct cxl_port *cxl_mem_find_port(struct cxl_memdev *cxlmd,
EXPORT_SYMBOL_NS_GPL(cxl_mem_find_port, "CXL");
static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
- struct cxl_port *port, int *target_map)
+ struct cxl_port *port)
{
+ struct cxl_decoder *cxld = &cxlsd->cxld;
int i;
- if (!target_map)
- return 0;
-
device_lock_assert(&port->dev);
if (xa_empty(&port->dports))
@@ -1729,7 +1727,7 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
guard(rwsem_write)(&cxl_rwsem.region);
for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
- struct cxl_dport *dport = find_dport(port, target_map[i]);
+ struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
if (!dport)
return -ENXIO;
@@ -1921,9 +1919,6 @@ EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, "CXL");
/**
* cxl_decoder_add_locked - Add a decoder with targets
* @cxld: The cxl decoder allocated by cxl_<type>_decoder_alloc()
- * @target_map: A list of downstream ports that this decoder can direct memory
- * traffic to. These numbers should correspond with the port number
- * in the PCIe Link Capabilities structure.
*
* Certain types of decoders may not have any targets. The main example of this
* is an endpoint device. A more awkward example is a hostbridge whose root
@@ -1937,7 +1932,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, "CXL");
* Return: Negative error code if the decoder wasn't properly configured; else
* returns 0.
*/
-int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
+int cxl_decoder_add_locked(struct cxl_decoder *cxld)
{
struct cxl_port *port;
struct device *dev;
@@ -1958,7 +1953,7 @@ int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
if (!is_endpoint_decoder(dev)) {
struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(dev);
- rc = decoder_populate_targets(cxlsd, port, target_map);
+ rc = decoder_populate_targets(cxlsd, port);
if (rc && (cxld->flags & CXL_DECODER_F_ENABLE)) {
dev_err(&port->dev,
"Failed to populate active decoder targets\n");
@@ -1977,9 +1972,6 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, "CXL");
/**
* cxl_decoder_add - Add a decoder with targets
* @cxld: The cxl decoder allocated by cxl_<type>_decoder_alloc()
- * @target_map: A list of downstream ports that this decoder can direct memory
- * traffic to. These numbers should correspond with the port number
- * in the PCIe Link Capabilities structure.
*
* This is the unlocked variant of cxl_decoder_add_locked().
* See cxl_decoder_add_locked().
@@ -1987,7 +1979,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, "CXL");
* Context: Process context. Takes and releases the device lock of the port that
* owns the @cxld.
*/
-int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
+int cxl_decoder_add(struct cxl_decoder *cxld)
{
struct cxl_port *port;
@@ -2000,7 +1992,7 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
port = to_cxl_port(cxld->dev.parent);
guard(device)(&port->dev);
- return cxl_decoder_add_locked(cxld, target_map);
+ return cxl_decoder_add_locked(cxld);
}
EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, "CXL");
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 847e37be42c4..4b858f3d44c6 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -357,6 +357,9 @@ enum cxl_decoder_type {
* @target_type: accelerator vs expander (type2 vs type3) selector
* @region: currently assigned region for this decoder
* @flags: memory type capabilities and locking
+ * @target_map: cached copy of hardware port-id list, available at init
+ * before all @dport objects have been instantiated. While
+ * dport id is 8bit, CFMWS interleave targets are 32bits.
* @commit: device/decoder-type specific callback to commit settings to hw
* @reset: device/decoder-type specific callback to reset hw settings
*/
@@ -369,6 +372,7 @@ struct cxl_decoder {
enum cxl_decoder_type target_type;
struct cxl_region *region;
unsigned long flags;
+ u32 target_map[CXL_DECODER_MAX_INTERLEAVE];
int (*commit)(struct cxl_decoder *cxld);
void (*reset)(struct cxl_decoder *cxld);
};
@@ -781,9 +785,9 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets);
struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets);
-int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
+int cxl_decoder_add(struct cxl_decoder *cxld);
struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port);
-int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
+int cxl_decoder_add_locked(struct cxl_decoder *cxld);
int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
static inline int cxl_root_decoder_autoremove(struct device *host,
struct cxl_root_decoder *cxlrd)
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 6a25cca5636f..8faf4143d04e 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -651,7 +651,7 @@ static int mock_cxl_add_passthrough_decoder(struct cxl_port *port)
struct target_map_ctx {
- int *target_map;
+ u32 *target_map;
int index;
int target_count;
};
@@ -863,9 +863,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
target_count = NR_CXL_SWITCH_PORTS;
for (i = 0; i < NR_CXL_PORT_DECODERS; i++) {
- int target_map[CXL_DECODER_MAX_INTERLEAVE] = { 0 };
struct target_map_ctx ctx = {
- .target_map = target_map,
.target_count = target_count,
};
struct cxl_decoder *cxld;
@@ -894,6 +892,8 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
cxld = &cxled->cxld;
}
+ ctx.target_map = cxld->target_map;
+
mock_init_hdm_decoder(cxld);
if (target_count) {
@@ -905,7 +905,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
}
}
- rc = cxl_decoder_add_locked(cxld, target_map);
+ rc = cxl_decoder_add_locked(cxld);
if (rc) {
put_device(&cxld->dev);
dev_err(&port->dev, "Failed to add decoder\n");
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 04/11] cxl: Move port register setup to first dport appear
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (2 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-15 12:57 ` Jonathan Cameron
2025-08-22 10:37 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 05/11] cxl: Defer dport allocation for switch ports Dave Jiang
` (7 subsequent siblings)
11 siblings, 2 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter
This patch moves the port register setup to when the first dport appears
via the memdev probe path. At this point, the CXL link should be
established and the register access is expected to succeed. This change
addresses an error message observed when PCIe hotplug is enabled on
an Intel platform. The error messages "cxl portN: Couldn't locate the
CXL.cache and CXL.mem capability array header" is observed for the
hostbridge during cxl_acpi driver probe. If the cxl_acpi module
probe is running before the CXL link between the endpoint device and the
RP is established, then the platform may not have exposed DVSEC ID 3
and/or DVSEC ID 7 blocks which will trigger the error message. This
behavior is defined by the spec and not a hardware quirk.
This change also needs the dport enumeration to be moved to the memdev
probe path in order to address the issue. This change is just part of
the code refactoring and is not a wholly contained fix itself.
Suggested-by: Dan Williamsn <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/core/port.c | 16 +++++++++++++---
drivers/cxl/cxl.h | 2 ++
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 48e76673aaf3..25209952f469 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -867,9 +867,7 @@ static int cxl_port_add(struct cxl_port *port,
if (rc)
return rc;
- rc = cxl_port_setup_regs(port, component_reg_phys);
- if (rc)
- return rc;
+ port->component_reg_phys = component_reg_phys;
} else {
rc = dev_set_name(dev, "root%d", port->id);
if (rc)
@@ -1200,6 +1198,18 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
cxl_debugfs_create_dport_dir(dport);
+ /*
+ * Setup port register if this is the first dport showed up. Having
+ * a dport also means that there is at least 1 active link.
+ */
+ if (port->nr_dports == 1 &&
+ port->component_reg_phys != CXL_RESOURCE_NONE) {
+ rc = cxl_port_setup_regs(port, port->component_reg_phys);
+ if (rc)
+ return ERR_PTR(rc);
+ port->component_reg_phys = CXL_RESOURCE_NONE;
+ }
+
return dport;
}
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 4b858f3d44c6..87a905db5ffb 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -599,6 +599,7 @@ struct cxl_dax_region {
* @cdat: Cached CDAT data
* @cdat_available: Should a CDAT attribute be available in sysfs
* @pci_latency: Upstream latency in picoseconds
+ * @component_reg_phys: Physical address of component register
*/
struct cxl_port {
struct device dev;
@@ -622,6 +623,7 @@ struct cxl_port {
} cdat;
bool cdat_available;
long pci_latency;
+ resource_size_t component_reg_phys;
};
/**
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (3 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-20 12:41 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 06/11] cxl/test: Add cxl_test support for cxl_port_get_possible_dports() Dave Jiang
` (6 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter
The current implementation enumerates the dports during the cxl_port
driver probe. Without an endpoint connected, the dport may not be
active during port probe. This scheme may prevent a valid hardware
dport id to be retrieved and MMIO registers to be read when an endpoint
is hot-plugged. Move the dport allocation and setup to behind memdev
probe so the endpoint is guaranteed to be connected.
In the original enumeration behavior, there are 3 phases (or 2 if no CXL
switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
ACPI0017.N device. Through that it enumerates downstream ports composed
of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
uses add_host_bridge_uport() to create the ports that enumerate the PCI
RPs as the dports of these ports. Every time a port is created, the port
driver is attached, cxl_switch_porbe_probe() is called and
devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
the dports.
The second phase is if there are any CXL switches. When the pci endpoint
device driver (cxl_pci) calls probe, it will add a mem device and triggers
the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
and attempts to discovery and create all the ports represent CXL switches.
During this phase, a port is created per switch and the attached dports
are also enumerated and probed.
The last phase is creating endpoint port which happens for all endpoint
devices.
In this commit, the port create and its dport probing in cxl_acpi is not
changed. That will be handled later. The behavior change is only for CXL
switch ports. Only the dport that is part of the path for an endpoint
device to the RP will be probed. This happens naturally by the code
walking up the device hierarchy and identifying the upstream device and
the downstream device.
The new sequence is instead of creating all possible dports at initial
port creation, defer port instantiation until a memdev beneath that
dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
the creation and extension of ports with new dports as memory devices
arrive. As part of this rework, switch decoder target list is amended
at runtime as dports show up.
While the decoders are allocated during the port driver probe,
The decoders must also be updated since previously it's all done when all
the dports are setup and now every time a dport is setup per endpoint, the
switch target listing need to be updated with new dport. A
guard(rwsem_write) is used to update decoder targets. This is similar to
when decoder_populate_target() is called and the decoder programming
must be protected.
Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v8:
- grammar and spelling fixups (Dan)
- Clarify commit log story. (Dan)
- Move register mapping and decoder enumeration to when first dport shows up (Dan)
- Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
- cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
- Remove failure path for possible dports == 0. (Dan, Robert)
- update_switch_decoder() -> update_decoder_targets(). (Dan)
- Remove lock asserts where not needed. (Dan)
- Add support for passthrough decoder init. (Dan)
- Return -ENXIO when no driver attached. (Dan)
- Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
- Add devm_cxl_create_or_extend_port() helper. (Dan)
- Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
- Remove 'new_dport' local var. (Robert)
- Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
- Move port check logic to add_port_attach_ep(). (Robert)
---
drivers/cxl/core/cdat.c | 2 +-
drivers/cxl/core/core.h | 2 +
drivers/cxl/core/hdm.c | 6 -
drivers/cxl/core/pci.c | 81 +++++++++++
drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
drivers/cxl/core/region.c | 4 +-
drivers/cxl/cxl.h | 3 +
drivers/cxl/port.c | 29 +---
8 files changed, 331 insertions(+), 83 deletions(-)
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index c0af645425f4..b156b81a9b20 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
guard(rwsem_read)(&cxl_rwsem.region);
for (int i = 0; i < cxlsd->nr_targets; i++) {
- if (host_bridge == cxlsd->target[i]->dport_dev)
+ if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
return 1;
}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2669f251d677..2ac71eb459e6 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
int cxl_ras_init(void);
void cxl_ras_exit(void);
int cxl_gpf_port_setup(struct cxl_dport *dport);
+struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev);
#ifdef CONFIG_CXL_FEATURES
struct cxl_feat_entry *
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index cee68bbc7ff6..5263e9eba7d0 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
{
struct cxl_switch_decoder *cxlsd;
- struct cxl_dport *dport = NULL;
- unsigned long index;
struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
/*
@@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
device_lock_assert(&port->dev);
- xa_for_each(&port->dports, index, dport)
- break;
- cxlsd->cxld.target_map[0] = dport->port_id;
-
return add_hdm_decoder(port, &cxlsd->cxld);
}
EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index b50551601c2e..b9d770f1aa7b 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
module_param(media_ready_timeout, ushort, 0644);
MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
+/**
+ * devm_cxl_add_dport_by_dev - allocate a dport by dport device
+ * @port: cxl_port that hosts the dport
+ * @dport_dev: 'struct device' of the dport
+ *
+ * Returns the allocate dport on success or ERR_PTR() of -errno on error
+ */
+struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev)
+{
+ struct cxl_register_map map;
+ struct pci_dev *pdev;
+ u32 lnkcap, port_num;
+ int type;
+ int rc;
+
+ if (!dev_is_pci(dport_dev))
+ return ERR_PTR(-EINVAL);
+
+ device_lock_assert(&port->dev);
+
+ pdev = to_pci_dev(dport_dev);
+ type = pci_pcie_type(pdev);
+ if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
+ return ERR_PTR(-EINVAL);
+
+ if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
+ &lnkcap))
+ return ERR_PTR(-ENXIO);
+
+ rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+ if (rc)
+ dev_dbg(&port->dev, "failed to find component registers\n");
+
+ port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
+ return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
+}
+
struct cxl_walk_context {
struct pci_bus *bus;
struct cxl_port *port;
@@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
return 0;
}
+
+static int count_dports(struct pci_dev *pdev, void *data)
+{
+ struct cxl_walk_context *ctx = data;
+ int type = pci_pcie_type(pdev);
+
+ if (pdev->bus != ctx->bus)
+ return 0;
+ if (!pci_is_pcie(pdev))
+ return 0;
+ if (type != ctx->type)
+ return 0;
+
+ ctx->count++;
+ return 0;
+}
+
+int cxl_port_get_possible_dports(struct cxl_port *port)
+{
+ struct pci_bus *bus = cxl_port_to_pci_bus(port);
+ struct cxl_walk_context ctx;
+ int type;
+
+ if (!bus) {
+ dev_err(&port->dev, "No PCI bus found for port %s\n",
+ dev_name(&port->dev));
+ return -ENXIO;
+ }
+
+ if (pci_is_root_bus(bus))
+ type = PCI_EXP_TYPE_ROOT_PORT;
+ else
+ type = PCI_EXP_TYPE_DOWNSTREAM;
+
+ ctx = (struct cxl_walk_context) {
+ .bus = bus,
+ .type = type,
+ };
+ pci_walk_bus(bus, count_dports, &ctx);
+
+ return ctx.count;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 25209952f469..877f888ee8f5 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
return port;
}
-static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
- struct device *dport_dev,
- struct cxl_dport **dport)
-{
- struct cxl_find_port_ctx ctx = {
- .dport_dev = dport_dev,
- .parent_port = parent_port,
- .dport = dport,
- };
- struct cxl_port *port;
-
- port = __find_cxl_port(&ctx);
- return port;
-}
-
/*
* All users of grandparent() are using it to walk PCIe-like switch port
* hierarchy. A PCIe switch is comprised of a bridge device representing the
@@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
return map.resource;
}
+static int match_port_by_uport(struct device *dev, const void *data)
+{
+ const struct device *uport_dev = data;
+ struct cxl_port *port;
+
+ if (!is_cxl_port(dev))
+ return 0;
+
+ port = to_cxl_port(dev);
+ return uport_dev == port->uport_dev;
+}
+
+/*
+ * Function takes a device reference on the port device. Caller should do a
+ * put_device() when done.
+ */
+static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
+{
+ struct device *dev;
+
+ dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
+ if (dev)
+ return to_cxl_port(dev);
+ return NULL;
+}
+
+static int update_decoder_targets(struct device *dev, void *data)
+{
+ struct cxl_dport *dport = data;
+ struct cxl_switch_decoder *cxlsd;
+ struct cxl_decoder *cxld;
+ int i;
+
+ if (!is_switch_decoder(dev))
+ return 0;
+
+ cxlsd = to_cxl_switch_decoder(dev);
+ cxld = &cxlsd->cxld;
+ guard(rwsem_write)(&cxl_rwsem.region);
+
+ /* Short cut for passthrough decoder */
+ if (cxlsd->nr_targets == 1) {
+ cxlsd->target[0] = dport;
+ return 0;
+ }
+
+ for (i = 0; i < cxld->interleave_ways; i++) {
+ if (cxld->target_map[i] == dport->port_id) {
+ cxlsd->target[i] = dport;
+ dev_dbg(dev, "dport%d found in target list, index %d\n",
+ dport->port_id, i);
+ return 0;
+ }
+ }
+
+ return 0;
+}
+
+static int cxl_decoders_dport_update(struct cxl_dport *dport)
+{
+ return device_for_each_child(&dport->port->dev, dport,
+ update_decoder_targets);
+}
+
+static int cxl_switch_port_setup(struct cxl_port *port)
+{
+ struct cxl_hdm *cxlhdm;
+
+ cxlhdm = devm_cxl_setup_hdm(port, NULL);
+ if (!IS_ERR(cxlhdm))
+ return devm_cxl_enumerate_decoders(cxlhdm, NULL);
+
+ if (PTR_ERR(cxlhdm) != -ENODEV) {
+ dev_err(&port->dev, "Failed to map HDM decoder capability\n");
+ return PTR_ERR(cxlhdm);
+ }
+
+ if (port->possible_dports == 1) {
+ dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
+ return devm_cxl_add_passthrough_decoder(port);
+ }
+
+ dev_err(&port->dev, "HDM decoder capability not found\n");
+ return -ENXIO;
+}
+
+DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
+static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
+ struct device *dport_dev)
+{
+ struct cxl_dport *dport;
+ int rc;
+
+ guard(device)(&port->dev);
+
+ if (!port->dev.driver)
+ return ERR_PTR(-ENXIO);
+
+ dport = cxl_find_dport_by_dev(port, dport_dev);
+ if (dport)
+ return dport;
+
+ struct cxl_dport *new_dport __free(put_cxl_dport) =
+ devm_cxl_add_dport_by_dev(port, dport_dev);
+ if (IS_ERR(new_dport))
+ return new_dport;
+
+ cxl_switch_parse_cdat(port);
+
+ /*
+ * First instance of dport appearing, need to setup the port, including
+ * allocating decoders.
+ */
+ if (port->nr_dports == 1) {
+ rc = cxl_switch_port_setup(port);
+ if (rc)
+ return ERR_PTR(rc);
+ return no_free_ptr(new_dport);
+ }
+
+ rc = cxl_decoders_dport_update(new_dport);
+ if (rc)
+ return ERR_PTR(rc);
+
+ return no_free_ptr(new_dport);
+}
+
+static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
+ struct device *dport_dev)
+{
+ struct cxl_port *port __free(put_cxl_port) =
+ find_cxl_port_by_uport(uport_dev);
+
+ if (!port)
+ return ERR_PTR(-ENODEV);
+
+ return cxl_port_get_or_add_dport(port, dport_dev);
+}
+
+static struct cxl_dport *
+devm_cxl_create_or_extend_port(struct device *ep_dev,
+ struct cxl_port *parent_port,
+ struct cxl_dport *parent_dport,
+ struct device *uport_dev,
+ struct device *dport_dev)
+{
+ resource_size_t component_reg_phys;
+
+ guard(device)(&parent_port->dev);
+
+ if (!parent_port->dev.driver) {
+ dev_warn(ep_dev,
+ "port %s:%s disabled, failed to enumerate CXL.mem\n",
+ dev_name(&parent_port->dev), dev_name(uport_dev));
+ return ERR_PTR(-ENXIO);
+ }
+
+ struct cxl_port *port __free(put_cxl_port) =
+ find_cxl_port_by_uport(uport_dev);
+
+ if (!port) {
+ component_reg_phys = find_component_registers(uport_dev);
+ port = devm_cxl_add_port(&parent_port->dev, uport_dev,
+ component_reg_phys, parent_dport);
+ if (IS_ERR(port))
+ return (struct cxl_dport *)port;
+
+ /*
+ * retry to make sure a port is found. a port device
+ * reference is taken.
+ */
+ port = find_cxl_port_by_uport(uport_dev);
+ if (!port)
+ return ERR_PTR(-ENODEV);
+
+ dev_dbg(ep_dev, "created port %s:%s\n",
+ dev_name(&port->dev), dev_name(port->uport_dev));
+ }
+
+ return cxl_port_get_or_add_dport(port, dport_dev);
+}
+
static int add_port_attach_ep(struct cxl_memdev *cxlmd,
struct device *uport_dev,
struct device *dport_dev)
{
struct device *dparent = grandparent(dport_dev);
struct cxl_dport *dport, *parent_dport;
- resource_size_t component_reg_phys;
int rc;
if (is_cxl_host_bridge(dparent)) {
+ struct cxl_port *port __free(put_cxl_port) =
+ find_cxl_port_by_uport(uport_dev);
/*
* The iteration reached the topology root without finding the
* CXL-root 'cxl_port' on a previous iteration, fail for now to
* be re-probed after platform driver attaches.
*/
- dev_dbg(&cxlmd->dev, "%s is a root dport\n",
- dev_name(dport_dev));
- return -ENXIO;
+ if (!port) {
+ dev_dbg(&cxlmd->dev, "%s is a root dport\n",
+ dev_name(dport_dev));
+ return -ENXIO;
+ }
+
+ /*
+ * While the port is found, there may not be a dport associated
+ * yet. Try to associate the dport to the port. On return success,
+ * the iteration will restart with the dport now attached.
+ */
+ dport = devm_cxl_add_dport_by_uport(uport_dev,
+ dport_dev);
+ if (IS_ERR(dport))
+ return PTR_ERR(dport);
+
+ return 0;
}
struct cxl_port *parent_port __free(put_cxl_port) =
@@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
return -EAGAIN;
}
- /*
- * Definition with __free() here to keep the sequence of
- * dereferencing the device of the port before the parent_port releasing.
- */
- struct cxl_port *port __free(put_cxl_port) = NULL;
- scoped_guard(device, &parent_port->dev) {
- if (!parent_port->dev.driver) {
- dev_warn(&cxlmd->dev,
- "port %s:%s disabled, failed to enumerate CXL.mem\n",
- dev_name(&parent_port->dev), dev_name(uport_dev));
- return -ENXIO;
- }
+ dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
+ parent_dport, uport_dev,
+ dport_dev);
+ if (IS_ERR(dport))
+ return PTR_ERR(dport);
- port = find_cxl_port_at(parent_port, dport_dev, &dport);
- if (!port) {
- component_reg_phys = find_component_registers(uport_dev);
- port = devm_cxl_add_port(&parent_port->dev, uport_dev,
- component_reg_phys, parent_dport);
- if (IS_ERR(port))
- return PTR_ERR(port);
-
- /* retry find to pick up the new dport information */
- port = find_cxl_port_at(parent_port, dport_dev, &dport);
- if (!port)
- return -ENXIO;
- }
- }
-
- dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
- dev_name(&port->dev), dev_name(port->uport_dev));
rc = cxl_add_ep(dport, &cxlmd->dev);
if (rc == -EBUSY) {
/*
@@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
{
struct device *dev = &cxlmd->dev;
struct device *iter;
+ int ports_need_create = 0;
int rc;
/*
@@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
struct device *uport_dev;
struct cxl_dport *dport;
+ ports_need_create++;
+
if (is_cxl_host_bridge(dport_dev))
return 0;
@@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
cxl_gpf_port_setup(dport);
+ ports_need_create--;
/* Any more ports to add between this one and the root? */
if (!dev_is_cxl_root_child(&port->dev))
continue;
+ /*
+ * The 'ports_need_create' variable tracks a port being
+ * created as it goes through this iterative loop. It's
+ * incremented when it first enters the loop and decremented
+ * when the port is found. If at the root of the hierarchy
+ * and the variable is not 0, then it's missing a port
+ * creation somewhere in the hierarchy and should restart.
+ * For example in a setup where there's a PCI root port, a
+ * switch, and an endpoint, it is possible to get to the
+ * PCI root port and its creation, and the switch port is
+ * still missing because the root port didn't exist. This
+ * triggers a restart of the loop to create the switch port
+ * now with a present root port.
+ */
+ if (ports_need_create)
+ goto retry;
+
return 0;
}
@@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
if (rc == -EAGAIN)
continue;
/* failed to add ep or port */
- if (rc)
+ if (rc < 0)
return rc;
+
+ ports_need_create = 0;
/* port added, new descendants possible, start over */
goto retry;
}
@@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
device_lock_assert(&port->dev);
if (xa_empty(&port->dports))
- return -EINVAL;
+ return 0;
guard(rwsem_write)(&cxl_rwsem.region);
for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
- if (!dport)
- return -ENXIO;
+ if (!dport) {
+ /* dport may be activated later */
+ continue;
+ }
cxlsd->target[i] = dport;
}
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 71cc42d05248..bba62867df90 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1510,8 +1510,10 @@ static int cxl_port_setup_targets(struct cxl_port *port,
cxl_rr->nr_targets_set);
return -ENXIO;
}
- } else
+ } else {
cxlsd->target[cxl_rr->nr_targets_set] = ep->dport;
+ cxlsd->cxld.target_map[cxl_rr->nr_targets_set] = ep->dport->port_id;
+ }
inc = 1;
out_target_set:
cxl_rr->nr_targets_set += inc;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 87a905db5ffb..df10a01376c6 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -591,6 +591,7 @@ struct cxl_dax_region {
* @parent_dport: dport that points to this port in the parent
* @decoder_ida: allocator for decoder ids
* @reg_map: component and ras register mapping parameters
+ * @possible_dports: Total possible dports reported by hardware
* @nr_dports: number of entries in @dports
* @hdm_end: track last allocated HDM decoder instance for allocation ordering
* @commit_end: cursor to track highest committed decoder for commit ordering
@@ -612,6 +613,7 @@ struct cxl_port {
struct cxl_dport *parent_dport;
struct ida decoder_ida;
struct cxl_register_map reg_map;
+ int possible_dports;
int nr_dports;
int hdm_end;
int commit_end;
@@ -911,6 +913,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
struct access_coordinate *c2);
bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
+int cxl_port_get_possible_dports(struct cxl_port *port);
/*
* Unit test builds overrides this to __weak, find the 'strong' version
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index cf32dc50b7a6..941a7d7157bd 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -59,34 +59,17 @@ static int discover_region(struct device *dev, void *unused)
static int cxl_switch_port_probe(struct cxl_port *port)
{
- struct cxl_hdm *cxlhdm;
- int rc;
+ int dports;
/* Cache the data early to ensure is_visible() works */
read_cdat_data(port);
- rc = devm_cxl_port_enumerate_dports(port);
- if (rc < 0)
- return rc;
+ dports = cxl_port_get_possible_dports(port);
+ if (dports < 0)
+ return dports;
+ port->possible_dports = dports;
- cxl_switch_parse_cdat(port);
-
- cxlhdm = devm_cxl_setup_hdm(port, NULL);
- if (!IS_ERR(cxlhdm))
- return devm_cxl_enumerate_decoders(cxlhdm, NULL);
-
- if (PTR_ERR(cxlhdm) != -ENODEV) {
- dev_err(&port->dev, "Failed to map HDM decoder capability\n");
- return PTR_ERR(cxlhdm);
- }
-
- if (rc == 1) {
- dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
- return devm_cxl_add_passthrough_decoder(port);
- }
-
- dev_err(&port->dev, "HDM decoder capability not found\n");
- return -ENXIO;
+ return 0;
}
static int cxl_endpoint_port_probe(struct cxl_port *port)
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 06/11] cxl/test: Add cxl_test support for cxl_port_get_possible_dports()
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (4 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 05/11] cxl: Defer dport allocation for switch ports Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 07/11] cxl/test: Add mock version of devm_cxl_add_dport_by_dev() Dave Jiang
` (5 subsequent siblings)
11 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Li Ming
In the delayed dport allocation scheme, the total possible number of
dports need to be discovered during port probe. Add the mock function
that does it for cxl_test.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
tools/testing/cxl/Kbuild | 1 +
tools/testing/cxl/test/cxl.c | 48 +++++++++++++++++++++++++++++++++--
tools/testing/cxl/test/mock.c | 15 +++++++++++
tools/testing/cxl/test/mock.h | 1 +
4 files changed, 63 insertions(+), 2 deletions(-)
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index d07f14cb7aa4..e070cda6ca41 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -5,6 +5,7 @@ ldflags-y += --wrap=acpi_evaluate_integer
ldflags-y += --wrap=acpi_pci_find_root
ldflags-y += --wrap=nvdimm_bus_register
ldflags-y += --wrap=devm_cxl_port_enumerate_dports
+ldflags-y += --wrap=cxl_port_get_possible_dports
ldflags-y += --wrap=devm_cxl_setup_hdm
ldflags-y += --wrap=devm_cxl_add_passthrough_decoder
ldflags-y += --wrap=devm_cxl_enumerate_decoders
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 8faf4143d04e..ecb12a29f3ac 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -921,10 +921,12 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
return 0;
}
-static int mock_cxl_port_enumerate_dports(struct cxl_port *port)
+static int get_port_array(struct cxl_port *port,
+ struct platform_device ***port_array,
+ int *port_array_size)
{
struct platform_device **array;
- int i, array_size;
+ int array_size;
if (port->depth == 1) {
if (is_multi_bridge(port->uport_dev)) {
@@ -958,6 +960,47 @@ static int mock_cxl_port_enumerate_dports(struct cxl_port *port)
return -ENXIO;
}
+ *port_array = array;
+ *port_array_size = array_size;
+
+ return 0;
+}
+
+static int mock_cxl_port_get_possible_dports(struct cxl_port *port)
+{
+ struct platform_device **array;
+ int rc, i, array_size;
+ int dports = 0;
+
+ rc = get_port_array(port, &array, &array_size);
+ if (rc)
+ return rc;
+
+ for (i = 0; i < array_size; i++) {
+ struct platform_device *pdev = array[i];
+
+ if (pdev->dev.parent != port->uport_dev) {
+ dev_dbg(&port->dev, "%s: mismatch parent %s\n",
+ dev_name(port->uport_dev),
+ dev_name(pdev->dev.parent));
+ continue;
+ }
+
+ dports++;
+ }
+
+ return dports;
+}
+
+static int mock_cxl_port_enumerate_dports(struct cxl_port *port)
+{
+ struct platform_device **array;
+ int rc, i, array_size;
+
+ rc = get_port_array(port, &array, &array_size);
+ if (rc)
+ return rc;
+
for (i = 0; i < array_size; i++) {
struct platform_device *pdev = array[i];
struct cxl_dport *dport;
@@ -1036,6 +1079,7 @@ static struct cxl_mock_ops cxl_mock_ops = {
.acpi_evaluate_integer = mock_acpi_evaluate_integer,
.acpi_pci_find_root = mock_acpi_pci_find_root,
.devm_cxl_port_enumerate_dports = mock_cxl_port_enumerate_dports,
+ .cxl_port_get_possible_dports = mock_cxl_port_get_possible_dports,
.devm_cxl_setup_hdm = mock_cxl_setup_hdm,
.devm_cxl_add_passthrough_decoder = mock_cxl_add_passthrough_decoder,
.devm_cxl_enumerate_decoders = mock_cxl_enumerate_decoders,
diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
index 1989ae020df3..eefdc1f009c7 100644
--- a/tools/testing/cxl/test/mock.c
+++ b/tools/testing/cxl/test/mock.c
@@ -196,6 +196,21 @@ int __wrap_devm_cxl_port_enumerate_dports(struct cxl_port *port)
}
EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_port_enumerate_dports, "CXL");
+int __wrap_cxl_port_get_possible_dports(struct cxl_port *port)
+{
+ int dports, index;
+ struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+ if (ops && ops->is_mock_port(port->uport_dev))
+ dports = ops->cxl_port_get_possible_dports(port);
+ else
+ dports = cxl_port_get_possible_dports(port);
+ put_cxl_mock_ops(index);
+
+ return dports;
+}
+EXPORT_SYMBOL_NS_GPL(__wrap_cxl_port_get_possible_dports, "CXL");
+
int __wrap_cxl_await_media_ready(struct cxl_dev_state *cxlds)
{
int rc, index;
diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
index d1b0271d2822..413abb2dcb14 100644
--- a/tools/testing/cxl/test/mock.h
+++ b/tools/testing/cxl/test/mock.h
@@ -20,6 +20,7 @@ struct cxl_mock_ops {
bool (*is_mock_port)(struct device *dev);
bool (*is_mock_dev)(struct device *dev);
int (*devm_cxl_port_enumerate_dports)(struct cxl_port *port);
+ int (*cxl_port_get_possible_dports)(struct cxl_port *port);
struct cxl_hdm *(*devm_cxl_setup_hdm)(
struct cxl_port *port, struct cxl_endpoint_dvsec_info *info);
int (*devm_cxl_add_passthrough_decoder)(struct cxl_port *port);
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 07/11] cxl/test: Add mock version of devm_cxl_add_dport_by_dev()
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (5 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 06/11] cxl/test: Add cxl_test support for cxl_port_get_possible_dports() Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 08/11] cxl/test: Add support to cxl_test for decoder enumeration mock functions Dave Jiang
` (4 subsequent siblings)
11 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Li Ming
devm_cxl_add_dport_by_dev() outside of cxl_test is done through PCI
hierarchy. However with cxl_test, it needs to be done through the
platform device hierarchy. Add the mock function for
devm_cxl_add_dport_by_dev().
When cxl_core calls a cxl_core exported function and that function is
mocked by cxl_test, the call chain causes a circular dependency issue. Dan
provided a workaround to avoid this issue. Apply the method to changes from
the late dport allocation changes in order to enable cxl-test.
In cxl_core they are defined with "__" added in front of the function. A
macro is used to define the original function names for when non-test
version of the kernel is built. A bit of macros and typedefs are used to
allow mocking of those functions in cxl_test.
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/core/pci.c | 7 ++++---
drivers/cxl/cxl.h | 20 ++++++++++++++++++
tools/testing/cxl/Kbuild | 1 +
tools/testing/cxl/cxl_core_exports.c | 12 +++++++++++
tools/testing/cxl/exports.h | 10 +++++++++
tools/testing/cxl/test/cxl.c | 31 ++++++++++++++++++++++++++++
tools/testing/cxl/test/mock.c | 23 +++++++++++++++++++++
tools/testing/cxl/test/mock.h | 2 ++
8 files changed, 103 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/cxl/exports.h
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index b9d770f1aa7b..71ed7b5991e6 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -25,14 +25,14 @@ module_param(media_ready_timeout, ushort, 0644);
MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
/**
- * devm_cxl_add_dport_by_dev - allocate a dport by dport device
+ * __devm_cxl_add_dport_by_dev - allocate a dport by dport device
* @port: cxl_port that hosts the dport
* @dport_dev: 'struct device' of the dport
*
* Returns the allocate dport on success or ERR_PTR() of -errno on error
*/
-struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
- struct device *dport_dev)
+struct cxl_dport *__devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev)
{
struct cxl_register_map map;
struct pci_dev *pdev;
@@ -61,6 +61,7 @@ struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
}
+EXPORT_SYMBOL_NS_GPL(__devm_cxl_add_dport_by_dev, "CXL");
struct cxl_walk_context {
struct pci_bus *bus;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index df10a01376c6..796c27e98afb 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -914,6 +914,10 @@ void cxl_coordinates_combine(struct access_coordinate *out,
bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
int cxl_port_get_possible_dports(struct cxl_port *port);
+struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev);
+struct cxl_dport *__devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev);
/*
* Unit test builds overrides this to __weak, find the 'strong' version
@@ -924,4 +928,20 @@ int cxl_port_get_possible_dports(struct cxl_port *port);
#endif
u16 cxl_gpf_get_dvsec(struct device *dev);
+
+/*
+ * Declaration for functions that are mocked by cxl_test that are called by
+ * cxl_core. The respective functions are defined as __foo() and called by
+ * cxl_core as foo(). The macros below ensures that those functions would
+ * exist as foo(). See tools/testing/cxl/cxl_core_exports.c and
+ * tools/testing/cxl/exports.h for setting up the mock functions. The dance
+ * is done to avoid a circular dependency where cxl_core calls a function that
+ * ends up being a mock function and goes to * cxl_test where it calls a
+ * cxl_core function.
+ */
+#ifndef CXL_TEST_ENABLE
+#define DECLARE_TESTABLE(x) __##x
+#define devm_cxl_add_dport_by_dev DECLARE_TESTABLE(devm_cxl_add_dport_by_dev)
+#endif
+
#endif /* __CXL_H__ */
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index e070cda6ca41..d083cf8cca3b 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -22,6 +22,7 @@ CXL_SRC := $(DRIVERS)/cxl
CXL_CORE_SRC := $(DRIVERS)/cxl/core
ccflags-y := -I$(srctree)/drivers/cxl/
ccflags-y += -D__mock=__weak
+ccflags-y += -DCXL_TEST_ENABLE=1
ccflags-y += -DTRACE_INCLUDE_PATH=$(CXL_CORE_SRC) -I$(srctree)/drivers/cxl/core/
obj-m += cxl_acpi.o
diff --git a/tools/testing/cxl/cxl_core_exports.c b/tools/testing/cxl/cxl_core_exports.c
index f088792a8925..0d18abc1f5a3 100644
--- a/tools/testing/cxl/cxl_core_exports.c
+++ b/tools/testing/cxl/cxl_core_exports.c
@@ -2,6 +2,18 @@
/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
#include "cxl.h"
+#include "exports.h"
/* Exporting of cxl_core symbols that are only used by cxl_test */
EXPORT_SYMBOL_NS_GPL(cxl_num_decoders_committed, "CXL");
+
+cxl_add_dport_by_dev_fn _devm_cxl_add_dport_by_dev =
+ __devm_cxl_add_dport_by_dev;
+EXPORT_SYMBOL_NS_GPL(_devm_cxl_add_dport_by_dev, "CXL");
+
+struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev)
+{
+ return _devm_cxl_add_dport_by_dev(port, dport_dev);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_add_dport_by_dev, "CXL");
diff --git a/tools/testing/cxl/exports.h b/tools/testing/cxl/exports.h
new file mode 100644
index 000000000000..9261ce6f1197
--- /dev/null
+++ b/tools/testing/cxl/exports.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2025 Intel Corporation */
+#ifndef __MOCK_CXL_EXPORTS_H_
+#define __MOCK_CXL_EXPORTS_H_
+
+typedef struct cxl_dport *(*cxl_add_dport_by_dev_fn)(struct cxl_port *port,
+ struct device *dport_dev);
+extern cxl_add_dport_by_dev_fn _devm_cxl_add_dport_by_dev;
+
+#endif
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index ecb12a29f3ac..0ef2a8ec1fab 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -1022,6 +1022,36 @@ static int mock_cxl_port_enumerate_dports(struct cxl_port *port)
return 0;
}
+static struct cxl_dport *mock_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev)
+{
+ struct platform_device **array;
+ int rc, i, array_size;
+
+ rc = get_port_array(port, &array, &array_size);
+ if (rc)
+ return ERR_PTR(rc);
+
+ for (i = 0; i < array_size; i++) {
+ struct platform_device *pdev = array[i];
+
+ if (pdev->dev.parent != port->uport_dev) {
+ dev_dbg(&port->dev, "%s: mismatch parent %s\n",
+ dev_name(port->uport_dev),
+ dev_name(pdev->dev.parent));
+ continue;
+ }
+
+ if (&pdev->dev != dport_dev)
+ continue;
+
+ return devm_cxl_add_dport(port, &pdev->dev, pdev->id,
+ CXL_RESOURCE_NONE);
+ }
+
+ return ERR_PTR(-ENODEV);
+}
+
/*
* Faking the cxl_dpa_perf for the memdev when appropriate.
*/
@@ -1084,6 +1114,7 @@ static struct cxl_mock_ops cxl_mock_ops = {
.devm_cxl_add_passthrough_decoder = mock_cxl_add_passthrough_decoder,
.devm_cxl_enumerate_decoders = mock_cxl_enumerate_decoders,
.cxl_endpoint_parse_cdat = mock_cxl_endpoint_parse_cdat,
+ .devm_cxl_add_dport_by_dev = mock_cxl_add_dport_by_dev,
.list = LIST_HEAD_INIT(cxl_mock_ops.list),
};
diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
index eefdc1f009c7..fd73a06f6ccb 100644
--- a/tools/testing/cxl/test/mock.c
+++ b/tools/testing/cxl/test/mock.c
@@ -10,12 +10,18 @@
#include <cxlmem.h>
#include <cxlpci.h>
#include "mock.h"
+#include "../exports.h"
static LIST_HEAD(mock);
+static struct cxl_dport *
+redirect_devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev);
+
void register_cxl_mock_ops(struct cxl_mock_ops *ops)
{
list_add_rcu(&ops->list, &mock);
+ _devm_cxl_add_dport_by_dev = redirect_devm_cxl_add_dport_by_dev;
}
EXPORT_SYMBOL_GPL(register_cxl_mock_ops);
@@ -23,6 +29,7 @@ DEFINE_STATIC_SRCU(cxl_mock_srcu);
void unregister_cxl_mock_ops(struct cxl_mock_ops *ops)
{
+ _devm_cxl_add_dport_by_dev = __devm_cxl_add_dport_by_dev;
list_del_rcu(&ops->list);
synchronize_srcu(&cxl_mock_srcu);
}
@@ -326,6 +333,22 @@ void __wrap_cxl_dport_init_ras_reporting(struct cxl_dport *dport, struct device
}
EXPORT_SYMBOL_NS_GPL(__wrap_cxl_dport_init_ras_reporting, "CXL");
+struct cxl_dport *redirect_devm_cxl_add_dport_by_dev(struct cxl_port *port,
+ struct device *dport_dev)
+{
+ int index;
+ struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+ struct cxl_dport *dport;
+
+ if (ops && ops->is_mock_port(port->uport_dev))
+ dport = ops->devm_cxl_add_dport_by_dev(port, dport_dev);
+ else
+ dport = devm_cxl_add_dport_by_dev(port, dport_dev);
+ put_cxl_mock_ops(index);
+
+ return dport;
+}
+
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("cxl_test: emulation module");
MODULE_IMPORT_NS("ACPI");
diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
index 413abb2dcb14..7c301d305ae3 100644
--- a/tools/testing/cxl/test/mock.h
+++ b/tools/testing/cxl/test/mock.h
@@ -27,6 +27,8 @@ struct cxl_mock_ops {
int (*devm_cxl_enumerate_decoders)(
struct cxl_hdm *hdm, struct cxl_endpoint_dvsec_info *info);
void (*cxl_endpoint_parse_cdat)(struct cxl_port *port);
+ struct cxl_dport *(*devm_cxl_add_dport_by_dev)(
+ struct cxl_port *port, struct device *dport_dev);
};
void register_cxl_mock_ops(struct cxl_mock_ops *ops);
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 08/11] cxl/test: Add support to cxl_test for decoder enumeration mock functions
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (6 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 07/11] cxl/test: Add mock version of devm_cxl_add_dport_by_dev() Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization Dave Jiang
` (3 subsequent siblings)
11 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Jonathan Cameron
When cxl_core calls a cxl_core exported function and that function is
mocked by cxl_test, the call chain causes a circular dependency issue. Dan
provided a workaround to avoid this issue. Apply the method to changes from
the late host bridge uport mapping update changes in order to enable
cxl-test.
The following functions are being modified:
devm_cxl_add_passthrough_decoder()
devm_cxl_setup_hdm()
devm_cxl_enumerate_decoders()
In cxl_core they are defined with "__" added in front of the function. A
macro is used to define the original function names for when non-test
version of the kernel is built. A bit of macros and typedefs are used to
allow mocking of those functions in cxl_test.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
Jonathan acked this in v3, nothing changed.
Needed now due to decoder setup is now moved to core
---
drivers/cxl/core/hdm.c | 27 +++++++++++++----------
drivers/cxl/cxl.h | 9 ++++++++
tools/testing/cxl/Kbuild | 3 ---
tools/testing/cxl/cxl_core_exports.c | 30 ++++++++++++++++++++++++++
tools/testing/cxl/exports.h | 11 ++++++++++
tools/testing/cxl/test/mock.c | 32 ++++++++++++++++++----------
6 files changed, 87 insertions(+), 25 deletions(-)
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 5263e9eba7d0..ef30f93e1e10 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -42,14 +42,19 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
return 0;
}
-/*
+/**
+ * __devm_cxl_add_passthrough_decoder - Add pasthrough decoder
+ * @port: The cxl_port context
+ *
+ * Return 0 on success or errno on failure.
+ *
* Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
* single ported host-bridges need not publish a decoder capability when a
* passthrough decode can be assumed, i.e. all transactions that the uport sees
* are claimed and passed to the single dport. Disable the range until the first
* CXL region is enumerated / activated.
*/
-int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
+int __devm_cxl_add_passthrough_decoder(struct cxl_port *port)
{
struct cxl_switch_decoder *cxlsd;
struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
@@ -69,7 +74,7 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
return add_hdm_decoder(port, &cxlsd->cxld);
}
-EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
+EXPORT_SYMBOL_NS_GPL(__devm_cxl_add_passthrough_decoder, "CXL");
static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
{
@@ -135,12 +140,12 @@ static bool should_emulate_decoders(struct cxl_endpoint_dvsec_info *info)
}
/**
- * devm_cxl_setup_hdm - map HDM decoder component registers
+ * __devm_cxl_setup_hdm - map HDM decoder component registers
* @port: cxl_port to map
* @info: cached DVSEC range register info
*/
-struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
- struct cxl_endpoint_dvsec_info *info)
+struct cxl_hdm *__devm_cxl_setup_hdm(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info)
{
struct cxl_register_map *reg_map = &port->reg_map;
struct device *dev = &port->dev;
@@ -195,7 +200,7 @@ struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
return cxlhdm;
}
-EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_hdm, "CXL");
+EXPORT_SYMBOL_NS_GPL(__devm_cxl_setup_hdm, "CXL");
static void __cxl_dpa_debug(struct seq_file *file, struct resource *r, int depth)
{
@@ -1156,12 +1161,12 @@ static void cxl_settle_decoders(struct cxl_hdm *cxlhdm)
}
/**
- * devm_cxl_enumerate_decoders - add decoder objects per HDM register set
+ * __devm_cxl_enumerate_decoders - add decoder objects per HDM register set
* @cxlhdm: Structure to populate with HDM capabilities
* @info: cached DVSEC range register info
*/
-int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
- struct cxl_endpoint_dvsec_info *info)
+int __devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info)
{
void __iomem *hdm = cxlhdm->regs.hdm_decoder;
struct cxl_port *port = cxlhdm->port;
@@ -1216,4 +1221,4 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
return 0;
}
-EXPORT_SYMBOL_NS_GPL(devm_cxl_enumerate_decoders, "CXL");
+EXPORT_SYMBOL_NS_GPL(__devm_cxl_enumerate_decoders, "CXL");
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 796c27e98afb..ba8811388dc8 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -817,9 +817,15 @@ struct cxl_endpoint_dvsec_info {
struct cxl_hdm;
struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
struct cxl_endpoint_dvsec_info *info);
+struct cxl_hdm *__devm_cxl_setup_hdm(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info);
int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
struct cxl_endpoint_dvsec_info *info);
+int __devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info);
int devm_cxl_add_passthrough_decoder(struct cxl_port *port);
+int __devm_cxl_add_passthrough_decoder(struct cxl_port *port);
+
struct cxl_dev_state;
int cxl_dvsec_rr_decode(struct cxl_dev_state *cxlds,
struct cxl_endpoint_dvsec_info *info);
@@ -942,6 +948,9 @@ u16 cxl_gpf_get_dvsec(struct device *dev);
#ifndef CXL_TEST_ENABLE
#define DECLARE_TESTABLE(x) __##x
#define devm_cxl_add_dport_by_dev DECLARE_TESTABLE(devm_cxl_add_dport_by_dev)
+#define devm_cxl_enumerate_decoders DECLARE_TESTABLE(devm_cxl_enumerate_decoders)
+#define devm_cxl_setup_hdm DECLARE_TESTABLE(devm_cxl_setup_hdm)
+#define devm_cxl_add_passthrough_decoder DECLARE_TESTABLE(devm_cxl_add_passthrough_decoder)
#endif
#endif /* __CXL_H__ */
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index d083cf8cca3b..cc2d2c25b5f9 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -6,9 +6,6 @@ ldflags-y += --wrap=acpi_pci_find_root
ldflags-y += --wrap=nvdimm_bus_register
ldflags-y += --wrap=devm_cxl_port_enumerate_dports
ldflags-y += --wrap=cxl_port_get_possible_dports
-ldflags-y += --wrap=devm_cxl_setup_hdm
-ldflags-y += --wrap=devm_cxl_add_passthrough_decoder
-ldflags-y += --wrap=devm_cxl_enumerate_decoders
ldflags-y += --wrap=cxl_await_media_ready
ldflags-y += --wrap=cxl_hdm_decode_init
ldflags-y += --wrap=cxl_dvsec_rr_decode
diff --git a/tools/testing/cxl/cxl_core_exports.c b/tools/testing/cxl/cxl_core_exports.c
index 0d18abc1f5a3..45fbe8804ddf 100644
--- a/tools/testing/cxl/cxl_core_exports.c
+++ b/tools/testing/cxl/cxl_core_exports.c
@@ -17,3 +17,33 @@ struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
return _devm_cxl_add_dport_by_dev(port, dport_dev);
}
EXPORT_SYMBOL_NS_GPL(devm_cxl_add_dport_by_dev, "CXL");
+
+cxl_add_pt_decoder_fn _devm_cxl_add_passthrough_decoder =
+ __devm_cxl_add_passthrough_decoder;
+EXPORT_SYMBOL_NS_GPL(_devm_cxl_add_passthrough_decoder, "CXL");
+
+int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
+{
+ return _devm_cxl_add_passthrough_decoder(port);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
+
+cxl_setup_hdm_fn _devm_cxl_setup_hdm = __devm_cxl_setup_hdm;
+EXPORT_SYMBOL_NS_GPL(_devm_cxl_setup_hdm, "CXL");
+
+struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info)
+{
+ return _devm_cxl_setup_hdm(port, info);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_hdm, "CXL");
+
+cxl_enum_decoders_fn _devm_cxl_enumerate_decoders = __devm_cxl_enumerate_decoders;
+EXPORT_SYMBOL_NS_GPL(_devm_cxl_enumerate_decoders, "CXL");
+
+int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info)
+{
+ return _devm_cxl_enumerate_decoders(cxlhdm, info);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_enumerate_decoders, "CXL");
diff --git a/tools/testing/cxl/exports.h b/tools/testing/cxl/exports.h
index 9261ce6f1197..7f06207f9f8f 100644
--- a/tools/testing/cxl/exports.h
+++ b/tools/testing/cxl/exports.h
@@ -7,4 +7,15 @@ typedef struct cxl_dport *(*cxl_add_dport_by_dev_fn)(struct cxl_port *port,
struct device *dport_dev);
extern cxl_add_dport_by_dev_fn _devm_cxl_add_dport_by_dev;
+typedef struct cxl_hdm *(*cxl_setup_hdm_fn)(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info);
+extern cxl_setup_hdm_fn _devm_cxl_setup_hdm;
+
+typedef int (*cxl_enum_decoders_fn)(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info);
+extern cxl_enum_decoders_fn _devm_cxl_enumerate_decoders;
+
+typedef int (*cxl_add_pt_decoder_fn)(struct cxl_port *port);
+extern cxl_add_pt_decoder_fn _devm_cxl_add_passthrough_decoder;
+
#endif
diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
index fd73a06f6ccb..8b1ab7f6cb5a 100644
--- a/tools/testing/cxl/test/mock.c
+++ b/tools/testing/cxl/test/mock.c
@@ -17,11 +17,21 @@ static LIST_HEAD(mock);
static struct cxl_dport *
redirect_devm_cxl_add_dport_by_dev(struct cxl_port *port,
struct device *dport_dev);
+static struct cxl_hdm *
+redirect_devm_cxl_setup_hdm(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info);
+static int
+redirect_devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info);
+static int redirect_devm_cxl_add_passthrough_decoder(struct cxl_port *port);
void register_cxl_mock_ops(struct cxl_mock_ops *ops)
{
list_add_rcu(&ops->list, &mock);
_devm_cxl_add_dport_by_dev = redirect_devm_cxl_add_dport_by_dev;
+ _devm_cxl_add_passthrough_decoder = redirect_devm_cxl_add_passthrough_decoder;
+ _devm_cxl_enumerate_decoders = redirect_devm_cxl_enumerate_decoders;
+ _devm_cxl_setup_hdm = redirect_devm_cxl_setup_hdm;
}
EXPORT_SYMBOL_GPL(register_cxl_mock_ops);
@@ -29,6 +39,9 @@ DEFINE_STATIC_SRCU(cxl_mock_srcu);
void unregister_cxl_mock_ops(struct cxl_mock_ops *ops)
{
+ _devm_cxl_setup_hdm = __devm_cxl_setup_hdm;
+ _devm_cxl_enumerate_decoders = __devm_cxl_enumerate_decoders;
+ _devm_cxl_add_passthrough_decoder = __devm_cxl_add_passthrough_decoder;
_devm_cxl_add_dport_by_dev = __devm_cxl_add_dport_by_dev;
list_del_rcu(&ops->list);
synchronize_srcu(&cxl_mock_srcu);
@@ -138,8 +151,8 @@ __wrap_nvdimm_bus_register(struct device *dev,
}
EXPORT_SYMBOL_GPL(__wrap_nvdimm_bus_register);
-struct cxl_hdm *__wrap_devm_cxl_setup_hdm(struct cxl_port *port,
- struct cxl_endpoint_dvsec_info *info)
+struct cxl_hdm *redirect_devm_cxl_setup_hdm(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info)
{
int index;
@@ -149,14 +162,13 @@ struct cxl_hdm *__wrap_devm_cxl_setup_hdm(struct cxl_port *port,
if (ops && ops->is_mock_port(port->uport_dev))
cxlhdm = ops->devm_cxl_setup_hdm(port, info);
else
- cxlhdm = devm_cxl_setup_hdm(port, info);
+ cxlhdm = __devm_cxl_setup_hdm(port, info);
put_cxl_mock_ops(index);
return cxlhdm;
}
-EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_setup_hdm, "CXL");
-int __wrap_devm_cxl_add_passthrough_decoder(struct cxl_port *port)
+int redirect_devm_cxl_add_passthrough_decoder(struct cxl_port *port)
{
int rc, index;
struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
@@ -164,15 +176,14 @@ int __wrap_devm_cxl_add_passthrough_decoder(struct cxl_port *port)
if (ops && ops->is_mock_port(port->uport_dev))
rc = ops->devm_cxl_add_passthrough_decoder(port);
else
- rc = devm_cxl_add_passthrough_decoder(port);
+ rc = __devm_cxl_add_passthrough_decoder(port);
put_cxl_mock_ops(index);
return rc;
}
-EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_add_passthrough_decoder, "CXL");
-int __wrap_devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
- struct cxl_endpoint_dvsec_info *info)
+int redirect_devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info)
{
int rc, index;
struct cxl_port *port = cxlhdm->port;
@@ -181,12 +192,11 @@ int __wrap_devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
if (ops && ops->is_mock_port(port->uport_dev))
rc = ops->devm_cxl_enumerate_decoders(cxlhdm, info);
else
- rc = devm_cxl_enumerate_decoders(cxlhdm, info);
+ rc = __devm_cxl_enumerate_decoders(cxlhdm, info);
put_cxl_mock_ops(index);
return rc;
}
-EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_enumerate_decoders, "CXL");
int __wrap_devm_cxl_port_enumerate_dports(struct cxl_port *port)
{
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (7 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 08/11] cxl/test: Add support to cxl_test for decoder enumeration mock functions Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-15 13:04 ` Jonathan Cameron
2025-08-14 22:21 ` [PATCH v8 10/11] cxl: Change sslbis handler to only handle single dport Dave Jiang
` (2 subsequent siblings)
11 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter
cxl_test uses mock functions for decoder enumation. Add initialization
of the cxld->target_map[] for cxl_test based decoders in the mock
functions.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
tools/testing/cxl/test/cxl.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 0ef2a8ec1fab..2d50193d10fe 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -818,15 +818,21 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
*/
if (WARN_ON(!dev))
continue;
+
cxlsd = to_cxl_switch_decoder(dev);
if (i == 0) {
/* put cxl_mem.4 second in the decode order */
- if (pdev->id == 4)
+ if (pdev->id == 4) {
cxlsd->target[1] = dport;
- else
+ cxld->target_map[1] = dport->port_id;
+ } else {
cxlsd->target[0] = dport;
- } else
+ cxld->target_map[0] = dport->port_id;
+ }
+ } else {
cxlsd->target[0] = dport;
+ cxld->target_map[0] = dport->port_id;
+ }
cxld = &cxlsd->cxld;
cxld->target_type = CXL_DECODER_HOSTONLYMEM;
cxld->flags = CXL_DECODER_F_ENABLE;
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 10/11] cxl: Change sslbis handler to only handle single dport
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (8 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support Dave Jiang
2025-08-19 9:39 ` [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Robert Richter
11 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter, Gregory Price,
Jonathan Cameron, Li Ming
While cxl_switch_parse_cdat() is harmless to be run multiple times, it is
not efficient in the current scheme where one dport is being updated at
a time by the memdev probe path. Change the input parameter to the
specific dport being updated to pick up the SSLBIS information for just
that dport.
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v8:
- Will consider Robert's suggestion of collect once in a follow on
implementation.
---
drivers/cxl/core/cdat.c | 23 ++++++++++-------------
drivers/cxl/core/port.c | 2 +-
drivers/cxl/cxl.h | 2 +-
3 files changed, 12 insertions(+), 15 deletions(-)
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index b156b81a9b20..84c50e7e8d0a 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -440,8 +440,8 @@ static int cdat_sslbis_handler(union acpi_subtable_headers *header, void *arg,
} *tbl = (struct acpi_cdat_sslbis_table *)header;
int size = sizeof(header->cdat) + sizeof(tbl->sslbis_header);
struct acpi_cdat_sslbis *sslbis;
- struct cxl_port *port = arg;
- struct device *dev = &port->dev;
+ struct cxl_dport *dport = arg;
+ struct device *dev = &dport->port->dev;
int remain, entries, i;
u16 len;
@@ -467,8 +467,6 @@ static int cdat_sslbis_handler(union acpi_subtable_headers *header, void *arg,
u16 y = le16_to_cpu((__force __le16)tbl->entries[i].porty_id);
__le64 le_base;
__le16 le_val;
- struct cxl_dport *dport;
- unsigned long index;
u16 dsp_id;
u64 val;
@@ -499,28 +497,27 @@ static int cdat_sslbis_handler(union acpi_subtable_headers *header, void *arg,
val = cdat_normalize(le16_to_cpu(le_val), le64_to_cpu(le_base),
sslbis->data_type);
- xa_for_each(&port->dports, index, dport) {
- if (dsp_id == ACPI_CDAT_SSLBIS_ANY_PORT ||
- dsp_id == dport->port_id) {
- cxl_access_coordinate_set(dport->coord,
- sslbis->data_type,
- val);
- }
+ if (dsp_id == ACPI_CDAT_SSLBIS_ANY_PORT ||
+ dsp_id == dport->port_id) {
+ cxl_access_coordinate_set(dport->coord,
+ sslbis->data_type, val);
+ return 0;
}
}
return 0;
}
-void cxl_switch_parse_cdat(struct cxl_port *port)
+void cxl_switch_parse_cdat(struct cxl_dport *dport)
{
+ struct cxl_port *port = dport->port;
int rc;
if (!port->cdat.table)
return;
rc = cdat_table_parse(ACPI_CDAT_TYPE_SSLBIS, cdat_sslbis_handler,
- port, port->cdat.table, port->cdat.length);
+ dport, port->cdat.table, port->cdat.length);
rc = cdat_table_parse_output(rc);
if (rc)
dev_dbg(&port->dev, "Failed to parse SSLBIS: %d\n", rc);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 877f888ee8f5..cf309d72fa3d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1649,7 +1649,7 @@ static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
if (IS_ERR(new_dport))
return new_dport;
- cxl_switch_parse_cdat(port);
+ cxl_switch_parse_cdat(new_dport);
/*
* First instance of dport appearing, need to setup the port, including
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ba8811388dc8..637f7b94d7e9 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -904,7 +904,7 @@ static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
#endif
void cxl_endpoint_parse_cdat(struct cxl_port *port);
-void cxl_switch_parse_cdat(struct cxl_port *port);
+void cxl_switch_parse_cdat(struct cxl_dport *dport);
int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
struct access_coordinate *coord);
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (9 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 10/11] cxl: Change sslbis handler to only handle single dport Dave Jiang
@ 2025-08-14 22:21 ` Dave Jiang
2025-08-15 13:15 ` Jonathan Cameron
2025-08-19 9:39 ` [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Robert Richter
11 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-14 22:21 UTC (permalink / raw)
To: linux-cxl
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, rrichter
Record decoder values at init and mock_decoder_commit() time, and
restore them at the next invocation of mock_init_hdm_decoder(). Add 2
attributes to the cxl_test "cxl_acpi" device to optionally flush the
cache of topology decoder values, or disable updating the decoder at
mock_decoder_reset() time.
This enables replaying a saved decoder configuration when re-triggering
a topology scan by re-binding the cxl_acpi driver to "cxl_acpi.0" (the
cxl_test emulation of an ACPI0017 instance).
# modprobe cxl_test
# cxl list -RB -b cxl_test -u
{
"bus":"root3",
"provider":"cxl_test",
"regions:root3":[
{
"region":"region5",
"resource":"0xf010000000",
"size":"512.00 MiB (536.87 MB)",
"type":"ram",
"interleave_ways":2,
"interleave_granularity":4096,
"decode_state":"commit"
}
]
}
# echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_reset_disable
# echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind
# cxl list -RB -b cxl_test -u
# echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind
# cxl list -RB -b cxl_test -u
{
"bus":"root3",
"provider":"cxl_test",
"regions:root3":[
{
"region":"region5",
"resource":"0xf010000000",
"size":"512.00 MiB (536.87 MB)",
"type":"ram",
"interleave_ways":2,
"interleave_granularity":4096,
"decode_state":"commit"
}
]
}
[dj: Added support for delayed dport initialization ]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
- Will have a cxl-replay test script in CXL CLI package
- This allowed me to test assembly and replay for manually created CXL regions
---
tools/testing/cxl/test/cxl.c | 300 ++++++++++++++++++++++++++++++++++-
1 file changed, 298 insertions(+), 2 deletions(-)
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 2d50193d10fe..e0da2a48d2a8 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -47,6 +47,9 @@ struct platform_device *cxl_mem_single[NR_MEM_SINGLE];
static struct platform_device *cxl_rch[NR_CXL_RCH];
static struct platform_device *cxl_rcd[NR_CXL_RCH];
+static DEFINE_XARRAY(decoder_registry);
+static bool decoder_registry_reset_disable;
+
static inline bool is_multi_bridge(struct device *dev)
{
int i;
@@ -671,6 +674,164 @@ static int map_targets(struct device *dev, void *data)
return 0;
}
+static unsigned long cxld_registry_index(struct cxl_decoder *cxld)
+{
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+
+ /*
+ * Upper nibble of a kernel pointer is 0xff, chop that to make
+ * space for a cxl_decoder id which should be less than 128
+ * given decoder count is a 4-bit field.
+ *
+ * While @port is reallocated each enumeration, @port->uport_dev
+ * is stable.
+ */
+ dev_WARN_ONCE(&port->dev, cxld->id >= 128,
+ "decoder id:%d out of range\n", cxld->id);
+ return (((unsigned long) port->uport_dev) << 4) | cxld->id;
+}
+
+struct cxl_test_decoder {
+ union {
+ struct cxl_switch_decoder cxlsd;
+ struct cxl_endpoint_decoder cxled;
+ };
+ union {
+ struct cxl_dport *targets[CXL_DECODER_MAX_INTERLEAVE];
+ struct range dpa_range;
+ };
+};
+
+static struct cxl_test_decoder *cxld_registry_find(struct cxl_decoder *cxld)
+{
+ return xa_load(&decoder_registry, cxld_registry_index(cxld));
+}
+
+#define dbg_cxld(port, msg, cxld) \
+ do { \
+ struct cxl_decoder *___d = (cxld); \
+ dev_dbg((port)->uport_dev, \
+ "decoder%d: %s range: %#llx-%#llx iw: %d ig: %d flags: %#lx\n", \
+ ___d->id, msg, ___d->hpa_range.start, \
+ ___d->hpa_range.end + 1, ___d->interleave_ways, \
+ ___d->interleave_granularity, ___d->flags); \
+ } while (0)
+
+static int mock_decoder_commit(struct cxl_decoder *cxld);
+static void mock_decoder_reset(struct cxl_decoder *cxld);
+
+static void cxld_copy(struct cxl_decoder *a, struct cxl_decoder *b)
+{
+ a->id = b->id;
+ a->hpa_range = b->hpa_range;
+ a->interleave_ways = b->interleave_ways;
+ a->interleave_granularity = b->interleave_granularity;
+ a->target_type = b->target_type;
+ a->flags = b->flags;
+ a->commit = mock_decoder_commit;
+ a->reset = mock_decoder_reset;
+}
+
+static void cxld_registry_restore(struct cxl_decoder *cxld, struct cxl_test_decoder *td)
+{
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+
+ if (is_switch_decoder(&cxld->dev)) {
+ struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(&cxld->dev);
+
+ dbg_cxld(port, "restore", &td->cxlsd.cxld);
+ cxld_copy(cxld, &td->cxlsd.cxld);
+ WARN_ON(cxlsd->nr_targets != td->cxlsd.nr_targets);
+
+ /* convert saved dport devs to dports */
+ for (int i = 0; i < cxlsd->nr_targets; i++) {
+ struct cxl_dport *dport;
+ struct device *dev;
+
+ if (!td->cxlsd.target[i])
+ continue;
+ /* Recall this dport ptr is overloaded with a 'struct device' at save */
+ dev = (struct device *)td->cxlsd.target[i];
+ dport = cxl_find_dport_by_dev(port, dev);
+ if (!dport) {
+ cxld->target_map[i] = td->cxlsd.cxld.target_map[i];
+ continue;
+ }
+ cxlsd->target[i] = dport;
+ cxld->target_map[i] = dport->port_id;
+ }
+ } else {
+ struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev);
+
+ dbg_cxld(port, "restore", &td->cxled.cxld);
+ cxld_copy(cxld, &td->cxled.cxld);
+ cxled->state = td->cxled.state;
+ cxled->skip = td->cxled.skip;
+ if (range_len(&td->dpa_range))
+ devm_cxl_dpa_reserve(cxled, td->dpa_range.start,
+ range_len(&td->dpa_range),
+ td->cxled.skip);
+ if (cxld->flags & CXL_DECODER_F_ENABLE)
+ port->commit_end = cxld->id;
+ }
+}
+
+static void __cxld_registry_save(struct cxl_test_decoder *td,
+ struct cxl_decoder *cxld)
+{
+ if (is_switch_decoder(&cxld->dev)) {
+ struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(&cxld->dev);
+
+ cxld_copy(&td->cxlsd.cxld, cxld);
+ td->cxlsd.nr_targets = cxlsd->nr_targets;
+
+ /* save dport devs as a stable placeholder for dports */
+ for (int i = 0; i < cxlsd->nr_targets; i++) {
+ struct cxl_dport *dport;
+
+ if (!cxlsd->target[i])
+ continue;
+
+ dport = cxlsd->target[i];
+ /* Overloading target[] with a 'struct device' */
+ td->cxlsd.target[i] = (struct cxl_dport *)dport->dport_dev;
+ td->cxlsd.cxld.target_map[i] = dport->port_id;
+ }
+ } else {
+ struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev);
+
+ cxld_copy(&td->cxled.cxld, cxld);
+ td->cxled.state = cxled->state;
+ td->cxled.skip = cxled->skip;
+ if (cxled->dpa_res) {
+ td->dpa_range.start = cxled->dpa_res->start;
+ td->dpa_range.end = cxled->dpa_res->end;
+ } else {
+ td->dpa_range.start = 0;
+ td->dpa_range.end = -1;
+ }
+ }
+}
+
+static void cxld_registry_save(struct cxl_test_decoder *td, struct cxl_decoder *cxld)
+{
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+
+ dbg_cxld(port, "save", cxld);
+ __cxld_registry_save(td, cxld);
+}
+
+static void cxld_registry_update(struct cxl_decoder *cxld)
+{
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+ struct cxl_test_decoder *td = cxld_registry_find(cxld);
+
+ dev_WARN_ONCE(port->uport_dev, !td, "%s failed\n", __func__);
+
+ dbg_cxld(port, "update", cxld);
+ __cxld_registry_save(td, cxld);
+}
+
static int mock_decoder_commit(struct cxl_decoder *cxld)
{
struct cxl_port *port = to_cxl_port(cxld->dev.parent);
@@ -690,6 +851,13 @@ static int mock_decoder_commit(struct cxl_decoder *cxld)
port->commit_end++;
cxld->flags |= CXL_DECODER_F_ENABLE;
+ if (is_endpoint_decoder(&cxld->dev)) {
+ struct cxl_endpoint_decoder *cxled =
+ to_cxl_endpoint_decoder(&cxld->dev);
+
+ cxled->state = CXL_DECODER_STATE_AUTO;
+ }
+ cxld_registry_update(cxld);
return 0;
}
@@ -700,7 +868,7 @@ static void mock_decoder_reset(struct cxl_decoder *cxld)
int id = cxld->id;
if ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)
- return;
+ goto registry_update_out;
dev_dbg(&port->dev, "%s reset\n", dev_name(&cxld->dev));
if (port->commit_end == id)
@@ -709,7 +877,51 @@ static void mock_decoder_reset(struct cxl_decoder *cxld)
dev_dbg(&port->dev,
"%s: out of order reset, expected decoder%d.%d\n",
dev_name(&cxld->dev), port->id, port->commit_end);
+
+registry_update_out:
cxld->flags &= ~CXL_DECODER_F_ENABLE;
+
+ if (is_endpoint_decoder(&cxld->dev)) {
+ struct cxl_endpoint_decoder *cxled =
+ to_cxl_endpoint_decoder(&cxld->dev);
+
+ cxled->state = CXL_DECODER_STATE_MANUAL;
+ }
+ if (decoder_registry_reset_disable)
+ dev_dbg(port->uport_dev, "decoder%d: skip registry update\n",
+ cxld->id);
+ else
+ cxld_registry_update(cxld);
+
+ return;
+}
+
+static void cxld_registry_invalidate(void)
+{
+ unsigned long index;
+ void *entry;
+
+ xa_for_each(&decoder_registry, index, entry) {
+ xa_erase(&decoder_registry, index);
+ kfree(entry);
+ }
+}
+
+static struct cxl_test_decoder *cxld_registry_new(struct cxl_decoder *cxld)
+{
+ struct cxl_test_decoder *td __free(kfree) = kzalloc(sizeof(*td), GFP_KERNEL);
+
+ if (!td)
+ return NULL;
+
+ if (xa_insert(&decoder_registry, cxld_registry_index(cxld), td,
+ GFP_KERNEL)) {
+ WARN_ON(1);
+ return NULL;
+ }
+
+ cxld_registry_save(td, cxld);
+ return no_free_ptr(td);
}
static void default_mock_decoder(struct cxl_decoder *cxld)
@@ -724,6 +936,9 @@ static void default_mock_decoder(struct cxl_decoder *cxld)
cxld->target_type = CXL_DECODER_HOSTONLYMEM;
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
+
+ if (!cxld_registry_new(cxld))
+ dev_dbg(&cxld->dev, "failed to add to registry\n");
}
static int first_decoder(struct device *dev, const void *data)
@@ -745,6 +960,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
struct cxl_endpoint_decoder *cxled;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *port, *iter;
+ struct cxl_test_decoder *td;
const int size = SZ_512M;
struct cxl_memdev *cxlmd;
struct cxl_dport *dport;
@@ -774,6 +990,12 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
port = cxled_to_port(cxled);
}
+ td = cxld_registry_find(cxld);
+ if (td) {
+ cxld_registry_restore(cxld, td);
+ return;
+ }
+
/*
* The first decoder on the first 2 devices on the first switch
* attached to host-bridge0 mock a fake / static RAM region. All
@@ -802,6 +1024,8 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
devm_cxl_dpa_reserve(cxled, 0, size / cxld->interleave_ways, 0);
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
+ if (!cxld_registry_new(cxld))
+ dev_dbg(&cxld->dev, "failed to add to registry\n");
/*
* Now that endpoint decoder is set up, walk up the hierarchy
@@ -850,6 +1074,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
.start = base,
.end = base + size - 1,
};
+ cxld_registry_update(cxld);
put_device(dev);
}
}
@@ -902,7 +1127,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
mock_init_hdm_decoder(cxld);
- if (target_count) {
+ if (target_count && !decoder_registry_reset_disable) {
rc = device_for_each_child(port->uport_dev, &ctx,
map_targets);
if (rc) {
@@ -1407,6 +1632,73 @@ static int cxl_mem_init(void)
return rc;
}
+static ssize_t decoder_registry_invalidate_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ unsigned long index;
+ bool empty = true;
+ void *entry;
+
+ xa_for_each(&decoder_registry, index, entry) {
+ empty = false;
+ break;
+ }
+
+ return sysfs_emit(buf, "%d\n", !empty);
+}
+
+static ssize_t decoder_registry_invalidate_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ bool invalidate;
+ int rc;
+
+ rc = kstrtobool(buf, &invalidate);
+ if (rc)
+ return rc;
+
+ guard(device)(dev);
+
+ if (dev->driver)
+ return -EBUSY;
+
+ cxld_registry_invalidate();
+ return count;
+}
+
+static DEVICE_ATTR_RW(decoder_registry_invalidate);
+
+static ssize_t
+decoder_registry_reset_disable_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sysfs_emit(buf, "%d\n", decoder_registry_reset_disable);
+}
+
+static ssize_t
+decoder_registry_reset_disable_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int rc;
+
+ rc = kstrtobool(buf, &decoder_registry_reset_disable);
+ if (rc)
+ return rc;
+ return count;
+}
+
+static DEVICE_ATTR_RW(decoder_registry_reset_disable);
+
+static struct attribute *cxl_acpi_attrs[] = {
+ &dev_attr_decoder_registry_invalidate.attr,
+ &dev_attr_decoder_registry_reset_disable.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(cxl_acpi);
+
static __init int cxl_test_init(void)
{
int rc, i;
@@ -1537,6 +1829,7 @@ static __init int cxl_test_init(void)
mock_companion(&acpi0017_mock, &cxl_acpi->dev);
acpi0017_mock.dev.bus = &platform_bus_type;
+ cxl_acpi->dev.groups = cxl_acpi_groups;
rc = platform_device_add(cxl_acpi);
if (rc)
@@ -1606,6 +1899,9 @@ static __exit void cxl_test_exit(void)
depopulate_all_mock_resources();
gen_pool_destroy(cxl_mock_pool);
unregister_cxl_mock_ops(&cxl_mock_ops);
+
+ cxld_registry_invalidate();
+ xa_destroy(&decoder_registry);
}
module_param(interleave_arithmetic, int, 0444);
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
@ 2025-08-15 12:50 ` Jonathan Cameron
2025-08-20 13:51 ` Robert Richter
1 sibling, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-08-15 12:50 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, rrichter, Li Ming
On Thu, 14 Aug 2025 15:21:41 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> Add a helper to replace the open code detection of CXL device hierarchy
> root, or the host bridge. The helper will be used for delayed downstream
> port (dport) creation.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Li Ming <ming.li@zohomail.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> v8:
> - Rename to is_cxl_host_bridge() (Dan)
> - Rename duplicate tags from Jonathan
> ---
> drivers/cxl/core/port.c | 17 +++++++++++------
> 1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 29197376b18e..855623cebd7d 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -33,6 +33,15 @@
> static DEFINE_IDA(cxl_port_ida);
> static DEFINE_XARRAY(cxl_root_buses);
>
> +/*
> + * The terminal device in PCI is NULL and @platform_bus
> + * for platform devices (for cxl_test)
Silly but it tickled my in built line length detector...
* The terminal device in PCI is NULL and @platform_bus for platform devices
* (for cxl_test)
Obviously makes to practical difference to anything!
> + */
> +static bool is_cxl_host_bridge(struct device *dev)
> +{
> + return (!dev || dev == &platform_bus);
> +}
> +
> int cxl_num_decoders_committed(struct cxl_port *port)
> {
> lockdep_assert_held(&cxl_rwsem.region);
> @@ -1541,7 +1550,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> resource_size_t component_reg_phys;
> int rc;
>
> - if (!dparent) {
> + if (is_cxl_host_bridge(dparent)) {
> /*
> * The iteration reached the topology root without finding the
> * CXL-root 'cxl_port' on a previous iteration, fail for now to
> @@ -1629,11 +1638,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> struct device *uport_dev;
> struct cxl_dport *dport;
>
> - /*
> - * The terminal "grandparent" in PCI is NULL and @platform_bus
> - * for platform devices
> - */
> - if (!dport_dev || dport_dev == &platform_bus)
> + if (is_cxl_host_bridge(dport_dev))
> return 0;
>
> uport_dev = dport_dev->parent;
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
@ 2025-08-15 12:52 ` Jonathan Cameron
2025-08-20 14:17 ` Robert Richter
1 sibling, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-08-15 12:52 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, rrichter
On Thu, 14 Aug 2025 15:21:43 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> Add a cached copy of the hardware port-id list that is available at init
> before all @dport objects have been instantiated. Change is in preparation
> of delayed dport instantiation.
>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
LGTM, though I haven't yet gotten to reading where it's used in later patches...
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 04/11] cxl: Move port register setup to first dport appear
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
@ 2025-08-15 12:57 ` Jonathan Cameron
2025-08-21 11:57 ` Robert Richter
2025-08-22 10:37 ` Robert Richter
1 sibling, 1 reply; 40+ messages in thread
From: Jonathan Cameron @ 2025-08-15 12:57 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, rrichter
On Thu, 14 Aug 2025 15:21:44 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> This patch moves the port register setup to when the first dport appears
> via the memdev probe path. At this point, the CXL link should be
> established and the register access is expected to succeed. This change
> addresses an error message observed when PCIe hotplug is enabled on
> an Intel platform. The error messages "cxl portN: Couldn't locate the
> CXL.cache and CXL.mem capability array header" is observed for the
> hostbridge during cxl_acpi driver probe. If the cxl_acpi module
> probe is running before the CXL link between the endpoint device and the
> RP is established, then the platform may not have exposed DVSEC ID 3
> and/or DVSEC ID 7 blocks which will trigger the error message. This
> behavior is defined by the spec and not a hardware quirk.
>
> This change also needs the dport enumeration to be moved to the memdev
> probe path in order to address the issue. This change is just part of
> the code refactoring and is not a wholly contained fix itself.
>
> Suggested-by: Dan Williamsn <dan.j.williams@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
I'm a little nervous about what happens when we hot unplug EPs
on these systems and any left over address mappings for port to which
they are connected. But from previous discussions I think the argument
was that they were benign if they do happen.
Anyhow, this looks fine to me.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization
2025-08-14 22:21 ` [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization Dave Jiang
@ 2025-08-15 13:04 ` Jonathan Cameron
0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-08-15 13:04 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, rrichter
On Thu, 14 Aug 2025 15:21:49 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> cxl_test uses mock functions for decoder enumation. Add initialization
> of the cxld->target_map[] for cxl_test based decoders in the mock
> functions.
>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Looks fine.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> ---
> tools/testing/cxl/test/cxl.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> index 0ef2a8ec1fab..2d50193d10fe 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -818,15 +818,21 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
> */
> if (WARN_ON(!dev))
> continue;
> +
> cxlsd = to_cxl_switch_decoder(dev);
> if (i == 0) {
> /* put cxl_mem.4 second in the decode order */
> - if (pdev->id == 4)
> + if (pdev->id == 4) {
> cxlsd->target[1] = dport;
> - else
> + cxld->target_map[1] = dport->port_id;
> + } else {
> cxlsd->target[0] = dport;
> - } else
> + cxld->target_map[0] = dport->port_id;
> + }
> + } else {
> cxlsd->target[0] = dport;
> + cxld->target_map[0] = dport->port_id;
> + }
> cxld = &cxlsd->cxld;
> cxld->target_type = CXL_DECODER_HOSTONLYMEM;
> cxld->flags = CXL_DECODER_F_ENABLE;
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support
2025-08-14 22:21 ` [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support Dave Jiang
@ 2025-08-15 13:15 ` Jonathan Cameron
0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-08-15 13:15 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, rrichter
On Thu, 14 Aug 2025 15:21:51 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> Record decoder values at init and mock_decoder_commit() time, and
> restore them at the next invocation of mock_init_hdm_decoder(). Add 2
> attributes to the cxl_test "cxl_acpi" device to optionally flush the
> cache of topology decoder values, or disable updating the decoder at
> mock_decoder_reset() time.
>
> This enables replaying a saved decoder configuration when re-triggering
> a topology scan by re-binding the cxl_acpi driver to "cxl_acpi.0" (the
> cxl_test emulation of an ACPI0017 instance).
>
> # modprobe cxl_test
> # cxl list -RB -b cxl_test -u
> {
> "bus":"root3",
> "provider":"cxl_test",
> "regions:root3":[
> {
> "region":"region5",
> "resource":"0xf010000000",
> "size":"512.00 MiB (536.87 MB)",
> "type":"ram",
> "interleave_ways":2,
> "interleave_granularity":4096,
> "decode_state":"commit"
> }
> ]
> }
> # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_reset_disable
> # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind
> # cxl list -RB -b cxl_test -u
> # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind
> # cxl list -RB -b cxl_test -u
> {
> "bus":"root3",
> "provider":"cxl_test",
> "regions:root3":[
> {
> "region":"region5",
> "resource":"0xf010000000",
> "size":"512.00 MiB (536.87 MB)",
> "type":"ram",
> "interleave_ways":2,
> "interleave_granularity":4096,
> "decode_state":"commit"
> }
> ]
> }
>
> [dj: Added support for delayed dport initialization ]
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Authorship wrong, or a Co-dev missing? Was expecting to see a From: Dan...
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> - Will have a cxl-replay test script in CXL CLI package
> - This allowed me to test assembly and replay for manually created CXL regions
> ---
> tools/testing/cxl/test/cxl.c | 300 ++++++++++++++++++++++++++++++++++-
> 1 file changed, 298 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> index 2d50193d10fe..e0da2a48d2a8 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -47,6 +47,9 @@ struct platform_device *cxl_mem_single[NR_MEM_SINGLE];
> static struct platform_device *cxl_rch[NR_CXL_RCH];
> static struct platform_device *cxl_rcd[NR_CXL_RCH];
>
> +static DEFINE_XARRAY(decoder_registry);
> +static bool decoder_registry_reset_disable;
> +
> static inline bool is_multi_bridge(struct device *dev)
> {
> int i;
> @@ -671,6 +674,164 @@ static int map_targets(struct device *dev, void *data)
> return 0;
> }
>
> +static unsigned long cxld_registry_index(struct cxl_decoder *cxld)
> +{
> + struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> +
> + /*
> + * Upper nibble of a kernel pointer is 0xff, chop that to make
Very unlikely a nibble is 0xff, more likely 0xf unless you x86 folk have
8 bit nibbles :)
> + * space for a cxl_decoder id which should be less than 128
> + * given decoder count is a 4-bit field.
If it's a 4 bit field, how can it be greater than 15?
Maybe I need more coffee.
> + *
> + * While @port is reallocated each enumeration, @port->uport_dev
> + * is stable.
> + */
> + dev_WARN_ONCE(&port->dev, cxld->id >= 128,
> + "decoder id:%d out of range\n", cxld->id);
> + return (((unsigned long) port->uport_dev) << 4) | cxld->id;
> +}
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
` (10 preceding siblings ...)
2025-08-14 22:21 ` [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support Dave Jiang
@ 2025-08-19 9:39 ` Robert Richter
2025-08-19 15:41 ` Dave Jiang
11 siblings, 1 reply; 40+ messages in thread
From: Robert Richter @ 2025-08-19 9:39 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, Gregory Price, Li Ming
Dave,
On 14.08.25 15:21:40, Dave Jiang wrote:
> v8:
> - A bit of changes from Dan and Robert's comments. Main change is moving the port MMIO
> register probing to after the first dport shows up. This resulted with decoder allocation
> happens after the register probe.
> - See specific commits for more detailed changes.
thank you for your rework.
> Dave Jiang (11):
> cxl: Add helper to detect top of CXL device topology
> cxl: Add helper to reap dport
> cxl: Add a cached copy of target_map to cxl_decoder
> cxl: Move port register setup to first dport appear
> cxl: Defer dport allocation for switch ports
> cxl/test: Add cxl_test support for cxl_port_get_possible_dports()
> cxl/test: Add mock version of devm_cxl_add_dport_by_dev()
> cxl/test: Add support to cxl_test for decoder enumeration mock
> functions
> cxl/test: Setup target_map for cxl_test decoder initialization
> cxl: Change sslbis handler to only handle single dport
> tools/testing/cxl: Add decoder save/restore support
I have tested the whole series and it also solves the non-unique port
id errors for offline dports we see like:
cxl_port port2: unable to add dport247-0000:00:01.3 non-unique port id (0000:00:01.1)
For the whole series you can add:
Tested-by: Robert Richter <rrichter@amd.com>
Thanks,
-Robert
>
> drivers/cxl/acpi.c | 7 +-
> drivers/cxl/core/cdat.c | 25 +-
> drivers/cxl/core/core.h | 2 +
> drivers/cxl/core/hdm.c | 51 ++--
> drivers/cxl/core/pci.c | 82 ++++++
> drivers/cxl/core/port.c | 358 ++++++++++++++++++------
> drivers/cxl/core/region.c | 4 +-
> drivers/cxl/cxl.h | 44 ++-
> drivers/cxl/port.c | 29 +-
> tools/testing/cxl/Kbuild | 5 +-
> tools/testing/cxl/cxl_core_exports.c | 42 +++
> tools/testing/cxl/exports.h | 21 ++
> tools/testing/cxl/test/cxl.c | 399 ++++++++++++++++++++++++++-
> tools/testing/cxl/test/mock.c | 70 ++++-
> tools/testing/cxl/test/mock.h | 3 +
> 15 files changed, 963 insertions(+), 179 deletions(-)
> create mode 100644 tools/testing/cxl/exports.h
>
>
> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe
2025-08-19 9:39 ` [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Robert Richter
@ 2025-08-19 15:41 ` Dave Jiang
0 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-19 15:41 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, Gregory Price, Li Ming
On 8/19/25 2:39 AM, Robert Richter wrote:
> Dave,
>
> On 14.08.25 15:21:40, Dave Jiang wrote:
>> v8:
>> - A bit of changes from Dan and Robert's comments. Main change is moving the port MMIO
>> register probing to after the first dport shows up. This resulted with decoder allocation
>> happens after the register probe.
>> - See specific commits for more detailed changes.
>
> thank you for your rework.
>
>> Dave Jiang (11):
>> cxl: Add helper to detect top of CXL device topology
>> cxl: Add helper to reap dport
>> cxl: Add a cached copy of target_map to cxl_decoder
>> cxl: Move port register setup to first dport appear
>> cxl: Defer dport allocation for switch ports
>> cxl/test: Add cxl_test support for cxl_port_get_possible_dports()
>> cxl/test: Add mock version of devm_cxl_add_dport_by_dev()
>> cxl/test: Add support to cxl_test for decoder enumeration mock
>> functions
>> cxl/test: Setup target_map for cxl_test decoder initialization
>> cxl: Change sslbis handler to only handle single dport
>> tools/testing/cxl: Add decoder save/restore support
>
> I have tested the whole series and it also solves the non-unique port
> id errors for offline dports we see like:
>
> cxl_port port2: unable to add dport247-0000:00:01.3 non-unique port id (0000:00:01.1)
>
> For the whole series you can add:
>
> Tested-by: Robert Richter <rrichter@amd.com>
Thank you for testing Robert!
>
> Thanks,
>
> -Robert
>
>>
>> drivers/cxl/acpi.c | 7 +-
>> drivers/cxl/core/cdat.c | 25 +-
>> drivers/cxl/core/core.h | 2 +
>> drivers/cxl/core/hdm.c | 51 ++--
>> drivers/cxl/core/pci.c | 82 ++++++
>> drivers/cxl/core/port.c | 358 ++++++++++++++++++------
>> drivers/cxl/core/region.c | 4 +-
>> drivers/cxl/cxl.h | 44 ++-
>> drivers/cxl/port.c | 29 +-
>> tools/testing/cxl/Kbuild | 5 +-
>> tools/testing/cxl/cxl_core_exports.c | 42 +++
>> tools/testing/cxl/exports.h | 21 ++
>> tools/testing/cxl/test/cxl.c | 399 ++++++++++++++++++++++++++-
>> tools/testing/cxl/test/mock.c | 70 ++++-
>> tools/testing/cxl/test/mock.h | 3 +
>> 15 files changed, 963 insertions(+), 179 deletions(-)
>> create mode 100644 tools/testing/cxl/exports.h
>>
>>
>> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
>> --
>> 2.50.1
>>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-14 22:21 ` [PATCH v8 05/11] cxl: Defer dport allocation for switch ports Dave Jiang
@ 2025-08-20 12:41 ` Robert Richter
2025-08-20 15:20 ` Dave Jiang
` (2 more replies)
0 siblings, 3 replies; 40+ messages in thread
From: Robert Richter @ 2025-08-20 12:41 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
Hi Dave,
see my comments below.
On 14.08.25 15:21:45, Dave Jiang wrote:
> The current implementation enumerates the dports during the cxl_port
> driver probe. Without an endpoint connected, the dport may not be
> active during port probe. This scheme may prevent a valid hardware
> dport id to be retrieved and MMIO registers to be read when an endpoint
> is hot-plugged. Move the dport allocation and setup to behind memdev
> probe so the endpoint is guaranteed to be connected.
>
> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
> ACPI0017.N device. Through that it enumerates downstream ports composed
> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
> uses add_host_bridge_uport() to create the ports that enumerate the PCI
> RPs as the dports of these ports. Every time a port is created, the port
> driver is attached, cxl_switch_porbe_probe() is called and
> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
> the dports.
>
> The second phase is if there are any CXL switches. When the pci endpoint
> device driver (cxl_pci) calls probe, it will add a mem device and triggers
> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
> and attempts to discovery and create all the ports represent CXL switches.
> During this phase, a port is created per switch and the attached dports
> are also enumerated and probed.
>
> The last phase is creating endpoint port which happens for all endpoint
> devices.
>
> In this commit, the port create and its dport probing in cxl_acpi is not
> changed. That will be handled later. The behavior change is only for CXL
> switch ports. Only the dport that is part of the path for an endpoint
> device to the RP will be probed. This happens naturally by the code
> walking up the device hierarchy and identifying the upstream device and
> the downstream device.
>
> The new sequence is instead of creating all possible dports at initial
> port creation, defer port instantiation until a memdev beneath that
> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
> the creation and extension of ports with new dports as memory devices
> arrive. As part of this rework, switch decoder target list is amended
> at runtime as dports show up.
>
> While the decoders are allocated during the port driver probe,
> The decoders must also be updated since previously it's all done when all
> the dports are setup and now every time a dport is setup per endpoint, the
> switch target listing need to be updated with new dport. A
> guard(rwsem_write) is used to update decoder targets. This is similar to
> when decoder_populate_target() is called and the decoder programming
> must be protected.
>
> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> v8:
> - grammar and spelling fixups (Dan)
> - Clarify commit log story. (Dan)
> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
> - Remove failure path for possible dports == 0. (Dan, Robert)
> - update_switch_decoder() -> update_decoder_targets(). (Dan)
> - Remove lock asserts where not needed. (Dan)
> - Add support for passthrough decoder init. (Dan)
> - Return -ENXIO when no driver attached. (Dan)
> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
> - Add devm_cxl_create_or_extend_port() helper. (Dan)
> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
> - Remove 'new_dport' local var. (Robert)
> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
> - Move port check logic to add_port_attach_ep(). (Robert)
> ---
> drivers/cxl/core/cdat.c | 2 +-
> drivers/cxl/core/core.h | 2 +
> drivers/cxl/core/hdm.c | 6 -
> drivers/cxl/core/pci.c | 81 +++++++++++
> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
> drivers/cxl/core/region.c | 4 +-
> drivers/cxl/cxl.h | 3 +
> drivers/cxl/port.c | 29 +---
> 8 files changed, 331 insertions(+), 83 deletions(-)
>
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index c0af645425f4..b156b81a9b20 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
>
> guard(rwsem_read)(&cxl_rwsem.region);
> for (int i = 0; i < cxlsd->nr_targets; i++) {
> - if (host_bridge == cxlsd->target[i]->dport_dev)
> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
> return 1;
> }
>
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 2669f251d677..2ac71eb459e6 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
> int cxl_ras_init(void);
> void cxl_ras_exit(void);
> int cxl_gpf_port_setup(struct cxl_dport *dport);
> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
> + struct device *dport_dev);
>
> #ifdef CONFIG_CXL_FEATURES
> struct cxl_feat_entry *
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index cee68bbc7ff6..5263e9eba7d0 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
> {
> struct cxl_switch_decoder *cxlsd;
> - struct cxl_dport *dport = NULL;
> - unsigned long index;
> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
>
> /*
> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>
> device_lock_assert(&port->dev);
>
> - xa_for_each(&port->dports, index, dport)
> - break;
> - cxlsd->cxld.target_map[0] = dport->port_id;
> -
The change of initialization of cxlsd->cxld.target_map[] could have
been a separate patch to reduce size of this patch.
> return add_hdm_decoder(port, &cxlsd->cxld);
> }
> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index b50551601c2e..b9d770f1aa7b 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
> module_param(media_ready_timeout, ushort, 0644);
> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
>
> +/**
> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
> + * @port: cxl_port that hosts the dport
> + * @dport_dev: 'struct device' of the dport
> + *
> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
> + */
> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
This function only determines the port_num. How about only implement
this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
directly?
That would nicely fit into core/pci.c.
> + struct device *dport_dev)
> +{
> + struct cxl_register_map map;
> + struct pci_dev *pdev;
> + u32 lnkcap, port_num;
> + int type;
> + int rc;
> +
> + if (!dev_is_pci(dport_dev))
> + return ERR_PTR(-EINVAL);
> +
> + device_lock_assert(&port->dev);
> +
> + pdev = to_pci_dev(dport_dev);
> + type = pci_pcie_type(pdev);
> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
> + return ERR_PTR(-EINVAL);
> +
> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> + &lnkcap))
> + return ERR_PTR(-ENXIO);
> +
> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> + if (rc)
> + dev_dbg(&port->dev, "failed to find component registers\n");
> +
> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
So, just return port_num instead.
> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
> +}
> +
> struct cxl_walk_context {
> struct pci_bus *bus;
> struct cxl_port *port;
> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
>
> return 0;
> }
> +
> +static int count_dports(struct pci_dev *pdev, void *data)
> +{
> + struct cxl_walk_context *ctx = data;
> + int type = pci_pcie_type(pdev);
> +
> + if (pdev->bus != ctx->bus)
> + return 0;
> + if (!pci_is_pcie(pdev))
> + return 0;
> + if (type != ctx->type)
> + return 0;
> +
> + ctx->count++;
> + return 0;
> +}
> +
> +int cxl_port_get_possible_dports(struct cxl_port *port)
> +{
> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
> + struct cxl_walk_context ctx;
> + int type;
> +
> + if (!bus) {
> + dev_err(&port->dev, "No PCI bus found for port %s\n",
> + dev_name(&port->dev));
> + return -ENXIO;
> + }
> +
> + if (pci_is_root_bus(bus))
> + type = PCI_EXP_TYPE_ROOT_PORT;
> + else
> + type = PCI_EXP_TYPE_DOWNSTREAM;
> +
> + ctx = (struct cxl_walk_context) {
> + .bus = bus,
> + .type = type,
> + };
> + pci_walk_bus(bus, count_dports, &ctx);
Don't walk the whole bus, just check children of port->uport_dev.
> +
> + return ctx.count;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
See below for my comment on possible_dports.
Since we only check for count > 1 the implemntation could be
simplified and renamed to e.g. cxl_port_has_multiple_dports which
could easily be used to call devm_cxl_add_passthrough_decoder().
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 25209952f469..877f888ee8f5 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
> return port;
> }
>
> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
> - struct device *dport_dev,
> - struct cxl_dport **dport)
> -{
> - struct cxl_find_port_ctx ctx = {
> - .dport_dev = dport_dev,
> - .parent_port = parent_port,
> - .dport = dport,
> - };
> - struct cxl_port *port;
> -
> - port = __find_cxl_port(&ctx);
> - return port;
> -}
> -
> /*
> * All users of grandparent() are using it to walk PCIe-like switch port
> * hierarchy. A PCIe switch is comprised of a bridge device representing the
> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
> return map.resource;
> }
>
> +static int match_port_by_uport(struct device *dev, const void *data)
> +{
> + const struct device *uport_dev = data;
> + struct cxl_port *port;
> +
> + if (!is_cxl_port(dev))
> + return 0;
> +
> + port = to_cxl_port(dev);
> + return uport_dev == port->uport_dev;
> +}
> +
> +/*
> + * Function takes a device reference on the port device. Caller should do a
> + * put_device() when done.
> + */
> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
> +{
> + struct device *dev;
> +
> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
> + if (dev)
> + return to_cxl_port(dev);
> + return NULL;
> +}
> +
> +static int update_decoder_targets(struct device *dev, void *data)
> +{
> + struct cxl_dport *dport = data;
> + struct cxl_switch_decoder *cxlsd;
> + struct cxl_decoder *cxld;
> + int i;
> +
> + if (!is_switch_decoder(dev))
> + return 0;
> +
> + cxlsd = to_cxl_switch_decoder(dev);
> + cxld = &cxlsd->cxld;
> + guard(rwsem_write)(&cxl_rwsem.region);
> +
> + /* Short cut for passthrough decoder */
> + if (cxlsd->nr_targets == 1) {
I think we should still check port_id. That is, remove the shortcut.
If nr_targets == 1, then interleave_ways should be one too, so you
gain nothing. Plus, you also see the dev_dbg().
> + cxlsd->target[0] = dport;
> + return 0;
> + }
> +
> + for (i = 0; i < cxld->interleave_ways; i++) {
> + if (cxld->target_map[i] == dport->port_id) {
> + cxlsd->target[i] = dport;
> + dev_dbg(dev, "dport%d found in target list, index %d\n",
> + dport->port_id, i);
> + return 0;
Only one target exists, right? Stop the iteration by returning a
non-zero here (caller needs to be adjusted then).
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
> +{
> + return device_for_each_child(&dport->port->dev, dport,
> + update_decoder_targets);
Might need changes if update_decoder_targets returns 1 to stop the
iterator.
> +}
> +
> +static int cxl_switch_port_setup(struct cxl_port *port)
> +{
Could you factor out that function in a separate patch?
The function only sets up decoders. Name it
cxl_switch_port_setup_decoders()?
> + struct cxl_hdm *cxlhdm;
> +
> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
> + if (!IS_ERR(cxlhdm))
> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> +
> + if (PTR_ERR(cxlhdm) != -ENODEV) {
> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
> + return PTR_ERR(cxlhdm);
> + }
> +
> + if (port->possible_dports == 1) {
> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> + return devm_cxl_add_passthrough_decoder(port);
Imo, the possible_dports handling should be removed as it only
introduces dead code. mock_cxl_setup_hdm() always returns a valid
cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
here.
So how about moving (the "real") devm_cxl_add_passthrough_decoder()
and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
devm_cxl_add_passthrough_decoder() would be static then and
cxl_port_get_possible_dports() will be a core.h function only. Then,
mock_cxl_add_passthrough_decoder() could be removed too.
I really would like to have a clean core module interface that allows
an easy implementation of cxl_test and avoid too much impact to the
driver code.
> + }
> +
> + dev_err(&port->dev, "HDM decoder capability not found\n");
> + return -ENXIO;
> +}
> +
> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
> + struct device *dport_dev)
> +{
> + struct cxl_dport *dport;
> + int rc;
> +
> + guard(device)(&port->dev);
> +
> + if (!port->dev.driver)
> + return ERR_PTR(-ENXIO);
> +
> + dport = cxl_find_dport_by_dev(port, dport_dev);
> + if (dport)
> + return dport;
What is the case if there is already a dport bound to the port? Since
there is a 1:1 mapping downstream, there is only one allocation and I
would expect that dport never exists and an -EBUSY should be returned
otherwise.
> +
> + struct cxl_dport *new_dport __free(put_cxl_dport) =
> + devm_cxl_add_dport_by_dev(port, dport_dev);
See my comment on devm_cxl_add_dport_by_dev() above.
> + if (IS_ERR(new_dport))
> + return new_dport;
> +
> + cxl_switch_parse_cdat(port);
> +
> + /*
> + * First instance of dport appearing, need to setup the port, including
> + * allocating decoders.
> + */
> + if (port->nr_dports == 1) {
> + rc = cxl_switch_port_setup(port);
Can't this be done with port creation? I don't see a reason doing this
late at this point.
> + if (rc)
> + return ERR_PTR(rc);
> + return no_free_ptr(new_dport);
> + }
> +
> + rc = cxl_decoders_dport_update(new_dport);
> + if (rc)
> + return ERR_PTR(rc);
Maybe unfold cxl_decoders_dport_update() here?
> +
> + return no_free_ptr(new_dport);
> +}
> +
> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
> + struct device *dport_dev)
> +{
> + struct cxl_port *port __free(put_cxl_port) =
> + find_cxl_port_by_uport(uport_dev);
> +
> + if (!port)
> + return ERR_PTR(-ENODEV);
> +
> + return cxl_port_get_or_add_dport(port, dport_dev);
> +}
That function can be removed, see below.
> +
> +static struct cxl_dport *
> +devm_cxl_create_or_extend_port(struct device *ep_dev,
> + struct cxl_port *parent_port,
> + struct cxl_dport *parent_dport,
> + struct device *uport_dev,
> + struct device *dport_dev)
> +{
> + resource_size_t component_reg_phys;
> +
> + guard(device)(&parent_port->dev);
> +
> + if (!parent_port->dev.driver) {
> + dev_warn(ep_dev,
> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
> + dev_name(&parent_port->dev), dev_name(uport_dev));
> + return ERR_PTR(-ENXIO);
> + }
> +
> + struct cxl_port *port __free(put_cxl_port) =
> + find_cxl_port_by_uport(uport_dev);
> +
> + if (!port) {
> + component_reg_phys = find_component_registers(uport_dev);
> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> + component_reg_phys, parent_dport);
> + if (IS_ERR(port))
> + return (struct cxl_dport *)port;
> +
> + /*
> + * retry to make sure a port is found. a port device
> + * reference is taken.
> + */
> + port = find_cxl_port_by_uport(uport_dev);
> + if (!port)
> + return ERR_PTR(-ENODEV);
> +
> + dev_dbg(ep_dev, "created port %s:%s\n",
> + dev_name(&port->dev), dev_name(port->uport_dev));
> + }
> +
> + return cxl_port_get_or_add_dport(port, dport_dev);
> +}
> +
> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> struct device *uport_dev,
> struct device *dport_dev)
> {
> struct device *dparent = grandparent(dport_dev);
> struct cxl_dport *dport, *parent_dport;
> - resource_size_t component_reg_phys;
> int rc;
>
> if (is_cxl_host_bridge(dparent)) {
> + struct cxl_port *port __free(put_cxl_port) =
> + find_cxl_port_by_uport(uport_dev);
> /*
> * The iteration reached the topology root without finding the
> * CXL-root 'cxl_port' on a previous iteration, fail for now to
> * be re-probed after platform driver attaches.
> */
> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> - dev_name(dport_dev));
> - return -ENXIO;
> + if (!port) {
> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> + dev_name(dport_dev));
> + return -ENXIO;
> + }
> +
> + /*
> + * While the port is found, there may not be a dport associated
> + * yet. Try to associate the dport to the port. On return success,
> + * the iteration will restart with the dport now attached.
> + */
> + dport = devm_cxl_add_dport_by_uport(uport_dev,
> + dport_dev);
port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
instead. Remove devm_cxl_add_dport_by_uport().
> + if (IS_ERR(dport))
> + return PTR_ERR(dport);
> +
> + return 0;
> }
>
> struct cxl_port *parent_port __free(put_cxl_port) =
> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> return -EAGAIN;
> }
>
> - /*
> - * Definition with __free() here to keep the sequence of
> - * dereferencing the device of the port before the parent_port releasing.
> - */
> - struct cxl_port *port __free(put_cxl_port) = NULL;
> - scoped_guard(device, &parent_port->dev) {
> - if (!parent_port->dev.driver) {
> - dev_warn(&cxlmd->dev,
> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
> - dev_name(&parent_port->dev), dev_name(uport_dev));
> - return -ENXIO;
> - }
> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
> + parent_dport, uport_dev,
> + dport_dev);
You expand add_port_attach_ep() here. This function was originally
called if there is no *port* at all. Now, as the dport_dev is not yet
registered, the port may already exist, but it is not found since the
dport_dev is not yet registered and add_port_attach_ep() is called now
even if the port exists. I think we should move that dport_dev
registration a level higher to devm_cxl_enumerate_ports(). That might
need a cleanup of the iterator and the removal of
add_port_attach_ep().
> + if (IS_ERR(dport))
> + return PTR_ERR(dport);
>
> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> - if (!port) {
> - component_reg_phys = find_component_registers(uport_dev);
> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> - component_reg_phys, parent_dport);
> - if (IS_ERR(port))
> - return PTR_ERR(port);
> -
> - /* retry find to pick up the new dport information */
> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> - if (!port)
> - return -ENXIO;
> - }
> - }
> -
> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
> - dev_name(&port->dev), dev_name(port->uport_dev));
> rc = cxl_add_ep(dport, &cxlmd->dev);
> if (rc == -EBUSY) {
> /*
> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> {
> struct device *dev = &cxlmd->dev;
> struct device *iter;
> + int ports_need_create = 0;
> int rc;
>
> /*
> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> struct device *uport_dev;
> struct cxl_dport *dport;
>
> + ports_need_create++;
> +
> if (is_cxl_host_bridge(dport_dev))
> return 0;
>
> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>
> cxl_gpf_port_setup(dport);
>
> + ports_need_create--;
> /* Any more ports to add between this one and the root? */
> if (!dev_is_cxl_root_child(&port->dev))
> continue;
>
> + /*
> + * The 'ports_need_create' variable tracks a port being
> + * created as it goes through this iterative loop. It's
> + * incremented when it first enters the loop and decremented
> + * when the port is found. If at the root of the hierarchy
> + * and the variable is not 0, then it's missing a port
> + * creation somewhere in the hierarchy and should restart.
> + * For example in a setup where there's a PCI root port, a
> + * switch, and an endpoint, it is possible to get to the
> + * PCI root port and its creation, and the switch port is
> + * still missing because the root port didn't exist. This
> + * triggers a restart of the loop to create the switch port
> + * now with a present root port.
> + */
> + if (ports_need_create)
Uh, that becomes hard. Isn't the iterator much simpler:
* Start the iter = endpoint.
* Find first existing parent port up to the root.
* If that is the direct parent of the endpoint, attach it to the
parent (add dport etc.). Exit loop without errors.
* Else, create port and attach it to the found parent port (including
dport handling).
* Fail on errors or retry otherwise.
So, devm_cxl_enumerate_ports() should be reworked better, also address
my other comments regarding add_port_attach_ep() and
devm_cxl_create_or_extend_port().
> + goto retry;
> +
> return 0;
> }
>
> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> if (rc == -EAGAIN)
> continue;
> /* failed to add ep or port */
> - if (rc)
> + if (rc < 0)
> return rc;
> +
> + ports_need_create = 0;
> /* port added, new descendants possible, start over */
> goto retry;
> }
> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
> device_lock_assert(&port->dev);
>
> if (xa_empty(&port->dports))
> - return -EINVAL;
> + return 0;
>
> guard(rwsem_write)(&cxl_rwsem.region);
> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
>
> - if (!dport)
> - return -ENXIO;
> + if (!dport) {
> + /* dport may be activated later */
> + continue;
> + }
> cxlsd->target[i] = dport;
> }
Should that be dropped entirely as the target setup is done somewhere
else?
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 71cc42d05248..bba62867df90 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1510,8 +1510,10 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> cxl_rr->nr_targets_set);
> return -ENXIO;
> }
> - } else
> + } else {
> cxlsd->target[cxl_rr->nr_targets_set] = ep->dport;
> + cxlsd->cxld.target_map[cxl_rr->nr_targets_set] = ep->dport->port_id;
> + }
> inc = 1;
> out_target_set:
> cxl_rr->nr_targets_set += inc;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 87a905db5ffb..df10a01376c6 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -591,6 +591,7 @@ struct cxl_dax_region {
> * @parent_dport: dport that points to this port in the parent
> * @decoder_ida: allocator for decoder ids
> * @reg_map: component and ras register mapping parameters
> + * @possible_dports: Total possible dports reported by hardware
> * @nr_dports: number of entries in @dports
> * @hdm_end: track last allocated HDM decoder instance for allocation ordering
> * @commit_end: cursor to track highest committed decoder for commit ordering
> @@ -612,6 +613,7 @@ struct cxl_port {
> struct cxl_dport *parent_dport;
> struct ida decoder_ida;
> struct cxl_register_map reg_map;
> + int possible_dports;
> int nr_dports;
> int hdm_end;
> int commit_end;
> @@ -911,6 +913,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
> struct access_coordinate *c2);
>
> bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
> +int cxl_port_get_possible_dports(struct cxl_port *port);
>
> /*
> * Unit test builds overrides this to __weak, find the 'strong' version
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index cf32dc50b7a6..941a7d7157bd 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -59,34 +59,17 @@ static int discover_region(struct device *dev, void *unused)
>
> static int cxl_switch_port_probe(struct cxl_port *port)
> {
> - struct cxl_hdm *cxlhdm;
> - int rc;
> + int dports;
>
> /* Cache the data early to ensure is_visible() works */
> read_cdat_data(port);
>
> - rc = devm_cxl_port_enumerate_dports(port);
> - if (rc < 0)
> - return rc;
> + dports = cxl_port_get_possible_dports(port);
> + if (dports < 0)
> + return dports;
> + port->possible_dports = dports;
As said, I think the whole possible_dports part can be removed.
Thanks,
-Robert
>
> - cxl_switch_parse_cdat(port);
> -
> - cxlhdm = devm_cxl_setup_hdm(port, NULL);
> - if (!IS_ERR(cxlhdm))
> - return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> -
> - if (PTR_ERR(cxlhdm) != -ENODEV) {
> - dev_err(&port->dev, "Failed to map HDM decoder capability\n");
> - return PTR_ERR(cxlhdm);
> - }
> -
> - if (rc == 1) {
> - dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> - return devm_cxl_add_passthrough_decoder(port);
> - }
> -
> - dev_err(&port->dev, "HDM decoder capability not found\n");
> - return -ENXIO;
> + return 0;
> }
>
> static int cxl_endpoint_port_probe(struct cxl_port *port)
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
2025-08-15 12:50 ` Jonathan Cameron
@ 2025-08-20 13:51 ` Robert Richter
1 sibling, 0 replies; 40+ messages in thread
From: Robert Richter @ 2025-08-20 13:51 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, Li Ming
On 14.08.25 15:21:41, Dave Jiang wrote:
> Add a helper to replace the open code detection of CXL device hierarchy
> root, or the host bridge. The helper will be used for delayed downstream
> port (dport) creation.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Li Ming <ming.li@zohomail.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 02/11] cxl: Add helper to reap dport
2025-08-14 22:21 ` [PATCH v8 02/11] cxl: Add helper to reap dport Dave Jiang
@ 2025-08-20 14:10 ` Robert Richter
2025-08-20 20:54 ` Dave Jiang
0 siblings, 1 reply; 40+ messages in thread
From: Robert Richter @ 2025-08-20 14:10 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, Li Ming
On 14.08.25 15:21:42, Dave Jiang wrote:
> Refactor the code in reap_dports() out to provide a helper function that
> reaps a single dport. This will be used later in the cleanup path for
> allocating a dport.
>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Li Ming <ming.li@zohomail.com>
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/port.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 855623cebd7d..fd316e9bd59d 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1441,6 +1441,15 @@ static void delete_switch_port(struct cxl_port *port)
> devm_release_action(port->dev.parent, unregister_port, port);
> }
>
> +static void reap_dport(struct cxl_dport *dport)
Could you name that cxl_del_dport() as a counterpart to
devm_cxl_add_dport? The cleanup helper should be named to
cxl_del_dport to, put_cxl_dport is incorrect here (it is not the
refcount decremented, kfree is used).
> +{
> + struct cxl_port *port = dport->port;
> +
> + devm_release_action(&port->dev, cxl_dport_unlink, dport);
> + devm_release_action(&port->dev, cxl_dport_remove, dport);
> + devm_kfree(&port->dev, dport);
> +}
> +
> static void reap_dports(struct cxl_port *port)
cxl_del_dports()?
> {
> struct cxl_dport *dport;
> @@ -1448,11 +1457,8 @@ static void reap_dports(struct cxl_port *port)
>
> device_lock_assert(&port->dev);
>
> - xa_for_each(&port->dports, index, dport) {
> - devm_release_action(&port->dev, cxl_dport_unlink, dport);
> - devm_release_action(&port->dev, cxl_dport_remove, dport);
> - devm_kfree(&port->dev, dport);
> - }
> + xa_for_each(&port->dports, index, dport)
> + reap_dport(dport);
> }
>
> struct detach_ctx {
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
2025-08-15 12:52 ` Jonathan Cameron
@ 2025-08-20 14:17 ` Robert Richter
1 sibling, 0 replies; 40+ messages in thread
From: Robert Richter @ 2025-08-20 14:17 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 14.08.25 15:21:43, Dave Jiang wrote:
> Add a cached copy of the hardware port-id list that is available at init
> before all @dport objects have been instantiated. Change is in preparation
> of delayed dport instantiation.
>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Nice rework, thanks. A nitpick below.
-Robert
> @@ -984,7 +982,7 @@ static int cxl_setup_hdm_decoder_from_dvsec(
> }
>
> static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> - int *target_map, void __iomem *hdm, int which,
> + void __iomem *hdm, int which,
> u64 *dpa_base, struct cxl_endpoint_dvsec_info *info)
Split line to fill the 80 char limit.
> {
> struct cxl_endpoint_decoder *cxled = NULL;
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-20 12:41 ` Robert Richter
@ 2025-08-20 15:20 ` Dave Jiang
2025-08-22 9:59 ` Robert Richter
2025-08-27 21:15 ` Dave Jiang
2025-08-27 21:37 ` Dave Jiang
2 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-20 15:20 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 8/20/25 5:41 AM, Robert Richter wrote:
> Hi Dave,
>
> see my comments below.
>
> On 14.08.25 15:21:45, Dave Jiang wrote:
<--snip-->
>> + if (IS_ERR(new_dport))
>> + return new_dport;
>> +
>> + cxl_switch_parse_cdat(port);
>> +
>> + /*
>> + * First instance of dport appearing, need to setup the port, including
>> + * allocating decoders.
>> + */
>> + if (port->nr_dports == 1) {
>> + rc = cxl_switch_port_setup(port);
>
> Can't this be done with port creation? I don't see a reason doing this
> late at this point.
The main reason we are doing this is to move the port register probing until we know the CXL link is established. Otherwise when cxl_acpi does probe and calls add_host_bridge_uport(), that devm_cxl_add_port() can trigger errors if the platform BIOS enables PCI hotplug support on Intel platforms. The error messages "cxl portN: Couldn't locate the CXL.cache and CXL.mem capability array header" is observed. Essentially we can be trying to map registers while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because that got pushed out, so did the decoder enumeration.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 02/11] cxl: Add helper to reap dport
2025-08-20 14:10 ` Robert Richter
@ 2025-08-20 20:54 ` Dave Jiang
0 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-20 20:54 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, Li Ming
On 8/20/25 7:10 AM, Robert Richter wrote:
> On 14.08.25 15:21:42, Dave Jiang wrote:
>> Refactor the code in reap_dports() out to provide a helper function that
>> reaps a single dport. This will be used later in the cleanup path for
>> allocating a dport.
>>
>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>> Reviewed-by: Li Ming <ming.li@zohomail.com>
>> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>> drivers/cxl/core/port.c | 16 +++++++++++-----
>> 1 file changed, 11 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 855623cebd7d..fd316e9bd59d 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1441,6 +1441,15 @@ static void delete_switch_port(struct cxl_port *port)
>> devm_release_action(port->dev.parent, unregister_port, port);
>> }
>>
>> +static void reap_dport(struct cxl_dport *dport)
>
> Could you name that cxl_del_dport() as a counterpart to
> devm_cxl_add_dport? The cleanup helper should be named to
> cxl_del_dport to, put_cxl_dport is incorrect here (it is not the
> refcount decremented, kfree is used).
Will rename to del_dport(). No need to add cxl_ since it's a local function.
DJ
>
>> +{
>> + struct cxl_port *port = dport->port;
>> +
>> + devm_release_action(&port->dev, cxl_dport_unlink, dport);
>> + devm_release_action(&port->dev, cxl_dport_remove, dport);
>> + devm_kfree(&port->dev, dport);
>> +}
>> +
>> static void reap_dports(struct cxl_port *port)
>
> cxl_del_dports()?
>
>> {
>> struct cxl_dport *dport;
>> @@ -1448,11 +1457,8 @@ static void reap_dports(struct cxl_port *port)
>>
>> device_lock_assert(&port->dev);
>>
>> - xa_for_each(&port->dports, index, dport) {
>> - devm_release_action(&port->dev, cxl_dport_unlink, dport);
>> - devm_release_action(&port->dev, cxl_dport_remove, dport);
>> - devm_kfree(&port->dev, dport);
>> - }
>> + xa_for_each(&port->dports, index, dport)
>> + reap_dport(dport);
>> }
>>
>> struct detach_ctx {
>> --
>> 2.50.1
>>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 04/11] cxl: Move port register setup to first dport appear
2025-08-15 12:57 ` Jonathan Cameron
@ 2025-08-21 11:57 ` Robert Richter
0 siblings, 0 replies; 40+ messages in thread
From: Robert Richter @ 2025-08-21 11:57 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Dave Jiang, linux-cxl, dave, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams
On 15.08.25 13:57:35, Jonathan Cameron wrote:
> I'm a little nervous about what happens when we hot unplug EPs
> on these systems and any left over address mappings for port to which
> they are connected. But from previous discussions I think the argument
> was that they were benign if they do happen.
Yeah, unplug is tricky, we probably don't get the same state as
before. Let's see how tolerant the driver is.
-Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-20 15:20 ` Dave Jiang
@ 2025-08-22 9:59 ` Robert Richter
2025-08-22 15:52 ` Dave Jiang
0 siblings, 1 reply; 40+ messages in thread
From: Robert Richter @ 2025-08-22 9:59 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 20.08.25 08:20:04, Dave Jiang wrote:
> On 8/20/25 5:41 AM, Robert Richter wrote:
> > Hi Dave,
> >
> > see my comments below.
> >
> > On 14.08.25 15:21:45, Dave Jiang wrote:
>
> <--snip-->
>
> >> + if (IS_ERR(new_dport))
> >> + return new_dport;
> >> +
> >> + cxl_switch_parse_cdat(port);
> >> +
> >> + /*
> >> + * First instance of dport appearing, need to setup the port, including
> >> + * allocating decoders.
> >> + */
> >> + if (port->nr_dports == 1) {
> >> + rc = cxl_switch_port_setup(port);
> >
> > Can't this be done with port creation? I don't see a reason doing this
> > late at this point.
>
> The main reason we are doing this is to move the port register
> probing until we know the CXL link is established. Otherwise when
> cxl_acpi does probe and calls add_host_bridge_uport(), that
> devm_cxl_add_port() can trigger errors if the platform BIOS enables
> PCI hotplug support on Intel platforms. The error messages "cxl
> portN: Couldn't locate the CXL.cache and CXL.mem capability array
> header" is observed. Essentially we can be trying to map registers
> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
> that got pushed out, so did the decoder enumeration.
The code suggests the Component Registers of the CXL Host Bridge are
not yet ready. Is this delayed after the first Root Port is connected
to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
(CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
the pcie config space, which is enumerated not before a CXL endpoint
becomes active. I haven't found a spec refs here. Please explain.
Thanks,
-Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 04/11] cxl: Move port register setup to first dport appear
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
2025-08-15 12:57 ` Jonathan Cameron
@ 2025-08-22 10:37 ` Robert Richter
1 sibling, 0 replies; 40+ messages in thread
From: Robert Richter @ 2025-08-22 10:37 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 14.08.25 15:21:44, Dave Jiang wrote:
> This patch moves the port register setup to when the first dport appears
> via the memdev probe path. At this point, the CXL link should be
> established and the register access is expected to succeed. This change
> addresses an error message observed when PCIe hotplug is enabled on
> an Intel platform. The error messages "cxl portN: Couldn't locate the
> CXL.cache and CXL.mem capability array header" is observed for the
> hostbridge during cxl_acpi driver probe. If the cxl_acpi module
> probe is running before the CXL link between the endpoint device and the
> RP is established, then the platform may not have exposed DVSEC ID 3
> and/or DVSEC ID 7 blocks which will trigger the error message. This
> behavior is defined by the spec and not a hardware quirk.
>
> This change also needs the dport enumeration to be moved to the memdev
> probe path in order to address the issue. This change is just part of
> the code refactoring and is not a wholly contained fix itself.
>
> Suggested-by: Dan Williamsn <dan.j.williams@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/port.c | 16 +++++++++++++---
> drivers/cxl/cxl.h | 2 ++
> 2 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 48e76673aaf3..25209952f469 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -867,9 +867,7 @@ static int cxl_port_add(struct cxl_port *port,
> if (rc)
> return rc;
>
> - rc = cxl_port_setup_regs(port, component_reg_phys);
> - if (rc)
> - return rc;
> + port->component_reg_phys = component_reg_phys;
> } else {
> rc = dev_set_name(dev, "root%d", port->id);
> if (rc)
> @@ -1200,6 +1198,18 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
>
> cxl_debugfs_create_dport_dir(dport);
>
> + /*
> + * Setup port register if this is the first dport showed up. Having
> + * a dport also means that there is at least 1 active link.
> + */
> + if (port->nr_dports == 1 &&
> + port->component_reg_phys != CXL_RESOURCE_NONE) {
> + rc = cxl_port_setup_regs(port, port->component_reg_phys);
All that delays decoder enablement and visibility in sysfs. I think we
need a different approach to handle late CHBRC availablity. Let's see
your response to my other mail.
-Robert
> + if (rc)
> + return ERR_PTR(rc);
> + port->component_reg_phys = CXL_RESOURCE_NONE;
> + }
> +
> return dport;
> }
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 4b858f3d44c6..87a905db5ffb 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -599,6 +599,7 @@ struct cxl_dax_region {
> * @cdat: Cached CDAT data
> * @cdat_available: Should a CDAT attribute be available in sysfs
> * @pci_latency: Upstream latency in picoseconds
> + * @component_reg_phys: Physical address of component register
> */
> struct cxl_port {
> struct device dev;
> @@ -622,6 +623,7 @@ struct cxl_port {
> } cdat;
> bool cdat_available;
> long pci_latency;
> + resource_size_t component_reg_phys;
> };
>
> /**
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-22 9:59 ` Robert Richter
@ 2025-08-22 15:52 ` Dave Jiang
2025-08-26 7:51 ` Robert Richter
0 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-22 15:52 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 8/22/25 2:59 AM, Robert Richter wrote:
> On 20.08.25 08:20:04, Dave Jiang wrote:
>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>> Hi Dave,
>>>
>>> see my comments below.
>>>
>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>
>> <--snip-->
>>
>>>> + if (IS_ERR(new_dport))
>>>> + return new_dport;
>>>> +
>>>> + cxl_switch_parse_cdat(port);
>>>> +
>>>> + /*
>>>> + * First instance of dport appearing, need to setup the port, including
>>>> + * allocating decoders.
>>>> + */
>>>> + if (port->nr_dports == 1) {
>>>> + rc = cxl_switch_port_setup(port);
>>>
>>> Can't this be done with port creation? I don't see a reason doing this
>>> late at this point.
>>
>
>> The main reason we are doing this is to move the port register
>> probing until we know the CXL link is established. Otherwise when
>> cxl_acpi does probe and calls add_host_bridge_uport(), that
>> devm_cxl_add_port() can trigger errors if the platform BIOS enables
>> PCI hotplug support on Intel platforms. The error messages "cxl
>> portN: Couldn't locate the CXL.cache and CXL.mem capability array
>> header" is observed. Essentially we can be trying to map registers
>> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
>> that got pushed out, so did the decoder enumeration.
>
> The code suggests the Component Registers of the CXL Host Bridge are
> not yet ready. Is this delayed after the first Root Port is connected
> to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
> (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
> the pcie config space, which is enumerated not before a CXL endpoint
> becomes active. I haven't found a spec refs here. Please explain.
So the behavior is observed when PCIe hotplug support is turned on in BIOS for the Intel platform. A CXL device is plugged in to a RP without CXL switches. The thinking is that the CXL link is not fully established at the time when cxl_acpi_probe() is running and the ports are being added. And the only way to 100% be sure the link is established is when we are enumerating the memdev just like the dports. Not sure what spec ref are you looking for. Table 8-2 indicates that those 2 DVSECs are mandatory for CXL root ports. Lack of presence means either the RP isn't CXL or the CXL link isn't established yet. I would assume this would also be true if a CXL memdev is hot-plugged into a slot post boot.
>
> Thanks,
>
> -Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-22 15:52 ` Dave Jiang
@ 2025-08-26 7:51 ` Robert Richter
2025-08-27 17:05 ` Dave Jiang
0 siblings, 1 reply; 40+ messages in thread
From: Robert Richter @ 2025-08-26 7:51 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 22.08.25 08:52:39, Dave Jiang wrote:
>
>
> On 8/22/25 2:59 AM, Robert Richter wrote:
> > On 20.08.25 08:20:04, Dave Jiang wrote:
> >> On 8/20/25 5:41 AM, Robert Richter wrote:
> >>> Hi Dave,
> >>>
> >>> see my comments below.
> >>>
> >>> On 14.08.25 15:21:45, Dave Jiang wrote:
> >>
> >> <--snip-->
> >>
> >>>> + if (IS_ERR(new_dport))
> >>>> + return new_dport;
> >>>> +
> >>>> + cxl_switch_parse_cdat(port);
> >>>> +
> >>>> + /*
> >>>> + * First instance of dport appearing, need to setup the port, including
> >>>> + * allocating decoders.
> >>>> + */
> >>>> + if (port->nr_dports == 1) {
> >>>> + rc = cxl_switch_port_setup(port);
> >>>
> >>> Can't this be done with port creation? I don't see a reason doing this
> >>> late at this point.
> >>
> >
> >> The main reason we are doing this is to move the port register
> >> probing until we know the CXL link is established. Otherwise when
> >> cxl_acpi does probe and calls add_host_bridge_uport(), that
> >> devm_cxl_add_port() can trigger errors if the platform BIOS enables
> >> PCI hotplug support on Intel platforms. The error messages "cxl
> >> portN: Couldn't locate the CXL.cache and CXL.mem capability array
> >> header" is observed. Essentially we can be trying to map registers
> >> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
> >> that got pushed out, so did the decoder enumeration.
> >
> > The code suggests the Component Registers of the CXL Host Bridge are
> > not yet ready. Is this delayed after the first Root Port is connected
> > to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
> > (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
> > the pcie config space, which is enumerated not before a CXL endpoint
> > becomes active. I haven't found a spec refs here. Please explain.
>
> So the behavior is observed when PCIe hotplug support is turned on
> in BIOS for the Intel platform. A CXL device is plugged in to a RP
> without CXL switches. The thinking is that the CXL link is not fully
> established at the time when cxl_acpi_probe() is running and the
> ports are being added. And the only way to 100% be sure the link is
> established is when we are enumerating the memdev just like the
> dports. Not sure what spec ref are you looking for. Table 8-2
> indicates that those 2 DVSECs are mandatory for CXL root ports. Lack
> of presence means either the RP isn't CXL or the CXL link isn't
> established yet. I would assume this would also be true if a CXL
> memdev is hot-plugged into a slot post boot.
But add_host_bridge_uport() only creates ports for the host bridge
(ACPI0016) devices and enumerates their component registers (CHBCR).
The root ports are being added already late as those are part of the
pci hierarchy. The root ports are discovered in
devm_cxl_enumerate_ports() not earlier than the mem_dev is probed.
devm_cxl_add_memdev() is called once the endpoint is probed and the
CXL link is up.
That is, function cxl_port_get_or_add_dport() in add_port_attach_ep
should only add the dport. Then, a retry in the enumeration loop will
be triggered and the cxl_port for the root port is added. No explicit
call of cxl_switch_port_setup() should be needed, it can be done
during cxl_port_probe(). All done late after the endpoint was found.
-Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-26 7:51 ` Robert Richter
@ 2025-08-27 17:05 ` Dave Jiang
2025-08-29 15:02 ` Robert Richter
0 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-27 17:05 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 8/26/25 12:51 AM, Robert Richter wrote:
> On 22.08.25 08:52:39, Dave Jiang wrote:
>>
>>
>> On 8/22/25 2:59 AM, Robert Richter wrote:
>>> On 20.08.25 08:20:04, Dave Jiang wrote:
>>>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>>>> Hi Dave,
>>>>>
>>>>> see my comments below.
>>>>>
>>>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>>>
>>>> <--snip-->
>>>>
>>>>>> + if (IS_ERR(new_dport))
>>>>>> + return new_dport;
>>>>>> +
>>>>>> + cxl_switch_parse_cdat(port);
>>>>>> +
>>>>>> + /*
>>>>>> + * First instance of dport appearing, need to setup the port, including
>>>>>> + * allocating decoders.
>>>>>> + */
>>>>>> + if (port->nr_dports == 1) {
>>>>>> + rc = cxl_switch_port_setup(port);
>>>>>
>>>>> Can't this be done with port creation? I don't see a reason doing this
>>>>> late at this point.
>>>>
>>>
>>>> The main reason we are doing this is to move the port register
>>>> probing until we know the CXL link is established. Otherwise when
>>>> cxl_acpi does probe and calls add_host_bridge_uport(), that
>>>> devm_cxl_add_port() can trigger errors if the platform BIOS enables
>>>> PCI hotplug support on Intel platforms. The error messages "cxl
>>>> portN: Couldn't locate the CXL.cache and CXL.mem capability array
>>>> header" is observed. Essentially we can be trying to map registers
>>>> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
>>>> that got pushed out, so did the decoder enumeration.
>>>
>>> The code suggests the Component Registers of the CXL Host Bridge are
>>> not yet ready. Is this delayed after the first Root Port is connected
>>> to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
>>> (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
>>> the pcie config space, which is enumerated not before a CXL endpoint
>>> becomes active. I haven't found a spec refs here. Please explain.
>>
>
>> So the behavior is observed when PCIe hotplug support is turned on
>> in BIOS for the Intel platform. A CXL device is plugged in to a RP
>> without CXL switches. The thinking is that the CXL link is not fully
>> established at the time when cxl_acpi_probe() is running and the
>> ports are being added. And the only way to 100% be sure the link is
>> established is when we are enumerating the memdev just like the
>> dports. Not sure what spec ref are you looking for. Table 8-2
>> indicates that those 2 DVSECs are mandatory for CXL root ports. Lack
>> of presence means either the RP isn't CXL or the CXL link isn't
>> established yet. I would assume this would also be true if a CXL
>> memdev is hot-plugged into a slot post boot.
>
> But add_host_bridge_uport() only creates ports for the host bridge
> (ACPI0016) devices and enumerates their component registers (CHBCR).
And I think that's where the issue is. The component registers via CHBCR isn't there. When I removed this change, this is the signature I get:
[ 37.423882] cxl_acpi:cxl_get_chbs:589: acpi ACPI0016:03: UID found: 35
[ 37.424180] cxl_acpi:add_host_bridge_uport:726: acpi ACPI0016:03: CHBCR found for UID 35: 0x00000
000aabf0000
[ 37.424186] cxl_core:cxl_port_alloc:741: pci0000:3a: host-bridge: pci0000:3a
[ 37.424210] cxl_core:cxl_map_regblock:426: cxl port2: Mapped CXL Memory Device resource 0x0000000
0aabf0000
[ 37.424213] cxl_core:cxl_probe_component_regs:55: cxl port2: Couldn't locate the CXL.cache and CXL.mem capability array header.
DJ
> The root ports are being added already late as those are part of the
> pci hierarchy. The root ports are discovered in
> devm_cxl_enumerate_ports() not earlier than the mem_dev is probed.
> devm_cxl_add_memdev() is called once the endpoint is probed and the
> CXL link is up.
>
> That is, function cxl_port_get_or_add_dport() in add_port_attach_ep
> should only add the dport. Then, a retry in the enumeration loop will
> be triggered and the cxl_port for the root port is added. No explicit
> call of cxl_switch_port_setup() should be needed, it can be done
> during cxl_port_probe(). All done late after the endpoint was found.
>
> -Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-20 12:41 ` Robert Richter
2025-08-20 15:20 ` Dave Jiang
@ 2025-08-27 21:15 ` Dave Jiang
2025-09-01 17:29 ` Robert Richter
2025-08-27 21:37 ` Dave Jiang
2 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-27 21:15 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 8/20/25 5:41 AM, Robert Richter wrote:
> Hi Dave,
>
> see my comments below.
>
> On 14.08.25 15:21:45, Dave Jiang wrote:
>> The current implementation enumerates the dports during the cxl_port
>> driver probe. Without an endpoint connected, the dport may not be
>> active during port probe. This scheme may prevent a valid hardware
>> dport id to be retrieved and MMIO registers to be read when an endpoint
>> is hot-plugged. Move the dport allocation and setup to behind memdev
>> probe so the endpoint is guaranteed to be connected.
>>
>> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
>> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
>> ACPI0017.N device. Through that it enumerates downstream ports composed
>> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
>> uses add_host_bridge_uport() to create the ports that enumerate the PCI
>> RPs as the dports of these ports. Every time a port is created, the port
>> driver is attached, cxl_switch_porbe_probe() is called and
>> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
>> the dports.
>>
>> The second phase is if there are any CXL switches. When the pci endpoint
>> device driver (cxl_pci) calls probe, it will add a mem device and triggers
>> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
>> and attempts to discovery and create all the ports represent CXL switches.
>> During this phase, a port is created per switch and the attached dports
>> are also enumerated and probed.
>>
>> The last phase is creating endpoint port which happens for all endpoint
>> devices.
>>
>> In this commit, the port create and its dport probing in cxl_acpi is not
>> changed. That will be handled later. The behavior change is only for CXL
>> switch ports. Only the dport that is part of the path for an endpoint
>> device to the RP will be probed. This happens naturally by the code
>> walking up the device hierarchy and identifying the upstream device and
>> the downstream device.
>>
>> The new sequence is instead of creating all possible dports at initial
>> port creation, defer port instantiation until a memdev beneath that
>> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
>> the creation and extension of ports with new dports as memory devices
>> arrive. As part of this rework, switch decoder target list is amended
>> at runtime as dports show up.
>>
>> While the decoders are allocated during the port driver probe,
>> The decoders must also be updated since previously it's all done when all
>> the dports are setup and now every time a dport is setup per endpoint, the
>> switch target listing need to be updated with new dport. A
>> guard(rwsem_write) is used to update decoder targets. This is similar to
>> when decoder_populate_target() is called and the decoder programming
>> must be protected.
>>
>> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>> v8:
>> - grammar and spelling fixups (Dan)
>> - Clarify commit log story. (Dan)
>> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
>> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
>> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
>> - Remove failure path for possible dports == 0. (Dan, Robert)
>> - update_switch_decoder() -> update_decoder_targets(). (Dan)
>> - Remove lock asserts where not needed. (Dan)
>> - Add support for passthrough decoder init. (Dan)
>> - Return -ENXIO when no driver attached. (Dan)
>> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
>> - Add devm_cxl_create_or_extend_port() helper. (Dan)
>> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
>> - Remove 'new_dport' local var. (Robert)
>> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
>> - Move port check logic to add_port_attach_ep(). (Robert)
>> ---
>> drivers/cxl/core/cdat.c | 2 +-
>> drivers/cxl/core/core.h | 2 +
>> drivers/cxl/core/hdm.c | 6 -
>> drivers/cxl/core/pci.c | 81 +++++++++++
>> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
>> drivers/cxl/core/region.c | 4 +-
>> drivers/cxl/cxl.h | 3 +
>> drivers/cxl/port.c | 29 +---
>> 8 files changed, 331 insertions(+), 83 deletions(-)
>>
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> index c0af645425f4..b156b81a9b20 100644
>> --- a/drivers/cxl/core/cdat.c
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
>>
>> guard(rwsem_read)(&cxl_rwsem.region);
>> for (int i = 0; i < cxlsd->nr_targets; i++) {
>> - if (host_bridge == cxlsd->target[i]->dport_dev)
>> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
>> return 1;
>> }
>>
>> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
>> index 2669f251d677..2ac71eb459e6 100644
>> --- a/drivers/cxl/core/core.h
>> +++ b/drivers/cxl/core/core.h
>> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
>> int cxl_ras_init(void);
>> void cxl_ras_exit(void);
>> int cxl_gpf_port_setup(struct cxl_dport *dport);
>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>> + struct device *dport_dev);
>>
>> #ifdef CONFIG_CXL_FEATURES
>> struct cxl_feat_entry *
>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>> index cee68bbc7ff6..5263e9eba7d0 100644
>> --- a/drivers/cxl/core/hdm.c
>> +++ b/drivers/cxl/core/hdm.c
>> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
>> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>> {
>> struct cxl_switch_decoder *cxlsd;
>> - struct cxl_dport *dport = NULL;
>> - unsigned long index;
>> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
>>
>> /*
>> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>
>> device_lock_assert(&port->dev);
>>
>> - xa_for_each(&port->dports, index, dport)
>> - break;
>> - cxlsd->cxld.target_map[0] = dport->port_id;
>> -
>
> The change of initialization of cxlsd->cxld.target_map[] could have
> been a separate patch to reduce size of this patch.
>
>> return add_hdm_decoder(port, &cxlsd->cxld);
>> }
>> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index b50551601c2e..b9d770f1aa7b 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
>> module_param(media_ready_timeout, ushort, 0644);
>> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
>>
>> +/**
>> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
>> + * @port: cxl_port that hosts the dport
>> + * @dport_dev: 'struct device' of the dport
>> + *
>> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
>> + */
>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>
> This function only determines the port_num. How about only implement
> this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
> directly?
I can split out the code to get the port_num locally, but we can't call devm_cxl_add_dport() directly in core/port.c because we need the map.resource and in order to retrieve that cxl_find_regblock() requires a pci dev.
>
> That would nicely fit into core/pci.c.
>
>> + struct device *dport_dev)
>> +{
>> + struct cxl_register_map map;
>> + struct pci_dev *pdev;
>> + u32 lnkcap, port_num;
>> + int type;
>> + int rc;
>> +
>> + if (!dev_is_pci(dport_dev))
>> + return ERR_PTR(-EINVAL);
>> +
>> + device_lock_assert(&port->dev);
>> +
>> + pdev = to_pci_dev(dport_dev);
>> + type = pci_pcie_type(pdev);
>> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
>> + return ERR_PTR(-EINVAL);
>> +
>> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
>> + &lnkcap))
>> + return ERR_PTR(-ENXIO);
>> +
>> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
>> + if (rc)
>> + dev_dbg(&port->dev, "failed to find component registers\n");
>> +
>> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
>
> So, just return port_num instead.
>
>> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
>> +}
>> +
>> struct cxl_walk_context {
>> struct pci_bus *bus;
>> struct cxl_port *port;
>> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
>>
>> return 0;
>> }
>> +
>> +static int count_dports(struct pci_dev *pdev, void *data)
>> +{
>> + struct cxl_walk_context *ctx = data;
>> + int type = pci_pcie_type(pdev);
>> +
>> + if (pdev->bus != ctx->bus)
>> + return 0;
>> + if (!pci_is_pcie(pdev))
>> + return 0;
>> + if (type != ctx->type)
>> + return 0;
>> +
>> + ctx->count++;
>> + return 0;
>> +}
>> +
>> +int cxl_port_get_possible_dports(struct cxl_port *port)
>> +{
>> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
>> + struct cxl_walk_context ctx;
>> + int type;
>> +
>> + if (!bus) {
>> + dev_err(&port->dev, "No PCI bus found for port %s\n",
>> + dev_name(&port->dev));
>> + return -ENXIO;
>> + }
>> +
>> + if (pci_is_root_bus(bus))
>> + type = PCI_EXP_TYPE_ROOT_PORT;
>> + else
>> + type = PCI_EXP_TYPE_DOWNSTREAM;
>> +
>> + ctx = (struct cxl_walk_context) {
>> + .bus = bus,
>> + .type = type,
>> + };
>> + pci_walk_bus(bus, count_dports, &ctx);
>
> Don't walk the whole bus, just check children of port->uport_dev.
cxl_port_to_pci_bus() gets the pdev->subordinate of the port->uport_dev. So I think that's equivalent of checking the children of port->uport_dev and not actually walking the whole pci bus no?
>
>> +
>> + return ctx.count;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
>
> See below for my comment on possible_dports.
>
> Since we only check for count > 1 the implemntation could be
> simplified and renamed to e.g. cxl_port_has_multiple_dports which
> could easily be used to call devm_cxl_add_passthrough_decoder().
This would be possible if the function can return a bool. However, it is possible to encounter errors. And errors should not be equivalent to a false (0) return value and resulting in a passthrough decoder creation. Thus I think we should stay with the current function name.
>
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 25209952f469..877f888ee8f5 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
>> return port;
>> }
>>
>> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
>> - struct device *dport_dev,
>> - struct cxl_dport **dport)
>> -{
>> - struct cxl_find_port_ctx ctx = {
>> - .dport_dev = dport_dev,
>> - .parent_port = parent_port,
>> - .dport = dport,
>> - };
>> - struct cxl_port *port;
>> -
>> - port = __find_cxl_port(&ctx);
>> - return port;
>> -}
>> -
>> /*
>> * All users of grandparent() are using it to walk PCIe-like switch port
>> * hierarchy. A PCIe switch is comprised of a bridge device representing the
>> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
>> return map.resource;
>> }
>>
>> +static int match_port_by_uport(struct device *dev, const void *data)
>> +{
>> + const struct device *uport_dev = data;
>> + struct cxl_port *port;
>> +
>> + if (!is_cxl_port(dev))
>> + return 0;
>> +
>> + port = to_cxl_port(dev);
>> + return uport_dev == port->uport_dev;
>> +}
>> +
>> +/*
>> + * Function takes a device reference on the port device. Caller should do a
>> + * put_device() when done.
>> + */
>> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
>> +{
>> + struct device *dev;
>> +
>> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
>> + if (dev)
>> + return to_cxl_port(dev);
>> + return NULL;
>> +}
>> +
>> +static int update_decoder_targets(struct device *dev, void *data)
>> +{
>> + struct cxl_dport *dport = data;
>> + struct cxl_switch_decoder *cxlsd;
>> + struct cxl_decoder *cxld;
>> + int i;
>> +
>> + if (!is_switch_decoder(dev))
>> + return 0;
>> +
>> + cxlsd = to_cxl_switch_decoder(dev);
>> + cxld = &cxlsd->cxld;
>> + guard(rwsem_write)(&cxl_rwsem.region);
>> +
>> + /* Short cut for passthrough decoder */
>> + if (cxlsd->nr_targets == 1) {
>
> I think we should still check port_id. That is, remove the shortcut.
> If nr_targets == 1, then interleave_ways should be one too, so you
> gain nothing. Plus, you also see the dev_dbg().
ok
>
>> + cxlsd->target[0] = dport;
>> + return 0;
>> + }
>> +
>> + for (i = 0; i < cxld->interleave_ways; i++) {
>> + if (cxld->target_map[i] == dport->port_id) {
>> + cxlsd->target[i] = dport;
>> + dev_dbg(dev, "dport%d found in target list, index %d\n",
>> + dport->port_id, i);
>> + return 0;
>
> Only one target exists, right? Stop the iteration by returning a
> non-zero here (caller needs to be adjusted then).
ok
>
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
>> +{
>> + return device_for_each_child(&dport->port->dev, dport,
>> + update_decoder_targets);
>
> Might need changes if update_decoder_targets returns 1 to stop the
> iterator.
ok
>
>> +}
>> +
>> +static int cxl_switch_port_setup(struct cxl_port *port)
>> +{
>
> Could you factor out that function in a separate patch?
>
> The function only sets up decoders. Name it
> cxl_switch_port_setup_decoders()?
>
>> + struct cxl_hdm *cxlhdm;
>> +
>> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
>> + if (!IS_ERR(cxlhdm))
>> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
>> +
>> + if (PTR_ERR(cxlhdm) != -ENODEV) {
>> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
>> + return PTR_ERR(cxlhdm);
>> + }
>> +
>> + if (port->possible_dports == 1) {
>> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
>> + return devm_cxl_add_passthrough_decoder(port);
>
> Imo, the possible_dports handling should be removed as it only
> introduces dead code. mock_cxl_setup_hdm() always returns a valid
> cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
> here.
>
> So how about moving (the "real") devm_cxl_add_passthrough_decoder()
> and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
> devm_cxl_add_passthrough_decoder() would be static then and
> cxl_port_get_possible_dports() will be a core.h function only. Then,
> mock_cxl_add_passthrough_decoder() could be removed too.
>
> I really would like to have a clean core module interface that allows
> an easy implementation of cxl_test and avoid too much impact to the
> driver code.
So after looking at this a bit, it looks like we need a bigger refactor than just devm_cxl_enumerate_decoders(). I have an attempt in the next rev you can take a look. It reduces from 3-4 mock functions down to 2.
>
>> + }
>> +
>> + dev_err(&port->dev, "HDM decoder capability not found\n");
>> + return -ENXIO;
>> +}
>> +
>> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
>> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
>> + struct device *dport_dev)
>> +{
>> + struct cxl_dport *dport;
>> + int rc;
>> +
>> + guard(device)(&port->dev);
>> +
>> + if (!port->dev.driver)
>> + return ERR_PTR(-ENXIO);
>> +
>> + dport = cxl_find_dport_by_dev(port, dport_dev);
>> + if (dport)
>> + return dport;
>
> What is the case if there is already a dport bound to the port? Since
> there is a 1:1 mapping downstream, there is only one allocation and I
> would expect that dport never exists and an -EBUSY should be returned
> otherwise.
ok
>
>> +
>> + struct cxl_dport *new_dport __free(put_cxl_dport) =
>> + devm_cxl_add_dport_by_dev(port, dport_dev);
>
> See my comment on devm_cxl_add_dport_by_dev() above.
>
>> + if (IS_ERR(new_dport))
>> + return new_dport;
>> +
>> + cxl_switch_parse_cdat(port);
>> +
>> + /*
>> + * First instance of dport appearing, need to setup the port, including
>> + * allocating decoders.
>> + */
>> + if (port->nr_dports == 1) {
>> + rc = cxl_switch_port_setup(port);
>
> Can't this be done with port creation? I don't see a reason doing this
> late at this point.
>
>> + if (rc)
>> + return ERR_PTR(rc);
>> + return no_free_ptr(new_dport);
>> + }
>> +
>> + rc = cxl_decoders_dport_update(new_dport);
>> + if (rc)
>> + return ERR_PTR(rc);
>
> Maybe unfold cxl_decoders_dport_update() here?
ok
>
>> +
>> + return no_free_ptr(new_dport);
>> +}
>> +
>> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
>> + struct device *dport_dev)
>> +{
>> + struct cxl_port *port __free(put_cxl_port) =
>> + find_cxl_port_by_uport(uport_dev);
>> +
>> + if (!port)
>> + return ERR_PTR(-ENODEV);
>> +
>> + return cxl_port_get_or_add_dport(port, dport_dev);
>> +}
>
> That function can be removed, see below.
ok
>
>> +
>> +static struct cxl_dport *
>> +devm_cxl_create_or_extend_port(struct device *ep_dev,
>> + struct cxl_port *parent_port,
>> + struct cxl_dport *parent_dport,
>> + struct device *uport_dev,
>> + struct device *dport_dev)
>> +{
>> + resource_size_t component_reg_phys;
>> +
>> + guard(device)(&parent_port->dev);
>> +
>> + if (!parent_port->dev.driver) {
>> + dev_warn(ep_dev,
>> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
>> + dev_name(&parent_port->dev), dev_name(uport_dev));
>> + return ERR_PTR(-ENXIO);
>> + }
>> +
>> + struct cxl_port *port __free(put_cxl_port) =
>> + find_cxl_port_by_uport(uport_dev);
>> +
>> + if (!port) {
>> + component_reg_phys = find_component_registers(uport_dev);
>> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>> + component_reg_phys, parent_dport);
>> + if (IS_ERR(port))
>> + return (struct cxl_dport *)port;
>> +
>> + /*
>> + * retry to make sure a port is found. a port device
>> + * reference is taken.
>> + */
>> + port = find_cxl_port_by_uport(uport_dev);
>> + if (!port)
>> + return ERR_PTR(-ENODEV);
>> +
>> + dev_dbg(ep_dev, "created port %s:%s\n",
>> + dev_name(&port->dev), dev_name(port->uport_dev));
>> + }
>> +
>> + return cxl_port_get_or_add_dport(port, dport_dev);
>> +}
>> +
>> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>> struct device *uport_dev,
>> struct device *dport_dev)
>> {
>> struct device *dparent = grandparent(dport_dev);
>> struct cxl_dport *dport, *parent_dport;
>> - resource_size_t component_reg_phys;
>> int rc;
>>
>> if (is_cxl_host_bridge(dparent)) {
>> + struct cxl_port *port __free(put_cxl_port) =
>> + find_cxl_port_by_uport(uport_dev);
>> /*
>> * The iteration reached the topology root without finding the
>> * CXL-root 'cxl_port' on a previous iteration, fail for now to
>> * be re-probed after platform driver attaches.
>> */
>> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>> - dev_name(dport_dev));
>> - return -ENXIO;
>> + if (!port) {
>> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>> + dev_name(dport_dev));
>> + return -ENXIO;
>> + }
>> +
>> + /*
>> + * While the port is found, there may not be a dport associated
>> + * yet. Try to associate the dport to the port. On return success,
>> + * the iteration will restart with the dport now attached.
>> + */
>> + dport = devm_cxl_add_dport_by_uport(uport_dev,
>> + dport_dev);
>
> port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
> instead. Remove devm_cxl_add_dport_by_uport().
ok
>
>> + if (IS_ERR(dport))
>> + return PTR_ERR(dport);
>> +
>> + return 0;
>> }
>>
>> struct cxl_port *parent_port __free(put_cxl_port) =
>> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>> return -EAGAIN;
>> }
>>
>> - /*
>> - * Definition with __free() here to keep the sequence of
>> - * dereferencing the device of the port before the parent_port releasing.
>> - */
>> - struct cxl_port *port __free(put_cxl_port) = NULL;
>> - scoped_guard(device, &parent_port->dev) {
>> - if (!parent_port->dev.driver) {
>> - dev_warn(&cxlmd->dev,
>> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
>> - dev_name(&parent_port->dev), dev_name(uport_dev));
>> - return -ENXIO;
>> - }
>> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
>> + parent_dport, uport_dev,
>> + dport_dev);
>
> You expand add_port_attach_ep() here. This function was originally
> called if there is no *port* at all. Now, as the dport_dev is not yet
> registered, the port may already exist, but it is not found since the
> dport_dev is not yet registered and add_port_attach_ep() is called now
> even if the port exists. I think we should move that dport_dev
> registration a level higher to devm_cxl_enumerate_ports(). That might
> need a cleanup of the iterator and the removal of
> add_port_attach_ep().
Yes. The new rev will move the dport registration a level up. No need to remove add_port_attach_ep(). devm_cxl_create_or_extend_port() will be devm_cxl_create_port().
>
>> + if (IS_ERR(dport))
>> + return PTR_ERR(dport);
>>
>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>> - if (!port) {
>> - component_reg_phys = find_component_registers(uport_dev);
>> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>> - component_reg_phys, parent_dport);
>> - if (IS_ERR(port))
>> - return PTR_ERR(port);
>> -
>> - /* retry find to pick up the new dport information */
>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>> - if (!port)
>> - return -ENXIO;
>> - }
>> - }
>> -
>> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
>> - dev_name(&port->dev), dev_name(port->uport_dev));
>> rc = cxl_add_ep(dport, &cxlmd->dev);
>> if (rc == -EBUSY) {
>> /*
>> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>> {
>> struct device *dev = &cxlmd->dev;
>> struct device *iter;
>> + int ports_need_create = 0;
>> int rc;
>>
>> /*
>> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>> struct device *uport_dev;
>> struct cxl_dport *dport;
>>
>> + ports_need_create++;
>> +
>> if (is_cxl_host_bridge(dport_dev))
>> return 0;
>>
>> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>
>> cxl_gpf_port_setup(dport);
>>
>> + ports_need_create--;
>> /* Any more ports to add between this one and the root? */
>> if (!dev_is_cxl_root_child(&port->dev))
>> continue;
>>
>> + /*
>> + * The 'ports_need_create' variable tracks a port being
>> + * created as it goes through this iterative loop. It's
>> + * incremented when it first enters the loop and decremented
>> + * when the port is found. If at the root of the hierarchy
>> + * and the variable is not 0, then it's missing a port
>> + * creation somewhere in the hierarchy and should restart.
>> + * For example in a setup where there's a PCI root port, a
>> + * switch, and an endpoint, it is possible to get to the
>> + * PCI root port and its creation, and the switch port is
>> + * still missing because the root port didn't exist. This
>> + * triggers a restart of the loop to create the switch port
>> + * now with a present root port.
>> + */
>> + if (ports_need_create)
>
> Uh, that becomes hard. Isn't the iterator much simpler:
>
> * Start the iter = endpoint.
>
> * Find first existing parent port up to the root.
>
> * If that is the direct parent of the endpoint, attach it to the
> parent (add dport etc.). Exit loop without errors.
>
> * Else, create port and attach it to the found parent port (including
> dport handling).
>
> * Fail on errors or retry otherwise.
>
> So, devm_cxl_enumerate_ports() should be reworked better, also address
> my other comments regarding add_port_attach_ep() and
> devm_cxl_create_or_extend_port().
So I reworked this whole path a bit. Maybe not exactly what you are envisioning here but it is a lot cleaner. You can take a look at the next rev.
>
>> + goto retry;
>> +
>> return 0;
>> }
>>
>> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>> if (rc == -EAGAIN)
>> continue;
>> /* failed to add ep or port */
>> - if (rc)
>> + if (rc < 0)
>> return rc;
>> +
>> + ports_need_create = 0;
>> /* port added, new descendants possible, start over */
>> goto retry;
>> }
>> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
>> device_lock_assert(&port->dev);
>>
>> if (xa_empty(&port->dports))
>> - return -EINVAL;
>> + return 0;
>>
>> guard(rwsem_write)(&cxl_rwsem.region);
>> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
>> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
>>
>> - if (!dport)
>> - return -ENXIO;
>> + if (!dport) {
>> + /* dport may be activated later */
>> + continue;
>> + }
>> cxlsd->target[i] = dport;
>> }
>
> Should that be dropped entirely as the target setup is done somewhere
> else?
>
No. This is still needed for root ports.
DJ
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 71cc42d05248..bba62867df90 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -1510,8 +1510,10 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>> cxl_rr->nr_targets_set);
>> return -ENXIO;
>> }
>> - } else
>> + } else {
>> cxlsd->target[cxl_rr->nr_targets_set] = ep->dport;
>> + cxlsd->cxld.target_map[cxl_rr->nr_targets_set] = ep->dport->port_id;
>> + }
>> inc = 1;
>> out_target_set:
>> cxl_rr->nr_targets_set += inc;
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 87a905db5ffb..df10a01376c6 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -591,6 +591,7 @@ struct cxl_dax_region {
>> * @parent_dport: dport that points to this port in the parent
>> * @decoder_ida: allocator for decoder ids
>> * @reg_map: component and ras register mapping parameters
>> + * @possible_dports: Total possible dports reported by hardware
>> * @nr_dports: number of entries in @dports
>> * @hdm_end: track last allocated HDM decoder instance for allocation ordering
>> * @commit_end: cursor to track highest committed decoder for commit ordering
>> @@ -612,6 +613,7 @@ struct cxl_port {
>> struct cxl_dport *parent_dport;
>> struct ida decoder_ida;
>> struct cxl_register_map reg_map;
>> + int possible_dports;
>> int nr_dports;
>> int hdm_end;
>> int commit_end;
>> @@ -911,6 +913,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>> struct access_coordinate *c2);
>>
>> bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>> +int cxl_port_get_possible_dports(struct cxl_port *port);
>>
>> /*
>> * Unit test builds overrides this to __weak, find the 'strong' version
>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>> index cf32dc50b7a6..941a7d7157bd 100644
>> --- a/drivers/cxl/port.c
>> +++ b/drivers/cxl/port.c
>> @@ -59,34 +59,17 @@ static int discover_region(struct device *dev, void *unused)
>>
>> static int cxl_switch_port_probe(struct cxl_port *port)
>> {
>> - struct cxl_hdm *cxlhdm;
>> - int rc;
>> + int dports;
>>
>> /* Cache the data early to ensure is_visible() works */
>> read_cdat_data(port);
>>
>> - rc = devm_cxl_port_enumerate_dports(port);
>> - if (rc < 0)
>> - return rc;
>> + dports = cxl_port_get_possible_dports(port);
>> + if (dports < 0)
>> + return dports;
>> + port->possible_dports = dports;
>
> As said, I think the whole possible_dports part can be removed.
>
> Thanks,
>
> -Robert
>
>>
>> - cxl_switch_parse_cdat(port);
>> -
>> - cxlhdm = devm_cxl_setup_hdm(port, NULL);
>> - if (!IS_ERR(cxlhdm))
>> - return devm_cxl_enumerate_decoders(cxlhdm, NULL);
>> -
>> - if (PTR_ERR(cxlhdm) != -ENODEV) {
>> - dev_err(&port->dev, "Failed to map HDM decoder capability\n");
>> - return PTR_ERR(cxlhdm);
>> - }
>> -
>> - if (rc == 1) {
>> - dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
>> - return devm_cxl_add_passthrough_decoder(port);
>> - }
>> -
>> - dev_err(&port->dev, "HDM decoder capability not found\n");
>> - return -ENXIO;
>> + return 0;
>> }
>>
>> static int cxl_endpoint_port_probe(struct cxl_port *port)
>> --
>> 2.50.1
>>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-20 12:41 ` Robert Richter
2025-08-20 15:20 ` Dave Jiang
2025-08-27 21:15 ` Dave Jiang
@ 2025-08-27 21:37 ` Dave Jiang
2 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-08-27 21:37 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 8/20/25 5:41 AM, Robert Richter wrote:
> Hi Dave,
>
> see my comments below.
>
> On 14.08.25 15:21:45, Dave Jiang wrote:
<--snip-->
>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>> index cee68bbc7ff6..5263e9eba7d0 100644
>> --- a/drivers/cxl/core/hdm.c
>> +++ b/drivers/cxl/core/hdm.c
>> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
>> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>> {
>> struct cxl_switch_decoder *cxlsd;
>> - struct cxl_dport *dport = NULL;
>> - unsigned long index;
>> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
>>
>> /*
>> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>
>> device_lock_assert(&port->dev);
>>
>> - xa_for_each(&port->dports, index, dport)
>> - break;
>> - cxlsd->cxld.target_map[0] = dport->port_id;
>> -
>
> The change of initialization of cxlsd->cxld.target_map[] could have
> been a separate patch to reduce size of this patch.
It doesn't look like it because the dport init changes need to happen first. So the target_map changes can only happen after the dport changes. But they also need to be part of the dport changes to not break things when git bisect is done.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-27 17:05 ` Dave Jiang
@ 2025-08-29 15:02 ` Robert Richter
2025-08-29 17:23 ` Dave Jiang
0 siblings, 1 reply; 40+ messages in thread
From: Robert Richter @ 2025-08-29 15:02 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 27.08.25 10:05:05, Dave Jiang wrote:
>
>
> On 8/26/25 12:51 AM, Robert Richter wrote:
> > On 22.08.25 08:52:39, Dave Jiang wrote:
> >>
> >>
> >> On 8/22/25 2:59 AM, Robert Richter wrote:
> >>> On 20.08.25 08:20:04, Dave Jiang wrote:
> >>>> On 8/20/25 5:41 AM, Robert Richter wrote:
> >>>>> Hi Dave,
> >>>>>
> >>>>> see my comments below.
> >>>>>
> >>>>> On 14.08.25 15:21:45, Dave Jiang wrote:
> >>>>
> >>>> <--snip-->
> >>>>
> >>>>>> + if (IS_ERR(new_dport))
> >>>>>> + return new_dport;
> >>>>>> +
> >>>>>> + cxl_switch_parse_cdat(port);
> >>>>>> +
> >>>>>> + /*
> >>>>>> + * First instance of dport appearing, need to setup the port, including
> >>>>>> + * allocating decoders.
> >>>>>> + */
> >>>>>> + if (port->nr_dports == 1) {
> >>>>>> + rc = cxl_switch_port_setup(port);
> >>>>>
> >>>>> Can't this be done with port creation? I don't see a reason doing this
> >>>>> late at this point.
> >>>>
> >>>
> >>>> The main reason we are doing this is to move the port register
> >>>> probing until we know the CXL link is established. Otherwise when
> >>>> cxl_acpi does probe and calls add_host_bridge_uport(), that
> >>>> devm_cxl_add_port() can trigger errors if the platform BIOS enables
> >>>> PCI hotplug support on Intel platforms. The error messages "cxl
> >>>> portN: Couldn't locate the CXL.cache and CXL.mem capability array
> >>>> header" is observed. Essentially we can be trying to map registers
> >>>> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
> >>>> that got pushed out, so did the decoder enumeration.
> >>>
> >>> The code suggests the Component Registers of the CXL Host Bridge are
> >>> not yet ready. Is this delayed after the first Root Port is connected
> >>> to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
> >>> (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
> >>> the pcie config space, which is enumerated not before a CXL endpoint
> >>> becomes active. I haven't found a spec refs here. Please explain.
> >>
> >
> >> So the behavior is observed when PCIe hotplug support is turned on
> >> in BIOS for the Intel platform. A CXL device is plugged in to a RP
> >> without CXL switches. The thinking is that the CXL link is not fully
> >> established at the time when cxl_acpi_probe() is running and the
> >> ports are being added. And the only way to 100% be sure the link is
> >> established is when we are enumerating the memdev just like the
> >> dports. Not sure what spec ref are you looking for. Table 8-2
> >> indicates that those 2 DVSECs are mandatory for CXL root ports. Lack
> >> of presence means either the RP isn't CXL or the CXL link isn't
> >> established yet. I would assume this would also be true if a CXL
> >> memdev is hot-plugged into a slot post boot.
> >
> > But add_host_bridge_uport() only creates ports for the host bridge
> > (ACPI0016) devices and enumerates their component registers (CHBCR).
>
> And I think that's where the issue is. The component registers via CHBCR isn't there. When I removed this change, this is the signature I get:
>
> [ 37.423882] cxl_acpi:cxl_get_chbs:589: acpi ACPI0016:03: UID found: 35
> [ 37.424180] cxl_acpi:add_host_bridge_uport:726: acpi ACPI0016:03: CHBCR found for UID 35: 0x00000
> 000aabf0000
> [ 37.424186] cxl_core:cxl_port_alloc:741: pci0000:3a: host-bridge: pci0000:3a
> [ 37.424210] cxl_core:cxl_map_regblock:426: cxl port2: Mapped CXL Memory Device resource 0x0000000
> 0aabf0000
> [ 37.424213] cxl_core:cxl_probe_component_regs:55: cxl port2: Couldn't locate the CXL.cache and CXL.mem capability array header.
Hmm, hot-added Host Bridges (and I would count this case to those)
should use the ACPI _CBR method. That is, else the host bridge should
be enumerated during boot.
Is it just a delay, or does the CHBCR come up not earlier than the
root port link is up?
Can -EAGAIN be used to reload the driver later if CHBCR init fails?
IMO, the Component Registers cannot be initialized later as that would
delay the enablement of the root decoders too. At least only bridges
that fail to init the CHBCR should be delayed.
The issue and the changes for this are not obvious, please make a
separate patch for that separate change.
Thanks,
-Robert
>
> DJ
>
> > The root ports are being added already late as those are part of the
> > pci hierarchy. The root ports are discovered in
> > devm_cxl_enumerate_ports() not earlier than the mem_dev is probed.
> > devm_cxl_add_memdev() is called once the endpoint is probed and the
> > CXL link is up.
> >
> > That is, function cxl_port_get_or_add_dport() in add_port_attach_ep
> > should only add the dport. Then, a retry in the enumeration loop will
> > be triggered and the cxl_port for the root port is added. No explicit
> > call of cxl_switch_port_setup() should be needed, it can be done
> > during cxl_port_probe(). All done late after the endpoint was found.
> >
> > -Robert
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-29 15:02 ` Robert Richter
@ 2025-08-29 17:23 ` Dave Jiang
2025-09-01 14:48 ` Robert Richter
0 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-08-29 17:23 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 8/29/25 8:02 AM, Robert Richter wrote:
> On 27.08.25 10:05:05, Dave Jiang wrote:
>>
>>
>> On 8/26/25 12:51 AM, Robert Richter wrote:
>>> On 22.08.25 08:52:39, Dave Jiang wrote:
>>>>
>>>>
>>>> On 8/22/25 2:59 AM, Robert Richter wrote:
>>>>> On 20.08.25 08:20:04, Dave Jiang wrote:
>>>>>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>>>>>> Hi Dave,
>>>>>>>
>>>>>>> see my comments below.
>>>>>>>
>>>>>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>>>>>
>>>>>> <--snip-->
>>>>>>
>>>>>>>> + if (IS_ERR(new_dport))
>>>>>>>> + return new_dport;
>>>>>>>> +
>>>>>>>> + cxl_switch_parse_cdat(port);
>>>>>>>> +
>>>>>>>> + /*
>>>>>>>> + * First instance of dport appearing, need to setup the port, including
>>>>>>>> + * allocating decoders.
>>>>>>>> + */
>>>>>>>> + if (port->nr_dports == 1) {
>>>>>>>> + rc = cxl_switch_port_setup(port);
>>>>>>>
>>>>>>> Can't this be done with port creation? I don't see a reason doing this
>>>>>>> late at this point.
>>>>>>
>>>>>
>>>>>> The main reason we are doing this is to move the port register
>>>>>> probing until we know the CXL link is established. Otherwise when
>>>>>> cxl_acpi does probe and calls add_host_bridge_uport(), that
>>>>>> devm_cxl_add_port() can trigger errors if the platform BIOS enables
>>>>>> PCI hotplug support on Intel platforms. The error messages "cxl
>>>>>> portN: Couldn't locate the CXL.cache and CXL.mem capability array
>>>>>> header" is observed. Essentially we can be trying to map registers
>>>>>> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
>>>>>> that got pushed out, so did the decoder enumeration.
>>>>>
>>>>> The code suggests the Component Registers of the CXL Host Bridge are
>>>>> not yet ready. Is this delayed after the first Root Port is connected
>>>>> to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
>>>>> (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
>>>>> the pcie config space, which is enumerated not before a CXL endpoint
>>>>> becomes active. I haven't found a spec refs here. Please explain.
>>>>
>>>
>>>> So the behavior is observed when PCIe hotplug support is turned on
>>>> in BIOS for the Intel platform. A CXL device is plugged in to a RP
>>>> without CXL switches. The thinking is that the CXL link is not fully
>>>> established at the time when cxl_acpi_probe() is running and the
>>>> ports are being added. And the only way to 100% be sure the link is
>>>> established is when we are enumerating the memdev just like the
>>>> dports. Not sure what spec ref are you looking for. Table 8-2
>>>> indicates that those 2 DVSECs are mandatory for CXL root ports. Lack
>>>> of presence means either the RP isn't CXL or the CXL link isn't
>>>> established yet. I would assume this would also be true if a CXL
>>>> memdev is hot-plugged into a slot post boot.
>>>
>>> But add_host_bridge_uport() only creates ports for the host bridge
>>> (ACPI0016) devices and enumerates their component registers (CHBCR).
>>
>> And I think that's where the issue is. The component registers via CHBCR isn't there. When I removed this change, this is the signature I get:
>>
>> [ 37.423882] cxl_acpi:cxl_get_chbs:589: acpi ACPI0016:03: UID found: 35
>> [ 37.424180] cxl_acpi:add_host_bridge_uport:726: acpi ACPI0016:03: CHBCR found for UID 35: 0x00000
>> 000aabf0000
>> [ 37.424186] cxl_core:cxl_port_alloc:741: pci0000:3a: host-bridge: pci0000:3a
>> [ 37.424210] cxl_core:cxl_map_regblock:426: cxl port2: Mapped CXL Memory Device resource 0x0000000
>> 0aabf0000
>> [ 37.424213] cxl_core:cxl_probe_component_regs:55: cxl port2: Couldn't locate the CXL.cache and CXL.mem capability array header.
>
> Hmm, hot-added Host Bridges (and I would count this case to those)
> should use the ACPI _CBR method. That is, else the host bridge should
> be enumerated during boot.
host bridges are there, just not the component registers in the CHBCR it appears.
>
> Is it just a delay, or does the CHBCR come up not earlier than the
> root port link is up?
Let me get that clarification from the BIOS people.
>
> Can -EAGAIN be used to reload the driver later if CHBCR init fails?
> IMO, the Component Registers cannot be initialized later as that would
> delay the enablement of the root decoders too. At least only bridges
> that fail to init the CHBCR should be delayed.
I don't follow why this is an issue. Auto region assembly doesn't start until the port hierarchy is established via the first endpoint. So by the time the region code pokes at the decoder registers during region assembly, the component register for CHBCR should have been probed. It seems reasonable to setup the component registers when we find the first dport and thus indicate that everything should be there. Are you observing an issue on a platform?
>
> The issue and the changes for this are not obvious, please make a
> separate patch for that separate change.
It was in this [1] patch.
https://lore.kernel.org/linux-cxl/20250814222151.3520500-5-dave.jiang@intel.com/
I think the name of the function being discussed confused things. It is now renamed to setup decoders instead of just setup.
DJ
>
> Thanks,
>
> -Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-29 17:23 ` Dave Jiang
@ 2025-09-01 14:48 ` Robert Richter
2025-09-02 15:58 ` Dave Jiang
0 siblings, 1 reply; 40+ messages in thread
From: Robert Richter @ 2025-09-01 14:48 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 29.08.25 10:23:38, Dave Jiang wrote:
>
>
> On 8/29/25 8:02 AM, Robert Richter wrote:
> > On 27.08.25 10:05:05, Dave Jiang wrote:
> >>
> >>
> >> On 8/26/25 12:51 AM, Robert Richter wrote:
> >>> On 22.08.25 08:52:39, Dave Jiang wrote:
> >>>>
> >>>>
> >>>> On 8/22/25 2:59 AM, Robert Richter wrote:
> >>>>> On 20.08.25 08:20:04, Dave Jiang wrote:
> >>>>>> On 8/20/25 5:41 AM, Robert Richter wrote:
> >>>>>>> Hi Dave,
> >>>>>>>
> >>>>>>> see my comments below.
> >>>>>>>
> >>>>>>> On 14.08.25 15:21:45, Dave Jiang wrote:
> >>>>>>
> >>>>>> <--snip-->
> >>>>>>
> >>>>>>>> + if (IS_ERR(new_dport))
> >>>>>>>> + return new_dport;
> >>>>>>>> +
> >>>>>>>> + cxl_switch_parse_cdat(port);
> >>>>>>>> +
> >>>>>>>> + /*
> >>>>>>>> + * First instance of dport appearing, need to setup the port, including
> >>>>>>>> + * allocating decoders.
> >>>>>>>> + */
> >>>>>>>> + if (port->nr_dports == 1) {
> >>>>>>>> + rc = cxl_switch_port_setup(port);
I want to come back to my previous comment that port setup should be
done as part of the port enumeration in devm_cxl_enumerate_ports().
No need to make this a special case here. The link should be up
already once the mem dev is visible.
> >>>>>>>
> >>>>>>> Can't this be done with port creation? I don't see a reason doing this
> >>>>>>> late at this point.
> >>>>>>
> >>>>>
> >>>>>> The main reason we are doing this is to move the port register
> >>>>>> probing until we know the CXL link is established. Otherwise when
> >>>>>> cxl_acpi does probe and calls add_host_bridge_uport(), that
> >>>>>> devm_cxl_add_port() can trigger errors if the platform BIOS enables
> >>>>>> PCI hotplug support on Intel platforms. The error messages "cxl
> >>>>>> portN: Couldn't locate the CXL.cache and CXL.mem capability array
> >>>>>> header" is observed. Essentially we can be trying to map registers
> >>>>>> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
> >>>>>> that got pushed out, so did the decoder enumeration.
> >>>>>
> >>>>> The code suggests the Component Registers of the CXL Host Bridge are
> >>>>> not yet ready. Is this delayed after the first Root Port is connected
> >>>>> to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
> >>>>> (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
> >>>>> the pcie config space, which is enumerated not before a CXL endpoint
> >>>>> becomes active. I haven't found a spec refs here. Please explain.
> >>>>
> >>>
> >>>> So the behavior is observed when PCIe hotplug support is turned on
> >>>> in BIOS for the Intel platform. A CXL device is plugged in to a RP
> >>>> without CXL switches. The thinking is that the CXL link is not fully
> >>>> established at the time when cxl_acpi_probe() is running and the
> >>>> ports are being added. And the only way to 100% be sure the link is
> >>>> established is when we are enumerating the memdev just like the
> >>>> dports. Not sure what spec ref are you looking for. Table 8-2
> >>>> indicates that those 2 DVSECs are mandatory for CXL root ports. Lack
> >>>> of presence means either the RP isn't CXL or the CXL link isn't
> >>>> established yet. I would assume this would also be true if a CXL
> >>>> memdev is hot-plugged into a slot post boot.
> >>>
> >>> But add_host_bridge_uport() only creates ports for the host bridge
> >>> (ACPI0016) devices and enumerates their component registers (CHBCR).
> >>
> >> And I think that's where the issue is. The component registers via CHBCR isn't there. When I removed this change, this is the signature I get:
> >>
> >> [ 37.423882] cxl_acpi:cxl_get_chbs:589: acpi ACPI0016:03: UID found: 35
> >> [ 37.424180] cxl_acpi:add_host_bridge_uport:726: acpi ACPI0016:03: CHBCR found for UID 35: 0x00000
> >> 000aabf0000
> >> [ 37.424186] cxl_core:cxl_port_alloc:741: pci0000:3a: host-bridge: pci0000:3a
> >> [ 37.424210] cxl_core:cxl_map_regblock:426: cxl port2: Mapped CXL Memory Device resource 0x0000000
> >> 0aabf0000
> >> [ 37.424213] cxl_core:cxl_probe_component_regs:55: cxl port2: Couldn't locate the CXL.cache and CXL.mem capability array header.
> >
> > Hmm, hot-added Host Bridges (and I would count this case to those)
> > should use the ACPI _CBR method. That is, else the host bridge should
> > be enumerated during boot.
>
> host bridges are there, just not the component registers in the CHBCR it appears.
>
> >
> > Is it just a delay, or does the CHBCR come up not earlier than the
> > root port link is up?
>
> Let me get that clarification from the BIOS people.
>
> >
> > Can -EAGAIN be used to reload the driver later if CHBCR init fails?
> > IMO, the Component Registers cannot be initialized later as that would
> > delay the enablement of the root decoders too. At least only bridges
> > that fail to init the CHBCR should be delayed.
>
> I don't follow why this is an issue. Auto region assembly doesn't
> start until the port hierarchy is established via the first
> endpoint. So by the time the region code pokes at the decoder
> registers during region assembly, the component register for CHBCR
> should have been probed. It seems reasonable to setup the component
> registers when we find the first dport and thus indicate that
> everything should be there. Are you observing an issue on a
> platform?
It would be good to see the bridges in the system regardless of the
link status of the root ports, which I think is possible. Same with
the chbcr and the hb decoders. Only defer it if not yet available and,
let's mark it as a quirk or workaround. Btw, this is not a delayed
dport enablement any longer.
I am also a bit worried about the conditions to run the setup. What if
there are multiple hotplug ports, why should the CHBCR be ready with
the first one already? Shouldn't all connected ports come up first?
> >
> > The issue and the changes for this are not obvious, please make a
> > separate patch for that separate change.
>
> It was in this [1] patch.
>
> https://lore.kernel.org/linux-cxl/20250814222151.3520500-5-dave.jiang@intel.com/
>
> I think the name of the function being discussed confused things. It is now renamed to setup decoders instead of just setup.
Yes, but there is this additional change that calls
cxl_switch_port_setup() in this patch.
-Robert
>
> DJ
>
> >
> > Thanks,
> >
> > -Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-08-27 21:15 ` Dave Jiang
@ 2025-09-01 17:29 ` Robert Richter
2025-09-02 15:40 ` Dave Jiang
2025-09-03 18:21 ` Dave Jiang
0 siblings, 2 replies; 40+ messages in thread
From: Robert Richter @ 2025-09-01 17:29 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 27.08.25 14:15:21, Dave Jiang wrote:
>
>
> On 8/20/25 5:41 AM, Robert Richter wrote:
> > Hi Dave,
> >
> > see my comments below.
> >
> > On 14.08.25 15:21:45, Dave Jiang wrote:
> >> The current implementation enumerates the dports during the cxl_port
> >> driver probe. Without an endpoint connected, the dport may not be
> >> active during port probe. This scheme may prevent a valid hardware
> >> dport id to be retrieved and MMIO registers to be read when an endpoint
> >> is hot-plugged. Move the dport allocation and setup to behind memdev
> >> probe so the endpoint is guaranteed to be connected.
> >>
> >> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
> >> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
> >> ACPI0017.N device. Through that it enumerates downstream ports composed
> >> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
> >> uses add_host_bridge_uport() to create the ports that enumerate the PCI
> >> RPs as the dports of these ports. Every time a port is created, the port
> >> driver is attached, cxl_switch_porbe_probe() is called and
> >> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
> >> the dports.
> >>
> >> The second phase is if there are any CXL switches. When the pci endpoint
> >> device driver (cxl_pci) calls probe, it will add a mem device and triggers
> >> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
> >> and attempts to discovery and create all the ports represent CXL switches.
> >> During this phase, a port is created per switch and the attached dports
> >> are also enumerated and probed.
> >>
> >> The last phase is creating endpoint port which happens for all endpoint
> >> devices.
> >>
> >> In this commit, the port create and its dport probing in cxl_acpi is not
> >> changed. That will be handled later. The behavior change is only for CXL
> >> switch ports. Only the dport that is part of the path for an endpoint
> >> device to the RP will be probed. This happens naturally by the code
> >> walking up the device hierarchy and identifying the upstream device and
> >> the downstream device.
> >>
> >> The new sequence is instead of creating all possible dports at initial
> >> port creation, defer port instantiation until a memdev beneath that
> >> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
> >> the creation and extension of ports with new dports as memory devices
> >> arrive. As part of this rework, switch decoder target list is amended
> >> at runtime as dports show up.
> >>
> >> While the decoders are allocated during the port driver probe,
> >> The decoders must also be updated since previously it's all done when all
> >> the dports are setup and now every time a dport is setup per endpoint, the
> >> switch target listing need to be updated with new dport. A
> >> guard(rwsem_write) is used to update decoder targets. This is similar to
> >> when decoder_populate_target() is called and the decoder programming
> >> must be protected.
> >>
> >> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
> >> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >> ---
> >> v8:
> >> - grammar and spelling fixups (Dan)
> >> - Clarify commit log story. (Dan)
> >> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
> >> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
> >> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
> >> - Remove failure path for possible dports == 0. (Dan, Robert)
> >> - update_switch_decoder() -> update_decoder_targets(). (Dan)
> >> - Remove lock asserts where not needed. (Dan)
> >> - Add support for passthrough decoder init. (Dan)
> >> - Return -ENXIO when no driver attached. (Dan)
> >> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
> >> - Add devm_cxl_create_or_extend_port() helper. (Dan)
> >> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
> >> - Remove 'new_dport' local var. (Robert)
> >> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
> >> - Move port check logic to add_port_attach_ep(). (Robert)
> >> ---
> >> drivers/cxl/core/cdat.c | 2 +-
> >> drivers/cxl/core/core.h | 2 +
> >> drivers/cxl/core/hdm.c | 6 -
> >> drivers/cxl/core/pci.c | 81 +++++++++++
> >> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
> >> drivers/cxl/core/region.c | 4 +-
> >> drivers/cxl/cxl.h | 3 +
> >> drivers/cxl/port.c | 29 +---
> >> 8 files changed, 331 insertions(+), 83 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> >> index c0af645425f4..b156b81a9b20 100644
> >> --- a/drivers/cxl/core/cdat.c
> >> +++ b/drivers/cxl/core/cdat.c
> >> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
> >>
> >> guard(rwsem_read)(&cxl_rwsem.region);
> >> for (int i = 0; i < cxlsd->nr_targets; i++) {
> >> - if (host_bridge == cxlsd->target[i]->dport_dev)
> >> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
> >> return 1;
> >> }
> >>
> >> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> >> index 2669f251d677..2ac71eb459e6 100644
> >> --- a/drivers/cxl/core/core.h
> >> +++ b/drivers/cxl/core/core.h
> >> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
> >> int cxl_ras_init(void);
> >> void cxl_ras_exit(void);
> >> int cxl_gpf_port_setup(struct cxl_dport *dport);
> >> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
> >> + struct device *dport_dev);
> >>
> >> #ifdef CONFIG_CXL_FEATURES
> >> struct cxl_feat_entry *
> >> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> >> index cee68bbc7ff6..5263e9eba7d0 100644
> >> --- a/drivers/cxl/core/hdm.c
> >> +++ b/drivers/cxl/core/hdm.c
> >> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
> >> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
> >> {
> >> struct cxl_switch_decoder *cxlsd;
> >> - struct cxl_dport *dport = NULL;
> >> - unsigned long index;
> >> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
> >>
> >> /*
> >> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
> >>
> >> device_lock_assert(&port->dev);
> >>
> >> - xa_for_each(&port->dports, index, dport)
> >> - break;
> >> - cxlsd->cxld.target_map[0] = dport->port_id;
> >> -
> >
> > The change of initialization of cxlsd->cxld.target_map[] could have
> > been a separate patch to reduce size of this patch.
> >
> >> return add_hdm_decoder(port, &cxlsd->cxld);
> >> }
> >> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> >> index b50551601c2e..b9d770f1aa7b 100644
> >> --- a/drivers/cxl/core/pci.c
> >> +++ b/drivers/cxl/core/pci.c
> >> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
> >> module_param(media_ready_timeout, ushort, 0644);
> >> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
> >>
> >> +/**
> >> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
> >> + * @port: cxl_port that hosts the dport
> >> + * @dport_dev: 'struct device' of the dport
> >> + *
> >> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
> >> + */
> >> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
> >
> > This function only determines the port_num. How about only implement
> > this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
> > directly?
>
> I can split out the code to get the port_num locally, but we can't
> call devm_cxl_add_dport() directly in core/port.c because we need
> the map.resource and in order to retrieve that cxl_find_regblock()
> requires a pci dev.
I mean the following:
In the mock case, there is always a decoder. That is,
devm_cxl_add_passthrough_decoder() will only be used for pci devs.
Create cxl_port_has_multiple_dports() which contains:
if (!dev_is_pci(...))
return false;
/* pci_walk_bus() and inspect dports: */
...
In cxl_switch_port_setup():
rc = devm_cxl_enumerate_decoders(...)
if (!rc)
return 0;
if (cxl_port_has_multiple_dports(...))
rc = devm_cxl_add_passthrough_decoder(...);
You don't need function devm_cxl_add_dport_by_dev() any longer, just
use devm_cxl_add_dport() instead.
> >
> > That would nicely fit into core/pci.c.
> >
> >> + struct device *dport_dev)
> >> +{
> >> + struct cxl_register_map map;
> >> + struct pci_dev *pdev;
> >> + u32 lnkcap, port_num;
> >> + int type;
> >> + int rc;
> >> +
> >> + if (!dev_is_pci(dport_dev))
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + device_lock_assert(&port->dev);
> >> +
> >> + pdev = to_pci_dev(dport_dev);
> >> + type = pci_pcie_type(pdev);
> >> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> >> + &lnkcap))
> >> + return ERR_PTR(-ENXIO);
> >> +
> >> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> >> + if (rc)
> >> + dev_dbg(&port->dev, "failed to find component registers\n");
> >> +
> >> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> >
> > So, just return port_num instead.
> >
> >> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
> >> +}
> >> +
> >> struct cxl_walk_context {
> >> struct pci_bus *bus;
> >> struct cxl_port *port;
> >> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
> >>
> >> return 0;
> >> }
> >> +
> >> +static int count_dports(struct pci_dev *pdev, void *data)
> >> +{
> >> + struct cxl_walk_context *ctx = data;
> >> + int type = pci_pcie_type(pdev);
> >> +
> >> + if (pdev->bus != ctx->bus)
> >> + return 0;
> >> + if (!pci_is_pcie(pdev))
> >> + return 0;
> >> + if (type != ctx->type)
> >> + return 0;
> >> +
> >> + ctx->count++;
> >> + return 0;
> >> +}
> >> +
> >> +int cxl_port_get_possible_dports(struct cxl_port *port)
> >> +{
> >> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
> >> + struct cxl_walk_context ctx;
> >> + int type;
> >> +
> >> + if (!bus) {
> >> + dev_err(&port->dev, "No PCI bus found for port %s\n",
> >> + dev_name(&port->dev));
> >> + return -ENXIO;
> >> + }
> >> +
> >> + if (pci_is_root_bus(bus))
> >> + type = PCI_EXP_TYPE_ROOT_PORT;
> >> + else
> >> + type = PCI_EXP_TYPE_DOWNSTREAM;
> >> +
> >> + ctx = (struct cxl_walk_context) {
> >> + .bus = bus,
> >> + .type = type,
> >> + };
> >> + pci_walk_bus(bus, count_dports, &ctx);
> >
> > Don't walk the whole bus, just check children of port->uport_dev.
>
> cxl_port_to_pci_bus() gets the pdev->subordinate of the
> port->uport_dev. So I think that's equivalent of checking the
> children of port->uport_dev and not actually walking the whole pci
> bus no?
pci_walk_bus() also calls subordinates. So it is equivalent, but
count_dports is called for other devices too that are not children.
And it is not obvious that only direct children are counted. Use
device_for_each_child()?
> >
> >> +
> >> + return ctx.count;
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
> >
> > See below for my comment on possible_dports.
> >
> > Since we only check for count > 1 the implemntation could be
> > simplified and renamed to e.g. cxl_port_has_multiple_dports which
> > could easily be used to call devm_cxl_add_passthrough_decoder().
>
> This would be possible if the function can return a bool. However,
> it is possible to encounter errors. And errors should not be
> equivalent to a false (0) return value and resulting in a
> passthrough decoder creation. Thus I think we should stay with the
> current function name.
Ok, but see also above.
> >
> >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> >> index 25209952f469..877f888ee8f5 100644
> >> --- a/drivers/cxl/core/port.c
> >> +++ b/drivers/cxl/core/port.c
> >> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
> >> return port;
> >> }
> >>
> >> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
> >> - struct device *dport_dev,
> >> - struct cxl_dport **dport)
> >> -{
> >> - struct cxl_find_port_ctx ctx = {
> >> - .dport_dev = dport_dev,
> >> - .parent_port = parent_port,
> >> - .dport = dport,
> >> - };
> >> - struct cxl_port *port;
> >> -
> >> - port = __find_cxl_port(&ctx);
> >> - return port;
> >> -}
> >> -
> >> /*
> >> * All users of grandparent() are using it to walk PCIe-like switch port
> >> * hierarchy. A PCIe switch is comprised of a bridge device representing the
> >> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
> >> return map.resource;
> >> }
> >>
> >> +static int match_port_by_uport(struct device *dev, const void *data)
> >> +{
> >> + const struct device *uport_dev = data;
> >> + struct cxl_port *port;
> >> +
> >> + if (!is_cxl_port(dev))
> >> + return 0;
> >> +
> >> + port = to_cxl_port(dev);
> >> + return uport_dev == port->uport_dev;
> >> +}
> >> +
> >> +/*
> >> + * Function takes a device reference on the port device. Caller should do a
> >> + * put_device() when done.
> >> + */
> >> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
> >> +{
> >> + struct device *dev;
> >> +
> >> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
> >> + if (dev)
> >> + return to_cxl_port(dev);
> >> + return NULL;
> >> +}
> >> +
> >> +static int update_decoder_targets(struct device *dev, void *data)
> >> +{
> >> + struct cxl_dport *dport = data;
> >> + struct cxl_switch_decoder *cxlsd;
> >> + struct cxl_decoder *cxld;
> >> + int i;
> >> +
> >> + if (!is_switch_decoder(dev))
> >> + return 0;
> >> +
> >> + cxlsd = to_cxl_switch_decoder(dev);
> >> + cxld = &cxlsd->cxld;
> >> + guard(rwsem_write)(&cxl_rwsem.region);
> >> +
> >> + /* Short cut for passthrough decoder */
> >> + if (cxlsd->nr_targets == 1) {
> >
> > I think we should still check port_id. That is, remove the shortcut.
> > If nr_targets == 1, then interleave_ways should be one too, so you
> > gain nothing. Plus, you also see the dev_dbg().
>
> ok
>
> >
> >> + cxlsd->target[0] = dport;
> >> + return 0;
> >> + }
> >> +
> >> + for (i = 0; i < cxld->interleave_ways; i++) {
> >> + if (cxld->target_map[i] == dport->port_id) {
> >> + cxlsd->target[i] = dport;
> >> + dev_dbg(dev, "dport%d found in target list, index %d\n",
> >> + dport->port_id, i);
> >> + return 0;
> >
> > Only one target exists, right? Stop the iteration by returning a
> > non-zero here (caller needs to be adjusted then).
>
> ok
>
> >
> >> + }
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
> >> +{
> >> + return device_for_each_child(&dport->port->dev, dport,
> >> + update_decoder_targets);
> >
> > Might need changes if update_decoder_targets returns 1 to stop the
> > iterator.
>
> ok
>
> >
> >> +}
> >> +
> >> +static int cxl_switch_port_setup(struct cxl_port *port)
> >> +{
> >
> > Could you factor out that function in a separate patch?
> >
> > The function only sets up decoders. Name it
> > cxl_switch_port_setup_decoders()?
> >
> >> + struct cxl_hdm *cxlhdm;
> >> +
> >> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
> >> + if (!IS_ERR(cxlhdm))
> >> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> >> +
> >> + if (PTR_ERR(cxlhdm) != -ENODEV) {
> >> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
> >> + return PTR_ERR(cxlhdm);
> >> + }
> >> +
> >> + if (port->possible_dports == 1) {
> >> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> >> + return devm_cxl_add_passthrough_decoder(port);
> >
> > Imo, the possible_dports handling should be removed as it only
> > introduces dead code. mock_cxl_setup_hdm() always returns a valid
> > cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
> > here.
> >
> > So how about moving (the "real") devm_cxl_add_passthrough_decoder()
> > and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
> > devm_cxl_add_passthrough_decoder() would be static then and
> > cxl_port_get_possible_dports() will be a core.h function only. Then,
> > mock_cxl_add_passthrough_decoder() could be removed too.
> >
> > I really would like to have a clean core module interface that allows
> > an easy implementation of cxl_test and avoid too much impact to the
> > driver code.
>
> So after looking at this a bit, it looks like we need a bigger refactor than just devm_cxl_enumerate_decoders(). I have an attempt in the next rev you can take a look. It reduces from 3-4 mock functions down to 2.
>
> >
> >> + }
> >> +
> >> + dev_err(&port->dev, "HDM decoder capability not found\n");
> >> + return -ENXIO;
> >> +}
> >> +
> >> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
> >> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
> >> + struct device *dport_dev)
> >> +{
> >> + struct cxl_dport *dport;
> >> + int rc;
> >> +
> >> + guard(device)(&port->dev);
> >> +
> >> + if (!port->dev.driver)
> >> + return ERR_PTR(-ENXIO);
> >> +
> >> + dport = cxl_find_dport_by_dev(port, dport_dev);
> >> + if (dport)
> >> + return dport;
> >
> > What is the case if there is already a dport bound to the port? Since
> > there is a 1:1 mapping downstream, there is only one allocation and I
> > would expect that dport never exists and an -EBUSY should be returned
> > otherwise.
>
> ok
>
> >
> >> +
> >> + struct cxl_dport *new_dport __free(put_cxl_dport) =
> >> + devm_cxl_add_dport_by_dev(port, dport_dev);
> >
> > See my comment on devm_cxl_add_dport_by_dev() above.
> >
> >> + if (IS_ERR(new_dport))
> >> + return new_dport;
> >> +
> >> + cxl_switch_parse_cdat(port);
> >> +
> >> + /*
> >> + * First instance of dport appearing, need to setup the port, including
> >> + * allocating decoders.
> >> + */
> >> + if (port->nr_dports == 1) {
> >> + rc = cxl_switch_port_setup(port);
> >
> > Can't this be done with port creation? I don't see a reason doing this
> > late at this point.
> >
> >> + if (rc)
> >> + return ERR_PTR(rc);
> >> + return no_free_ptr(new_dport);
> >> + }
> >> +
> >> + rc = cxl_decoders_dport_update(new_dport);
> >> + if (rc)
> >> + return ERR_PTR(rc);
> >
> > Maybe unfold cxl_decoders_dport_update() here?
>
> ok
>
> >
> >> +
> >> + return no_free_ptr(new_dport);
> >> +}
> >> +
> >> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
> >> + struct device *dport_dev)
> >> +{
> >> + struct cxl_port *port __free(put_cxl_port) =
> >> + find_cxl_port_by_uport(uport_dev);
> >> +
> >> + if (!port)
> >> + return ERR_PTR(-ENODEV);
> >> +
> >> + return cxl_port_get_or_add_dport(port, dport_dev);
> >> +}
> >
> > That function can be removed, see below.
>
> ok
>
> >
> >> +
> >> +static struct cxl_dport *
> >> +devm_cxl_create_or_extend_port(struct device *ep_dev,
> >> + struct cxl_port *parent_port,
> >> + struct cxl_dport *parent_dport,
> >> + struct device *uport_dev,
> >> + struct device *dport_dev)
> >> +{
> >> + resource_size_t component_reg_phys;
> >> +
> >> + guard(device)(&parent_port->dev);
> >> +
> >> + if (!parent_port->dev.driver) {
> >> + dev_warn(ep_dev,
> >> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
> >> + dev_name(&parent_port->dev), dev_name(uport_dev));
> >> + return ERR_PTR(-ENXIO);
> >> + }
> >> +
> >> + struct cxl_port *port __free(put_cxl_port) =
> >> + find_cxl_port_by_uport(uport_dev);
> >> +
> >> + if (!port) {
> >> + component_reg_phys = find_component_registers(uport_dev);
> >> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> >> + component_reg_phys, parent_dport);
> >> + if (IS_ERR(port))
> >> + return (struct cxl_dport *)port;
> >> +
> >> + /*
> >> + * retry to make sure a port is found. a port device
> >> + * reference is taken.
> >> + */
> >> + port = find_cxl_port_by_uport(uport_dev);
> >> + if (!port)
> >> + return ERR_PTR(-ENODEV);
> >> +
> >> + dev_dbg(ep_dev, "created port %s:%s\n",
> >> + dev_name(&port->dev), dev_name(port->uport_dev));
> >> + }
> >> +
> >> + return cxl_port_get_or_add_dport(port, dport_dev);
> >> +}
> >> +
> >> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> >> struct device *uport_dev,
> >> struct device *dport_dev)
> >> {
> >> struct device *dparent = grandparent(dport_dev);
> >> struct cxl_dport *dport, *parent_dport;
> >> - resource_size_t component_reg_phys;
> >> int rc;
> >>
> >> if (is_cxl_host_bridge(dparent)) {
> >> + struct cxl_port *port __free(put_cxl_port) =
> >> + find_cxl_port_by_uport(uport_dev);
> >> /*
> >> * The iteration reached the topology root without finding the
> >> * CXL-root 'cxl_port' on a previous iteration, fail for now to
> >> * be re-probed after platform driver attaches.
> >> */
> >> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> >> - dev_name(dport_dev));
> >> - return -ENXIO;
> >> + if (!port) {
> >> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> >> + dev_name(dport_dev));
> >> + return -ENXIO;
> >> + }
> >> +
> >> + /*
> >> + * While the port is found, there may not be a dport associated
> >> + * yet. Try to associate the dport to the port. On return success,
> >> + * the iteration will restart with the dport now attached.
> >> + */
> >> + dport = devm_cxl_add_dport_by_uport(uport_dev,
> >> + dport_dev);
> >
> > port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
> > instead. Remove devm_cxl_add_dport_by_uport().
>
> ok
>
> >
> >> + if (IS_ERR(dport))
> >> + return PTR_ERR(dport);
> >> +
> >> + return 0;
> >> }
> >>
> >> struct cxl_port *parent_port __free(put_cxl_port) =
> >> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> >> return -EAGAIN;
> >> }
> >>
> >> - /*
> >> - * Definition with __free() here to keep the sequence of
> >> - * dereferencing the device of the port before the parent_port releasing.
> >> - */
> >> - struct cxl_port *port __free(put_cxl_port) = NULL;
> >> - scoped_guard(device, &parent_port->dev) {
> >> - if (!parent_port->dev.driver) {
> >> - dev_warn(&cxlmd->dev,
> >> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
> >> - dev_name(&parent_port->dev), dev_name(uport_dev));
> >> - return -ENXIO;
> >> - }
> >> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
> >> + parent_dport, uport_dev,
> >> + dport_dev);
> >
> > You expand add_port_attach_ep() here. This function was originally
> > called if there is no *port* at all. Now, as the dport_dev is not yet
> > registered, the port may already exist, but it is not found since the
> > dport_dev is not yet registered and add_port_attach_ep() is called now
> > even if the port exists. I think we should move that dport_dev
> > registration a level higher to devm_cxl_enumerate_ports(). That might
> > need a cleanup of the iterator and the removal of
> > add_port_attach_ep().
>
> Yes. The new rev will move the dport registration a level up. No need to remove add_port_attach_ep(). devm_cxl_create_or_extend_port() will be devm_cxl_create_port().
>
> >
> >> + if (IS_ERR(dport))
> >> + return PTR_ERR(dport);
> >>
> >> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> >> - if (!port) {
> >> - component_reg_phys = find_component_registers(uport_dev);
> >> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> >> - component_reg_phys, parent_dport);
> >> - if (IS_ERR(port))
> >> - return PTR_ERR(port);
> >> -
> >> - /* retry find to pick up the new dport information */
> >> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> >> - if (!port)
> >> - return -ENXIO;
> >> - }
> >> - }
> >> -
> >> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
> >> - dev_name(&port->dev), dev_name(port->uport_dev));
> >> rc = cxl_add_ep(dport, &cxlmd->dev);
> >> if (rc == -EBUSY) {
> >> /*
> >> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >> {
> >> struct device *dev = &cxlmd->dev;
> >> struct device *iter;
> >> + int ports_need_create = 0;
> >> int rc;
> >>
> >> /*
> >> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >> struct device *uport_dev;
> >> struct cxl_dport *dport;
> >>
> >> + ports_need_create++;
> >> +
> >> if (is_cxl_host_bridge(dport_dev))
> >> return 0;
> >>
> >> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >>
> >> cxl_gpf_port_setup(dport);
> >>
> >> + ports_need_create--;
> >> /* Any more ports to add between this one and the root? */
> >> if (!dev_is_cxl_root_child(&port->dev))
> >> continue;
> >>
> >> + /*
> >> + * The 'ports_need_create' variable tracks a port being
> >> + * created as it goes through this iterative loop. It's
> >> + * incremented when it first enters the loop and decremented
> >> + * when the port is found. If at the root of the hierarchy
> >> + * and the variable is not 0, then it's missing a port
> >> + * creation somewhere in the hierarchy and should restart.
> >> + * For example in a setup where there's a PCI root port, a
> >> + * switch, and an endpoint, it is possible to get to the
> >> + * PCI root port and its creation, and the switch port is
> >> + * still missing because the root port didn't exist. This
> >> + * triggers a restart of the loop to create the switch port
> >> + * now with a present root port.
> >> + */
> >> + if (ports_need_create)
> >
> > Uh, that becomes hard. Isn't the iterator much simpler:
> >
> > * Start the iter = endpoint.
> >
> > * Find first existing parent port up to the root.
> >
> > * If that is the direct parent of the endpoint, attach it to the
> > parent (add dport etc.). Exit loop without errors.
> >
> > * Else, create port and attach it to the found parent port (including
> > dport handling).
> >
> > * Fail on errors or retry otherwise.
> >
> > So, devm_cxl_enumerate_ports() should be reworked better, also address
> > my other comments regarding add_port_attach_ep() and
> > devm_cxl_create_or_extend_port().
>
> So I reworked this whole path a bit. Maybe not exactly what you are envisioning here but it is a lot cleaner. You can take a look at the next rev.
>
> >
> >> + goto retry;
> >> +
> >> return 0;
> >> }
> >>
> >> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >> if (rc == -EAGAIN)
> >> continue;
> >> /* failed to add ep or port */
> >> - if (rc)
> >> + if (rc < 0)
> >> return rc;
> >> +
> >> + ports_need_create = 0;
> >> /* port added, new descendants possible, start over */
> >> goto retry;
> >> }
> >> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
> >> device_lock_assert(&port->dev);
> >>
> >> if (xa_empty(&port->dports))
> >> - return -EINVAL;
> >> + return 0;
> >>
> >> guard(rwsem_write)(&cxl_rwsem.region);
> >> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
> >> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
> >>
> >> - if (!dport)
> >> - return -ENXIO;
> >> + if (!dport) {
> >> + /* dport may be activated later */
> >> + continue;
> >> + }
> >> cxlsd->target[i] = dport;
> >> }
> >
> > Should that be dropped entirely as the target setup is done somewhere
> > else?
> >
> No. This is still needed for root ports.
Root ports are dport_devs and don't have a target list, host bridges
have. I did not follow the entire code flow, but shouldn't
cxl_decoders_dport_update() handle that?
-Robert
>
> DJ
> >>
> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> >> index 71cc42d05248..bba62867df90 100644
> >> --- a/drivers/cxl/core/region.c
> >> +++ b/drivers/cxl/core/region.c
> >> @@ -1510,8 +1510,10 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> >> cxl_rr->nr_targets_set);
> >> return -ENXIO;
> >> }
> >> - } else
> >> + } else {
> >> cxlsd->target[cxl_rr->nr_targets_set] = ep->dport;
> >> + cxlsd->cxld.target_map[cxl_rr->nr_targets_set] = ep->dport->port_id;
> >> + }
> >> inc = 1;
> >> out_target_set:
> >> cxl_rr->nr_targets_set += inc;
> >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> >> index 87a905db5ffb..df10a01376c6 100644
> >> --- a/drivers/cxl/cxl.h
> >> +++ b/drivers/cxl/cxl.h
> >> @@ -591,6 +591,7 @@ struct cxl_dax_region {
> >> * @parent_dport: dport that points to this port in the parent
> >> * @decoder_ida: allocator for decoder ids
> >> * @reg_map: component and ras register mapping parameters
> >> + * @possible_dports: Total possible dports reported by hardware
> >> * @nr_dports: number of entries in @dports
> >> * @hdm_end: track last allocated HDM decoder instance for allocation ordering
> >> * @commit_end: cursor to track highest committed decoder for commit ordering
> >> @@ -612,6 +613,7 @@ struct cxl_port {
> >> struct cxl_dport *parent_dport;
> >> struct ida decoder_ida;
> >> struct cxl_register_map reg_map;
> >> + int possible_dports;
> >> int nr_dports;
> >> int hdm_end;
> >> int commit_end;
> >> @@ -911,6 +913,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
> >> struct access_coordinate *c2);
> >>
> >> bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
> >> +int cxl_port_get_possible_dports(struct cxl_port *port);
> >>
> >> /*
> >> * Unit test builds overrides this to __weak, find the 'strong' version
> >> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> >> index cf32dc50b7a6..941a7d7157bd 100644
> >> --- a/drivers/cxl/port.c
> >> +++ b/drivers/cxl/port.c
> >> @@ -59,34 +59,17 @@ static int discover_region(struct device *dev, void *unused)
> >>
> >> static int cxl_switch_port_probe(struct cxl_port *port)
> >> {
> >> - struct cxl_hdm *cxlhdm;
> >> - int rc;
> >> + int dports;
> >>
> >> /* Cache the data early to ensure is_visible() works */
> >> read_cdat_data(port);
> >>
> >> - rc = devm_cxl_port_enumerate_dports(port);
> >> - if (rc < 0)
> >> - return rc;
> >> + dports = cxl_port_get_possible_dports(port);
> >> + if (dports < 0)
> >> + return dports;
> >> + port->possible_dports = dports;
> >
> > As said, I think the whole possible_dports part can be removed.
> >
> > Thanks,
> >
> > -Robert
> >
> >>
> >> - cxl_switch_parse_cdat(port);
> >> -
> >> - cxlhdm = devm_cxl_setup_hdm(port, NULL);
> >> - if (!IS_ERR(cxlhdm))
> >> - return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> >> -
> >> - if (PTR_ERR(cxlhdm) != -ENODEV) {
> >> - dev_err(&port->dev, "Failed to map HDM decoder capability\n");
> >> - return PTR_ERR(cxlhdm);
> >> - }
> >> -
> >> - if (rc == 1) {
> >> - dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> >> - return devm_cxl_add_passthrough_decoder(port);
> >> - }
> >> -
> >> - dev_err(&port->dev, "HDM decoder capability not found\n");
> >> - return -ENXIO;
> >> + return 0;
> >> }
> >>
> >> static int cxl_endpoint_port_probe(struct cxl_port *port)
> >> --
> >> 2.50.1
> >>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-09-01 17:29 ` Robert Richter
@ 2025-09-02 15:40 ` Dave Jiang
2025-09-03 18:21 ` Dave Jiang
1 sibling, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-09-02 15:40 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 9/1/25 10:29 AM, Robert Richter wrote:
> On 27.08.25 14:15:21, Dave Jiang wrote:
>>
>>
>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>> Hi Dave,
>>>
>>> see my comments below.
>>>
>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>>> The current implementation enumerates the dports during the cxl_port
>>>> driver probe. Without an endpoint connected, the dport may not be
>>>> active during port probe. This scheme may prevent a valid hardware
>>>> dport id to be retrieved and MMIO registers to be read when an endpoint
>>>> is hot-plugged. Move the dport allocation and setup to behind memdev
>>>> probe so the endpoint is guaranteed to be connected.
>>>>
>>>> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
>>>> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
>>>> ACPI0017.N device. Through that it enumerates downstream ports composed
>>>> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
>>>> uses add_host_bridge_uport() to create the ports that enumerate the PCI
>>>> RPs as the dports of these ports. Every time a port is created, the port
>>>> driver is attached, cxl_switch_porbe_probe() is called and
>>>> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
>>>> the dports.
>>>>
>>>> The second phase is if there are any CXL switches. When the pci endpoint
>>>> device driver (cxl_pci) calls probe, it will add a mem device and triggers
>>>> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
>>>> and attempts to discovery and create all the ports represent CXL switches.
>>>> During this phase, a port is created per switch and the attached dports
>>>> are also enumerated and probed.
>>>>
>>>> The last phase is creating endpoint port which happens for all endpoint
>>>> devices.
>>>>
>>>> In this commit, the port create and its dport probing in cxl_acpi is not
>>>> changed. That will be handled later. The behavior change is only for CXL
>>>> switch ports. Only the dport that is part of the path for an endpoint
>>>> device to the RP will be probed. This happens naturally by the code
>>>> walking up the device hierarchy and identifying the upstream device and
>>>> the downstream device.
>>>>
>>>> The new sequence is instead of creating all possible dports at initial
>>>> port creation, defer port instantiation until a memdev beneath that
>>>> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
>>>> the creation and extension of ports with new dports as memory devices
>>>> arrive. As part of this rework, switch decoder target list is amended
>>>> at runtime as dports show up.
>>>>
>>>> While the decoders are allocated during the port driver probe,
>>>> The decoders must also be updated since previously it's all done when all
>>>> the dports are setup and now every time a dport is setup per endpoint, the
>>>> switch target listing need to be updated with new dport. A
>>>> guard(rwsem_write) is used to update decoder targets. This is similar to
>>>> when decoder_populate_target() is called and the decoder programming
>>>> must be protected.
>>>>
>>>> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
>>>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>> ---
>>>> v8:
>>>> - grammar and spelling fixups (Dan)
>>>> - Clarify commit log story. (Dan)
>>>> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
>>>> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
>>>> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
>>>> - Remove failure path for possible dports == 0. (Dan, Robert)
>>>> - update_switch_decoder() -> update_decoder_targets(). (Dan)
>>>> - Remove lock asserts where not needed. (Dan)
>>>> - Add support for passthrough decoder init. (Dan)
>>>> - Return -ENXIO when no driver attached. (Dan)
>>>> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
>>>> - Add devm_cxl_create_or_extend_port() helper. (Dan)
>>>> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
>>>> - Remove 'new_dport' local var. (Robert)
>>>> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
>>>> - Move port check logic to add_port_attach_ep(). (Robert)
>>>> ---
>>>> drivers/cxl/core/cdat.c | 2 +-
>>>> drivers/cxl/core/core.h | 2 +
>>>> drivers/cxl/core/hdm.c | 6 -
>>>> drivers/cxl/core/pci.c | 81 +++++++++++
>>>> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
>>>> drivers/cxl/core/region.c | 4 +-
>>>> drivers/cxl/cxl.h | 3 +
>>>> drivers/cxl/port.c | 29 +---
>>>> 8 files changed, 331 insertions(+), 83 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>>>> index c0af645425f4..b156b81a9b20 100644
>>>> --- a/drivers/cxl/core/cdat.c
>>>> +++ b/drivers/cxl/core/cdat.c
>>>> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
>>>>
>>>> guard(rwsem_read)(&cxl_rwsem.region);
>>>> for (int i = 0; i < cxlsd->nr_targets; i++) {
>>>> - if (host_bridge == cxlsd->target[i]->dport_dev)
>>>> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
>>>> return 1;
>>>> }
>>>>
>>>> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
>>>> index 2669f251d677..2ac71eb459e6 100644
>>>> --- a/drivers/cxl/core/core.h
>>>> +++ b/drivers/cxl/core/core.h
>>>> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
>>>> int cxl_ras_init(void);
>>>> void cxl_ras_exit(void);
>>>> int cxl_gpf_port_setup(struct cxl_dport *dport);
>>>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>>>> + struct device *dport_dev);
>>>>
>>>> #ifdef CONFIG_CXL_FEATURES
>>>> struct cxl_feat_entry *
>>>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>>>> index cee68bbc7ff6..5263e9eba7d0 100644
>>>> --- a/drivers/cxl/core/hdm.c
>>>> +++ b/drivers/cxl/core/hdm.c
>>>> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
>>>> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>>> {
>>>> struct cxl_switch_decoder *cxlsd;
>>>> - struct cxl_dport *dport = NULL;
>>>> - unsigned long index;
>>>> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
>>>>
>>>> /*
>>>> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>>>
>>>> device_lock_assert(&port->dev);
>>>>
>>>> - xa_for_each(&port->dports, index, dport)
>>>> - break;
>>>> - cxlsd->cxld.target_map[0] = dport->port_id;
>>>> -
>>>
>>> The change of initialization of cxlsd->cxld.target_map[] could have
>>> been a separate patch to reduce size of this patch.
>>>
>>>> return add_hdm_decoder(port, &cxlsd->cxld);
>>>> }
>>>> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
>>>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>>>> index b50551601c2e..b9d770f1aa7b 100644
>>>> --- a/drivers/cxl/core/pci.c
>>>> +++ b/drivers/cxl/core/pci.c
>>>> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
>>>> module_param(media_ready_timeout, ushort, 0644);
>>>> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
>>>>
>>>> +/**
>>>> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
>>>> + * @port: cxl_port that hosts the dport
>>>> + * @dport_dev: 'struct device' of the dport
>>>> + *
>>>> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
>>>> + */
>>>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>>>
>>> This function only determines the port_num. How about only implement
>>> this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
>>> directly?
>>
>
>> I can split out the code to get the port_num locally, but we can't
>> call devm_cxl_add_dport() directly in core/port.c because we need
>> the map.resource and in order to retrieve that cxl_find_regblock()
>> requires a pci dev.
>
> I mean the following:
>
> In the mock case, there is always a decoder. That is,
> devm_cxl_add_passthrough_decoder() will only be used for pci devs.
>
> Create cxl_port_has_multiple_dports() which contains:
>
> if (!dev_is_pci(...))
> return false;
> /* pci_walk_bus() and inspect dports: */
> ...
>
> In cxl_switch_port_setup():
>
> rc = devm_cxl_enumerate_decoders(...)
> if (!rc)
> return 0;
> if (cxl_port_has_multiple_dports(...))
> rc = devm_cxl_add_passthrough_decoder(...);
ok, I can update this part with the function rename. v9 is mostly there with passthrough decoder eliminated in cxl_test.
>
> You don't need function devm_cxl_add_dport_by_dev() any longer, just
> use devm_cxl_add_dport() instead.
I'm not seeing how the decoder enumeration is related to devm_cxl_add_dport_by_dev(). devm_cxl_add_dport() needs a port_id and a component_reg_phys. Retrieving those require more help from core/pci.c and can't be done in core/port.c.
>
>>>
>>> That would nicely fit into core/pci.c.
>>>
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_register_map map;
>>>> + struct pci_dev *pdev;
>>>> + u32 lnkcap, port_num;
>>>> + int type;
>>>> + int rc;
>>>> +
>>>> + if (!dev_is_pci(dport_dev))
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + device_lock_assert(&port->dev);
>>>> +
>>>> + pdev = to_pci_dev(dport_dev);
>>>> + type = pci_pcie_type(pdev);
>>>> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
>>>> + &lnkcap))
>>>> + return ERR_PTR(-ENXIO);
>>>> +
>>>> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
>>>> + if (rc)
>>>> + dev_dbg(&port->dev, "failed to find component registers\n");
>>>> +
>>>> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
>>>
>>> So, just return port_num instead.
>>>
>>>> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
>>>> +}
>>>> +
>>>> struct cxl_walk_context {
>>>> struct pci_bus *bus;
>>>> struct cxl_port *port;
>>>> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
>>>>
>>>> return 0;
>>>> }
>>>> +
>>>> +static int count_dports(struct pci_dev *pdev, void *data)
>>>> +{
>>>> + struct cxl_walk_context *ctx = data;
>>>> + int type = pci_pcie_type(pdev);
>>>> +
>>>> + if (pdev->bus != ctx->bus)
>>>> + return 0;
>>>> + if (!pci_is_pcie(pdev))
>>>> + return 0;
>>>> + if (type != ctx->type)
>>>> + return 0;
>>>> +
>>>> + ctx->count++;
>>>> + return 0;
>>>> +}
>>>> +
>>>> +int cxl_port_get_possible_dports(struct cxl_port *port)
>>>> +{
>>>> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
>>>> + struct cxl_walk_context ctx;
>>>> + int type;
>>>> +
>>>> + if (!bus) {
>>>> + dev_err(&port->dev, "No PCI bus found for port %s\n",
>>>> + dev_name(&port->dev));
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + if (pci_is_root_bus(bus))
>>>> + type = PCI_EXP_TYPE_ROOT_PORT;
>>>> + else
>>>> + type = PCI_EXP_TYPE_DOWNSTREAM;
>>>> +
>>>> + ctx = (struct cxl_walk_context) {
>>>> + .bus = bus,
>>>> + .type = type,
>>>> + };
>>>> + pci_walk_bus(bus, count_dports, &ctx);
>>>
>>> Don't walk the whole bus, just check children of port->uport_dev.
>>
>
>> cxl_port_to_pci_bus() gets the pdev->subordinate of the
>> port->uport_dev. So I think that's equivalent of checking the
>> children of port->uport_dev and not actually walking the whole pci
>> bus no?
>
> pci_walk_bus() also calls subordinates. So it is equivalent, but
> count_dports is called for other devices too that are not children.
> And it is not obvious that only direct children are counted. Use
> device_for_each_child()?
>
ok I can see about switch to that.
>>>
>>>> +
>>>> + return ctx.count;
>>>> +}
>>>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
>>>
>>> See below for my comment on possible_dports.
>>>
>>> Since we only check for count > 1 the implemntation could be
>>> simplified and renamed to e.g. cxl_port_has_multiple_dports which
>>> could easily be used to call devm_cxl_add_passthrough_decoder().
>>
>
>> This would be possible if the function can return a bool. However,
>> it is possible to encounter errors. And errors should not be
>> equivalent to a false (0) return value and resulting in a
>> passthrough decoder creation. Thus I think we should stay with the
>> current function name.
>
> Ok, but see also above.
>
>>>
>>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>>> index 25209952f469..877f888ee8f5 100644
>>>> --- a/drivers/cxl/core/port.c
>>>> +++ b/drivers/cxl/core/port.c
>>>> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
>>>> return port;
>>>> }
>>>>
>>>> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
>>>> - struct device *dport_dev,
>>>> - struct cxl_dport **dport)
>>>> -{
>>>> - struct cxl_find_port_ctx ctx = {
>>>> - .dport_dev = dport_dev,
>>>> - .parent_port = parent_port,
>>>> - .dport = dport,
>>>> - };
>>>> - struct cxl_port *port;
>>>> -
>>>> - port = __find_cxl_port(&ctx);
>>>> - return port;
>>>> -}
>>>> -
>>>> /*
>>>> * All users of grandparent() are using it to walk PCIe-like switch port
>>>> * hierarchy. A PCIe switch is comprised of a bridge device representing the
>>>> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
>>>> return map.resource;
>>>> }
>>>>
>>>> +static int match_port_by_uport(struct device *dev, const void *data)
>>>> +{
>>>> + const struct device *uport_dev = data;
>>>> + struct cxl_port *port;
>>>> +
>>>> + if (!is_cxl_port(dev))
>>>> + return 0;
>>>> +
>>>> + port = to_cxl_port(dev);
>>>> + return uport_dev == port->uport_dev;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Function takes a device reference on the port device. Caller should do a
>>>> + * put_device() when done.
>>>> + */
>>>> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
>>>> +{
>>>> + struct device *dev;
>>>> +
>>>> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
>>>> + if (dev)
>>>> + return to_cxl_port(dev);
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +static int update_decoder_targets(struct device *dev, void *data)
>>>> +{
>>>> + struct cxl_dport *dport = data;
>>>> + struct cxl_switch_decoder *cxlsd;
>>>> + struct cxl_decoder *cxld;
>>>> + int i;
>>>> +
>>>> + if (!is_switch_decoder(dev))
>>>> + return 0;
>>>> +
>>>> + cxlsd = to_cxl_switch_decoder(dev);
>>>> + cxld = &cxlsd->cxld;
>>>> + guard(rwsem_write)(&cxl_rwsem.region);
>>>> +
>>>> + /* Short cut for passthrough decoder */
>>>> + if (cxlsd->nr_targets == 1) {
>>>
>>> I think we should still check port_id. That is, remove the shortcut.
>>> If nr_targets == 1, then interleave_ways should be one too, so you
>>> gain nothing. Plus, you also see the dev_dbg().
>>
>> ok
>>
>>>
>>>> + cxlsd->target[0] = dport;
>>>> + return 0;
>>>> + }
>>>> +
>>>> + for (i = 0; i < cxld->interleave_ways; i++) {
>>>> + if (cxld->target_map[i] == dport->port_id) {
>>>> + cxlsd->target[i] = dport;
>>>> + dev_dbg(dev, "dport%d found in target list, index %d\n",
>>>> + dport->port_id, i);
>>>> + return 0;
>>>
>>> Only one target exists, right? Stop the iteration by returning a
>>> non-zero here (caller needs to be adjusted then).
>>
>> ok
>>
>>>
>>>> + }
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
>>>> +{
>>>> + return device_for_each_child(&dport->port->dev, dport,
>>>> + update_decoder_targets);
>>>
>>> Might need changes if update_decoder_targets returns 1 to stop the
>>> iterator.
>>
>> ok
>>
>>>
>>>> +}
>>>> +
>>>> +static int cxl_switch_port_setup(struct cxl_port *port)
>>>> +{
>>>
>>> Could you factor out that function in a separate patch?
>>>
>>> The function only sets up decoders. Name it
>>> cxl_switch_port_setup_decoders()?
>>>
>>>> + struct cxl_hdm *cxlhdm;
>>>> +
>>>> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
>>>> + if (!IS_ERR(cxlhdm))
>>>> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
>>>> +
>>>> + if (PTR_ERR(cxlhdm) != -ENODEV) {
>>>> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
>>>> + return PTR_ERR(cxlhdm);
>>>> + }
>>>> +
>>>> + if (port->possible_dports == 1) {
>>>> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
>>>> + return devm_cxl_add_passthrough_decoder(port);
>>>
>>> Imo, the possible_dports handling should be removed as it only
>>> introduces dead code. mock_cxl_setup_hdm() always returns a valid
>>> cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
>>> here.
>>>
>>> So how about moving (the "real") devm_cxl_add_passthrough_decoder()
>>> and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
>>> devm_cxl_add_passthrough_decoder() would be static then and
>>> cxl_port_get_possible_dports() will be a core.h function only. Then,
>>> mock_cxl_add_passthrough_decoder() could be removed too.
>>>
>>> I really would like to have a clean core module interface that allows
>>> an easy implementation of cxl_test and avoid too much impact to the
>>> driver code.
>>
>> So after looking at this a bit, it looks like we need a bigger refactor than just devm_cxl_enumerate_decoders(). I have an attempt in the next rev you can take a look. It reduces from 3-4 mock functions down to 2.
>>
>>>
>>>> + }
>>>> +
>>>> + dev_err(&port->dev, "HDM decoder capability not found\n");
>>>> + return -ENXIO;
>>>> +}
>>>> +
>>>> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
>>>> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_dport *dport;
>>>> + int rc;
>>>> +
>>>> + guard(device)(&port->dev);
>>>> +
>>>> + if (!port->dev.driver)
>>>> + return ERR_PTR(-ENXIO);
>>>> +
>>>> + dport = cxl_find_dport_by_dev(port, dport_dev);
>>>> + if (dport)
>>>> + return dport;
>>>
>>> What is the case if there is already a dport bound to the port? Since
>>> there is a 1:1 mapping downstream, there is only one allocation and I
>>> would expect that dport never exists and an -EBUSY should be returned
>>> otherwise.
>>
>> ok
>>
>>>
>>>> +
>>>> + struct cxl_dport *new_dport __free(put_cxl_dport) =
>>>> + devm_cxl_add_dport_by_dev(port, dport_dev);
>>>
>>> See my comment on devm_cxl_add_dport_by_dev() above.
>>>
>>>> + if (IS_ERR(new_dport))
>>>> + return new_dport;
>>>> +
>>>> + cxl_switch_parse_cdat(port);
>>>> +
>>>> + /*
>>>> + * First instance of dport appearing, need to setup the port, including
>>>> + * allocating decoders.
>>>> + */
>>>> + if (port->nr_dports == 1) {
>>>> + rc = cxl_switch_port_setup(port);
>>>
>>> Can't this be done with port creation? I don't see a reason doing this
>>> late at this point.
>>>
>>>> + if (rc)
>>>> + return ERR_PTR(rc);
>>>> + return no_free_ptr(new_dport);
>>>> + }
>>>> +
>>>> + rc = cxl_decoders_dport_update(new_dport);
>>>> + if (rc)
>>>> + return ERR_PTR(rc);
>>>
>>> Maybe unfold cxl_decoders_dport_update() here?
>>
>> ok
>>
>>>
>>>> +
>>>> + return no_free_ptr(new_dport);
>>>> +}
>>>> +
>>>> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> +
>>>> + if (!port)
>>>> + return ERR_PTR(-ENODEV);
>>>> +
>>>> + return cxl_port_get_or_add_dport(port, dport_dev);
>>>> +}
>>>
>>> That function can be removed, see below.
>>
>> ok
>>
>>>
>>>> +
>>>> +static struct cxl_dport *
>>>> +devm_cxl_create_or_extend_port(struct device *ep_dev,
>>>> + struct cxl_port *parent_port,
>>>> + struct cxl_dport *parent_dport,
>>>> + struct device *uport_dev,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + resource_size_t component_reg_phys;
>>>> +
>>>> + guard(device)(&parent_port->dev);
>>>> +
>>>> + if (!parent_port->dev.driver) {
>>>> + dev_warn(ep_dev,
>>>> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
>>>> + dev_name(&parent_port->dev), dev_name(uport_dev));
>>>> + return ERR_PTR(-ENXIO);
>>>> + }
>>>> +
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> +
>>>> + if (!port) {
>>>> + component_reg_phys = find_component_registers(uport_dev);
>>>> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>>>> + component_reg_phys, parent_dport);
>>>> + if (IS_ERR(port))
>>>> + return (struct cxl_dport *)port;
>>>> +
>>>> + /*
>>>> + * retry to make sure a port is found. a port device
>>>> + * reference is taken.
>>>> + */
>>>> + port = find_cxl_port_by_uport(uport_dev);
>>>> + if (!port)
>>>> + return ERR_PTR(-ENODEV);
>>>> +
>>>> + dev_dbg(ep_dev, "created port %s:%s\n",
>>>> + dev_name(&port->dev), dev_name(port->uport_dev));
>>>> + }
>>>> +
>>>> + return cxl_port_get_or_add_dport(port, dport_dev);
>>>> +}
>>>> +
>>>> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>>> struct device *uport_dev,
>>>> struct device *dport_dev)
>>>> {
>>>> struct device *dparent = grandparent(dport_dev);
>>>> struct cxl_dport *dport, *parent_dport;
>>>> - resource_size_t component_reg_phys;
>>>> int rc;
>>>>
>>>> if (is_cxl_host_bridge(dparent)) {
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> /*
>>>> * The iteration reached the topology root without finding the
>>>> * CXL-root 'cxl_port' on a previous iteration, fail for now to
>>>> * be re-probed after platform driver attaches.
>>>> */
>>>> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>>> - dev_name(dport_dev));
>>>> - return -ENXIO;
>>>> + if (!port) {
>>>> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>>> + dev_name(dport_dev));
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + /*
>>>> + * While the port is found, there may not be a dport associated
>>>> + * yet. Try to associate the dport to the port. On return success,
>>>> + * the iteration will restart with the dport now attached.
>>>> + */
>>>> + dport = devm_cxl_add_dport_by_uport(uport_dev,
>>>> + dport_dev);
>>>
>>> port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
>>> instead. Remove devm_cxl_add_dport_by_uport().
>>
>> ok
>>
>>>
>>>> + if (IS_ERR(dport))
>>>> + return PTR_ERR(dport);
>>>> +
>>>> + return 0;
>>>> }
>>>>
>>>> struct cxl_port *parent_port __free(put_cxl_port) =
>>>> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>>> return -EAGAIN;
>>>> }
>>>>
>>>> - /*
>>>> - * Definition with __free() here to keep the sequence of
>>>> - * dereferencing the device of the port before the parent_port releasing.
>>>> - */
>>>> - struct cxl_port *port __free(put_cxl_port) = NULL;
>>>> - scoped_guard(device, &parent_port->dev) {
>>>> - if (!parent_port->dev.driver) {
>>>> - dev_warn(&cxlmd->dev,
>>>> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
>>>> - dev_name(&parent_port->dev), dev_name(uport_dev));
>>>> - return -ENXIO;
>>>> - }
>>>> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
>>>> + parent_dport, uport_dev,
>>>> + dport_dev);
>>>
>>> You expand add_port_attach_ep() here. This function was originally
>>> called if there is no *port* at all. Now, as the dport_dev is not yet
>>> registered, the port may already exist, but it is not found since the
>>> dport_dev is not yet registered and add_port_attach_ep() is called now
>>> even if the port exists. I think we should move that dport_dev
>>> registration a level higher to devm_cxl_enumerate_ports(). That might
>>> need a cleanup of the iterator and the removal of
>>> add_port_attach_ep().
>>
>> Yes. The new rev will move the dport registration a level up. No need to remove add_port_attach_ep(). devm_cxl_create_or_extend_port() will be devm_cxl_create_port().
>>
>>>
>>>> + if (IS_ERR(dport))
>>>> + return PTR_ERR(dport);
>>>>
>>>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>>>> - if (!port) {
>>>> - component_reg_phys = find_component_registers(uport_dev);
>>>> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>>>> - component_reg_phys, parent_dport);
>>>> - if (IS_ERR(port))
>>>> - return PTR_ERR(port);
>>>> -
>>>> - /* retry find to pick up the new dport information */
>>>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>>>> - if (!port)
>>>> - return -ENXIO;
>>>> - }
>>>> - }
>>>> -
>>>> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
>>>> - dev_name(&port->dev), dev_name(port->uport_dev));
>>>> rc = cxl_add_ep(dport, &cxlmd->dev);
>>>> if (rc == -EBUSY) {
>>>> /*
>>>> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> {
>>>> struct device *dev = &cxlmd->dev;
>>>> struct device *iter;
>>>> + int ports_need_create = 0;
>>>> int rc;
>>>>
>>>> /*
>>>> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> struct device *uport_dev;
>>>> struct cxl_dport *dport;
>>>>
>>>> + ports_need_create++;
>>>> +
>>>> if (is_cxl_host_bridge(dport_dev))
>>>> return 0;
>>>>
>>>> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>>
>>>> cxl_gpf_port_setup(dport);
>>>>
>>>> + ports_need_create--;
>>>> /* Any more ports to add between this one and the root? */
>>>> if (!dev_is_cxl_root_child(&port->dev))
>>>> continue;
>>>>
>>>> + /*
>>>> + * The 'ports_need_create' variable tracks a port being
>>>> + * created as it goes through this iterative loop. It's
>>>> + * incremented when it first enters the loop and decremented
>>>> + * when the port is found. If at the root of the hierarchy
>>>> + * and the variable is not 0, then it's missing a port
>>>> + * creation somewhere in the hierarchy and should restart.
>>>> + * For example in a setup where there's a PCI root port, a
>>>> + * switch, and an endpoint, it is possible to get to the
>>>> + * PCI root port and its creation, and the switch port is
>>>> + * still missing because the root port didn't exist. This
>>>> + * triggers a restart of the loop to create the switch port
>>>> + * now with a present root port.
>>>> + */
>>>> + if (ports_need_create)
>>>
>>> Uh, that becomes hard. Isn't the iterator much simpler:
>>>
>>> * Start the iter = endpoint.
>>>
>>> * Find first existing parent port up to the root.
>>>
>>> * If that is the direct parent of the endpoint, attach it to the
>>> parent (add dport etc.). Exit loop without errors.
>>>
>>> * Else, create port and attach it to the found parent port (including
>>> dport handling).
>>>
>>> * Fail on errors or retry otherwise.
>>>
>>> So, devm_cxl_enumerate_ports() should be reworked better, also address
>>> my other comments regarding add_port_attach_ep() and
>>> devm_cxl_create_or_extend_port().
>>
>> So I reworked this whole path a bit. Maybe not exactly what you are envisioning here but it is a lot cleaner. You can take a look at the next rev.
>>
>>>
>>>> + goto retry;
>>>> +
>>>> return 0;
>>>> }
>>>>
>>>> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> if (rc == -EAGAIN)
>>>> continue;
>>>> /* failed to add ep or port */
>>>> - if (rc)
>>>> + if (rc < 0)
>>>> return rc;
>>>> +
>>>> + ports_need_create = 0;
>>>> /* port added, new descendants possible, start over */
>>>> goto retry;
>>>> }
>>>> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
>>>> device_lock_assert(&port->dev);
>>>>
>>>> if (xa_empty(&port->dports))
>>>> - return -EINVAL;
>>>> + return 0;
>>>>
>>>> guard(rwsem_write)(&cxl_rwsem.region);
>>>> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
>>>> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
>>>>
>>>> - if (!dport)
>>>> - return -ENXIO;
>>>> + if (!dport) {
>>>> + /* dport may be activated later */
>>>> + continue;
>>>> + }
>>>> cxlsd->target[i] = dport;
>>>> }
>>>
>>> Should that be dropped entirely as the target setup is done somewhere
>>> else?
>>>
>> No. This is still needed for root ports.
>
> Root ports are dport_devs and don't have a target list, host bridges
> have. I did not follow the entire code flow, but shouldn't
> cxl_decoders_dport_update() handle that?
Sorry I see where I mislead you. I should have said cxl_root(s) and not the root ports under the host bridge. __cxl_parse_cfmws() calls cxl_decoder_add() for the root decoder and that's where decoder_populate_targets() needs to populate the targets and cxl_decoders_dport_update() does not hit the root decoder.
DJ
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-09-01 14:48 ` Robert Richter
@ 2025-09-02 15:58 ` Dave Jiang
0 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-09-02 15:58 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 9/1/25 7:48 AM, Robert Richter wrote:
> On 29.08.25 10:23:38, Dave Jiang wrote:
>>
>>
>> On 8/29/25 8:02 AM, Robert Richter wrote:
>>> On 27.08.25 10:05:05, Dave Jiang wrote:
>>>>
>>>>
>>>> On 8/26/25 12:51 AM, Robert Richter wrote:
>>>>> On 22.08.25 08:52:39, Dave Jiang wrote:
>>>>>>
>>>>>>
>>>>>> On 8/22/25 2:59 AM, Robert Richter wrote:
>>>>>>> On 20.08.25 08:20:04, Dave Jiang wrote:
>>>>>>>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>>>>>>>> Hi Dave,
>>>>>>>>>
>>>>>>>>> see my comments below.
>>>>>>>>>
>>>>>>>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>>>>>>>
>>>>>>>> <--snip-->
>>>>>>>>
>>>>>>>>>> + if (IS_ERR(new_dport))
>>>>>>>>>> + return new_dport;
>>>>>>>>>> +
>>>>>>>>>> + cxl_switch_parse_cdat(port);
>>>>>>>>>> +
>>>>>>>>>> + /*
>>>>>>>>>> + * First instance of dport appearing, need to setup the port, including
>>>>>>>>>> + * allocating decoders.
>>>>>>>>>> + */
>>>>>>>>>> + if (port->nr_dports == 1) {
>>>>>>>>>> + rc = cxl_switch_port_setup(port);
>
> I want to come back to my previous comment that port setup should be
> done as part of the port enumeration in devm_cxl_enumerate_ports().
> No need to make this a special case here. The link should be up
> already once the mem dev is visible.
IFF the port enumeration is done via devm_cxl_enumerate_ports() and through endpoint bring up. But we also do port enumeration during cxl_acpi probe and thus is the issue. cxl_switch_port_probe() also happens at that time when cxl_acpi_probe() add ports. And the link may not be up at that time. Anyhow, this is what it looks like in v9:
if (ida_is_empty(&port->decoder_ida)) {
rc = devm_cxl_switch_port_decoders_setup(port);
if (rc)
return ERR_PTR(rc);
dev_dbg(&port->dev, "first dport%d:%s added with decoders\n",
new_dport->port_id, dev_name(dport_dev));
return no_free_ptr(new_dport);
}
>
>>>>>>>>>
>>>>>>>>> Can't this be done with port creation? I don't see a reason doing this
>>>>>>>>> late at this point.
>>>>>>>>
>>>>>>>
>>>>>>>> The main reason we are doing this is to move the port register
>>>>>>>> probing until we know the CXL link is established. Otherwise when
>>>>>>>> cxl_acpi does probe and calls add_host_bridge_uport(), that
>>>>>>>> devm_cxl_add_port() can trigger errors if the platform BIOS enables
>>>>>>>> PCI hotplug support on Intel platforms. The error messages "cxl
>>>>>>>> portN: Couldn't locate the CXL.cache and CXL.mem capability array
>>>>>>>> header" is observed. Essentially we can be trying to map registers
>>>>>>>> while DVSEC ID 3 and/or 7 has not appeared yet. And in turn because
>>>>>>>> that got pushed out, so did the decoder enumeration.
>>>>>>>
>>>>>>> The code suggests the Component Registers of the CXL Host Bridge are
>>>>>>> not yet ready. Is this delayed after the first Root Port is connected
>>>>>>> to a CXL Endpoint/Switch? PCIe DVSEC ID 3 and 7
>>>>>>> (CXL_DVSEC_PORT_EXTENSIONS, CXL_DVSEC_PCIE_FLEXBUS_PORT) are part of
>>>>>>> the pcie config space, which is enumerated not before a CXL endpoint
>>>>>>> becomes active. I haven't found a spec refs here. Please explain.
>>>>>>
>>>>>
>>>>>> So the behavior is observed when PCIe hotplug support is turned on
>>>>>> in BIOS for the Intel platform. A CXL device is plugged in to a RP
>>>>>> without CXL switches. The thinking is that the CXL link is not fully
>>>>>> established at the time when cxl_acpi_probe() is running and the
>>>>>> ports are being added. And the only way to 100% be sure the link is
>>>>>> established is when we are enumerating the memdev just like the
>>>>>> dports. Not sure what spec ref are you looking for. Table 8-2
>>>>>> indicates that those 2 DVSECs are mandatory for CXL root ports. Lack
>>>>>> of presence means either the RP isn't CXL or the CXL link isn't
>>>>>> established yet. I would assume this would also be true if a CXL
>>>>>> memdev is hot-plugged into a slot post boot.
>>>>>
>>>>> But add_host_bridge_uport() only creates ports for the host bridge
>>>>> (ACPI0016) devices and enumerates their component registers (CHBCR).
>>>>
>>>> And I think that's where the issue is. The component registers via CHBCR isn't there. When I removed this change, this is the signature I get:
>>>>
>>>> [ 37.423882] cxl_acpi:cxl_get_chbs:589: acpi ACPI0016:03: UID found: 35
>>>> [ 37.424180] cxl_acpi:add_host_bridge_uport:726: acpi ACPI0016:03: CHBCR found for UID 35: 0x00000
>>>> 000aabf0000
>>>> [ 37.424186] cxl_core:cxl_port_alloc:741: pci0000:3a: host-bridge: pci0000:3a
>>>> [ 37.424210] cxl_core:cxl_map_regblock:426: cxl port2: Mapped CXL Memory Device resource 0x0000000
>>>> 0aabf0000
>>>> [ 37.424213] cxl_core:cxl_probe_component_regs:55: cxl port2: Couldn't locate the CXL.cache and CXL.mem capability array header.
>>>
>>> Hmm, hot-added Host Bridges (and I would count this case to those)
>>> should use the ACPI _CBR method. That is, else the host bridge should
>>> be enumerated during boot.
>>
>> host bridges are there, just not the component registers in the CHBCR it appears.
>>
>>>
>>> Is it just a delay, or does the CHBCR come up not earlier than the
>>> root port link is up?
>>
>> Let me get that clarification from the BIOS people.
>>
>>>
>>> Can -EAGAIN be used to reload the driver later if CHBCR init fails?
>>> IMO, the Component Registers cannot be initialized later as that would
>>> delay the enablement of the root decoders too. At least only bridges
>>> that fail to init the CHBCR should be delayed.
>>
>
>> I don't follow why this is an issue. Auto region assembly doesn't
>> start until the port hierarchy is established via the first
>> endpoint. So by the time the region code pokes at the decoder
>> registers during region assembly, the component register for CHBCR
>> should have been probed. It seems reasonable to setup the component
>> registers when we find the first dport and thus indicate that
>> everything should be there. Are you observing an issue on a
>> platform?
>
> It would be good to see the bridges in the system regardless of the
> link status of the root ports, which I think is possible. Same with
> the chbcr and the hb decoders. Only defer it if not yet available and,
> let's mark it as a quirk or workaround. Btw, this is not a delayed
> dport enablement any longer.
ok. I can take out the delayed register probing patch while we discuss this. No reason to hold up the rest of the series.
>
> I am also a bit worried about the conditions to run the setup. What if
> there are multiple hotplug ports, why should the CHBCR be ready with
> the first one already? Shouldn't all connected ports come up first?
I see what you are saying. Let me get the exact behavior from the BIOS guys first.
>
>>>
>>> The issue and the changes for this are not obvious, please make a
>>> separate patch for that separate change.
>>
>> It was in this [1] patch.
>>
>> https://lore.kernel.org/linux-cxl/20250814222151.3520500-5-dave.jiang@intel.com/
>>
>> I think the name of the function being discussed confused things. It is now renamed to setup decoders instead of just setup.
>
> Yes, but there is this additional change that calls
> cxl_switch_port_setup() in this patch.
>
> -Robert
>
>>
>> DJ
>>
>>>
>>> Thanks,
>>>
>>> -Robert
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
2025-09-01 17:29 ` Robert Richter
2025-09-02 15:40 ` Dave Jiang
@ 2025-09-03 18:21 ` Dave Jiang
1 sibling, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2025-09-03 18:21 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, dave, jonathan.cameron, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
On 9/1/25 10:29 AM, Robert Richter wrote:
> On 27.08.25 14:15:21, Dave Jiang wrote:
>>
>>
>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>> Hi Dave,
>>>
>>> see my comments below.
>>>
>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>>> The current implementation enumerates the dports during the cxl_port
>>>> driver probe. Without an endpoint connected, the dport may not be
>>>> active during port probe. This scheme may prevent a valid hardware
>>>> dport id to be retrieved and MMIO registers to be read when an endpoint
>>>> is hot-plugged. Move the dport allocation and setup to behind memdev
>>>> probe so the endpoint is guaranteed to be connected.
>>>>
>>>> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
>>>> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
>>>> ACPI0017.N device. Through that it enumerates downstream ports composed
>>>> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
>>>> uses add_host_bridge_uport() to create the ports that enumerate the PCI
>>>> RPs as the dports of these ports. Every time a port is created, the port
>>>> driver is attached, cxl_switch_porbe_probe() is called and
>>>> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
>>>> the dports.
>>>>
>>>> The second phase is if there are any CXL switches. When the pci endpoint
>>>> device driver (cxl_pci) calls probe, it will add a mem device and triggers
>>>> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
>>>> and attempts to discovery and create all the ports represent CXL switches.
>>>> During this phase, a port is created per switch and the attached dports
>>>> are also enumerated and probed.
>>>>
>>>> The last phase is creating endpoint port which happens for all endpoint
>>>> devices.
>>>>
>>>> In this commit, the port create and its dport probing in cxl_acpi is not
>>>> changed. That will be handled later. The behavior change is only for CXL
>>>> switch ports. Only the dport that is part of the path for an endpoint
>>>> device to the RP will be probed. This happens naturally by the code
>>>> walking up the device hierarchy and identifying the upstream device and
>>>> the downstream device.
>>>>
>>>> The new sequence is instead of creating all possible dports at initial
>>>> port creation, defer port instantiation until a memdev beneath that
>>>> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
>>>> the creation and extension of ports with new dports as memory devices
>>>> arrive. As part of this rework, switch decoder target list is amended
>>>> at runtime as dports show up.
>>>>
>>>> While the decoders are allocated during the port driver probe,
>>>> The decoders must also be updated since previously it's all done when all
>>>> the dports are setup and now every time a dport is setup per endpoint, the
>>>> switch target listing need to be updated with new dport. A
>>>> guard(rwsem_write) is used to update decoder targets. This is similar to
>>>> when decoder_populate_target() is called and the decoder programming
>>>> must be protected.
>>>>
>>>> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
>>>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>> ---
>>>> v8:
>>>> - grammar and spelling fixups (Dan)
>>>> - Clarify commit log story. (Dan)
>>>> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
>>>> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
>>>> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
>>>> - Remove failure path for possible dports == 0. (Dan, Robert)
>>>> - update_switch_decoder() -> update_decoder_targets(). (Dan)
>>>> - Remove lock asserts where not needed. (Dan)
>>>> - Add support for passthrough decoder init. (Dan)
>>>> - Return -ENXIO when no driver attached. (Dan)
>>>> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
>>>> - Add devm_cxl_create_or_extend_port() helper. (Dan)
>>>> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
>>>> - Remove 'new_dport' local var. (Robert)
>>>> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
>>>> - Move port check logic to add_port_attach_ep(). (Robert)
>>>> ---
>>>> drivers/cxl/core/cdat.c | 2 +-
>>>> drivers/cxl/core/core.h | 2 +
>>>> drivers/cxl/core/hdm.c | 6 -
>>>> drivers/cxl/core/pci.c | 81 +++++++++++
>>>> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
>>>> drivers/cxl/core/region.c | 4 +-
>>>> drivers/cxl/cxl.h | 3 +
>>>> drivers/cxl/port.c | 29 +---
>>>> 8 files changed, 331 insertions(+), 83 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>>>> index c0af645425f4..b156b81a9b20 100644
>>>> --- a/drivers/cxl/core/cdat.c
>>>> +++ b/drivers/cxl/core/cdat.c
>>>> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
>>>>
>>>> guard(rwsem_read)(&cxl_rwsem.region);
>>>> for (int i = 0; i < cxlsd->nr_targets; i++) {
>>>> - if (host_bridge == cxlsd->target[i]->dport_dev)
>>>> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
>>>> return 1;
>>>> }
>>>>
>>>> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
>>>> index 2669f251d677..2ac71eb459e6 100644
>>>> --- a/drivers/cxl/core/core.h
>>>> +++ b/drivers/cxl/core/core.h
>>>> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
>>>> int cxl_ras_init(void);
>>>> void cxl_ras_exit(void);
>>>> int cxl_gpf_port_setup(struct cxl_dport *dport);
>>>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>>>> + struct device *dport_dev);
>>>>
>>>> #ifdef CONFIG_CXL_FEATURES
>>>> struct cxl_feat_entry *
>>>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>>>> index cee68bbc7ff6..5263e9eba7d0 100644
>>>> --- a/drivers/cxl/core/hdm.c
>>>> +++ b/drivers/cxl/core/hdm.c
>>>> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
>>>> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>>> {
>>>> struct cxl_switch_decoder *cxlsd;
>>>> - struct cxl_dport *dport = NULL;
>>>> - unsigned long index;
>>>> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
>>>>
>>>> /*
>>>> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>>>
>>>> device_lock_assert(&port->dev);
>>>>
>>>> - xa_for_each(&port->dports, index, dport)
>>>> - break;
>>>> - cxlsd->cxld.target_map[0] = dport->port_id;
>>>> -
>>>
>>> The change of initialization of cxlsd->cxld.target_map[] could have
>>> been a separate patch to reduce size of this patch.
>>>
>>>> return add_hdm_decoder(port, &cxlsd->cxld);
>>>> }
>>>> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
>>>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>>>> index b50551601c2e..b9d770f1aa7b 100644
>>>> --- a/drivers/cxl/core/pci.c
>>>> +++ b/drivers/cxl/core/pci.c
>>>> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
>>>> module_param(media_ready_timeout, ushort, 0644);
>>>> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
>>>>
>>>> +/**
>>>> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
>>>> + * @port: cxl_port that hosts the dport
>>>> + * @dport_dev: 'struct device' of the dport
>>>> + *
>>>> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
>>>> + */
>>>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>>>
>>> This function only determines the port_num. How about only implement
>>> this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
>>> directly?
>>
>
>> I can split out the code to get the port_num locally, but we can't
>> call devm_cxl_add_dport() directly in core/port.c because we need
>> the map.resource and in order to retrieve that cxl_find_regblock()
>> requires a pci dev.
>
> I mean the following:
>
> In the mock case, there is always a decoder. That is,
> devm_cxl_add_passthrough_decoder() will only be used for pci devs.
>
> Create cxl_port_has_multiple_dports() which contains:
>
> if (!dev_is_pci(...))
> return false;
> /* pci_walk_bus() and inspect dports: */
> ...
>
> In cxl_switch_port_setup():
>
> rc = devm_cxl_enumerate_decoders(...)
> if (!rc)
> return 0;
> if (cxl_port_has_multiple_dports(...))
I think you probably mean cxl_port_has_single_dport()? Otherwise it would be
if (!cxl_port_has_multiple_dports(...)) {
rc = devm_cxl_add_passthrough_decoder(...);
...
}
And any error would cause a creation of passthrough decoder.
DJ
> rc = devm_cxl_add_passthrough_decoder(...);
>
> You don't need function devm_cxl_add_dport_by_dev() any longer, just
> use devm_cxl_add_dport() instead.
>
>>>
>>> That would nicely fit into core/pci.c.
>>>
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_register_map map;
>>>> + struct pci_dev *pdev;
>>>> + u32 lnkcap, port_num;
>>>> + int type;
>>>> + int rc;
>>>> +
>>>> + if (!dev_is_pci(dport_dev))
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + device_lock_assert(&port->dev);
>>>> +
>>>> + pdev = to_pci_dev(dport_dev);
>>>> + type = pci_pcie_type(pdev);
>>>> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
>>>> + &lnkcap))
>>>> + return ERR_PTR(-ENXIO);
>>>> +
>>>> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
>>>> + if (rc)
>>>> + dev_dbg(&port->dev, "failed to find component registers\n");
>>>> +
>>>> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
>>>
>>> So, just return port_num instead.
>>>
>>>> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
>>>> +}
>>>> +
>>>> struct cxl_walk_context {
>>>> struct pci_bus *bus;
>>>> struct cxl_port *port;
>>>> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
>>>>
>>>> return 0;
>>>> }
>>>> +
>>>> +static int count_dports(struct pci_dev *pdev, void *data)
>>>> +{
>>>> + struct cxl_walk_context *ctx = data;
>>>> + int type = pci_pcie_type(pdev);
>>>> +
>>>> + if (pdev->bus != ctx->bus)
>>>> + return 0;
>>>> + if (!pci_is_pcie(pdev))
>>>> + return 0;
>>>> + if (type != ctx->type)
>>>> + return 0;
>>>> +
>>>> + ctx->count++;
>>>> + return 0;
>>>> +}
>>>> +
>>>> +int cxl_port_get_possible_dports(struct cxl_port *port)
>>>> +{
>>>> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
>>>> + struct cxl_walk_context ctx;
>>>> + int type;
>>>> +
>>>> + if (!bus) {
>>>> + dev_err(&port->dev, "No PCI bus found for port %s\n",
>>>> + dev_name(&port->dev));
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + if (pci_is_root_bus(bus))
>>>> + type = PCI_EXP_TYPE_ROOT_PORT;
>>>> + else
>>>> + type = PCI_EXP_TYPE_DOWNSTREAM;
>>>> +
>>>> + ctx = (struct cxl_walk_context) {
>>>> + .bus = bus,
>>>> + .type = type,
>>>> + };
>>>> + pci_walk_bus(bus, count_dports, &ctx);
>>>
>>> Don't walk the whole bus, just check children of port->uport_dev.
>>
>
>> cxl_port_to_pci_bus() gets the pdev->subordinate of the
>> port->uport_dev. So I think that's equivalent of checking the
>> children of port->uport_dev and not actually walking the whole pci
>> bus no?
>
> pci_walk_bus() also calls subordinates. So it is equivalent, but
> count_dports is called for other devices too that are not children.
> And it is not obvious that only direct children are counted. Use
> device_for_each_child()?
>
>>>
>>>> +
>>>> + return ctx.count;
>>>> +}
>>>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
>>>
>>> See below for my comment on possible_dports.
>>>
>>> Since we only check for count > 1 the implemntation could be
>>> simplified and renamed to e.g. cxl_port_has_multiple_dports which
>>> could easily be used to call devm_cxl_add_passthrough_decoder().
>>
>
>> This would be possible if the function can return a bool. However,
>> it is possible to encounter errors. And errors should not be
>> equivalent to a false (0) return value and resulting in a
>> passthrough decoder creation. Thus I think we should stay with the
>> current function name.
>
> Ok, but see also above.
>
>>>
>>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>>> index 25209952f469..877f888ee8f5 100644
>>>> --- a/drivers/cxl/core/port.c
>>>> +++ b/drivers/cxl/core/port.c
>>>> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
>>>> return port;
>>>> }
>>>>
>>>> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
>>>> - struct device *dport_dev,
>>>> - struct cxl_dport **dport)
>>>> -{
>>>> - struct cxl_find_port_ctx ctx = {
>>>> - .dport_dev = dport_dev,
>>>> - .parent_port = parent_port,
>>>> - .dport = dport,
>>>> - };
>>>> - struct cxl_port *port;
>>>> -
>>>> - port = __find_cxl_port(&ctx);
>>>> - return port;
>>>> -}
>>>> -
>>>> /*
>>>> * All users of grandparent() are using it to walk PCIe-like switch port
>>>> * hierarchy. A PCIe switch is comprised of a bridge device representing the
>>>> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
>>>> return map.resource;
>>>> }
>>>>
>>>> +static int match_port_by_uport(struct device *dev, const void *data)
>>>> +{
>>>> + const struct device *uport_dev = data;
>>>> + struct cxl_port *port;
>>>> +
>>>> + if (!is_cxl_port(dev))
>>>> + return 0;
>>>> +
>>>> + port = to_cxl_port(dev);
>>>> + return uport_dev == port->uport_dev;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Function takes a device reference on the port device. Caller should do a
>>>> + * put_device() when done.
>>>> + */
>>>> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
>>>> +{
>>>> + struct device *dev;
>>>> +
>>>> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
>>>> + if (dev)
>>>> + return to_cxl_port(dev);
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +static int update_decoder_targets(struct device *dev, void *data)
>>>> +{
>>>> + struct cxl_dport *dport = data;
>>>> + struct cxl_switch_decoder *cxlsd;
>>>> + struct cxl_decoder *cxld;
>>>> + int i;
>>>> +
>>>> + if (!is_switch_decoder(dev))
>>>> + return 0;
>>>> +
>>>> + cxlsd = to_cxl_switch_decoder(dev);
>>>> + cxld = &cxlsd->cxld;
>>>> + guard(rwsem_write)(&cxl_rwsem.region);
>>>> +
>>>> + /* Short cut for passthrough decoder */
>>>> + if (cxlsd->nr_targets == 1) {
>>>
>>> I think we should still check port_id. That is, remove the shortcut.
>>> If nr_targets == 1, then interleave_ways should be one too, so you
>>> gain nothing. Plus, you also see the dev_dbg().
>>
>> ok
>>
>>>
>>>> + cxlsd->target[0] = dport;
>>>> + return 0;
>>>> + }
>>>> +
>>>> + for (i = 0; i < cxld->interleave_ways; i++) {
>>>> + if (cxld->target_map[i] == dport->port_id) {
>>>> + cxlsd->target[i] = dport;
>>>> + dev_dbg(dev, "dport%d found in target list, index %d\n",
>>>> + dport->port_id, i);
>>>> + return 0;
>>>
>>> Only one target exists, right? Stop the iteration by returning a
>>> non-zero here (caller needs to be adjusted then).
>>
>> ok
>>
>>>
>>>> + }
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
>>>> +{
>>>> + return device_for_each_child(&dport->port->dev, dport,
>>>> + update_decoder_targets);
>>>
>>> Might need changes if update_decoder_targets returns 1 to stop the
>>> iterator.
>>
>> ok
>>
>>>
>>>> +}
>>>> +
>>>> +static int cxl_switch_port_setup(struct cxl_port *port)
>>>> +{
>>>
>>> Could you factor out that function in a separate patch?
>>>
>>> The function only sets up decoders. Name it
>>> cxl_switch_port_setup_decoders()?
>>>
>>>> + struct cxl_hdm *cxlhdm;
>>>> +
>>>> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
>>>> + if (!IS_ERR(cxlhdm))
>>>> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
>>>> +
>>>> + if (PTR_ERR(cxlhdm) != -ENODEV) {
>>>> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
>>>> + return PTR_ERR(cxlhdm);
>>>> + }
>>>> +
>>>> + if (port->possible_dports == 1) {
>>>> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
>>>> + return devm_cxl_add_passthrough_decoder(port);
>>>
>>> Imo, the possible_dports handling should be removed as it only
>>> introduces dead code. mock_cxl_setup_hdm() always returns a valid
>>> cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
>>> here.
>>>
>>> So how about moving (the "real") devm_cxl_add_passthrough_decoder()
>>> and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
>>> devm_cxl_add_passthrough_decoder() would be static then and
>>> cxl_port_get_possible_dports() will be a core.h function only. Then,
>>> mock_cxl_add_passthrough_decoder() could be removed too.
>>>
>>> I really would like to have a clean core module interface that allows
>>> an easy implementation of cxl_test and avoid too much impact to the
>>> driver code.
>>
>> So after looking at this a bit, it looks like we need a bigger refactor than just devm_cxl_enumerate_decoders(). I have an attempt in the next rev you can take a look. It reduces from 3-4 mock functions down to 2.
>>
>>>
>>>> + }
>>>> +
>>>> + dev_err(&port->dev, "HDM decoder capability not found\n");
>>>> + return -ENXIO;
>>>> +}
>>>> +
>>>> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
>>>> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_dport *dport;
>>>> + int rc;
>>>> +
>>>> + guard(device)(&port->dev);
>>>> +
>>>> + if (!port->dev.driver)
>>>> + return ERR_PTR(-ENXIO);
>>>> +
>>>> + dport = cxl_find_dport_by_dev(port, dport_dev);
>>>> + if (dport)
>>>> + return dport;
>>>
>>> What is the case if there is already a dport bound to the port? Since
>>> there is a 1:1 mapping downstream, there is only one allocation and I
>>> would expect that dport never exists and an -EBUSY should be returned
>>> otherwise.
>>
>> ok
>>
>>>
>>>> +
>>>> + struct cxl_dport *new_dport __free(put_cxl_dport) =
>>>> + devm_cxl_add_dport_by_dev(port, dport_dev);
>>>
>>> See my comment on devm_cxl_add_dport_by_dev() above.
>>>
>>>> + if (IS_ERR(new_dport))
>>>> + return new_dport;
>>>> +
>>>> + cxl_switch_parse_cdat(port);
>>>> +
>>>> + /*
>>>> + * First instance of dport appearing, need to setup the port, including
>>>> + * allocating decoders.
>>>> + */
>>>> + if (port->nr_dports == 1) {
>>>> + rc = cxl_switch_port_setup(port);
>>>
>>> Can't this be done with port creation? I don't see a reason doing this
>>> late at this point.
>>>
>>>> + if (rc)
>>>> + return ERR_PTR(rc);
>>>> + return no_free_ptr(new_dport);
>>>> + }
>>>> +
>>>> + rc = cxl_decoders_dport_update(new_dport);
>>>> + if (rc)
>>>> + return ERR_PTR(rc);
>>>
>>> Maybe unfold cxl_decoders_dport_update() here?
>>
>> ok
>>
>>>
>>>> +
>>>> + return no_free_ptr(new_dport);
>>>> +}
>>>> +
>>>> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> +
>>>> + if (!port)
>>>> + return ERR_PTR(-ENODEV);
>>>> +
>>>> + return cxl_port_get_or_add_dport(port, dport_dev);
>>>> +}
>>>
>>> That function can be removed, see below.
>>
>> ok
>>
>>>
>>>> +
>>>> +static struct cxl_dport *
>>>> +devm_cxl_create_or_extend_port(struct device *ep_dev,
>>>> + struct cxl_port *parent_port,
>>>> + struct cxl_dport *parent_dport,
>>>> + struct device *uport_dev,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + resource_size_t component_reg_phys;
>>>> +
>>>> + guard(device)(&parent_port->dev);
>>>> +
>>>> + if (!parent_port->dev.driver) {
>>>> + dev_warn(ep_dev,
>>>> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
>>>> + dev_name(&parent_port->dev), dev_name(uport_dev));
>>>> + return ERR_PTR(-ENXIO);
>>>> + }
>>>> +
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> +
>>>> + if (!port) {
>>>> + component_reg_phys = find_component_registers(uport_dev);
>>>> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>>>> + component_reg_phys, parent_dport);
>>>> + if (IS_ERR(port))
>>>> + return (struct cxl_dport *)port;
>>>> +
>>>> + /*
>>>> + * retry to make sure a port is found. a port device
>>>> + * reference is taken.
>>>> + */
>>>> + port = find_cxl_port_by_uport(uport_dev);
>>>> + if (!port)
>>>> + return ERR_PTR(-ENODEV);
>>>> +
>>>> + dev_dbg(ep_dev, "created port %s:%s\n",
>>>> + dev_name(&port->dev), dev_name(port->uport_dev));
>>>> + }
>>>> +
>>>> + return cxl_port_get_or_add_dport(port, dport_dev);
>>>> +}
>>>> +
>>>> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>>> struct device *uport_dev,
>>>> struct device *dport_dev)
>>>> {
>>>> struct device *dparent = grandparent(dport_dev);
>>>> struct cxl_dport *dport, *parent_dport;
>>>> - resource_size_t component_reg_phys;
>>>> int rc;
>>>>
>>>> if (is_cxl_host_bridge(dparent)) {
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> /*
>>>> * The iteration reached the topology root without finding the
>>>> * CXL-root 'cxl_port' on a previous iteration, fail for now to
>>>> * be re-probed after platform driver attaches.
>>>> */
>>>> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>>> - dev_name(dport_dev));
>>>> - return -ENXIO;
>>>> + if (!port) {
>>>> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>>> + dev_name(dport_dev));
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + /*
>>>> + * While the port is found, there may not be a dport associated
>>>> + * yet. Try to associate the dport to the port. On return success,
>>>> + * the iteration will restart with the dport now attached.
>>>> + */
>>>> + dport = devm_cxl_add_dport_by_uport(uport_dev,
>>>> + dport_dev);
>>>
>>> port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
>>> instead. Remove devm_cxl_add_dport_by_uport().
>>
>> ok
>>
>>>
>>>> + if (IS_ERR(dport))
>>>> + return PTR_ERR(dport);
>>>> +
>>>> + return 0;
>>>> }
>>>>
>>>> struct cxl_port *parent_port __free(put_cxl_port) =
>>>> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>>> return -EAGAIN;
>>>> }
>>>>
>>>> - /*
>>>> - * Definition with __free() here to keep the sequence of
>>>> - * dereferencing the device of the port before the parent_port releasing.
>>>> - */
>>>> - struct cxl_port *port __free(put_cxl_port) = NULL;
>>>> - scoped_guard(device, &parent_port->dev) {
>>>> - if (!parent_port->dev.driver) {
>>>> - dev_warn(&cxlmd->dev,
>>>> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
>>>> - dev_name(&parent_port->dev), dev_name(uport_dev));
>>>> - return -ENXIO;
>>>> - }
>>>> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
>>>> + parent_dport, uport_dev,
>>>> + dport_dev);
>>>
>>> You expand add_port_attach_ep() here. This function was originally
>>> called if there is no *port* at all. Now, as the dport_dev is not yet
>>> registered, the port may already exist, but it is not found since the
>>> dport_dev is not yet registered and add_port_attach_ep() is called now
>>> even if the port exists. I think we should move that dport_dev
>>> registration a level higher to devm_cxl_enumerate_ports(). That might
>>> need a cleanup of the iterator and the removal of
>>> add_port_attach_ep().
>>
>> Yes. The new rev will move the dport registration a level up. No need to remove add_port_attach_ep(). devm_cxl_create_or_extend_port() will be devm_cxl_create_port().
>>
>>>
>>>> + if (IS_ERR(dport))
>>>> + return PTR_ERR(dport);
>>>>
>>>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>>>> - if (!port) {
>>>> - component_reg_phys = find_component_registers(uport_dev);
>>>> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>>>> - component_reg_phys, parent_dport);
>>>> - if (IS_ERR(port))
>>>> - return PTR_ERR(port);
>>>> -
>>>> - /* retry find to pick up the new dport information */
>>>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>>>> - if (!port)
>>>> - return -ENXIO;
>>>> - }
>>>> - }
>>>> -
>>>> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
>>>> - dev_name(&port->dev), dev_name(port->uport_dev));
>>>> rc = cxl_add_ep(dport, &cxlmd->dev);
>>>> if (rc == -EBUSY) {
>>>> /*
>>>> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> {
>>>> struct device *dev = &cxlmd->dev;
>>>> struct device *iter;
>>>> + int ports_need_create = 0;
>>>> int rc;
>>>>
>>>> /*
>>>> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> struct device *uport_dev;
>>>> struct cxl_dport *dport;
>>>>
>>>> + ports_need_create++;
>>>> +
>>>> if (is_cxl_host_bridge(dport_dev))
>>>> return 0;
>>>>
>>>> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>>
>>>> cxl_gpf_port_setup(dport);
>>>>
>>>> + ports_need_create--;
>>>> /* Any more ports to add between this one and the root? */
>>>> if (!dev_is_cxl_root_child(&port->dev))
>>>> continue;
>>>>
>>>> + /*
>>>> + * The 'ports_need_create' variable tracks a port being
>>>> + * created as it goes through this iterative loop. It's
>>>> + * incremented when it first enters the loop and decremented
>>>> + * when the port is found. If at the root of the hierarchy
>>>> + * and the variable is not 0, then it's missing a port
>>>> + * creation somewhere in the hierarchy and should restart.
>>>> + * For example in a setup where there's a PCI root port, a
>>>> + * switch, and an endpoint, it is possible to get to the
>>>> + * PCI root port and its creation, and the switch port is
>>>> + * still missing because the root port didn't exist. This
>>>> + * triggers a restart of the loop to create the switch port
>>>> + * now with a present root port.
>>>> + */
>>>> + if (ports_need_create)
>>>
>>> Uh, that becomes hard. Isn't the iterator much simpler:
>>>
>>> * Start the iter = endpoint.
>>>
>>> * Find first existing parent port up to the root.
>>>
>>> * If that is the direct parent of the endpoint, attach it to the
>>> parent (add dport etc.). Exit loop without errors.
>>>
>>> * Else, create port and attach it to the found parent port (including
>>> dport handling).
>>>
>>> * Fail on errors or retry otherwise.
>>>
>>> So, devm_cxl_enumerate_ports() should be reworked better, also address
>>> my other comments regarding add_port_attach_ep() and
>>> devm_cxl_create_or_extend_port().
>>
>> So I reworked this whole path a bit. Maybe not exactly what you are envisioning here but it is a lot cleaner. You can take a look at the next rev.
>>
>>>
>>>> + goto retry;
>>>> +
>>>> return 0;
>>>> }
>>>>
>>>> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> if (rc == -EAGAIN)
>>>> continue;
>>>> /* failed to add ep or port */
>>>> - if (rc)
>>>> + if (rc < 0)
>>>> return rc;
>>>> +
>>>> + ports_need_create = 0;
>>>> /* port added, new descendants possible, start over */
>>>> goto retry;
>>>> }
>>>> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
>>>> device_lock_assert(&port->dev);
>>>>
>>>> if (xa_empty(&port->dports))
>>>> - return -EINVAL;
>>>> + return 0;
>>>>
>>>> guard(rwsem_write)(&cxl_rwsem.region);
>>>> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
>>>> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
>>>>
>>>> - if (!dport)
>>>> - return -ENXIO;
>>>> + if (!dport) {
>>>> + /* dport may be activated later */
>>>> + continue;
>>>> + }
>>>> cxlsd->target[i] = dport;
>>>> }
>>>
>>> Should that be dropped entirely as the target setup is done somewhere
>>> else?
>>>
>> No. This is still needed for root ports.
>
> Root ports are dport_devs and don't have a target list, host bridges
> have. I did not follow the entire code flow, but shouldn't
> cxl_decoders_dport_update() handle that?
>
> -Robert
>
>>
>> DJ
>>>>
>>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>>> index 71cc42d05248..bba62867df90 100644
>>>> --- a/drivers/cxl/core/region.c
>>>> +++ b/drivers/cxl/core/region.c
>>>> @@ -1510,8 +1510,10 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>>>> cxl_rr->nr_targets_set);
>>>> return -ENXIO;
>>>> }
>>>> - } else
>>>> + } else {
>>>> cxlsd->target[cxl_rr->nr_targets_set] = ep->dport;
>>>> + cxlsd->cxld.target_map[cxl_rr->nr_targets_set] = ep->dport->port_id;
>>>> + }
>>>> inc = 1;
>>>> out_target_set:
>>>> cxl_rr->nr_targets_set += inc;
>>>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>>>> index 87a905db5ffb..df10a01376c6 100644
>>>> --- a/drivers/cxl/cxl.h
>>>> +++ b/drivers/cxl/cxl.h
>>>> @@ -591,6 +591,7 @@ struct cxl_dax_region {
>>>> * @parent_dport: dport that points to this port in the parent
>>>> * @decoder_ida: allocator for decoder ids
>>>> * @reg_map: component and ras register mapping parameters
>>>> + * @possible_dports: Total possible dports reported by hardware
>>>> * @nr_dports: number of entries in @dports
>>>> * @hdm_end: track last allocated HDM decoder instance for allocation ordering
>>>> * @commit_end: cursor to track highest committed decoder for commit ordering
>>>> @@ -612,6 +613,7 @@ struct cxl_port {
>>>> struct cxl_dport *parent_dport;
>>>> struct ida decoder_ida;
>>>> struct cxl_register_map reg_map;
>>>> + int possible_dports;
>>>> int nr_dports;
>>>> int hdm_end;
>>>> int commit_end;
>>>> @@ -911,6 +913,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>>>> struct access_coordinate *c2);
>>>>
>>>> bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>>>> +int cxl_port_get_possible_dports(struct cxl_port *port);
>>>>
>>>> /*
>>>> * Unit test builds overrides this to __weak, find the 'strong' version
>>>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>>>> index cf32dc50b7a6..941a7d7157bd 100644
>>>> --- a/drivers/cxl/port.c
>>>> +++ b/drivers/cxl/port.c
>>>> @@ -59,34 +59,17 @@ static int discover_region(struct device *dev, void *unused)
>>>>
>>>> static int cxl_switch_port_probe(struct cxl_port *port)
>>>> {
>>>> - struct cxl_hdm *cxlhdm;
>>>> - int rc;
>>>> + int dports;
>>>>
>>>> /* Cache the data early to ensure is_visible() works */
>>>> read_cdat_data(port);
>>>>
>>>> - rc = devm_cxl_port_enumerate_dports(port);
>>>> - if (rc < 0)
>>>> - return rc;
>>>> + dports = cxl_port_get_possible_dports(port);
>>>> + if (dports < 0)
>>>> + return dports;
>>>> + port->possible_dports = dports;
>>>
>>> As said, I think the whole possible_dports part can be removed.
>>>
>>> Thanks,
>>>
>>> -Robert
>>>
>>>>
>>>> - cxl_switch_parse_cdat(port);
>>>> -
>>>> - cxlhdm = devm_cxl_setup_hdm(port, NULL);
>>>> - if (!IS_ERR(cxlhdm))
>>>> - return devm_cxl_enumerate_decoders(cxlhdm, NULL);
>>>> -
>>>> - if (PTR_ERR(cxlhdm) != -ENODEV) {
>>>> - dev_err(&port->dev, "Failed to map HDM decoder capability\n");
>>>> - return PTR_ERR(cxlhdm);
>>>> - }
>>>> -
>>>> - if (rc == 1) {
>>>> - dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
>>>> - return devm_cxl_add_passthrough_decoder(port);
>>>> - }
>>>> -
>>>> - dev_err(&port->dev, "HDM decoder capability not found\n");
>>>> - return -ENXIO;
>>>> + return 0;
>>>> }
>>>>
>>>> static int cxl_endpoint_port_probe(struct cxl_port *port)
>>>> --
>>>> 2.50.1
>>>>
>>
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2025-09-03 18:21 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
2025-08-15 12:50 ` Jonathan Cameron
2025-08-20 13:51 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 02/11] cxl: Add helper to reap dport Dave Jiang
2025-08-20 14:10 ` Robert Richter
2025-08-20 20:54 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
2025-08-15 12:52 ` Jonathan Cameron
2025-08-20 14:17 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
2025-08-15 12:57 ` Jonathan Cameron
2025-08-21 11:57 ` Robert Richter
2025-08-22 10:37 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 05/11] cxl: Defer dport allocation for switch ports Dave Jiang
2025-08-20 12:41 ` Robert Richter
2025-08-20 15:20 ` Dave Jiang
2025-08-22 9:59 ` Robert Richter
2025-08-22 15:52 ` Dave Jiang
2025-08-26 7:51 ` Robert Richter
2025-08-27 17:05 ` Dave Jiang
2025-08-29 15:02 ` Robert Richter
2025-08-29 17:23 ` Dave Jiang
2025-09-01 14:48 ` Robert Richter
2025-09-02 15:58 ` Dave Jiang
2025-08-27 21:15 ` Dave Jiang
2025-09-01 17:29 ` Robert Richter
2025-09-02 15:40 ` Dave Jiang
2025-09-03 18:21 ` Dave Jiang
2025-08-27 21:37 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 06/11] cxl/test: Add cxl_test support for cxl_port_get_possible_dports() Dave Jiang
2025-08-14 22:21 ` [PATCH v8 07/11] cxl/test: Add mock version of devm_cxl_add_dport_by_dev() Dave Jiang
2025-08-14 22:21 ` [PATCH v8 08/11] cxl/test: Add support to cxl_test for decoder enumeration mock functions Dave Jiang
2025-08-14 22:21 ` [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization Dave Jiang
2025-08-15 13:04 ` Jonathan Cameron
2025-08-14 22:21 ` [PATCH v8 10/11] cxl: Change sslbis handler to only handle single dport Dave Jiang
2025-08-14 22:21 ` [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support Dave Jiang
2025-08-15 13:15 ` Jonathan Cameron
2025-08-19 9:39 ` [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Robert Richter
2025-08-19 15:41 ` Dave Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).