From: Robert Richter <rrichter@amd.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: linux-cxl@vger.kernel.org, dave@stgolabs.net,
jonathan.cameron@huawei.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com,
dan.j.williams@intel.com
Subject: Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
Date: Mon, 1 Sep 2025 19:29:41 +0200 [thread overview]
Message-ID: <aLXYBcBJNV-9Lt30@rric.localdomain> (raw)
In-Reply-To: <b47901d8-1578-41ed-9a89-62f39cd34d5b@intel.com>
On 27.08.25 14:15:21, Dave Jiang wrote:
>
>
> On 8/20/25 5:41 AM, Robert Richter wrote:
> > Hi Dave,
> >
> > see my comments below.
> >
> > On 14.08.25 15:21:45, Dave Jiang wrote:
> >> The current implementation enumerates the dports during the cxl_port
> >> driver probe. Without an endpoint connected, the dport may not be
> >> active during port probe. This scheme may prevent a valid hardware
> >> dport id to be retrieved and MMIO registers to be read when an endpoint
> >> is hot-plugged. Move the dport allocation and setup to behind memdev
> >> probe so the endpoint is guaranteed to be connected.
> >>
> >> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
> >> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
> >> ACPI0017.N device. Through that it enumerates downstream ports composed
> >> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
> >> uses add_host_bridge_uport() to create the ports that enumerate the PCI
> >> RPs as the dports of these ports. Every time a port is created, the port
> >> driver is attached, cxl_switch_porbe_probe() is called and
> >> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
> >> the dports.
> >>
> >> The second phase is if there are any CXL switches. When the pci endpoint
> >> device driver (cxl_pci) calls probe, it will add a mem device and triggers
> >> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
> >> and attempts to discovery and create all the ports represent CXL switches.
> >> During this phase, a port is created per switch and the attached dports
> >> are also enumerated and probed.
> >>
> >> The last phase is creating endpoint port which happens for all endpoint
> >> devices.
> >>
> >> In this commit, the port create and its dport probing in cxl_acpi is not
> >> changed. That will be handled later. The behavior change is only for CXL
> >> switch ports. Only the dport that is part of the path for an endpoint
> >> device to the RP will be probed. This happens naturally by the code
> >> walking up the device hierarchy and identifying the upstream device and
> >> the downstream device.
> >>
> >> The new sequence is instead of creating all possible dports at initial
> >> port creation, defer port instantiation until a memdev beneath that
> >> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
> >> the creation and extension of ports with new dports as memory devices
> >> arrive. As part of this rework, switch decoder target list is amended
> >> at runtime as dports show up.
> >>
> >> While the decoders are allocated during the port driver probe,
> >> The decoders must also be updated since previously it's all done when all
> >> the dports are setup and now every time a dport is setup per endpoint, the
> >> switch target listing need to be updated with new dport. A
> >> guard(rwsem_write) is used to update decoder targets. This is similar to
> >> when decoder_populate_target() is called and the decoder programming
> >> must be protected.
> >>
> >> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
> >> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >> ---
> >> v8:
> >> - grammar and spelling fixups (Dan)
> >> - Clarify commit log story. (Dan)
> >> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
> >> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
> >> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
> >> - Remove failure path for possible dports == 0. (Dan, Robert)
> >> - update_switch_decoder() -> update_decoder_targets(). (Dan)
> >> - Remove lock asserts where not needed. (Dan)
> >> - Add support for passthrough decoder init. (Dan)
> >> - Return -ENXIO when no driver attached. (Dan)
> >> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
> >> - Add devm_cxl_create_or_extend_port() helper. (Dan)
> >> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
> >> - Remove 'new_dport' local var. (Robert)
> >> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
> >> - Move port check logic to add_port_attach_ep(). (Robert)
> >> ---
> >> drivers/cxl/core/cdat.c | 2 +-
> >> drivers/cxl/core/core.h | 2 +
> >> drivers/cxl/core/hdm.c | 6 -
> >> drivers/cxl/core/pci.c | 81 +++++++++++
> >> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
> >> drivers/cxl/core/region.c | 4 +-
> >> drivers/cxl/cxl.h | 3 +
> >> drivers/cxl/port.c | 29 +---
> >> 8 files changed, 331 insertions(+), 83 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> >> index c0af645425f4..b156b81a9b20 100644
> >> --- a/drivers/cxl/core/cdat.c
> >> +++ b/drivers/cxl/core/cdat.c
> >> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
> >>
> >> guard(rwsem_read)(&cxl_rwsem.region);
> >> for (int i = 0; i < cxlsd->nr_targets; i++) {
> >> - if (host_bridge == cxlsd->target[i]->dport_dev)
> >> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
> >> return 1;
> >> }
> >>
> >> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> >> index 2669f251d677..2ac71eb459e6 100644
> >> --- a/drivers/cxl/core/core.h
> >> +++ b/drivers/cxl/core/core.h
> >> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
> >> int cxl_ras_init(void);
> >> void cxl_ras_exit(void);
> >> int cxl_gpf_port_setup(struct cxl_dport *dport);
> >> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
> >> + struct device *dport_dev);
> >>
> >> #ifdef CONFIG_CXL_FEATURES
> >> struct cxl_feat_entry *
> >> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> >> index cee68bbc7ff6..5263e9eba7d0 100644
> >> --- a/drivers/cxl/core/hdm.c
> >> +++ b/drivers/cxl/core/hdm.c
> >> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
> >> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
> >> {
> >> struct cxl_switch_decoder *cxlsd;
> >> - struct cxl_dport *dport = NULL;
> >> - unsigned long index;
> >> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
> >>
> >> /*
> >> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
> >>
> >> device_lock_assert(&port->dev);
> >>
> >> - xa_for_each(&port->dports, index, dport)
> >> - break;
> >> - cxlsd->cxld.target_map[0] = dport->port_id;
> >> -
> >
> > The change of initialization of cxlsd->cxld.target_map[] could have
> > been a separate patch to reduce size of this patch.
> >
> >> return add_hdm_decoder(port, &cxlsd->cxld);
> >> }
> >> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> >> index b50551601c2e..b9d770f1aa7b 100644
> >> --- a/drivers/cxl/core/pci.c
> >> +++ b/drivers/cxl/core/pci.c
> >> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
> >> module_param(media_ready_timeout, ushort, 0644);
> >> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
> >>
> >> +/**
> >> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
> >> + * @port: cxl_port that hosts the dport
> >> + * @dport_dev: 'struct device' of the dport
> >> + *
> >> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
> >> + */
> >> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
> >
> > This function only determines the port_num. How about only implement
> > this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
> > directly?
>
> I can split out the code to get the port_num locally, but we can't
> call devm_cxl_add_dport() directly in core/port.c because we need
> the map.resource and in order to retrieve that cxl_find_regblock()
> requires a pci dev.
I mean the following:
In the mock case, there is always a decoder. That is,
devm_cxl_add_passthrough_decoder() will only be used for pci devs.
Create cxl_port_has_multiple_dports() which contains:
if (!dev_is_pci(...))
return false;
/* pci_walk_bus() and inspect dports: */
...
In cxl_switch_port_setup():
rc = devm_cxl_enumerate_decoders(...)
if (!rc)
return 0;
if (cxl_port_has_multiple_dports(...))
rc = devm_cxl_add_passthrough_decoder(...);
You don't need function devm_cxl_add_dport_by_dev() any longer, just
use devm_cxl_add_dport() instead.
> >
> > That would nicely fit into core/pci.c.
> >
> >> + struct device *dport_dev)
> >> +{
> >> + struct cxl_register_map map;
> >> + struct pci_dev *pdev;
> >> + u32 lnkcap, port_num;
> >> + int type;
> >> + int rc;
> >> +
> >> + if (!dev_is_pci(dport_dev))
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + device_lock_assert(&port->dev);
> >> +
> >> + pdev = to_pci_dev(dport_dev);
> >> + type = pci_pcie_type(pdev);
> >> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> >> + &lnkcap))
> >> + return ERR_PTR(-ENXIO);
> >> +
> >> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> >> + if (rc)
> >> + dev_dbg(&port->dev, "failed to find component registers\n");
> >> +
> >> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> >
> > So, just return port_num instead.
> >
> >> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
> >> +}
> >> +
> >> struct cxl_walk_context {
> >> struct pci_bus *bus;
> >> struct cxl_port *port;
> >> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
> >>
> >> return 0;
> >> }
> >> +
> >> +static int count_dports(struct pci_dev *pdev, void *data)
> >> +{
> >> + struct cxl_walk_context *ctx = data;
> >> + int type = pci_pcie_type(pdev);
> >> +
> >> + if (pdev->bus != ctx->bus)
> >> + return 0;
> >> + if (!pci_is_pcie(pdev))
> >> + return 0;
> >> + if (type != ctx->type)
> >> + return 0;
> >> +
> >> + ctx->count++;
> >> + return 0;
> >> +}
> >> +
> >> +int cxl_port_get_possible_dports(struct cxl_port *port)
> >> +{
> >> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
> >> + struct cxl_walk_context ctx;
> >> + int type;
> >> +
> >> + if (!bus) {
> >> + dev_err(&port->dev, "No PCI bus found for port %s\n",
> >> + dev_name(&port->dev));
> >> + return -ENXIO;
> >> + }
> >> +
> >> + if (pci_is_root_bus(bus))
> >> + type = PCI_EXP_TYPE_ROOT_PORT;
> >> + else
> >> + type = PCI_EXP_TYPE_DOWNSTREAM;
> >> +
> >> + ctx = (struct cxl_walk_context) {
> >> + .bus = bus,
> >> + .type = type,
> >> + };
> >> + pci_walk_bus(bus, count_dports, &ctx);
> >
> > Don't walk the whole bus, just check children of port->uport_dev.
>
> cxl_port_to_pci_bus() gets the pdev->subordinate of the
> port->uport_dev. So I think that's equivalent of checking the
> children of port->uport_dev and not actually walking the whole pci
> bus no?
pci_walk_bus() also calls subordinates. So it is equivalent, but
count_dports is called for other devices too that are not children.
And it is not obvious that only direct children are counted. Use
device_for_each_child()?
> >
> >> +
> >> + return ctx.count;
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
> >
> > See below for my comment on possible_dports.
> >
> > Since we only check for count > 1 the implemntation could be
> > simplified and renamed to e.g. cxl_port_has_multiple_dports which
> > could easily be used to call devm_cxl_add_passthrough_decoder().
>
> This would be possible if the function can return a bool. However,
> it is possible to encounter errors. And errors should not be
> equivalent to a false (0) return value and resulting in a
> passthrough decoder creation. Thus I think we should stay with the
> current function name.
Ok, but see also above.
> >
> >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> >> index 25209952f469..877f888ee8f5 100644
> >> --- a/drivers/cxl/core/port.c
> >> +++ b/drivers/cxl/core/port.c
> >> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
> >> return port;
> >> }
> >>
> >> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
> >> - struct device *dport_dev,
> >> - struct cxl_dport **dport)
> >> -{
> >> - struct cxl_find_port_ctx ctx = {
> >> - .dport_dev = dport_dev,
> >> - .parent_port = parent_port,
> >> - .dport = dport,
> >> - };
> >> - struct cxl_port *port;
> >> -
> >> - port = __find_cxl_port(&ctx);
> >> - return port;
> >> -}
> >> -
> >> /*
> >> * All users of grandparent() are using it to walk PCIe-like switch port
> >> * hierarchy. A PCIe switch is comprised of a bridge device representing the
> >> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
> >> return map.resource;
> >> }
> >>
> >> +static int match_port_by_uport(struct device *dev, const void *data)
> >> +{
> >> + const struct device *uport_dev = data;
> >> + struct cxl_port *port;
> >> +
> >> + if (!is_cxl_port(dev))
> >> + return 0;
> >> +
> >> + port = to_cxl_port(dev);
> >> + return uport_dev == port->uport_dev;
> >> +}
> >> +
> >> +/*
> >> + * Function takes a device reference on the port device. Caller should do a
> >> + * put_device() when done.
> >> + */
> >> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
> >> +{
> >> + struct device *dev;
> >> +
> >> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
> >> + if (dev)
> >> + return to_cxl_port(dev);
> >> + return NULL;
> >> +}
> >> +
> >> +static int update_decoder_targets(struct device *dev, void *data)
> >> +{
> >> + struct cxl_dport *dport = data;
> >> + struct cxl_switch_decoder *cxlsd;
> >> + struct cxl_decoder *cxld;
> >> + int i;
> >> +
> >> + if (!is_switch_decoder(dev))
> >> + return 0;
> >> +
> >> + cxlsd = to_cxl_switch_decoder(dev);
> >> + cxld = &cxlsd->cxld;
> >> + guard(rwsem_write)(&cxl_rwsem.region);
> >> +
> >> + /* Short cut for passthrough decoder */
> >> + if (cxlsd->nr_targets == 1) {
> >
> > I think we should still check port_id. That is, remove the shortcut.
> > If nr_targets == 1, then interleave_ways should be one too, so you
> > gain nothing. Plus, you also see the dev_dbg().
>
> ok
>
> >
> >> + cxlsd->target[0] = dport;
> >> + return 0;
> >> + }
> >> +
> >> + for (i = 0; i < cxld->interleave_ways; i++) {
> >> + if (cxld->target_map[i] == dport->port_id) {
> >> + cxlsd->target[i] = dport;
> >> + dev_dbg(dev, "dport%d found in target list, index %d\n",
> >> + dport->port_id, i);
> >> + return 0;
> >
> > Only one target exists, right? Stop the iteration by returning a
> > non-zero here (caller needs to be adjusted then).
>
> ok
>
> >
> >> + }
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
> >> +{
> >> + return device_for_each_child(&dport->port->dev, dport,
> >> + update_decoder_targets);
> >
> > Might need changes if update_decoder_targets returns 1 to stop the
> > iterator.
>
> ok
>
> >
> >> +}
> >> +
> >> +static int cxl_switch_port_setup(struct cxl_port *port)
> >> +{
> >
> > Could you factor out that function in a separate patch?
> >
> > The function only sets up decoders. Name it
> > cxl_switch_port_setup_decoders()?
> >
> >> + struct cxl_hdm *cxlhdm;
> >> +
> >> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
> >> + if (!IS_ERR(cxlhdm))
> >> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> >> +
> >> + if (PTR_ERR(cxlhdm) != -ENODEV) {
> >> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
> >> + return PTR_ERR(cxlhdm);
> >> + }
> >> +
> >> + if (port->possible_dports == 1) {
> >> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> >> + return devm_cxl_add_passthrough_decoder(port);
> >
> > Imo, the possible_dports handling should be removed as it only
> > introduces dead code. mock_cxl_setup_hdm() always returns a valid
> > cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
> > here.
> >
> > So how about moving (the "real") devm_cxl_add_passthrough_decoder()
> > and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
> > devm_cxl_add_passthrough_decoder() would be static then and
> > cxl_port_get_possible_dports() will be a core.h function only. Then,
> > mock_cxl_add_passthrough_decoder() could be removed too.
> >
> > I really would like to have a clean core module interface that allows
> > an easy implementation of cxl_test and avoid too much impact to the
> > driver code.
>
> So after looking at this a bit, it looks like we need a bigger refactor than just devm_cxl_enumerate_decoders(). I have an attempt in the next rev you can take a look. It reduces from 3-4 mock functions down to 2.
>
> >
> >> + }
> >> +
> >> + dev_err(&port->dev, "HDM decoder capability not found\n");
> >> + return -ENXIO;
> >> +}
> >> +
> >> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
> >> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
> >> + struct device *dport_dev)
> >> +{
> >> + struct cxl_dport *dport;
> >> + int rc;
> >> +
> >> + guard(device)(&port->dev);
> >> +
> >> + if (!port->dev.driver)
> >> + return ERR_PTR(-ENXIO);
> >> +
> >> + dport = cxl_find_dport_by_dev(port, dport_dev);
> >> + if (dport)
> >> + return dport;
> >
> > What is the case if there is already a dport bound to the port? Since
> > there is a 1:1 mapping downstream, there is only one allocation and I
> > would expect that dport never exists and an -EBUSY should be returned
> > otherwise.
>
> ok
>
> >
> >> +
> >> + struct cxl_dport *new_dport __free(put_cxl_dport) =
> >> + devm_cxl_add_dport_by_dev(port, dport_dev);
> >
> > See my comment on devm_cxl_add_dport_by_dev() above.
> >
> >> + if (IS_ERR(new_dport))
> >> + return new_dport;
> >> +
> >> + cxl_switch_parse_cdat(port);
> >> +
> >> + /*
> >> + * First instance of dport appearing, need to setup the port, including
> >> + * allocating decoders.
> >> + */
> >> + if (port->nr_dports == 1) {
> >> + rc = cxl_switch_port_setup(port);
> >
> > Can't this be done with port creation? I don't see a reason doing this
> > late at this point.
> >
> >> + if (rc)
> >> + return ERR_PTR(rc);
> >> + return no_free_ptr(new_dport);
> >> + }
> >> +
> >> + rc = cxl_decoders_dport_update(new_dport);
> >> + if (rc)
> >> + return ERR_PTR(rc);
> >
> > Maybe unfold cxl_decoders_dport_update() here?
>
> ok
>
> >
> >> +
> >> + return no_free_ptr(new_dport);
> >> +}
> >> +
> >> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
> >> + struct device *dport_dev)
> >> +{
> >> + struct cxl_port *port __free(put_cxl_port) =
> >> + find_cxl_port_by_uport(uport_dev);
> >> +
> >> + if (!port)
> >> + return ERR_PTR(-ENODEV);
> >> +
> >> + return cxl_port_get_or_add_dport(port, dport_dev);
> >> +}
> >
> > That function can be removed, see below.
>
> ok
>
> >
> >> +
> >> +static struct cxl_dport *
> >> +devm_cxl_create_or_extend_port(struct device *ep_dev,
> >> + struct cxl_port *parent_port,
> >> + struct cxl_dport *parent_dport,
> >> + struct device *uport_dev,
> >> + struct device *dport_dev)
> >> +{
> >> + resource_size_t component_reg_phys;
> >> +
> >> + guard(device)(&parent_port->dev);
> >> +
> >> + if (!parent_port->dev.driver) {
> >> + dev_warn(ep_dev,
> >> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
> >> + dev_name(&parent_port->dev), dev_name(uport_dev));
> >> + return ERR_PTR(-ENXIO);
> >> + }
> >> +
> >> + struct cxl_port *port __free(put_cxl_port) =
> >> + find_cxl_port_by_uport(uport_dev);
> >> +
> >> + if (!port) {
> >> + component_reg_phys = find_component_registers(uport_dev);
> >> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> >> + component_reg_phys, parent_dport);
> >> + if (IS_ERR(port))
> >> + return (struct cxl_dport *)port;
> >> +
> >> + /*
> >> + * retry to make sure a port is found. a port device
> >> + * reference is taken.
> >> + */
> >> + port = find_cxl_port_by_uport(uport_dev);
> >> + if (!port)
> >> + return ERR_PTR(-ENODEV);
> >> +
> >> + dev_dbg(ep_dev, "created port %s:%s\n",
> >> + dev_name(&port->dev), dev_name(port->uport_dev));
> >> + }
> >> +
> >> + return cxl_port_get_or_add_dport(port, dport_dev);
> >> +}
> >> +
> >> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> >> struct device *uport_dev,
> >> struct device *dport_dev)
> >> {
> >> struct device *dparent = grandparent(dport_dev);
> >> struct cxl_dport *dport, *parent_dport;
> >> - resource_size_t component_reg_phys;
> >> int rc;
> >>
> >> if (is_cxl_host_bridge(dparent)) {
> >> + struct cxl_port *port __free(put_cxl_port) =
> >> + find_cxl_port_by_uport(uport_dev);
> >> /*
> >> * The iteration reached the topology root without finding the
> >> * CXL-root 'cxl_port' on a previous iteration, fail for now to
> >> * be re-probed after platform driver attaches.
> >> */
> >> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> >> - dev_name(dport_dev));
> >> - return -ENXIO;
> >> + if (!port) {
> >> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> >> + dev_name(dport_dev));
> >> + return -ENXIO;
> >> + }
> >> +
> >> + /*
> >> + * While the port is found, there may not be a dport associated
> >> + * yet. Try to associate the dport to the port. On return success,
> >> + * the iteration will restart with the dport now attached.
> >> + */
> >> + dport = devm_cxl_add_dport_by_uport(uport_dev,
> >> + dport_dev);
> >
> > port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
> > instead. Remove devm_cxl_add_dport_by_uport().
>
> ok
>
> >
> >> + if (IS_ERR(dport))
> >> + return PTR_ERR(dport);
> >> +
> >> + return 0;
> >> }
> >>
> >> struct cxl_port *parent_port __free(put_cxl_port) =
> >> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> >> return -EAGAIN;
> >> }
> >>
> >> - /*
> >> - * Definition with __free() here to keep the sequence of
> >> - * dereferencing the device of the port before the parent_port releasing.
> >> - */
> >> - struct cxl_port *port __free(put_cxl_port) = NULL;
> >> - scoped_guard(device, &parent_port->dev) {
> >> - if (!parent_port->dev.driver) {
> >> - dev_warn(&cxlmd->dev,
> >> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
> >> - dev_name(&parent_port->dev), dev_name(uport_dev));
> >> - return -ENXIO;
> >> - }
> >> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
> >> + parent_dport, uport_dev,
> >> + dport_dev);
> >
> > You expand add_port_attach_ep() here. This function was originally
> > called if there is no *port* at all. Now, as the dport_dev is not yet
> > registered, the port may already exist, but it is not found since the
> > dport_dev is not yet registered and add_port_attach_ep() is called now
> > even if the port exists. I think we should move that dport_dev
> > registration a level higher to devm_cxl_enumerate_ports(). That might
> > need a cleanup of the iterator and the removal of
> > add_port_attach_ep().
>
> Yes. The new rev will move the dport registration a level up. No need to remove add_port_attach_ep(). devm_cxl_create_or_extend_port() will be devm_cxl_create_port().
>
> >
> >> + if (IS_ERR(dport))
> >> + return PTR_ERR(dport);
> >>
> >> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> >> - if (!port) {
> >> - component_reg_phys = find_component_registers(uport_dev);
> >> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> >> - component_reg_phys, parent_dport);
> >> - if (IS_ERR(port))
> >> - return PTR_ERR(port);
> >> -
> >> - /* retry find to pick up the new dport information */
> >> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> >> - if (!port)
> >> - return -ENXIO;
> >> - }
> >> - }
> >> -
> >> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
> >> - dev_name(&port->dev), dev_name(port->uport_dev));
> >> rc = cxl_add_ep(dport, &cxlmd->dev);
> >> if (rc == -EBUSY) {
> >> /*
> >> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >> {
> >> struct device *dev = &cxlmd->dev;
> >> struct device *iter;
> >> + int ports_need_create = 0;
> >> int rc;
> >>
> >> /*
> >> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >> struct device *uport_dev;
> >> struct cxl_dport *dport;
> >>
> >> + ports_need_create++;
> >> +
> >> if (is_cxl_host_bridge(dport_dev))
> >> return 0;
> >>
> >> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >>
> >> cxl_gpf_port_setup(dport);
> >>
> >> + ports_need_create--;
> >> /* Any more ports to add between this one and the root? */
> >> if (!dev_is_cxl_root_child(&port->dev))
> >> continue;
> >>
> >> + /*
> >> + * The 'ports_need_create' variable tracks a port being
> >> + * created as it goes through this iterative loop. It's
> >> + * incremented when it first enters the loop and decremented
> >> + * when the port is found. If at the root of the hierarchy
> >> + * and the variable is not 0, then it's missing a port
> >> + * creation somewhere in the hierarchy and should restart.
> >> + * For example in a setup where there's a PCI root port, a
> >> + * switch, and an endpoint, it is possible to get to the
> >> + * PCI root port and its creation, and the switch port is
> >> + * still missing because the root port didn't exist. This
> >> + * triggers a restart of the loop to create the switch port
> >> + * now with a present root port.
> >> + */
> >> + if (ports_need_create)
> >
> > Uh, that becomes hard. Isn't the iterator much simpler:
> >
> > * Start the iter = endpoint.
> >
> > * Find first existing parent port up to the root.
> >
> > * If that is the direct parent of the endpoint, attach it to the
> > parent (add dport etc.). Exit loop without errors.
> >
> > * Else, create port and attach it to the found parent port (including
> > dport handling).
> >
> > * Fail on errors or retry otherwise.
> >
> > So, devm_cxl_enumerate_ports() should be reworked better, also address
> > my other comments regarding add_port_attach_ep() and
> > devm_cxl_create_or_extend_port().
>
> So I reworked this whole path a bit. Maybe not exactly what you are envisioning here but it is a lot cleaner. You can take a look at the next rev.
>
> >
> >> + goto retry;
> >> +
> >> return 0;
> >> }
> >>
> >> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >> if (rc == -EAGAIN)
> >> continue;
> >> /* failed to add ep or port */
> >> - if (rc)
> >> + if (rc < 0)
> >> return rc;
> >> +
> >> + ports_need_create = 0;
> >> /* port added, new descendants possible, start over */
> >> goto retry;
> >> }
> >> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
> >> device_lock_assert(&port->dev);
> >>
> >> if (xa_empty(&port->dports))
> >> - return -EINVAL;
> >> + return 0;
> >>
> >> guard(rwsem_write)(&cxl_rwsem.region);
> >> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
> >> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
> >>
> >> - if (!dport)
> >> - return -ENXIO;
> >> + if (!dport) {
> >> + /* dport may be activated later */
> >> + continue;
> >> + }
> >> cxlsd->target[i] = dport;
> >> }
> >
> > Should that be dropped entirely as the target setup is done somewhere
> > else?
> >
> No. This is still needed for root ports.
Root ports are dport_devs and don't have a target list, host bridges
have. I did not follow the entire code flow, but shouldn't
cxl_decoders_dport_update() handle that?
-Robert
>
> DJ
> >>
> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> >> index 71cc42d05248..bba62867df90 100644
> >> --- a/drivers/cxl/core/region.c
> >> +++ b/drivers/cxl/core/region.c
> >> @@ -1510,8 +1510,10 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> >> cxl_rr->nr_targets_set);
> >> return -ENXIO;
> >> }
> >> - } else
> >> + } else {
> >> cxlsd->target[cxl_rr->nr_targets_set] = ep->dport;
> >> + cxlsd->cxld.target_map[cxl_rr->nr_targets_set] = ep->dport->port_id;
> >> + }
> >> inc = 1;
> >> out_target_set:
> >> cxl_rr->nr_targets_set += inc;
> >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> >> index 87a905db5ffb..df10a01376c6 100644
> >> --- a/drivers/cxl/cxl.h
> >> +++ b/drivers/cxl/cxl.h
> >> @@ -591,6 +591,7 @@ struct cxl_dax_region {
> >> * @parent_dport: dport that points to this port in the parent
> >> * @decoder_ida: allocator for decoder ids
> >> * @reg_map: component and ras register mapping parameters
> >> + * @possible_dports: Total possible dports reported by hardware
> >> * @nr_dports: number of entries in @dports
> >> * @hdm_end: track last allocated HDM decoder instance for allocation ordering
> >> * @commit_end: cursor to track highest committed decoder for commit ordering
> >> @@ -612,6 +613,7 @@ struct cxl_port {
> >> struct cxl_dport *parent_dport;
> >> struct ida decoder_ida;
> >> struct cxl_register_map reg_map;
> >> + int possible_dports;
> >> int nr_dports;
> >> int hdm_end;
> >> int commit_end;
> >> @@ -911,6 +913,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
> >> struct access_coordinate *c2);
> >>
> >> bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
> >> +int cxl_port_get_possible_dports(struct cxl_port *port);
> >>
> >> /*
> >> * Unit test builds overrides this to __weak, find the 'strong' version
> >> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> >> index cf32dc50b7a6..941a7d7157bd 100644
> >> --- a/drivers/cxl/port.c
> >> +++ b/drivers/cxl/port.c
> >> @@ -59,34 +59,17 @@ static int discover_region(struct device *dev, void *unused)
> >>
> >> static int cxl_switch_port_probe(struct cxl_port *port)
> >> {
> >> - struct cxl_hdm *cxlhdm;
> >> - int rc;
> >> + int dports;
> >>
> >> /* Cache the data early to ensure is_visible() works */
> >> read_cdat_data(port);
> >>
> >> - rc = devm_cxl_port_enumerate_dports(port);
> >> - if (rc < 0)
> >> - return rc;
> >> + dports = cxl_port_get_possible_dports(port);
> >> + if (dports < 0)
> >> + return dports;
> >> + port->possible_dports = dports;
> >
> > As said, I think the whole possible_dports part can be removed.
> >
> > Thanks,
> >
> > -Robert
> >
> >>
> >> - cxl_switch_parse_cdat(port);
> >> -
> >> - cxlhdm = devm_cxl_setup_hdm(port, NULL);
> >> - if (!IS_ERR(cxlhdm))
> >> - return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> >> -
> >> - if (PTR_ERR(cxlhdm) != -ENODEV) {
> >> - dev_err(&port->dev, "Failed to map HDM decoder capability\n");
> >> - return PTR_ERR(cxlhdm);
> >> - }
> >> -
> >> - if (rc == 1) {
> >> - dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> >> - return devm_cxl_add_passthrough_decoder(port);
> >> - }
> >> -
> >> - dev_err(&port->dev, "HDM decoder capability not found\n");
> >> - return -ENXIO;
> >> + return 0;
> >> }
> >>
> >> static int cxl_endpoint_port_probe(struct cxl_port *port)
> >> --
> >> 2.50.1
> >>
>
next prev parent reply other threads:[~2025-09-01 17:29 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
2025-08-15 12:50 ` Jonathan Cameron
2025-08-20 13:51 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 02/11] cxl: Add helper to reap dport Dave Jiang
2025-08-20 14:10 ` Robert Richter
2025-08-20 20:54 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
2025-08-15 12:52 ` Jonathan Cameron
2025-08-20 14:17 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
2025-08-15 12:57 ` Jonathan Cameron
2025-08-21 11:57 ` Robert Richter
2025-08-22 10:37 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 05/11] cxl: Defer dport allocation for switch ports Dave Jiang
2025-08-20 12:41 ` Robert Richter
2025-08-20 15:20 ` Dave Jiang
2025-08-22 9:59 ` Robert Richter
2025-08-22 15:52 ` Dave Jiang
2025-08-26 7:51 ` Robert Richter
2025-08-27 17:05 ` Dave Jiang
2025-08-29 15:02 ` Robert Richter
2025-08-29 17:23 ` Dave Jiang
2025-09-01 14:48 ` Robert Richter
2025-09-02 15:58 ` Dave Jiang
2025-08-27 21:15 ` Dave Jiang
2025-09-01 17:29 ` Robert Richter [this message]
2025-09-02 15:40 ` Dave Jiang
2025-09-03 18:21 ` Dave Jiang
2025-08-27 21:37 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 06/11] cxl/test: Add cxl_test support for cxl_port_get_possible_dports() Dave Jiang
2025-08-14 22:21 ` [PATCH v8 07/11] cxl/test: Add mock version of devm_cxl_add_dport_by_dev() Dave Jiang
2025-08-14 22:21 ` [PATCH v8 08/11] cxl/test: Add support to cxl_test for decoder enumeration mock functions Dave Jiang
2025-08-14 22:21 ` [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization Dave Jiang
2025-08-15 13:04 ` Jonathan Cameron
2025-08-14 22:21 ` [PATCH v8 10/11] cxl: Change sslbis handler to only handle single dport Dave Jiang
2025-08-14 22:21 ` [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support Dave Jiang
2025-08-15 13:15 ` Jonathan Cameron
2025-08-19 9:39 ` [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Robert Richter
2025-08-19 15:41 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLXYBcBJNV-9Lt30@rric.localdomain \
--to=rrichter@amd.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox