From: Dave Jiang <dave.jiang@intel.com>
To: Robert Richter <rrichter@amd.com>
Cc: linux-cxl@vger.kernel.org, dave@stgolabs.net,
jonathan.cameron@huawei.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com,
dan.j.williams@intel.com
Subject: Re: [PATCH v8 05/11] cxl: Defer dport allocation for switch ports
Date: Tue, 2 Sep 2025 08:40:30 -0700 [thread overview]
Message-ID: <12b3c97a-9463-4c95-8757-e306d4aea14d@intel.com> (raw)
In-Reply-To: <aLXYBcBJNV-9Lt30@rric.localdomain>
On 9/1/25 10:29 AM, Robert Richter wrote:
> On 27.08.25 14:15:21, Dave Jiang wrote:
>>
>>
>> On 8/20/25 5:41 AM, Robert Richter wrote:
>>> Hi Dave,
>>>
>>> see my comments below.
>>>
>>> On 14.08.25 15:21:45, Dave Jiang wrote:
>>>> The current implementation enumerates the dports during the cxl_port
>>>> driver probe. Without an endpoint connected, the dport may not be
>>>> active during port probe. This scheme may prevent a valid hardware
>>>> dport id to be retrieved and MMIO registers to be read when an endpoint
>>>> is hot-plugged. Move the dport allocation and setup to behind memdev
>>>> probe so the endpoint is guaranteed to be connected.
>>>>
>>>> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
>>>> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
>>>> ACPI0017.N device. Through that it enumerates downstream ports composed
>>>> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
>>>> uses add_host_bridge_uport() to create the ports that enumerate the PCI
>>>> RPs as the dports of these ports. Every time a port is created, the port
>>>> driver is attached, cxl_switch_porbe_probe() is called and
>>>> devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
>>>> the dports.
>>>>
>>>> The second phase is if there are any CXL switches. When the pci endpoint
>>>> device driver (cxl_pci) calls probe, it will add a mem device and triggers
>>>> the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
>>>> and attempts to discovery and create all the ports represent CXL switches.
>>>> During this phase, a port is created per switch and the attached dports
>>>> are also enumerated and probed.
>>>>
>>>> The last phase is creating endpoint port which happens for all endpoint
>>>> devices.
>>>>
>>>> In this commit, the port create and its dport probing in cxl_acpi is not
>>>> changed. That will be handled later. The behavior change is only for CXL
>>>> switch ports. Only the dport that is part of the path for an endpoint
>>>> device to the RP will be probed. This happens naturally by the code
>>>> walking up the device hierarchy and identifying the upstream device and
>>>> the downstream device.
>>>>
>>>> The new sequence is instead of creating all possible dports at initial
>>>> port creation, defer port instantiation until a memdev beneath that
>>>> dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
>>>> the creation and extension of ports with new dports as memory devices
>>>> arrive. As part of this rework, switch decoder target list is amended
>>>> at runtime as dports show up.
>>>>
>>>> While the decoders are allocated during the port driver probe,
>>>> The decoders must also be updated since previously it's all done when all
>>>> the dports are setup and now every time a dport is setup per endpoint, the
>>>> switch target listing need to be updated with new dport. A
>>>> guard(rwsem_write) is used to update decoder targets. This is similar to
>>>> when decoder_populate_target() is called and the decoder programming
>>>> must be protected.
>>>>
>>>> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
>>>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>> ---
>>>> v8:
>>>> - grammar and spelling fixups (Dan)
>>>> - Clarify commit log story. (Dan)
>>>> - Move register mapping and decoder enumeration to when first dport shows up (Dan)
>>>> - Fix kdoc indentation issue with devm_cxl_add_dport_by_dev()
>>>> - cxl_port_update_total_dports() -> cxl_probe_possible_dports(). (Dan)
>>>> - Remove failure path for possible dports == 0. (Dan, Robert)
>>>> - update_switch_decoder() -> update_decoder_targets(). (Dan)
>>>> - Remove lock asserts where not needed. (Dan)
>>>> - Add support for passthrough decoder init. (Dan)
>>>> - Return -ENXIO when no driver attached. (Dan)
>>>> - Move guard() from devm-cxl_add_dport_by_uport. (Dan, Robert)
>>>> - Add devm_cxl_create_or_extend_port() helper. (Dan)
>>>> - Remove shortcut for the port iteration path. Find better way to deal. (Dan, Robert)
>>>> - Remove 'new_dport' local var. (Robert)
>>>> - Use find_cxl_port_by_uport() instead of find_cxl_port(). (Robert)
>>>> - Move port check logic to add_port_attach_ep(). (Robert)
>>>> ---
>>>> drivers/cxl/core/cdat.c | 2 +-
>>>> drivers/cxl/core/core.h | 2 +
>>>> drivers/cxl/core/hdm.c | 6 -
>>>> drivers/cxl/core/pci.c | 81 +++++++++++
>>>> drivers/cxl/core/port.c | 287 +++++++++++++++++++++++++++++++-------
>>>> drivers/cxl/core/region.c | 4 +-
>>>> drivers/cxl/cxl.h | 3 +
>>>> drivers/cxl/port.c | 29 +---
>>>> 8 files changed, 331 insertions(+), 83 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>>>> index c0af645425f4..b156b81a9b20 100644
>>>> --- a/drivers/cxl/core/cdat.c
>>>> +++ b/drivers/cxl/core/cdat.c
>>>> @@ -338,7 +338,7 @@ static int match_cxlrd_hb(struct device *dev, void *data)
>>>>
>>>> guard(rwsem_read)(&cxl_rwsem.region);
>>>> for (int i = 0; i < cxlsd->nr_targets; i++) {
>>>> - if (host_bridge == cxlsd->target[i]->dport_dev)
>>>> + if (cxlsd->target[i] && host_bridge == cxlsd->target[i]->dport_dev)
>>>> return 1;
>>>> }
>>>>
>>>> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
>>>> index 2669f251d677..2ac71eb459e6 100644
>>>> --- a/drivers/cxl/core/core.h
>>>> +++ b/drivers/cxl/core/core.h
>>>> @@ -146,6 +146,8 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
>>>> int cxl_ras_init(void);
>>>> void cxl_ras_exit(void);
>>>> int cxl_gpf_port_setup(struct cxl_dport *dport);
>>>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>>>> + struct device *dport_dev);
>>>>
>>>> #ifdef CONFIG_CXL_FEATURES
>>>> struct cxl_feat_entry *
>>>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>>>> index cee68bbc7ff6..5263e9eba7d0 100644
>>>> --- a/drivers/cxl/core/hdm.c
>>>> +++ b/drivers/cxl/core/hdm.c
>>>> @@ -52,8 +52,6 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
>>>> int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>>> {
>>>> struct cxl_switch_decoder *cxlsd;
>>>> - struct cxl_dport *dport = NULL;
>>>> - unsigned long index;
>>>> struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
>>>>
>>>> /*
>>>> @@ -69,10 +67,6 @@ int devm_cxl_add_passthrough_decoder(struct cxl_port *port)
>>>>
>>>> device_lock_assert(&port->dev);
>>>>
>>>> - xa_for_each(&port->dports, index, dport)
>>>> - break;
>>>> - cxlsd->cxld.target_map[0] = dport->port_id;
>>>> -
>>>
>>> The change of initialization of cxlsd->cxld.target_map[] could have
>>> been a separate patch to reduce size of this patch.
>>>
>>>> return add_hdm_decoder(port, &cxlsd->cxld);
>>>> }
>>>> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, "CXL");
>>>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>>>> index b50551601c2e..b9d770f1aa7b 100644
>>>> --- a/drivers/cxl/core/pci.c
>>>> +++ b/drivers/cxl/core/pci.c
>>>> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
>>>> module_param(media_ready_timeout, ushort, 0644);
>>>> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
>>>>
>>>> +/**
>>>> + * devm_cxl_add_dport_by_dev - allocate a dport by dport device
>>>> + * @port: cxl_port that hosts the dport
>>>> + * @dport_dev: 'struct device' of the dport
>>>> + *
>>>> + * Returns the allocate dport on success or ERR_PTR() of -errno on error
>>>> + */
>>>> +struct cxl_dport *devm_cxl_add_dport_by_dev(struct cxl_port *port,
>>>
>>> This function only determines the port_num. How about only implement
>>> this in a function cxl_pci_get_port_num() and call devm_cxl_add_dport
>>> directly?
>>
>
>> I can split out the code to get the port_num locally, but we can't
>> call devm_cxl_add_dport() directly in core/port.c because we need
>> the map.resource and in order to retrieve that cxl_find_regblock()
>> requires a pci dev.
>
> I mean the following:
>
> In the mock case, there is always a decoder. That is,
> devm_cxl_add_passthrough_decoder() will only be used for pci devs.
>
> Create cxl_port_has_multiple_dports() which contains:
>
> if (!dev_is_pci(...))
> return false;
> /* pci_walk_bus() and inspect dports: */
> ...
>
> In cxl_switch_port_setup():
>
> rc = devm_cxl_enumerate_decoders(...)
> if (!rc)
> return 0;
> if (cxl_port_has_multiple_dports(...))
> rc = devm_cxl_add_passthrough_decoder(...);
ok, I can update this part with the function rename. v9 is mostly there with passthrough decoder eliminated in cxl_test.
>
> You don't need function devm_cxl_add_dport_by_dev() any longer, just
> use devm_cxl_add_dport() instead.
I'm not seeing how the decoder enumeration is related to devm_cxl_add_dport_by_dev(). devm_cxl_add_dport() needs a port_id and a component_reg_phys. Retrieving those require more help from core/pci.c and can't be done in core/port.c.
>
>>>
>>> That would nicely fit into core/pci.c.
>>>
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_register_map map;
>>>> + struct pci_dev *pdev;
>>>> + u32 lnkcap, port_num;
>>>> + int type;
>>>> + int rc;
>>>> +
>>>> + if (!dev_is_pci(dport_dev))
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + device_lock_assert(&port->dev);
>>>> +
>>>> + pdev = to_pci_dev(dport_dev);
>>>> + type = pci_pcie_type(pdev);
>>>> + if (type != PCI_EXP_TYPE_DOWNSTREAM && type != PCI_EXP_TYPE_ROOT_PORT)
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
>>>> + &lnkcap))
>>>> + return ERR_PTR(-ENXIO);
>>>> +
>>>> + rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
>>>> + if (rc)
>>>> + dev_dbg(&port->dev, "failed to find component registers\n");
>>>> +
>>>> + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
>>>
>>> So, just return port_num instead.
>>>
>>>> + return devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
>>>> +}
>>>> +
>>>> struct cxl_walk_context {
>>>> struct pci_bus *bus;
>>>> struct cxl_port *port;
>>>> @@ -1169,3 +1207,46 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
>>>>
>>>> return 0;
>>>> }
>>>> +
>>>> +static int count_dports(struct pci_dev *pdev, void *data)
>>>> +{
>>>> + struct cxl_walk_context *ctx = data;
>>>> + int type = pci_pcie_type(pdev);
>>>> +
>>>> + if (pdev->bus != ctx->bus)
>>>> + return 0;
>>>> + if (!pci_is_pcie(pdev))
>>>> + return 0;
>>>> + if (type != ctx->type)
>>>> + return 0;
>>>> +
>>>> + ctx->count++;
>>>> + return 0;
>>>> +}
>>>> +
>>>> +int cxl_port_get_possible_dports(struct cxl_port *port)
>>>> +{
>>>> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
>>>> + struct cxl_walk_context ctx;
>>>> + int type;
>>>> +
>>>> + if (!bus) {
>>>> + dev_err(&port->dev, "No PCI bus found for port %s\n",
>>>> + dev_name(&port->dev));
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + if (pci_is_root_bus(bus))
>>>> + type = PCI_EXP_TYPE_ROOT_PORT;
>>>> + else
>>>> + type = PCI_EXP_TYPE_DOWNSTREAM;
>>>> +
>>>> + ctx = (struct cxl_walk_context) {
>>>> + .bus = bus,
>>>> + .type = type,
>>>> + };
>>>> + pci_walk_bus(bus, count_dports, &ctx);
>>>
>>> Don't walk the whole bus, just check children of port->uport_dev.
>>
>
>> cxl_port_to_pci_bus() gets the pdev->subordinate of the
>> port->uport_dev. So I think that's equivalent of checking the
>> children of port->uport_dev and not actually walking the whole pci
>> bus no?
>
> pci_walk_bus() also calls subordinates. So it is equivalent, but
> count_dports is called for other devices too that are not children.
> And it is not obvious that only direct children are counted. Use
> device_for_each_child()?
>
ok I can see about switch to that.
>>>
>>>> +
>>>> + return ctx.count;
>>>> +}
>>>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_possible_dports, "CXL");
>>>
>>> See below for my comment on possible_dports.
>>>
>>> Since we only check for count > 1 the implemntation could be
>>> simplified and renamed to e.g. cxl_port_has_multiple_dports which
>>> could easily be used to call devm_cxl_add_passthrough_decoder().
>>
>
>> This would be possible if the function can return a bool. However,
>> it is possible to encounter errors. And errors should not be
>> equivalent to a false (0) return value and resulting in a
>> passthrough decoder creation. Thus I think we should stay with the
>> current function name.
>
> Ok, but see also above.
>
>>>
>>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>>> index 25209952f469..877f888ee8f5 100644
>>>> --- a/drivers/cxl/core/port.c
>>>> +++ b/drivers/cxl/core/port.c
>>>> @@ -1367,21 +1367,6 @@ static struct cxl_port *find_cxl_port(struct device *dport_dev,
>>>> return port;
>>>> }
>>>>
>>>> -static struct cxl_port *find_cxl_port_at(struct cxl_port *parent_port,
>>>> - struct device *dport_dev,
>>>> - struct cxl_dport **dport)
>>>> -{
>>>> - struct cxl_find_port_ctx ctx = {
>>>> - .dport_dev = dport_dev,
>>>> - .parent_port = parent_port,
>>>> - .dport = dport,
>>>> - };
>>>> - struct cxl_port *port;
>>>> -
>>>> - port = __find_cxl_port(&ctx);
>>>> - return port;
>>>> -}
>>>> -
>>>> /*
>>>> * All users of grandparent() are using it to walk PCIe-like switch port
>>>> * hierarchy. A PCIe switch is comprised of a bridge device representing the
>>>> @@ -1557,24 +1542,221 @@ static resource_size_t find_component_registers(struct device *dev)
>>>> return map.resource;
>>>> }
>>>>
>>>> +static int match_port_by_uport(struct device *dev, const void *data)
>>>> +{
>>>> + const struct device *uport_dev = data;
>>>> + struct cxl_port *port;
>>>> +
>>>> + if (!is_cxl_port(dev))
>>>> + return 0;
>>>> +
>>>> + port = to_cxl_port(dev);
>>>> + return uport_dev == port->uport_dev;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Function takes a device reference on the port device. Caller should do a
>>>> + * put_device() when done.
>>>> + */
>>>> +static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
>>>> +{
>>>> + struct device *dev;
>>>> +
>>>> + dev = bus_find_device(&cxl_bus_type, NULL, uport_dev, match_port_by_uport);
>>>> + if (dev)
>>>> + return to_cxl_port(dev);
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +static int update_decoder_targets(struct device *dev, void *data)
>>>> +{
>>>> + struct cxl_dport *dport = data;
>>>> + struct cxl_switch_decoder *cxlsd;
>>>> + struct cxl_decoder *cxld;
>>>> + int i;
>>>> +
>>>> + if (!is_switch_decoder(dev))
>>>> + return 0;
>>>> +
>>>> + cxlsd = to_cxl_switch_decoder(dev);
>>>> + cxld = &cxlsd->cxld;
>>>> + guard(rwsem_write)(&cxl_rwsem.region);
>>>> +
>>>> + /* Short cut for passthrough decoder */
>>>> + if (cxlsd->nr_targets == 1) {
>>>
>>> I think we should still check port_id. That is, remove the shortcut.
>>> If nr_targets == 1, then interleave_ways should be one too, so you
>>> gain nothing. Plus, you also see the dev_dbg().
>>
>> ok
>>
>>>
>>>> + cxlsd->target[0] = dport;
>>>> + return 0;
>>>> + }
>>>> +
>>>> + for (i = 0; i < cxld->interleave_ways; i++) {
>>>> + if (cxld->target_map[i] == dport->port_id) {
>>>> + cxlsd->target[i] = dport;
>>>> + dev_dbg(dev, "dport%d found in target list, index %d\n",
>>>> + dport->port_id, i);
>>>> + return 0;
>>>
>>> Only one target exists, right? Stop the iteration by returning a
>>> non-zero here (caller needs to be adjusted then).
>>
>> ok
>>
>>>
>>>> + }
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int cxl_decoders_dport_update(struct cxl_dport *dport)
>>>> +{
>>>> + return device_for_each_child(&dport->port->dev, dport,
>>>> + update_decoder_targets);
>>>
>>> Might need changes if update_decoder_targets returns 1 to stop the
>>> iterator.
>>
>> ok
>>
>>>
>>>> +}
>>>> +
>>>> +static int cxl_switch_port_setup(struct cxl_port *port)
>>>> +{
>>>
>>> Could you factor out that function in a separate patch?
>>>
>>> The function only sets up decoders. Name it
>>> cxl_switch_port_setup_decoders()?
>>>
>>>> + struct cxl_hdm *cxlhdm;
>>>> +
>>>> + cxlhdm = devm_cxl_setup_hdm(port, NULL);
>>>> + if (!IS_ERR(cxlhdm))
>>>> + return devm_cxl_enumerate_decoders(cxlhdm, NULL);
>>>> +
>>>> + if (PTR_ERR(cxlhdm) != -ENODEV) {
>>>> + dev_err(&port->dev, "Failed to map HDM decoder capability\n");
>>>> + return PTR_ERR(cxlhdm);
>>>> + }
>>>> +
>>>> + if (port->possible_dports == 1) {
>>>> + dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
>>>> + return devm_cxl_add_passthrough_decoder(port);
>>>
>>> Imo, the possible_dports handling should be removed as it only
>>> introduces dead code. mock_cxl_setup_hdm() always returns a valid
>>> cxlhdm (unless for -ENOMEM) and the mock case never reaches this code
>>> here.
>>>
>>> So how about moving (the "real") devm_cxl_add_passthrough_decoder()
>>> and cxl_port_get_possible_dports() to devm_cxl_enumerate_decoders()?
>>> devm_cxl_add_passthrough_decoder() would be static then and
>>> cxl_port_get_possible_dports() will be a core.h function only. Then,
>>> mock_cxl_add_passthrough_decoder() could be removed too.
>>>
>>> I really would like to have a clean core module interface that allows
>>> an easy implementation of cxl_test and avoid too much impact to the
>>> driver code.
>>
>> So after looking at this a bit, it looks like we need a bigger refactor than just devm_cxl_enumerate_decoders(). I have an attempt in the next rev you can take a look. It reduces from 3-4 mock functions down to 2.
>>
>>>
>>>> + }
>>>> +
>>>> + dev_err(&port->dev, "HDM decoder capability not found\n");
>>>> + return -ENXIO;
>>>> +}
>>>> +
>>>> +DEFINE_FREE(put_cxl_dport, struct cxl_dport *, if (!IS_ERR_OR_NULL(_T)) reap_dport(_T))
>>>> +static struct cxl_dport *cxl_port_get_or_add_dport(struct cxl_port *port,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_dport *dport;
>>>> + int rc;
>>>> +
>>>> + guard(device)(&port->dev);
>>>> +
>>>> + if (!port->dev.driver)
>>>> + return ERR_PTR(-ENXIO);
>>>> +
>>>> + dport = cxl_find_dport_by_dev(port, dport_dev);
>>>> + if (dport)
>>>> + return dport;
>>>
>>> What is the case if there is already a dport bound to the port? Since
>>> there is a 1:1 mapping downstream, there is only one allocation and I
>>> would expect that dport never exists and an -EBUSY should be returned
>>> otherwise.
>>
>> ok
>>
>>>
>>>> +
>>>> + struct cxl_dport *new_dport __free(put_cxl_dport) =
>>>> + devm_cxl_add_dport_by_dev(port, dport_dev);
>>>
>>> See my comment on devm_cxl_add_dport_by_dev() above.
>>>
>>>> + if (IS_ERR(new_dport))
>>>> + return new_dport;
>>>> +
>>>> + cxl_switch_parse_cdat(port);
>>>> +
>>>> + /*
>>>> + * First instance of dport appearing, need to setup the port, including
>>>> + * allocating decoders.
>>>> + */
>>>> + if (port->nr_dports == 1) {
>>>> + rc = cxl_switch_port_setup(port);
>>>
>>> Can't this be done with port creation? I don't see a reason doing this
>>> late at this point.
>>>
>>>> + if (rc)
>>>> + return ERR_PTR(rc);
>>>> + return no_free_ptr(new_dport);
>>>> + }
>>>> +
>>>> + rc = cxl_decoders_dport_update(new_dport);
>>>> + if (rc)
>>>> + return ERR_PTR(rc);
>>>
>>> Maybe unfold cxl_decoders_dport_update() here?
>>
>> ok
>>
>>>
>>>> +
>>>> + return no_free_ptr(new_dport);
>>>> +}
>>>> +
>>>> +static struct cxl_dport *devm_cxl_add_dport_by_uport(struct device *uport_dev,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> +
>>>> + if (!port)
>>>> + return ERR_PTR(-ENODEV);
>>>> +
>>>> + return cxl_port_get_or_add_dport(port, dport_dev);
>>>> +}
>>>
>>> That function can be removed, see below.
>>
>> ok
>>
>>>
>>>> +
>>>> +static struct cxl_dport *
>>>> +devm_cxl_create_or_extend_port(struct device *ep_dev,
>>>> + struct cxl_port *parent_port,
>>>> + struct cxl_dport *parent_dport,
>>>> + struct device *uport_dev,
>>>> + struct device *dport_dev)
>>>> +{
>>>> + resource_size_t component_reg_phys;
>>>> +
>>>> + guard(device)(&parent_port->dev);
>>>> +
>>>> + if (!parent_port->dev.driver) {
>>>> + dev_warn(ep_dev,
>>>> + "port %s:%s disabled, failed to enumerate CXL.mem\n",
>>>> + dev_name(&parent_port->dev), dev_name(uport_dev));
>>>> + return ERR_PTR(-ENXIO);
>>>> + }
>>>> +
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> +
>>>> + if (!port) {
>>>> + component_reg_phys = find_component_registers(uport_dev);
>>>> + port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>>>> + component_reg_phys, parent_dport);
>>>> + if (IS_ERR(port))
>>>> + return (struct cxl_dport *)port;
>>>> +
>>>> + /*
>>>> + * retry to make sure a port is found. a port device
>>>> + * reference is taken.
>>>> + */
>>>> + port = find_cxl_port_by_uport(uport_dev);
>>>> + if (!port)
>>>> + return ERR_PTR(-ENODEV);
>>>> +
>>>> + dev_dbg(ep_dev, "created port %s:%s\n",
>>>> + dev_name(&port->dev), dev_name(port->uport_dev));
>>>> + }
>>>> +
>>>> + return cxl_port_get_or_add_dport(port, dport_dev);
>>>> +}
>>>> +
>>>> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>>> struct device *uport_dev,
>>>> struct device *dport_dev)
>>>> {
>>>> struct device *dparent = grandparent(dport_dev);
>>>> struct cxl_dport *dport, *parent_dport;
>>>> - resource_size_t component_reg_phys;
>>>> int rc;
>>>>
>>>> if (is_cxl_host_bridge(dparent)) {
>>>> + struct cxl_port *port __free(put_cxl_port) =
>>>> + find_cxl_port_by_uport(uport_dev);
>>>> /*
>>>> * The iteration reached the topology root without finding the
>>>> * CXL-root 'cxl_port' on a previous iteration, fail for now to
>>>> * be re-probed after platform driver attaches.
>>>> */
>>>> - dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>>> - dev_name(dport_dev));
>>>> - return -ENXIO;
>>>> + if (!port) {
>>>> + dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>>> + dev_name(dport_dev));
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + /*
>>>> + * While the port is found, there may not be a dport associated
>>>> + * yet. Try to associate the dport to the port. On return success,
>>>> + * the iteration will restart with the dport now attached.
>>>> + */
>>>> + dport = devm_cxl_add_dport_by_uport(uport_dev,
>>>> + dport_dev);
>>>
>>> port is known here, use cxl_port_get_or_add_dport(port, dport_dev)
>>> instead. Remove devm_cxl_add_dport_by_uport().
>>
>> ok
>>
>>>
>>>> + if (IS_ERR(dport))
>>>> + return PTR_ERR(dport);
>>>> +
>>>> + return 0;
>>>> }
>>>>
>>>> struct cxl_port *parent_port __free(put_cxl_port) =
>>>> @@ -1584,36 +1766,12 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>>> return -EAGAIN;
>>>> }
>>>>
>>>> - /*
>>>> - * Definition with __free() here to keep the sequence of
>>>> - * dereferencing the device of the port before the parent_port releasing.
>>>> - */
>>>> - struct cxl_port *port __free(put_cxl_port) = NULL;
>>>> - scoped_guard(device, &parent_port->dev) {
>>>> - if (!parent_port->dev.driver) {
>>>> - dev_warn(&cxlmd->dev,
>>>> - "port %s:%s disabled, failed to enumerate CXL.mem\n",
>>>> - dev_name(&parent_port->dev), dev_name(uport_dev));
>>>> - return -ENXIO;
>>>> - }
>>>> + dport = devm_cxl_create_or_extend_port(&cxlmd->dev, parent_port,
>>>> + parent_dport, uport_dev,
>>>> + dport_dev);
>>>
>>> You expand add_port_attach_ep() here. This function was originally
>>> called if there is no *port* at all. Now, as the dport_dev is not yet
>>> registered, the port may already exist, but it is not found since the
>>> dport_dev is not yet registered and add_port_attach_ep() is called now
>>> even if the port exists. I think we should move that dport_dev
>>> registration a level higher to devm_cxl_enumerate_ports(). That might
>>> need a cleanup of the iterator and the removal of
>>> add_port_attach_ep().
>>
>> Yes. The new rev will move the dport registration a level up. No need to remove add_port_attach_ep(). devm_cxl_create_or_extend_port() will be devm_cxl_create_port().
>>
>>>
>>>> + if (IS_ERR(dport))
>>>> + return PTR_ERR(dport);
>>>>
>>>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>>>> - if (!port) {
>>>> - component_reg_phys = find_component_registers(uport_dev);
>>>> - port = devm_cxl_add_port(&parent_port->dev, uport_dev,
>>>> - component_reg_phys, parent_dport);
>>>> - if (IS_ERR(port))
>>>> - return PTR_ERR(port);
>>>> -
>>>> - /* retry find to pick up the new dport information */
>>>> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
>>>> - if (!port)
>>>> - return -ENXIO;
>>>> - }
>>>> - }
>>>> -
>>>> - dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
>>>> - dev_name(&port->dev), dev_name(port->uport_dev));
>>>> rc = cxl_add_ep(dport, &cxlmd->dev);
>>>> if (rc == -EBUSY) {
>>>> /*
>>>> @@ -1630,6 +1788,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> {
>>>> struct device *dev = &cxlmd->dev;
>>>> struct device *iter;
>>>> + int ports_need_create = 0;
>>>> int rc;
>>>>
>>>> /*
>>>> @@ -1654,6 +1813,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> struct device *uport_dev;
>>>> struct cxl_dport *dport;
>>>>
>>>> + ports_need_create++;
>>>> +
>>>> if (is_cxl_host_bridge(dport_dev))
>>>> return 0;
>>>>
>>>> @@ -1688,10 +1849,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>>
>>>> cxl_gpf_port_setup(dport);
>>>>
>>>> + ports_need_create--;
>>>> /* Any more ports to add between this one and the root? */
>>>> if (!dev_is_cxl_root_child(&port->dev))
>>>> continue;
>>>>
>>>> + /*
>>>> + * The 'ports_need_create' variable tracks a port being
>>>> + * created as it goes through this iterative loop. It's
>>>> + * incremented when it first enters the loop and decremented
>>>> + * when the port is found. If at the root of the hierarchy
>>>> + * and the variable is not 0, then it's missing a port
>>>> + * creation somewhere in the hierarchy and should restart.
>>>> + * For example in a setup where there's a PCI root port, a
>>>> + * switch, and an endpoint, it is possible to get to the
>>>> + * PCI root port and its creation, and the switch port is
>>>> + * still missing because the root port didn't exist. This
>>>> + * triggers a restart of the loop to create the switch port
>>>> + * now with a present root port.
>>>> + */
>>>> + if (ports_need_create)
>>>
>>> Uh, that becomes hard. Isn't the iterator much simpler:
>>>
>>> * Start the iter = endpoint.
>>>
>>> * Find first existing parent port up to the root.
>>>
>>> * If that is the direct parent of the endpoint, attach it to the
>>> parent (add dport etc.). Exit loop without errors.
>>>
>>> * Else, create port and attach it to the found parent port (including
>>> dport handling).
>>>
>>> * Fail on errors or retry otherwise.
>>>
>>> So, devm_cxl_enumerate_ports() should be reworked better, also address
>>> my other comments regarding add_port_attach_ep() and
>>> devm_cxl_create_or_extend_port().
>>
>> So I reworked this whole path a bit. Maybe not exactly what you are envisioning here but it is a lot cleaner. You can take a look at the next rev.
>>
>>>
>>>> + goto retry;
>>>> +
>>>> return 0;
>>>> }
>>>>
>>>> @@ -1700,8 +1879,10 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>>>> if (rc == -EAGAIN)
>>>> continue;
>>>> /* failed to add ep or port */
>>>> - if (rc)
>>>> + if (rc < 0)
>>>> return rc;
>>>> +
>>>> + ports_need_create = 0;
>>>> /* port added, new descendants possible, start over */
>>>> goto retry;
>>>> }
>>>> @@ -1733,14 +1914,16 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
>>>> device_lock_assert(&port->dev);
>>>>
>>>> if (xa_empty(&port->dports))
>>>> - return -EINVAL;
>>>> + return 0;
>>>>
>>>> guard(rwsem_write)(&cxl_rwsem.region);
>>>> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
>>>> struct cxl_dport *dport = find_dport(port, cxld->target_map[i]);
>>>>
>>>> - if (!dport)
>>>> - return -ENXIO;
>>>> + if (!dport) {
>>>> + /* dport may be activated later */
>>>> + continue;
>>>> + }
>>>> cxlsd->target[i] = dport;
>>>> }
>>>
>>> Should that be dropped entirely as the target setup is done somewhere
>>> else?
>>>
>> No. This is still needed for root ports.
>
> Root ports are dport_devs and don't have a target list, host bridges
> have. I did not follow the entire code flow, but shouldn't
> cxl_decoders_dport_update() handle that?
Sorry I see where I mislead you. I should have said cxl_root(s) and not the root ports under the host bridge. __cxl_parse_cfmws() calls cxl_decoder_add() for the root decoder and that's where decoder_populate_targets() needs to populate the targets and cxl_decoders_dport_update() does not hit the root decoder.
DJ
next prev parent reply other threads:[~2025-09-02 15:40 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 22:21 [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
2025-08-14 22:21 ` [PATCH v8 01/11] cxl: Add helper to detect top of CXL device topology Dave Jiang
2025-08-15 12:50 ` Jonathan Cameron
2025-08-20 13:51 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 02/11] cxl: Add helper to reap dport Dave Jiang
2025-08-20 14:10 ` Robert Richter
2025-08-20 20:54 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 03/11] cxl: Add a cached copy of target_map to cxl_decoder Dave Jiang
2025-08-15 12:52 ` Jonathan Cameron
2025-08-20 14:17 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 04/11] cxl: Move port register setup to first dport appear Dave Jiang
2025-08-15 12:57 ` Jonathan Cameron
2025-08-21 11:57 ` Robert Richter
2025-08-22 10:37 ` Robert Richter
2025-08-14 22:21 ` [PATCH v8 05/11] cxl: Defer dport allocation for switch ports Dave Jiang
2025-08-20 12:41 ` Robert Richter
2025-08-20 15:20 ` Dave Jiang
2025-08-22 9:59 ` Robert Richter
2025-08-22 15:52 ` Dave Jiang
2025-08-26 7:51 ` Robert Richter
2025-08-27 17:05 ` Dave Jiang
2025-08-29 15:02 ` Robert Richter
2025-08-29 17:23 ` Dave Jiang
2025-09-01 14:48 ` Robert Richter
2025-09-02 15:58 ` Dave Jiang
2025-08-27 21:15 ` Dave Jiang
2025-09-01 17:29 ` Robert Richter
2025-09-02 15:40 ` Dave Jiang [this message]
2025-09-03 18:21 ` Dave Jiang
2025-08-27 21:37 ` Dave Jiang
2025-08-14 22:21 ` [PATCH v8 06/11] cxl/test: Add cxl_test support for cxl_port_get_possible_dports() Dave Jiang
2025-08-14 22:21 ` [PATCH v8 07/11] cxl/test: Add mock version of devm_cxl_add_dport_by_dev() Dave Jiang
2025-08-14 22:21 ` [PATCH v8 08/11] cxl/test: Add support to cxl_test for decoder enumeration mock functions Dave Jiang
2025-08-14 22:21 ` [PATCH v8 09/11] cxl/test: Setup target_map for cxl_test decoder initialization Dave Jiang
2025-08-15 13:04 ` Jonathan Cameron
2025-08-14 22:21 ` [PATCH v8 10/11] cxl: Change sslbis handler to only handle single dport Dave Jiang
2025-08-14 22:21 ` [PATCH v8 11/11] tools/testing/cxl: Add decoder save/restore support Dave Jiang
2025-08-15 13:15 ` Jonathan Cameron
2025-08-19 9:39 ` [PATCH v8 00/11] cxl: Delay HB port and switch dport probing until endpoint dev probe Robert Richter
2025-08-19 15:41 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=12b3c97a-9463-4c95-8757-e306d4aea14d@intel.com \
--to=dave.jiang@intel.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.