From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <dave@stgolabs.net>,
<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
<ira.weiny@intel.com>, <dan.j.williams@intel.com>,
<rrichter@amd.com>
Subject: Re: [PATCH v4 4/9] cxl: Defer dport allocation for switch ports
Date: Tue, 1 Jul 2025 12:10:22 +0100 [thread overview]
Message-ID: <20250701121022.000035d1@huawei.com> (raw)
In-Reply-To: <20250624213916.1665889-5-dave.jiang@intel.com>
On Tue, 24 Jun 2025 14:39:11 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> The current implementation enumerates the dports during the cxl_port
> driver probe. Without an endpoint connected, the dport may not be
> active during port probe. This scheme may prevent a valid hardware
> dport id to be retrieved and MMIO registers to be read when an endpoint
> is hot-plugged. Move the dport allocation and setup to behind memdev
> probe so the endpoint is guaranteed to be connected.
>
> In the original enumeration behavior, there are 3 phases (or 2 if no CXL
> switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
> ACPI0017.N device. Through that it enumerate downstream ports composed
> of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
> use add_host_bridge_uport() to create the ports that enumerates the PCI
> RPs as the dports of these ports. Every time a port is created, the port
> driver is attached and drv->probe() is called and
> devm_cxl_port_enumerate_dports() is envoked to enumerate and probe
> the dports.
>
> The second phase is if there are any CXL switches. When the pci endpoint
> device driver (cxl_pci) calls probe, it will add a mem device and triggers
> the cxl_mem->probe(). cxl_mem->probe() calls devm_cxl_enumerate_ports()
> and attempts to discovery and create all the ports represent CXL switches.
> During this phase, a port is created per switch and the attached dports
> are also enumerated and probed.
>
> The last phase is creating endpoint port which happens for all endpoint
> devices.
>
> In this commit, the port create and its dport probing in cxl_acpi is not
> changed. That will be handled in a different patch later on. The behavior
> change is only for CXL switch ports. Only the dport that is part of the
> path for an endpoint device to the RP will be probed. This happens
> naturally by the code walking up the device hierarchy and identifying the
> upstream device and the downstream device.
>
> There are two points where the interception of dport creation happens
> during the devm_cxl_enumerate_ports() path. The first location is right
> before the function calls add_port_attch_ep() where it does the dport
attach_ep()
> allocation for the RP. Once the dport is allocated, the iteration path
> is reset to the beginning to try again. The second location happens
> in add_port_attach_ep() after the location where either the port is
> discovered or allocated new if it does not exist.
>
> Locking of port device during __cxl_port_add_dport() protects modifications
> against the port and its dports while multiple endpoints can be probing at
> the same time and the same port is being modified concurrently.
>
> While the decoders are allocated during the port driver probe,
> The decoders must also be updated since previously it's all done when all
> the dports are setup and now every time a dport is setup per endpoint, the
> switch target listing need to be updated with new dport. A
> guard(rwsem_write) is used to update decoder targets. This is similar to
> when decoder_populate_target() is called and the decoder programming
> must be protected.
>
> Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> v4:
> - Allocate dport when they are found via endpoint probe. (Robert)
A few comments inline.
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index b50551601c2e..6e9b5ab69de0 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -24,6 +24,44 @@ static unsigned short media_ready_timeout = 60;
> module_param(media_ready_timeout, ushort, 0644);
> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
> struct cxl_walk_context {
> struct pci_bus *bus;
> struct cxl_port *port;
> @@ -1169,3 +1207,53 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
>
> return 0;
> }
> +
> +static int match_dport(struct pci_dev *pdev, void *data)
> +{
> + struct cxl_walk_context *ctx = data;
> + int type = pci_pcie_type(pdev);
> +
> + if (pdev->bus != ctx->bus)
> + return 0;
> + if (!pci_is_pcie(pdev))
> + return 0;
> + if (type != ctx->type)
I'm lost. How would bus have matched, but type not?
Add a comment.
> + return 0;
> +
> + ctx->count++;
> + return 0;
> +}
> +
> +int cxl_port_get_total_dports(struct cxl_port *port)
See comment below on renaming this given it doesn't return
the number.
> +{
> + struct pci_bus *bus = cxl_port_to_pci_bus(port);
> + struct cxl_walk_context ctx;
> + int type;
> +
> + if (!bus) {
> + dev_err(&port->dev, "No PCI bus found for port %s\n",
> + dev_name(&port->dev));
> + return -ENXIO;
> + }
> +
> + if (pci_is_root_bus(bus))
> + type = PCI_EXP_TYPE_ROOT_PORT;
> + else
> + type = PCI_EXP_TYPE_DOWNSTREAM;
> +
> + ctx = (struct cxl_walk_context) {
> + .bus = bus,
> + .type = type,
> + };
> + pci_walk_bus(bus, match_dport, &ctx);
> +
> + port->total_dports = ctx.count;
> + if (port->total_dports == 0) {
> + dev_warn(&port->dev, "No dports found for port %s on bus %s\n",
> + dev_name(&port->dev), bus->name);
> + return -ENXIO;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_total_dports, "CXL");
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 9691da831224..3872638c439b 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1551,6 +1551,134 @@ static resource_size_t find_component_registers(struct device *dev)
> return map.resource;
> }
> +
> +static int devm_cxl_add_dport_by_uport(struct device *uport_dev,
> + struct device *dport_dev,
> + struct cxl_dport **dport)
> +{
> + *dport = NULL;
> + struct cxl_port *port __free(put_cxl_port) =
> + find_cxl_port_by_uport(uport_dev);
> + if (!port)
> + return -ENODEV;
> +
> + guard(device)(&port->dev);
> + return devm_cxl_port_add_dport(port, dport_dev, dport);
> +}
> +
> +static int devm_cxl_add_dport_for_rp(struct device *uport_dev,
> + struct device *dport_dev,
> + struct cxl_dport **dport)
That this sets *dport = NULL lead me to get confused at the caller.
Perhaps instead these could return a dport *, NULL or ERR_PTR()
instead?
> +{
> + struct device *dparent = grandparent(dport_dev);
> +
> + *dport = NULL;
> + /* Only go down this path if we are at the root port */
> + if (!is_cxl_hierarchy_head(dparent))
> + return 0;
> +
> + return devm_cxl_add_dport_by_uport(uport_dev, dport_dev, dport);
> +}
> +
> static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> struct device *uport_dev,
> struct device *dport_dev)
> @@ -1584,6 +1712,8 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> */
> struct cxl_port *port __free(put_cxl_port) = NULL;
> scoped_guard(device, &parent_port->dev) {
> + struct cxl_dport *new_dport;
> +
> if (!parent_port->dev.driver) {
> dev_warn(&cxlmd->dev,
> "port %s:%s disabled, failed to enumerate CXL.mem\n",
> @@ -1592,6 +1722,8 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> }
>
> port = find_cxl_port_at(parent_port, dport_dev, &dport);
> + if (!port)
> + port = find_cxl_port_by_uport(uport_dev);
> if (!port) {
> component_reg_phys = find_component_registers(uport_dev);
> port = devm_cxl_add_port(&parent_port->dev, uport_dev,
> @@ -1599,11 +1731,17 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> if (IS_ERR(port))
> return PTR_ERR(port);
>
> - /* retry find to pick up the new dport information */
> - port = find_cxl_port_at(parent_port, dport_dev, &dport);
> - if (!port)
> - return -ENXIO;
> + get_device(&port->dev);
Not immediately obvious to me why a reference needs to be grabbed here. PErhaps
a comment?
> }
> +
> + guard(device)(&port->dev);
> + /* Existing dport is checked with no action done if exists */
> + rc = devm_cxl_port_add_dport(port, dport_dev, &new_dport);
Similar comment to below. Would prefer
new_dport = devm_cxl_port_add_dport(port, dprot_dev);
if (IS_ERR(new_dport))
return PTR_ERR(new_dport);
Similar approach already used for devm_cxl_add_port()
> + if (rc)
> + return rc;
> +
> + if (new_dport)
> + dport = new_dport;
> }
>
> dev_dbg(&cxlmd->dev, "add to new port %s:%s\n",
> @@ -1620,11 +1758,13 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> return rc;
> }
>
> +#define CXL_ITER_LEVEL_SWITCH 1
> +
> int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> {
> struct device *dev = &cxlmd->dev;
> struct device *iter;
> - int rc;
> + int rc, i;
>
> /*
> * Skip intermediate port enumeration in the RCH case, there
> @@ -1643,7 +1783,7 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> * attempt fails.
> */
> retry:
> - for (iter = dev; iter; iter = grandparent(iter)) {
> + for (i = 0, iter = dev; iter; i++, iter = grandparent(iter)) {
> struct device *dport_dev = grandparent(iter);
> struct device *uport_dev;
> struct cxl_dport *dport;
> @@ -1686,9 +1826,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> if (!dev_is_cxl_root_child(&port->dev))
> continue;
>
> + /*
> + * This is a corner case where the rootport is setup but
> + * the switch dport is not. It needs to go back to the
> + * beginning to setup the switch port.
> + */
> + if (i >= CXL_ITER_LEVEL_SWITCH) {
> + struct cxl_port *pport __free(put_cxl_port) =
> + cxl_mem_find_port(cxlmd, &dport);
> + if (!port)
> + goto retry;
> + }
> +
> return 0;
> }
>
> + rc = devm_cxl_add_dport_for_rp(uport_dev, dport_dev, &dport);
As above, I'd prefer to see
dport = devm_cxl_add_dport_for_rp(uport_dev, dport_dev);
if (IS_ERR(dport))
return PTR_ERR(dport);
if (dport)
goto retry;
That way it's explicit that dport is going to be updated in all cases within
devm_cxl_add_dport_for_rp()
> + if (rc)
> + return rc;
> + /* Added a dport, restart enumeration */
> + if (dport)
> + goto retry;
> +
> rc = add_port_attach_ep(cxlmd, uport_dev, dport_dev);
> /* port missing, try to add parent */
> if (rc == -EAGAIN)
> @@ -1727,16 +1886,19 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
> return 0;
>
> device_lock_assert(&port->dev);
> + memcpy(cxlsd->target_map, target_map, sizeof(cxlsd->target_map));
>
> if (xa_empty(&port->dports))
> - return -EINVAL;
> + return 0;
>
> guard(rwsem_write)(&cxl_region_rwsem);
> for (i = 0; i < cxlsd->cxld.interleave_ways; i++) {
> struct cxl_dport *dport = find_dport(port, target_map[i]);
>
> - if (!dport)
> - return -ENXIO;
> + if (!dport) {
> + /* dport may be activated later */
> + continue;
> + }
> cxlsd->target[i] = dport;
> }
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index fe4b593331da..354e134793c3 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -65,12 +65,10 @@ static int cxl_switch_port_probe(struct cxl_port *port)
> /* Cache the data early to ensure is_visible() works */
> read_cdat_data(port);
>
> - rc = devm_cxl_port_enumerate_dports(port);
> - if (rc < 0)
> + rc = cxl_port_get_total_dports(port);
A function called cxl_port_get_total_dports() to my thinking should return
the number of them not fill in a value embedded in port.
cxl_port_update_total_dports() perhaps?
> + if (rc)
> return rc;
>
> - cxl_switch_parse_cdat(port);
> -
> cxlhdm = devm_cxl_setup_hdm(port, NULL);
> if (!IS_ERR(cxlhdm))
> return devm_cxl_enumerate_decoders(cxlhdm, NULL);
> @@ -80,7 +78,7 @@ static int cxl_switch_port_probe(struct cxl_port *port)
> return PTR_ERR(cxlhdm);
> }
>
> - if (rc == 1) {
> + if (port->total_dports == 1) {
> dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
> return devm_cxl_add_passthrough_decoder(port);
> }
next prev parent reply other threads:[~2025-07-01 11:10 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-24 21:39 [PATCH v4 0/9] cxl: Delay HB port and switch dport probing until endpoint dev probe Dave Jiang
2025-06-24 21:39 ` [PATCH v4 1/9] cxl/region: Add decoder check to check_commit_order() Dave Jiang
2025-06-24 21:39 ` [PATCH v4 2/9] cxl: Add helper to detect top of CXL device topology Dave Jiang
2025-06-24 21:39 ` [PATCH v4 3/9] cxl: Add helper to reap dport Dave Jiang
2025-07-01 10:31 ` Jonathan Cameron
2025-06-24 21:39 ` [PATCH v4 4/9] cxl: Defer dport allocation for switch ports Dave Jiang
2025-07-01 11:10 ` Jonathan Cameron [this message]
2025-07-01 20:18 ` Dave Jiang
2025-07-02 10:18 ` Jonathan Cameron
2025-06-24 21:39 ` [PATCH v4 5/9] cxl/test: Add cxl_test support for new dport allocation scheme Dave Jiang
2025-07-01 11:12 ` Jonathan Cameron
2025-06-24 21:39 ` [PATCH v4 6/9] cxl/test: Add mock version of devm_cxl_add_dport_by_dev() Dave Jiang
2025-07-01 11:17 ` Jonathan Cameron
2025-06-24 21:39 ` [PATCH v4 7/9] cxl: Change sslbis handler to only handle single dport Dave Jiang
2025-06-24 21:39 ` [PATCH v4 8/9] cxl: Create an xarray to tie a host bridge to the cxl_root Dave Jiang
2025-07-01 11:20 ` Jonathan Cameron
2025-06-24 21:39 ` [PATCH v4 9/9] cxl: Move enumeration of hostbridge ports to the memdev probe path Dave Jiang
2025-07-01 11:32 ` Jonathan Cameron
2025-07-03 23:22 ` Dave Jiang
2025-07-04 9:39 ` Jonathan Cameron
2025-07-07 18:35 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250701121022.000035d1@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.