From: Chen Pei <cp0613@linux.alibaba.com>
To: icheng@nvidia.com
Cc: alison.schofield@intel.com, cp0613@linux.alibaba.com,
dave.jiang@intel.com, dave@stgolabs.net, djbw@kernel.org,
guoren@kernel.org, ira.weiny@intel.com, jic23@kernel.org,
linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
vishal.l.verma@intel.com
Subject: Re: [PATCH] cxl/acpi: Defer probe when ACPI0016 PCI root bridge is not ready
Date: Fri, 15 May 2026 21:46:49 +0800 [thread overview]
Message-ID: <20260515134652.15486-1-cp0613@linux.alibaba.com> (raw)
In-Reply-To: <agV2e28XjW7yOUwH@MWDK4CY14F>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 6567 bytes --]
On Thu, 14 May 2026 15:31:11 +0800, Richard Cheng wrote:
> > On some platforms (e.g., RISC-V and ARM64) that use the generic
> > pci_acpi_scan_root() implementation, cxl_acpi_probe may run before
> > acpi_pci_root driver has bound to ACPI0016 (CXL host bridge) devices.
> > In this case, acpi_pci_find_root() returns NULL, causing
> > to_cxl_host_bridge() to skip the device silently. This results in
> > incomplete CXL port enumeration on first boot.
> >
> > Fix this by detecting the case where an ACPI0016 device exists but its
> > PCI root bridge is not yet ready, and returning -EPROBE_DEFER to trigger
> > a deferred probe retry.
> >
> > Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
> > ---
> > drivers/cxl/acpi.c | 26 ++++++++++++++++++++++++--
> > 1 file changed, 24 insertions(+), 2 deletions(-)
> >
>
> Hi Chen Pei,
>
> Thanks for the patch.
> I have a few questions and suggestions regarding to your changes.
>
> First of all I would like in which scenario did you encounter the bug?
> Any specific CONFIG options and the devices ? what's the error log ?
>
> It would be nice if you can attach it for us.
Hi Richard,
Thanks for the review.
I'm currently working on bringing up CXL support on the RISC-V QEMU
virt platform with ACPI (EDK2 UEFI firmware). This is still in the
early debugging/enabling stage.
During testing, I found that cxl_acpi (ACPI0017) probes before
acpi_pci_root has bound to the ACPI0016 (CXL host bridge) device.
RISC-V uses the generic pci_acpi_scan_root() implementation, where
the probe ordering of acpi_pci_root relative to cxl_acpi is not
guaranteed.
On x86, acpi_pci_root uses subsys_initcall and binds very early,
so this race does not manifest there.
This is a silent failure (no explicit error log), which makes it
particularly hard to diagnose. When cxl_acpi probes before
acpi_pci_root has bound the ACPI0016 device:
1. acpi_pci_find_root() returns NULL in to_cxl_host_bridge()
2. to_cxl_host_bridge() returns NULL
2. Both add_host_bridge_dport() and add_host_bridge_uport() return 0
(not an error), silently skipping the host bridge
3. cxl_acpi_probe() returns success, but the CXL port topology is
incomplete — no dports or uports are registered
The observable result after boot:
# memdev is visible but decoder hierarchy is missing
$ cxl list -M
[
{
"memdev":"mem0",
...
}
]
# No decoders or ports registered for the host bridge
$ cxl list -BDP
[]
# Workaround: manually unbind/bind triggers re-probe after
# acpi_pci_root is ready, and CXL topology enumerates correctly
$ echo ACPI0017:00 > /sys/bus/platform/drivers/cxl_acpi/unbind
$ echo ACPI0017:00 > /sys/bus/platform/drivers/cxl_acpi/bind
# After re-probe, full topology is available and CXL memory
# can be enabled/onlined successfully
$ cxl enable-memdev mem0
$ cxl create-region -m -t ram -d decoder0.0 -w 1 mem0 -s 4G
$ daxctl online-memory dax0.0
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 127537628817..9952d0cff903 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -631,8 +631,21 @@ static int add_host_bridge_dport(struct device *match, void *arg)
> > struct acpi_pci_root *pci_root;
> > struct cxl_port *root_port = arg;
> > struct device *host = root_port->dev.parent;
> > - struct acpi_device *hb = to_cxl_host_bridge(host, match);
> > + struct acpi_device *adev = to_acpi_device(match);
> > + struct acpi_device *hb;
> >
> > + /*
> > + * If this is an ACPI0016 device but acpi_pci_find_root() hasn't
> > + * found the PCI root yet (driver not probed), defer the probe
> > + * to allow acpi_pci_root to bind first.
> > + */
> > + if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0 &&
> > + !acpi_pci_find_root(adev->handle)) {
> > + dev_dbg(host, "deferring probe, ACPI0016 PCI root not ready\n");
> > + return -EPROBE_DEFER;
> > + }
>
> What about strncpy() here since we already know we're comparing against "ACPI0016" ?
> At the same time, why not just use "acpi_dev_hid_match()" ? it's widely used across
> numerous files.
Good point, thanks. I will switch to the acpi_dev_hid_match() in v2.
> > +
> > + hb = to_cxl_host_bridge(host, match);
> > if (!hb)
> > return 0;
> >
> > @@ -688,7 +701,8 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > {
> > struct cxl_port *root_port = arg;
> > struct device *host = root_port->dev.parent;
> > - struct acpi_device *hb = to_cxl_host_bridge(host, match);
> > + struct acpi_device *adev = to_acpi_device(match);
> > + struct acpi_device *hb;
> > struct acpi_pci_root *pci_root;
> > struct cxl_dport *dport;
> > struct cxl_port *port;
> > @@ -697,6 +711,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > resource_size_t component_reg_phys;
> > int rc;
> >
> > + /* Same deferral check as in add_host_bridge_dport() */
> > + if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0 &&
> > + !acpi_pci_find_root(adev->handle)) {
> > + dev_dbg(host, "deferring probe, ACPI0016 PCI root not ready\n");
> > + return -EPROBE_DEFER;
> > + }
> > +
> > + hb = to_cxl_host_bridge(host, match);
> > if (!hb)
> > return 0;
> >
> > --
> > 2.50.1
> >
> >
>
> These 2 checks are basically the same, can we put it in a static inline helper or
> a macro if possible? something like the following might be better
>
> ```
> static int cxl_acpi_defer_host_bridge(struct device *host,
> struct acpi_device *adev)
> {
> if (acpi_dev_hid_match(adev, "ACPI0016") &&
> !acpi_pci_find_root(adev->handle)) {
> dev_dbg(host, "deferring probe, ACPI0016 PCI root not ready\n");
> return -EPROBE_DEFER;
> }
> return 0;
> }
> ```
> and use it in your code like
>
> ```
> int rc = cxl_acpi_defer_host_bridge(host, adev);
> if (rc)
> return rc;
> ```
Agreed, will extract it into a helper in v2.
> Last but not least, have you run the kselftests of CXL ? some mock bridges
> are platform devices, not ACPI devices, you are using "to_acpi_device(match)", this
> is not a safe runtime check when "match" is a platform_device, the code will read the memory
> layout wrongly.
Good catch, I haven't run the CXL kselftests. You're right that the
mock bridges are platform devices and unconditionally calling
to_acpi_device(match) would be unsafe there.
I'll fix this in v2 by adding a dev_is_platform() guard in the
helper, so it only applies to real ACPI devices. I'll also run
the CXL kselftests to validate before sending v2.
Thanks,
Pei
next prev parent reply other threads:[~2026-05-15 13:47 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 2:32 [PATCH] cxl/acpi: Defer probe when ACPI0016 PCI root bridge is not ready Chen Pei
2026-05-14 7:31 ` Richard Cheng
2026-05-15 13:46 ` Chen Pei [this message]
2026-05-14 17:10 ` Dave Jiang
2026-05-15 13:53 ` Chen Pei
2026-05-14 17:19 ` Alison Schofield
2026-05-15 13:56 ` Chen Pei
2026-05-15 15:38 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260515134652.15486-1-cp0613@linux.alibaba.com \
--to=cp0613@linux.alibaba.com \
--cc=alison.schofield@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=djbw@kernel.org \
--cc=guoren@kernel.org \
--cc=icheng@nvidia.com \
--cc=ira.weiny@intel.com \
--cc=jic23@kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox