From: sashiko-bot@kernel.org
To: "Caleb James DeLisle" <cjd@cjdns.fr>
Cc: linux-pci@vger.kernel.org, devicetree@vger.kernel.org,
robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH v8 3/3] PCI: mediatek: Add support for EcoNet EN7528 SoC
Date: Wed, 20 May 2026 19:58:05 +0000 [thread overview]
Message-ID: <20260520195806.4726A1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260520183827.908243-4-cjd@cjdns.fr>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [High] Link retraining is performed after device enumeration and driver binding, causing a severe race condition.
- [High] Iterating the PCI bus devices list without holding the pci_bus_sem read lock.
- [Medium] Early return in mtk_pcie_retrain() aborts retraining of all remaining root ports if a single port fails.
Pre-existing issues:
- [High] Freeing port in error paths causes a Use-After-Free because the active IRQ domain's host_data still points to it.
--
commit c1c3bb9bcba6bb8a070037578912a2f02c381591
Author: Caleb James DeLisle <cjd@cjdns.fr>
Subject: PCI: mediatek: Add support for EcoNet EN7528 SoC
This commit adds support for the PCIe controller present on the EcoNet
EN7528 (and EN751221) SoCs. These SoCs feature a mix of Gen1 and Gen2 ports,
and require a manual link retraining after startup for Gen2 ports to function.
> diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c
> index c503fbd774d09..4e4d4b1559f71 100644
> --- a/drivers/pci/controller/pcie-mediatek.c
> +++ b/drivers/pci/controller/pcie-mediatek.c
[ ... ]
> @@ -755,6 +766,132 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
> return 0;
> }
>
> +static int mtk_pcie_startup_port_en7528(struct mtk_pcie_port *port)
> +{
[ ... ]
> + /* 100ms timeout value should be enough for Gen1/2 training */
> + err = regmap_read_poll_timeout(pcie->cfg, EN7528_LINKUP_REG, val,
> + !!(val & link_mask), 20,
> + PCI_PM_D3COLD_WAIT * USEC_PER_MSEC);
> + if (err) {
> + dev_err(pcie->dev, "EN7528: port%d link timeout\n", port->slot);
> + return -ETIMEDOUT;
> + }
[Severity: High]
This is a pre-existing issue, but returning an error here can cause a
use-after-free crash.
When a port fails to start up (for example, if the slot is empty and link
training times out), the caller frees the port structure by calling
mtk_pcie_port_free(). However, mtk_pcie_port_free() does not call
irq_domain_remove().
Since mtk_pcie_init_irq_domain() sets the newly allocated port structure
as the IRQ domain's host_data, the domain remains registered with a
dangling pointer.
Could we make sure the IRQ domain is properly cleaned up in error paths?
[ ... ]
> +static int mtk_pcie_retrain(struct device *dev, struct pci_host_bridge *host)
> +{
> + struct pci_dev *rp;
> + int ret = -ENOENT;
> + u16 lnksta = 0;
> + u32 speed;
> +
> + /* Should already have been warned about during startup_port */
> + if (!IS_BUILTIN(CONFIG_PCIE_MEDIATEK))
> + return 0;
> +
> + for_each_pci_bridge(rp, host->bus) {
[Severity: High]
Does iterating the PCI bus devices list here require holding the pci_bus_sem
read lock?
Since mtk_pcie_retrain() is called after pci_host_probe() completes, the PCI
bus is live and devices are exposed to the system. Concurrent operations like
sysfs hotplug events could modify the host->bus->devices list during traversal.
Should we hold down_read(&pci_bus_sem) or use pci_walk_bus() to avoid a race
condition?
> + if (pci_pcie_type(rp) != PCI_EXP_TYPE_ROOT_PORT)
> + continue;
> +
> +#if IS_BUILTIN(CONFIG_PCIE_MEDIATEK)
> + ret = pcie_retrain_link(rp, true);
> +#endif
> +
> + if (ret)
> + return dev_err_probe(&rp->dev, ret,
> + "failed to retrain port\n");
[Severity: Medium]
Will this early return abort the retraining process for all remaining root
ports?
If pcie_retrain_link() fails on the first port, we immediately break out of
the loop.
Should this log the error and use continue instead, so that a failure on one
independent slot does not prevent the others from retraining successfully?
> +
> + pcie_capability_read_word(rp, PCI_EXP_LNKSTA, &lnksta);
> + speed = lnksta & PCI_EXP_LNKSTA_CLS;
> +
> + pci_info(rp, "link retrained, speed %s\n",
> + pci_speed_string(pcie_link_speed[speed]));
> +
> + }
> +
> + return 0;
> +}
[ ... ]
> @@ -1159,6 +1296,13 @@ static int mtk_pcie_probe(struct platform_device *pdev)
> if (err)
> goto put_resources;
>
> + /*
> + * Ignore error because pci_host_probe() was already called, and in any
> + * case it is possible that the port will still work as Gen1.
> + */
> + if (pcie->soc->quirks & MTK_PCIE_RETRAIN)
> + mtk_pcie_retrain(dev, host);
> +
[Severity: High]
Can retraining the link at this stage cause issues with actively bound endpoint
drivers?
By the time pci_host_probe() returns, devices have already been discovered,
exposed to the system, and their drivers bound. Retraining the PCIe link forces
it into recovery, which could stall transactions or temporarily drop the link.
If an endpoint driver (like NVMe or Wi-Fi) is actively performing MMIO or DMA,
could this cause unhandled exceptions, AER errors, or initialization failures?
Would it be safer to retrain before devices are added to the system (e.g.,
before pci_bus_add_devices())?
> return 0;
>
> put_resources:
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260520183827.908243-1-cjd@cjdns.fr?part=3
next prev parent reply other threads:[~2026-05-20 19:58 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 18:38 [PATCH v8 0/3] Add EcoNet EN7528 (and EN751221) PCIe support Caleb James DeLisle
2026-05-20 18:38 ` [PATCH v8 1/3] PCI: mediatek: Use actual physical address instead of virt_to_phys() Caleb James DeLisle
2026-05-20 18:55 ` Bjorn Helgaas
2026-05-20 19:17 ` Caleb James DeLisle
2026-05-20 19:59 ` Bjorn Helgaas
2026-05-21 5:14 ` Manivannan Sadhasivam
2026-05-22 22:43 ` Bjorn Helgaas
2026-05-23 14:43 ` Manivannan Sadhasivam
2026-05-20 18:38 ` [PATCH v8 2/3] dt-bindings: PCI: mediatek: Add support for EcoNet EN7528 Caleb James DeLisle
2026-05-20 18:38 ` [PATCH v8 3/3] PCI: mediatek: Add support for EcoNet EN7528 SoC Caleb James DeLisle
2026-05-20 19:58 ` sashiko-bot [this message]
2026-05-21 7:33 ` Manivannan Sadhasivam
2026-05-21 8:52 ` Caleb James DeLisle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520195806.4726A1F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=cjd@cjdns.fr \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox