* [PATCH AUTOSEL 6.17] selftests: pci_endpoint: Skip IRQ test if IRQ is out of range.
[not found] <20251025160905.3857885-1-sashal@kernel.org>
@ 2025-10-25 15:54 ` Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-6.12] PCI: imx6: Enable the Vaux supply if available Sasha Levin
` (9 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:54 UTC (permalink / raw)
To: patches, stable
Cc: Christian Bruel, Manivannan Sadhasivam, Sasha Levin, kwilczynski,
linux-pci
From: Christian Bruel <christian.bruel@foss.st.com>
[ Upstream commit 106fc08b30a2ece49a251b053165a83d41d50fd0 ]
The pci_endpoint_test tests the entire MSI/MSI-X range, which generates
false errors on platforms that do not support the whole range.
Skip the test in such cases and report accordingly.
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250804170916.3212221-4-christian.bruel@foss.st.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
**Why This Fix Matters**
- Prevents false test failures: The tests iterate full MSI/MSI-X ranges
regardless of what was actually allocated on the device. When a vector
index is out of range, the kernel returns -EINVAL, which the old test
treated as a failure.
- Skips unsupported cases correctly: The new checks treat out-of-range
vectors as “not applicable” and mark the test as skipped, which
accurately reflects platform capability instead of reporting an error.
**What Changed (Selftests Only)**
- In `tools/testing/selftests/pci_endpoint/pci_endpoint_test.c:122` and
`:140`, after triggering an IRQ:
- MSI loop: `pci_ep_ioctl(PCITEST_MSI, i);` then `if (ret == -EINVAL)
SKIP(return, "MSI%d is disabled", i);` (`tools/testing/selftests/pci
_endpoint/pci_endpoint_test.c:123`–`:126`)
- MSI-X loop: `pci_ep_ioctl(PCITEST_MSIX, i);` then `if (ret ==
-EINVAL) SKIP(return, "MSI-X%d is disabled", i);` (`tools/testing/se
lftests/pci_endpoint/pci_endpoint_test.c:141`–`:144`)
- Uses existing kselftest skip mechanism (`SKIP(...)`) which is well-
established in the harness
(`tools/testing/selftests/kselftest_harness.h:110`–`:134`).
**Why -EINVAL Means “Out of Range” Here**
- The endpoint test driver queries the Linux IRQ number for a given
vector via `pci_irq_vector(pdev, msi_num - 1)`, and immediately
returns that error when negative
(`drivers/misc/pci_endpoint_test.c:441`–`:443`).
- `pci_irq_vector()` returns -EINVAL precisely when the vector index is
out of range/not allocated for the device
(`drivers/pci/msi/api.c:311`–`:324`), which happens when the device
supports fewer MSI/MSI-X vectors than the upper bound tested (MSI up
to 32, MSI-X up to 2048).
**Scope and Risk**
- Small, contained change; affects only kselftests (no runtime kernel
code).
- No API or architectural changes; just improves test correctness by
skipping unsupported cases.
- Mirrors existing skip behavior already used in the same test suite
(e.g., BAR test skips when disabled,
`tools/testing/selftests/pci_endpoint/pci_endpoint_test.c:67`–`:70`).
- No security impact.
**Stable Backport Criteria**
- Fixes a real issue that affects users of stable kernels running
selftests (false negatives on platforms with limited MSI/MSI-X
vectors).
- Minimal risk and fully confined to `tools/testing/selftests`.
- Does not introduce new features; aligns with stable policy for test
fixes.
Given the above, this is a good candidate for stable backport.
tools/testing/selftests/pci_endpoint/pci_endpoint_test.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/pci_endpoint/pci_endpoint_test.c b/tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
index da0db0e7c9693..cd9075444c32a 100644
--- a/tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
+++ b/tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
@@ -121,6 +121,8 @@ TEST_F(pci_ep_basic, MSI_TEST)
for (i = 1; i <= 32; i++) {
pci_ep_ioctl(PCITEST_MSI, i);
+ if (ret == -EINVAL)
+ SKIP(return, "MSI%d is disabled", i);
EXPECT_FALSE(ret) TH_LOG("Test failed for MSI%d", i);
}
}
@@ -137,6 +139,8 @@ TEST_F(pci_ep_basic, MSIX_TEST)
for (i = 1; i <= 2048; i++) {
pci_ep_ioctl(PCITEST_MSIX, i);
+ if (ret == -EINVAL)
+ SKIP(return, "MSI-X%d is disabled", i);
EXPECT_FALSE(ret) TH_LOG("Test failed for MSI-X%d", i);
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-6.12] PCI: imx6: Enable the Vaux supply if available
[not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] selftests: pci_endpoint: Skip IRQ test if IRQ is out of range Sasha Levin
@ 2025-10-25 15:54 ` Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier Sasha Levin
` (8 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:54 UTC (permalink / raw)
To: patches, stable
Cc: Richard Zhu, Manivannan Sadhasivam, Frank Li, Sasha Levin,
l.stach, shawnguo, linux-pci, linux-arm-kernel, imx
From: Richard Zhu <hongxing.zhu@nxp.com>
[ Upstream commit c221cbf8dc547eb8489152ac62ef103fede99545 ]
When the 3.3Vaux supply is present, fetch it at the probe time and keep it
enabled for the entire PCIe controller lifecycle so that the link can enter
L2 state and the devices can signal wakeup using either Beacon or WAKE#
mechanisms.
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: reworded the subject, description and error message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250820022328.2143374-1-hongxing.zhu@nxp.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- The change enables an optional 3.3V auxiliary PCIe supply early in
probe and keeps it on for the controller’s lifetime via devm, which
directly addresses link low‑power (L2) entry and wake signaling
reliability. The new call
`devm_regulator_get_enable_optional(&pdev->dev, "vpcie3v3aux")` is
added in `drivers/pci/controller/dwc/pci-imx6.c:1744`. Errors other
than “not present” are surfaced using `dev_err_probe()`
(`drivers/pci/controller/dwc/pci-imx6.c:1745`), ensuring a clear,
fail‑fast behavior if hardware provides the supply but it cannot be
enabled.
- The helper used is a standard devres API that both acquires and
enables the regulator for the device lifetime, and automatically
disables it on device teardown. See the declaration in
`include/linux/regulator/consumer.h:166` and implementation in
`drivers/regulator/devres.c:110`. This matches the commit’s intent to
“keep it enabled for the entire PCIe controller lifecycle.”
- This is a contained, minimal change within the i.MX DesignWare PCIe
host driver probe path. It does not alter broader PCIe core behavior,
call flows, or add architectural changes. It only:
- Enables `vpcie3v3aux` if present (`drivers/pci/controller/dwc/pci-
imx6.c:1744`).
- Leaves existing supply handling intact for `vpcie` and `vph`
(`drivers/pci/controller/dwc/pci-imx6.c:1748` and
`drivers/pci/controller/dwc/pci-imx6.c:1755`).
- Keeps `vpcie` enable/disable at host init/exit unchanged
(`drivers/pci/controller/dwc/pci-imx6.c:1205`,
`drivers/pci/controller/dwc/pci-imx6.c:1280`,
`drivers/pci/controller/dwc/pci-imx6.c:1297`).
- The functional impact is to enable proper L2 and wake signaling
(Beacon or WAKE#) on boards that wire up 3.3Vaux. The driver already
carries context that AUX power matters; for example, i.MX95 has an
erratum requiring AUX power detect handling to exit L23 Ready
(`drivers/pci/controller/dwc/pci-imx6.c:245` comment explains AUX
power implications). Turning on AUX power when available is therefore
a correctness fix, not a feature.
- Risk/regression assessment:
- If the supply is not defined, nothing changes (uses “optional” API
and ignores `-ENODEV`).
- If the supply is defined but cannot be enabled, probe now fails
loudly; this surfaces real hardware/regulator issues instead of
running with broken low‑power/wake behavior.
- The pattern matches existing PCIe controller drivers that enable
optional PCIe supplies at probe with the same helper (e.g.,
`drivers/pci/controller/pcie-rcar-host.c:954`), indicating
established practice across subsystems.
- Binding-wise, the i.MX PCIe common binding allows additional
properties (`additionalProperties: true` in
`Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-
common.yaml:246`), so using `vpcie3v3aux-supply` is non‑disruptive
for DT validation. DT updates are optional and can follow
separately.
- Stable criteria fit:
- Fixes a real user-visible issue (L2 entry and wake signaling fail
without AUX).
- Small and self-contained change in a single driver.
- No architectural refactor or feature addition beyond enabling an
optional, already-described hardware supply.
- Uses existing, widely deployed APIs with minimal regression risk.
Given the clear bugfix nature, minimal scope, and alignment with
established patterns, this is a good candidate for stable backport.
drivers/pci/controller/dwc/pci-imx6.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index 80e48746bbaf6..db51e382a7cf3 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -1745,6 +1745,10 @@ static int imx_pcie_probe(struct platform_device *pdev)
pci->max_link_speed = 1;
of_property_read_u32(node, "fsl,max-link-speed", &pci->max_link_speed);
+ ret = devm_regulator_get_enable_optional(&pdev->dev, "vpcie3v3aux");
+ if (ret < 0 && ret != -ENODEV)
+ return dev_err_probe(dev, ret, "failed to enable Vaux supply\n");
+
imx_pcie->vpcie = devm_regulator_get_optional(&pdev->dev, "vpcie");
if (IS_ERR(imx_pcie->vpcie)) {
if (PTR_ERR(imx_pcie->vpcie) != -ENODEV)
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier
[not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] selftests: pci_endpoint: Skip IRQ test if IRQ is out of range Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-6.12] PCI: imx6: Enable the Vaux supply if available Sasha Levin
@ 2025-10-25 15:55 ` Sasha Levin
2025-10-27 12:39 ` Ilpo Järvinen
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.1] s390/pci: Use pci_uevent_ers() in PCI recovery Sasha Levin
` (7 subsequent siblings)
10 siblings, 1 reply; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:55 UTC (permalink / raw)
To: patches, stable; +Cc: Ilpo Järvinen, Bjorn Helgaas, Sasha Levin, linux-pci
From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[ Upstream commit a43ac325c7cbbfe72bdf9178059b3ee9f5a2c7dd ]
Bridge windows are read twice from PCI Config Space, the first time from
pci_read_bridge_windows(), which does not set up the device's resources.
This causes problems down the road as child resources of the bridge cannot
check whether they reside within the bridge window or not.
Set up the bridge windows already in pci_read_bridge_windows().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250924134228.1663-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- `pci_alloc_child_bus()` copies each subordinate bus window to
`child->resource[i] = &bridge->resource[PCI_BRIDGE_RESOURCES+i]`
before any child is scanned (`drivers/pci/probe.c:1245-1248`). Without
this patch, those `bridge->resource[...]` entries are still zeroed;
the first call to `pci_read_bridge_windows()` only logged with a
stack-local `struct resource`.
- Child drivers often probe immediately (device_add → bus_probe_device)
while the bus scan is still in progress. During their
`pci_enable_device()` they hit `pci_claim_resource()`
(`drivers/pci/setup-res.c:154-169`), which calls
`pci_find_parent_resource()` to make sure the BAR sits inside an
upstream bridge window (`drivers/pci/pci.c:737-767`). Because
`pcibios_fixup_bus()` (the point where `pci_read_bridge_bases()` re-
reads the window into the real resource) runs only after the entire
bus has been scanned (`drivers/pci/probe.c:3091-3106`), the parent
window is still zero and the containment test fails. Result:
`pci_enable_device()` reports “can't claim; no compatible bridge
window” and the device never comes up behind that bridge.
- The patch fixes that race by writing the values directly into the
bridge’s real resources the first time we read config space
(`drivers/pci/probe.c:540-588`). When the subordinate bus is created,
the copied pointers already describe the real aperture, so drivers can
claim their BARs successfully even if they probe before the later
fixup.
- Behavioural risk is negligible: we still populate the same resource
structures with the same data, only earlier; the later
`pci_read_bridge_bases()` call simply refreshes them with `log=false`.
No new dependencies or behavioural changes outside this bug fix path,
making it safe for stable.
Natural next step: consider tagging with a `Fixes` reference upstream to
ease stable selection.
drivers/pci/probe.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index a56dfa1c9b6ff..0b8c82c610baa 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -524,10 +524,14 @@ static void pci_read_bridge_windows(struct pci_dev *bridge)
}
if (io) {
bridge->io_window = 1;
- pci_read_bridge_io(bridge, &res, true);
+ pci_read_bridge_io(bridge,
+ pci_resource_n(bridge, PCI_BRIDGE_IO_WINDOW),
+ true);
}
- pci_read_bridge_mmio(bridge, &res, true);
+ pci_read_bridge_mmio(bridge,
+ pci_resource_n(bridge, PCI_BRIDGE_MEM_WINDOW),
+ true);
/*
* DECchip 21050 pass 2 errata: the bridge may miss an address
@@ -565,7 +569,10 @@ static void pci_read_bridge_windows(struct pci_dev *bridge)
bridge->pref_64_window = 1;
}
- pci_read_bridge_mmio_pref(bridge, &res, true);
+ pci_read_bridge_mmio_pref(bridge,
+ pci_resource_n(bridge,
+ PCI_BRIDGE_PREF_MEM_WINDOW),
+ true);
}
void pci_read_bridge_bases(struct pci_bus *child)
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-6.1] s390/pci: Use pci_uevent_ers() in PCI recovery
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (2 preceding siblings ...)
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier Sasha Levin
@ 2025-10-25 15:55 ` Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-5.4] PCI: Disable MSI on RDC PCI to PCIe bridges Sasha Levin
` (6 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:55 UTC (permalink / raw)
To: patches, stable
Cc: Niklas Schnelle, Bjorn Helgaas, Lukas Wunner, Sasha Levin,
gerald.schaefer, linux-s390, linux-pci
From: Niklas Schnelle <schnelle@linux.ibm.com>
[ Upstream commit dab32f2576a39d5f54f3dbbbc718d92fa5109ce9 ]
Issue uevents on s390 during PCI recovery using pci_uevent_ers() as done by
EEH and AER PCIe recovery routines.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20250807-add_err_uevents-v5-2-adf85b0620b0@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
Explanation
- What it changes
- arch/s390/pci/pci_event.c: Adds uevent notifications to the s390 PCI
error recovery path, mirroring existing AER/EEH behavior:
- After driver’s error_detected() returns, emit a recovery-begin
uevent: the patch inserts pci_uevent_ers(pdev, ers_res) in
zpci_event_notify_error_detected() (arch/s390/pci/pci_event.c:85).
- On recovery failure, emit FAILED_RECOVERY: the patch calls
pci_uevent_ers(pdev, PCI_ERS_RESULT_DISCONNECT) in
zpci_event_attempt_error_recovery()
(arch/s390/pci/pci_event.c:178).
- On recovery success, emit SUCCESSFUL_RECOVERY: the patch calls
pci_uevent_ers(pdev, PCI_ERS_RESULT_RECOVERED) after an optional
resume() in zpci_event_attempt_error_recovery()
(arch/s390/pci/pci_event.c:178).
- drivers/pci/pci-driver.c: Makes pci_uevent_ers() available when
building for s390 by expanding the ifdef to include CONFIG_S390
(drivers/pci/pci-driver.c:1591).
- include/linux/pci.h: Similarly expands the prototype guard to
include CONFIG_S390 so arch/s390 code can call it
(include/linux/pci.h:2768).
- Why it matters (user-visible impact)
- Brings s390 PCI recovery uevents to parity with AER and PowerPC EEH:
- pci_uevent_ers() already emits ERROR_EVENT=BEGIN_RECOVERY /
SUCCESSFUL_RECOVERY / FAILED_RECOVERY and DEVICE_ONLINE=0/1 to
userspace (drivers/pci/pci-driver.c:1591).
- AER and EEH already use these notifications; s390 previously did
not. This omission prevents userspace from reacting consistently
to PCI recovery events on s390 systems.
- The change enables standard userspace tooling (udev rules,
monitoring scripts) to receive the same recovery lifecycle events on
s390 that they already get elsewhere, which can help automate
remediation or logging. It’s a clear correctness/observability
improvement, not a feature addition that changes kernel behavior.
- Scope and risk assessment
- Small, contained change:
- Adds three calls to pci_uevent_ers() in the s390 recovery path; no
core recovery logic changed.
- Only adjusts preprocessor guards to build pci_uevent_ers() for
s390. No behavior change on non-s390.
- Consistent with established patterns:
- AER calls pci_uevent_ers() after error_detected() and on
resume/failure; this patch mirrors that sequencing for s390.
- Low regression risk:
- Only additional KOBJ_CHANGE uevents are emitted during rare error
recovery flows.
- Calls occur under the same locking pattern used in AER (s390 uses
pci_dev_lock/pci_dev_unlock, which wraps device_lock, consistent
with AER’s device_lock usage), so no new locking hazards.
- No ABI change; only adds uevents that other architectures already
emit.
- Stable backport considerations
- Dependencies: pci_uevent_ers() exists and is implemented in pci-
driver.c (drivers/pci/pci-driver.c:1591) with a prototype in
include/linux/pci.h (include/linux/pci.h:2768). Older stable series
where pci_uevent_ers() lived in different guards may need the guard
expansions this patch includes. For supported long-term series
(4.19+, 5.4+, 5.10+, 5.15+, 6.1+), pci_uevent_ers() is already
present; just ensure to add CONFIG_S390 to both the definition and
the prototype guards as in this patch.
- No major architectural changes; change is limited to s390 PCI
recovery and one generic helper being compiled for s390.
- While the commit message does not include Fixes:/Cc: stable tags,
this is a correctness/behavior-parity fix affecting real userspace
observability and is minimal risk, making it suitable for stable.
- Conclusion
- This patch fixes a real behavioral gap on s390 by emitting standard
PCI recovery uevents that already exist on other platforms. It is
small, self-contained, and low risk, with clear user benefit. It
should be backported to stable.
arch/s390/pci/pci_event.c | 3 +++
drivers/pci/pci-driver.c | 2 +-
include/linux/pci.h | 2 +-
3 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index d930416d4c903..b95376041501f 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -88,6 +88,7 @@ static pci_ers_result_t zpci_event_notify_error_detected(struct pci_dev *pdev,
pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
ers_res = driver->err_handler->error_detected(pdev, pdev->error_state);
+ pci_uevent_ers(pdev, ers_res);
if (ers_result_indicates_abort(ers_res))
pr_info("%s: Automatic recovery failed after initial reporting\n", pci_name(pdev));
else if (ers_res == PCI_ERS_RESULT_NEED_RESET)
@@ -244,6 +245,7 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
ers_res = PCI_ERS_RESULT_RECOVERED;
if (ers_res != PCI_ERS_RESULT_RECOVERED) {
+ pci_uevent_ers(pdev, PCI_ERS_RESULT_DISCONNECT);
pr_err("%s: Automatic recovery failed; operator intervention is required\n",
pci_name(pdev));
status_str = "failed (driver can't recover)";
@@ -253,6 +255,7 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
pr_info("%s: The device is ready to resume operations\n", pci_name(pdev));
if (driver->err_handler->resume)
driver->err_handler->resume(pdev);
+ pci_uevent_ers(pdev, PCI_ERS_RESULT_RECOVERED);
out_unlock:
pci_dev_unlock(pdev);
zpci_report_status(zdev, "recovery", status_str);
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 6405acdb5d0f3..302d61783f6c0 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1582,7 +1582,7 @@ static int pci_uevent(const struct device *dev, struct kobj_uevent_env *env)
return 0;
}
-#if defined(CONFIG_PCIEAER) || defined(CONFIG_EEH)
+#if defined(CONFIG_PCIEAER) || defined(CONFIG_EEH) || defined(CONFIG_S390)
/**
* pci_uevent_ers - emit a uevent during recovery path of PCI device
* @pdev: PCI device undergoing error recovery
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 59876de13860d..7735acf6f3490 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2764,7 +2764,7 @@ static inline bool pci_is_thunderbolt_attached(struct pci_dev *pdev)
return false;
}
-#if defined(CONFIG_PCIEPORTBUS) || defined(CONFIG_EEH)
+#if defined(CONFIG_PCIEPORTBUS) || defined(CONFIG_EEH) || defined(CONFIG_S390)
void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type);
#endif
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-5.4] PCI: Disable MSI on RDC PCI to PCIe bridges
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (3 preceding siblings ...)
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.1] s390/pci: Use pci_uevent_ers() in PCI recovery Sasha Levin
@ 2025-10-25 15:56 ` Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-6.1] PCI/PM: Skip resuming to D0 if device is disconnected Sasha Levin
` (5 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:56 UTC (permalink / raw)
To: patches, stable
Cc: Marcos Del Sol Vives, Bjorn Helgaas, Sasha Levin, linux-pci
From: Marcos Del Sol Vives <marcos@orca.pet>
[ Upstream commit ebc7086b39e5e4f3d3ca82caaea20538c9b62d42 ]
RDC PCI to PCIe bridges, present on Vortex86DX3 and Vortex86EX2 SoCs, do
not support MSIs. If enabled, interrupts generated by PCIe devices never
reach the processor.
I have contacted the manufacturer (DM&P) and they confirmed that PCI MSIs
need to be disabled for them.
Signed-off-by: Marcos Del Sol Vives <marcos@orca.pet>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250705233209.721507-1-marcos@orca.pet
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
Reasons this is a good stable backport candidate
- Fixes a real hardware bug: The commit message states RDC PCI→PCIe
bridges “do not support MSIs” and that when enabled, “interrupts
generated by PCIe devices never reach the processor.” That is a
functional breakage for devices behind these bridges, not a feature
change.
- Minimal, targeted change: The patch adds a single quirk line to
disable MSI only behind the affected bridge:
- Adds: DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_RDC, 0x1031,
quirk_disable_msi);
in drivers/pci/quirks.c next to other vendor-specific MSI-disable
quirks such as AMD/VIA/ATI (drivers/pci/quirks.c:2717-2719).
- The quirk uses existing helper quirk_disable_msi()
(drivers/pci/quirks.c:2710), which only sets a flag on the
subordinate bus if the device is a bridge:
- Sets bus flag: dev->subordinate->bus_flags |=
PCI_BUS_FLAGS_NO_MSI;
- The function is compiled under CONFIG_PCI_MSI
(drivers/pci/quirks.c:2686), so it is inert if MSI is not enabled.
- Correct mechanism and stage:
- Bus-level gating is the standard way to suppress MSI/MSI-X behind a
broken bridge. The MSI core checks this flag up the bus hierarchy
and refuses to enable MSI/MSI-X when set (drivers/pci/msi/msi.c:62).
- The flag used is the canonical one (include/linux/pci.h:259), and
the vendor macro is already present (include/linux/pci_ids.h:2412).
- Uses DECLARE_PCI_FIXUP_FINAL(), consistent with other similar quirks
(drivers/pci/quirks.c:2717-2719), so it runs late enough to have a
subordinate bus to mark and before drivers enable MSI.
- Scope-limited and precedent:
- Only affects devices with vendor RDC and device ID 0x1031, and only
if they are bridges (quirk_disable_msi() early-exits otherwise;
drivers/pci/quirks.c:2712).
- Mirrors long-standing patterns for known-broken bridges (e.g., AMD
8131, VIA, ATI entries at drivers/pci/quirks.c:2717-2719).
- Risk assessment:
- Primary effect is that devices behind the affected bridge will use
INTx instead of MSI/MSI-X. That may reduce performance but restores
correctness (avoids lost interrupts). If a device strictly requires
MSI/MSI-X, it could not have worked reliably on this hardware
anyway, so the quirk does not introduce a new regression in
practice.
- No architectural changes, no API changes, no cross-subsystem impact;
single-file quirk addition under an existing guard.
- Stable policy fit:
- Hardware workarounds that fix real user-visible bugs are routinely
backported.
- The change is small, well-scoped, and follows existing patterns.
- While there is no explicit “Cc: stable” in the provided message,
this class of PCI quirks is commonly accepted for stable.
Conclusion
- Backport Status: YES. This one-line quirk reliably prevents MSI/MSI-X
enablement behind RDC bridges known to drop MSI interrupts, restoring
device functionality with minimal risk.
drivers/pci/quirks.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index d97335a401930..6eb3d20386e95 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2717,6 +2717,7 @@ static void quirk_disable_msi(struct pci_dev *dev)
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_disable_msi);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0xa238, quirk_disable_msi);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x5a3f, quirk_disable_msi);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_RDC, 0x1031, quirk_disable_msi);
/*
* The APC bridge device in AMD 780 family northbridges has some random
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-6.1] PCI/PM: Skip resuming to D0 if device is disconnected
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (4 preceding siblings ...)
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-5.4] PCI: Disable MSI on RDC PCI to PCIe bridges Sasha Levin
@ 2025-10-25 15:56 ` Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.4] PCI/P2PDMA: Fix incorrect pointer usage in devm_kfree() call Sasha Levin
` (4 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:56 UTC (permalink / raw)
To: patches, stable
Cc: Mario Limonciello, Lukas Wunner, Bjorn Helgaas,
Ilpo Järvinen, Rafael J. Wysocki, Sasha Levin, linux-pci
From: Mario Limonciello <mario.limonciello@amd.com>
[ Upstream commit 299fad4133677b845ce962f78c9cf75bded63f61 ]
When a device is surprise-removed (e.g., due to a dock unplug), the PCI
core unconfigures all downstream devices and sets their error state to
pci_channel_io_perm_failure. This marks them as disconnected via
pci_dev_is_disconnected().
During device removal, the runtime PM framework may attempt to resume the
device to D0 via pm_runtime_get_sync(), which calls into pci_power_up().
Since the device is already disconnected, this resume attempt is
unnecessary and results in a predictable errors like this, typically when
undocking from a TBT3 or USB4 dock with PCIe tunneling:
pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
Avoid powering up disconnected devices by checking their status early in
pci_power_up() and returning -EIO.
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[bhelgaas: add typical message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://patch.msgid.link/20250909031916.4143121-1-superm1@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- The new guard in `drivers/pci/pci.c:1321-1324` checks
`pci_dev_is_disconnected()` before touching PCI PM registers, so
surprise-removed devices short-circuit with `-EIO` while keeping
`current_state = PCI_D3cold`. This prevents the guaranteed `"Unable to
change power state..."` error emitted when `pci_read_config_word()`
hits a vanished device (see `drivers/pci/pci.c:1326-1331`), which
currently spams logs whenever users undock TBT3/USB4 systems.
- Callers already expect a negative return in this scenario—the pre-
change path hit the same `-EIO` branch after the failing config
read—so observable behaviour stays the same aside from eliminating the
noisy and misleading error message. `pci_set_full_power_state()` and
runtime PM resume paths therefore retain their semantics but avoid
futile config accesses.
- The fix is narrowly scoped to PCI PM, introduces no architectural
churn, and relies only on long-standing helpers present in supported
stables (confirmed `pci_dev_is_disconnected()` in tags like `p-6.6`).
It neither alters power-state transitions for healthy devices nor
affects platforms lacking PM caps because the new check comes after
the existing `!dev->pm_cap` fallback.
- Avoiding config transactions on removed hardware reduces the chance of
host controller complaints and matches other PCI core code that tests
`pci_dev_is_disconnected()` before issuing requests, making this a
low-risk, high-signal bug fix well suited for stable backporting.
Natural next step: 1) Queue the patch for the targeted stable series
after double-checking those trees already expose
`pci_dev_is_disconnected()` in `include/linux/pci.h`.
drivers/pci/pci.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cdd..036511f5b2625 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
return -EIO;
}
+ if (pci_dev_is_disconnected(dev)) {
+ dev->current_state = PCI_D3cold;
+ return -EIO;
+ }
+
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
if (PCI_POSSIBLE_ERROR(pmcsr)) {
pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-5.4] PCI/P2PDMA: Fix incorrect pointer usage in devm_kfree() call
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (5 preceding siblings ...)
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-6.1] PCI/PM: Skip resuming to D0 if device is disconnected Sasha Levin
@ 2025-10-25 15:58 ` Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] misc: pci_endpoint_test: Skip IRQ tests if irq is out of range Sasha Levin
` (3 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:58 UTC (permalink / raw)
To: patches, stable
Cc: Sungho Kim, Bjorn Helgaas, Logan Gunthorpe, Sasha Levin,
linux-pci
From: Sungho Kim <sungho.kim@furiosa.ai>
[ Upstream commit 6238784e502b6a9fbeb3a6b77284b29baa4135cc ]
The error handling path in pci_p2pdma_add_resource() contains a bug in its
`pgmap_free` label.
Memory is allocated for the `p2p_pgmap` struct, and the pointer is stored
in `p2p_pgmap`. However, the error path calls devm_kfree() with `pgmap`,
which is a pointer to a member field within the `p2p_pgmap` struct, not the
base pointer of the allocation.
Correct the bug by passing the correct base pointer, `p2p_pgmap`, to
devm_kfree().
Signed-off-by: Sungho Kim <sungho.kim@furiosa.ai>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/20250820105714.2939896-1-sungho.kim@furiosa.ai
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Backport Analysis Complete
**ANSWER: YES**
This commit should be backported to stable kernel trees.
---
## Comprehensive Analysis
### Bug Description
The bug is located in the error cleanup path of
`pci_p2pdma_add_resource()` at **drivers/pci/p2pdma.c:363** (line 343 in
the diff context).
**The Problem:**
```c
// Line 321: Allocate base structure
p2p_pgmap = devm_kzalloc(&pdev->dev, sizeof(*p2p_pgmap), GFP_KERNEL);
// Line 325: Get pointer to member field (offset +16 bytes)
pgmap = &p2p_pgmap->pgmap;
// Line 363: BUGGY - tries to free from offset pointer
devm_kfree(&pdev->dev, pgmap); // ❌ WRONG!
// Should be:
devm_kfree(&pdev->dev, p2p_pgmap); // ✅ CORRECT
```
**Structure Layout** (lines 30-34):
```c
struct pci_p2pdma_pagemap {
struct pci_dev *provider; // offset 0, 8 bytes
u64 bus_offset; // offset 8, 8 bytes
struct dev_pagemap pgmap; // offset 16 <-- pgmap points here
};
```
The bug passes a pointer to a member field (`pgmap` at offset +16)
instead of the base allocation pointer (`p2p_pgmap`) to `devm_kfree()`.
### Historical Context
- **Introduced**: Commit **a6e6fe6549f60** (August 2019) by Logan
Gunthorpe
- **Present since**: Kernel v5.10 (approximately 5-6 years)
- **Fixed by**: This commit (cb662cfd4a020)
- **Primary affected subsystem**: PCI P2PDMA used by NVMe driver for
Controller Memory Buffer (CMB)
### Impact Assessment
**When Triggered:**
The bug manifests only in error paths when:
1. `devm_memremap_pages()` fails (line 336-340) - memory mapping failure
2. `gen_pool_add_owner()` fails (line 348-353) - pool allocation failure
These failures occur during:
- Low memory conditions
- Invalid PCI BAR configurations
- Hardware initialization failures
- NVMe device probe with CMB support
**Runtime Behavior:**
1. `devm_kfree()` attempts to free the wrong pointer
2. devres subsystem cannot find matching allocation (exact pointer
comparison)
3. `devm_kfree()` triggers **WARN_ON()** and generates stack trace in
kernel logs
4. **Memory leak**: ~184 bytes remain allocated and orphaned
5. NVMe driver disables CMB feature and continues without P2PDMA
**Severity: MEDIUM**
- Memory leak (bounded but accumulates with repeated errors)
- System instability concern (kernel warnings)
- Functional degradation (NVMe CMB unavailable)
- No immediate security vulnerability
- Affects critical storage infrastructure
**User Impact:**
- Enterprise/datacenter systems using NVMe with P2PDMA
- Systems experiencing memory pressure during device initialization
- Accumulating memory leaks over time with repeated device probe/remove
cycles
### Why This Should Be Backported
**1. Fixes Important Bug** ✅
- Clear memory management error affecting real users
- Causes memory leaks and kernel warnings
- Degrades NVMe CMB functionality in production systems
**2. Small and Contained** ✅
- **One line changed**: `pgmap` → `p2p_pgmap`
- No algorithmic changes
- No API modifications
- Isolated to single function
**3. Minimal Regression Risk** ✅
- Fix is obviously correct (pointer arithmetic fix)
- No complex logic changes
- Easy to verify correctness
- No dependencies on other changes
**4. Long-Standing Issue** ✅
- Bug present for ~5-6 years (since v5.10)
- Affects all stable kernels from 5.10 onwards
- Wide user base affected
**5. Follows Stable Tree Rules** ✅
- Important bugfix (memory leak + warnings)
- No new features
- No architectural changes
- Clear and obvious fix
- Minimal side effects beyond the fix
**6. Subsystem Importance** ✅
- Affects PCI P2PDMA critical for NVMe storage
- Used in enterprise/datacenter environments
- Storage performance and reliability impact
**7. Professional Review** ✅
- Reviewed-by: Logan Gunthorpe (original P2PDMA author)
- Signed-off-by: Bjorn Helgaas (PCI maintainer)
- Code review validates correctness
### Technical Correctness
The fix is straightforward and correct:
- `p2p_pgmap` is the base pointer from `devm_kzalloc()`
- This is what must be passed to `devm_kfree()`
- `pgmap` is derived via `&p2p_pgmap->pgmap` and cannot be used for
freeing
- The fixed code matches standard devres patterns throughout the kernel
### Backport Recommendation
**Strong YES** - This commit meets all criteria for stable backporting:
- Fixes a real bug affecting users
- Simple, contained, low-risk change
- No architectural implications
- Applicable to long-lived stable kernels
- Improves system reliability
**Target Stable Trees:** All active stable kernels containing the buggy
code (v5.10+)
drivers/pci/p2pdma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index da5657a020074..1cb5e423eed4f 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -360,7 +360,7 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
pages_free:
devm_memunmap_pages(&pdev->dev, pgmap);
pgmap_free:
- devm_kfree(&pdev->dev, pgmap);
+ devm_kfree(&pdev->dev, p2p_pgmap);
return error;
}
EXPORT_SYMBOL_GPL(pci_p2pdma_add_resource);
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17] misc: pci_endpoint_test: Skip IRQ tests if irq is out of range
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (6 preceding siblings ...)
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.4] PCI/P2PDMA: Fix incorrect pointer usage in devm_kfree() call Sasha Levin
@ 2025-10-25 15:58 ` Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.15] thunderbolt: Use is_pciehp instead of is_hotplug_bridge Sasha Levin
` (2 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:58 UTC (permalink / raw)
To: patches, stable
Cc: Christian Bruel, Manivannan Sadhasivam, Sasha Levin, kwilczynski,
gregkh, linux-pci
From: Christian Bruel <christian.bruel@foss.st.com>
[ Upstream commit cc8e391067164f45f89b6132a5aaa18c33a0e32b ]
The pci_endpoint_test tests the 32-bit MSI range. However, the device might
not have all vectors configured. For example, if msi_interrupts is 8 in the
ep function space or if the MSI Multiple Message Capable value is
configured as 4 (maximum 16 vectors).
In this case, do not attempt to run the test to avoid timeouts and directly
return the error value.
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250804170916.3212221-2-christian.bruel@foss.st.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
Explanation:
- What it fixes: The endpoint host test currently tries every MSI/MSI-X
vector in the nominal 32/2048 range even when the device only enabled
fewer vectors. For disabled/out-of-range vectors, no interrupt ever
arrives from the endpoint side, so the host side waits 1 second and
fails with a timeout (-ETIMEDOUT) per attempt. This is both slow and
semantically wrong: the error is “invalid vector” rather than
“timeout”.
- Core change: The patch adds an early range check using
`pci_irq_vector()` before attempting to trigger an interrupt. If the
requested vector is not valid for the device, it returns immediately
with the error from `pci_irq_vector()` (typically -EINVAL), skipping
the test and avoiding a 1s timeout.
- Early check added: drivers/misc/pci_endpoint_test.c:441-444
- `irq = pci_irq_vector(pdev, msi_num - 1);`
- `if (irq < 0) return irq;`
- Trigger and wait unchanged otherwise:
drivers/misc/pci_endpoint_test.c:445-456
- Post-wait verification now uses the pre-fetched `irq` value:
drivers/misc/pci_endpoint_test.c:457-460
- Previous behavior (pre-patch): The test wrote the registers and waited
up to 1s for completion, then only after success called
`pci_irq_vector()` to compare the vector number. If the vector was
actually invalid, the wait timed out first and the function returned
-ETIMEDOUT, masking the real reason and wasting time.
- Alignment with selftests: Kselftests iterate through the full
MSI/MSI-X ranges and expect -EINVAL for disabled vectors in order to
SKIP them rather than fail:
- MSI test expects -EINVAL to SKIP:
tools/testing/selftests/pci_endpoint/pci_endpoint_test.c:122-127
- MSI-X test expects -EINVAL to SKIP:
tools/testing/selftests/pci_endpoint/pci_endpoint_test.c:140-145
This change makes the driver return -EINVAL for out-of-range vectors,
matching selftests and preventing spurious failures/timeouts.
- Scope and risk:
- Small, contained change in a single helper:
drivers/misc/pci_endpoint_test.c:434-461.
- No architectural changes; only the order of operations and error
path are adjusted.
- `pci_irq_vector()` has well-defined semantics and returns -EINVAL
for out-of-range vectors (drivers/pci/msi/api.c:309-320).
- Functional behavior for valid vectors is unchanged; the post-wait
check still verifies the exact IRQ delivered (`irq ==
test->last_irq`).
- The only user-visible change is the error code for invalid vectors
(-EINVAL instead of -ETIMEDOUT), which aligns with the intended API
usage and the selftests.
- Backport criteria:
- Fixes a real user-visible issue (spurious timeouts and failing tests
when not all vectors are configured).
- Minimal and low risk; confined to the pci_endpoint_test driver.
- No new features; no impact on core PCI or IRQ subsystems.
- Improves test reliability and reduces needless delays.
Given these points, this is a good candidate for stable backport.
drivers/misc/pci_endpoint_test.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index f935175d8bf55..506a2847e5d22 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -436,7 +436,11 @@ static int pci_endpoint_test_msi_irq(struct pci_endpoint_test *test,
{
struct pci_dev *pdev = test->pdev;
u32 val;
- int ret;
+ int irq;
+
+ irq = pci_irq_vector(pdev, msi_num - 1);
+ if (irq < 0)
+ return irq;
pci_endpoint_test_writel(test, PCI_ENDPOINT_TEST_IRQ_TYPE,
msix ? PCITEST_IRQ_TYPE_MSIX :
@@ -450,11 +454,7 @@ static int pci_endpoint_test_msi_irq(struct pci_endpoint_test *test,
if (!val)
return -ETIMEDOUT;
- ret = pci_irq_vector(pdev, msi_num - 1);
- if (ret < 0)
- return ret;
-
- if (ret != test->last_irq)
+ if (irq != test->last_irq)
return -EIO;
return 0;
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-5.15] thunderbolt: Use is_pciehp instead of is_hotplug_bridge
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (7 preceding siblings ...)
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] misc: pci_endpoint_test: Skip IRQ tests if irq is out of range Sasha Levin
@ 2025-10-25 15:58 ` Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-5.10] PCI: cadence: Check for the existence of cdns_pcie::ops before using it Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.6] PCI: dwc: Verify the single eDMA IRQ in dw_pcie_edma_irq_verify() Sasha Levin
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:58 UTC (permalink / raw)
To: patches, stable
Cc: Lukas Wunner, Bjorn Helgaas, Mika Westerberg, Sasha Levin,
andreas.noever, westeri, YehezkelShB, linux-pci, linux-usb
From: Lukas Wunner <lukas@wunner.de>
[ Upstream commit 5d03847175e81e86d4865456c15638faaf7c0634 ]
The thunderbolt driver sets up device link dependencies from hotplug ports
to the Host Router (aka Native Host Interface, NHI). When resuming from
system sleep, this allows the Host Router to re-establish tunnels to
attached Thunderbolt devices before the hotplug ports resume.
To identify the hotplug ports, the driver utilizes the is_hotplug_bridge
flag which also encompasses ACPI slots handled by the ACPI hotplug driver.
Thunderbolt hotplug ports are always Hot-Plug Capable PCIe ports, so it is
more apt to identify them with the is_pciehp flag.
Similarly, hotplug ports on older Thunderbolt controllers have broken MSI
support and are quirked to use legacy INTx interrupts instead. The quirk
identifies them with is_hotplug_bridge, even though all affected ports are
also matched by is_pciehp. So use is_pciehp here as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- What it changes
- Switches hotplug-port detection from the broad
`pdev->is_hotplug_bridge` flag to the precise `pdev->is_pciehp`
(PCIe Hot-Plug Capable) in two places:
- Quirk for TB controllers’ hotplug MSI: drivers/pci/quirks.c:3834
- Device-link creation for resume ordering:
drivers/thunderbolt/tb.c:3256
- Why this is correct
- Semantics: `is_hotplug_bridge` is a superset flag that also covers
ACPI-driven slots and conventional PCI hotplug, not just PCIe HPC.
Documentation explicitly distinguishes them (include/linux/pci.h:330
and include/linux/pci.h:334).
- TB downstream hotplug ports are always PCIe Hot-Plug Capable, so
`is_pciehp` matches the intended set without inadvertently pulling
in ACPI-only slots.
- ACPI code can set `is_hotplug_bridge` on bridges that are not PCIe
HPC (drivers/pci/hotplug/acpiphp_glue.c:411 and
drivers/pci/hotplug/acpiphp_glue.c:424), which is precisely what
this change avoids.
- Code specifics and impact
- Quirk: drivers/pci/quirks.c:3834
- Old: `if (pdev->is_hotplug_bridge && …) pdev->no_msi = 1;`
- New: `if (pdev->is_pciehp && …) pdev->no_msi = 1;`
- Effect: Still covers all affected TB hotplug ports (Light
Ridge/Eagle Ridge/Light Peak/early Cactus Ridge/Port Ridge),
because those ports have PCIe HPC. Behavior is unchanged for
intended devices, but avoids misfiring if some non-PCIe-HP bridge
was flagged via the generic hotplug flag.
- Resume ordering via device links: drivers/thunderbolt/tb.c:3256
- Old: Filter downstream ports with `… || !pdev->is_hotplug_bridge)
continue;`
- New: `… || !pdev->is_pciehp) continue;`
- Effect: Device links are created only for PCIe HPC downstream
ports beneath the TB controller’s upstream port, which are the
ports that participate in TB PCIe tunneling. This avoids creating
links for ACPI-only hotplug bridges that do not belong in the TB
tunnel re-establishment ordering.
- Correctness and consistency with PCI core
- The PCI core caches the HPC bit early and sets both flags together
for PCIe HPC ports (drivers/pci/probe.c:1628 and
drivers/pci/probe.c:1630), so `is_pciehp` is reliable and avoids
late config reads.
- Other subsystems have already moved to `is_pciehp` for the same
reason (e.g., PCIe portdrv created services only when `is_pciehp`),
demonstrating the broader effort to disambiguate these flags.
- Risk assessment
- Small, localized change; no architectural shifts.
- Maintainers’ acks: Thunderbolt and PCI maintainers are on board
(Acked-by Bjorn Helgaas, Signed-off-by Mika Westerberg), indicating
consensus on intent and safety.
- For TB quirk, the condition remains true for all intended TB ports;
for TB device-links, the selection is narrowed to the correct
subset, reducing potential for incorrect PM dependency ordering. No
functional expansion; only reduced false positives.
- User-facing effect
- More reliable resume sequencing for tunneled TB devices (device
links now scoped to true PCIe HP ports).
- Avoids unnecessary links for ACPI-only slots that are unrelated to
TB tunnels.
- No change in MSI disabling behavior for affected TB generations;
still prevents broken MSI usage there.
- Backport considerations
- Dependency: Target stable trees must already have `struct
pci_dev::is_pciehp` and the core logic that sets it
(drivers/pci/probe.c:1628 and drivers/pci/probe.c:1630). If a stable
branch lacks this field, it would require backporting that PCI core
addition first.
- Scope: Touches only `drivers/pci/quirks.c` and
`drivers/thunderbolt/tb.c`; isolated and low-risk with no API
changes.
Given the minimal change, precise semantics, alignment with ongoing PCI
core cleanups, and low regression risk confined to TB/PCI hotplug paths,
this is a good candidate for stable backport (where `is_pciehp` is
present).
drivers/pci/quirks.c | 2 +-
drivers/thunderbolt/tb.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6eb3d20386e95..214ed060ca1b3 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3830,7 +3830,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0xcf80, quirk_no_pm_reset);
*/
static void quirk_thunderbolt_hotplug_msi(struct pci_dev *pdev)
{
- if (pdev->is_hotplug_bridge &&
+ if (pdev->is_pciehp &&
(pdev->device != PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C ||
pdev->revision <= 1))
pdev->no_msi = 1;
diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
index c14ab1fbeeafd..83a33fc1486ab 100644
--- a/drivers/thunderbolt/tb.c
+++ b/drivers/thunderbolt/tb.c
@@ -3336,7 +3336,7 @@ static bool tb_apple_add_links(struct tb_nhi *nhi)
if (!pci_is_pcie(pdev))
continue;
if (pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM ||
- !pdev->is_hotplug_bridge)
+ !pdev->is_pciehp)
continue;
link = device_link_add(&pdev->dev, &nhi->pdev->dev,
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-5.10] PCI: cadence: Check for the existence of cdns_pcie::ops before using it
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (8 preceding siblings ...)
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.15] thunderbolt: Use is_pciehp instead of is_hotplug_bridge Sasha Levin
@ 2025-10-25 15:59 ` Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.6] PCI: dwc: Verify the single eDMA IRQ in dw_pcie_edma_irq_verify() Sasha Levin
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 15:59 UTC (permalink / raw)
To: patches, stable
Cc: Chen Wang, Manivannan Sadhasivam, Sasha Levin, s-vadapalli,
bhelgaas, alexander.deucher, kishon, 18255117159,
alexandre.f.demers, bwawrzyn, linux-pci
From: Chen Wang <unicorn_wang@outlook.com>
[ Upstream commit 49a6c160ad4812476f8ae1a8f4ed6d15adfa6c09 ]
cdns_pcie::ops might not be populated by all the Cadence glue drivers. This
is going to be true for the upcoming Sophgo platform which doesn't set the
ops.
Hence, add a check to prevent NULL pointer dereference.
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/35182ee1d972dfcd093a964e11205efcebbdc044.1757643388.git.unicorn_wang@outlook.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
**Why Backport**
- The shared Cadence core dereferences `pcie->ops` unconditionally in
several hot paths (`drivers/pci/controller/cadence/pcie-cadence-
host.c:534`, `pcie-cadence.c:109` and `:140`, `pcie-cadence.h:497`,
`505`, `511`). If a glue driver legitimately leaves `ops` unset, the
host setup oopses during probe.
- The new in-tree Sophgo driver (`drivers/pci/controller/cadence/pcie-
sg2042.c:35-69`) deliberately does not populate `pcie->ops`; without
this fix `cdns_pcie_host_setup()` trips the NULL dereference in
`cdns_pcie_host_init_address_translation()` immediately, so the
controller cannot even enumerate.
- The patch simply wraps each dereference with `pcie->ops &&
pcie->ops->...`, meaning existing platforms that register callbacks
keep identical behaviour, while platforms that do not provide optional
hooks now fall back to the previously implied defaults — avoiding the
fatal crash.
**Risk**
- Change is entirely in guard logic, no register programming altered
when `ops` is present. For platforms that rely on
`cpu_addr_fixup`/link callbacks, the functions still run because the
pointer remains non-NULL.
- For platforms without callbacks, the driver already relied on the
default behaviour implied by the inline helpers; the patch just
matches that expectation. Regression risk is therefore minimal.
**Next Steps**
- 1) Smoke/boot-test on at least one Cadence RC platform (e.g. TI J721E)
plus the Sophgo SG2042 host once both patches are staged, to confirm
link bring-up stays healthy.
drivers/pci/controller/cadence/pcie-cadence-host.c | 2 +-
drivers/pci/controller/cadence/pcie-cadence.c | 4 ++--
drivers/pci/controller/cadence/pcie-cadence.h | 6 +++---
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c
index 59a4631de79fe..fffd63d6665e8 100644
--- a/drivers/pci/controller/cadence/pcie-cadence-host.c
+++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
@@ -531,7 +531,7 @@ static int cdns_pcie_host_init_address_translation(struct cdns_pcie_rc *rc)
cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_PCI_ADDR1(0), addr1);
cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(0), desc1);
- if (pcie->ops->cpu_addr_fixup)
+ if (pcie->ops && pcie->ops->cpu_addr_fixup)
cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr);
addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(12) |
diff --git a/drivers/pci/controller/cadence/pcie-cadence.c b/drivers/pci/controller/cadence/pcie-cadence.c
index 70a19573440ee..61806bbd8aa32 100644
--- a/drivers/pci/controller/cadence/pcie-cadence.c
+++ b/drivers/pci/controller/cadence/pcie-cadence.c
@@ -92,7 +92,7 @@ void cdns_pcie_set_outbound_region(struct cdns_pcie *pcie, u8 busnr, u8 fn,
cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(r), desc1);
/* Set the CPU address */
- if (pcie->ops->cpu_addr_fixup)
+ if (pcie->ops && pcie->ops->cpu_addr_fixup)
cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr);
addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(nbits) |
@@ -123,7 +123,7 @@ void cdns_pcie_set_outbound_region_for_normal_msg(struct cdns_pcie *pcie,
}
/* Set the CPU address */
- if (pcie->ops->cpu_addr_fixup)
+ if (pcie->ops && pcie->ops->cpu_addr_fixup)
cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr);
addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(17) |
diff --git a/drivers/pci/controller/cadence/pcie-cadence.h b/drivers/pci/controller/cadence/pcie-cadence.h
index 1d81c4bf6c6db..2f07ba661bda7 100644
--- a/drivers/pci/controller/cadence/pcie-cadence.h
+++ b/drivers/pci/controller/cadence/pcie-cadence.h
@@ -468,7 +468,7 @@ static inline u32 cdns_pcie_ep_fn_readl(struct cdns_pcie *pcie, u8 fn, u32 reg)
static inline int cdns_pcie_start_link(struct cdns_pcie *pcie)
{
- if (pcie->ops->start_link)
+ if (pcie->ops && pcie->ops->start_link)
return pcie->ops->start_link(pcie);
return 0;
@@ -476,13 +476,13 @@ static inline int cdns_pcie_start_link(struct cdns_pcie *pcie)
static inline void cdns_pcie_stop_link(struct cdns_pcie *pcie)
{
- if (pcie->ops->stop_link)
+ if (pcie->ops && pcie->ops->stop_link)
pcie->ops->stop_link(pcie);
}
static inline bool cdns_pcie_link_up(struct cdns_pcie *pcie)
{
- if (pcie->ops->link_up)
+ if (pcie->ops && pcie->ops->link_up)
return pcie->ops->link_up(pcie);
return true;
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH AUTOSEL 6.17-6.6] PCI: dwc: Verify the single eDMA IRQ in dw_pcie_edma_irq_verify()
[not found] <20251025160905.3857885-1-sashal@kernel.org>
` (9 preceding siblings ...)
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-5.10] PCI: cadence: Check for the existence of cdns_pcie::ops before using it Sasha Levin
@ 2025-10-25 16:01 ` Sasha Levin
10 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-10-25 16:01 UTC (permalink / raw)
To: patches, stable
Cc: Niklas Cassel, Manivannan Sadhasivam, Bjorn Helgaas, Sasha Levin,
jingoohan1, linux-pci
From: Niklas Cassel <cassel@kernel.org>
[ Upstream commit 09fefb24ed5e15f3b112f6c04b21a90ea23eaf8b ]
dw_pcie_edma_irq_verify() is supposed to verify the eDMA IRQs in devicetree
by fetching them using either 'dma' or 'dmaX' IRQ names. Former is used
when the platform uses a single IRQ for all eDMA channels and latter is
used when the platform uses separate IRQ per channel. But currently,
dw_pcie_edma_irq_verify() bails out early if edma::nr_irqs is 1, i.e., when
a single IRQ is used. This gives an impression that the driver could work
with any single IRQ in devicetree, not necessarily with name 'dma'.
But dw_pcie_edma_irq_vector(), which actually requests the IRQ, does
require the single IRQ to be named as 'dma'. So this creates inconsistency
between dw_pcie_edma_irq_verify() and dw_pcie_edma_irq_vector().
Thus, to fix this inconsistency, make sure dw_pcie_edma_irq_verify() also
verifies the single IRQ name by removing the bail out code.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: fix typos]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250908165914.547002-3-cassel@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive analysis of this commit, here is my
determination:
## **Backport Status: YES**
## Extensive Analysis and Justification
### **Nature of the Change**
This is a **correctness and consistency bugfix** that addresses an
inconsistency between two related functions in the DWC PCIe eDMA driver:
1. **`dw_pcie_edma_irq_verify()`** - Validates eDMA IRQ configuration in
devicetree
2. **`dw_pcie_edma_irq_vector()`** - Actually retrieves the eDMA IRQs
### **The Bug Explained**
**Before the patch (lines 1028-1030 in the diff):**
```c
if (pci->edma.nr_irqs == 1)
return 0; // Early bailout - doesn't verify IRQ name!
else if (pci->edma.nr_irqs > 1)
return pci->edma.nr_irqs != ch_cnt ? -EINVAL : 0;
```
When `nr_irqs == 1`, the function returned success immediately **without
verifying** that a "dma" named IRQ actually exists in the devicetree.
This created a false sense of validation.
**However**, `dw_pcie_edma_irq_vector()` (lines 880-896 in pcie-
designware.c) **requires** the single IRQ to be named "dma":
```c
ret = platform_get_irq_byname_optional(pdev, "dma");
if (ret > 0)
return ret;
```
**The inconsistency:** Verification passed for ANY single IRQ, but
actual IRQ retrieval required it to be named "dma". This could lead to:
- Silent eDMA failures when devicetree is misconfigured
- Confusing behavior where verification passes but eDMA doesn't work
- Glue drivers manually setting `nr_irqs = 1` to bypass validation (as
qcom-ep did in commit ff8d92038cf92)
**After the patch:**
```c
if (pci->edma.nr_irqs > 1)
return pci->edma.nr_irqs != ch_cnt ? -EINVAL : 0;
ret = platform_get_irq_byname_optional(pdev, "dma");
if (ret > 0) {
pci->edma.nr_irqs = 1;
return 0;
}
```
Now the function properly verifies that the "dma" IRQ exists, matching
what `dw_pcie_edma_irq_vector()` expects. This makes both functions
consistent.
### **Why This Should Be Backported**
**1. Fixes User-Visible Bug:**
- Improves error detection for misconfigured device trees
- Provides clear error messages ("Invalid eDMA IRQs found") instead of
silent failures
- Helps developers catch DT configuration errors during development
**2. Part of a Coordinated Fix Series:**
This is patch 3/X in a series. Patch 4 (commit eea30c7601224) removes
redundant `edma.nr_irqs = 1` initialization from qcom-ep driver, with
the commit message stating:
> "dw_pcie_edma_irq_verify() already parses device tree for either 'dma'
(if there is a single IRQ for all DMA channels) or 'dmaX' (if there is
one IRQ per DMA channel), and initializes dma.nr_irqs accordingly."
This statement is only true **after our commit** is applied. The series
works together as a unit.
**3. Minimal and Contained:**
- Only removes 2 lines of code (`if (pci->edma.nr_irqs == 1) return 0;`)
- Changes a single static function
- No API changes, no ABI changes
- Affects only DWC PCIe eDMA subsystem
**4. Zero Regression Risk:**
I verified through code analysis that `dw_pcie_edma_detect()` (lines
1052-1056) has backward compatibility protection:
```c
ret = dw_pcie_edma_irq_verify(pci);
if (ret) {
dev_err(pci->dev, "Invalid eDMA IRQs found\n");
return 0; // Errors converted to success for backward compat
}
```
Even if verification fails, the probe doesn't fail - it just logs an
error and continues. This means:
- **Correctly configured platforms**: Work as before ✓
- **Platforms without eDMA IRQs**: Work as before (backward compat) ✓
- **Misconfigured platforms**: Now get helpful error messages instead of
silent failures ✓
**5. Follows Device Tree Binding Specification:**
The DT binding documentation
(`Documentation/devicetree/bindings/pci/snps,dw-pcie-ep.yaml` line 140)
specifies:
```yaml
pattern: '^dma([0-9]|1[0-5])?$'
```
This means eDMA IRQs must be named "dma" (single) or "dma0"-"dma15"
(per-channel). The patch enforces this specification.
**6. Improves Code Quality:**
- Makes verification consistent with actual usage
- Eliminates need for glue drivers to manually set `nr_irqs`
- Reduces code duplication and potential for errors
### **Historical Context**
- **v6.2**: eDMA support added (commit 939fbcd568fd2, Jan 2023) with the
buggy early return
- **v6.6**: qcom-ep worked around the bug by manually setting `nr_irqs =
1` (commit ff8d92038cf92)
- **Current**: This patch fixes the root cause, allowing removal of
workarounds
### **Impact Assessment**
Searched 46 device tree files using "dma" interrupt names. All properly
use either:
- `interrupt-names = "dma";` (single shared IRQ) - e.g., Qualcomm,
Renesas, i.MX platforms
- `interrupt-names = "dma0", "dma1", ...;` (per-channel IRQs) - e.g.,
Baikal-T1
No platforms found with incorrect naming that would be negatively
affected.
### **Stable Tree Criteria Compliance**
✓ **Fixes an obvious bug** - Inconsistency between verify and usage
functions
✓ **Important bugfix** - Affects eDMA functionality on multiple
platforms
✓ **Small and contained** - 2-line change in single static function
✓ **No architectural changes** - Pure bugfix with no new features
✓ **Minimal regression risk** - Protected by backward compatibility code
✓ **Easy to understand** - Clear, well-documented change
✓ **Build-tested** - Part of mainline kernel, no compilation issues
### **Recommendation**
This commit should definitely be backported to stable trees where eDMA
support exists (v6.2+). It improves correctness, consistency, and error
reporting without any risk of regression.
drivers/pci/controller/dwc/pcie-designware.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 89aad5a08928c..c7a2cf5e886f3 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -1045,9 +1045,7 @@ static int dw_pcie_edma_irq_verify(struct dw_pcie *pci)
char name[15];
int ret;
- if (pci->edma.nr_irqs == 1)
- return 0;
- else if (pci->edma.nr_irqs > 1)
+ if (pci->edma.nr_irqs > 1)
return pci->edma.nr_irqs != ch_cnt ? -EINVAL : 0;
ret = platform_get_irq_byname_optional(pdev, "dma");
--
2.51.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier Sasha Levin
@ 2025-10-27 12:39 ` Ilpo Järvinen
2025-11-04 13:51 ` Sasha Levin
0 siblings, 1 reply; 13+ messages in thread
From: Ilpo Järvinen @ 2025-10-27 12:39 UTC (permalink / raw)
To: Sasha Levin; +Cc: patches, stable, Bjorn Helgaas, linux-pci
[-- Attachment #1: Type: text/plain, Size: 4170 bytes --]
On Sat, 25 Oct 2025, Sasha Levin wrote:
> From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>
> [ Upstream commit a43ac325c7cbbfe72bdf9178059b3ee9f5a2c7dd ]
>
> Bridge windows are read twice from PCI Config Space, the first time from
> pci_read_bridge_windows(), which does not set up the device's resources.
> This causes problems down the road as child resources of the bridge cannot
> check whether they reside within the bridge window or not.
>
> Set up the bridge windows already in pci_read_bridge_windows().
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Link: https://patch.msgid.link/20250924134228.1663-2-ilpo.jarvinen@linux.intel.com
> Signed-off-by: Sasha Levin <sashal@kernel.org>
This change was reverted by the commit 469276c06aff ("PCI: Revert early
bridge resource set up").
--
i.
> ---
>
> LLM Generated explanations, may be completely bogus:
>
> YES
> - `pci_alloc_child_bus()` copies each subordinate bus window to
> `child->resource[i] = &bridge->resource[PCI_BRIDGE_RESOURCES+i]`
> before any child is scanned (`drivers/pci/probe.c:1245-1248`). Without
> this patch, those `bridge->resource[...]` entries are still zeroed;
> the first call to `pci_read_bridge_windows()` only logged with a
> stack-local `struct resource`.
> - Child drivers often probe immediately (device_add → bus_probe_device)
> while the bus scan is still in progress. During their
> `pci_enable_device()` they hit `pci_claim_resource()`
> (`drivers/pci/setup-res.c:154-169`), which calls
> `pci_find_parent_resource()` to make sure the BAR sits inside an
> upstream bridge window (`drivers/pci/pci.c:737-767`). Because
> `pcibios_fixup_bus()` (the point where `pci_read_bridge_bases()` re-
> reads the window into the real resource) runs only after the entire
> bus has been scanned (`drivers/pci/probe.c:3091-3106`), the parent
> window is still zero and the containment test fails. Result:
> `pci_enable_device()` reports “can't claim; no compatible bridge
> window” and the device never comes up behind that bridge.
> - The patch fixes that race by writing the values directly into the
> bridge’s real resources the first time we read config space
> (`drivers/pci/probe.c:540-588`). When the subordinate bus is created,
> the copied pointers already describe the real aperture, so drivers can
> claim their BARs successfully even if they probe before the later
> fixup.
> - Behavioural risk is negligible: we still populate the same resource
> structures with the same data, only earlier; the later
> `pci_read_bridge_bases()` call simply refreshes them with `log=false`.
> No new dependencies or behavioural changes outside this bug fix path,
> making it safe for stable.
>
> Natural next step: consider tagging with a `Fixes` reference upstream to
> ease stable selection.
>
> drivers/pci/probe.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index a56dfa1c9b6ff..0b8c82c610baa 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -524,10 +524,14 @@ static void pci_read_bridge_windows(struct pci_dev *bridge)
> }
> if (io) {
> bridge->io_window = 1;
> - pci_read_bridge_io(bridge, &res, true);
> + pci_read_bridge_io(bridge,
> + pci_resource_n(bridge, PCI_BRIDGE_IO_WINDOW),
> + true);
> }
>
> - pci_read_bridge_mmio(bridge, &res, true);
> + pci_read_bridge_mmio(bridge,
> + pci_resource_n(bridge, PCI_BRIDGE_MEM_WINDOW),
> + true);
>
> /*
> * DECchip 21050 pass 2 errata: the bridge may miss an address
> @@ -565,7 +569,10 @@ static void pci_read_bridge_windows(struct pci_dev *bridge)
> bridge->pref_64_window = 1;
> }
>
> - pci_read_bridge_mmio_pref(bridge, &res, true);
> + pci_read_bridge_mmio_pref(bridge,
> + pci_resource_n(bridge,
> + PCI_BRIDGE_PREF_MEM_WINDOW),
> + true);
> }
>
> void pci_read_bridge_bases(struct pci_bus *child)
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier
2025-10-27 12:39 ` Ilpo Järvinen
@ 2025-11-04 13:51 ` Sasha Levin
0 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2025-11-04 13:51 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: patches, stable, Bjorn Helgaas, linux-pci
On Mon, Oct 27, 2025 at 02:39:27PM +0200, Ilpo Järvinen wrote:
>On Sat, 25 Oct 2025, Sasha Levin wrote:
>
>> From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>
>> [ Upstream commit a43ac325c7cbbfe72bdf9178059b3ee9f5a2c7dd ]
>>
>> Bridge windows are read twice from PCI Config Space, the first time from
>> pci_read_bridge_windows(), which does not set up the device's resources.
>> This causes problems down the road as child resources of the bridge cannot
>> check whether they reside within the bridge window or not.
>>
>> Set up the bridge windows already in pci_read_bridge_windows().
>>
>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> Link: https://patch.msgid.link/20250924134228.1663-2-ilpo.jarvinen@linux.intel.com
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>
>This change was reverted by the commit 469276c06aff ("PCI: Revert early
>bridge resource set up").
Dropped, thanks!
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-11-04 13:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17] selftests: pci_endpoint: Skip IRQ test if IRQ is out of range Sasha Levin
2025-10-25 15:54 ` [PATCH AUTOSEL 6.17-6.12] PCI: imx6: Enable the Vaux supply if available Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.12] PCI: Set up bridge resources earlier Sasha Levin
2025-10-27 12:39 ` Ilpo Järvinen
2025-11-04 13:51 ` Sasha Levin
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17-6.1] s390/pci: Use pci_uevent_ers() in PCI recovery Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-5.4] PCI: Disable MSI on RDC PCI to PCIe bridges Sasha Levin
2025-10-25 15:56 ` [PATCH AUTOSEL 6.17-6.1] PCI/PM: Skip resuming to D0 if device is disconnected Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.4] PCI/P2PDMA: Fix incorrect pointer usage in devm_kfree() call Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] misc: pci_endpoint_test: Skip IRQ tests if irq is out of range Sasha Levin
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17-5.15] thunderbolt: Use is_pciehp instead of is_hotplug_bridge Sasha Levin
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17-5.10] PCI: cadence: Check for the existence of cdns_pcie::ops before using it Sasha Levin
2025-10-25 16:01 ` [PATCH AUTOSEL 6.17-6.6] PCI: dwc: Verify the single eDMA IRQ in dw_pcie_edma_irq_verify() Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).