* [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
@ 2025-12-02 14:22 ` Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 1/4] PCI: Enable ACS only after configuring IOMMU for " Manivannan Sadhasivam
` (5 more replies)
0 siblings, 6 replies; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-02 14:22 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Marek Szyprowski, Robin Murphy,
Jason Gunthorpe, Manivannan Sadhasivam, Manivannan Sadhasivam
Hi,
This series fixes the long standing issue with ACS in OF platforms. There are
two fixes in this series, both fixing independent issues on their own, but both
are needed to properly enable ACS on OF platforms.
Issue(s) background
===================
Back in 2021, Xingang Wang first noted a failure in attaching the HiSilicon SEC
device to QEMU ARM64 pci-root-port device [1]. He then tracked down the issue to
ACS not being enabled for the QEMU Root Port device and he proposed a patch to
fix it [2].
Once the patch got applied, people reported PCIe issues with linux-next on the
ARM Juno Development boards, where they saw failure in enumerating the endpoint
devices [3][4]. So soon, the patch got dropped, but the actual issue with the
ARM Juno boards was left behind.
Fast forward to 2024, Pavan resubmitted the same fix [5] for his own usecase,
hoping that someone in the community would fix the issue with ARM Juno boards.
But the patch was rightly rejected, as a patch that was known to cause issues
should not be merged to the kernel. But again, no one investigated the Juno
issue and it was left behind again.
Now it ended up in my plate and I managed to track down the issue with the help
of Naresh who got access to the Juno boards in LKFT. The Juno issue was with the
PCIe switch from Microsemi/IDT, which triggers ACS Source Validation error on
Completions received for the Configuration Read Request from a device connected
to the downstream port that has not yet captured the PCIe bus number. As per the
PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and Device Numbers
supplied with all Type 0 Configuration Write Requests completed by the Function
and supply these numbers in the Bus and Device Number fields of the Requester ID
for all Requests". So during the first Configuration Read Request issued by the
switch downstream port during enumeration (for reading Vendor ID), Bus and
Device numbers will be unknown to the device. So it responds to the Read Request
with Completion having Bus and Device number as 0. The switch interprets the
Completion as an ACS Source Validation error and drops the completion, leading
to the failure in detecting the endpoint device. Though the PCIe spec r6.0, sec
6.12.1.1, states that "Completions are never affected by ACS Source Validation".
This behavior is in violation of the spec.
Solution
========
In September, I submitted a series [6] to fix both issues. For the IDT issue,
I reused the existing quirk in the PCI core which does a dummy config write
before issuing the first config read to the device. And for the ACS enablement
issue, I just resubmitted the original patch from Xingang which called
pci_request_acs() from devm_of_pci_bridge_init().
But during the review of the series, several comments were received and they
required the series to be reworked completely. Hence, in this version, I've
incorported the comments as below:
1. For the ACS enablement issue, I've moved the pci_enable_acs() call from
pci_acs_init() to pci_dma_configure().
2. For the IDT issue, I've cached the ACS capabilities (RO) in 'pci_dev',
collected the broken capability for the IDT switches in the quirk and used it to
disable the capability in the cache. This also allowed me to get rid of the
earlier workaround for the switch.
[1] https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
[2] https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
[3] https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
[4] https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
[5] https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
[6] https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
Changes in v2:
* Reworked the patches completely as mentioned above.
* Rebased on top of v6.18-rc7
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
Manivannan Sadhasivam (4):
PCI: Enable ACS only after configuring IOMMU for OF platforms
PCI: Cache ACS capabilities
PCI: Disable ACS SV capability for the broken IDT switches
PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch
drivers/pci/pci-driver.c | 8 +++++++
drivers/pci/pci.c | 33 ++++++++++++--------------
drivers/pci/pci.h | 2 +-
drivers/pci/probe.c | 12 ----------
drivers/pci/quirks.c | 62 ++++++++++++------------------------------------
include/linux/pci.h | 2 ++
6 files changed, 41 insertions(+), 78 deletions(-)
---
base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d
change-id: 20251201-pci_acs-b15aa3947289
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/4] PCI: Enable ACS only after configuring IOMMU for OF platforms
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
@ 2025-12-02 14:22 ` Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 2/4] PCI: Cache ACS capabilities Manivannan Sadhasivam
` (4 subsequent siblings)
5 siblings, 0 replies; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-02 14:22 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Marek Szyprowski, Robin Murphy,
Jason Gunthorpe, Manivannan Sadhasivam, Manivannan Sadhasivam
For enabling ACS without the cmdline params, the platform drivers are
expected to call pci_request_acs() API which sets a static flag,
'pci_acs_enable' in drivers/pci/pci.c. And this flag is used to enable ACS
in pci_enable_acs() helper, which gets called during pci_acs_init(), as per
this call stack:
-> pci_device_add()
-> pci_init_capabilities()
-> pci_acs_init()
/* check for pci_acs_enable */
-> pci_enable_acs()
For the OF platforms, pci_request_acs() is called during
of_iommu_configure() during device_add(), as per this call stack:
-> device_add()
-> iommu_bus_notifier()
-> iommu_probe_device()
-> pci_dma_configure()
-> of_dma_configure()
-> of_iommu_configure()
/* set pci_acs_enable */
-> pci_request_acs()
As seen from both call stacks, pci_enable_acs() is called way before the
invocation of pci_request_acs() for the OF platforms. This means,
pci_enable_acs() will not enable ACS for the first device that gets
enumerated, which is usally the Root Port device. But since the static
flag, 'pci_acs_enable' is set *afterwards*, ACS will be enabled for the
ACS capable devices enumerated later.
To fix this issue, do not call pci_enable_acs() from pci_acs_init(), but
only from pci_dma_configure() after calling of_dma_configure(). This makes
sure that pci_enable_acs() only gets called after the IOMMU framework has
called pci_request_acs(). The ACS enablement flow now looks like:
-> pci_device_add()
-> pci_init_capabilities()
/* Just store the ACS cap */
-> pci_acs_init()
-> device_add()
...
-> pci_dma_configure()
-> of_dma_configure()
-> pci_request_acs()
-> pci_enable_acs()
For the ACPI platforms, pci_request_acs() is called during ACPI
initialization time itself, independent of the IOMMU framework.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/pci-driver.c | 8 ++++++++
drivers/pci/pci.c | 8 --------
drivers/pci/pci.h | 1 +
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 302d61783f6c..a4ee93497a06 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1648,6 +1648,14 @@ static int pci_dma_configure(struct device *dev)
ret = acpi_dma_configure(dev, acpi_get_dma_attr(adev));
}
+ /*
+ * Attempt to enable ACS regardless of capability because some Root
+ * Ports (e.g. those quirked with *_intel_pch_acs_*) do not have
+ * the standard ACS capability but still support ACS via those
+ * quirks.
+ */
+ pci_enable_acs(to_pci_dev(dev));
+
pci_put_host_bridge_device(bridge);
/* @drv may not be valid when we're called from the IOMMU layer */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..9f594fc6dade 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3677,14 +3677,6 @@ bool pci_acs_path_enabled(struct pci_dev *start,
void pci_acs_init(struct pci_dev *dev)
{
dev->acs_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
-
- /*
- * Attempt to enable ACS regardless of capability because some Root
- * Ports (e.g. those quirked with *_intel_pch_acs_*) do not have
- * the standard ACS capability but still support ACS via those
- * quirks.
- */
- pci_enable_acs(dev);
}
void pci_rebar_init(struct pci_dev *pdev)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 36f8c0985430..972b28fc5455 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -917,6 +917,7 @@ static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
}
void pci_acs_init(struct pci_dev *dev);
+void pci_enable_acs(struct pci_dev *dev);
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
int pci_dev_specific_enable_acs(struct pci_dev *dev);
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/4] PCI: Cache ACS capabilities
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 1/4] PCI: Enable ACS only after configuring IOMMU for " Manivannan Sadhasivam
@ 2025-12-02 14:22 ` Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches Manivannan Sadhasivam
` (3 subsequent siblings)
5 siblings, 0 replies; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-02 14:22 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Marek Szyprowski, Robin Murphy,
Jason Gunthorpe, Manivannan Sadhasivam, Manivannan Sadhasivam
ACS capabilities are the RO values set by the hardware. Cache them to avoid
reading it all the time when required and also to override any capability
in quirks.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/pci.c | 26 +++++++++++++++-----------
include/linux/pci.h | 1 +
2 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9f594fc6dade..4eb5b487c982 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -892,7 +892,6 @@ static const char *disable_acs_redir_param;
static const char *config_acs_param;
struct pci_acs {
- u16 cap;
u16 ctrl;
u16 fw_ctrl;
};
@@ -995,27 +994,27 @@ static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps,
static void pci_std_enable_acs(struct pci_dev *dev, struct pci_acs *caps)
{
/* Source Validation */
- caps->ctrl |= (caps->cap & PCI_ACS_SV);
+ caps->ctrl |= (dev->acs_capabilities & PCI_ACS_SV);
/* P2P Request Redirect */
- caps->ctrl |= (caps->cap & PCI_ACS_RR);
+ caps->ctrl |= (dev->acs_capabilities & PCI_ACS_RR);
/* P2P Completion Redirect */
- caps->ctrl |= (caps->cap & PCI_ACS_CR);
+ caps->ctrl |= (dev->acs_capabilities & PCI_ACS_CR);
/* Upstream Forwarding */
- caps->ctrl |= (caps->cap & PCI_ACS_UF);
+ caps->ctrl |= (dev->acs_capabilities & PCI_ACS_UF);
/* Enable Translation Blocking for external devices and noats */
if (pci_ats_disabled() || dev->external_facing || dev->untrusted)
- caps->ctrl |= (caps->cap & PCI_ACS_TB);
+ caps->ctrl |= (dev->acs_capabilities & PCI_ACS_TB);
}
/**
* pci_enable_acs - enable ACS if hardware support it
* @dev: the PCI device
*/
-static void pci_enable_acs(struct pci_dev *dev)
+void pci_enable_acs(struct pci_dev *dev)
{
struct pci_acs caps;
bool enable_acs = false;
@@ -1031,7 +1030,6 @@ static void pci_enable_acs(struct pci_dev *dev)
if (!pos)
return;
- pci_read_config_word(dev, pos + PCI_ACS_CAP, &caps.cap);
pci_read_config_word(dev, pos + PCI_ACS_CTRL, &caps.ctrl);
caps.fw_ctrl = caps.ctrl;
@@ -3543,7 +3541,7 @@ void pci_configure_ari(struct pci_dev *dev)
static bool pci_acs_flags_enabled(struct pci_dev *pdev, u16 acs_flags)
{
int pos;
- u16 cap, ctrl;
+ u16 ctrl;
pos = pdev->acs_cap;
if (!pos)
@@ -3554,8 +3552,7 @@ static bool pci_acs_flags_enabled(struct pci_dev *pdev, u16 acs_flags)
* or only required if controllable. Features missing from the
* capability field can therefore be assumed as hard-wired enabled.
*/
- pci_read_config_word(pdev, pos + PCI_ACS_CAP, &cap);
- acs_flags &= (cap | PCI_ACS_EC);
+ acs_flags &= (pdev->acs_capabilities | PCI_ACS_EC);
pci_read_config_word(pdev, pos + PCI_ACS_CTRL, &ctrl);
return (ctrl & acs_flags) == acs_flags;
@@ -3676,7 +3673,14 @@ bool pci_acs_path_enabled(struct pci_dev *start,
*/
void pci_acs_init(struct pci_dev *dev)
{
+ int pos;
+
dev->acs_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+ pos = dev->acs_cap;
+ if (!pos)
+ return;
+
+ pci_read_config_word(dev, pos + PCI_ACS_CAP, &dev->acs_capabilities);
}
void pci_rebar_init(struct pci_dev *pdev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index bf97d49c23cf..c6ee1dfdb0fb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -543,6 +543,7 @@ struct pci_dev {
struct npem *npem; /* Native PCIe Enclosure Management */
#endif
u16 acs_cap; /* ACS Capability offset */
+ u16 acs_capabilities; /* ACS Capabilities */
u8 supported_speeds; /* Supported Link Speeds Vector */
phys_addr_t rom; /* Physical address if not from BAR */
size_t romlen; /* Length if not from BAR */
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 1/4] PCI: Enable ACS only after configuring IOMMU for " Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 2/4] PCI: Cache ACS capabilities Manivannan Sadhasivam
@ 2025-12-02 14:22 ` Manivannan Sadhasivam
2025-12-02 19:15 ` Jason Gunthorpe
2025-12-02 14:22 ` [PATCH v2 4/4] PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch Manivannan Sadhasivam
` (2 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-02 14:22 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Marek Szyprowski, Robin Murphy,
Jason Gunthorpe, Manivannan Sadhasivam, Manivannan Sadhasivam
Some IDT switches behave erratically when ACS Source Validation is enabled.
For example, they incorrectly flag an ACS Source Validation error on
completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
says that completions are never affected by ACS Source Validation.
Even though IDT suggests working around this issue by issuing a config
write before the first config read, so that the device caches the bus and
device number. But it would still be fragile since the device could loose
the IDs after the reset and any further access may trigger ACS SV
violation.
Hence, to properly fix the issue, the respective capability needs to be
disabled. Since the ACS Capabilities are RO values, and are cached in the
'pci_dev::acs_capabilities' field, add a new field for broken caps, set it
in quirks and use it to remove the broken capabilities in pci_acs_init().
This will allow pci_enable_acs() helper to disable the relevant ACS ctrls.
It should be noted that the quirk should be of the fixup_header level, so
that it gets called before pci_acs_init().
With this, the previous workaround can now be safely removed.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/pci.c | 1 +
drivers/pci/pci.h | 1 -
drivers/pci/probe.c | 12 -----------
drivers/pci/quirks.c | 61 ++++++++++++----------------------------------------
include/linux/pci.h | 1 +
5 files changed, 16 insertions(+), 60 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4eb5b487c982..6ed35affea06 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3681,6 +3681,7 @@ void pci_acs_init(struct pci_dev *dev)
return;
pci_read_config_word(dev, pos + PCI_ACS_CAP, &dev->acs_capabilities);
+ dev->acs_capabilities &= ~dev->acs_broken_cap;
}
void pci_rebar_init(struct pci_dev *pdev)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 972b28fc5455..56ba7d60d658 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -430,7 +430,6 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int rrs_timeout);
bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int rrs_timeout);
-int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int rrs_timeout);
int pci_setup_device(struct pci_dev *dev);
void __pci_size_stdbars(struct pci_dev *dev, int count,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 9cd032dff31e..6f8142cf9487 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2517,18 +2517,6 @@ bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
int timeout)
{
-#ifdef CONFIG_PCI_QUIRKS
- struct pci_dev *bridge = bus->self;
-
- /*
- * Certain IDT switches have an issue where they improperly trigger
- * ACS Source Validation errors on completions for config reads.
- */
- if (bridge && bridge->vendor == PCI_VENDOR_ID_IDT &&
- bridge->device == 0x80b5)
- return pci_idt_bus_quirk(bus, devfn, l, timeout);
-#endif
-
return pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
}
EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b9c252aa6fe0..a5956726a49f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5778,59 +5778,26 @@ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
PCI_BASE_CLASS_DISPLAY, 16, quirk_nvidia_hda);
/*
- * Some IDT switches incorrectly flag an ACS Source Validation error on
- * completions for config read requests even though PCIe r4.0, sec
- * 6.12.1.1, says that completions are never affected by ACS Source
- * Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36:
+ * Some IDT switches behave erratically when ACS Source Validation is enabled.
+ * For example, they incorrectly flag an ACS Source Validation error on
+ * completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
+ * says that completions are never affected by ACS Source Validation.
*
- * Item #36 - Downstream port applies ACS Source Validation to Completions
- * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that
- * completions are never affected by ACS Source Validation. However,
- * completions received by a downstream port of the PCIe switch from a
- * device that has not yet captured a PCIe bus number are incorrectly
- * dropped by ACS Source Validation by the switch downstream port.
+ * Even though IDT suggests working around this issue by issuing a config write
+ * before the first config read, so that the switch caches the bus and device
+ * number, it would still be fragile since the device could loose the IDs after
+ * the reset.
*
- * The workaround suggested by IDT is to issue a config write to the
- * downstream device before issuing the first config read. This allows the
- * downstream device to capture its bus and device numbers (see PCIe r4.0,
- * sec 2.2.9), thus avoiding the ACS error on the completion.
- *
- * However, we don't know when the device is ready to accept the config
- * write, so we do config reads until we receive a non-Config Request Retry
- * Status, then do the config write.
- *
- * To avoid hitting the erratum when doing the config reads, we disable ACS
- * SV around this process.
+ * Hence, a reliable fix would be to assume that these switches don't support
+ * ACS SV.
*/
-int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l, int timeout)
+static void pci_disable_acs_sv(struct pci_dev *dev)
{
- int pos;
- u16 ctrl = 0;
- bool found;
- struct pci_dev *bridge = bus->self;
-
- pos = bridge->acs_cap;
-
- /* Disable ACS SV before initial config reads */
- if (pos) {
- pci_read_config_word(bridge, pos + PCI_ACS_CTRL, &ctrl);
- if (ctrl & PCI_ACS_SV)
- pci_write_config_word(bridge, pos + PCI_ACS_CTRL,
- ctrl & ~PCI_ACS_SV);
- }
+ pci_info(dev, "Disabling broken ACS SV\n");
- found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
-
- /* Write Vendor ID (read-only) so the endpoint latches its bus/dev */
- if (found)
- pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
-
- /* Re-enable ACS_SV if it was previously enabled */
- if (ctrl & PCI_ACS_SV)
- pci_write_config_word(bridge, pos + PCI_ACS_CTRL, ctrl);
-
- return found;
+ dev->acs_broken_cap |= PCI_ACS_SV;
}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IDT, 0x80b5, pci_disable_acs_sv);
/*
* Microsemi Switchtec NTB uses devfn proxy IDs to move TLPs between
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c6ee1dfdb0fb..246c0ca34308 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -544,6 +544,7 @@ struct pci_dev {
#endif
u16 acs_cap; /* ACS Capability offset */
u16 acs_capabilities; /* ACS Capabilities */
+ u16 acs_broken_cap; /* Broken ACS Capabilities */
u8 supported_speeds; /* Supported Link Speeds Vector */
phys_addr_t rom; /* Physical address if not from BAR */
size_t romlen; /* Length if not from BAR */
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 4/4] PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
` (2 preceding siblings ...)
2025-12-02 14:22 ` [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches Manivannan Sadhasivam
@ 2025-12-02 14:22 ` Manivannan Sadhasivam
2025-12-03 8:46 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Naresh Kamboju
2025-12-03 12:04 ` Marek Szyprowski
5 siblings, 0 replies; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-02 14:22 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Marek Szyprowski, Robin Murphy,
Jason Gunthorpe, Manivannan Sadhasivam, Manivannan Sadhasivam
The IDT switch with Device ID 0x8090 used in the ARM Juno R2 development
board incorrectly raises an ACS Source Validation error on Completions for
Config Read Requests, even though PCIe r6.0, sec 6.12.1.1, says that
Completions are never affected by ACS Source Validation.
This is already handled by the pci_disable_acs_sv() quirk for one of the
IDT switch 0x80b5. Hence, extend the quirk for this device too.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/quirks.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index a5956726a49f..314aacf5a309 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5798,6 +5798,7 @@ static void pci_disable_acs_sv(struct pci_dev *dev)
dev->acs_broken_cap |= PCI_ACS_SV;
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IDT, 0x80b5, pci_disable_acs_sv);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IDT, 0x8090, pci_disable_acs_sv);
/*
* Microsemi Switchtec NTB uses devfn proxy IDs to move TLPs between
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches
2025-12-02 14:22 ` [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches Manivannan Sadhasivam
@ 2025-12-02 19:15 ` Jason Gunthorpe
2025-12-09 11:20 ` Manivannan Sadhasivam
0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2025-12-02 19:15 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Bjorn Helgaas, linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Marek Szyprowski, Robin Murphy,
Manivannan Sadhasivam
On Tue, Dec 02, 2025 at 07:52:50PM +0530, Manivannan Sadhasivam wrote:
> @@ -544,6 +544,7 @@ struct pci_dev {
> #endif
> u16 acs_cap; /* ACS Capability offset */
> u16 acs_capabilities; /* ACS Capabilities */
> + u16 acs_broken_cap; /* Broken ACS Capabilities */
Why do we need this? Have the quirk function accep tthe
acs_capabilities from the register and return the value to program
into struct pci_dev ?
Jason
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
` (3 preceding siblings ...)
2025-12-02 14:22 ` [PATCH v2 4/4] PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch Manivannan Sadhasivam
@ 2025-12-03 8:46 ` Naresh Kamboju
2025-12-03 12:04 ` Marek Szyprowski
5 siblings, 0 replies; 19+ messages in thread
From: Naresh Kamboju @ 2025-12-03 8:46 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Bjorn Helgaas, linux-pci, linux-kernel, iommu, Pavankumar Kondeti,
Xingang Wang, Marek Szyprowski, Robin Murphy, Jason Gunthorpe,
Manivannan Sadhasivam
On Tue, 2 Dec 2025 at 19:53, Manivannan Sadhasivam
<manivannan.sadhasivam@oss.qualcomm.com> wrote:
>
> Hi,
>
> This series fixes the long standing issue with ACS in OF platforms. There are
> two fixes in this series, both fixing independent issues on their own, but both
> are needed to properly enable ACS on OF platforms.
>
> Issue(s) background
> ===================
>
> Back in 2021, Xingang Wang first noted a failure in attaching the HiSilicon SEC
> device to QEMU ARM64 pci-root-port device [1]. He then tracked down the issue to
> ACS not being enabled for the QEMU Root Port device and he proposed a patch to
> fix it [2].
>
> Once the patch got applied, people reported PCIe issues with linux-next on the
> ARM Juno Development boards, where they saw failure in enumerating the endpoint
> devices [3][4]. So soon, the patch got dropped, but the actual issue with the
> ARM Juno boards was left behind.
>
> Fast forward to 2024, Pavan resubmitted the same fix [5] for his own usecase,
> hoping that someone in the community would fix the issue with ARM Juno boards.
> But the patch was rightly rejected, as a patch that was known to cause issues
> should not be merged to the kernel. But again, no one investigated the Juno
> issue and it was left behind again.
>
> Now it ended up in my plate and I managed to track down the issue with the help
> of Naresh who got access to the Juno boards in LKFT. The Juno issue was with the
> PCIe switch from Microsemi/IDT, which triggers ACS Source Validation error on
> Completions received for the Configuration Read Request from a device connected
> to the downstream port that has not yet captured the PCIe bus number. As per the
> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and Device Numbers
> supplied with all Type 0 Configuration Write Requests completed by the Function
> and supply these numbers in the Bus and Device Number fields of the Requester ID
> for all Requests". So during the first Configuration Read Request issued by the
> switch downstream port during enumeration (for reading Vendor ID), Bus and
> Device numbers will be unknown to the device. So it responds to the Read Request
> with Completion having Bus and Device number as 0. The switch interprets the
> Completion as an ACS Source Validation error and drops the completion, leading
> to the failure in detecting the endpoint device. Though the PCIe spec r6.0, sec
> 6.12.1.1, states that "Completions are never affected by ACS Source Validation".
> This behavior is in violation of the spec.
>
> Solution
> ========
>
> In September, I submitted a series [6] to fix both issues. For the IDT issue,
> I reused the existing quirk in the PCI core which does a dummy config write
> before issuing the first config read to the device. And for the ACS enablement
> issue, I just resubmitted the original patch from Xingang which called
> pci_request_acs() from devm_of_pci_bridge_init().
>
> But during the review of the series, several comments were received and they
> required the series to be reworked completely. Hence, in this version, I've
> incorported the comments as below:
>
> 1. For the ACS enablement issue, I've moved the pci_enable_acs() call from
> pci_acs_init() to pci_dma_configure().
>
> 2. For the IDT issue, I've cached the ACS capabilities (RO) in 'pci_dev',
> collected the broken capability for the IDT switches in the quirk and used it to
> disable the capability in the cache. This also allowed me to get rid of the
> earlier workaround for the switch.
>
> [1] https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
> [2] https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
> [3] https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
> [4] https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
> [5] https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
> [6] https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>
> Changes in v2:
>
> * Reworked the patches completely as mentioned above.
> * Rebased on top of v6.18-rc7
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> ---
> Manivannan Sadhasivam (4):
> PCI: Enable ACS only after configuring IOMMU for OF platforms
> PCI: Cache ACS capabilities
> PCI: Disable ACS SV capability for the broken IDT switches
> PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch
>
> drivers/pci/pci-driver.c | 8 +++++++
> drivers/pci/pci.c | 33 ++++++++++++--------------
> drivers/pci/pci.h | 2 +-
> drivers/pci/probe.c | 12 ----------
> drivers/pci/quirks.c | 62 ++++++++++++------------------------------------
> include/linux/pci.h | 2 ++
> 6 files changed, 41 insertions(+), 78 deletions(-)
> ---
> base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d
> change-id: 20251201-pci_acs-b15aa3947289
>
> Best regards,
> --
> Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
` (4 preceding siblings ...)
2025-12-03 8:46 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Naresh Kamboju
@ 2025-12-03 12:04 ` Marek Szyprowski
2025-12-04 13:13 ` Marek Szyprowski
5 siblings, 1 reply; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-03 12:04 UTC (permalink / raw)
To: Manivannan Sadhasivam, Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Robin Murphy, Jason Gunthorpe,
Manivannan Sadhasivam
On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
> This series fixes the long standing issue with ACS in OF platforms. There are
> two fixes in this series, both fixing independent issues on their own, but both
> are needed to properly enable ACS on OF platforms.
>
> Issue(s) background
> ===================
>
> Back in 2021, Xingang Wang first noted a failure in attaching the HiSilicon SEC
> device to QEMU ARM64 pci-root-port device [1]. He then tracked down the issue to
> ACS not being enabled for the QEMU Root Port device and he proposed a patch to
> fix it [2].
>
> Once the patch got applied, people reported PCIe issues with linux-next on the
> ARM Juno Development boards, where they saw failure in enumerating the endpoint
> devices [3][4]. So soon, the patch got dropped, but the actual issue with the
> ARM Juno boards was left behind.
>
> Fast forward to 2024, Pavan resubmitted the same fix [5] for his own usecase,
> hoping that someone in the community would fix the issue with ARM Juno boards.
> But the patch was rightly rejected, as a patch that was known to cause issues
> should not be merged to the kernel. But again, no one investigated the Juno
> issue and it was left behind again.
>
> Now it ended up in my plate and I managed to track down the issue with the help
> of Naresh who got access to the Juno boards in LKFT. The Juno issue was with the
> PCIe switch from Microsemi/IDT, which triggers ACS Source Validation error on
> Completions received for the Configuration Read Request from a device connected
> to the downstream port that has not yet captured the PCIe bus number. As per the
> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and Device Numbers
> supplied with all Type 0 Configuration Write Requests completed by the Function
> and supply these numbers in the Bus and Device Number fields of the Requester ID
> for all Requests". So during the first Configuration Read Request issued by the
> switch downstream port during enumeration (for reading Vendor ID), Bus and
> Device numbers will be unknown to the device. So it responds to the Read Request
> with Completion having Bus and Device number as 0. The switch interprets the
> Completion as an ACS Source Validation error and drops the completion, leading
> to the failure in detecting the endpoint device. Though the PCIe spec r6.0, sec
> 6.12.1.1, states that "Completions are never affected by ACS Source Validation".
> This behavior is in violation of the spec.
>
> Solution
> ========
>
> In September, I submitted a series [6] to fix both issues. For the IDT issue,
> I reused the existing quirk in the PCI core which does a dummy config write
> before issuing the first config read to the device. And for the ACS enablement
> issue, I just resubmitted the original patch from Xingang which called
> pci_request_acs() from devm_of_pci_bridge_init().
>
> But during the review of the series, several comments were received and they
> required the series to be reworked completely. Hence, in this version, I've
> incorported the comments as below:
>
> 1. For the ACS enablement issue, I've moved the pci_enable_acs() call from
> pci_acs_init() to pci_dma_configure().
>
> 2. For the IDT issue, I've cached the ACS capabilities (RO) in 'pci_dev',
> collected the broken capability for the IDT switches in the quirk and used it to
> disable the capability in the cache. This also allowed me to get rid of the
> earlier workaround for the switch.
>
> [1] https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
> [2] https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
> [3] https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
> [4] https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
> [5] https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
> [6] https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>
Thanks for this patchset! I've tested it on my ARM Juno R1 and it looks
that it almost works fine. This patchset even fixed some issues with PCI
devices probe, as I again see SATA and GBit ethernet devices, which were
missing since Linux v6.14 (it looks that I've also missed this in my tests).
# lspci
00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02)
03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA
Raid II Controller (rev 01)
08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057 PCI-E
Gigabit Ethernet Controller
However there is also a regression. After applying this patchset system
suspend/resume stopped working. This is probably related to this message:
pcieport 0000:02:1f.0: Unable to change power state from D0 to D3hot,
device inaccessible
which appears after calling 'rtcwake -s10 -mmem'. This might not be
related to this patchset, so I probably need to apply it on older kernel
releases and check.
> Changes in v2:
>
> * Reworked the patches completely as mentioned above.
> * Rebased on top of v6.18-rc7
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Manivannan Sadhasivam (4):
> PCI: Enable ACS only after configuring IOMMU for OF platforms
> PCI: Cache ACS capabilities
> PCI: Disable ACS SV capability for the broken IDT switches
> PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch
>
> drivers/pci/pci-driver.c | 8 +++++++
> drivers/pci/pci.c | 33 ++++++++++++--------------
> drivers/pci/pci.h | 2 +-
> drivers/pci/probe.c | 12 ----------
> drivers/pci/quirks.c | 62 ++++++++++++------------------------------------
> include/linux/pci.h | 2 ++
> 6 files changed, 41 insertions(+), 78 deletions(-)
> ---
> base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d
> change-id: 20251201-pci_acs-b15aa3947289
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-03 12:04 ` Marek Szyprowski
@ 2025-12-04 13:13 ` Marek Szyprowski
2025-12-09 7:31 ` Marek Szyprowski
0 siblings, 1 reply; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-04 13:13 UTC (permalink / raw)
To: Manivannan Sadhasivam, Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Robin Murphy, Jason Gunthorpe,
Manivannan Sadhasivam
On 03.12.2025 13:04, Marek Szyprowski wrote:
> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
>> This series fixes the long standing issue with ACS in OF platforms.
>> There are
>> two fixes in this series, both fixing independent issues on their
>> own, but both
>> are needed to properly enable ACS on OF platforms.
>>
>> Issue(s) background
>> ===================
>>
>> Back in 2021, Xingang Wang first noted a failure in attaching the
>> HiSilicon SEC
>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
>> the issue to
>> ACS not being enabled for the QEMU Root Port device and he proposed a
>> patch to
>> fix it [2].
>>
>> Once the patch got applied, people reported PCIe issues with
>> linux-next on the
>> ARM Juno Development boards, where they saw failure in enumerating
>> the endpoint
>> devices [3][4]. So soon, the patch got dropped, but the actual issue
>> with the
>> ARM Juno boards was left behind.
>>
>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his own
>> usecase,
>> hoping that someone in the community would fix the issue with ARM
>> Juno boards.
>> But the patch was rightly rejected, as a patch that was known to
>> cause issues
>> should not be merged to the kernel. But again, no one investigated
>> the Juno
>> issue and it was left behind again.
>>
>> Now it ended up in my plate and I managed to track down the issue
>> with the help
>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
>> was with the
>> PCIe switch from Microsemi/IDT, which triggers ACS Source Validation
>> error on
>> Completions received for the Configuration Read Request from a device
>> connected
>> to the downstream port that has not yet captured the PCIe bus number.
>> As per the
>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
>> Device Numbers
>> supplied with all Type 0 Configuration Write Requests completed by
>> the Function
>> and supply these numbers in the Bus and Device Number fields of the
>> Requester ID
>> for all Requests". So during the first Configuration Read Request
>> issued by the
>> switch downstream port during enumeration (for reading Vendor ID),
>> Bus and
>> Device numbers will be unknown to the device. So it responds to the
>> Read Request
>> with Completion having Bus and Device number as 0. The switch
>> interprets the
>> Completion as an ACS Source Validation error and drops the
>> completion, leading
>> to the failure in detecting the endpoint device. Though the PCIe spec
>> r6.0, sec
>> 6.12.1.1, states that "Completions are never affected by ACS Source
>> Validation".
>> This behavior is in violation of the spec.
>>
>> Solution
>> ========
>>
>> In September, I submitted a series [6] to fix both issues. For the
>> IDT issue,
>> I reused the existing quirk in the PCI core which does a dummy config
>> write
>> before issuing the first config read to the device. And for the ACS
>> enablement
>> issue, I just resubmitted the original patch from Xingang which called
>> pci_request_acs() from devm_of_pci_bridge_init().
>>
>> But during the review of the series, several comments were received
>> and they
>> required the series to be reworked completely. Hence, in this
>> version, I've
>> incorported the comments as below:
>>
>> 1. For the ACS enablement issue, I've moved the pci_enable_acs() call
>> from
>> pci_acs_init() to pci_dma_configure().
>>
>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
>> 'pci_dev',
>> collected the broken capability for the IDT switches in the quirk and
>> used it to
>> disable the capability in the cache. This also allowed me to get rid
>> of the
>> earlier workaround for the switch.
>>
>> [1]
>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
>> [2]
>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
>> [3]
>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
>> [4]
>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
>> [5]
>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
>> [6]
>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>>
> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
> looks that it almost works fine. This patchset even fixed some issues
> with PCI devices probe, as I again see SATA and GBit ethernet devices,
> which were missing since Linux v6.14 (it looks that I've also missed
> this in my tests).
>
> # lspci
> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> 8090 (rev 02)
> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
> ATA Raid II Controller (rev 01)
> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
> PCI-E Gigabit Ethernet Controller
>
> However there is also a regression. After applying this patchset
> system suspend/resume stopped working. This is probably related to
> this message:
>
> pcieport 0000:02:1f.0: Unable to change power state from D0 to D3hot,
> device inaccessible
>
> which appears after calling 'rtcwake -s10 -mmem'. This might not be
> related to this patchset, so I probably need to apply it on older
> kernel releases and check.
Just one more information - I've applied this patchset on top of v6.16
and it works perfectly on ARM Juno R1. SATA and GBit ethernet are
visible again and system suspend/resume works too, so the issue with the
latter on top of v6.18 seems not to be directly related to $subject
patchset. I will try to bisect this issue when I have some spare time.
Feel free to add:
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-04 13:13 ` Marek Szyprowski
@ 2025-12-09 7:31 ` Marek Szyprowski
2025-12-09 8:28 ` Marek Szyprowski
0 siblings, 1 reply; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-09 7:31 UTC (permalink / raw)
To: Manivannan Sadhasivam, Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Robin Murphy, Jason Gunthorpe,
Manivannan Sadhasivam
On 04.12.2025 14:13, Marek Szyprowski wrote:
> On 03.12.2025 13:04, Marek Szyprowski wrote:
>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
>>> This series fixes the long standing issue with ACS in OF platforms.
>>> There are
>>> two fixes in this series, both fixing independent issues on their
>>> own, but both
>>> are needed to properly enable ACS on OF platforms.
>>>
>>> Issue(s) background
>>> ===================
>>>
>>> Back in 2021, Xingang Wang first noted a failure in attaching the
>>> HiSilicon SEC
>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
>>> the issue to
>>> ACS not being enabled for the QEMU Root Port device and he proposed
>>> a patch to
>>> fix it [2].
>>>
>>> Once the patch got applied, people reported PCIe issues with
>>> linux-next on the
>>> ARM Juno Development boards, where they saw failure in enumerating
>>> the endpoint
>>> devices [3][4]. So soon, the patch got dropped, but the actual issue
>>> with the
>>> ARM Juno boards was left behind.
>>>
>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his own
>>> usecase,
>>> hoping that someone in the community would fix the issue with ARM
>>> Juno boards.
>>> But the patch was rightly rejected, as a patch that was known to
>>> cause issues
>>> should not be merged to the kernel. But again, no one investigated
>>> the Juno
>>> issue and it was left behind again.
>>>
>>> Now it ended up in my plate and I managed to track down the issue
>>> with the help
>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
>>> was with the
>>> PCIe switch from Microsemi/IDT, which triggers ACS Source Validation
>>> error on
>>> Completions received for the Configuration Read Request from a
>>> device connected
>>> to the downstream port that has not yet captured the PCIe bus
>>> number. As per the
>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
>>> Device Numbers
>>> supplied with all Type 0 Configuration Write Requests completed by
>>> the Function
>>> and supply these numbers in the Bus and Device Number fields of the
>>> Requester ID
>>> for all Requests". So during the first Configuration Read Request
>>> issued by the
>>> switch downstream port during enumeration (for reading Vendor ID),
>>> Bus and
>>> Device numbers will be unknown to the device. So it responds to the
>>> Read Request
>>> with Completion having Bus and Device number as 0. The switch
>>> interprets the
>>> Completion as an ACS Source Validation error and drops the
>>> completion, leading
>>> to the failure in detecting the endpoint device. Though the PCIe
>>> spec r6.0, sec
>>> 6.12.1.1, states that "Completions are never affected by ACS Source
>>> Validation".
>>> This behavior is in violation of the spec.
>>>
>>> Solution
>>> ========
>>>
>>> In September, I submitted a series [6] to fix both issues. For the
>>> IDT issue,
>>> I reused the existing quirk in the PCI core which does a dummy
>>> config write
>>> before issuing the first config read to the device. And for the ACS
>>> enablement
>>> issue, I just resubmitted the original patch from Xingang which called
>>> pci_request_acs() from devm_of_pci_bridge_init().
>>>
>>> But during the review of the series, several comments were received
>>> and they
>>> required the series to be reworked completely. Hence, in this
>>> version, I've
>>> incorported the comments as below:
>>>
>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
>>> call from
>>> pci_acs_init() to pci_dma_configure().
>>>
>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
>>> 'pci_dev',
>>> collected the broken capability for the IDT switches in the quirk
>>> and used it to
>>> disable the capability in the cache. This also allowed me to get rid
>>> of the
>>> earlier workaround for the switch.
>>>
>>> [1]
>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
>>> [2]
>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
>>> [3]
>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
>>> [4]
>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
>>> [5]
>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
>>> [6]
>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>>>
>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
>> looks that it almost works fine. This patchset even fixed some issues
>> with PCI devices probe, as I again see SATA and GBit ethernet
>> devices, which were missing since Linux v6.14 (it looks that
>> I've also missed this in my tests).
>>
>> # lspci
>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>> 8090 (rev 02)
>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
>> ATA Raid II Controller (rev 01)
>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
>> PCI-E Gigabit Ethernet Controller
>>
>> However there is also a regression. After applying this patchset
>> system suspend/resume stopped working. This is probably related to
>> this message:
>>
>> pcieport 0000:02:1f.0: Unable to change power state from D0 to D3hot,
>> device inaccessible
>>
>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
>> related to this patchset, so I probably need to apply it on older
>> kernel releases and check.
>
>
> Just one more information - I've applied this patchset on top of v6.16
> and it works perfectly on ARM Juno R1. SATA and GBit ethernet are
> visible again and system suspend/resume works too, so the issue with
> the latter on top of v6.18 seems not to be directly related to
> $subject patchset. I will try to bisect this issue when I have some
> spare time.
>
> Feel free to add:
>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
I spent some time analyzing this regression on Juno R1 and found that:
1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
("iommu: Get DT/ACPI parsing into the proper probe path") merged to
v6.15-rc1.
2. With $subject patch applied to enable SATA & GBit ethernet again,
system suspend/resume stopped working after commit f3ac2ff14834
("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
platforms") merged to v6.18-rc1.
If I got it right, according to the latter commit message, some quirks
have to be added to fix the suspend/resume issue. Unfortunately I have
no idea if this is the Juno R1 or the given PCI devices specific issue.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-09 7:31 ` Marek Szyprowski
@ 2025-12-09 8:28 ` Marek Szyprowski
2025-12-09 11:15 ` Manivannan Sadhasivam
0 siblings, 1 reply; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-09 8:28 UTC (permalink / raw)
To: Manivannan Sadhasivam, Bjorn Helgaas
Cc: linux-pci, linux-kernel, iommu, Naresh Kamboju,
Pavankumar Kondeti, Xingang Wang, Robin Murphy, Jason Gunthorpe,
Manivannan Sadhasivam
On 09.12.2025 08:31, Marek Szyprowski wrote:
> On 04.12.2025 14:13, Marek Szyprowski wrote:
>> On 03.12.2025 13:04, Marek Szyprowski wrote:
>>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
>>>> This series fixes the long standing issue with ACS in OF platforms.
>>>> There are
>>>> two fixes in this series, both fixing independent issues on their
>>>> own, but both
>>>> are needed to properly enable ACS on OF platforms.
>>>>
>>>> Issue(s) background
>>>> ===================
>>>>
>>>> Back in 2021, Xingang Wang first noted a failure in attaching the
>>>> HiSilicon SEC
>>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
>>>> the issue to
>>>> ACS not being enabled for the QEMU Root Port device and he proposed
>>>> a patch to
>>>> fix it [2].
>>>>
>>>> Once the patch got applied, people reported PCIe issues with
>>>> linux-next on the
>>>> ARM Juno Development boards, where they saw failure in enumerating
>>>> the endpoint
>>>> devices [3][4]. So soon, the patch got dropped, but the actual
>>>> issue with the
>>>> ARM Juno boards was left behind.
>>>>
>>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
>>>> own usecase,
>>>> hoping that someone in the community would fix the issue with ARM
>>>> Juno boards.
>>>> But the patch was rightly rejected, as a patch that was known to
>>>> cause issues
>>>> should not be merged to the kernel. But again, no one investigated
>>>> the Juno
>>>> issue and it was left behind again.
>>>>
>>>> Now it ended up in my plate and I managed to track down the issue
>>>> with the help
>>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
>>>> was with the
>>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
>>>> Validation error on
>>>> Completions received for the Configuration Read Request from a
>>>> device connected
>>>> to the downstream port that has not yet captured the PCIe bus
>>>> number. As per the
>>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
>>>> Device Numbers
>>>> supplied with all Type 0 Configuration Write Requests completed by
>>>> the Function
>>>> and supply these numbers in the Bus and Device Number fields of the
>>>> Requester ID
>>>> for all Requests". So during the first Configuration Read Request
>>>> issued by the
>>>> switch downstream port during enumeration (for reading Vendor ID),
>>>> Bus and
>>>> Device numbers will be unknown to the device. So it responds to the
>>>> Read Request
>>>> with Completion having Bus and Device number as 0. The switch
>>>> interprets the
>>>> Completion as an ACS Source Validation error and drops the
>>>> completion, leading
>>>> to the failure in detecting the endpoint device. Though the PCIe
>>>> spec r6.0, sec
>>>> 6.12.1.1, states that "Completions are never affected by ACS Source
>>>> Validation".
>>>> This behavior is in violation of the spec.
>>>>
>>>> Solution
>>>> ========
>>>>
>>>> In September, I submitted a series [6] to fix both issues. For the
>>>> IDT issue,
>>>> I reused the existing quirk in the PCI core which does a dummy
>>>> config write
>>>> before issuing the first config read to the device. And for the ACS
>>>> enablement
>>>> issue, I just resubmitted the original patch from Xingang which called
>>>> pci_request_acs() from devm_of_pci_bridge_init().
>>>>
>>>> But during the review of the series, several comments were received
>>>> and they
>>>> required the series to be reworked completely. Hence, in this
>>>> version, I've
>>>> incorported the comments as below:
>>>>
>>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
>>>> call from
>>>> pci_acs_init() to pci_dma_configure().
>>>>
>>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
>>>> 'pci_dev',
>>>> collected the broken capability for the IDT switches in the quirk
>>>> and used it to
>>>> disable the capability in the cache. This also allowed me to get
>>>> rid of the
>>>> earlier workaround for the switch.
>>>>
>>>> [1]
>>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
>>>> [2]
>>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
>>>> [3]
>>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
>>>> [4]
>>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
>>>> [5]
>>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
>>>> [6]
>>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>>>>
>>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
>>> looks that it almost works fine. This patchset even fixed some
>>> issues with PCI devices probe, as I again see SATA and GBit ethernet
>>> devices, which were missing since Linux v6.14 (it looks that
>>> I've also missed this in my tests).
>>>
>>> # lspci
>>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
>>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>> 8090 (rev 02)
>>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
>>> ATA Raid II Controller (rev 01)
>>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
>>> PCI-E Gigabit Ethernet Controller
>>>
>>> However there is also a regression. After applying this patchset
>>> system suspend/resume stopped working. This is probably related to
>>> this message:
>>>
>>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
>>> D3hot, device inaccessible
>>>
>>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
>>> related to this patchset, so I probably need to apply it on older
>>> kernel releases and check.
>>
>>
>> Just one more information - I've applied this patchset on top of
>> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
>> are visible again and system suspend/resume works too, so the issue
>> with the latter on top of v6.18 seems not to be directly related to
>> $subject patchset. I will try to bisect this issue when I have some
>> spare time.
>>
>> Feel free to add:
>>
>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>
>
> I spent some time analyzing this regression on Juno R1 and found that:
>
> 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
> ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
> v6.15-rc1.
>
> 2. With $subject patch applied to enable SATA & GBit ethernet again,
> system suspend/resume stopped working after commit f3ac2ff14834
> ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
> platforms") merged to v6.18-rc1.
>
> If I got it right, according to the latter commit message, some quirks
> have to be added to fix the suspend/resume issue. Unfortunately I have
> no idea if this is the Juno R1 or the given PCI devices specific issue.
And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
L1 for devicetree platforms") doesn't fix the suspend/resume issue
either (with $subject patchset applied on top of it).
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-09 8:28 ` Marek Szyprowski
@ 2025-12-09 11:15 ` Manivannan Sadhasivam
2025-12-09 12:00 ` Marek Szyprowski
0 siblings, 1 reply; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-09 11:15 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Robin Murphy, Jason Gunthorpe
On Tue, Dec 09, 2025 at 09:28:38AM +0100, Marek Szyprowski wrote:
> On 09.12.2025 08:31, Marek Szyprowski wrote:
> > On 04.12.2025 14:13, Marek Szyprowski wrote:
> >> On 03.12.2025 13:04, Marek Szyprowski wrote:
> >>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
> >>>> This series fixes the long standing issue with ACS in OF platforms.
> >>>> There are
> >>>> two fixes in this series, both fixing independent issues on their
> >>>> own, but both
> >>>> are needed to properly enable ACS on OF platforms.
> >>>>
> >>>> Issue(s) background
> >>>> ===================
> >>>>
> >>>> Back in 2021, Xingang Wang first noted a failure in attaching the
> >>>> HiSilicon SEC
> >>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
> >>>> the issue to
> >>>> ACS not being enabled for the QEMU Root Port device and he proposed
> >>>> a patch to
> >>>> fix it [2].
> >>>>
> >>>> Once the patch got applied, people reported PCIe issues with
> >>>> linux-next on the
> >>>> ARM Juno Development boards, where they saw failure in enumerating
> >>>> the endpoint
> >>>> devices [3][4]. So soon, the patch got dropped, but the actual
> >>>> issue with the
> >>>> ARM Juno boards was left behind.
> >>>>
> >>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
> >>>> own usecase,
> >>>> hoping that someone in the community would fix the issue with ARM
> >>>> Juno boards.
> >>>> But the patch was rightly rejected, as a patch that was known to
> >>>> cause issues
> >>>> should not be merged to the kernel. But again, no one investigated
> >>>> the Juno
> >>>> issue and it was left behind again.
> >>>>
> >>>> Now it ended up in my plate and I managed to track down the issue
> >>>> with the help
> >>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
> >>>> was with the
> >>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
> >>>> Validation error on
> >>>> Completions received for the Configuration Read Request from a
> >>>> device connected
> >>>> to the downstream port that has not yet captured the PCIe bus
> >>>> number. As per the
> >>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
> >>>> Device Numbers
> >>>> supplied with all Type 0 Configuration Write Requests completed by
> >>>> the Function
> >>>> and supply these numbers in the Bus and Device Number fields of the
> >>>> Requester ID
> >>>> for all Requests". So during the first Configuration Read Request
> >>>> issued by the
> >>>> switch downstream port during enumeration (for reading Vendor ID),
> >>>> Bus and
> >>>> Device numbers will be unknown to the device. So it responds to the
> >>>> Read Request
> >>>> with Completion having Bus and Device number as 0. The switch
> >>>> interprets the
> >>>> Completion as an ACS Source Validation error and drops the
> >>>> completion, leading
> >>>> to the failure in detecting the endpoint device. Though the PCIe
> >>>> spec r6.0, sec
> >>>> 6.12.1.1, states that "Completions are never affected by ACS Source
> >>>> Validation".
> >>>> This behavior is in violation of the spec.
> >>>>
> >>>> Solution
> >>>> ========
> >>>>
> >>>> In September, I submitted a series [6] to fix both issues. For the
> >>>> IDT issue,
> >>>> I reused the existing quirk in the PCI core which does a dummy
> >>>> config write
> >>>> before issuing the first config read to the device. And for the ACS
> >>>> enablement
> >>>> issue, I just resubmitted the original patch from Xingang which called
> >>>> pci_request_acs() from devm_of_pci_bridge_init().
> >>>>
> >>>> But during the review of the series, several comments were received
> >>>> and they
> >>>> required the series to be reworked completely. Hence, in this
> >>>> version, I've
> >>>> incorported the comments as below:
> >>>>
> >>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
> >>>> call from
> >>>> pci_acs_init() to pci_dma_configure().
> >>>>
> >>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
> >>>> 'pci_dev',
> >>>> collected the broken capability for the IDT switches in the quirk
> >>>> and used it to
> >>>> disable the capability in the cache. This also allowed me to get
> >>>> rid of the
> >>>> earlier workaround for the switch.
> >>>>
> >>>> [1]
> >>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
> >>>> [2]
> >>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
> >>>> [3]
> >>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
> >>>> [4]
> >>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
> >>>> [5]
> >>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
> >>>> [6]
> >>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
> >>>>
> >>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
> >>> looks that it almost works fine. This patchset even fixed some
> >>> issues with PCI devices probe, as I again see SATA and GBit ethernet
> >>> devices, which were missing since Linux v6.14 (it looks that
> >>> I've also missed this in my tests).
> >>>
> >>> # lspci
> >>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
> >>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>> 8090 (rev 02)
> >>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
> >>> ATA Raid II Controller (rev 01)
> >>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
> >>> PCI-E Gigabit Ethernet Controller
> >>>
> >>> However there is also a regression. After applying this patchset
> >>> system suspend/resume stopped working. This is probably related to
> >>> this message:
> >>>
> >>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
> >>> D3hot, device inaccessible
> >>>
> >>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
> >>> related to this patchset, so I probably need to apply it on older
> >>> kernel releases and check.
> >>
> >>
> >> Just one more information - I've applied this patchset on top of
> >> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
> >> are visible again and system suspend/resume works too, so the issue
> >> with the latter on top of v6.18 seems not to be directly related to
> >> $subject patchset. I will try to bisect this issue when I have some
> >> spare time.
> >>
> >> Feel free to add:
> >>
> >> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >
> >
> > I spent some time analyzing this regression on Juno R1 and found that:
> >
> > 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
> > ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
> > v6.15-rc1.
> >
> > 2. With $subject patch applied to enable SATA & GBit ethernet again,
> > system suspend/resume stopped working after commit f3ac2ff14834
> > ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
> > platforms") merged to v6.18-rc1.
> >
Yes, this was expected as if you don't disable ACS, it will cause issues in
detecting the devices.
> > If I got it right, according to the latter commit message, some quirks
> > have to be added to fix the suspend/resume issue. Unfortunately I have
> > no idea if this is the Juno R1 or the given PCI devices specific issue.
>
>
> And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
> L1 for devicetree platforms") doesn't fix the suspend/resume issue
> either (with $subject patchset applied on top of it).
>
Interesting. Can you do:
echo performance > /sys/module/pcie_aspm/parameters/policy
and then suspend?
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches
2025-12-02 19:15 ` Jason Gunthorpe
@ 2025-12-09 11:20 ` Manivannan Sadhasivam
2025-12-17 15:19 ` Jason Gunthorpe
0 siblings, 1 reply; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-09 11:20 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Marek Szyprowski, Robin Murphy
On Tue, Dec 02, 2025 at 03:15:33PM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 02, 2025 at 07:52:50PM +0530, Manivannan Sadhasivam wrote:
> > @@ -544,6 +544,7 @@ struct pci_dev {
> > #endif
> > u16 acs_cap; /* ACS Capability offset */
> > u16 acs_capabilities; /* ACS Capabilities */
> > + u16 acs_broken_cap; /* Broken ACS Capabilities */
>
> Why do we need this? Have the quirk function accep tthe
> acs_capabilities from the register and return the value to program
> into struct pci_dev ?
>
We dont have any quirk levels between pci_acs_init() and pci_acs_enable() that
will allow us to modify pci_dev::acs_capabilities in the quirk function. Hence,
I came up with one more member to pass the broken caps.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-09 11:15 ` Manivannan Sadhasivam
@ 2025-12-09 12:00 ` Marek Szyprowski
2025-12-09 15:04 ` Manivannan Sadhasivam
0 siblings, 1 reply; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-09 12:00 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Robin Murphy, Jason Gunthorpe
On 09.12.2025 12:15, Manivannan Sadhasivam wrote:
> On Tue, Dec 09, 2025 at 09:28:38AM +0100, Marek Szyprowski wrote:
>> On 09.12.2025 08:31, Marek Szyprowski wrote:
>>> On 04.12.2025 14:13, Marek Szyprowski wrote:
>>>> On 03.12.2025 13:04, Marek Szyprowski wrote:
>>>>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
>>>>>> This series fixes the long standing issue with ACS in OF platforms.
>>>>>> There are
>>>>>> two fixes in this series, both fixing independent issues on their
>>>>>> own, but both
>>>>>> are needed to properly enable ACS on OF platforms.
>>>>>>
>>>>>> Issue(s) background
>>>>>> ===================
>>>>>>
>>>>>> Back in 2021, Xingang Wang first noted a failure in attaching the
>>>>>> HiSilicon SEC
>>>>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
>>>>>> the issue to
>>>>>> ACS not being enabled for the QEMU Root Port device and he proposed
>>>>>> a patch to
>>>>>> fix it [2].
>>>>>>
>>>>>> Once the patch got applied, people reported PCIe issues with
>>>>>> linux-next on the
>>>>>> ARM Juno Development boards, where they saw failure in enumerating
>>>>>> the endpoint
>>>>>> devices [3][4]. So soon, the patch got dropped, but the actual
>>>>>> issue with the
>>>>>> ARM Juno boards was left behind.
>>>>>>
>>>>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
>>>>>> own usecase,
>>>>>> hoping that someone in the community would fix the issue with ARM
>>>>>> Juno boards.
>>>>>> But the patch was rightly rejected, as a patch that was known to
>>>>>> cause issues
>>>>>> should not be merged to the kernel. But again, no one investigated
>>>>>> the Juno
>>>>>> issue and it was left behind again.
>>>>>>
>>>>>> Now it ended up in my plate and I managed to track down the issue
>>>>>> with the help
>>>>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
>>>>>> was with the
>>>>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
>>>>>> Validation error on
>>>>>> Completions received for the Configuration Read Request from a
>>>>>> device connected
>>>>>> to the downstream port that has not yet captured the PCIe bus
>>>>>> number. As per the
>>>>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
>>>>>> Device Numbers
>>>>>> supplied with all Type 0 Configuration Write Requests completed by
>>>>>> the Function
>>>>>> and supply these numbers in the Bus and Device Number fields of the
>>>>>> Requester ID
>>>>>> for all Requests". So during the first Configuration Read Request
>>>>>> issued by the
>>>>>> switch downstream port during enumeration (for reading Vendor ID),
>>>>>> Bus and
>>>>>> Device numbers will be unknown to the device. So it responds to the
>>>>>> Read Request
>>>>>> with Completion having Bus and Device number as 0. The switch
>>>>>> interprets the
>>>>>> Completion as an ACS Source Validation error and drops the
>>>>>> completion, leading
>>>>>> to the failure in detecting the endpoint device. Though the PCIe
>>>>>> spec r6.0, sec
>>>>>> 6.12.1.1, states that "Completions are never affected by ACS Source
>>>>>> Validation".
>>>>>> This behavior is in violation of the spec.
>>>>>>
>>>>>> Solution
>>>>>> ========
>>>>>>
>>>>>> In September, I submitted a series [6] to fix both issues. For the
>>>>>> IDT issue,
>>>>>> I reused the existing quirk in the PCI core which does a dummy
>>>>>> config write
>>>>>> before issuing the first config read to the device. And for the ACS
>>>>>> enablement
>>>>>> issue, I just resubmitted the original patch from Xingang which called
>>>>>> pci_request_acs() from devm_of_pci_bridge_init().
>>>>>>
>>>>>> But during the review of the series, several comments were received
>>>>>> and they
>>>>>> required the series to be reworked completely. Hence, in this
>>>>>> version, I've
>>>>>> incorported the comments as below:
>>>>>>
>>>>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
>>>>>> call from
>>>>>> pci_acs_init() to pci_dma_configure().
>>>>>>
>>>>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
>>>>>> 'pci_dev',
>>>>>> collected the broken capability for the IDT switches in the quirk
>>>>>> and used it to
>>>>>> disable the capability in the cache. This also allowed me to get
>>>>>> rid of the
>>>>>> earlier workaround for the switch.
>>>>>>
>>>>>> [1]
>>>>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
>>>>>> [2]
>>>>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
>>>>>> [3]
>>>>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
>>>>>> [4]
>>>>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
>>>>>> [5]
>>>>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
>>>>>> [6]
>>>>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>>>>>>
>>>>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
>>>>> looks that it almost works fine. This patchset even fixed some
>>>>> issues with PCI devices probe, as I again see SATA and GBit ethernet
>>>>> devices, which were missing since Linux v6.14 (it looks that
>>>>> I've also missed this in my tests).
>>>>>
>>>>> # lspci
>>>>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
>>>>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>> 8090 (rev 02)
>>>>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
>>>>> ATA Raid II Controller (rev 01)
>>>>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
>>>>> PCI-E Gigabit Ethernet Controller
>>>>>
>>>>> However there is also a regression. After applying this patchset
>>>>> system suspend/resume stopped working. This is probably related to
>>>>> this message:
>>>>>
>>>>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
>>>>> D3hot, device inaccessible
>>>>>
>>>>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
>>>>> related to this patchset, so I probably need to apply it on older
>>>>> kernel releases and check.
>>>>
>>>> Just one more information - I've applied this patchset on top of
>>>> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
>>>> are visible again and system suspend/resume works too, so the issue
>>>> with the latter on top of v6.18 seems not to be directly related to
>>>> $subject patchset. I will try to bisect this issue when I have some
>>>> spare time.
>>>>
>>>> Feel free to add:
>>>>
>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>
>>> I spent some time analyzing this regression on Juno R1 and found that:
>>>
>>> 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
>>> ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
>>> v6.15-rc1.
>>>
>>> 2. With $subject patch applied to enable SATA & GBit ethernet again,
>>> system suspend/resume stopped working after commit f3ac2ff14834
>>> ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
>>> platforms") merged to v6.18-rc1.
>>>
> Yes, this was expected as if you don't disable ACS, it will cause issues in
> detecting the devices.
>
>>> If I got it right, according to the latter commit message, some quirks
>>> have to be added to fix the suspend/resume issue. Unfortunately I have
>>> no idea if this is the Juno R1 or the given PCI devices specific issue.
>>
>> And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
>> L1 for devicetree platforms") doesn't fix the suspend/resume issue
>> either (with $subject patchset applied on top of it).
>>
> Interesting. Can you do:
>
> echo performance > /sys/module/pcie_aspm/parameters/policy
>
> and then suspend?
After the above command, system suspend/resume works again.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-09 12:00 ` Marek Szyprowski
@ 2025-12-09 15:04 ` Manivannan Sadhasivam
2025-12-10 17:26 ` Marek Szyprowski
0 siblings, 1 reply; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-09 15:04 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Robin Murphy, Jason Gunthorpe
On Tue, Dec 09, 2025 at 01:00:55PM +0100, Marek Szyprowski wrote:
> On 09.12.2025 12:15, Manivannan Sadhasivam wrote:
> > On Tue, Dec 09, 2025 at 09:28:38AM +0100, Marek Szyprowski wrote:
> >> On 09.12.2025 08:31, Marek Szyprowski wrote:
> >>> On 04.12.2025 14:13, Marek Szyprowski wrote:
> >>>> On 03.12.2025 13:04, Marek Szyprowski wrote:
> >>>>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
> >>>>>> This series fixes the long standing issue with ACS in OF platforms.
> >>>>>> There are
> >>>>>> two fixes in this series, both fixing independent issues on their
> >>>>>> own, but both
> >>>>>> are needed to properly enable ACS on OF platforms.
> >>>>>>
> >>>>>> Issue(s) background
> >>>>>> ===================
> >>>>>>
> >>>>>> Back in 2021, Xingang Wang first noted a failure in attaching the
> >>>>>> HiSilicon SEC
> >>>>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
> >>>>>> the issue to
> >>>>>> ACS not being enabled for the QEMU Root Port device and he proposed
> >>>>>> a patch to
> >>>>>> fix it [2].
> >>>>>>
> >>>>>> Once the patch got applied, people reported PCIe issues with
> >>>>>> linux-next on the
> >>>>>> ARM Juno Development boards, where they saw failure in enumerating
> >>>>>> the endpoint
> >>>>>> devices [3][4]. So soon, the patch got dropped, but the actual
> >>>>>> issue with the
> >>>>>> ARM Juno boards was left behind.
> >>>>>>
> >>>>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
> >>>>>> own usecase,
> >>>>>> hoping that someone in the community would fix the issue with ARM
> >>>>>> Juno boards.
> >>>>>> But the patch was rightly rejected, as a patch that was known to
> >>>>>> cause issues
> >>>>>> should not be merged to the kernel. But again, no one investigated
> >>>>>> the Juno
> >>>>>> issue and it was left behind again.
> >>>>>>
> >>>>>> Now it ended up in my plate and I managed to track down the issue
> >>>>>> with the help
> >>>>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
> >>>>>> was with the
> >>>>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
> >>>>>> Validation error on
> >>>>>> Completions received for the Configuration Read Request from a
> >>>>>> device connected
> >>>>>> to the downstream port that has not yet captured the PCIe bus
> >>>>>> number. As per the
> >>>>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
> >>>>>> Device Numbers
> >>>>>> supplied with all Type 0 Configuration Write Requests completed by
> >>>>>> the Function
> >>>>>> and supply these numbers in the Bus and Device Number fields of the
> >>>>>> Requester ID
> >>>>>> for all Requests". So during the first Configuration Read Request
> >>>>>> issued by the
> >>>>>> switch downstream port during enumeration (for reading Vendor ID),
> >>>>>> Bus and
> >>>>>> Device numbers will be unknown to the device. So it responds to the
> >>>>>> Read Request
> >>>>>> with Completion having Bus and Device number as 0. The switch
> >>>>>> interprets the
> >>>>>> Completion as an ACS Source Validation error and drops the
> >>>>>> completion, leading
> >>>>>> to the failure in detecting the endpoint device. Though the PCIe
> >>>>>> spec r6.0, sec
> >>>>>> 6.12.1.1, states that "Completions are never affected by ACS Source
> >>>>>> Validation".
> >>>>>> This behavior is in violation of the spec.
> >>>>>>
> >>>>>> Solution
> >>>>>> ========
> >>>>>>
> >>>>>> In September, I submitted a series [6] to fix both issues. For the
> >>>>>> IDT issue,
> >>>>>> I reused the existing quirk in the PCI core which does a dummy
> >>>>>> config write
> >>>>>> before issuing the first config read to the device. And for the ACS
> >>>>>> enablement
> >>>>>> issue, I just resubmitted the original patch from Xingang which called
> >>>>>> pci_request_acs() from devm_of_pci_bridge_init().
> >>>>>>
> >>>>>> But during the review of the series, several comments were received
> >>>>>> and they
> >>>>>> required the series to be reworked completely. Hence, in this
> >>>>>> version, I've
> >>>>>> incorported the comments as below:
> >>>>>>
> >>>>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
> >>>>>> call from
> >>>>>> pci_acs_init() to pci_dma_configure().
> >>>>>>
> >>>>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
> >>>>>> 'pci_dev',
> >>>>>> collected the broken capability for the IDT switches in the quirk
> >>>>>> and used it to
> >>>>>> disable the capability in the cache. This also allowed me to get
> >>>>>> rid of the
> >>>>>> earlier workaround for the switch.
> >>>>>>
> >>>>>> [1]
> >>>>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
> >>>>>> [2]
> >>>>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
> >>>>>> [3]
> >>>>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
> >>>>>> [4]
> >>>>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
> >>>>>> [5]
> >>>>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
> >>>>>> [6]
> >>>>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
> >>>>>>
> >>>>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
> >>>>> looks that it almost works fine. This patchset even fixed some
> >>>>> issues with PCI devices probe, as I again see SATA and GBit ethernet
> >>>>> devices, which were missing since Linux v6.14 (it looks that
> >>>>> I've also missed this in my tests).
> >>>>>
> >>>>> # lspci
> >>>>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
> >>>>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>> 8090 (rev 02)
> >>>>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
> >>>>> ATA Raid II Controller (rev 01)
> >>>>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
> >>>>> PCI-E Gigabit Ethernet Controller
> >>>>>
> >>>>> However there is also a regression. After applying this patchset
> >>>>> system suspend/resume stopped working. This is probably related to
> >>>>> this message:
> >>>>>
> >>>>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
> >>>>> D3hot, device inaccessible
> >>>>>
> >>>>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
> >>>>> related to this patchset, so I probably need to apply it on older
> >>>>> kernel releases and check.
> >>>>
> >>>> Just one more information - I've applied this patchset on top of
> >>>> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
> >>>> are visible again and system suspend/resume works too, so the issue
> >>>> with the latter on top of v6.18 seems not to be directly related to
> >>>> $subject patchset. I will try to bisect this issue when I have some
> >>>> spare time.
> >>>>
> >>>> Feel free to add:
> >>>>
> >>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>
> >>> I spent some time analyzing this regression on Juno R1 and found that:
> >>>
> >>> 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
> >>> ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
> >>> v6.15-rc1.
> >>>
> >>> 2. With $subject patch applied to enable SATA & GBit ethernet again,
> >>> system suspend/resume stopped working after commit f3ac2ff14834
> >>> ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
> >>> platforms") merged to v6.18-rc1.
> >>>
> > Yes, this was expected as if you don't disable ACS, it will cause issues in
> > detecting the devices.
> >
> >>> If I got it right, according to the latter commit message, some quirks
> >>> have to be added to fix the suspend/resume issue. Unfortunately I have
> >>> no idea if this is the Juno R1 or the given PCI devices specific issue.
> >>
> >> And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
> >> L1 for devicetree platforms") doesn't fix the suspend/resume issue
> >> either (with $subject patchset applied on top of it).
> >>
> > Interesting. Can you do:
> >
> > echo performance > /sys/module/pcie_aspm/parameters/policy
> >
> > and then suspend?
>
> After the above command, system suspend/resume works again.
>
Ok, so ASPM L0s/L1 seems to be the issue. But I'm not quite sure why it causes
issue during suspend/resume. If the device/controller doesn't play well with
ASPM L0s/L1, it should atleast cause the issue before entering suspend.
I'm clueless here atm...
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-09 15:04 ` Manivannan Sadhasivam
@ 2025-12-10 17:26 ` Marek Szyprowski
2025-12-12 4:02 ` Manivannan Sadhasivam
0 siblings, 1 reply; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-10 17:26 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Robin Murphy, Jason Gunthorpe
On 09.12.2025 16:04, Manivannan Sadhasivam wrote:
> On Tue, Dec 09, 2025 at 01:00:55PM +0100, Marek Szyprowski wrote:
>> On 09.12.2025 12:15, Manivannan Sadhasivam wrote:
>>> On Tue, Dec 09, 2025 at 09:28:38AM +0100, Marek Szyprowski wrote:
>>>> On 09.12.2025 08:31, Marek Szyprowski wrote:
>>>>> On 04.12.2025 14:13, Marek Szyprowski wrote:
>>>>>> On 03.12.2025 13:04, Marek Szyprowski wrote:
>>>>>>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
>>>>>>>> This series fixes the long standing issue with ACS in OF platforms.
>>>>>>>> There are
>>>>>>>> two fixes in this series, both fixing independent issues on their
>>>>>>>> own, but both
>>>>>>>> are needed to properly enable ACS on OF platforms.
>>>>>>>>
>>>>>>>> Issue(s) background
>>>>>>>> ===================
>>>>>>>>
>>>>>>>> Back in 2021, Xingang Wang first noted a failure in attaching the
>>>>>>>> HiSilicon SEC
>>>>>>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
>>>>>>>> the issue to
>>>>>>>> ACS not being enabled for the QEMU Root Port device and he proposed
>>>>>>>> a patch to
>>>>>>>> fix it [2].
>>>>>>>>
>>>>>>>> Once the patch got applied, people reported PCIe issues with
>>>>>>>> linux-next on the
>>>>>>>> ARM Juno Development boards, where they saw failure in enumerating
>>>>>>>> the endpoint
>>>>>>>> devices [3][4]. So soon, the patch got dropped, but the actual
>>>>>>>> issue with the
>>>>>>>> ARM Juno boards was left behind.
>>>>>>>>
>>>>>>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
>>>>>>>> own usecase,
>>>>>>>> hoping that someone in the community would fix the issue with ARM
>>>>>>>> Juno boards.
>>>>>>>> But the patch was rightly rejected, as a patch that was known to
>>>>>>>> cause issues
>>>>>>>> should not be merged to the kernel. But again, no one investigated
>>>>>>>> the Juno
>>>>>>>> issue and it was left behind again.
>>>>>>>>
>>>>>>>> Now it ended up in my plate and I managed to track down the issue
>>>>>>>> with the help
>>>>>>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
>>>>>>>> was with the
>>>>>>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
>>>>>>>> Validation error on
>>>>>>>> Completions received for the Configuration Read Request from a
>>>>>>>> device connected
>>>>>>>> to the downstream port that has not yet captured the PCIe bus
>>>>>>>> number. As per the
>>>>>>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
>>>>>>>> Device Numbers
>>>>>>>> supplied with all Type 0 Configuration Write Requests completed by
>>>>>>>> the Function
>>>>>>>> and supply these numbers in the Bus and Device Number fields of the
>>>>>>>> Requester ID
>>>>>>>> for all Requests". So during the first Configuration Read Request
>>>>>>>> issued by the
>>>>>>>> switch downstream port during enumeration (for reading Vendor ID),
>>>>>>>> Bus and
>>>>>>>> Device numbers will be unknown to the device. So it responds to the
>>>>>>>> Read Request
>>>>>>>> with Completion having Bus and Device number as 0. The switch
>>>>>>>> interprets the
>>>>>>>> Completion as an ACS Source Validation error and drops the
>>>>>>>> completion, leading
>>>>>>>> to the failure in detecting the endpoint device. Though the PCIe
>>>>>>>> spec r6.0, sec
>>>>>>>> 6.12.1.1, states that "Completions are never affected by ACS Source
>>>>>>>> Validation".
>>>>>>>> This behavior is in violation of the spec.
>>>>>>>>
>>>>>>>> Solution
>>>>>>>> ========
>>>>>>>>
>>>>>>>> In September, I submitted a series [6] to fix both issues. For the
>>>>>>>> IDT issue,
>>>>>>>> I reused the existing quirk in the PCI core which does a dummy
>>>>>>>> config write
>>>>>>>> before issuing the first config read to the device. And for the ACS
>>>>>>>> enablement
>>>>>>>> issue, I just resubmitted the original patch from Xingang which called
>>>>>>>> pci_request_acs() from devm_of_pci_bridge_init().
>>>>>>>>
>>>>>>>> But during the review of the series, several comments were received
>>>>>>>> and they
>>>>>>>> required the series to be reworked completely. Hence, in this
>>>>>>>> version, I've
>>>>>>>> incorported the comments as below:
>>>>>>>>
>>>>>>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
>>>>>>>> call from
>>>>>>>> pci_acs_init() to pci_dma_configure().
>>>>>>>>
>>>>>>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
>>>>>>>> 'pci_dev',
>>>>>>>> collected the broken capability for the IDT switches in the quirk
>>>>>>>> and used it to
>>>>>>>> disable the capability in the cache. This also allowed me to get
>>>>>>>> rid of the
>>>>>>>> earlier workaround for the switch.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
>>>>>>>> [2]
>>>>>>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
>>>>>>>> [3]
>>>>>>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
>>>>>>>> [4]
>>>>>>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
>>>>>>>> [5]
>>>>>>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
>>>>>>>> [6]
>>>>>>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>>>>>>>>
>>>>>>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
>>>>>>> looks that it almost works fine. This patchset even fixed some
>>>>>>> issues with PCI devices probe, as I again see SATA and GBit ethernet
>>>>>>> devices, which were missing since Linux v6.14 (it looks that
>>>>>>> I've also missed this in my tests).
>>>>>>>
>>>>>>> # lspci
>>>>>>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
>>>>>>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>> 8090 (rev 02)
>>>>>>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
>>>>>>> ATA Raid II Controller (rev 01)
>>>>>>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
>>>>>>> PCI-E Gigabit Ethernet Controller
>>>>>>>
>>>>>>> However there is also a regression. After applying this patchset
>>>>>>> system suspend/resume stopped working. This is probably related to
>>>>>>> this message:
>>>>>>>
>>>>>>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
>>>>>>> D3hot, device inaccessible
>>>>>>>
>>>>>>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
>>>>>>> related to this patchset, so I probably need to apply it on older
>>>>>>> kernel releases and check.
>>>>>> Just one more information - I've applied this patchset on top of
>>>>>> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
>>>>>> are visible again and system suspend/resume works too, so the issue
>>>>>> with the latter on top of v6.18 seems not to be directly related to
>>>>>> $subject patchset. I will try to bisect this issue when I have some
>>>>>> spare time.
>>>>>>
>>>>>> Feel free to add:
>>>>>>
>>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>> I spent some time analyzing this regression on Juno R1 and found that:
>>>>>
>>>>> 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
>>>>> ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
>>>>> v6.15-rc1.
>>>>>
>>>>> 2. With $subject patch applied to enable SATA & GBit ethernet again,
>>>>> system suspend/resume stopped working after commit f3ac2ff14834
>>>>> ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
>>>>> platforms") merged to v6.18-rc1.
>>>>>
>>> Yes, this was expected as if you don't disable ACS, it will cause issues in
>>> detecting the devices.
>>>
>>>>> If I got it right, according to the latter commit message, some quirks
>>>>> have to be added to fix the suspend/resume issue. Unfortunately I have
>>>>> no idea if this is the Juno R1 or the given PCI devices specific issue.
>>>> And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
>>>> L1 for devicetree platforms") doesn't fix the suspend/resume issue
>>>> either (with $subject patchset applied on top of it).
>>>>
>>> Interesting. Can you do:
>>>
>>> echo performance > /sys/module/pcie_aspm/parameters/policy
>>>
>>> and then suspend?
>> After the above command, system suspend/resume works again.
>>
> Ok, so ASPM L0s/L1 seems to be the issue. But I'm not quite sure why it causes
> issue during suspend/resume. If the device/controller doesn't play well with
> ASPM L0s/L1, it should atleast cause the issue before entering suspend.
>
> I'm clueless here atm...
Definitely something gets broken during suspend, after adding
'no_console_suspend' to kernel command line I see the following messages:
# time rtcwake -s10 -mmem
rtcwake: wakeup from "mem" using /dev/rtc0 at Wed Dec 10 17:04:12 2025
PM: suspend entry (deep)
Filesystems sync: 0.001 seconds
Freezing user space processes
Freezing user space processes completed (elapsed 0.005 seconds)
OOM killer disabled.
Freezing remaining freezable tasks
Freezing remaining freezable tasks completed (elapsed 0.003 seconds)
psmouse serio1: Failed to disable mouse on 1c070000.kmi
psmouse serio0: Failed to disable mouse on 1c060000.kmi
pcieport 0000:02:1f.0: Unable to change power state from D0 to D3hot,
device inaccessible
Disabling non-boot CPUs ...
psci: CPU5 killed (polled 0 ms)
psci: CPU4 killed (polled 0 ms)
psci: CPU3 killed (polled 0 ms)
psci: CPU2 killed (polled 0 ms)
psci: CPU1 killed (polled 4 ms)
and system never wakes up.
I assume that this 'pcieport 0000:02:1f.0: Unable to change power state
from D0 to D3hot, device inaccessible' message is crucial here. It
doesn't appear when I change the pcie_aspm policy to performance (as You
suggested in previous mail).
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-10 17:26 ` Marek Szyprowski
@ 2025-12-12 4:02 ` Manivannan Sadhasivam
2025-12-12 7:25 ` Marek Szyprowski
0 siblings, 1 reply; 19+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-12 4:02 UTC (permalink / raw)
To: Marek Szyprowski
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Robin Murphy, Jason Gunthorpe
On Wed, Dec 10, 2025 at 06:26:27PM +0100, Marek Szyprowski wrote:
> On 09.12.2025 16:04, Manivannan Sadhasivam wrote:
> > On Tue, Dec 09, 2025 at 01:00:55PM +0100, Marek Szyprowski wrote:
> >> On 09.12.2025 12:15, Manivannan Sadhasivam wrote:
> >>> On Tue, Dec 09, 2025 at 09:28:38AM +0100, Marek Szyprowski wrote:
> >>>> On 09.12.2025 08:31, Marek Szyprowski wrote:
> >>>>> On 04.12.2025 14:13, Marek Szyprowski wrote:
> >>>>>> On 03.12.2025 13:04, Marek Szyprowski wrote:
> >>>>>>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
> >>>>>>>> This series fixes the long standing issue with ACS in OF platforms.
> >>>>>>>> There are
> >>>>>>>> two fixes in this series, both fixing independent issues on their
> >>>>>>>> own, but both
> >>>>>>>> are needed to properly enable ACS on OF platforms.
> >>>>>>>>
> >>>>>>>> Issue(s) background
> >>>>>>>> ===================
> >>>>>>>>
> >>>>>>>> Back in 2021, Xingang Wang first noted a failure in attaching the
> >>>>>>>> HiSilicon SEC
> >>>>>>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
> >>>>>>>> the issue to
> >>>>>>>> ACS not being enabled for the QEMU Root Port device and he proposed
> >>>>>>>> a patch to
> >>>>>>>> fix it [2].
> >>>>>>>>
> >>>>>>>> Once the patch got applied, people reported PCIe issues with
> >>>>>>>> linux-next on the
> >>>>>>>> ARM Juno Development boards, where they saw failure in enumerating
> >>>>>>>> the endpoint
> >>>>>>>> devices [3][4]. So soon, the patch got dropped, but the actual
> >>>>>>>> issue with the
> >>>>>>>> ARM Juno boards was left behind.
> >>>>>>>>
> >>>>>>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
> >>>>>>>> own usecase,
> >>>>>>>> hoping that someone in the community would fix the issue with ARM
> >>>>>>>> Juno boards.
> >>>>>>>> But the patch was rightly rejected, as a patch that was known to
> >>>>>>>> cause issues
> >>>>>>>> should not be merged to the kernel. But again, no one investigated
> >>>>>>>> the Juno
> >>>>>>>> issue and it was left behind again.
> >>>>>>>>
> >>>>>>>> Now it ended up in my plate and I managed to track down the issue
> >>>>>>>> with the help
> >>>>>>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
> >>>>>>>> was with the
> >>>>>>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
> >>>>>>>> Validation error on
> >>>>>>>> Completions received for the Configuration Read Request from a
> >>>>>>>> device connected
> >>>>>>>> to the downstream port that has not yet captured the PCIe bus
> >>>>>>>> number. As per the
> >>>>>>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
> >>>>>>>> Device Numbers
> >>>>>>>> supplied with all Type 0 Configuration Write Requests completed by
> >>>>>>>> the Function
> >>>>>>>> and supply these numbers in the Bus and Device Number fields of the
> >>>>>>>> Requester ID
> >>>>>>>> for all Requests". So during the first Configuration Read Request
> >>>>>>>> issued by the
> >>>>>>>> switch downstream port during enumeration (for reading Vendor ID),
> >>>>>>>> Bus and
> >>>>>>>> Device numbers will be unknown to the device. So it responds to the
> >>>>>>>> Read Request
> >>>>>>>> with Completion having Bus and Device number as 0. The switch
> >>>>>>>> interprets the
> >>>>>>>> Completion as an ACS Source Validation error and drops the
> >>>>>>>> completion, leading
> >>>>>>>> to the failure in detecting the endpoint device. Though the PCIe
> >>>>>>>> spec r6.0, sec
> >>>>>>>> 6.12.1.1, states that "Completions are never affected by ACS Source
> >>>>>>>> Validation".
> >>>>>>>> This behavior is in violation of the spec.
> >>>>>>>>
> >>>>>>>> Solution
> >>>>>>>> ========
> >>>>>>>>
> >>>>>>>> In September, I submitted a series [6] to fix both issues. For the
> >>>>>>>> IDT issue,
> >>>>>>>> I reused the existing quirk in the PCI core which does a dummy
> >>>>>>>> config write
> >>>>>>>> before issuing the first config read to the device. And for the ACS
> >>>>>>>> enablement
> >>>>>>>> issue, I just resubmitted the original patch from Xingang which called
> >>>>>>>> pci_request_acs() from devm_of_pci_bridge_init().
> >>>>>>>>
> >>>>>>>> But during the review of the series, several comments were received
> >>>>>>>> and they
> >>>>>>>> required the series to be reworked completely. Hence, in this
> >>>>>>>> version, I've
> >>>>>>>> incorported the comments as below:
> >>>>>>>>
> >>>>>>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
> >>>>>>>> call from
> >>>>>>>> pci_acs_init() to pci_dma_configure().
> >>>>>>>>
> >>>>>>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
> >>>>>>>> 'pci_dev',
> >>>>>>>> collected the broken capability for the IDT switches in the quirk
> >>>>>>>> and used it to
> >>>>>>>> disable the capability in the cache. This also allowed me to get
> >>>>>>>> rid of the
> >>>>>>>> earlier workaround for the switch.
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
> >>>>>>>> [2]
> >>>>>>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
> >>>>>>>> [3]
> >>>>>>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
> >>>>>>>> [4]
> >>>>>>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
> >>>>>>>> [5]
> >>>>>>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
> >>>>>>>> [6]
> >>>>>>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
> >>>>>>>>
> >>>>>>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
> >>>>>>> looks that it almost works fine. This patchset even fixed some
> >>>>>>> issues with PCI devices probe, as I again see SATA and GBit ethernet
> >>>>>>> devices, which were missing since Linux v6.14 (it looks that
> >>>>>>> I've also missed this in my tests).
> >>>>>>>
> >>>>>>> # lspci
> >>>>>>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
> >>>>>>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
> >>>>>>> 8090 (rev 02)
> >>>>>>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
> >>>>>>> ATA Raid II Controller (rev 01)
> >>>>>>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
> >>>>>>> PCI-E Gigabit Ethernet Controller
> >>>>>>>
> >>>>>>> However there is also a regression. After applying this patchset
> >>>>>>> system suspend/resume stopped working. This is probably related to
> >>>>>>> this message:
> >>>>>>>
> >>>>>>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
> >>>>>>> D3hot, device inaccessible
> >>>>>>>
> >>>>>>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
> >>>>>>> related to this patchset, so I probably need to apply it on older
> >>>>>>> kernel releases and check.
> >>>>>> Just one more information - I've applied this patchset on top of
> >>>>>> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
> >>>>>> are visible again and system suspend/resume works too, so the issue
> >>>>>> with the latter on top of v6.18 seems not to be directly related to
> >>>>>> $subject patchset. I will try to bisect this issue when I have some
> >>>>>> spare time.
> >>>>>>
> >>>>>> Feel free to add:
> >>>>>>
> >>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>>> I spent some time analyzing this regression on Juno R1 and found that:
> >>>>>
> >>>>> 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
> >>>>> ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
> >>>>> v6.15-rc1.
> >>>>>
> >>>>> 2. With $subject patch applied to enable SATA & GBit ethernet again,
> >>>>> system suspend/resume stopped working after commit f3ac2ff14834
> >>>>> ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
> >>>>> platforms") merged to v6.18-rc1.
> >>>>>
> >>> Yes, this was expected as if you don't disable ACS, it will cause issues in
> >>> detecting the devices.
> >>>
> >>>>> If I got it right, according to the latter commit message, some quirks
> >>>>> have to be added to fix the suspend/resume issue. Unfortunately I have
> >>>>> no idea if this is the Juno R1 or the given PCI devices specific issue.
> >>>> And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
> >>>> L1 for devicetree platforms") doesn't fix the suspend/resume issue
> >>>> either (with $subject patchset applied on top of it).
> >>>>
> >>> Interesting. Can you do:
> >>>
> >>> echo performance > /sys/module/pcie_aspm/parameters/policy
> >>>
> >>> and then suspend?
> >> After the above command, system suspend/resume works again.
> >>
> > Ok, so ASPM L0s/L1 seems to be the issue. But I'm not quite sure why it causes
> > issue during suspend/resume. If the device/controller doesn't play well with
> > ASPM L0s/L1, it should atleast cause the issue before entering suspend.
> >
> > I'm clueless here atm...
>
> Definitely something gets broken during suspend, after adding
> 'no_console_suspend' to kernel command line I see the following messages:
>
> # time rtcwake -s10 -mmem
> rtcwake: wakeup from "mem" using /dev/rtc0 at Wed Dec 10 17:04:12 2025
> PM: suspend entry (deep)
> Filesystems sync: 0.001 seconds
> Freezing user space processes
> Freezing user space processes completed (elapsed 0.005 seconds)
> OOM killer disabled.
> Freezing remaining freezable tasks
> Freezing remaining freezable tasks completed (elapsed 0.003 seconds)
> psmouse serio1: Failed to disable mouse on 1c070000.kmi
> psmouse serio0: Failed to disable mouse on 1c060000.kmi
> pcieport 0000:02:1f.0: Unable to change power state from D0 to D3hot,
> device inaccessible
The device just got blown off the bus at this point. But it is unclear to me why
it happens though if we enable ASPM L0s/L1. I don't think the firmware has
gotten the chance to turn off the power to devices.
So maybe some actions that we do in the PCI core during system suspend is
affecting the device state. But can you try to access the device by doing:
lspci -vvv -s 0000:02:1f.0
before initiating system suspend. Just to make sure if the issue happens during
suspend or way before that.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms
2025-12-12 4:02 ` Manivannan Sadhasivam
@ 2025-12-12 7:25 ` Marek Szyprowski
0 siblings, 0 replies; 19+ messages in thread
From: Marek Szyprowski @ 2025-12-12 7:25 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Robin Murphy, Jason Gunthorpe
On 12.12.2025 05:02, Manivannan Sadhasivam wrote:
> On Wed, Dec 10, 2025 at 06:26:27PM +0100, Marek Szyprowski wrote:
>> On 09.12.2025 16:04, Manivannan Sadhasivam wrote:
>>> On Tue, Dec 09, 2025 at 01:00:55PM +0100, Marek Szyprowski wrote:
>>>> On 09.12.2025 12:15, Manivannan Sadhasivam wrote:
>>>>> On Tue, Dec 09, 2025 at 09:28:38AM +0100, Marek Szyprowski wrote:
>>>>>> On 09.12.2025 08:31, Marek Szyprowski wrote:
>>>>>>> On 04.12.2025 14:13, Marek Szyprowski wrote:
>>>>>>>> On 03.12.2025 13:04, Marek Szyprowski wrote:
>>>>>>>>> On 02.12.2025 15:22, Manivannan Sadhasivam wrote:
>>>>>>>>>> This series fixes the long standing issue with ACS in OF platforms.
>>>>>>>>>> There are
>>>>>>>>>> two fixes in this series, both fixing independent issues on their
>>>>>>>>>> own, but both
>>>>>>>>>> are needed to properly enable ACS on OF platforms.
>>>>>>>>>>
>>>>>>>>>> Issue(s) background
>>>>>>>>>> ===================
>>>>>>>>>>
>>>>>>>>>> Back in 2021, Xingang Wang first noted a failure in attaching the
>>>>>>>>>> HiSilicon SEC
>>>>>>>>>> device to QEMU ARM64 pci-root-port device [1]. He then tracked down
>>>>>>>>>> the issue to
>>>>>>>>>> ACS not being enabled for the QEMU Root Port device and he proposed
>>>>>>>>>> a patch to
>>>>>>>>>> fix it [2].
>>>>>>>>>>
>>>>>>>>>> Once the patch got applied, people reported PCIe issues with
>>>>>>>>>> linux-next on the
>>>>>>>>>> ARM Juno Development boards, where they saw failure in enumerating
>>>>>>>>>> the endpoint
>>>>>>>>>> devices [3][4]. So soon, the patch got dropped, but the actual
>>>>>>>>>> issue with the
>>>>>>>>>> ARM Juno boards was left behind.
>>>>>>>>>>
>>>>>>>>>> Fast forward to 2024, Pavan resubmitted the same fix [5] for his
>>>>>>>>>> own usecase,
>>>>>>>>>> hoping that someone in the community would fix the issue with ARM
>>>>>>>>>> Juno boards.
>>>>>>>>>> But the patch was rightly rejected, as a patch that was known to
>>>>>>>>>> cause issues
>>>>>>>>>> should not be merged to the kernel. But again, no one investigated
>>>>>>>>>> the Juno
>>>>>>>>>> issue and it was left behind again.
>>>>>>>>>>
>>>>>>>>>> Now it ended up in my plate and I managed to track down the issue
>>>>>>>>>> with the help
>>>>>>>>>> of Naresh who got access to the Juno boards in LKFT. The Juno issue
>>>>>>>>>> was with the
>>>>>>>>>> PCIe switch from Microsemi/IDT, which triggers ACS Source
>>>>>>>>>> Validation error on
>>>>>>>>>> Completions received for the Configuration Read Request from a
>>>>>>>>>> device connected
>>>>>>>>>> to the downstream port that has not yet captured the PCIe bus
>>>>>>>>>> number. As per the
>>>>>>>>>> PCIe spec r6.0 sec 2.2.6.2, "Functions must capture the Bus and
>>>>>>>>>> Device Numbers
>>>>>>>>>> supplied with all Type 0 Configuration Write Requests completed by
>>>>>>>>>> the Function
>>>>>>>>>> and supply these numbers in the Bus and Device Number fields of the
>>>>>>>>>> Requester ID
>>>>>>>>>> for all Requests". So during the first Configuration Read Request
>>>>>>>>>> issued by the
>>>>>>>>>> switch downstream port during enumeration (for reading Vendor ID),
>>>>>>>>>> Bus and
>>>>>>>>>> Device numbers will be unknown to the device. So it responds to the
>>>>>>>>>> Read Request
>>>>>>>>>> with Completion having Bus and Device number as 0. The switch
>>>>>>>>>> interprets the
>>>>>>>>>> Completion as an ACS Source Validation error and drops the
>>>>>>>>>> completion, leading
>>>>>>>>>> to the failure in detecting the endpoint device. Though the PCIe
>>>>>>>>>> spec r6.0, sec
>>>>>>>>>> 6.12.1.1, states that "Completions are never affected by ACS Source
>>>>>>>>>> Validation".
>>>>>>>>>> This behavior is in violation of the spec.
>>>>>>>>>>
>>>>>>>>>> Solution
>>>>>>>>>> ========
>>>>>>>>>>
>>>>>>>>>> In September, I submitted a series [6] to fix both issues. For the
>>>>>>>>>> IDT issue,
>>>>>>>>>> I reused the existing quirk in the PCI core which does a dummy
>>>>>>>>>> config write
>>>>>>>>>> before issuing the first config read to the device. And for the ACS
>>>>>>>>>> enablement
>>>>>>>>>> issue, I just resubmitted the original patch from Xingang which called
>>>>>>>>>> pci_request_acs() from devm_of_pci_bridge_init().
>>>>>>>>>>
>>>>>>>>>> But during the review of the series, several comments were received
>>>>>>>>>> and they
>>>>>>>>>> required the series to be reworked completely. Hence, in this
>>>>>>>>>> version, I've
>>>>>>>>>> incorported the comments as below:
>>>>>>>>>>
>>>>>>>>>> 1. For the ACS enablement issue, I've moved the pci_enable_acs()
>>>>>>>>>> call from
>>>>>>>>>> pci_acs_init() to pci_dma_configure().
>>>>>>>>>>
>>>>>>>>>> 2. For the IDT issue, I've cached the ACS capabilities (RO) in
>>>>>>>>>> 'pci_dev',
>>>>>>>>>> collected the broken capability for the IDT switches in the quirk
>>>>>>>>>> and used it to
>>>>>>>>>> disable the capability in the cache. This also allowed me to get
>>>>>>>>>> rid of the
>>>>>>>>>> earlier workaround for the switch.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://lore.kernel.org/all/038397a6-57e2-b6fc-6e1c-7c03b7be9d96@huawei.com
>>>>>>>>>> [2]
>>>>>>>>>> https://lore.kernel.org/all/1621566204-37456-1-git-send-email-wangxingang5@huawei.com
>>>>>>>>>> [3]
>>>>>>>>>> https://lore.kernel.org/all/01314d70-41e6-70f9-e496-84091948701a@samsung.com
>>>>>>>>>> [4]
>>>>>>>>>> https://lore.kernel.org/all/CADYN=9JWU3CMLzMEcD5MSQGnaLyDRSKc5SofBFHUax6YuTRaJA@mail.gmail.com
>>>>>>>>>> [5]
>>>>>>>>>> https://lore.kernel.org/linux-pci/20241107-pci_acs_fix-v1-1-185a2462a571@quicinc.com
>>>>>>>>>> [6]
>>>>>>>>>> https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com
>>>>>>>>>>
>>>>>>>>> Thanks for this patchset! I've tested it on my ARM Juno R1 and it
>>>>>>>>> looks that it almost works fine. This patchset even fixed some
>>>>>>>>> issues with PCI devices probe, as I again see SATA and GBit ethernet
>>>>>>>>> devices, which were missing since Linux v6.14 (it looks that
>>>>>>>>> I've also missed this in my tests).
>>>>>>>>>
>>>>>>>>> # lspci
>>>>>>>>> 00:00.0 PCI bridge: PLDA PCI Express Core Reference Design (rev 01)
>>>>>>>>> 01:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 02:01.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 02:02.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 02:03.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 02:0c.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 02:10.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device
>>>>>>>>> 8090 (rev 02)
>>>>>>>>> 03:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial
>>>>>>>>> ATA Raid II Controller (rev 01)
>>>>>>>>> 08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057
>>>>>>>>> PCI-E Gigabit Ethernet Controller
>>>>>>>>>
>>>>>>>>> However there is also a regression. After applying this patchset
>>>>>>>>> system suspend/resume stopped working. This is probably related to
>>>>>>>>> this message:
>>>>>>>>>
>>>>>>>>> pcieport 0000:02:1f.0: Unable to change power state from D0 to
>>>>>>>>> D3hot, device inaccessible
>>>>>>>>>
>>>>>>>>> which appears after calling 'rtcwake -s10 -mmem'. This might not be
>>>>>>>>> related to this patchset, so I probably need to apply it on older
>>>>>>>>> kernel releases and check.
>>>>>>>> Just one more information - I've applied this patchset on top of
>>>>>>>> v6.16 and it works perfectly on ARM Juno R1. SATA and GBit ethernet
>>>>>>>> are visible again and system suspend/resume works too, so the issue
>>>>>>>> with the latter on top of v6.18 seems not to be directly related to
>>>>>>>> $subject patchset. I will try to bisect this issue when I have some
>>>>>>>> spare time.
>>>>>>>>
>>>>>>>> Feel free to add:
>>>>>>>>
>>>>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>>> I spent some time analyzing this regression on Juno R1 and found that:
>>>>>>>
>>>>>>> 1. SATA and GBit Ethernet stopped working after commit bcb81ac6ae3c
>>>>>>> ("iommu: Get DT/ACPI parsing into the proper probe path") merged to
>>>>>>> v6.15-rc1.
>>>>>>>
>>>>>>> 2. With $subject patch applied to enable SATA & GBit ethernet again,
>>>>>>> system suspend/resume stopped working after commit f3ac2ff14834
>>>>>>> ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
>>>>>>> platforms") merged to v6.18-rc1.
>>>>>>>
>>>>> Yes, this was expected as if you don't disable ACS, it will cause issues in
>>>>> detecting the devices.
>>>>>
>>>>>>> If I got it right, according to the latter commit message, some quirks
>>>>>>> have to be added to fix the suspend/resume issue. Unfortunately I have
>>>>>>> no idea if this is the Juno R1 or the given PCI devices specific issue.
>>>>>> And one more note, commit df5192d9bb0e ("PCI/ASPM: Enable only L0s and
>>>>>> L1 for devicetree platforms") doesn't fix the suspend/resume issue
>>>>>> either (with $subject patchset applied on top of it).
>>>>>>
>>>>> Interesting. Can you do:
>>>>>
>>>>> echo performance > /sys/module/pcie_aspm/parameters/policy
>>>>>
>>>>> and then suspend?
>>>> After the above command, system suspend/resume works again.
>>>>
>>> Ok, so ASPM L0s/L1 seems to be the issue. But I'm not quite sure why it causes
>>> issue during suspend/resume. If the device/controller doesn't play well with
>>> ASPM L0s/L1, it should atleast cause the issue before entering suspend.
>>>
>>> I'm clueless here atm...
>> Definitely something gets broken during suspend, after adding
>> 'no_console_suspend' to kernel command line I see the following messages:
>>
>> # time rtcwake -s10 -mmem
>> rtcwake: wakeup from "mem" using /dev/rtc0 at Wed Dec 10 17:04:12 2025
>> PM: suspend entry (deep)
>> Filesystems sync: 0.001 seconds
>> Freezing user space processes
>> Freezing user space processes completed (elapsed 0.005 seconds)
>> OOM killer disabled.
>> Freezing remaining freezable tasks
>> Freezing remaining freezable tasks completed (elapsed 0.003 seconds)
>> psmouse serio1: Failed to disable mouse on 1c070000.kmi
>> psmouse serio0: Failed to disable mouse on 1c060000.kmi
>> pcieport 0000:02:1f.0: Unable to change power state from D0 to D3hot,
>> device inaccessible
> The device just got blown off the bus at this point. But it is unclear to me why
> it happens though if we enable ASPM L0s/L1. I don't think the firmware has
> gotten the chance to turn off the power to devices.
>
> So maybe some actions that we do in the PCI core during system suspend is
> affecting the device state. But can you try to access the device by doing:
>
> lspci -vvv -s 0000:02:1f.0
>
> before initiating system suspend. Just to make sure if the issue happens during
> suspend or way before that.
Same as before:
root@target:~# lspci -vvv -s 0000:02:1f.0
02:1f.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090
(rev 02) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin ? routed to IRQ 50
Bus: primary=02, secondary=08, subordinate=08, sec-latency=0
I/O behind bridge: 00002000-00002fff
Memory behind bridge: 50100000-501fffff
Prefetchable memory behind bridge:
00000000fff00000-00000000000fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Express (v2) Downstream Port (Slot-), MSI 00
DevCap: MaxPayload 2048 bytes, PhantFunc 0
ExtTag+ RBE+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1,
Exit Latency L0s <4us, L1 <4us
ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp-
LnkCtl: ASPM L0s L1 Enabled; Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt+
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
DLActive+ BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-,
LTR-, OBFF Not Supported ARIFwd+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
LTR-, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
Capabilities: [c0] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fffbb040 Data: 00e7
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+
ChkEn-
Capabilities: [200 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed+ WRR32- WRR64- WRR128- TWRR128-
WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [320 v1] Access Control Services
ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+
UpstreamFwd+ EgressCtrl+ DirectTrans+
ACSCtl: SrcValid- TransBlk- ReqRedir+ CmpltRedir+
UpstreamFwd+ EgressCtrl- DirectTrans-
Capabilities: [330 v1] #12
Kernel driver in use: pcieport
root@target:~# time rtcwake -s10 -mmem
rtcwake: wakeup from "mem" using /dev/rtc0 at Fri Dec 12 07:23:50 2025
[ 110.529810] PM: suspend entry (deep)
[ 110.532688] Filesystems sync: 0.001 seconds
[ 110.549590] Freezing user space processes
[ 110.557833] Freezing user space processes completed (elapsed 0.008
seconds)
[ 110.558282] OOM killer disabled.
[ 110.558296] Freezing remaining freezable tasks
[ 110.561602] Freezing remaining freezable tasks completed (elapsed
0.003 seconds)
[ 110.736524] psmouse serio1: Failed to disable mouse on 1c070000.kmi
[ 111.071329] psmouse serio0: Failed to disable mouse on 1c060000.kmi
[ 111.700685] pcieport 0000:02:1f.0: Unable to change power state from
D0 to D3hot, device inaccessible
[ 111.737951] Disabling non-boot CPUs ...
[ 111.757973] psci: CPU5 killed (polled 0 ms)
[ 111.775215] psci: CPU4 killed (polled 0 ms)
[ 111.789725] psci: CPU3 killed (polled 4 ms)
[ 111.800778] psci: CPU2 killed (polled 0 ms)
[ 111.816363] psci: CPU1 killed (polled 0 ms)
(machine never wakes up)
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches
2025-12-09 11:20 ` Manivannan Sadhasivam
@ 2025-12-17 15:19 ` Jason Gunthorpe
0 siblings, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2025-12-17 15:19 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Manivannan Sadhasivam, Bjorn Helgaas, linux-pci, linux-kernel,
iommu, Naresh Kamboju, Pavankumar Kondeti, Xingang Wang,
Marek Szyprowski, Robin Murphy
On Tue, Dec 09, 2025 at 08:20:39PM +0900, Manivannan Sadhasivam wrote:
> On Tue, Dec 02, 2025 at 03:15:33PM -0400, Jason Gunthorpe wrote:
> > On Tue, Dec 02, 2025 at 07:52:50PM +0530, Manivannan Sadhasivam wrote:
> > > @@ -544,6 +544,7 @@ struct pci_dev {
> > > #endif
> > > u16 acs_cap; /* ACS Capability offset */
> > > u16 acs_capabilities; /* ACS Capabilities */
> > > + u16 acs_broken_cap; /* Broken ACS Capabilities */
> >
> > Why do we need this? Have the quirk function accep tthe
> > acs_capabilities from the register and return the value to program
> > into struct pci_dev ?
> >
>
> We dont have any quirk levels between pci_acs_init() and pci_acs_enable() that
> will allow us to modify pci_dev::acs_capabilities in the quirk function. Hence,
> I came up with one more member to pass the broken caps.
Call the quirk function directly from the ACS path? We have things
like that already for ACS?
Jason
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-12-17 15:19 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20251202142307eucas1p12a15e5656bb53f48f445c3056d4e3166@eucas1p1.samsung.com>
2025-12-02 14:22 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 1/4] PCI: Enable ACS only after configuring IOMMU for " Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 2/4] PCI: Cache ACS capabilities Manivannan Sadhasivam
2025-12-02 14:22 ` [PATCH v2 3/4] PCI: Disable ACS SV capability for the broken IDT switches Manivannan Sadhasivam
2025-12-02 19:15 ` Jason Gunthorpe
2025-12-09 11:20 ` Manivannan Sadhasivam
2025-12-17 15:19 ` Jason Gunthorpe
2025-12-02 14:22 ` [PATCH v2 4/4] PCI: Extend the pci_disable_acs_sv quirk for one more IDT switch Manivannan Sadhasivam
2025-12-03 8:46 ` [PATCH v2 0/4] PCI: Fix ACS enablement for Root Ports in OF platforms Naresh Kamboju
2025-12-03 12:04 ` Marek Szyprowski
2025-12-04 13:13 ` Marek Szyprowski
2025-12-09 7:31 ` Marek Szyprowski
2025-12-09 8:28 ` Marek Szyprowski
2025-12-09 11:15 ` Manivannan Sadhasivam
2025-12-09 12:00 ` Marek Szyprowski
2025-12-09 15:04 ` Manivannan Sadhasivam
2025-12-10 17:26 ` Marek Szyprowski
2025-12-12 4:02 ` Manivannan Sadhasivam
2025-12-12 7:25 ` Marek Szyprowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox