* [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
@ 2025-09-22 16:16 Manivannan Sadhasivam via B4 Relay
2025-09-22 16:16 ` [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for " Manivannan Sadhasivam via B4 Relay
` (3 more replies)
0 siblings, 4 replies; 42+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-09-22 16:16 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring
Cc: linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Manivannan Sadhasivam, Bjorn Helgaas
Hi,
This series is one of the 'let's bite the bullet' kind, where we have decided to
enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
reason why devicetree platforms were chosen because, it will be of minimal
impact compared to the ACPI platforms. So seemed ideal to test the waters.
Problem Statement
=================
Historically, PCI subsystem relied on the BIOS to enable ASPM and Clock PM
states for PCI devices before the kernel boot if the default states are
selected using:
* Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
* cmdline: "pcie_aspm=off", or
* FADT: ACPI_FADT_NO_ASPM
This was done to avoid enabling ASPM for the buggy devices that are known to
create issues with ASPM (even though they advertise the ASPM capability). But
BIOS is not at all a thing on most of the non-x86 platforms. For instance, the
majority of the Embedded and Compute ARM based platforms using devicetree have
something called bootloader, which is not anyway near the standard BIOS used in
x86 based platforms. And these bootloaders wouldn't touch PCIe at all, unless
they boot using PCIe storage, even then there would be no guarantee that the
ASPM states will get enabled. Another example is the Intel's VMD domain that is
not at all configured by the BIOS. But, this series is not enabling ASPM/Clock
PM for VMD domain. I hope it will be done similarly in the future patches.
Solution
========
So to avoid relying on BIOS, it was agreed [2] that the PCI subsystem has to
enable ASPM and Clock PM states based on the device capability. If any devices
misbehave, then they should be quirked accordingly.
First patch of this series introduces two helper functions to enable all ASPM
and Clock PM states if of_have_populated_dt() is true. Second patch drops the
custom ASPM enablement code from the pcie-qcom driver as it is no longer needed.
Testing
=======
This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
supported ASPM states are getting enabled for both the NVMe and WLAN devices by
default.
[1] https://lore.kernel.org/linux-pci/a47sg5ahflhvzyzqnfxvpk3dw4clkhqlhznjxzwqpf4nyjx5dk@bcghz5o6zolk
[2] https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
Changes in v2:
- Used of_have_populated_dt() instead of CONFIG_OF to identify devicetree
platforms
- Renamed the override helpers and changed the override print
- Moved setting the default state back to the original place and only kept the
override in helpers
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
Manivannan Sadhasivam (2):
PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
PCI: qcom: Remove the custom ASPM enablement code
drivers/pci/controller/dwc/pcie-qcom.c | 32 --------------------------
drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++--
2 files changed, 40 insertions(+), 34 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250916-pci-dt-aspm-8b3a7e8d2cf1
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-09-22 16:16 [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Manivannan Sadhasivam via B4 Relay
@ 2025-09-22 16:16 ` Manivannan Sadhasivam via B4 Relay
2025-10-14 16:30 ` FUKAUMI Naoki
2025-09-22 16:16 ` [PATCH v2 2/2] PCI: qcom: Remove the custom ASPM enablement code Manivannan Sadhasivam via B4 Relay
` (2 subsequent siblings)
3 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-09-22 16:16 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring
Cc: linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Manivannan Sadhasivam, Bjorn Helgaas
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
So far, the PCI subsystem has honored the ASPM and Clock PM states set by
the BIOS (through LNKCTL) during device initialization, if it relies on the
default state selected using:
* Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
* cmdline: "pcie_aspm=off", or
* FADT: ACPI_FADT_NO_ASPM
This was done conservatively to avoid issues with the buggy devices that
advertise ASPM capabilities, but behave erratically if the ASPM states are
enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
ASPM states that were known to work for the devices.
But this turned out to be a problem for devicetree platforms, especially
the ARM based devicetree platforms powering Embedded and *some* Compute
devices as they tend to run without any standard BIOS. So the ASPM states
on these platforms were left disabled during boot and the PCI subsystem
never bothered to enable them, unless the user has forcefully enabled the
ASPM states through Kconfig, cmdline, and sysfs or the device drivers
themselves, enabling the ASPM states through pci_enable_link_state() APIs.
This caused runtime power issues on those platforms. So a couple of
approaches were tried to mitigate this BIOS dependency without user
intervention by enabling the ASPM states in the PCI controller drivers
after device enumeration, and overriding the ASPM/Clock PM states
by the PCI controller drivers through an API before enumeration.
But it has been concluded that none of these mitigations should really be
required and the PCI subsystem should enable the ASPM states advertised by
the devices without relying on BIOS or the PCI controller drivers. If any
device is found to be misbehaving after enabling ASPM states that they
advertised, then those devices should be quirked to disable the problematic
ASPM/Clock PM states.
In an effort to do so, start by overriding the ASPM and Clock PM states set
by the BIOS for devicetree platforms first. Separate helper functions are
introduced to override the BIOS set states by enabling all of them if
of_have_populated_dt() returns true. To aid debugging, print the overridden
ASPM and Clock PM states as well.
In the future, these helpers could be extended to allow other platforms
like VMD, newer ACPI systems with a cutoff year etc... to follow the path.
Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-1-778fe907c9ad@oss.qualcomm.com
---
drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 40 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 919a05b9764791c3cc469c9ada62ba5b2c405118..cda31150aec1b67b6a48b60569222ea3d1c3d41f 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -15,6 +15,7 @@
#include <linux/math.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
+#include <linux/of.h>
#include <linux/pci.h>
#include <linux/pci_regs.h>
#include <linux/errno.h>
@@ -235,13 +236,15 @@ struct pcie_link_state {
u32 aspm_support:7; /* Supported ASPM state */
u32 aspm_enabled:7; /* Enabled ASPM state */
u32 aspm_capable:7; /* Capable ASPM state with latency */
- u32 aspm_default:7; /* Default ASPM state by BIOS */
+ u32 aspm_default:7; /* Default ASPM state by BIOS or
+ override */
u32 aspm_disable:7; /* Disabled ASPM state */
/* Clock PM state */
u32 clkpm_capable:1; /* Clock PM capable? */
u32 clkpm_enabled:1; /* Current Clock PM state */
- u32 clkpm_default:1; /* Default Clock PM state by BIOS */
+ u32 clkpm_default:1; /* Default Clock PM state by BIOS or
+ override */
u32 clkpm_disable:1; /* Clock PM disabled */
};
@@ -373,6 +376,18 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
pcie_set_clkpm_nocheck(link, enable);
}
+static void pcie_clkpm_override_default_link_state(struct pcie_link_state *link,
+ int enabled)
+{
+ struct pci_dev *pdev = link->downstream;
+
+ /* Override the BIOS disabled Clock PM state for devicetree platforms */
+ if (of_have_populated_dt() && !enabled) {
+ link->clkpm_default = 1;
+ pci_info(pdev, "Clock PM state overridden: ClockPM+\n");
+ }
+}
+
static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
{
int capable = 1, enabled = 1;
@@ -395,6 +410,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
}
link->clkpm_enabled = enabled;
link->clkpm_default = enabled;
+ pcie_clkpm_override_default_link_state(link, enabled);
link->clkpm_capable = capable;
link->clkpm_disable = blacklist ? 1 : 0;
}
@@ -788,6 +804,26 @@ static void aspm_l1ss_init(struct pcie_link_state *link)
aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
}
+static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
+{
+ struct pci_dev *pdev = link->downstream;
+ u32 override;
+
+ /* Override the BIOS disabled ASPM states for devicetree platforms */
+ if (of_have_populated_dt()) {
+ link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
+ override = link->aspm_default & ~link->aspm_enabled;
+ if (override)
+ pci_info(pdev, "ASPM states overridden: %s%s%s%s%s%s\n",
+ (override & PCIE_LINK_STATE_L0S) ? "L0s+, " : "",
+ (override & PCIE_LINK_STATE_L1) ? "L1+, " : "",
+ (override & PCIE_LINK_STATE_L1_1) ? "L1.1+, " : "",
+ (override & PCIE_LINK_STATE_L1_2) ? "L1.2+, " : "",
+ (override & PCIE_LINK_STATE_L1_1_PCIPM) ? "L1.1 PCI-PM+, " : "",
+ (override & PCIE_LINK_STATE_L1_2_PCIPM) ? "L1.2 PCI-PM+" : "");
+ }
+}
+
static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
{
struct pci_dev *child = link->downstream, *parent = link->pdev;
@@ -868,6 +904,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
/* Save default state */
link->aspm_default = link->aspm_enabled;
+ pcie_aspm_override_default_link_state(link);
+
/* Setup initial capable state. Will be updated later */
link->aspm_capable = link->aspm_support;
--
2.48.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 2/2] PCI: qcom: Remove the custom ASPM enablement code
2025-09-22 16:16 [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Manivannan Sadhasivam via B4 Relay
2025-09-22 16:16 ` [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for " Manivannan Sadhasivam via B4 Relay
@ 2025-09-22 16:16 ` Manivannan Sadhasivam via B4 Relay
2025-09-23 23:14 ` [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Bjorn Helgaas
2025-11-08 16:18 ` Dmitry Baryshkov
3 siblings, 0 replies; 42+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-09-22 16:16 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring
Cc: linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Since the PCI subsystem has started enabling all ASPM states for all
devicetree based platforms, the ASPM enablement code from this driver can
now be dropped.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-2-778fe907c9ad@oss.qualcomm.com
---
drivers/pci/controller/dwc/pcie-qcom.c | 32 --------------------------------
1 file changed, 32 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 294babe1816e4d0c2b2343fe22d89af72afcd6cd..a1c4a9c31f9241e9ca679533323e33c0b972e678 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -247,7 +247,6 @@ struct qcom_pcie_ops {
int (*get_resources)(struct qcom_pcie *pcie);
int (*init)(struct qcom_pcie *pcie);
int (*post_init)(struct qcom_pcie *pcie);
- void (*host_post_init)(struct qcom_pcie *pcie);
void (*deinit)(struct qcom_pcie *pcie);
void (*ltssm_enable)(struct qcom_pcie *pcie);
int (*config_sid)(struct qcom_pcie *pcie);
@@ -1040,25 +1039,6 @@ static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
return 0;
}
-static int qcom_pcie_enable_aspm(struct pci_dev *pdev, void *userdata)
-{
- /*
- * Downstream devices need to be in D0 state before enabling PCI PM
- * substates.
- */
- pci_set_power_state_locked(pdev, PCI_D0);
- pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
-
- return 0;
-}
-
-static void qcom_pcie_host_post_init_2_7_0(struct qcom_pcie *pcie)
-{
- struct dw_pcie_rp *pp = &pcie->pci->pp;
-
- pci_walk_bus(pp->bridge->bus, qcom_pcie_enable_aspm, NULL);
-}
-
static void qcom_pcie_deinit_2_7_0(struct qcom_pcie *pcie)
{
struct qcom_pcie_resources_2_7_0 *res = &pcie->res.v2_7_0;
@@ -1358,19 +1338,9 @@ static void qcom_pcie_host_deinit(struct dw_pcie_rp *pp)
pcie->cfg->ops->deinit(pcie);
}
-static void qcom_pcie_host_post_init(struct dw_pcie_rp *pp)
-{
- struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
- struct qcom_pcie *pcie = to_qcom_pcie(pci);
-
- if (pcie->cfg->ops->host_post_init)
- pcie->cfg->ops->host_post_init(pcie);
-}
-
static const struct dw_pcie_host_ops qcom_pcie_dw_ops = {
.init = qcom_pcie_host_init,
.deinit = qcom_pcie_host_deinit,
- .post_init = qcom_pcie_host_post_init,
};
/* Qcom IP rev.: 2.1.0 Synopsys IP rev.: 4.01a */
@@ -1432,7 +1402,6 @@ static const struct qcom_pcie_ops ops_1_9_0 = {
.get_resources = qcom_pcie_get_resources_2_7_0,
.init = qcom_pcie_init_2_7_0,
.post_init = qcom_pcie_post_init_2_7_0,
- .host_post_init = qcom_pcie_host_post_init_2_7_0,
.deinit = qcom_pcie_deinit_2_7_0,
.ltssm_enable = qcom_pcie_2_3_2_ltssm_enable,
.config_sid = qcom_pcie_config_sid_1_9_0,
@@ -1443,7 +1412,6 @@ static const struct qcom_pcie_ops ops_1_21_0 = {
.get_resources = qcom_pcie_get_resources_2_7_0,
.init = qcom_pcie_init_2_7_0,
.post_init = qcom_pcie_post_init_2_7_0,
- .host_post_init = qcom_pcie_host_post_init_2_7_0,
.deinit = qcom_pcie_deinit_2_7_0,
.ltssm_enable = qcom_pcie_2_3_2_ltssm_enable,
};
--
2.48.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-09-22 16:16 [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Manivannan Sadhasivam via B4 Relay
2025-09-22 16:16 ` [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for " Manivannan Sadhasivam via B4 Relay
2025-09-22 16:16 ` [PATCH v2 2/2] PCI: qcom: Remove the custom ASPM enablement code Manivannan Sadhasivam via B4 Relay
@ 2025-09-23 23:14 ` Bjorn Helgaas
2025-11-08 16:18 ` Dmitry Baryshkov
3 siblings, 0 replies; 42+ messages in thread
From: Bjorn Helgaas @ 2025-09-23 23:14 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao
On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> Hi,
>
> This series is one of the 'let's bite the bullet' kind, where we have decided to
> enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
> reason why devicetree platforms were chosen because, it will be of minimal
> impact compared to the ACPI platforms. So seemed ideal to test the waters.
>
> Problem Statement
> =================
>
> Historically, PCI subsystem relied on the BIOS to enable ASPM and Clock PM
> states for PCI devices before the kernel boot if the default states are
> selected using:
>
> * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
> * cmdline: "pcie_aspm=off", or
> * FADT: ACPI_FADT_NO_ASPM
>
> This was done to avoid enabling ASPM for the buggy devices that are known to
> create issues with ASPM (even though they advertise the ASPM capability). But
> BIOS is not at all a thing on most of the non-x86 platforms. For instance, the
> majority of the Embedded and Compute ARM based platforms using devicetree have
> something called bootloader, which is not anyway near the standard BIOS used in
> x86 based platforms. And these bootloaders wouldn't touch PCIe at all, unless
> they boot using PCIe storage, even then there would be no guarantee that the
> ASPM states will get enabled. Another example is the Intel's VMD domain that is
> not at all configured by the BIOS. But, this series is not enabling ASPM/Clock
> PM for VMD domain. I hope it will be done similarly in the future patches.
>
> Solution
> ========
>
> So to avoid relying on BIOS, it was agreed [2] that the PCI subsystem has to
> enable ASPM and Clock PM states based on the device capability. If any devices
> misbehave, then they should be quirked accordingly.
>
> First patch of this series introduces two helper functions to enable all ASPM
> and Clock PM states if of_have_populated_dt() is true. Second patch drops the
> custom ASPM enablement code from the pcie-qcom driver as it is no longer needed.
>
> Testing
> =======
>
> This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
> supported ASPM states are getting enabled for both the NVMe and WLAN devices by
> default.
>
> [1] https://lore.kernel.org/linux-pci/a47sg5ahflhvzyzqnfxvpk3dw4clkhqlhznjxzwqpf4nyjx5dk@bcghz5o6zolk
> [2] https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
>
> Changes in v2:
>
> - Used of_have_populated_dt() instead of CONFIG_OF to identify devicetree
> platforms
> - Renamed the override helpers and changed the override print
> - Moved setting the default state back to the original place and only kept the
> override in helpers
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Manivannan Sadhasivam (2):
> PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
> PCI: qcom: Remove the custom ASPM enablement code
>
> drivers/pci/controller/dwc/pcie-qcom.c | 32 --------------------------
> drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++--
> 2 files changed, 40 insertions(+), 34 deletions(-)
I tentatively put this on pci/aspm and included it in pci/next.
I think it's too late in the cycle to include this for v6.18, so I'll
probably defer it until v6.19, but maybe we can start getting a little
more testing.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-09-22 16:16 ` [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for " Manivannan Sadhasivam via B4 Relay
@ 2025-10-14 16:30 ` FUKAUMI Naoki
2025-10-14 18:49 ` Bjorn Helgaas
0 siblings, 1 reply; 42+ messages in thread
From: FUKAUMI Naoki @ 2025-10-14 16:30 UTC (permalink / raw)
To: manivannan.sadhasivam, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring
Cc: linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Bjorn Helgaas, Dragan Simic, linux-rockchip
Hi Manivannan Sadhasivam,
I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
Rockchip RK3588(S) SoC.
When running Linux v6.18-rc1 or linux-next since 20250924, the kernel
either freezes or fails to probe M.2 Wi-Fi modules. This happens with
several different modules I've tested, including the Realtek RTL8852BE,
MediaTek MT7921E, and Intel AX210.
I've found that reverting the following commit (i.e., the patch I'm
replying to) resolves the problem:
commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
I'm not sure what the best long-term solution is, but would it be
possible to revert this patch for the time being to fix the regression?
Best regards,
--
FUKAUMI Naoki
Radxa Computer (Shenzhen) Co., Ltd.
On 9/23/25 01:16, Manivannan Sadhasivam via B4 Relay wrote:
> From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>
> So far, the PCI subsystem has honored the ASPM and Clock PM states set by
> the BIOS (through LNKCTL) during device initialization, if it relies on the
> default state selected using:
>
> * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
> * cmdline: "pcie_aspm=off", or
> * FADT: ACPI_FADT_NO_ASPM
>
> This was done conservatively to avoid issues with the buggy devices that
> advertise ASPM capabilities, but behave erratically if the ASPM states are
> enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
> ASPM states that were known to work for the devices.
>
> But this turned out to be a problem for devicetree platforms, especially
> the ARM based devicetree platforms powering Embedded and *some* Compute
> devices as they tend to run without any standard BIOS. So the ASPM states
> on these platforms were left disabled during boot and the PCI subsystem
> never bothered to enable them, unless the user has forcefully enabled the
> ASPM states through Kconfig, cmdline, and sysfs or the device drivers
> themselves, enabling the ASPM states through pci_enable_link_state() APIs.
>
> This caused runtime power issues on those platforms. So a couple of
> approaches were tried to mitigate this BIOS dependency without user
> intervention by enabling the ASPM states in the PCI controller drivers
> after device enumeration, and overriding the ASPM/Clock PM states
> by the PCI controller drivers through an API before enumeration.
>
> But it has been concluded that none of these mitigations should really be
> required and the PCI subsystem should enable the ASPM states advertised by
> the devices without relying on BIOS or the PCI controller drivers. If any
> device is found to be misbehaving after enabling ASPM states that they
> advertised, then those devices should be quirked to disable the problematic
> ASPM/Clock PM states.
>
> In an effort to do so, start by overriding the ASPM and Clock PM states set
> by the BIOS for devicetree platforms first. Separate helper functions are
> introduced to override the BIOS set states by enabling all of them if
> of_have_populated_dt() returns true. To aid debugging, print the overridden
> ASPM and Clock PM states as well.
>
> In the future, these helpers could be extended to allow other platforms
> like VMD, newer ACPI systems with a cutoff year etc... to follow the path.
>
> Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-1-778fe907c9ad@oss.qualcomm.com
> ---
> drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 919a05b9764791c3cc469c9ada62ba5b2c405118..cda31150aec1b67b6a48b60569222ea3d1c3d41f 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -15,6 +15,7 @@
> #include <linux/math.h>
> #include <linux/module.h>
> #include <linux/moduleparam.h>
> +#include <linux/of.h>
> #include <linux/pci.h>
> #include <linux/pci_regs.h>
> #include <linux/errno.h>
> @@ -235,13 +236,15 @@ struct pcie_link_state {
> u32 aspm_support:7; /* Supported ASPM state */
> u32 aspm_enabled:7; /* Enabled ASPM state */
> u32 aspm_capable:7; /* Capable ASPM state with latency */
> - u32 aspm_default:7; /* Default ASPM state by BIOS */
> + u32 aspm_default:7; /* Default ASPM state by BIOS or
> + override */
> u32 aspm_disable:7; /* Disabled ASPM state */
>
> /* Clock PM state */
> u32 clkpm_capable:1; /* Clock PM capable? */
> u32 clkpm_enabled:1; /* Current Clock PM state */
> - u32 clkpm_default:1; /* Default Clock PM state by BIOS */
> + u32 clkpm_default:1; /* Default Clock PM state by BIOS or
> + override */
> u32 clkpm_disable:1; /* Clock PM disabled */
> };
>
> @@ -373,6 +376,18 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
> pcie_set_clkpm_nocheck(link, enable);
> }
>
> +static void pcie_clkpm_override_default_link_state(struct pcie_link_state *link,
> + int enabled)
> +{
> + struct pci_dev *pdev = link->downstream;
> +
> + /* Override the BIOS disabled Clock PM state for devicetree platforms */
> + if (of_have_populated_dt() && !enabled) {
> + link->clkpm_default = 1;
> + pci_info(pdev, "Clock PM state overridden: ClockPM+\n");
> + }
> +}
> +
> static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> {
> int capable = 1, enabled = 1;
> @@ -395,6 +410,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> }
> link->clkpm_enabled = enabled;
> link->clkpm_default = enabled;
> + pcie_clkpm_override_default_link_state(link, enabled);
> link->clkpm_capable = capable;
> link->clkpm_disable = blacklist ? 1 : 0;
> }
> @@ -788,6 +804,26 @@ static void aspm_l1ss_init(struct pcie_link_state *link)
> aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> }
>
> +static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
> +{
> + struct pci_dev *pdev = link->downstream;
> + u32 override;
> +
> + /* Override the BIOS disabled ASPM states for devicetree platforms */
> + if (of_have_populated_dt()) {
> + link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> + override = link->aspm_default & ~link->aspm_enabled;
> + if (override)
> + pci_info(pdev, "ASPM states overridden: %s%s%s%s%s%s\n",
> + (override & PCIE_LINK_STATE_L0S) ? "L0s+, " : "",
> + (override & PCIE_LINK_STATE_L1) ? "L1+, " : "",
> + (override & PCIE_LINK_STATE_L1_1) ? "L1.1+, " : "",
> + (override & PCIE_LINK_STATE_L1_2) ? "L1.2+, " : "",
> + (override & PCIE_LINK_STATE_L1_1_PCIPM) ? "L1.1 PCI-PM+, " : "",
> + (override & PCIE_LINK_STATE_L1_2_PCIPM) ? "L1.2 PCI-PM+" : "");
> + }
> +}
> +
> static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> {
> struct pci_dev *child = link->downstream, *parent = link->pdev;
> @@ -868,6 +904,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> /* Save default state */
> link->aspm_default = link->aspm_enabled;
>
> + pcie_aspm_override_default_link_state(link);
> +
> /* Setup initial capable state. Will be updated later */
> link->aspm_capable = link->aspm_support;
>
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-14 16:30 ` FUKAUMI Naoki
@ 2025-10-14 18:49 ` Bjorn Helgaas
2025-10-14 23:33 ` Dragan Simic
` (3 more replies)
0 siblings, 4 replies; 42+ messages in thread
From: Bjorn Helgaas @ 2025-10-14 18:49 UTC (permalink / raw)
To: FUKAUMI Naoki
Cc: manivannan.sadhasivam, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions
[+cc regressions]
On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> Hi Manivannan Sadhasivam,
>
> I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> Rockchip RK3588(S) SoC.
>
> When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> different modules I've tested, including the Realtek RTL8852BE, MediaTek
> MT7921E, and Intel AX210.
>
> I've found that reverting the following commit (i.e., the patch I'm replying
> to) resolves the problem:
> commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
Thanks for the report, and sorry for the regression.
Since this affects several devices from different manufacturers and (I
assume) different drivers, it seems likely that there's some issue
with the Rockchip end, since ASPM probably works on these devices in
other systems. So we should figure out if there's something wrong
with the way we configure ASPM, which we could potentially fix, or if
there's a hardware issue and we need some king of quirk to prevent
usage of ASPM on the affected platforms.
Can you collect a complete dmesg log when booting with
ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
and the output of "sudo lspci -vv"?
When the kernel freezes, can you give us any information about where,
e.g., a log or screenshot?
Do you know if any platforms other than Radxa ROCK 5A/5B have this
problem?
#regzbot introduced: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
#regzbot dup-of: https://github.com/chzigotzky/kernels/issues/17
> On 9/23/25 01:16, Manivannan Sadhasivam via B4 Relay wrote:
> > From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> >
> > So far, the PCI subsystem has honored the ASPM and Clock PM states set by
> > the BIOS (through LNKCTL) during device initialization, if it relies on the
> > default state selected using:
> >
> > * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
> > * cmdline: "pcie_aspm=off", or
> > * FADT: ACPI_FADT_NO_ASPM
> >
> > This was done conservatively to avoid issues with the buggy devices that
> > advertise ASPM capabilities, but behave erratically if the ASPM states are
> > enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
> > ASPM states that were known to work for the devices.
> >
> > But this turned out to be a problem for devicetree platforms, especially
> > the ARM based devicetree platforms powering Embedded and *some* Compute
> > devices as they tend to run without any standard BIOS. So the ASPM states
> > on these platforms were left disabled during boot and the PCI subsystem
> > never bothered to enable them, unless the user has forcefully enabled the
> > ASPM states through Kconfig, cmdline, and sysfs or the device drivers
> > themselves, enabling the ASPM states through pci_enable_link_state() APIs.
> >
> > This caused runtime power issues on those platforms. So a couple of
> > approaches were tried to mitigate this BIOS dependency without user
> > intervention by enabling the ASPM states in the PCI controller drivers
> > after device enumeration, and overriding the ASPM/Clock PM states
> > by the PCI controller drivers through an API before enumeration.
> >
> > But it has been concluded that none of these mitigations should really be
> > required and the PCI subsystem should enable the ASPM states advertised by
> > the devices without relying on BIOS or the PCI controller drivers. If any
> > device is found to be misbehaving after enabling ASPM states that they
> > advertised, then those devices should be quirked to disable the problematic
> > ASPM/Clock PM states.
> >
> > In an effort to do so, start by overriding the ASPM and Clock PM states set
> > by the BIOS for devicetree platforms first. Separate helper functions are
> > introduced to override the BIOS set states by enabling all of them if
> > of_have_populated_dt() returns true. To aid debugging, print the overridden
> > ASPM and Clock PM states as well.
> >
> > In the future, these helpers could be extended to allow other platforms
> > like VMD, newer ACPI systems with a cutoff year etc... to follow the path.
> >
> > Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
> > Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> > Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-1-778fe907c9ad@oss.qualcomm.com
> > ---
> > drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 40 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 919a05b9764791c3cc469c9ada62ba5b2c405118..cda31150aec1b67b6a48b60569222ea3d1c3d41f 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -15,6 +15,7 @@
> > #include <linux/math.h>
> > #include <linux/module.h>
> > #include <linux/moduleparam.h>
> > +#include <linux/of.h>
> > #include <linux/pci.h>
> > #include <linux/pci_regs.h>
> > #include <linux/errno.h>
> > @@ -235,13 +236,15 @@ struct pcie_link_state {
> > u32 aspm_support:7; /* Supported ASPM state */
> > u32 aspm_enabled:7; /* Enabled ASPM state */
> > u32 aspm_capable:7; /* Capable ASPM state with latency */
> > - u32 aspm_default:7; /* Default ASPM state by BIOS */
> > + u32 aspm_default:7; /* Default ASPM state by BIOS or
> > + override */
> > u32 aspm_disable:7; /* Disabled ASPM state */
> > /* Clock PM state */
> > u32 clkpm_capable:1; /* Clock PM capable? */
> > u32 clkpm_enabled:1; /* Current Clock PM state */
> > - u32 clkpm_default:1; /* Default Clock PM state by BIOS */
> > + u32 clkpm_default:1; /* Default Clock PM state by BIOS or
> > + override */
> > u32 clkpm_disable:1; /* Clock PM disabled */
> > };
> > @@ -373,6 +376,18 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
> > pcie_set_clkpm_nocheck(link, enable);
> > }
> > +static void pcie_clkpm_override_default_link_state(struct pcie_link_state *link,
> > + int enabled)
> > +{
> > + struct pci_dev *pdev = link->downstream;
> > +
> > + /* Override the BIOS disabled Clock PM state for devicetree platforms */
> > + if (of_have_populated_dt() && !enabled) {
> > + link->clkpm_default = 1;
> > + pci_info(pdev, "Clock PM state overridden: ClockPM+\n");
> > + }
> > +}
> > +
> > static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> > {
> > int capable = 1, enabled = 1;
> > @@ -395,6 +410,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> > }
> > link->clkpm_enabled = enabled;
> > link->clkpm_default = enabled;
> > + pcie_clkpm_override_default_link_state(link, enabled);
> > link->clkpm_capable = capable;
> > link->clkpm_disable = blacklist ? 1 : 0;
> > }
> > @@ -788,6 +804,26 @@ static void aspm_l1ss_init(struct pcie_link_state *link)
> > aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> > }
> > +static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
> > +{
> > + struct pci_dev *pdev = link->downstream;
> > + u32 override;
> > +
> > + /* Override the BIOS disabled ASPM states for devicetree platforms */
> > + if (of_have_populated_dt()) {
> > + link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> > + override = link->aspm_default & ~link->aspm_enabled;
> > + if (override)
> > + pci_info(pdev, "ASPM states overridden: %s%s%s%s%s%s\n",
> > + (override & PCIE_LINK_STATE_L0S) ? "L0s+, " : "",
> > + (override & PCIE_LINK_STATE_L1) ? "L1+, " : "",
> > + (override & PCIE_LINK_STATE_L1_1) ? "L1.1+, " : "",
> > + (override & PCIE_LINK_STATE_L1_2) ? "L1.2+, " : "",
> > + (override & PCIE_LINK_STATE_L1_1_PCIPM) ? "L1.1 PCI-PM+, " : "",
> > + (override & PCIE_LINK_STATE_L1_2_PCIPM) ? "L1.2 PCI-PM+" : "");
> > + }
> > +}
> > +
> > static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > {
> > struct pci_dev *child = link->downstream, *parent = link->pdev;
> > @@ -868,6 +904,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > /* Save default state */
> > link->aspm_default = link->aspm_enabled;
> > + pcie_aspm_override_default_link_state(link);
> > +
> > /* Setup initial capable state. Will be updated later */
> > link->aspm_capable = link->aspm_support;
> >
>
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-14 18:49 ` Bjorn Helgaas
@ 2025-10-14 23:33 ` Dragan Simic
2025-10-15 6:22 ` Manivannan Sadhasivam
2025-10-15 6:26 ` Manivannan Sadhasivam
` (2 subsequent siblings)
3 siblings, 1 reply; 42+ messages in thread
From: Dragan Simic @ 2025-10-14 23:33 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: FUKAUMI Naoki, manivannan.sadhasivam, Bjorn Helgaas,
Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, linux-rockchip, regressions
Hello all,
On Tuesday, October 14, 2025 20:49 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > Rockchip RK3588(S) SoC.
> >
> > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > MT7921E, and Intel AX210.
> >
> > I've found that reverting the following commit (i.e., the patch I'm replying
> > to) resolves the problem:
> > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>
> Thanks for the report, and sorry for the regression.
>
> Since this affects several devices from different manufacturers and (I
> assume) different drivers, it seems likely that there's some issue
> with the Rockchip end, since ASPM probably works on these devices in
> other systems. So we should figure out if there's something wrong
> with the way we configure ASPM, which we could potentially fix, or if
> there's a hardware issue and we need some king of quirk to prevent
> usage of ASPM on the affected platforms.
>
> Can you collect a complete dmesg log when booting with
>
> ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
>
> and the output of "sudo lspci -vv"?
>
> When the kernel freezes, can you give us any information about where,
> e.g., a log or screenshot?
>
> Do you know if any platforms other than Radxa ROCK 5A/5B have this
> problem?
After thinking quite a bit about it, I think we should revert this
patch and replace it with another patch that allows per-SoC, or
maybe even per-board, opting into the forced enablement of PCIe
ASPM. Let me explain, please.
When a new feature is introduced, it's expected that it may fail
on some hardware or with some specific setups, so quirking off such
instances, as time passes, is perfectly fine. Such a new feature
didn't work before it was implemented, so it's acceptable that it
fails in some instances after the introduction, and that it gets
quirked off as time passes and more testing is performed.
However, when some widespread feature, such as PCIe, has already
been in production for quite a while, introducing high-risk changes
to it in a blanket fashion, while intending to have the incompatible
or not-yet-ready platforms quirked off over time, simply isn't the
way to go. Breaking stuff intentionally to find out what actually
doesn't work is rarely a good option.
Thus, I'd suggest that this patch is replaced with nother patches,
which would introduce an additional ASPM opt-in switch to the PCI
binding, allowing SoCs or boards to have it enabled _after_ proper
testing is performed. The PCIe driver may emit a warning that ASPM
is to be enabled at some point in the future, to "bug" people about
the need to perform the testing, etc. With all that in place, we
could expect that in a year or two PCIe ASPM could eventually be
enabled everywhere. Getting everything tested is a massive endeavor,
but that's the only way not to break stuff.
Biting the bullet and hoping that it all goes well, I'd say, isn't
the right approach here.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-14 23:33 ` Dragan Simic
@ 2025-10-15 6:22 ` Manivannan Sadhasivam
2025-10-15 11:23 ` Diederik de Haas
2025-10-23 18:57 ` Dragan Simic
0 siblings, 2 replies; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-15 6:22 UTC (permalink / raw)
To: Dragan Simic
Cc: Bjorn Helgaas, FUKAUMI Naoki, manivannan.sadhasivam,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
linux-rockchip, regressions
On Wed, Oct 15, 2025 at 01:33:35AM +0200, Dragan Simic wrote:
> Hello all,
>
> On Tuesday, October 14, 2025 20:49 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > > Rockchip RK3588(S) SoC.
> > >
> > > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > > MT7921E, and Intel AX210.
> > >
> > > I've found that reverting the following commit (i.e., the patch I'm replying
> > > to) resolves the problem:
> > > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
> >
> > Thanks for the report, and sorry for the regression.
> >
> > Since this affects several devices from different manufacturers and (I
> > assume) different drivers, it seems likely that there's some issue
> > with the Rockchip end, since ASPM probably works on these devices in
> > other systems. So we should figure out if there's something wrong
> > with the way we configure ASPM, which we could potentially fix, or if
> > there's a hardware issue and we need some king of quirk to prevent
> > usage of ASPM on the affected platforms.
> >
> > Can you collect a complete dmesg log when booting with
> >
> > ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
> >
> > and the output of "sudo lspci -vv"?
> >
> > When the kernel freezes, can you give us any information about where,
> > e.g., a log or screenshot?
> >
> > Do you know if any platforms other than Radxa ROCK 5A/5B have this
> > problem?
>
> After thinking quite a bit about it, I think we should revert this
> patch and replace it with another patch that allows per-SoC, or
> maybe even per-board, opting into the forced enablement of PCIe
> ASPM. Let me explain, please.
>
ASPM is a PCIe device specific feature, nothing related to SoC/board. Even if
you limit it to certain platforms, there is no guarantee that it will be safe as
the users can connect a buggy device to the slot and it could lead to the same
issue.
> When a new feature is introduced, it's expected that it may fail
> on some hardware or with some specific setups, so quirking off such
> instances, as time passes, is perfectly fine. Such a new feature
> didn't work before it was implemented, so it's acceptable that it
> fails in some instances after the introduction, and that it gets
> quirked off as time passes and more testing is performed.
>
ASPM is not a new feature. It was introduced more than a decade before. But we
somehow procastinated the enablement for so long until we realized that if we
don't do it now, we wouldn't be able to do it anytime in the future.
> However, when some widespread feature, such as PCIe, has already
> been in production for quite a while, introducing high-risk changes
> to it in a blanket fashion, while intending to have the incompatible
> or not-yet-ready platforms quirked off over time, simply isn't the
> way to go. Breaking stuff intentionally to find out what actually
> doesn't work is rarely a good option.
>
The issue is due to devices exposing ASPM capability, but behaving erratically
when enabled. Until, we enable ASPM on these devices, we cannot know whether
they are working or not. To avoid mass chaos, we decided to enable it only for
devicetree platforms as a start.
> Thus, I'd suggest that this patch is replaced with nother patches,
> which would introduce an additional ASPM opt-in switch to the PCI
> binding, allowing SoCs or boards to have it enabled _after_ proper
> testing is performed. The PCIe driver may emit a warning that ASPM
> is to be enabled at some point in the future, to "bug" people about
> the need to perform the testing, etc.
Even if we emit a "YOUR DEVICE MAY BREAK" warning, nobody would care as long as
the device works for them. We didn't decide to enable this feature overnight to
trouble users. The fact that ASPM saves runtime power, which will benefit users
and ofc the environment as a whole, should not be kept disabled.
But does that mean, we wanted to have breakages, NO. We expected breakages as
not all devices will play nicely with ASPM, but there is only one way to find
out. And we do want to disable ASPM only for those devices.
> With all that in place, we
> could expect that in a year or two PCIe ASPM could eventually be
> enabled everywhere. Getting everything tested is a massive endeavor,
> but that's the only way not to break stuff.
>
> Biting the bullet and hoping that it all goes well, I'd say, isn't
> the right approach here.
>
Your two year phased approach would never work as that's what we have hoped for
more than a decade.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-14 18:49 ` Bjorn Helgaas
2025-10-14 23:33 ` Dragan Simic
@ 2025-10-15 6:26 ` Manivannan Sadhasivam
2025-10-15 7:13 ` FUKAUMI Naoki
2025-10-15 12:26 ` Diederik de Haas
2025-10-30 22:14 ` Bjorn Helgaas
3 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-15 6:26 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: FUKAUMI Naoki, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions
On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
> [+cc regressions]
>
> On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > Hi Manivannan Sadhasivam,
> >
> > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > Rockchip RK3588(S) SoC.
> >
> > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > MT7921E, and Intel AX210.
> >
> > I've found that reverting the following commit (i.e., the patch I'm replying
> > to) resolves the problem:
> > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>
> Thanks for the report, and sorry for the regression.
>
> Since this affects several devices from different manufacturers and (I
> assume) different drivers, it seems likely that there's some issue
> with the Rockchip end, since ASPM probably works on these devices in
> other systems. So we should figure out if there's something wrong
> with the way we configure ASPM, which we could potentially fix, or if
> there's a hardware issue and we need some king of quirk to prevent
> usage of ASPM on the affected platforms.
>
I believe it is the latter. The Root Port is having trouble with ASPM.
FUKAUMI Naoki, could you please share the 'sudo lspci -vv' output so that we
know what kind of Root Port we are dealing with? You can revert the offending
patch and share the output.
Then I could supply a patch that you can test out.
- Mani
> Can you collect a complete dmesg log when booting with
>
> ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
>
> and the output of "sudo lspci -vv"?
>
> When the kernel freezes, can you give us any information about where,
> e.g., a log or screenshot?
>
> Do you know if any platforms other than Radxa ROCK 5A/5B have this
> problem?
>
> #regzbot introduced: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
> #regzbot dup-of: https://github.com/chzigotzky/kernels/issues/17
>
> > On 9/23/25 01:16, Manivannan Sadhasivam via B4 Relay wrote:
> > > From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> > >
> > > So far, the PCI subsystem has honored the ASPM and Clock PM states set by
> > > the BIOS (through LNKCTL) during device initialization, if it relies on the
> > > default state selected using:
> > >
> > > * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
> > > * cmdline: "pcie_aspm=off", or
> > > * FADT: ACPI_FADT_NO_ASPM
> > >
> > > This was done conservatively to avoid issues with the buggy devices that
> > > advertise ASPM capabilities, but behave erratically if the ASPM states are
> > > enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
> > > ASPM states that were known to work for the devices.
> > >
> > > But this turned out to be a problem for devicetree platforms, especially
> > > the ARM based devicetree platforms powering Embedded and *some* Compute
> > > devices as they tend to run without any standard BIOS. So the ASPM states
> > > on these platforms were left disabled during boot and the PCI subsystem
> > > never bothered to enable them, unless the user has forcefully enabled the
> > > ASPM states through Kconfig, cmdline, and sysfs or the device drivers
> > > themselves, enabling the ASPM states through pci_enable_link_state() APIs.
> > >
> > > This caused runtime power issues on those platforms. So a couple of
> > > approaches were tried to mitigate this BIOS dependency without user
> > > intervention by enabling the ASPM states in the PCI controller drivers
> > > after device enumeration, and overriding the ASPM/Clock PM states
> > > by the PCI controller drivers through an API before enumeration.
> > >
> > > But it has been concluded that none of these mitigations should really be
> > > required and the PCI subsystem should enable the ASPM states advertised by
> > > the devices without relying on BIOS or the PCI controller drivers. If any
> > > device is found to be misbehaving after enabling ASPM states that they
> > > advertised, then those devices should be quirked to disable the problematic
> > > ASPM/Clock PM states.
> > >
> > > In an effort to do so, start by overriding the ASPM and Clock PM states set
> > > by the BIOS for devicetree platforms first. Separate helper functions are
> > > introduced to override the BIOS set states by enabling all of them if
> > > of_have_populated_dt() returns true. To aid debugging, print the overridden
> > > ASPM and Clock PM states as well.
> > >
> > > In the future, these helpers could be extended to allow other platforms
> > > like VMD, newer ACPI systems with a cutoff year etc... to follow the path.
> > >
> > > Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
> > > Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> > > Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-1-778fe907c9ad@oss.qualcomm.com
> > > ---
> > > drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> > > 1 file changed, 40 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index 919a05b9764791c3cc469c9ada62ba5b2c405118..cda31150aec1b67b6a48b60569222ea3d1c3d41f 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -15,6 +15,7 @@
> > > #include <linux/math.h>
> > > #include <linux/module.h>
> > > #include <linux/moduleparam.h>
> > > +#include <linux/of.h>
> > > #include <linux/pci.h>
> > > #include <linux/pci_regs.h>
> > > #include <linux/errno.h>
> > > @@ -235,13 +236,15 @@ struct pcie_link_state {
> > > u32 aspm_support:7; /* Supported ASPM state */
> > > u32 aspm_enabled:7; /* Enabled ASPM state */
> > > u32 aspm_capable:7; /* Capable ASPM state with latency */
> > > - u32 aspm_default:7; /* Default ASPM state by BIOS */
> > > + u32 aspm_default:7; /* Default ASPM state by BIOS or
> > > + override */
> > > u32 aspm_disable:7; /* Disabled ASPM state */
> > > /* Clock PM state */
> > > u32 clkpm_capable:1; /* Clock PM capable? */
> > > u32 clkpm_enabled:1; /* Current Clock PM state */
> > > - u32 clkpm_default:1; /* Default Clock PM state by BIOS */
> > > + u32 clkpm_default:1; /* Default Clock PM state by BIOS or
> > > + override */
> > > u32 clkpm_disable:1; /* Clock PM disabled */
> > > };
> > > @@ -373,6 +376,18 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
> > > pcie_set_clkpm_nocheck(link, enable);
> > > }
> > > +static void pcie_clkpm_override_default_link_state(struct pcie_link_state *link,
> > > + int enabled)
> > > +{
> > > + struct pci_dev *pdev = link->downstream;
> > > +
> > > + /* Override the BIOS disabled Clock PM state for devicetree platforms */
> > > + if (of_have_populated_dt() && !enabled) {
> > > + link->clkpm_default = 1;
> > > + pci_info(pdev, "Clock PM state overridden: ClockPM+\n");
> > > + }
> > > +}
> > > +
> > > static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> > > {
> > > int capable = 1, enabled = 1;
> > > @@ -395,6 +410,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> > > }
> > > link->clkpm_enabled = enabled;
> > > link->clkpm_default = enabled;
> > > + pcie_clkpm_override_default_link_state(link, enabled);
> > > link->clkpm_capable = capable;
> > > link->clkpm_disable = blacklist ? 1 : 0;
> > > }
> > > @@ -788,6 +804,26 @@ static void aspm_l1ss_init(struct pcie_link_state *link)
> > > aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> > > }
> > > +static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
> > > +{
> > > + struct pci_dev *pdev = link->downstream;
> > > + u32 override;
> > > +
> > > + /* Override the BIOS disabled ASPM states for devicetree platforms */
> > > + if (of_have_populated_dt()) {
> > > + link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> > > + override = link->aspm_default & ~link->aspm_enabled;
> > > + if (override)
> > > + pci_info(pdev, "ASPM states overridden: %s%s%s%s%s%s\n",
> > > + (override & PCIE_LINK_STATE_L0S) ? "L0s+, " : "",
> > > + (override & PCIE_LINK_STATE_L1) ? "L1+, " : "",
> > > + (override & PCIE_LINK_STATE_L1_1) ? "L1.1+, " : "",
> > > + (override & PCIE_LINK_STATE_L1_2) ? "L1.2+, " : "",
> > > + (override & PCIE_LINK_STATE_L1_1_PCIPM) ? "L1.1 PCI-PM+, " : "",
> > > + (override & PCIE_LINK_STATE_L1_2_PCIPM) ? "L1.2 PCI-PM+" : "");
> > > + }
> > > +}
> > > +
> > > static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > {
> > > struct pci_dev *child = link->downstream, *parent = link->pdev;
> > > @@ -868,6 +904,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > /* Save default state */
> > > link->aspm_default = link->aspm_enabled;
> > > + pcie_aspm_override_default_link_state(link);
> > > +
> > > /* Setup initial capable state. Will be updated later */
> > > link->aspm_capable = link->aspm_support;
> > >
> >
> >
> > _______________________________________________
> > Linux-rockchip mailing list
> > Linux-rockchip@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-rockchip
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 6:26 ` Manivannan Sadhasivam
@ 2025-10-15 7:13 ` FUKAUMI Naoki
2025-10-15 7:50 ` Manivannan Sadhasivam
0 siblings, 1 reply; 42+ messages in thread
From: FUKAUMI Naoki @ 2025-10-15 7:13 UTC (permalink / raw)
To: Manivannan Sadhasivam, Bjorn Helgaas
Cc: manivannan.sadhasivam, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Dragan Simic, linux-rockchip,
regressions
Hi,
On 10/15/25 15:26, Manivannan Sadhasivam wrote:
> On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
>> [+cc regressions]
>>
>> On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
>>> Hi Manivannan Sadhasivam,
>>>
>>> I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
>>> Rockchip RK3588(S) SoC.
>>>
>>> When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
>>> freezes or fails to probe M.2 Wi-Fi modules. This happens with several
>>> different modules I've tested, including the Realtek RTL8852BE, MediaTek
>>> MT7921E, and Intel AX210.
>>>
>>> I've found that reverting the following commit (i.e., the patch I'm replying
>>> to) resolves the problem:
>>> commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>>
>> Thanks for the report, and sorry for the regression.
>>
>> Since this affects several devices from different manufacturers and (I
>> assume) different drivers, it seems likely that there's some issue
>> with the Rockchip end, since ASPM probably works on these devices in
>> other systems. So we should figure out if there's something wrong
>> with the way we configure ASPM, which we could potentially fix, or if
>> there's a hardware issue and we need some king of quirk to prevent
>> usage of ASPM on the affected platforms.
>>
>
> I believe it is the latter. The Root Port is having trouble with ASPM.
>
> FUKAUMI Naoki, could you please share the 'sudo lspci -vv' output so that we
> know what kind of Root Port we are dealing with? You can revert the offending
> patch and share the output.
Here is dmesg/lspci output on ROCK 5A(RK3588S):
https://gist.github.com/RadxaNaoki/1355a0b4278b6e51a61d89df7a535a5d
----
I've (likely) noticed another PCIe issue on the ROCK 5B (RK3588).
Reverting commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961 on top of
commit 331b2acbe6123dbbcb5c34698692959abbf5748b definitely gets the M.2
Wi-Fi working. However, reverting commit
f3ac2ff14834a0aa056ee3ae0e4b8c641c579961 on v6.18-rc1/next-20251014 does
not work. (It does work on the ROCK 5A.)
Commit 331b2acbe6123dbbcb5c34698692959abbf5748b landed between
next-20250923 and next-20250924. I'm currently in the process of bisecting.
Best regards,
--
FUKAUMI Naoki
Radxa Computer (Shenzhen) Co., Ltd.
> Then I could supply a patch that you can test out.
>
> - Mani
>
>> Can you collect a complete dmesg log when booting with
>>
>> ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
>>
>> and the output of "sudo lspci -vv"?
>>
>> When the kernel freezes, can you give us any information about where,
>> e.g., a log or screenshot?
>>
>> Do you know if any platforms other than Radxa ROCK 5A/5B have this
>> problem?
>>
>> #regzbot introduced: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
>> #regzbot dup-of: https://github.com/chzigotzky/kernels/issues/17
>>
>>> On 9/23/25 01:16, Manivannan Sadhasivam via B4 Relay wrote:
>>>> From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>>>>
>>>> So far, the PCI subsystem has honored the ASPM and Clock PM states set by
>>>> the BIOS (through LNKCTL) during device initialization, if it relies on the
>>>> default state selected using:
>>>>
>>>> * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
>>>> * cmdline: "pcie_aspm=off", or
>>>> * FADT: ACPI_FADT_NO_ASPM
>>>>
>>>> This was done conservatively to avoid issues with the buggy devices that
>>>> advertise ASPM capabilities, but behave erratically if the ASPM states are
>>>> enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
>>>> ASPM states that were known to work for the devices.
>>>>
>>>> But this turned out to be a problem for devicetree platforms, especially
>>>> the ARM based devicetree platforms powering Embedded and *some* Compute
>>>> devices as they tend to run without any standard BIOS. So the ASPM states
>>>> on these platforms were left disabled during boot and the PCI subsystem
>>>> never bothered to enable them, unless the user has forcefully enabled the
>>>> ASPM states through Kconfig, cmdline, and sysfs or the device drivers
>>>> themselves, enabling the ASPM states through pci_enable_link_state() APIs.
>>>>
>>>> This caused runtime power issues on those platforms. So a couple of
>>>> approaches were tried to mitigate this BIOS dependency without user
>>>> intervention by enabling the ASPM states in the PCI controller drivers
>>>> after device enumeration, and overriding the ASPM/Clock PM states
>>>> by the PCI controller drivers through an API before enumeration.
>>>>
>>>> But it has been concluded that none of these mitigations should really be
>>>> required and the PCI subsystem should enable the ASPM states advertised by
>>>> the devices without relying on BIOS or the PCI controller drivers. If any
>>>> device is found to be misbehaving after enabling ASPM states that they
>>>> advertised, then those devices should be quirked to disable the problematic
>>>> ASPM/Clock PM states.
>>>>
>>>> In an effort to do so, start by overriding the ASPM and Clock PM states set
>>>> by the BIOS for devicetree platforms first. Separate helper functions are
>>>> introduced to override the BIOS set states by enabling all of them if
>>>> of_have_populated_dt() returns true. To aid debugging, print the overridden
>>>> ASPM and Clock PM states as well.
>>>>
>>>> In the future, these helpers could be extended to allow other platforms
>>>> like VMD, newer ACPI systems with a cutoff year etc... to follow the path.
>>>>
>>>> Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
>>>> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
>>>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>>>> Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-1-778fe907c9ad@oss.qualcomm.com
>>>> ---
>>>> drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>>>> 1 file changed, 40 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>> index 919a05b9764791c3cc469c9ada62ba5b2c405118..cda31150aec1b67b6a48b60569222ea3d1c3d41f 100644
>>>> --- a/drivers/pci/pcie/aspm.c
>>>> +++ b/drivers/pci/pcie/aspm.c
>>>> @@ -15,6 +15,7 @@
>>>> #include <linux/math.h>
>>>> #include <linux/module.h>
>>>> #include <linux/moduleparam.h>
>>>> +#include <linux/of.h>
>>>> #include <linux/pci.h>
>>>> #include <linux/pci_regs.h>
>>>> #include <linux/errno.h>
>>>> @@ -235,13 +236,15 @@ struct pcie_link_state {
>>>> u32 aspm_support:7; /* Supported ASPM state */
>>>> u32 aspm_enabled:7; /* Enabled ASPM state */
>>>> u32 aspm_capable:7; /* Capable ASPM state with latency */
>>>> - u32 aspm_default:7; /* Default ASPM state by BIOS */
>>>> + u32 aspm_default:7; /* Default ASPM state by BIOS or
>>>> + override */
>>>> u32 aspm_disable:7; /* Disabled ASPM state */
>>>> /* Clock PM state */
>>>> u32 clkpm_capable:1; /* Clock PM capable? */
>>>> u32 clkpm_enabled:1; /* Current Clock PM state */
>>>> - u32 clkpm_default:1; /* Default Clock PM state by BIOS */
>>>> + u32 clkpm_default:1; /* Default Clock PM state by BIOS or
>>>> + override */
>>>> u32 clkpm_disable:1; /* Clock PM disabled */
>>>> };
>>>> @@ -373,6 +376,18 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
>>>> pcie_set_clkpm_nocheck(link, enable);
>>>> }
>>>> +static void pcie_clkpm_override_default_link_state(struct pcie_link_state *link,
>>>> + int enabled)
>>>> +{
>>>> + struct pci_dev *pdev = link->downstream;
>>>> +
>>>> + /* Override the BIOS disabled Clock PM state for devicetree platforms */
>>>> + if (of_have_populated_dt() && !enabled) {
>>>> + link->clkpm_default = 1;
>>>> + pci_info(pdev, "Clock PM state overridden: ClockPM+\n");
>>>> + }
>>>> +}
>>>> +
>>>> static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
>>>> {
>>>> int capable = 1, enabled = 1;
>>>> @@ -395,6 +410,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
>>>> }
>>>> link->clkpm_enabled = enabled;
>>>> link->clkpm_default = enabled;
>>>> + pcie_clkpm_override_default_link_state(link, enabled);
>>>> link->clkpm_capable = capable;
>>>> link->clkpm_disable = blacklist ? 1 : 0;
>>>> }
>>>> @@ -788,6 +804,26 @@ static void aspm_l1ss_init(struct pcie_link_state *link)
>>>> aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
>>>> }
>>>> +static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
>>>> +{
>>>> + struct pci_dev *pdev = link->downstream;
>>>> + u32 override;
>>>> +
>>>> + /* Override the BIOS disabled ASPM states for devicetree platforms */
>>>> + if (of_have_populated_dt()) {
>>>> + link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
>>>> + override = link->aspm_default & ~link->aspm_enabled;
>>>> + if (override)
>>>> + pci_info(pdev, "ASPM states overridden: %s%s%s%s%s%s\n",
>>>> + (override & PCIE_LINK_STATE_L0S) ? "L0s+, " : "",
>>>> + (override & PCIE_LINK_STATE_L1) ? "L1+, " : "",
>>>> + (override & PCIE_LINK_STATE_L1_1) ? "L1.1+, " : "",
>>>> + (override & PCIE_LINK_STATE_L1_2) ? "L1.2+, " : "",
>>>> + (override & PCIE_LINK_STATE_L1_1_PCIPM) ? "L1.1 PCI-PM+, " : "",
>>>> + (override & PCIE_LINK_STATE_L1_2_PCIPM) ? "L1.2 PCI-PM+" : "");
>>>> + }
>>>> +}
>>>> +
>>>> static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>>>> {
>>>> struct pci_dev *child = link->downstream, *parent = link->pdev;
>>>> @@ -868,6 +904,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>>>> /* Save default state */
>>>> link->aspm_default = link->aspm_enabled;
>>>> + pcie_aspm_override_default_link_state(link);
>>>> +
>>>> /* Setup initial capable state. Will be updated later */
>>>> link->aspm_capable = link->aspm_support;
>>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-rockchip mailing list
>>> Linux-rockchip@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 7:13 ` FUKAUMI Naoki
@ 2025-10-15 7:50 ` Manivannan Sadhasivam
2025-10-15 9:11 ` Shawn Lin
0 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-15 7:50 UTC (permalink / raw)
To: FUKAUMI Naoki
Cc: Bjorn Helgaas, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions
On Wed, Oct 15, 2025 at 04:13:41PM +0900, FUKAUMI Naoki wrote:
> Hi,
>
> On 10/15/25 15:26, Manivannan Sadhasivam wrote:
> > On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
> > > [+cc regressions]
> > >
> > > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > > > Hi Manivannan Sadhasivam,
> > > >
> > > > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > > > Rockchip RK3588(S) SoC.
> > > >
> > > > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > > > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > > > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > > > MT7921E, and Intel AX210.
> > > >
> > > > I've found that reverting the following commit (i.e., the patch I'm replying
> > > > to) resolves the problem:
> > > > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
> > >
> > > Thanks for the report, and sorry for the regression.
> > >
> > > Since this affects several devices from different manufacturers and (I
> > > assume) different drivers, it seems likely that there's some issue
> > > with the Rockchip end, since ASPM probably works on these devices in
> > > other systems. So we should figure out if there's something wrong
> > > with the way we configure ASPM, which we could potentially fix, or if
> > > there's a hardware issue and we need some king of quirk to prevent
> > > usage of ASPM on the affected platforms.
> > >
> >
> > I believe it is the latter. The Root Port is having trouble with ASPM.
> >
> > FUKAUMI Naoki, could you please share the 'sudo lspci -vv' output so that we
> > know what kind of Root Port we are dealing with? You can revert the offending
> > patch and share the output.
>
> Here is dmesg/lspci output on ROCK 5A(RK3588S):
> https://gist.github.com/RadxaNaoki/1355a0b4278b6e51a61d89df7a535a5d
>
Thanks! Could you please try the below diff with f3ac2ff14834 applied?
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 214ed060ca1b..0069d06c282d 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
*/
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
+
+static void quirk_disable_aspm_all(struct pci_dev *dev)
+{
+ pci_info(dev, "Disabling ASPM\n");
+ pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
+}
+
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
+
/*
* Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
* Link bit cleared after starting the link retrain process to allow this
From your previous comment, I believe the Root Port is having the issues with
ASPM as you seem to have tried connecting different devices to the slot. So I
disabled ASPM for the Root Port with the above diff.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 7:50 ` Manivannan Sadhasivam
@ 2025-10-15 9:11 ` Shawn Lin
2025-10-15 9:43 ` Manivannan Sadhasivam
2025-10-15 9:46 ` Niklas Cassel
0 siblings, 2 replies; 42+ messages in thread
From: Shawn Lin @ 2025-10-15 9:11 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: shawn.lin, Bjorn Helgaas, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
Hi Mani
在 2025/10/15 星期三 15:50, Manivannan Sadhasivam 写道:
> On Wed, Oct 15, 2025 at 04:13:41PM +0900, FUKAUMI Naoki wrote:
>> Hi,
>>
>> On 10/15/25 15:26, Manivannan Sadhasivam wrote:
>>> On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
>>>> [+cc regressions]
>>>>
>>>> On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
>>>>> Hi Manivannan Sadhasivam,
>>>>>
>>>>> I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
>>>>> Rockchip RK3588(S) SoC.
>>>>>
>>>>> When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
>>>>> freezes or fails to probe M.2 Wi-Fi modules. This happens with several
>>>>> different modules I've tested, including the Realtek RTL8852BE, MediaTek
>>>>> MT7921E, and Intel AX210.
>>>>>
>>>>> I've found that reverting the following commit (i.e., the patch I'm replying
>>>>> to) resolves the problem:
>>>>> commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>>>>
>>>> Thanks for the report, and sorry for the regression.
>>>>
>>>> Since this affects several devices from different manufacturers and (I
>>>> assume) different drivers, it seems likely that there's some issue
>>>> with the Rockchip end, since ASPM probably works on these devices in
>>>> other systems. So we should figure out if there's something wrong
>>>> with the way we configure ASPM, which we could potentially fix, or if
>>>> there's a hardware issue and we need some king of quirk to prevent
>>>> usage of ASPM on the affected platforms.
>>>>
>>>
>>> I believe it is the latter. The Root Port is having trouble with ASPM.
>>>
>>> FUKAUMI Naoki, could you please share the 'sudo lspci -vv' output so that we
>>> know what kind of Root Port we are dealing with? You can revert the offending
>>> patch and share the output.
>>
>> Here is dmesg/lspci output on ROCK 5A(RK3588S):
>> https://gist.github.com/RadxaNaoki/1355a0b4278b6e51a61d89df7a535a5d
>>
>
> Thanks! Could you please try the below diff with f3ac2ff14834 applied?
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 214ed060ca1b..0069d06c282d 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> */
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
>
> +
> +static void quirk_disable_aspm_all(struct pci_dev *dev)
> +{
> + pci_info(dev, "Disabling ASPM\n");
> + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
> +}
> +
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
That's not true from my POV. Rockchip platform supports all ASPM policy
after mass production verification. I also verified current upstream
code this morning with RK3588-EVB and can check L0s/L1/L1ss work fine.
The log and lspci output could be found here:
https://pastebin.com/qizeYED7
Moreover, I disscussed this issue with FUKAUMI today off-list and his
board seems to work when only disable L1ss by patching:
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -813,7 +813,7 @@ static void
pcie_aspm_override_default_link_state(struct pcie_link_state *link)
/* For devicetree platforms, enable all ASPM states by default */
if (of_have_populated_dt()) {
- link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
+ link->aspm_default = PCIE_LINK_STATE_L0S |
PCIE_LINK_STATE_L1;
override = link->aspm_default & ~link->aspm_enabled;
if (override)
pci_info(pdev, "ASPM: DT platform,
So, is there a proper way to just disable this feature for spec boards
instead of this Soc?
> +
> /*
> * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
> * Link bit cleared after starting the link retrain process to allow this
>
>
> From your previous comment, I believe the Root Port is having the issues with
> ASPM as you seem to have tried connecting different devices to the slot. So I
> disabled ASPM for the Root Port with the above diff.
>
> - Mani
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 9:11 ` Shawn Lin
@ 2025-10-15 9:43 ` Manivannan Sadhasivam
2025-10-15 9:46 ` Niklas Cassel
1 sibling, 0 replies; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-15 9:43 UTC (permalink / raw)
To: Shawn Lin
Cc: Bjorn Helgaas, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Wed, Oct 15, 2025 at 05:11:39PM +0800, Shawn Lin wrote:
> Hi Mani
>
> 在 2025/10/15 星期三 15:50, Manivannan Sadhasivam 写道:
> > On Wed, Oct 15, 2025 at 04:13:41PM +0900, FUKAUMI Naoki wrote:
> > > Hi,
> > >
> > > On 10/15/25 15:26, Manivannan Sadhasivam wrote:
> > > > On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
> > > > > [+cc regressions]
> > > > >
> > > > > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > > > > > Hi Manivannan Sadhasivam,
> > > > > >
> > > > > > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > > > > > Rockchip RK3588(S) SoC.
> > > > > >
> > > > > > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > > > > > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > > > > > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > > > > > MT7921E, and Intel AX210.
> > > > > >
> > > > > > I've found that reverting the following commit (i.e., the patch I'm replying
> > > > > > to) resolves the problem:
> > > > > > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
> > > > >
> > > > > Thanks for the report, and sorry for the regression.
> > > > >
> > > > > Since this affects several devices from different manufacturers and (I
> > > > > assume) different drivers, it seems likely that there's some issue
> > > > > with the Rockchip end, since ASPM probably works on these devices in
> > > > > other systems. So we should figure out if there's something wrong
> > > > > with the way we configure ASPM, which we could potentially fix, or if
> > > > > there's a hardware issue and we need some king of quirk to prevent
> > > > > usage of ASPM on the affected platforms.
> > > > >
> > > >
> > > > I believe it is the latter. The Root Port is having trouble with ASPM.
> > > >
> > > > FUKAUMI Naoki, could you please share the 'sudo lspci -vv' output so that we
> > > > know what kind of Root Port we are dealing with? You can revert the offending
> > > > patch and share the output.
> > >
> > > Here is dmesg/lspci output on ROCK 5A(RK3588S):
> > > https://gist.github.com/RadxaNaoki/1355a0b4278b6e51a61d89df7a535a5d
> > >
> >
> > Thanks! Could you please try the below diff with f3ac2ff14834 applied?
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 214ed060ca1b..0069d06c282d 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> > */
> > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> >
> > +
> > +static void quirk_disable_aspm_all(struct pci_dev *dev)
> > +{
> > + pci_info(dev, "Disabling ASPM\n");
> > + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
> > +}
> > +
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
>
> That's not true from my POV. Rockchip platform supports all ASPM policy
> after mass production verification. I also verified current upstream
> code this morning with RK3588-EVB and can check L0s/L1/L1ss work fine.
>
> The log and lspci output could be found here:
> https://pastebin.com/qizeYED7
>
> Moreover, I disscussed this issue with FUKAUMI today off-list and his
> board seems to work when only disable L1ss by patching:
>
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -813,7 +813,7 @@ static void pcie_aspm_override_default_link_state(struct
> pcie_link_state *link)
>
> /* For devicetree platforms, enable all ASPM states by default */
> if (of_have_populated_dt()) {
> - link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> + link->aspm_default = PCIE_LINK_STATE_L0S |
> PCIE_LINK_STATE_L1;
> override = link->aspm_default & ~link->aspm_enabled;
> if (override)
> pci_info(pdev, "ASPM: DT platform,
>
>
Thanks a lot for debugging the issue. Now it is clear that the board routing is
on play and ASPM works fine on Rockchip Root Ports.
> So, is there a proper way to just disable this feature for spec boards
> instead of this Soc?
>
Below should work:
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 214ed060ca1b..9864b2c91399 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -29,6 +29,7 @@
#include <linux/ktime.h>
#include <linux/mm.h>
#include <linux/nvme.h>
+#include <linux/of.h>
#include <linux/platform_data/x86/apple.h>
#include <linux/pm_runtime.h>
#include <linux/sizes.h>
@@ -2525,6 +2526,19 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
*/
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
+
+static void quirk_disable_aspm_radxa(struct pci_dev *dev)
+{
+ if (of_machine_is_compatible("radxa,rock-5a") ||
+ (of_machine_is_compatible("radxa,rock-5b"))) {
+ pci_info(dev, "Disabling ASPM L1ss\n");
+ pci_disable_link_state(dev, PCIE_LINK_STATE_L1_1 |
+ PCIE_LINK_STATE_L1_2);
+ }
+}
+
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_radxa);
+
/*
* Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
* Link bit cleared after starting the link retrain process to allow this
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 9:11 ` Shawn Lin
2025-10-15 9:43 ` Manivannan Sadhasivam
@ 2025-10-15 9:46 ` Niklas Cassel
2025-10-15 10:33 ` Manivannan Sadhasivam
1 sibling, 1 reply; 42+ messages in thread
From: Niklas Cassel @ 2025-10-15 9:46 UTC (permalink / raw)
To: Shawn Lin
Cc: Manivannan Sadhasivam, Bjorn Helgaas, manivannan.sadhasivam,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
Hello Shawn,
On Wed, Oct 15, 2025 at 05:11:39PM +0800, Shawn Lin wrote:
> >
> > Thanks! Could you please try the below diff with f3ac2ff14834 applied?
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 214ed060ca1b..0069d06c282d 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> > */
> > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> >
> > +
> > +static void quirk_disable_aspm_all(struct pci_dev *dev)
> > +{
> > + pci_info(dev, "Disabling ASPM\n");
> > + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
> > +}
> > +
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
>
> That's not true from my POV. Rockchip platform supports all ASPM policy
> after mass production verification. I also verified current upstream
> code this morning with RK3588-EVB and can check L0s/L1/L1ss work fine.
>
> The log and lspci output could be found here:
> https://pastebin.com/qizeYED7
>
> Moreover, I disscussed this issue with FUKAUMI today off-list and his
> board seems to work when only disable L1ss by patching:
>
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -813,7 +813,7 @@ static void pcie_aspm_override_default_link_state(struct
> pcie_link_state *link)
>
> /* For devicetree platforms, enable all ASPM states by default */
> if (of_have_populated_dt()) {
> - link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> + link->aspm_default = PCIE_LINK_STATE_L0S |
> PCIE_LINK_STATE_L1;
> override = link->aspm_default & ~link->aspm_enabled;
> if (override)
> pci_info(pdev, "ASPM: DT platform,
>
>
> So, is there a proper way to just disable this feature for spec boards
> instead of this Soc?
This fix seems do the trick, without needing to patch common code (aspm.c):
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index 3e2752c7dd09..f5e1aaa97719 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -200,6 +200,19 @@ static bool rockchip_pcie_link_up(struct dw_pcie *pci)
return FIELD_GET(PCIE_LINKUP_MASK, val) == PCIE_LINKUP;
}
+static void rockchip_pcie_disable_l1sub(struct dw_pcie *pci)
+{
+ u32 cap, l1subcap;
+
+ cap = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_L1SS);
+ if (cap) {
+ l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
+ l1subcap &= ~(PCI_L1SS_CAP_ASPM_L1_1 | PCI_L1SS_CAP_ASPM_L1_2 | PCI_L1SS_CAP_L1_PM_SS);
+ dw_pcie_writel_dbi(pci, cap + PCI_L1SS_CAP, l1subcap);
+ l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
+ }
+}
+
static void rockchip_pcie_enable_l0s(struct dw_pcie *pci)
{
u32 cap, lnkcap;
@@ -264,6 +277,7 @@ static int rockchip_pcie_host_init(struct dw_pcie_rp *pp)
irq_set_chained_handler_and_data(irq, rockchip_pcie_intx_handler,
rockchip);
+ rockchip_pcie_disable_l1sub(pci);
rockchip_pcie_enable_l0s(pci);
return 0;
@@ -301,6 +315,7 @@ static void rockchip_pcie_ep_init(struct dw_pcie_ep *ep)
struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
enum pci_barno bar;
+ rockchip_pcie_disable_l1sub(pci);
rockchip_pcie_enable_l0s(pci);
rockchip_pcie_ep_hide_broken_ats_cap_rk3588(ep);
In reality, I think that pcie-dw-rockchip.c should check 'supports-clkreq',
and only if it doesn't support clkreq, it should disable L1 substates, similar
to how the Tegra driver does things:
https://github.com/torvalds/linux/blob/v6.18-rc1/drivers/pci/controller/dwc/pcie-tegra194.c#L934-L938
https://github.com/torvalds/linux/blob/v6.18-rc1/drivers/pci/controller/dwc/pcie-tegra194.c#L1164-L1165
In fact, that is also how the downstream rockchip drives does things:
https://github.com/rockchip-linux/kernel/blob/develop-6.6/drivers/pci/controller/dwc/pcie-dw-rockchip.c#L200-L233
https://github.com/rockchip-linux/kernel/blob/develop-6.6/drivers/pci/controller/dwc/pcie-dw-rockchip.c#L725
So I guess we either:
1) Add code to pcie-dw-rockchip.c to unconditionally disable L1 substates, or
2) We add code to:
- If have 'supports-clkreq' property, set PCIE_CLIENT_POWER_CON.app_clk_req_n=1
- If don't have 'supports-clkreq' property, disable L1 substates.
I think we need to do either 1) or 2), because a user can build the kernel with
CONFIG_PCIEASPM_POWER_SUPERSAVE=y
and that would break things even on older kernels, that don't have Mani's recent
commit.
Mani, perhaps common code (aspm.c) should enable L1 substates only if
'supports-clkreq' DT property exists?
Kind regards,
Niklas
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 9:46 ` Niklas Cassel
@ 2025-10-15 10:33 ` Manivannan Sadhasivam
2025-10-15 12:17 ` Niklas Cassel
0 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-15 10:33 UTC (permalink / raw)
To: Niklas Cassel
Cc: Shawn Lin, Bjorn Helgaas, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Wed, Oct 15, 2025 at 11:46:02AM +0200, Niklas Cassel wrote:
> Hello Shawn,
>
> On Wed, Oct 15, 2025 at 05:11:39PM +0800, Shawn Lin wrote:
> > >
> > > Thanks! Could you please try the below diff with f3ac2ff14834 applied?
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 214ed060ca1b..0069d06c282d 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> > > */
> > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> > >
> > > +
> > > +static void quirk_disable_aspm_all(struct pci_dev *dev)
> > > +{
> > > + pci_info(dev, "Disabling ASPM\n");
> > > + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
> > > +}
> > > +
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
> >
> > That's not true from my POV. Rockchip platform supports all ASPM policy
> > after mass production verification. I also verified current upstream
> > code this morning with RK3588-EVB and can check L0s/L1/L1ss work fine.
> >
> > The log and lspci output could be found here:
> > https://pastebin.com/qizeYED7
> >
> > Moreover, I disscussed this issue with FUKAUMI today off-list and his
> > board seems to work when only disable L1ss by patching:
> >
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -813,7 +813,7 @@ static void pcie_aspm_override_default_link_state(struct
> > pcie_link_state *link)
> >
> > /* For devicetree platforms, enable all ASPM states by default */
> > if (of_have_populated_dt()) {
> > - link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> > + link->aspm_default = PCIE_LINK_STATE_L0S |
> > PCIE_LINK_STATE_L1;
> > override = link->aspm_default & ~link->aspm_enabled;
> > if (override)
> > pci_info(pdev, "ASPM: DT platform,
> >
> >
> > So, is there a proper way to just disable this feature for spec boards
> > instead of this Soc?
>
> This fix seems do the trick, without needing to patch common code (aspm.c):
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 3e2752c7dd09..f5e1aaa97719 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -200,6 +200,19 @@ static bool rockchip_pcie_link_up(struct dw_pcie *pci)
> return FIELD_GET(PCIE_LINKUP_MASK, val) == PCIE_LINKUP;
> }
>
> +static void rockchip_pcie_disable_l1sub(struct dw_pcie *pci)
> +{
> + u32 cap, l1subcap;
> +
> + cap = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_L1SS);
> + if (cap) {
> + l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
> + l1subcap &= ~(PCI_L1SS_CAP_ASPM_L1_1 | PCI_L1SS_CAP_ASPM_L1_2 | PCI_L1SS_CAP_L1_PM_SS);
> + dw_pcie_writel_dbi(pci, cap + PCI_L1SS_CAP, l1subcap);
> + l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
> + }
> +}
> +
> static void rockchip_pcie_enable_l0s(struct dw_pcie *pci)
> {
> u32 cap, lnkcap;
> @@ -264,6 +277,7 @@ static int rockchip_pcie_host_init(struct dw_pcie_rp *pp)
> irq_set_chained_handler_and_data(irq, rockchip_pcie_intx_handler,
> rockchip);
>
> + rockchip_pcie_disable_l1sub(pci);
> rockchip_pcie_enable_l0s(pci);
>
> return 0;
> @@ -301,6 +315,7 @@ static void rockchip_pcie_ep_init(struct dw_pcie_ep *ep)
> struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> enum pci_barno bar;
>
> + rockchip_pcie_disable_l1sub(pci);
> rockchip_pcie_enable_l0s(pci);
> rockchip_pcie_ep_hide_broken_ats_cap_rk3588(ep);
>
But this patch removes the L1SS CAP for all boards, isn't it?
>
>
>
> In reality, I think that pcie-dw-rockchip.c should check 'supports-clkreq',
> and only if it doesn't support clkreq, it should disable L1 substates, similar
> to how the Tegra driver does things:
> https://github.com/torvalds/linux/blob/v6.18-rc1/drivers/pci/controller/dwc/pcie-tegra194.c#L934-L938
> https://github.com/torvalds/linux/blob/v6.18-rc1/drivers/pci/controller/dwc/pcie-tegra194.c#L1164-L1165
>
> In fact, that is also how the downstream rockchip drives does things:
> https://github.com/rockchip-linux/kernel/blob/develop-6.6/drivers/pci/controller/dwc/pcie-dw-rockchip.c#L200-L233
> https://github.com/rockchip-linux/kernel/blob/develop-6.6/drivers/pci/controller/dwc/pcie-dw-rockchip.c#L725
>
> So I guess we either:
> 1) Add code to pcie-dw-rockchip.c to unconditionally disable L1 substates, or
> 2) We add code to:
> - If have 'supports-clkreq' property, set PCIE_CLIENT_POWER_CON.app_clk_req_n=1
> - If don't have 'supports-clkreq' property, disable L1 substates.
>
> I think we need to do either 1) or 2), because a user can build the kernel with
> CONFIG_PCIEASPM_POWER_SUPERSAVE=y
> and that would break things even on older kernels, that don't have Mani's recent
> commit.
>
>
>
> Mani, perhaps common code (aspm.c) should enable L1 substates only if
> 'supports-clkreq' DT property exists?
>
Unfortunately, not all DTs define this property even though the platforms
support CLKREQ#. Right now, only Nvidia defines this property in the binding,
but not in upstream DTS. But I would expect the platforms to support CLKREQ# if
the Root Port supports L1SS CAP.
So removing the L1SS CAP for Root Port in the controller driver makes sense to
me.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 6:22 ` Manivannan Sadhasivam
@ 2025-10-15 11:23 ` Diederik de Haas
2025-10-23 18:57 ` Dragan Simic
1 sibling, 0 replies; 42+ messages in thread
From: Diederik de Haas @ 2025-10-15 11:23 UTC (permalink / raw)
To: Manivannan Sadhasivam, Dragan Simic
Cc: Bjorn Helgaas, FUKAUMI Naoki, manivannan.sadhasivam,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
linux-rockchip, regressions
On Wed Oct 15, 2025 at 8:22 AM CEST, Manivannan Sadhasivam wrote:
> On Wed, Oct 15, 2025 at 01:33:35AM +0200, Dragan Simic wrote:
>> On Tuesday, October 14, 2025 20:49 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
>> > > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
>> > > Rockchip RK3588(S) SoC.
>> > >
>> > > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
>> > > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
>> > > different modules I've tested, including the Realtek RTL8852BE, MediaTek
>> > > MT7921E, and Intel AX210.
>> > >
>> > > I've found that reverting the following commit (i.e., the patch I'm replying
>> > > to) resolves the problem:
>> > > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>> >
>> > <snip>
>> >
>> > Do you know if any platforms other than Radxa ROCK 5A/5B have this
>> > problem?
>> >
>> After thinking quite a bit about it, I think we should revert this
>> patch and replace it with another patch that allows per-SoC, or
>> maybe even per-board, opting into the forced enablement of PCIe
>> ASPM. Let me explain, please.
>
> ASPM is a PCIe device specific feature, nothing related to SoC/board. Even if
> you limit it to certain platforms, there is no guarantee that it will be safe as
> the users can connect a buggy device to the slot and it could lead to the same
> issue.
>
>> When a new feature is introduced, it's expected that it may fail
>> on some hardware or with some specific setups, so quirking off such
>> instances, as time passes, is perfectly fine. Such a new feature
>> didn't work before it was implemented, so it's acceptable that it
>> fails in some instances after the introduction, and that it gets
>> quirked off as time passes and more testing is performed.
>
> ASPM is not a new feature. It was introduced more than a decade before. But we
> somehow procastinated the enablement for so long until we realized that if we
> don't do it now, we wouldn't be able to do it anytime in the future.
Do you mean literally *now* or more like "we need to do it sometime"?
>> However, when some widespread feature, such as PCIe, has already
>> been in production for quite a while, introducing high-risk changes
>> to it in a blanket fashion, while intending to have the incompatible
>> or not-yet-ready platforms quirked off over time, simply isn't the
>> way to go. Breaking stuff intentionally to find out what actually
>> doesn't work is rarely a good option.
>
> The issue is due to devices exposing ASPM capability, but behaving erratically
> when enabled. Until, we enable ASPM on these devices, we cannot know whether
> they are working or not. To avoid mass chaos, we decided to enable it only for
> devicetree platforms as a start.
>
>> Thus, I'd suggest that this patch is replaced with nother patches,
>> which would introduce an additional ASPM opt-in switch to the PCI
>> binding, allowing SoCs or boards to have it enabled _after_ proper
>> testing is performed. The PCIe driver may emit a warning that ASPM
>> is to be enabled at some point in the future, to "bug" people about
>> the need to perform the testing, etc.
>
> Even if we emit a "YOUR DEVICE MAY BREAK" warning, nobody would care as long as
> the device works for them. We didn't decide to enable this feature overnight to
> trouble users. The fact that ASPM saves runtime power, which will benefit users
> and ofc the environment as a whole, should not be kept disabled.
>
> But does that mean, we wanted to have breakages, NO. We expected breakages as
> not all devices will play nicely with ASPM, but there is only one way to find
> out. And we do want to disable ASPM only for those devices.
I understand this logic. And I'm very much in favor of changes that
reduce power usage.
I suspect that 6.18 will become a LTS kernel, so introducing a change
which may break many devices, sounds less then ideal for such a kernel.
Kernel 6.19 OTOH sounds perfect for that. Then there's plenty of time to
encounter and fix issues which may/will come up before there is another
LTS kernel, namely ~ a year.
My 0.02.
Cheers,
Diederik
PS: will send my bug/debug report separately
>> With all that in place, we could expect that in a year or two PCIe ASPM
>> could eventually be enabled everywhere. Getting everything tested is a
>> massive endeavor, but that's the only way not to break stuff.
>>
>> Biting the bullet and hoping that it all goes well, I'd say, isn't
>> the right approach here.
>
> Your two year phased approach would never work as that's what we have hoped for
> more than a decade.
>
> - Mani
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 10:33 ` Manivannan Sadhasivam
@ 2025-10-15 12:17 ` Niklas Cassel
2025-10-15 13:00 ` Shawn Lin
0 siblings, 1 reply; 42+ messages in thread
From: Niklas Cassel @ 2025-10-15 12:17 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Shawn Lin, Bjorn Helgaas, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Wed, Oct 15, 2025 at 04:03:53PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Oct 15, 2025 at 11:46:02AM +0200, Niklas Cassel wrote:
> > Hello Shawn,
> >
> > On Wed, Oct 15, 2025 at 05:11:39PM +0800, Shawn Lin wrote:
> > > >
> > > > Thanks! Could you please try the below diff with f3ac2ff14834 applied?
> > > >
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index 214ed060ca1b..0069d06c282d 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> > > > */
> > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> > > >
> > > > +
> > > > +static void quirk_disable_aspm_all(struct pci_dev *dev)
> > > > +{
> > > > + pci_info(dev, "Disabling ASPM\n");
> > > > + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
> > > > +}
> > > > +
> > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
> > >
> > > That's not true from my POV. Rockchip platform supports all ASPM policy
> > > after mass production verification. I also verified current upstream
> > > code this morning with RK3588-EVB and can check L0s/L1/L1ss work fine.
> > >
> > > The log and lspci output could be found here:
> > > https://pastebin.com/qizeYED7
> > >
> > > Moreover, I disscussed this issue with FUKAUMI today off-list and his
> > > board seems to work when only disable L1ss by patching:
> > >
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -813,7 +813,7 @@ static void pcie_aspm_override_default_link_state(struct
> > > pcie_link_state *link)
> > >
> > > /* For devicetree platforms, enable all ASPM states by default */
> > > if (of_have_populated_dt()) {
> > > - link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
> > > + link->aspm_default = PCIE_LINK_STATE_L0S |
> > > PCIE_LINK_STATE_L1;
> > > override = link->aspm_default & ~link->aspm_enabled;
> > > if (override)
> > > pci_info(pdev, "ASPM: DT platform,
> > >
> > >
> > > So, is there a proper way to just disable this feature for spec boards
> > > instead of this Soc?
> >
> > This fix seems do the trick, without needing to patch common code (aspm.c):
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > index 3e2752c7dd09..f5e1aaa97719 100644
> > --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > @@ -200,6 +200,19 @@ static bool rockchip_pcie_link_up(struct dw_pcie *pci)
> > return FIELD_GET(PCIE_LINKUP_MASK, val) == PCIE_LINKUP;
> > }
> >
> > +static void rockchip_pcie_disable_l1sub(struct dw_pcie *pci)
> > +{
> > + u32 cap, l1subcap;
> > +
> > + cap = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_L1SS);
> > + if (cap) {
> > + l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
> > + l1subcap &= ~(PCI_L1SS_CAP_ASPM_L1_1 | PCI_L1SS_CAP_ASPM_L1_2 | PCI_L1SS_CAP_L1_PM_SS);
> > + dw_pcie_writel_dbi(pci, cap + PCI_L1SS_CAP, l1subcap);
> > + l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
> > + }
> > +}
> > +
> > static void rockchip_pcie_enable_l0s(struct dw_pcie *pci)
> > {
> > u32 cap, lnkcap;
> > @@ -264,6 +277,7 @@ static int rockchip_pcie_host_init(struct dw_pcie_rp *pp)
> > irq_set_chained_handler_and_data(irq, rockchip_pcie_intx_handler,
> > rockchip);
> >
> > + rockchip_pcie_disable_l1sub(pci);
> > rockchip_pcie_enable_l0s(pci);
> >
> > return 0;
> > @@ -301,6 +315,7 @@ static void rockchip_pcie_ep_init(struct dw_pcie_ep *ep)
> > struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > enum pci_barno bar;
> >
> > + rockchip_pcie_disable_l1sub(pci);
> > rockchip_pcie_enable_l0s(pci);
> > rockchip_pcie_ep_hide_broken_ats_cap_rk3588(ep);
> >
>
> But this patch removes the L1SS CAP for all boards, isn't it?
Yes, all boards supported by pcie-dw-rockchip.c, which matches what their
downstream driver does.
(Their downstream driver disables L1 substates for all boards that have
not defined 'supports-clkreq', and a grep through their downstream tree,
for all their all their different branches, shows that not a since rockchip
DTS has this property specified.)
So, let me submit a real patch with the above.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-14 18:49 ` Bjorn Helgaas
2025-10-14 23:33 ` Dragan Simic
2025-10-15 6:26 ` Manivannan Sadhasivam
@ 2025-10-15 12:26 ` Diederik de Haas
2025-10-15 22:50 ` Bjorn Helgaas
2025-10-30 22:14 ` Bjorn Helgaas
3 siblings, 1 reply; 42+ messages in thread
From: Diederik de Haas @ 2025-10-15 12:26 UTC (permalink / raw)
To: Bjorn Helgaas, FUKAUMI Naoki
Cc: manivannan.sadhasivam, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions
On Tue Oct 14, 2025 at 8:49 PM CEST, Bjorn Helgaas wrote:
> On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
>> I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
>> Rockchip RK3588(S) SoC.
>>
>> When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
>> freezes or fails to probe M.2 Wi-Fi modules. This happens with several
>> different modules I've tested, including the Realtek RTL8852BE, MediaTek
>> MT7921E, and Intel AX210.
>>
>> I've found that reverting the following commit (i.e., the patch I'm replying
>> to) resolves the problem:
>> commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>
> Thanks for the report, and sorry for the regression.
>
> Since this affects several devices from different manufacturers and (I
> assume) different drivers, it seems likely that there's some issue
> with the Rockchip end, since ASPM probably works on these devices in
> other systems. So we should figure out if there's something wrong
> with the way we configure ASPM, which we could potentially fix, or if
> there's a hardware issue and we need some king of quirk to prevent
> usage of ASPM on the affected platforms.
>
> Can you collect a complete dmesg log when booting with
>
> ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
>
> and the output of "sudo lspci -vv"?
I have a Rock 5B as well, but I don't have a Wi-Fi module, but I do have
a NVMe drive connected. That boots fine with 6.17, but I end up in a
rescue shell with 6.18-rc1. I haven't verified that it's caused by the
same commit, but it does sound plausible.
On this device, the NVMe isn't strictly needed (I used it to compile my
kernels on), so I added 'noauto' to the NVMe line in /etc/fstab ... and
that made it boot successfully into 6.18-rc1. Then running the 'mount'
command wrt that NVMe drive failed with this message:
EXT4-fs (nvme0n1p1): unable to read superblock
The log of my attempts can be found here:
https://paste.sr.ht/~diederik/f435eb258dca60676f7ac5154c00ddfdc24ac0b7
> When the kernel freezes, can you give us any information about where,
> e.g., a log or screenshot?
For me, there is no kernel freeze. I ended up in a rescue shell as it
couldn't mount the NVMe drive. As described above, when not letting it
auto-mount that drive, the boot completed normally.
> Do you know if any platforms other than Radxa ROCK 5A/5B have this
> problem?
I do have a NanoPi R5S (rk3568) which also has a NVMe drive attached,
but that one is mounted on /var ...
> #regzbot introduced: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
> #regzbot dup-of: https://github.com/chzigotzky/kernels/issues/17
>
>> On 9/23/25 01:16, Manivannan Sadhasivam via B4 Relay wrote:
>> > From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>> >
>> > So far, the PCI subsystem has honored the ASPM and Clock PM states set by
>> > the BIOS (through LNKCTL) during device initialization, if it relies on the
>> > default state selected using:
>> >
>> > * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
>> > * cmdline: "pcie_aspm=off", or
>> > * FADT: ACPI_FADT_NO_ASPM
>> >
>> > This was done conservatively to avoid issues with the buggy devices that
>> > advertise ASPM capabilities, but behave erratically if the ASPM states are
>> > enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
>> > ASPM states that were known to work for the devices.
>> >
>> > But this turned out to be a problem for devicetree platforms, especially
>> > the ARM based devicetree platforms powering Embedded and *some* Compute
>> > devices as they tend to run without any standard BIOS. So the ASPM states
I noticed in Naoki's log that it had lines mentioning 'BIOS', but I do
not. Naoki's log is wrt Rock 5A, while mine is wrt Rock 5B. I don't
suspect that's very relevant though.
Cheers,
Diederik
>> > on these platforms were left disabled during boot and the PCI subsystem
>> > never bothered to enable them, unless the user has forcefully enabled the
>> > ASPM states through Kconfig, cmdline, and sysfs or the device drivers
>> > themselves, enabling the ASPM states through pci_enable_link_state() APIs.
>> >
>> > This caused runtime power issues on those platforms. So a couple of
>> > approaches were tried to mitigate this BIOS dependency without user
>> > intervention by enabling the ASPM states in the PCI controller drivers
>> > after device enumeration, and overriding the ASPM/Clock PM states
>> > by the PCI controller drivers through an API before enumeration.
>> >
>> > But it has been concluded that none of these mitigations should really be
>> > required and the PCI subsystem should enable the ASPM states advertised by
>> > the devices without relying on BIOS or the PCI controller drivers. If any
>> > device is found to be misbehaving after enabling ASPM states that they
>> > advertised, then those devices should be quirked to disable the problematic
>> > ASPM/Clock PM states.
>> >
>> > In an effort to do so, start by overriding the ASPM and Clock PM states set
>> > by the BIOS for devicetree platforms first. Separate helper functions are
>> > introduced to override the BIOS set states by enabling all of them if
>> > of_have_populated_dt() returns true. To aid debugging, print the overridden
>> > ASPM and Clock PM states as well.
>> >
>> > In the future, these helpers could be extended to allow other platforms
>> > like VMD, newer ACPI systems with a cutoff year etc... to follow the path.
>> >
>> > Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
>> > Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
>> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>> > Link: https://patch.msgid.link/20250916-pci-dt-aspm-v1-1-778fe907c9ad@oss.qualcomm.com
>> > ---
>> > drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>> > 1 file changed, 40 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> > index 919a05b9764791c3cc469c9ada62ba5b2c405118..cda31150aec1b67b6a48b60569222ea3d1c3d41f 100644
>> > --- a/drivers/pci/pcie/aspm.c
>> > +++ b/drivers/pci/pcie/aspm.c
>> > @@ -15,6 +15,7 @@
>> > #include <linux/math.h>
>> > #include <linux/module.h>
>> > #include <linux/moduleparam.h>
>> > +#include <linux/of.h>
>> > #include <linux/pci.h>
>> > #include <linux/pci_regs.h>
>> > #include <linux/errno.h>
>> > @@ -235,13 +236,15 @@ struct pcie_link_state {
>> > u32 aspm_support:7; /* Supported ASPM state */
>> > u32 aspm_enabled:7; /* Enabled ASPM state */
>> > u32 aspm_capable:7; /* Capable ASPM state with latency */
>> > - u32 aspm_default:7; /* Default ASPM state by BIOS */
>> > + u32 aspm_default:7; /* Default ASPM state by BIOS or
>> > + override */
>> > u32 aspm_disable:7; /* Disabled ASPM state */
>> > /* Clock PM state */
>> > u32 clkpm_capable:1; /* Clock PM capable? */
>> > u32 clkpm_enabled:1; /* Current Clock PM state */
>> > - u32 clkpm_default:1; /* Default Clock PM state by BIOS */
>> > + u32 clkpm_default:1; /* Default Clock PM state by BIOS or
>> > + override */
>> > u32 clkpm_disable:1; /* Clock PM disabled */
>> > };
>> > @@ -373,6 +376,18 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
>> > pcie_set_clkpm_nocheck(link, enable);
>> > }
>> > +static void pcie_clkpm_override_default_link_state(struct pcie_link_state *link,
>> > + int enabled)
>> > +{
>> > + struct pci_dev *pdev = link->downstream;
>> > +
>> > + /* Override the BIOS disabled Clock PM state for devicetree platforms */
>> > + if (of_have_populated_dt() && !enabled) {
>> > + link->clkpm_default = 1;
>> > + pci_info(pdev, "Clock PM state overridden: ClockPM+\n");
>> > + }
>> > +}
>> > +
>> > static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
>> > {
>> > int capable = 1, enabled = 1;
>> > @@ -395,6 +410,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
>> > }
>> > link->clkpm_enabled = enabled;
>> > link->clkpm_default = enabled;
>> > + pcie_clkpm_override_default_link_state(link, enabled);
>> > link->clkpm_capable = capable;
>> > link->clkpm_disable = blacklist ? 1 : 0;
>> > }
>> > @@ -788,6 +804,26 @@ static void aspm_l1ss_init(struct pcie_link_state *link)
>> > aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
>> > }
>> > +static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
>> > +{
>> > + struct pci_dev *pdev = link->downstream;
>> > + u32 override;
>> > +
>> > + /* Override the BIOS disabled ASPM states for devicetree platforms */
>> > + if (of_have_populated_dt()) {
>> > + link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
>> > + override = link->aspm_default & ~link->aspm_enabled;
>> > + if (override)
>> > + pci_info(pdev, "ASPM states overridden: %s%s%s%s%s%s\n",
>> > + (override & PCIE_LINK_STATE_L0S) ? "L0s+, " : "",
>> > + (override & PCIE_LINK_STATE_L1) ? "L1+, " : "",
>> > + (override & PCIE_LINK_STATE_L1_1) ? "L1.1+, " : "",
>> > + (override & PCIE_LINK_STATE_L1_2) ? "L1.2+, " : "",
>> > + (override & PCIE_LINK_STATE_L1_1_PCIPM) ? "L1.1 PCI-PM+, " : "",
>> > + (override & PCIE_LINK_STATE_L1_2_PCIPM) ? "L1.2 PCI-PM+" : "");
>> > + }
>> > +}
>> > +
>> > static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>> > {
>> > struct pci_dev *child = link->downstream, *parent = link->pdev;
>> > @@ -868,6 +904,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>> > /* Save default state */
>> > link->aspm_default = link->aspm_enabled;
>> > + pcie_aspm_override_default_link_state(link);
>> > +
>> > /* Setup initial capable state. Will be updated later */
>> > link->aspm_capable = link->aspm_support;
>> >
>>
>>
>> _______________________________________________
>> Linux-rockchip mailing list
>> Linux-rockchip@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 12:17 ` Niklas Cassel
@ 2025-10-15 13:00 ` Shawn Lin
2025-10-15 15:23 ` Niklas Cassel
2025-10-15 23:30 ` Bjorn Helgaas
0 siblings, 2 replies; 42+ messages in thread
From: Shawn Lin @ 2025-10-15 13:00 UTC (permalink / raw)
To: Niklas Cassel, Manivannan Sadhasivam
Cc: shawn.lin, Bjorn Helgaas, manivannan.sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
在 2025/10/15 星期三 20:17, Niklas Cassel 写道:
> On Wed, Oct 15, 2025 at 04:03:53PM +0530, Manivannan Sadhasivam wrote:
>> On Wed, Oct 15, 2025 at 11:46:02AM +0200, Niklas Cassel wrote:
>>> Hello Shawn,
>>>
>>> On Wed, Oct 15, 2025 at 05:11:39PM +0800, Shawn Lin wrote:
>>>>>
>>>>> Thanks! Could you please try the below diff with f3ac2ff14834 applied?
>>>>>
>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>> index 214ed060ca1b..0069d06c282d 100644
>>>>> --- a/drivers/pci/quirks.c
>>>>> +++ b/drivers/pci/quirks.c
>>>>> @@ -2525,6 +2525,15 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
>>>>> */
>>>>> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
>>>>>
>>>>> +
>>>>> +static void quirk_disable_aspm_all(struct pci_dev *dev)
>>>>> +{
>>>>> + pci_info(dev, "Disabling ASPM\n");
>>>>> + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
>>>>> +}
>>>>> +
>>>>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ROCKCHIP, 0x3588, quirk_disable_aspm_all);
>>>>
>>>> That's not true from my POV. Rockchip platform supports all ASPM policy
>>>> after mass production verification. I also verified current upstream
>>>> code this morning with RK3588-EVB and can check L0s/L1/L1ss work fine.
>>>>
>>>> The log and lspci output could be found here:
>>>> https://pastebin.com/qizeYED7
>>>>
>>>> Moreover, I disscussed this issue with FUKAUMI today off-list and his
>>>> board seems to work when only disable L1ss by patching:
>>>>
>>>> --- a/drivers/pci/pcie/aspm.c
>>>> +++ b/drivers/pci/pcie/aspm.c
>>>> @@ -813,7 +813,7 @@ static void pcie_aspm_override_default_link_state(struct
>>>> pcie_link_state *link)
>>>>
>>>> /* For devicetree platforms, enable all ASPM states by default */
>>>> if (of_have_populated_dt()) {
>>>> - link->aspm_default = PCIE_LINK_STATE_ASPM_ALL;
>>>> + link->aspm_default = PCIE_LINK_STATE_L0S |
>>>> PCIE_LINK_STATE_L1;
>>>> override = link->aspm_default & ~link->aspm_enabled;
>>>> if (override)
>>>> pci_info(pdev, "ASPM: DT platform,
>>>>
>>>>
>>>> So, is there a proper way to just disable this feature for spec boards
>>>> instead of this Soc?
>>>
>>> This fix seems do the trick, without needing to patch common code (aspm.c):
>>>
>>> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
>>> index 3e2752c7dd09..f5e1aaa97719 100644
>>> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
>>> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
>>> @@ -200,6 +200,19 @@ static bool rockchip_pcie_link_up(struct dw_pcie *pci)
>>> return FIELD_GET(PCIE_LINKUP_MASK, val) == PCIE_LINKUP;
>>> }
>>>
>>> +static void rockchip_pcie_disable_l1sub(struct dw_pcie *pci)
>>> +{
>>> + u32 cap, l1subcap;
>>> +
>>> + cap = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_L1SS);
>>> + if (cap) {
>>> + l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
>>> + l1subcap &= ~(PCI_L1SS_CAP_ASPM_L1_1 | PCI_L1SS_CAP_ASPM_L1_2 | PCI_L1SS_CAP_L1_PM_SS);
>>> + dw_pcie_writel_dbi(pci, cap + PCI_L1SS_CAP, l1subcap);
>>> + l1subcap = dw_pcie_readl_dbi(pci, cap + PCI_L1SS_CAP);
>>> + }
>>> +}
>>> +
>>> static void rockchip_pcie_enable_l0s(struct dw_pcie *pci)
>>> {
>>> u32 cap, lnkcap;
>>> @@ -264,6 +277,7 @@ static int rockchip_pcie_host_init(struct dw_pcie_rp *pp)
>>> irq_set_chained_handler_and_data(irq, rockchip_pcie_intx_handler,
>>> rockchip);
>>>
>>> + rockchip_pcie_disable_l1sub(pci);
For now, this is a acceptable option if default ASPM policy enable L1ss
w/o checking if the HW could supports it... But how about adding
supports-clkreq stuff to upstream host driver directly? That would help
folks enable L1ss if the HW is ready and they just need adding property
to the DT.
>>> rockchip_pcie_enable_l0s(pci);
>>>
>>> return 0;
>>> @@ -301,6 +315,7 @@ static void rockchip_pcie_ep_init(struct dw_pcie_ep *ep)
>>> struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>>> enum pci_barno bar;
>>>
>>> + rockchip_pcie_disable_l1sub(pci);
>>> rockchip_pcie_enable_l0s(pci);
>>> rockchip_pcie_ep_hide_broken_ats_cap_rk3588(ep);
>>>
>>
>> But this patch removes the L1SS CAP for all boards, isn't it?
>
> Yes, all boards supported by pcie-dw-rockchip.c, which matches what their
> downstream driver does.
>
> (Their downstream driver disables L1 substates for all boards that have
> not defined 'supports-clkreq', and a grep through their downstream tree,
> for all their all their different branches, shows that not a since rockchip
> DTS has this property specified.)
The L1ss support is quite strict and need several steps to check, so we
didn't add supports-clkreq for them unless the HW is ready to go...
For adding supports of L1ss,
[1] the HW should support CLKREQ#, expecially for PCIe3.0 case on
Rockchip SoCs , since both CLKREQ# of RC and EP should connect to the
100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
well. If the enable pin is high active, the HW even need a invertor....
[2] define proper clkreq iomux to pinctrl of pcie node
[3] make sure the devices work fine with L1ss.(It's hard to check the
slot case with random devices in the wild )
[4] add supports-clkreq to the DT and enable
CONFIG_PCIEASPM_POWER_SUPERSAVE
>
> So, let me submit a real patch with the above.
>
>
> Kind regards,
> Niklas
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 13:00 ` Shawn Lin
@ 2025-10-15 15:23 ` Niklas Cassel
2025-10-15 23:30 ` Bjorn Helgaas
1 sibling, 0 replies; 42+ messages in thread
From: Niklas Cassel @ 2025-10-15 15:23 UTC (permalink / raw)
To: Shawn Lin
Cc: Manivannan Sadhasivam, Bjorn Helgaas, manivannan.sadhasivam,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
> For now, this is a acceptable option if default ASPM policy enable L1ss
> w/o checking if the HW could supports it... But how about adding
> supports-clkreq stuff to upstream host driver directly? That would help
> folks enable L1ss if the HW is ready and they just need adding property
> to the DT.
I like your idea, if you have time, please send a patch.
However, adding (working) support for L1 substates (via 'supports-clkreq')
is new code, and should thus be queued for next release instead of v6.18.
For now, pcie-dw-rockchip.c is broken for a lot of PCIe devices, so the
fix should be minimal and target v6.18, i.e. something like:
https://lore.kernel.org/linux-pci/20251015123142.392274-2-cassel@kernel.org/
Support for L1 substates via 'supports-clkreq' can be added on top of that
patch (while targeting v6.19).
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 12:26 ` Diederik de Haas
@ 2025-10-15 22:50 ` Bjorn Helgaas
2025-10-16 17:38 ` Diederik de Haas
0 siblings, 1 reply; 42+ messages in thread
From: Bjorn Helgaas @ 2025-10-15 22:50 UTC (permalink / raw)
To: Diederik de Haas
Cc: FUKAUMI Naoki, manivannan.sadhasivam, Bjorn Helgaas,
Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Dragan Simic, linux-rockchip,
regressions
On Wed, Oct 15, 2025 at 02:26:30PM +0200, Diederik de Haas wrote:
> On Tue Oct 14, 2025 at 8:49 PM CEST, Bjorn Helgaas wrote:
> > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> >> I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> >> Rockchip RK3588(S) SoC.
> >>
> >> When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> >> freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> >> different modules I've tested, including the Realtek RTL8852BE, MediaTek
> >> MT7921E, and Intel AX210.
> >>
> >> I've found that reverting the following commit (i.e., the patch I'm replying
> >> to) resolves the problem:
> >> commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
> >
> > Thanks for the report, and sorry for the regression.
> >
> > Since this affects several devices from different manufacturers and (I
> > assume) different drivers, it seems likely that there's some issue
> > with the Rockchip end, since ASPM probably works on these devices in
> > other systems. So we should figure out if there's something wrong
> > with the way we configure ASPM, which we could potentially fix, or if
> > there's a hardware issue and we need some king of quirk to prevent
> > usage of ASPM on the affected platforms.
> >
> > Can you collect a complete dmesg log when booting with
> >
> > ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
> >
> > and the output of "sudo lspci -vv"?
>
> I have a Rock 5B as well, but I don't have a Wi-Fi module, but I do have
> a NVMe drive connected. That boots fine with 6.17, but I end up in a
> rescue shell with 6.18-rc1. I haven't verified that it's caused by the
> same commit, but it does sound plausible.
FWIW, my expectation is that booting with "pcie_aspm=off" should
effectively avoid the ASPM enabling and behave similarly to reverting
f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for
devicetree platforms"). My hope was that we could boot that way and
incrementally enable ASPM via sysfs a device at a time for testing.
If hardware implements ASPM correctly, enabling it should have no
functional impact at all, so we might be tripping over some kind of
hardware bug or maybe a generic Linux issue (ASPM has to be enabled in
a very specific order, and it's conceivable we messed that up).
> On this device, the NVMe isn't strictly needed (I used it to compile my
> kernels on), so I added 'noauto' to the NVMe line in /etc/fstab ... and
> that made it boot successfully into 6.18-rc1. Then running the 'mount'
> command wrt that NVMe drive failed with this message:
>
> EXT4-fs (nvme0n1p1): unable to read superblock
>
> The log of my attempts can be found here:
> https://paste.sr.ht/~diederik/f435eb258dca60676f7ac5154c00ddfdc24ac0b7
>
> > When the kernel freezes, can you give us any information about where,
> > e.g., a log or screenshot?
>
> For me, there is no kernel freeze. I ended up in a rescue shell as it
> couldn't mount the NVMe drive. As described above, when not letting it
> auto-mount that drive, the boot completed normally.
Thanks for the log, it's very useful. This is pieced together from
the serial console log and the "dmesg --level" output, but I think
it's all the same boot:
[ 2.872094] rockchip-dw-pcie a40000000.pcie: PCI host bridge to bus 0000:00
[ 2.885904] pci 0000:00:00.0: [1d87:3588] type 01 class 0x060400 PCIe Root Port
[ 2.888237] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
[ 3.143823] pci 0000:01:00.0: [144d:a80a] type 00 class 0x010802 PCIe Endpoint
[ 3.144646] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit]
[ 3.162748] pci 0000:01:00.0: BAR 0 [mem 0xf0200000-0xf0203fff 64bit]: assigned
[ 3.298198] nvme nvme0: pci function 0000:01:00.0
[ 3.298901] nvme 0000:01:00.0: enabling device (0000 -> 0002)
[ 3.316695] nvme nvme0: D3 entry latency set to 10 seconds
...
[ 18.921811] rockchip-pm-domain fd8d8000.power-management:power-controller: sync_state() pending due to fdad0000.npu
[ 18.922737] rockchip-pm-domain fd8d8000.power-management:power-controller: sync_state() pending due to fdb50000.video-codec
...
[ 39.971050] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS read failed (134)
[ 39.971945] nvme nvme0: Does your device have a faulty power saving mode enabled?
[ 39.972609] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
[ 42.357637] nvme0n1: I/O Cmd(0x2) @ LBA 0, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
[ 42.358644] I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
[ 42.391612] nvme 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 42.443644] nvme nvme0: Disabling device after reset failure: -19
[ 42.459544] Buffer I/O error on dev nvme0n1, logical block 0, async page read
[ 42.607749] EXT4-fs (nvme0n1p1): unable to read superblock
The earlydump info shows the 00:00.0 Root Port had I/O+ Mem+
BusMaster+ (0x0107) and the 01:00.0 NVMe initially had I/O- Mem-
BusMaster- (0x0000). We were able to enumerate the NVMe device and
assign its BAR, and the nvme driver turned on Mem+ (0x002).
nvme_timeout
csts = readl(dev->bar + NVME_REG_CSTS)
if (nvme_should_reset(csts))
nvme_warn_reset(csts)
result = pci_read_config_word(PCI_STATUS)
"controller is down; will reset: CSTS=0xffffffff, ... failed (134)"
nvme_dev_disable
But I think the NVMe device was powered down to D3cold somewhere
before 39.971050. I don't know if the power-controller messages at
18.921811 have any connection, and I don't know why ASPM would be
related.
In any event, the NVME_REG_CSTS mem read returned ~0, probably because
the device didn't respond and the RC fabricated ~0. The PCI_STATUS
config read failed with 134 (PCIBIOS_DEVICE_NOT_FOUND). The config
read should be this path, which probably failed because the link was
down, which would happen if NVMe is in D3cold:
pci_read_config_word
if (pci_dev_is_disconnected())
return PCIBIOS_DEVICE_NOT_FOUND
pci_bus_read_config_word
ret = bus->ops->read
dw_pcie_rd_other_conf
pci_generic_config_read
addr = bus->ops->map_bus
dw_pcie_other_conf_map_bus
if (!dw_pcie_link_up())
return pci->ops->link_up
rockchip_pcie_link_up # .link_up
return NULL # link was down
if (!addr) # .map_bus() failed b/c link down
return PCIBIOS_DEVICE_NOT_FOUND
Your lspci shows no response, i.e., config reads to the device
returned ~0:
0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd SSD 980 PRO
!!! Unknown header type 7f
Interrupt: pin ? routed to IRQ 94
The Root Port shows a Completion Timeout error, which might be a
consequence of NVMe being powered off:
0000:00:00.0 PCI bridge: Rockchip Electronics Co., Ltd RK3588 (rev 01) (prog-if 00 [Normal decode])
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
Bottom line, I don't think I can get any further with this particular
issue until we confirm that f3ac2ff14834 ("PCI/ASPM: Enable all
ClockPM and ASPM states for devicetree platforms") is the cause.
Bjorn
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 13:00 ` Shawn Lin
2025-10-15 15:23 ` Niklas Cassel
@ 2025-10-15 23:30 ` Bjorn Helgaas
2025-10-16 6:46 ` Hongxing Zhu
2025-10-17 3:36 ` Manivannan Sadhasivam
1 sibling, 2 replies; 42+ messages in thread
From: Bjorn Helgaas @ 2025-10-15 23:30 UTC (permalink / raw)
To: Shawn Lin
Cc: Niklas Cassel, Manivannan Sadhasivam, manivannan.sadhasivam,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
> ...
> For now, this is a acceptable option if default ASPM policy enable
> L1ss w/o checking if the HW could supports it... But how about
> adding supports-clkreq stuff to upstream host driver directly? That
> would help folks enable L1ss if the HW is ready and they just need
> adding property to the DT.
> ...
> The L1ss support is quite strict and need several steps to check, so we
> didn't add supports-clkreq for them unless the HW is ready to go...
>
> For adding supports of L1ss,
> [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on Rockchip
> SoCs , since both CLKREQ# of RC and EP should connect to the
> 100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
> well. If the enable pin is high active, the HW even need a invertor....
>
> [2] define proper clkreq iomux to pinctrl of pcie node
> [3] make sure the devices work fine with L1ss.(It's hard to check the slot
> case with random devices in the wild )
> [4] add supports-clkreq to the DT and enable
> CONFIG_PCIEASPM_POWER_SUPERSAVE
I don't understand the details of the supports-clkreq issue.
If we need to add supports-clkreq to devicetree, I want to understand
why we need it there when we don't seem to need it for ACPI systems.
Generally the OS relies on what the hardware advertises, e.g., in Link
Capabilities and the L1 PM Substates Capability, and what is available
from firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
On the ACPI side, I don't think we get any specific information about
CLKREQ#. Can somebody explain why we do need it on the devicetree
side?
Bjorn
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 23:30 ` Bjorn Helgaas
@ 2025-10-16 6:46 ` Hongxing Zhu
2025-10-17 3:36 ` Manivannan Sadhasivam
1 sibling, 0 replies; 42+ messages in thread
From: Hongxing Zhu @ 2025-10-16 6:46 UTC (permalink / raw)
To: Bjorn Helgaas, Shawn Lin
Cc: Niklas Cassel, Manivannan Sadhasivam,
manivannan.sadhasivam@oss.qualcomm.com, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczy��ski,
Rob Herring, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
David E. Box, Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit,
Chia-Lin Kao, Dragan Simic, linux-rockchip@lists.infradead.org,
regressions@lists.linux.dev, FUKAUMI Naoki
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 3374 bytes --]
> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: 2025Äê10ÔÂ16ÈÕ 7:31
> To: Shawn Lin <shawn.lin@rock-chips.com>
> Cc: Niklas Cassel <cassel@kernel.org>; Manivannan Sadhasivam
> <mani@kernel.org>; manivannan.sadhasivam@oss.qualcomm.com; Bjorn
> Helgaas <bhelgaas@google.com>; Lorenzo Pieralisi <lpieralisi@kernel.org>;
> Krzysztof Wilczy¨½ski <kwilczynski@kernel.org>; Rob Herring
> <robh@kernel.org>; linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-arm-msm@vger.kernel.org; David E. Box <david.e.box@linux.intel.com>;
> Kai-Heng Feng <kai.heng.feng@canonical.com>; Rafael J. Wysocki
> <rafael@kernel.org>; Heiner Kallweit <hkallweit1@gmail.com>; Chia-Lin Kao
> <acelan.kao@canonical.com>; Dragan Simic <dsimic@manjaro.org>;
> linux-rockchip@lists.infradead.org; regressions@lists.linux.dev; FUKAUMI Naoki
> <naoki@radxa.com>
> Subject: Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states
> set by BIOS for devicetree platforms
>
> On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
> > ...
>
> > For now, this is a acceptable option if default ASPM policy enable
> > L1ss w/o checking if the HW could supports it... But how about adding
> > supports-clkreq stuff to upstream host driver directly? That would
> > help folks enable L1ss if the HW is ready and they just need adding
> > property to the DT.
> > ...
>
> > The L1ss support is quite strict and need several steps to check, so
> > we didn't add supports-clkreq for them unless the HW is ready to go...
> >
> > For adding supports of L1ss,
> > [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on
> > Rockchip SoCs , since both CLKREQ# of RC and EP should connect to the
> > 100MHz crystal generator's enable pin, as L1.2 need to disable refclk
> > as well. If the enable pin is high active, the HW even need a invertor....
> >
> > [2] define proper clkreq iomux to pinctrl of pcie node [3] make sure
> > the devices work fine with L1ss.(It's hard to check the slot case with
> > random devices in the wild ) [4] add supports-clkreq to the DT and
> > enable CONFIG_PCIEASPM_POWER_SUPERSAVE
>
> I don't understand the details of the supports-clkreq issue.
>
> If we need to add supports-clkreq to devicetree, I want to understand why we
> need it there when we don't seem to need it for ACPI systems.
>
> Generally the OS relies on what the hardware advertises, e.g., in Link
> Capabilities and the L1 PM Substates Capability, and what is available from
> firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
Hi Bjorn:
Yes, it is. The L1 PM Substates support can be broadcasted by the according
Capabilities of PCIe controller.
But, this feature is still relied on the CLKREQ# signal connection on the
board/device hardware designs too.
Maybe the "supports-clkreq" property is used to guarantee that the hardware
designs of CLKREQ# on board/device meet the requirements of L1 PM Substates.
https://lore.kernel.org/imx/20251015030428.2980427-1-hongxing.zhu@nxp.com/
This is the jam I encountered, especially on the second slot of i.MX95
19x19 EVK board.
Best Regards
Richard Zhu
>
> On the ACPI side, I don't think we get any specific information about CLKREQ#.
> Can somebody explain why we do need it on the devicetree side?
>
> Bjorn
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 22:50 ` Bjorn Helgaas
@ 2025-10-16 17:38 ` Diederik de Haas
0 siblings, 0 replies; 42+ messages in thread
From: Diederik de Haas @ 2025-10-16 17:38 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: FUKAUMI Naoki, manivannan.sadhasivam, Bjorn Helgaas,
Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Dragan Simic, linux-rockchip,
regressions, Ulf Hansson
On Thu Oct 16, 2025 at 12:50 AM CEST, Bjorn Helgaas wrote:
> On Wed, Oct 15, 2025 at 02:26:30PM +0200, Diederik de Haas wrote:
>> On Tue Oct 14, 2025 at 8:49 PM CEST, Bjorn Helgaas wrote:
>> > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
>> >> I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
>> >> Rockchip RK3588(S) SoC.
>> >>
>> >> When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
>> >> freezes or fails to probe M.2 Wi-Fi modules. This happens with several
>> >> different modules I've tested, including the Realtek RTL8852BE, MediaTek
>> >> MT7921E, and Intel AX210.
>> >>
>> >> I've found that reverting the following commit (i.e., the patch I'm replying
>> >> to) resolves the problem:
>> >> commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>> >
>> > Can you collect a complete dmesg log when booting with
>> >
>> > ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
>> >
>> > and the output of "sudo lspci -vv"?
>>
>> I have a Rock 5B as well, but I don't have a Wi-Fi module, but I do have
>> a NVMe drive connected. That boots fine with 6.17, but I end up in a
>> rescue shell with 6.18-rc1. I haven't verified that it's caused by the
>> same commit, but it does sound plausible.
>
> FWIW, my expectation is that booting with "pcie_aspm=off" should
> effectively avoid the ASPM enabling and behave similarly to reverting
> f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for
> devicetree platforms"). My hope was that we could boot that way and
> incrementally enable ASPM via sysfs a device at a time for testing.
>
> [Moved last lines up here]
> Bottom line, I don't think I can get any further with this particular
> issue until we confirm that f3ac2ff14834 ("PCI/ASPM: Enable all
> ClockPM and ASPM states for devicetree platforms") is the cause.
I built a 6.18-rc1 kernel with that commit reverted and when booted up,
I could mount my NVMe drive. Next I removed the 'noauto' from /etc/fstab
and rebooted and that succeeded as well.
So I think we can conclude that commit f3ac2ff14834 is the cause.
>> On this device, the NVMe isn't strictly needed (I used it to compile my
>> kernels on), so I added 'noauto' to the NVMe line in /etc/fstab ... and
>> that made it boot successfully into 6.18-rc1. Then running the 'mount'
>> command wrt that NVMe drive failed with this message:
>>
>> EXT4-fs (nvme0n1p1): unable to read superblock
>>
>> The log of my attempts can be found here:
>> https://paste.sr.ht/~diederik/f435eb258dca60676f7ac5154c00ddfdc24ac0b7
>
> Thanks for the log, it's very useful. This is pieced together from
> the serial console log and the "dmesg --level" output, but I think
> it's all the same boot:
Correct.
> ...
> [ 18.921811] rockchip-pm-domain fd8d8000.power-management:power-controller: sync_state() pending due to fdad0000.npu
> [ 18.922737] rockchip-pm-domain fd8d8000.power-management:power-controller: sync_state() pending due to fdb50000.video-codec
> ...
>
> The earlydump info shows the 00:00.0 Root Port had I/O+ Mem+
> BusMaster+ (0x0107) and the 01:00.0 NVMe initially had I/O- Mem-
> BusMaster- (0x0000). We were able to enumerate the NVMe device and
> assign its BAR, and the nvme driver turned on Mem+ (0x002).
>
> nvme_timeout
> csts = readl(dev->bar + NVME_REG_CSTS)
> if (nvme_should_reset(csts))
> nvme_warn_reset(csts)
> result = pci_read_config_word(PCI_STATUS)
> "controller is down; will reset: CSTS=0xffffffff, ... failed (134)"
> nvme_dev_disable
>
> But I think the NVMe device was powered down to D3cold somewhere
> before 39.971050. I don't know if the power-controller messages at
> 18.921811 have any connection, and I don't know why ASPM would be
> related.
I highly doubt they're connected. These threads are relevant:
https://lore.kernel.org/all/20250701114733.636510-1-ulf.hansson@linaro.org/
https://lore.kernel.org/all/20250909111130.132976-1-ulf.hansson@linaro.org/
https://lore.kernel.org/all/20251007094312.590819-1-ulf.hansson@linaro.org/
TL;DR: Those warnings will (likely) be downgraded to 'info'.
Cheers,
Diederik
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 23:30 ` Bjorn Helgaas
2025-10-16 6:46 ` Hongxing Zhu
@ 2025-10-17 3:36 ` Manivannan Sadhasivam
2025-10-17 9:47 ` Shawn Lin
1 sibling, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-17 3:36 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Shawn Lin, Niklas Cassel, Manivannan Sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Wed, Oct 15, 2025 at 06:30:54PM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
> > ...
>
> > For now, this is a acceptable option if default ASPM policy enable
> > L1ss w/o checking if the HW could supports it... But how about
> > adding supports-clkreq stuff to upstream host driver directly? That
> > would help folks enable L1ss if the HW is ready and they just need
> > adding property to the DT.
> > ...
>
> > The L1ss support is quite strict and need several steps to check, so we
> > didn't add supports-clkreq for them unless the HW is ready to go...
> >
> > For adding supports of L1ss,
> > [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on Rockchip
> > SoCs , since both CLKREQ# of RC and EP should connect to the
> > 100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
> > well. If the enable pin is high active, the HW even need a invertor....
> >
> > [2] define proper clkreq iomux to pinctrl of pcie node
> > [3] make sure the devices work fine with L1ss.(It's hard to check the slot
> > case with random devices in the wild )
> > [4] add supports-clkreq to the DT and enable
> > CONFIG_PCIEASPM_POWER_SUPERSAVE
>
> I don't understand the details of the supports-clkreq issue.
>
> If we need to add supports-clkreq to devicetree, I want to understand
> why we need it there when we don't seem to need it for ACPI systems.
>
> Generally the OS relies on what the hardware advertises, e.g., in Link
> Capabilities and the L1 PM Substates Capability, and what is available
> from firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
>
> On the ACPI side, I don't think we get any specific information about
> CLKREQ#. Can somebody explain why we do need it on the devicetree
> side?
>
I think there is a disconnect between enabling L1ss CAP and CLKREQ#
availability.. When L1ss CAP is enabled for the Root Port in the hardware, there
is no guarantee that CLKREQ# is also available. If CLKREQ# is not available,
then if L1ss is enabled by the OS, it is not possible to exit the L1ss states
(assuming that L1ss is entered due to CLKREQ# in deassert (default) state).
Yes, there seems to be no standard way to know CLKREQ# presence in ACPI, but
in devicetree, we have this 'supports-clkreq' property to tell the OS that
CLKREQ# is available in the platform. But unfortunately, this property is not
widely used by the devicetrees out there. So we cannot use it in generic
pci/aspm.c driver.
We can certainly rely on the BIOS to enable L1ss as the fw developers would
have the knowledge of the CLKREQ# availability. But BIOS is not a thing on
mobile and embedded platforms where L1ss would come handy.
What I would suggest is, the host controller drivers (mostly for devicetree
platforms) should enable L1ss CAP for the Root Port only if they know that
CLKREQ# is available. They can either rely on the 'supports-clkreq' property or
some platform specific knowledge (for instance, on all Qcom platforms, we
certainly know that CLKREQ# is available, but we don't set the DT property).
Then in the generic pci/aspm.c driver, we can just enable L1ss for all devices
if the CAP is set, which we do currently.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-17 3:36 ` Manivannan Sadhasivam
@ 2025-10-17 9:47 ` Shawn Lin
2025-10-17 10:04 ` Manivannan Sadhasivam
0 siblings, 1 reply; 42+ messages in thread
From: Shawn Lin @ 2025-10-17 9:47 UTC (permalink / raw)
To: Manivannan Sadhasivam, Bjorn Helgaas
Cc: shawn.lin, Niklas Cassel, Manivannan Sadhasivam, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
Hi Mani and Bjorn
在 2025/10/17 星期五 11:36, Manivannan Sadhasivam 写道:
> On Wed, Oct 15, 2025 at 06:30:54PM -0500, Bjorn Helgaas wrote:
>> On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
>>> ...
>>
>>> For now, this is a acceptable option if default ASPM policy enable
>>> L1ss w/o checking if the HW could supports it... But how about
>>> adding supports-clkreq stuff to upstream host driver directly? That
>>> would help folks enable L1ss if the HW is ready and they just need
>>> adding property to the DT.
>>> ...
>>
>>> The L1ss support is quite strict and need several steps to check, so we
>>> didn't add supports-clkreq for them unless the HW is ready to go...
>>>
>>> For adding supports of L1ss,
>>> [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on Rockchip
>>> SoCs , since both CLKREQ# of RC and EP should connect to the
>>> 100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
>>> well. If the enable pin is high active, the HW even need a invertor....
>>>
>>> [2] define proper clkreq iomux to pinctrl of pcie node
>>> [3] make sure the devices work fine with L1ss.(It's hard to check the slot
>>> case with random devices in the wild )
>>> [4] add supports-clkreq to the DT and enable
>>> CONFIG_PCIEASPM_POWER_SUPERSAVE
>>
>> I don't understand the details of the supports-clkreq issue.
>>
>> If we need to add supports-clkreq to devicetree, I want to understand
>> why we need it there when we don't seem to need it for ACPI systems.
>>
>> Generally the OS relies on what the hardware advertises, e.g., in Link
>> Capabilities and the L1 PM Substates Capability, and what is available
>> from firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
>>
>> On the ACPI side, I don't think we get any specific information about
>> CLKREQ#. Can somebody explain why we do need it on the devicetree
>> side?
>>
>
> I think there is a disconnect between enabling L1ss CAP and CLKREQ#
> availability.. When L1ss CAP is enabled for the Root Port in the hardware, there
> is no guarantee that CLKREQ# is also available. If CLKREQ# is not available,
> then if L1ss is enabled by the OS, it is not possible to exit the L1ss states
> (assuming that L1ss is entered due to CLKREQ# in deassert (default) state).
>
> Yes, there seems to be no standard way to know CLKREQ# presence in ACPI, but
> in devicetree, we have this 'supports-clkreq' property to tell the OS that
> CLKREQ# is available in the platform. But unfortunately, this property is not
> widely used by the devicetrees out there. So we cannot use it in generic
> pci/aspm.c driver.
>
> We can certainly rely on the BIOS to enable L1ss as the fw developers would
> have the knowledge of the CLKREQ# availability. But BIOS is not a thing on
> mobile and embedded platforms where L1ss would come handy.
>
> What I would suggest is, the host controller drivers (mostly for devicetree
> platforms) should enable L1ss CAP for the Root Port only if they know that
> CLKREQ# is available. They can either rely on the 'supports-clkreq' property or
> some platform specific knowledge (for instance, on all Qcom platforms, we
> certainly know that CLKREQ# is available, but we don't set the DT property).
While we're on the topic of ASPM, may I ask a silly question?
I saw the ASPM would only be configured once the function driver calling
pci_enable_device. So if the modular driver hasn't been insmoded, the
link will be in L0 even though there is no transcation on-going. What is
the intention behind it?
>
> Then in the generic pci/aspm.c driver, we can just enable L1ss for all devices
> if the CAP is set, which we do currently.
>
> - Mani
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-17 9:47 ` Shawn Lin
@ 2025-10-17 10:04 ` Manivannan Sadhasivam
2025-10-17 12:19 ` Shawn Lin
0 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-17 10:04 UTC (permalink / raw)
To: Shawn Lin
Cc: Manivannan Sadhasivam, Bjorn Helgaas, Niklas Cassel,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Fri, Oct 17, 2025 at 05:47:44PM +0800, Shawn Lin wrote:
> Hi Mani and Bjorn
>
> 在 2025/10/17 星期五 11:36, Manivannan Sadhasivam 写道:
> > On Wed, Oct 15, 2025 at 06:30:54PM -0500, Bjorn Helgaas wrote:
> > > On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
> > > > ...
> > >
> > > > For now, this is a acceptable option if default ASPM policy enable
> > > > L1ss w/o checking if the HW could supports it... But how about
> > > > adding supports-clkreq stuff to upstream host driver directly? That
> > > > would help folks enable L1ss if the HW is ready and they just need
> > > > adding property to the DT.
> > > > ...
> > >
> > > > The L1ss support is quite strict and need several steps to check, so we
> > > > didn't add supports-clkreq for them unless the HW is ready to go...
> > > >
> > > > For adding supports of L1ss,
> > > > [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on Rockchip
> > > > SoCs , since both CLKREQ# of RC and EP should connect to the
> > > > 100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
> > > > well. If the enable pin is high active, the HW even need a invertor....
> > > >
> > > > [2] define proper clkreq iomux to pinctrl of pcie node
> > > > [3] make sure the devices work fine with L1ss.(It's hard to check the slot
> > > > case with random devices in the wild )
> > > > [4] add supports-clkreq to the DT and enable
> > > > CONFIG_PCIEASPM_POWER_SUPERSAVE
> > >
> > > I don't understand the details of the supports-clkreq issue.
> > >
> > > If we need to add supports-clkreq to devicetree, I want to understand
> > > why we need it there when we don't seem to need it for ACPI systems.
> > >
> > > Generally the OS relies on what the hardware advertises, e.g., in Link
> > > Capabilities and the L1 PM Substates Capability, and what is available
> > > from firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
> > >
> > > On the ACPI side, I don't think we get any specific information about
> > > CLKREQ#. Can somebody explain why we do need it on the devicetree
> > > side?
> > >
> >
> > I think there is a disconnect between enabling L1ss CAP and CLKREQ#
> > availability.. When L1ss CAP is enabled for the Root Port in the hardware, there
> > is no guarantee that CLKREQ# is also available. If CLKREQ# is not available,
> > then if L1ss is enabled by the OS, it is not possible to exit the L1ss states
> > (assuming that L1ss is entered due to CLKREQ# in deassert (default) state).
> >
> > Yes, there seems to be no standard way to know CLKREQ# presence in ACPI, but
> > in devicetree, we have this 'supports-clkreq' property to tell the OS that
> > CLKREQ# is available in the platform. But unfortunately, this property is not
> > widely used by the devicetrees out there. So we cannot use it in generic
> > pci/aspm.c driver.
> >
> > We can certainly rely on the BIOS to enable L1ss as the fw developers would
> > have the knowledge of the CLKREQ# availability. But BIOS is not a thing on
> > mobile and embedded platforms where L1ss would come handy.
> >
> > What I would suggest is, the host controller drivers (mostly for devicetree
> > platforms) should enable L1ss CAP for the Root Port only if they know that
> > CLKREQ# is available. They can either rely on the 'supports-clkreq' property or
> > some platform specific knowledge (for instance, on all Qcom platforms, we
> > certainly know that CLKREQ# is available, but we don't set the DT property).
>
> While we're on the topic of ASPM, may I ask a silly question?
> I saw the ASPM would only be configured once the function driver calling
> pci_enable_device. So if the modular driver hasn't been insmoded, the
> link will be in L0 even though there is no transcation on-going. What is
> the intention behind it?
>
I don't see where ASPM is configured during pci_enable_device(). It is currently
configured for all devices during pci_scan_slot().
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-17 10:04 ` Manivannan Sadhasivam
@ 2025-10-17 12:19 ` Shawn Lin
2025-10-17 12:54 ` Manivannan Sadhasivam
0 siblings, 1 reply; 42+ messages in thread
From: Shawn Lin @ 2025-10-17 12:19 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: shawn.lin, Manivannan Sadhasivam, Bjorn Helgaas, Niklas Cassel,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
在 2025/10/17 星期五 18:04, Manivannan Sadhasivam 写道:
> On Fri, Oct 17, 2025 at 05:47:44PM +0800, Shawn Lin wrote:
>> Hi Mani and Bjorn
>>
>> 在 2025/10/17 星期五 11:36, Manivannan Sadhasivam 写道:
>>> On Wed, Oct 15, 2025 at 06:30:54PM -0500, Bjorn Helgaas wrote:
>>>> On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
>>>>> ...
>>>>
>>>>> For now, this is a acceptable option if default ASPM policy enable
>>>>> L1ss w/o checking if the HW could supports it... But how about
>>>>> adding supports-clkreq stuff to upstream host driver directly? That
>>>>> would help folks enable L1ss if the HW is ready and they just need
>>>>> adding property to the DT.
>>>>> ...
>>>>
>>>>> The L1ss support is quite strict and need several steps to check, so we
>>>>> didn't add supports-clkreq for them unless the HW is ready to go...
>>>>>
>>>>> For adding supports of L1ss,
>>>>> [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on Rockchip
>>>>> SoCs , since both CLKREQ# of RC and EP should connect to the
>>>>> 100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
>>>>> well. If the enable pin is high active, the HW even need a invertor....
>>>>>
>>>>> [2] define proper clkreq iomux to pinctrl of pcie node
>>>>> [3] make sure the devices work fine with L1ss.(It's hard to check the slot
>>>>> case with random devices in the wild )
>>>>> [4] add supports-clkreq to the DT and enable
>>>>> CONFIG_PCIEASPM_POWER_SUPERSAVE
>>>>
>>>> I don't understand the details of the supports-clkreq issue.
>>>>
>>>> If we need to add supports-clkreq to devicetree, I want to understand
>>>> why we need it there when we don't seem to need it for ACPI systems.
>>>>
>>>> Generally the OS relies on what the hardware advertises, e.g., in Link
>>>> Capabilities and the L1 PM Substates Capability, and what is available
>>>> from firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
>>>>
>>>> On the ACPI side, I don't think we get any specific information about
>>>> CLKREQ#. Can somebody explain why we do need it on the devicetree
>>>> side?
>>>>
>>>
>>> I think there is a disconnect between enabling L1ss CAP and CLKREQ#
>>> availability.. When L1ss CAP is enabled for the Root Port in the hardware, there
>>> is no guarantee that CLKREQ# is also available. If CLKREQ# is not available,
>>> then if L1ss is enabled by the OS, it is not possible to exit the L1ss states
>>> (assuming that L1ss is entered due to CLKREQ# in deassert (default) state).
>>>
>>> Yes, there seems to be no standard way to know CLKREQ# presence in ACPI, but
>>> in devicetree, we have this 'supports-clkreq' property to tell the OS that
>>> CLKREQ# is available in the platform. But unfortunately, this property is not
>>> widely used by the devicetrees out there. So we cannot use it in generic
>>> pci/aspm.c driver.
>>>
>>> We can certainly rely on the BIOS to enable L1ss as the fw developers would
>>> have the knowledge of the CLKREQ# availability. But BIOS is not a thing on
>>> mobile and embedded platforms where L1ss would come handy.
>>>
>>> What I would suggest is, the host controller drivers (mostly for devicetree
>>> platforms) should enable L1ss CAP for the Root Port only if they know that
>>> CLKREQ# is available. They can either rely on the 'supports-clkreq' property or
>>> some platform specific knowledge (for instance, on all Qcom platforms, we
>>> certainly know that CLKREQ# is available, but we don't set the DT property).
>>
>> While we're on the topic of ASPM, may I ask a silly question?
>> I saw the ASPM would only be configured once the function driver calling
>> pci_enable_device. So if the modular driver hasn't been insmoded, the
>> link will be in L0 even though there is no transcation on-going. What is
>> the intention behind it?
>>
>
> I don't see where ASPM is configured during pci_enable_device(). It is currently
> configured for all devices during pci_scan_slot().
This is the dump_stack() where I observed. If I compile NVMe driver as a
module and never insmod it, the link is always in L0, namely ASPM
Disabled.
[ 0.747508] pci 0000:01:00.0: ASPM: DT platform, enabling L0s-up
L0s-dw L1 ASPM-L1.1 ASPM-L1.2 PCI-PM-L1.1 PCI-PM-L1.2
[ 0.748509] pci 0000:01:00.0: ASPM: DT platform, enabling ClockPM
[ 0.749145] pci 0000:01:00.0: BAR 0 [mem 0xf0200000-0xf0203fff
64bit]: assigned
[ 0.750430] nvme nvme0: pci function 0000:01:00.0
[ 0.750850] CPU: 7 UID: 0 PID: 120 Comm: kworker/u32:7 Not tainted
6.18.0-rc1-00019-gb00f931c2b43-dirty #19 PREEMPT
[ 0.750854] Hardware name: Rockchip RK3588 EVB1 V10 Board (DT)
[ 0.750856] Workqueue: async async_run_entry_fn
[ 0.750863] Call trace:
[ 0.750864] show_stack+0x18/0x24 (C)
[ 0.750869] dump_stack_lvl+0x40/0x84
[ 0.750873] dump_stack+0x18/0x24
[ 0.750876] pcie_config_aspm_link+0x40/0x2dc
[ 0.750882] pcie_aspm_powersave_config_link+0x68/0x104
[ 0.750885] do_pci_enable_device+0x80/0x170
[ 0.750890] pci_enable_device_flags+0x1b8/0x220
[ 0.750894] pci_enable_device_mem+0x14/0x20
[ 0.750898] nvme_pci_enable+0x3c/0x66c
[ 0.750904] nvme_probe+0x5f8/0x7d4
[ 0.750908] pci_device_probe+0x1dc/0x2ac
[ 0.750912] really_probe+0x138/0x2d8
[ 0.750915] __driver_probe_device+0xa0/0x128
[ 0.750918] driver_probe_device+0x3c/0x1f8
[ 0.750921] __device_attach_driver+0x118/0x140
[ 0.750924] bus_for_each_drv+0xf4/0x14c
[ 0.750927] __device_attach_async_helper+0x78/0xd0
[ 0.750930] async_run_entry_fn+0x24/0xdc
[ 0.750933] process_scheduled_works+0x194/0x2c4
[ 0.750937] worker_thread+0x28c/0x394
[ 0.750939] kthread+0x1bc/0x200
[ 0.750943] ret_from_fork+0x10/0x20
[ 0.750947] pcie_config_aspm_link 989
[ 0.753436] ehci-platform fc800000.usb: USB 2.0 started, EHCI 1.00
[ 0.753620] pcie_config_aspm_link 997 state 0x7c
[ 0.754204] hub 4-0:1.0: USB hub found
[ 0.754279] pcie_config_aspm_link 1004 state 0x7c
[ 0.754281] pcie_config_aspm_link 1008
[ 0.754741] hub 4-0:1.0: 1 port detected
[ 0.755153] nvme 0000:01:00.0: enabling device (0000 -> 0002)
[ 0.769741] hub 1-0:1.0: USB hub found
[ 0.770084] hub 1-0:1.0: 1 port detected
[ 0.801662] hub 3-0:1.0: USB hub found
[ 0.802007] hub 3-0:1.0: 1 port detected
[ 0.806393] nvme nvme0: D3 entry latency set to 10 seconds
[ 0.810188] nvme nvme0: 8/0/0 default/read/poll queues
>
> - Mani
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-17 12:19 ` Shawn Lin
@ 2025-10-17 12:54 ` Manivannan Sadhasivam
2025-10-17 13:45 ` Bjorn Helgaas
0 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-17 12:54 UTC (permalink / raw)
To: Shawn Lin
Cc: Manivannan Sadhasivam, Bjorn Helgaas, Niklas Cassel,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Fri, Oct 17, 2025 at 08:19:11PM +0800, Shawn Lin wrote:
> 在 2025/10/17 星期五 18:04, Manivannan Sadhasivam 写道:
> > On Fri, Oct 17, 2025 at 05:47:44PM +0800, Shawn Lin wrote:
> > > Hi Mani and Bjorn
> > >
> > > 在 2025/10/17 星期五 11:36, Manivannan Sadhasivam 写道:
> > > > On Wed, Oct 15, 2025 at 06:30:54PM -0500, Bjorn Helgaas wrote:
> > > > > On Wed, Oct 15, 2025 at 09:00:41PM +0800, Shawn Lin wrote:
> > > > > > ...
> > > > >
> > > > > > For now, this is a acceptable option if default ASPM policy enable
> > > > > > L1ss w/o checking if the HW could supports it... But how about
> > > > > > adding supports-clkreq stuff to upstream host driver directly? That
> > > > > > would help folks enable L1ss if the HW is ready and they just need
> > > > > > adding property to the DT.
> > > > > > ...
> > > > >
> > > > > > The L1ss support is quite strict and need several steps to check, so we
> > > > > > didn't add supports-clkreq for them unless the HW is ready to go...
> > > > > >
> > > > > > For adding supports of L1ss,
> > > > > > [1] the HW should support CLKREQ#, expecially for PCIe3.0 case on Rockchip
> > > > > > SoCs , since both CLKREQ# of RC and EP should connect to the
> > > > > > 100MHz crystal generator's enable pin, as L1.2 need to disable refclk as
> > > > > > well. If the enable pin is high active, the HW even need a invertor....
> > > > > >
> > > > > > [2] define proper clkreq iomux to pinctrl of pcie node
> > > > > > [3] make sure the devices work fine with L1ss.(It's hard to check the slot
> > > > > > case with random devices in the wild )
> > > > > > [4] add supports-clkreq to the DT and enable
> > > > > > CONFIG_PCIEASPM_POWER_SUPERSAVE
> > > > >
> > > > > I don't understand the details of the supports-clkreq issue.
> > > > >
> > > > > If we need to add supports-clkreq to devicetree, I want to understand
> > > > > why we need it there when we don't seem to need it for ACPI systems.
> > > > >
> > > > > Generally the OS relies on what the hardware advertises, e.g., in Link
> > > > > Capabilities and the L1 PM Substates Capability, and what is available
> > > > > from firmware, e.g., the ACPI _DSM for Latency Tolerance Reporting.
> > > > >
> > > > > On the ACPI side, I don't think we get any specific information about
> > > > > CLKREQ#. Can somebody explain why we do need it on the devicetree
> > > > > side?
> > > > >
> > > >
> > > > I think there is a disconnect between enabling L1ss CAP and CLKREQ#
> > > > availability.. When L1ss CAP is enabled for the Root Port in the hardware, there
> > > > is no guarantee that CLKREQ# is also available. If CLKREQ# is not available,
> > > > then if L1ss is enabled by the OS, it is not possible to exit the L1ss states
> > > > (assuming that L1ss is entered due to CLKREQ# in deassert (default) state).
> > > >
> > > > Yes, there seems to be no standard way to know CLKREQ# presence in ACPI, but
> > > > in devicetree, we have this 'supports-clkreq' property to tell the OS that
> > > > CLKREQ# is available in the platform. But unfortunately, this property is not
> > > > widely used by the devicetrees out there. So we cannot use it in generic
> > > > pci/aspm.c driver.
> > > >
> > > > We can certainly rely on the BIOS to enable L1ss as the fw developers would
> > > > have the knowledge of the CLKREQ# availability. But BIOS is not a thing on
> > > > mobile and embedded platforms where L1ss would come handy.
> > > >
> > > > What I would suggest is, the host controller drivers (mostly for devicetree
> > > > platforms) should enable L1ss CAP for the Root Port only if they know that
> > > > CLKREQ# is available. They can either rely on the 'supports-clkreq' property or
> > > > some platform specific knowledge (for instance, on all Qcom platforms, we
> > > > certainly know that CLKREQ# is available, but we don't set the DT property).
> > >
> > > While we're on the topic of ASPM, may I ask a silly question?
> > > I saw the ASPM would only be configured once the function driver calling
> > > pci_enable_device. So if the modular driver hasn't been insmoded, the
> > > link will be in L0 even though there is no transcation on-going. What is
> > > the intention behind it?
> > >
> >
> > I don't see where ASPM is configured during pci_enable_device(). It is currently
> > configured for all devices during pci_scan_slot().
>
> This is the dump_stack() where I observed. If I compile NVMe driver as a
> module and never insmod it, the link is always in L0, namely ASPM
> Disabled.
>
I guess this comment answers your question:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pcie/aspm.c?h=v6.18-rc1#n1179
But with the recent ASPM change, the ASPM settings for DT platforms will be
applied before pci_enable_device(). Also, the comment is somewhat outdated as we
generally do not want PCI client drivers to enable/disable ASPM nowadays. They
can however do it under specific circumstances.
- Mani
> [ 0.747508] pci 0000:01:00.0: ASPM: DT platform, enabling L0s-up L0s-dw
> L1 ASPM-L1.1 ASPM-L1.2 PCI-PM-L1.1 PCI-PM-L1.2
> [ 0.748509] pci 0000:01:00.0: ASPM: DT platform, enabling ClockPM
> [ 0.749145] pci 0000:01:00.0: BAR 0 [mem 0xf0200000-0xf0203fff 64bit]:
> assigned
> [ 0.750430] nvme nvme0: pci function 0000:01:00.0
> [ 0.750850] CPU: 7 UID: 0 PID: 120 Comm: kworker/u32:7 Not tainted
> 6.18.0-rc1-00019-gb00f931c2b43-dirty #19 PREEMPT
> [ 0.750854] Hardware name: Rockchip RK3588 EVB1 V10 Board (DT)
> [ 0.750856] Workqueue: async async_run_entry_fn
> [ 0.750863] Call trace:
> [ 0.750864] show_stack+0x18/0x24 (C)
> [ 0.750869] dump_stack_lvl+0x40/0x84
> [ 0.750873] dump_stack+0x18/0x24
> [ 0.750876] pcie_config_aspm_link+0x40/0x2dc
> [ 0.750882] pcie_aspm_powersave_config_link+0x68/0x104
> [ 0.750885] do_pci_enable_device+0x80/0x170
> [ 0.750890] pci_enable_device_flags+0x1b8/0x220
> [ 0.750894] pci_enable_device_mem+0x14/0x20
> [ 0.750898] nvme_pci_enable+0x3c/0x66c
> [ 0.750904] nvme_probe+0x5f8/0x7d4
> [ 0.750908] pci_device_probe+0x1dc/0x2ac
> [ 0.750912] really_probe+0x138/0x2d8
> [ 0.750915] __driver_probe_device+0xa0/0x128
> [ 0.750918] driver_probe_device+0x3c/0x1f8
> [ 0.750921] __device_attach_driver+0x118/0x140
> [ 0.750924] bus_for_each_drv+0xf4/0x14c
> [ 0.750927] __device_attach_async_helper+0x78/0xd0
> [ 0.750930] async_run_entry_fn+0x24/0xdc
> [ 0.750933] process_scheduled_works+0x194/0x2c4
> [ 0.750937] worker_thread+0x28c/0x394
> [ 0.750939] kthread+0x1bc/0x200
> [ 0.750943] ret_from_fork+0x10/0x20
> [ 0.750947] pcie_config_aspm_link 989
> [ 0.753436] ehci-platform fc800000.usb: USB 2.0 started, EHCI 1.00
> [ 0.753620] pcie_config_aspm_link 997 state 0x7c
> [ 0.754204] hub 4-0:1.0: USB hub found
> [ 0.754279] pcie_config_aspm_link 1004 state 0x7c
> [ 0.754281] pcie_config_aspm_link 1008
> [ 0.754741] hub 4-0:1.0: 1 port detected
> [ 0.755153] nvme 0000:01:00.0: enabling device (0000 -> 0002)
> [ 0.769741] hub 1-0:1.0: USB hub found
> [ 0.770084] hub 1-0:1.0: 1 port detected
> [ 0.801662] hub 3-0:1.0: USB hub found
> [ 0.802007] hub 3-0:1.0: 1 port detected
> [ 0.806393] nvme nvme0: D3 entry latency set to 10 seconds
> [ 0.810188] nvme nvme0: 8/0/0 default/read/poll queues
>
>
> >
> > - Mani
> >
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-17 12:54 ` Manivannan Sadhasivam
@ 2025-10-17 13:45 ` Bjorn Helgaas
2025-10-31 6:21 ` Manivannan Sadhasivam
0 siblings, 1 reply; 42+ messages in thread
From: Bjorn Helgaas @ 2025-10-17 13:45 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Shawn Lin, Manivannan Sadhasivam, Niklas Cassel, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Fri, Oct 17, 2025 at 06:24:26PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Oct 17, 2025 at 08:19:11PM +0800, Shawn Lin wrote:
> > 在 2025/10/17 星期五 18:04, Manivannan Sadhasivam 写道:
> > > On Fri, Oct 17, 2025 at 05:47:44PM +0800, Shawn Lin wrote:
> ...
> > > > While we're on the topic of ASPM, may I ask a silly question?
> > > > I saw the ASPM would only be configured once the function
> > > > driver calling pci_enable_device. So if the modular driver
> > > > hasn't been insmoded, the link will be in L0 even though there
> > > > is no transcation on-going. What is the intention behind it?
> > >
> > > I don't see where ASPM is configured during pci_enable_device().
> > > It is currently configured for all devices during
> > > pci_scan_slot().
> >
> > This is the dump_stack() where I observed. If I compile NVMe
> > driver as a module and never insmod it, the link is always in L0,
> > namely ASPM Disabled.
>
> I guess this comment answers your question:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pcie/aspm.c?h=v6.18-rc1#n1179
The comment is:
* At this stage drivers haven't had an opportunity to change the
* link policy setting. Enabling ASPM on broken hardware can cripple
* it even before the driver has had a chance to disable ASPM, so
* default to a safe level right now. If we're enabling ASPM beyond
* the BIOS's expectation, we'll do so once pci_enable_device() is
* called.
I don't think relying on a driver to disable ASPM to avoid broken
hardware is the right answer. If the driver is never loaded, we waste
power. And if the user enables ASPM via sysfs, apparently the device
may be crippled.
I think it would be better to have an enumeration-time quirk to keep
us from enabling ASPM. We might trip over some of this broken
hardware, but I don't think there are very many drivers that fiddle
with ASPM, so we should be able to be proactive about it.
> But with the recent ASPM change, the ASPM settings for DT platforms
> will be applied before pci_enable_device(). Also, the comment is
> somewhat outdated as we generally do not want PCI client drivers to
> enable/disable ASPM nowadays. They can however do it under specific
> circumstances.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-15 6:22 ` Manivannan Sadhasivam
2025-10-15 11:23 ` Diederik de Haas
@ 2025-10-23 18:57 ` Dragan Simic
1 sibling, 0 replies; 42+ messages in thread
From: Dragan Simic @ 2025-10-23 18:57 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Bjorn Helgaas, FUKAUMI Naoki, manivannan.sadhasivam,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
linux-rockchip, regressions, Shawn Lin, Diederik de Haas,
Heiko Stuebner
Hello Manivannan,
On Wednesday, October 15, 2025 08:22 CEST, Manivannan Sadhasivam <mani@kernel.org> wrote:
> On Wed, Oct 15, 2025 at 01:33:35AM +0200, Dragan Simic wrote:
> > On Tuesday, October 14, 2025 20:49 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > > > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > > > Rockchip RK3588(S) SoC.
> > > >
> > > > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > > > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > > > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > > > MT7921E, and Intel AX210.
> > > >
> > > > I've found that reverting the following commit (i.e., the patch I'm replying
> > > > to) resolves the problem:
> > > > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
> > >
> > > Thanks for the report, and sorry for the regression.
> > >
> > > Since this affects several devices from different manufacturers and (I
> > > assume) different drivers, it seems likely that there's some issue
> > > with the Rockchip end, since ASPM probably works on these devices in
> > > other systems. So we should figure out if there's something wrong
> > > with the way we configure ASPM, which we could potentially fix, or if
> > > there's a hardware issue and we need some king of quirk to prevent
> > > usage of ASPM on the affected platforms.
> > >
> > > Can you collect a complete dmesg log when booting with
> > >
> > > ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
> > >
> > > and the output of "sudo lspci -vv"?
> > >
> > > When the kernel freezes, can you give us any information about where,
> > > e.g., a log or screenshot?
> > >
> > > Do you know if any platforms other than Radxa ROCK 5A/5B have this
> > > problem?
> >
> > After thinking quite a bit about it, I think we should revert this
> > patch and replace it with another patch that allows per-SoC, or
> > maybe even per-board, opting into the forced enablement of PCIe
> > ASPM. Let me explain, please.
>
> ASPM is a PCIe device specific feature, nothing related to SoC/board. Even if
> you limit it to certain platforms, there is no guarantee that it will be safe as
> the users can connect a buggy device to the slot and it could lead to the same
> issue.
I'm hoping that it's clear by now that the theory and practice
actually diverge in this case, requiring certain level of support
for different SoCs and boards, which makes ASPM more than just
a device feature that's simply expected to work by default.
Dealing with buggy PCIe devices is, of course, another part of
the entire endeavor.
> > When a new feature is introduced, it's expected that it may fail
> > on some hardware or with some specific setups, so quirking off such
> > instances, as time passes, is perfectly fine. Such a new feature
> > didn't work before it was implemented, so it's acceptable that it
> > fails in some instances after the introduction, and that it gets
> > quirked off as time passes and more testing is performed.
>
> ASPM is not a new feature. It was introduced more than a decade before. But we
> somehow procastinated the enablement for so long until we realized that if we
> don't do it now, we wouldn't be able to do it anytime in the future.
To clarify, I referred to PCIe as the already established feature,
which evidently became broken on some SoCs and boards with the
initial blanket enabling of ASPM.
> > However, when some widespread feature, such as PCIe, has already
> > been in production for quite a while, introducing high-risk changes
> > to it in a blanket fashion, while intending to have the incompatible
> > or not-yet-ready platforms quirked off over time, simply isn't the
> > way to go. Breaking stuff intentionally to find out what actually
> > doesn't work is rarely a good option.
>
> The issue is due to devices exposing ASPM capability, but behaving erratically
> when enabled. Until, we enable ASPM on these devices, we cannot know whether
> they are working or not. To avoid mass chaos, we decided to enable it only for
> devicetree platforms as a start.
As mentioned above, dealing with buggy PCIe devices is something
to be covered separately, and wasn't the subject of my complaint.
I referred to the SoC and board specifics.
> > Thus, I'd suggest that this patch is replaced with nother patches,
> > which would introduce an additional ASPM opt-in switch to the PCI
> > binding, allowing SoCs or boards to have it enabled _after_ proper
> > testing is performed. The PCIe driver may emit a warning that ASPM
> > is to be enabled at some point in the future, to "bug" people about
> > the need to perform the testing, etc.
>
> Even if we emit a "YOUR DEVICE MAY BREAK" warning, nobody would care as long as
> the device works for them. We didn't decide to enable this feature overnight to
> trouble users. The fact that ASPM saves runtime power, which will benefit users
> and ofc the environment as a whole, should not be kept disabled.
>
> But does that mean, we wanted to have breakages, NO. We expected breakages as
> not all devices will play nicely with ASPM, but there is only one way to find
> out. And we do want to disable ASPM only for those devices.
I see, and it has turned out rather well in the end. I totally
understand the frustration that may come out of people not caring
for something until it actually breaks, but as you can see, the
Rockchip community, including people from Rockchip itself, actually
cares quite a lot and is willing to help in different ways.
> > With all that in place, we
> > could expect that in a year or two PCIe ASPM could eventually be
> > enabled everywhere. Getting everything tested is a massive endeavor,
> > but that's the only way not to break stuff.
> >
> > Biting the bullet and hoping that it all goes well, I'd say, isn't
> > the right approach here.
>
> Your two year phased approach would never work as that's what
> we have hoped for more than a decade.
Totally understood, but again, perhaps you haven't knocked at the
right doors. The Rockchip community cares, for example. :)
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-14 18:49 ` Bjorn Helgaas
` (2 preceding siblings ...)
2025-10-15 12:26 ` Diederik de Haas
@ 2025-10-30 22:14 ` Bjorn Helgaas
2025-10-30 22:16 ` Bjorn Helgaas
3 siblings, 1 reply; 42+ messages in thread
From: Bjorn Helgaas @ 2025-10-30 22:14 UTC (permalink / raw)
To: FUKAUMI Naoki
Cc: manivannan.sadhasivam, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions
On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 15, 2025 at 01:30:16AM +0900, FUKAUMI Naoki wrote:
> > Hi Manivannan Sadhasivam,
> >
> > I've noticed an issue on Radxa ROCK 5A/5B boards, which are based on the
> > Rockchip RK3588(S) SoC.
> >
> > When running Linux v6.18-rc1 or linux-next since 20250924, the kernel either
> > freezes or fails to probe M.2 Wi-Fi modules. This happens with several
> > different modules I've tested, including the Realtek RTL8852BE, MediaTek
> > MT7921E, and Intel AX210.
> >
> > I've found that reverting the following commit (i.e., the patch I'm replying
> > to) resolves the problem:
> > commit f3ac2ff14834a0aa056ee3ae0e4b8c641c579961
>
> Thanks for the report, and sorry for the regression.
>
> Since this affects several devices from different manufacturers and (I
> assume) different drivers, it seems likely that there's some issue
> with the Rockchip end, since ASPM probably works on these devices in
> other systems. So we should figure out if there's something wrong
> with the way we configure ASPM, which we could potentially fix, or if
> there's a hardware issue and we need some king of quirk to prevent
> usage of ASPM on the affected platforms.
>
> Can you collect a complete dmesg log when booting with
>
> ignore_loglevel pci=earlydump dyndbg="file drivers/pci/* +p"
>
> and the output of "sudo lspci -vv"?
>
> When the kernel freezes, can you give us any information about where,
> e.g., a log or screenshot?
>
> Do you know if any platforms other than Radxa ROCK 5A/5B have this
> problem?
>
> #regzbot introduced: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
> #regzbot dup-of: https://github.com/chzigotzky/kernels/issues/17
#regzbot report: https://lore.kernel.org/r/22594781424C5C98+22cb5d61-19b1-4353-9818-3bb2b311da0b@radxa.com
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-30 22:14 ` Bjorn Helgaas
@ 2025-10-30 22:16 ` Bjorn Helgaas
0 siblings, 0 replies; 42+ messages in thread
From: Bjorn Helgaas @ 2025-10-30 22:16 UTC (permalink / raw)
To: FUKAUMI Naoki
Cc: manivannan.sadhasivam, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions
On Thu, Oct 30, 2025 at 05:14:05PM -0500, Bjorn Helgaas wrote:
> On Tue, Oct 14, 2025 at 01:49:05PM -0500, Bjorn Helgaas wrote:
> > #regzbot introduced: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
> > #regzbot dup-of: https://github.com/chzigotzky/kernels/issues/17
>
> #regzbot report: https://lore.kernel.org/r/22594781424C5C98+22cb5d61-19b1-4353-9818-3bb2b311da0b@radxa.com
Sorry for the noise; I'm trying to figure out how to use regzbot and
making lots of mistakes.
#regzbot fix: df5192d9bb0e
df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms")
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
2025-10-17 13:45 ` Bjorn Helgaas
@ 2025-10-31 6:21 ` Manivannan Sadhasivam
0 siblings, 0 replies; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-31 6:21 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Shawn Lin, Manivannan Sadhasivam, Niklas Cassel, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao,
Dragan Simic, linux-rockchip, regressions, FUKAUMI Naoki
On Fri, Oct 17, 2025 at 08:45:54AM -0500, Bjorn Helgaas wrote:
> On Fri, Oct 17, 2025 at 06:24:26PM +0530, Manivannan Sadhasivam wrote:
> > On Fri, Oct 17, 2025 at 08:19:11PM +0800, Shawn Lin wrote:
> > > 在 2025/10/17 星期五 18:04, Manivannan Sadhasivam 写道:
> > > > On Fri, Oct 17, 2025 at 05:47:44PM +0800, Shawn Lin wrote:
> > ...
>
> > > > > While we're on the topic of ASPM, may I ask a silly question?
> > > > > I saw the ASPM would only be configured once the function
> > > > > driver calling pci_enable_device. So if the modular driver
> > > > > hasn't been insmoded, the link will be in L0 even though there
> > > > > is no transcation on-going. What is the intention behind it?
> > > >
> > > > I don't see where ASPM is configured during pci_enable_device().
> > > > It is currently configured for all devices during
> > > > pci_scan_slot().
> > >
> > > This is the dump_stack() where I observed. If I compile NVMe
> > > driver as a module and never insmod it, the link is always in L0,
> > > namely ASPM Disabled.
> >
> > I guess this comment answers your question:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pcie/aspm.c?h=v6.18-rc1#n1179
>
> The comment is:
>
> * At this stage drivers haven't had an opportunity to change the
> * link policy setting. Enabling ASPM on broken hardware can cripple
> * it even before the driver has had a chance to disable ASPM, so
> * default to a safe level right now. If we're enabling ASPM beyond
> * the BIOS's expectation, we'll do so once pci_enable_device() is
> * called.
>
> I don't think relying on a driver to disable ASPM to avoid broken
> hardware is the right answer. If the driver is never loaded, we waste
> power. And if the user enables ASPM via sysfs, apparently the device
> may be crippled.
>
> I think it would be better to have an enumeration-time quirk to keep
> us from enabling ASPM. We might trip over some of this broken
> hardware, but I don't think there are very many drivers that fiddle
> with ASPM, so we should be able to be proactive about it.
>
There are quite a bit of drivers fiddling with ASPM states:
git grep -l PCI_EXP_LNKCTL_ASPMC drivers/ | wc -l
16
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-09-22 16:16 [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Manivannan Sadhasivam via B4 Relay
` (2 preceding siblings ...)
2025-09-23 23:14 ` [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Bjorn Helgaas
@ 2025-11-08 16:18 ` Dmitry Baryshkov
2025-11-11 6:51 ` Val Packett
3 siblings, 1 reply; 42+ messages in thread
From: Dmitry Baryshkov @ 2025-11-08 16:18 UTC (permalink / raw)
To: manivannan.sadhasivam, Rob Clark, Vignesh Raman, Valentine Burley
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> Hi,
>
> This series is one of the 'let's bite the bullet' kind, where we have decided to
> enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
> reason why devicetree platforms were chosen because, it will be of minimal
> impact compared to the ACPI platforms. So seemed ideal to test the waters.
>
> This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
> supported ASPM states are getting enabled for both the NVMe and WLAN devices by
> default.
>
> [1] https://lore.kernel.org/linux-pci/a47sg5ahflhvzyzqnfxvpk3dw4clkhqlhznjxzwqpf4nyjx5dk@bcghz5o6zolk
> [2] https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
>
> Changes in v2:
>
> - Used of_have_populated_dt() instead of CONFIG_OF to identify devicetree
> platforms
> - Renamed the override helpers and changed the override print
> - Moved setting the default state back to the original place and only kept the
> override in helpers
The series breaks the DRM CI on DB820C board (apq8096, PCIe network
card, NFS root). The board resets randomly after some time ([1]).
Note:
- Reverting just the second patch is not enough ([2])
- Reverting the second patch and picking up df5192d9bb0e ("PCI/ASPM:
Enable only L0s and L1 for devicetree platforms") is also nout enough
([3])
- Only revert of both patches results in a working pipeline ([4])
[1] https://gitlab.freedesktop.org/drm/msm/-/jobs/87321332
[2] https://gitlab.freedesktop.org/drm/msm/-/jobs/87476851
[3] https://gitlab.freedesktop.org/drm/msm/-/jobs/87482677
[4] https://gitlab.freedesktop.org/drm/msm/-/jobs/87481381
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Manivannan Sadhasivam (2):
> PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
> PCI: qcom: Remove the custom ASPM enablement code
>
> drivers/pci/controller/dwc/pcie-qcom.c | 32 --------------------------
> drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++++++++--
> 2 files changed, 40 insertions(+), 34 deletions(-)
> ---
> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
> change-id: 20250916-pci-dt-aspm-8b3a7e8d2cf1
>
> Best regards,
> --
> Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-08 16:18 ` Dmitry Baryshkov
@ 2025-11-11 6:51 ` Val Packett
2025-11-11 7:19 ` Manivannan Sadhasivam
2025-11-11 23:33 ` Bjorn Helgaas
0 siblings, 2 replies; 42+ messages in thread
From: Val Packett @ 2025-11-11 6:51 UTC (permalink / raw)
To: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On 11/8/25 1:18 PM, Dmitry Baryshkov wrote:
> On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
>> Hi,
>>
>> This series is one of the 'let's bite the bullet' kind, where we have decided to
>> enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
>> reason why devicetree platforms were chosen because, it will be of minimal
>> impact compared to the ACPI platforms. So seemed ideal to test the waters.
>>
>> This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
>> supported ASPM states are getting enabled for both the NVMe and WLAN devices by
>> default.
>> [..]
> The series breaks the DRM CI on DB820C board (apq8096, PCIe network
> card, NFS root). The board resets randomly after some time ([1]).
Is that reset.. due to the watchdog resetting a hard-frozen system?
Me and a bunch of other people in the #aarch64-laptops irc/matrix room
have been experiencing these random hard freezes with ASPM enabled for
the NVMe SSD, on Hamoa (and Purwa too I think) devices.
Totally unpredictable, could be after 4 minutes or 4 days of uptime.
Panic-indicator LED not blinking, no reaction to magic SysRq, display
image frozen, just a complete hang until the watchdog does the reset.
I have confirmed with a modified (to accept args) enable-aspm.sh
script[1] that disabling ASPM *only* for the SSD, while keeping it *on*
for the WiFi adapter, is enough to keep the system stable (got to about
a month of uptime in that state).
If you have reproduced the same issue on an entirely different SoC, it's
probably a general driver issue.
Please, please help us debug this using your internal secret debug
equipment :)
[1]: https://gist.github.com/valpackett/8a6207b44364de6b32652f4041fe680f
Thanks,
~val
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-11 6:51 ` Val Packett
@ 2025-11-11 7:19 ` Manivannan Sadhasivam
2025-11-11 7:40 ` Val Packett
2025-11-11 23:33 ` Bjorn Helgaas
1 sibling, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-11-11 7:19 UTC (permalink / raw)
To: Val Packett
Cc: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On Tue, Nov 11, 2025 at 03:51:03AM -0300, Val Packett wrote:
>
> On 11/8/25 1:18 PM, Dmitry Baryshkov wrote:
> > On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > Hi,
> > >
> > > This series is one of the 'let's bite the bullet' kind, where we have decided to
> > > enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
> > > reason why devicetree platforms were chosen because, it will be of minimal
> > > impact compared to the ACPI platforms. So seemed ideal to test the waters.
> > >
> > > This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
> > > supported ASPM states are getting enabled for both the NVMe and WLAN devices by
> > > default.
> > > [..]
> > The series breaks the DRM CI on DB820C board (apq8096, PCIe network
> > card, NFS root). The board resets randomly after some time ([1]).
>
> Is that reset.. due to the watchdog resetting a hard-frozen system?
>
> Me and a bunch of other people in the #aarch64-laptops irc/matrix room have
> been experiencing these random hard freezes with ASPM enabled for the NVMe
> SSD, on Hamoa (and Purwa too I think) devices.
>
Interesting! ASPM is tested and found to be working on Hamoa and other Qcom
chipsets also, except Makena based chipsets that doesn't support L0s due to
incorrect PHY settings. APQ8096 might be an exception since it is a really old
target and I'm digging up internally regarding the ASPM support.
> Totally unpredictable, could be after 4 minutes or 4 days of uptime.
> Panic-indicator LED not blinking, no reaction to magic SysRq, display image
> frozen, just a complete hang until the watchdog does the reset.
>
I have KIOXIA SSD on my T14s. I do see some random hang, but I thought those
predate the ASPM enablement as I saw them earlier as well. But even before this
series, we had ASPM enabled for SSDs on Qcom targets (or devices that gets
enumerated during initial bus scan), so it might be that the SSD doesn't support
ASPM well enough.
But I'm clueless on why it results in a hang. What I know on ARM platforms is
that we get SError aborts and other crazy bus/NOC issues if the device doesn't
respond to the PCIe read request. So the hang could be due to one of those
issues.
> I have confirmed with a modified (to accept args) enable-aspm.sh script[1]
> that disabling ASPM *only* for the SSD, while keeping it *on* for the WiFi
> adapter, is enough to keep the system stable (got to about a month of uptime
> in that state).
>
So this confirms that the controller supports it, and the device (SSD) might be
of fault here.
> If you have reproduced the same issue on an entirely different SoC, it's
> probably a general driver issue.
>
> Please, please help us debug this using your internal secret debug equipment
> :)
>
Starting from v6.18-rc3, we only enable L0s and L1 by default on all devicetree
platforms. Are you seeing the hangs post -rc3 also? If so, could you please
share the SSD model by doing 'lspci -nn'?
Apologies for the inconvenience!
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-11 7:19 ` Manivannan Sadhasivam
@ 2025-11-11 7:40 ` Val Packett
2025-11-11 10:06 ` Manivannan Sadhasivam
0 siblings, 1 reply; 42+ messages in thread
From: Val Packett @ 2025-11-11 7:40 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On 11/11/25 4:19 AM, Manivannan Sadhasivam wrote:
> On Tue, Nov 11, 2025 at 03:51:03AM -0300, Val Packett wrote:
>> On 11/8/25 1:18 PM, Dmitry Baryshkov wrote:
>>> On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
>>>> Hi,
>>>>
>>>> This series is one of the 'let's bite the bullet' kind, where we have decided to
>>>> enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
>>>> reason why devicetree platforms were chosen because, it will be of minimal
>>>> impact compared to the ACPI platforms. So seemed ideal to test the waters.
>>>>
>>>> This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
>>>> supported ASPM states are getting enabled for both the NVMe and WLAN devices by
>>>> default.
>>>> [..]
>>> The series breaks the DRM CI on DB820C board (apq8096, PCIe network
>>> card, NFS root). The board resets randomly after some time ([1]).
>> Is that reset.. due to the watchdog resetting a hard-frozen system?
>>
>> Me and a bunch of other people in the #aarch64-laptops irc/matrix room have
>> been experiencing these random hard freezes with ASPM enabled for the NVMe
>> SSD, on Hamoa (and Purwa too I think) devices.
>>
> Interesting! ASPM is tested and found to be working on Hamoa and other Qcom
> chipsets also, except Makena based chipsets that doesn't support L0s due to
> incorrect PHY settings. APQ8096 might be an exception since it is a really old
> target and I'm digging up internally regarding the ASPM support.
>
>> Totally unpredictable, could be after 4 minutes or 4 days of uptime.
>> Panic-indicator LED not blinking, no reaction to magic SysRq, display image
>> frozen, just a complete hang until the watchdog does the reset.
>>
> I have KIOXIA SSD on my T14s. I do see some random hang, but I thought those
> predate the ASPM enablement as I saw them earlier as well. But even before this
> series, we had ASPM enabled for SSDs on Qcom targets (or devices that gets
> enumerated during initial bus scan), so it might be that the SSD doesn't support
> ASPM well enough.
I certainly remember that ASPM *was* enabled by default when I first got
this laptop, via the custom way that predates this series.
Actually that custom enablement code getting removed was how I
discovered it was ASPM related!
I pulled linux-next once and suddenly the system became stable!.. and
then I noticed +2W of battery drain..
> But I'm clueless on why it results in a hang. What I know on ARM platforms is
> that we get SError aborts and other crazy bus/NOC issues if the device doesn't
> respond to the PCIe read request. So the hang could be due to one of those
> issues.
Could the kernel be making requests before the device fully resumed from
a sleep state?
>> I have confirmed with a modified (to accept args) enable-aspm.sh script[1]
>> that disabling ASPM *only* for the SSD, while keeping it *on* for the WiFi
>> adapter, is enough to keep the system stable (got to about a month of uptime
>> in that state).
>>
> So this confirms that the controller supports it, and the device (SSD) might be
> of fault here.
>
>> If you have reproduced the same issue on an entirely different SoC, it's
>> probably a general driver issue.
>>
>> Please, please help us debug this using your internal secret debug equipment
>> :)
>>
> Starting from v6.18-rc3, we only enable L0s and L1 by default on all devicetree
> platforms. Are you seeing the hangs post -rc3 also? If so, could you please
> share the SSD model by doing 'lspci -nn'?
Yes, still seeing them on 6.18.0-rc4-next-20251107. At least with
pcie_aspm=force (have been using that recently, so likely all my testing
"post -rc3" was with force on.. but others have been testing without it)
I'm currently using the stock drive: Sandisk Corp PC SN740 NVMe SSD
(DRAM-less) [15b7:5015] (rev 01)
Though for a couple months I've used a 3rd party one, an SK Hynix BC901
[1c5c:1d59]
And other users have different other models and still have the same issue.
// Every time something PCIe related is posted to the mailing lists I've
been wondering if it could solve this :D
"Program correct T_POWER_ON value for L1.2 exit timing" didn't help.
Testing "Remove DPC Extended Capability" now..
~val
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-11 7:40 ` Val Packett
@ 2025-11-11 10:06 ` Manivannan Sadhasivam
2025-11-11 17:29 ` Val Packett
0 siblings, 1 reply; 42+ messages in thread
From: Manivannan Sadhasivam @ 2025-11-11 10:06 UTC (permalink / raw)
To: Val Packett
Cc: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On Tue, Nov 11, 2025 at 04:40:01AM -0300, Val Packett wrote:
>
> On 11/11/25 4:19 AM, Manivannan Sadhasivam wrote:
> > On Tue, Nov 11, 2025 at 03:51:03AM -0300, Val Packett wrote:
> > > On 11/8/25 1:18 PM, Dmitry Baryshkov wrote:
> > > > On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > > > Hi,
> > > > >
> > > > > This series is one of the 'let's bite the bullet' kind, where we have decided to
> > > > > enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
> > > > > reason why devicetree platforms were chosen because, it will be of minimal
> > > > > impact compared to the ACPI platforms. So seemed ideal to test the waters.
> > > > >
> > > > > This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
> > > > > supported ASPM states are getting enabled for both the NVMe and WLAN devices by
> > > > > default.
> > > > > [..]
> > > > The series breaks the DRM CI on DB820C board (apq8096, PCIe network
> > > > card, NFS root). The board resets randomly after some time ([1]).
> > > Is that reset.. due to the watchdog resetting a hard-frozen system?
> > >
> > > Me and a bunch of other people in the #aarch64-laptops irc/matrix room have
> > > been experiencing these random hard freezes with ASPM enabled for the NVMe
> > > SSD, on Hamoa (and Purwa too I think) devices.
> > >
> > Interesting! ASPM is tested and found to be working on Hamoa and other Qcom
> > chipsets also, except Makena based chipsets that doesn't support L0s due to
> > incorrect PHY settings. APQ8096 might be an exception since it is a really old
> > target and I'm digging up internally regarding the ASPM support.
> >
> > > Totally unpredictable, could be after 4 minutes or 4 days of uptime.
> > > Panic-indicator LED not blinking, no reaction to magic SysRq, display image
> > > frozen, just a complete hang until the watchdog does the reset.
> > >
> > I have KIOXIA SSD on my T14s. I do see some random hang, but I thought those
> > predate the ASPM enablement as I saw them earlier as well. But even before this
> > series, we had ASPM enabled for SSDs on Qcom targets (or devices that gets
> > enumerated during initial bus scan), so it might be that the SSD doesn't support
> > ASPM well enough.
>
> I certainly remember that ASPM *was* enabled by default when I first got
> this laptop, via the custom way that predates this series.
>
> Actually that custom enablement code getting removed was how I discovered it
> was ASPM related!
>
> I pulled linux-next once and suddenly the system became stable!.. and then I
> noticed +2W of battery drain..
>
Because, we only enable L0s and L1 by default and not L1ss.
> > But I'm clueless on why it results in a hang. What I know on ARM platforms is
> > that we get SError aborts and other crazy bus/NOC issues if the device doesn't
> > respond to the PCIe read request. So the hang could be due to one of those
> > issues.
>
> Could the kernel be making requests before the device fully resumed from a
> sleep state?
>
Kernel has no visibility on the PCIe link ASPM states as it happens autonomously
in hardware once enabled. So once kernel issues a PCIe read TLP, the link is
supposed to transition L0 and the device should respond. But if the link doesn't
come up for any reason, it will result in a completion timeout and weird things
happen on the host.
> > > I have confirmed with a modified (to accept args) enable-aspm.sh script[1]
> > > that disabling ASPM *only* for the SSD, while keeping it *on* for the WiFi
> > > adapter, is enough to keep the system stable (got to about a month of uptime
> > > in that state).
> > >
> > So this confirms that the controller supports it, and the device (SSD) might be
> > of fault here.
> >
> > > If you have reproduced the same issue on an entirely different SoC, it's
> > > probably a general driver issue.
> > >
> > > Please, please help us debug this using your internal secret debug equipment
> > > :)
> > >
> > Starting from v6.18-rc3, we only enable L0s and L1 by default on all devicetree
> > platforms. Are you seeing the hangs post -rc3 also? If so, could you please
> > share the SSD model by doing 'lspci -nn'?
>
> Yes, still seeing them on 6.18.0-rc4-next-20251107. At least with
> pcie_aspm=force (have been using that recently, so likely all my testing
> "post -rc3" was with force on.. but others have been testing without it)
>
pcie_aspm=force will forcefully enable all the ASPM states. So it will result in
the same crash if L1ss is not supported properly by the endpoint.
> I'm currently using the stock drive: Sandisk Corp PC SN740 NVMe SSD
> (DRAM-less) [15b7:5015] (rev 01)
>
I'm suspecting the L1ss issue with this SSD since you said above that
next/master works fine until you pass 'pcie_aspm=force'. Could you try the below
diff with that cmdline option?
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 44e780718953..ba48f8184b68 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2525,6 +2525,16 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
*/
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
+static void quirk_disable_aspm_l1ss(struct pci_dev *dev)
+{
+ pci_info(dev, "Disabling ASPM L1ss\n");
+ pci_disable_link_state(dev, PCIE_LINK_STATE_L1_1 |
+ PCIE_LINK_STATE_L1_2 |
+ PCIE_LINK_STATE_L1_1_PCIPM |
+ PCIE_LINK_STATE_L1_2_PCIPM);
+}
+DECLARE_PCI_FIXUP_FINAL(0x15b7, 0x5015, quirk_disable_aspm_l1ss);
+
/*
* Remove ASPM L0s and L1 support from cached copy of Link Capabilities so
* aspm.c won't try to enable them.
> Though for a couple months I've used a 3rd party one, an SK Hynix BC901
> [1c5c:1d59]
>
> And other users have different other models and still have the same issue.
>
> // Every time something PCIe related is posted to the mailing lists I've
> been wondering if it could solve this :D
> "Program correct T_POWER_ON value for L1.2 exit timing" didn't help. Testing
> "Remove DPC Extended Capability" now..
>
You could've reported this issue to linux-pci list.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-11 10:06 ` Manivannan Sadhasivam
@ 2025-11-11 17:29 ` Val Packett
2025-11-13 4:30 ` Val Packett
0 siblings, 1 reply; 42+ messages in thread
From: Val Packett @ 2025-11-11 17:29 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On 11/11/25 7:06 AM, Manivannan Sadhasivam wrote:
> On Tue, Nov 11, 2025 at 04:40:01AM -0300, Val Packett wrote:
>> On 11/11/25 4:19 AM, Manivannan Sadhasivam wrote:
>>> [..]
>>>> Totally unpredictable, could be after 4 minutes or 4 days of uptime.
>>>> Panic-indicator LED not blinking, no reaction to magic SysRq, display image
>>>> frozen, just a complete hang until the watchdog does the reset.
>>> I have KIOXIA SSD on my T14s. I do see some random hang, but I thought those
>>> predate the ASPM enablement as I saw them earlier as well. But even before this
>>> series, we had ASPM enabled for SSDs on Qcom targets (or devices that gets
>>> enumerated during initial bus scan), so it might be that the SSD doesn't support
>>> ASPM well enough.
>> I certainly remember that ASPM *was* enabled by default when I first got
>> this laptop, via the custom way that predates this series.
>>
>> Actually that custom enablement code getting removed was how I discovered it
>> was ASPM related!
>>
>> I pulled linux-next once and suddenly the system became stable!.. and then I
>> noticed +2W of battery drain..
> Because, we only enable L0s and L1 by default and not L1ss.
Back in that short time period between the old code getting removed and
this series landing, the default behavior was no ASPM at all, I'm pretty
sure.
Again, with the SK hynix SSD I used back then, I *definitely* saw the
issue with this series in and no args applied.
> [..]
>> I'm currently using the stock drive: Sandisk Corp PC SN740 NVMe SSD
>> (DRAM-less) [15b7:5015] (rev 01)
> I'm suspecting the L1ss issue with this SSD since you said above that
> next/master works fine until you pass 'pcie_aspm=force'. Could you try the below
> diff with that cmdline option?
I did *not* say that it works fine with no arg!
I said that I've only tested this stock WD SSD with 'force' so far, and
don't have any data on *this* SSD without 'force' yet.
Now testing with this drive and no arg:
LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
ExtSynch+ ClockPM- AutWidDis- BWInt- AutBWInt-
FltModeDis-
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=156672ns
Let's see how it goes.
But it sounds very odd that all the SSDs would be to blame and not the
controller.. Other platforms don't seem to be having this issue. Don't
Intel and AMD enable L1ss by default?
~val
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-11 6:51 ` Val Packett
2025-11-11 7:19 ` Manivannan Sadhasivam
@ 2025-11-11 23:33 ` Bjorn Helgaas
1 sibling, 0 replies; 42+ messages in thread
From: Bjorn Helgaas @ 2025-11-11 23:33 UTC (permalink / raw)
To: Val Packett
Cc: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
linux-pci, linux-kernel, linux-arm-msm, David E. Box,
Kai-Heng Feng, Rafael J. Wysocki, Heiner Kallweit, Chia-Lin Kao
On Tue, Nov 11, 2025 at 03:51:03AM -0300, Val Packett wrote:
> On 11/8/25 1:18 PM, Dmitry Baryshkov wrote:
> > On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > Hi,
> > >
> > > This series is one of the 'let's bite the bullet' kind, where we have decided to
> > > enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
> > > reason why devicetree platforms were chosen because, it will be of minimal
> > > impact compared to the ACPI platforms. So seemed ideal to test the waters.
> > >
> > > This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
> > > supported ASPM states are getting enabled for both the NVMe and WLAN devices by
> > > default.
> > > [..]
> > The series breaks the DRM CI on DB820C board (apq8096, PCIe network
> > card, NFS root). The board resets randomly after some time ([1]).
>
> Is that reset.. due to the watchdog resetting a hard-frozen system?
>
> Me and a bunch of other people in the #aarch64-laptops irc/matrix room have
> been experiencing these random hard freezes with ASPM enabled for the NVMe
> SSD, on Hamoa (and Purwa too I think) devices.
I don't know what controllers are in Hamoa and Purwa or what the IDs
of the root ports and endpoints are. Can you collect the Vendor and
Device IDs (from dmesg log or "lspci -n")? If we figure out that some
are broken, we might be able to add quirks to avoid any broken ASPM
states.
> I have confirmed with a modified (to accept args) enable-aspm.sh script[1]
> that disabling ASPM *only* for the SSD, while keeping it *on* for the WiFi
> adapter, is enough to keep the system stable (got to about a month of uptime
> in that state).
>
> If you have reproduced the same issue on an entirely different SoC, it's
> probably a general driver issue.
>
> Please, please help us debug this using your internal secret debug equipment
> :)
>
>
> [1]: https://gist.github.com/valpackett/8a6207b44364de6b32652f4041fe680f
Can you use "echo 1 > /sys/bus/pci/devices/.../link/l0s_aspm" and
similar (see Documentation/ABI/testing/sysfs-bus-pci) to do this
tuning instead of poking with setpci? If so, it might be easier.
There are ordering requirements that aspm.c tries to observe via the
sysfs interface.
enable-aspm.sh might observe them also (I didn't look that carefully),
but if aspm.c gets them wrong, they're wrong for everybody, so we'd
like to know about that.
Bjorn
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
2025-11-11 17:29 ` Val Packett
@ 2025-11-13 4:30 ` Val Packett
0 siblings, 0 replies; 42+ messages in thread
From: Val Packett @ 2025-11-13 4:30 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Dmitry Baryshkov, manivannan.sadhasivam, Rob Clark, Vignesh Raman,
Valentine Burley, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, linux-pci, linux-kernel,
linux-arm-msm, David E. Box, Kai-Heng Feng, Rafael J. Wysocki,
Heiner Kallweit, Chia-Lin Kao, Bjorn Helgaas
On 11/11/25 2:29 PM, Val Packett wrote:
> [..]
>
> Now testing with this drive and no arg:
>
> LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable-
> CommClk+
> ExtSynch+ ClockPM- AutWidDis- BWInt- AutBWInt-
> FltModeDis-
>
> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2-
> ASPM_L1.1-
> T_CommonMode=0us LTR1.2_Threshold=156672ns
>
> Let's see how it goes.
Update: close to 2 days in, went AFK to eat and came back to a gdm login
prompt once again. (This is the stock WD drive and no force.)
This does not seem to be related to L1ss.
~val
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2025-11-13 4:30 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-22 16:16 [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Manivannan Sadhasivam via B4 Relay
2025-09-22 16:16 ` [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for " Manivannan Sadhasivam via B4 Relay
2025-10-14 16:30 ` FUKAUMI Naoki
2025-10-14 18:49 ` Bjorn Helgaas
2025-10-14 23:33 ` Dragan Simic
2025-10-15 6:22 ` Manivannan Sadhasivam
2025-10-15 11:23 ` Diederik de Haas
2025-10-23 18:57 ` Dragan Simic
2025-10-15 6:26 ` Manivannan Sadhasivam
2025-10-15 7:13 ` FUKAUMI Naoki
2025-10-15 7:50 ` Manivannan Sadhasivam
2025-10-15 9:11 ` Shawn Lin
2025-10-15 9:43 ` Manivannan Sadhasivam
2025-10-15 9:46 ` Niklas Cassel
2025-10-15 10:33 ` Manivannan Sadhasivam
2025-10-15 12:17 ` Niklas Cassel
2025-10-15 13:00 ` Shawn Lin
2025-10-15 15:23 ` Niklas Cassel
2025-10-15 23:30 ` Bjorn Helgaas
2025-10-16 6:46 ` Hongxing Zhu
2025-10-17 3:36 ` Manivannan Sadhasivam
2025-10-17 9:47 ` Shawn Lin
2025-10-17 10:04 ` Manivannan Sadhasivam
2025-10-17 12:19 ` Shawn Lin
2025-10-17 12:54 ` Manivannan Sadhasivam
2025-10-17 13:45 ` Bjorn Helgaas
2025-10-31 6:21 ` Manivannan Sadhasivam
2025-10-15 12:26 ` Diederik de Haas
2025-10-15 22:50 ` Bjorn Helgaas
2025-10-16 17:38 ` Diederik de Haas
2025-10-30 22:14 ` Bjorn Helgaas
2025-10-30 22:16 ` Bjorn Helgaas
2025-09-22 16:16 ` [PATCH v2 2/2] PCI: qcom: Remove the custom ASPM enablement code Manivannan Sadhasivam via B4 Relay
2025-09-23 23:14 ` [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms Bjorn Helgaas
2025-11-08 16:18 ` Dmitry Baryshkov
2025-11-11 6:51 ` Val Packett
2025-11-11 7:19 ` Manivannan Sadhasivam
2025-11-11 7:40 ` Val Packett
2025-11-11 10:06 ` Manivannan Sadhasivam
2025-11-11 17:29 ` Val Packett
2025-11-13 4:30 ` Val Packett
2025-11-11 23:33 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).