* [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1
@ 2023-08-26 11:10 Heiner Kallweit
2023-10-02 15:14 ` Bjorn Helgaas
0 siblings, 1 reply; 5+ messages in thread
From: Heiner Kallweit @ 2023-08-26 11:10 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-pci@vger.kernel.org, Ajay Agarwal
After the referenced commit we may see L1 sub-states being active
unexpectedly. Following scenario as an example:
r8169 disables L1 because of known hardware issues on a number of
systems. Implicitly L1.1 and L1.2 are disabled too.
On my system L1 and L1.1 work fine, but L1.2 causes missed
rx packets. Therefore I write 1 to aspm_l1_1.
This removes ASPM_STATE_L1 from the disabled modes and therefore
unexpectedly enables also L1.2. So return to the old behavior.
A comment in the commit message of the referenced change correctly points
out that this behavior is inconsistent with aspm_attr_store_common().
So change aspm_attr_store_common() accordingly.
Fixes: fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/pci/pcie/aspm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 3dafba0b5..6d3788257 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
if (state & PCIE_LINK_STATE_L0S)
link->aspm_disable |= ASPM_STATE_L0S;
if (state & PCIE_LINK_STATE_L1)
- link->aspm_disable |= ASPM_STATE_L1;
+ link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS;
if (state & PCIE_LINK_STATE_L1_1)
link->aspm_disable |= ASPM_STATE_L1_1;
if (state & PCIE_LINK_STATE_L1_2)
@@ -1251,6 +1251,8 @@ static ssize_t aspm_attr_store_common(struct device *dev,
link->aspm_disable &= ~ASPM_STATE_L1;
} else {
link->aspm_disable |= state;
+ if (state & ASPM_STATE_L1)
+ link->aspm_disable |= ASPM_STATE_L1SS;
}
pcie_config_aspm_link(link, policy_to_aspm_state(link));
--
2.42.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1 2023-08-26 11:10 [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1 Heiner Kallweit @ 2023-10-02 15:14 ` Bjorn Helgaas 2023-10-03 3:33 ` Kuppuswamy Sathyanarayanan 2023-10-10 20:29 ` Heiner Kallweit 0 siblings, 2 replies; 5+ messages in thread From: Bjorn Helgaas @ 2023-10-02 15:14 UTC (permalink / raw) To: Heiner Kallweit Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, Ajay Agarwal, Kuppuswamy Sathyanarayanan, Lukas Wunner [+cc Sathy, Lukas] On Sat, Aug 26, 2023 at 01:10:35PM +0200, Heiner Kallweit wrote: > After the referenced commit we may see L1 sub-states being active > unexpectedly. Following scenario as an example: > r8169 disables L1 because of known hardware issues on a number of > systems. Implicitly L1.1 and L1.2 are disabled too. > On my system L1 and L1.1 work fine, but L1.2 causes missed > rx packets. Therefore I write 1 to aspm_l1_1. > This removes ASPM_STATE_L1 from the disabled modes and therefore > unexpectedly enables also L1.2. So return to the old behavior. > > A comment in the commit message of the referenced change correctly points > out that this behavior is inconsistent with aspm_attr_store_common(). > So change aspm_attr_store_common() accordingly. I think we should split this into a pure revert of fb097dcd5a28 with the description of the unintended consequence, followed by another patch to change aspm_attr_store_common(). I guess the existing aspm_attr_store_common() behavior allows a similar unexpected behavior? For example, in this sequence: - Write 0 to "l1_aspm" to disable L1 - Write 1 to "l1_1_aspm" to enable L1.1 does L1.2 get implicitly enabled as well even though that's clearly not what the user intended? There's also the separate question of how the sysfs file and the pci_disable_link_state() API should interact. Drivers use that API when they know about a defect in their device, but the user can override that via syfs. In [1], we have a similar situation with D3cold support, where we're thinking that we should not allow a user to use sysfs to override that driver knowledge. Bjorn [1] https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de > Fixes: fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1") > Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> > --- > drivers/pci/pcie/aspm.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 3dafba0b5..6d3788257 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) > if (state & PCIE_LINK_STATE_L0S) > link->aspm_disable |= ASPM_STATE_L0S; > if (state & PCIE_LINK_STATE_L1) > - link->aspm_disable |= ASPM_STATE_L1; > + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; > if (state & PCIE_LINK_STATE_L1_1) > link->aspm_disable |= ASPM_STATE_L1_1; > if (state & PCIE_LINK_STATE_L1_2) > @@ -1251,6 +1251,8 @@ static ssize_t aspm_attr_store_common(struct device *dev, > link->aspm_disable &= ~ASPM_STATE_L1; > } else { > link->aspm_disable |= state; > + if (state & ASPM_STATE_L1) > + link->aspm_disable |= ASPM_STATE_L1SS; > } > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > -- > 2.42.0 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1 2023-10-02 15:14 ` Bjorn Helgaas @ 2023-10-03 3:33 ` Kuppuswamy Sathyanarayanan 2023-10-10 20:33 ` Heiner Kallweit 2023-10-10 20:29 ` Heiner Kallweit 1 sibling, 1 reply; 5+ messages in thread From: Kuppuswamy Sathyanarayanan @ 2023-10-03 3:33 UTC (permalink / raw) To: Bjorn Helgaas, Heiner Kallweit Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, Ajay Agarwal, Lukas Wunner On 10/2/2023 8:14 AM, Bjorn Helgaas wrote: > [+cc Sathy, Lukas] > > On Sat, Aug 26, 2023 at 01:10:35PM +0200, Heiner Kallweit wrote: >> After the referenced commit we may see L1 sub-states being active >> unexpectedly. Following scenario as an example: >> r8169 disables L1 because of known hardware issues on a number of >> systems. Implicitly L1.1 and L1.2 are disabled too. >> On my system L1 and L1.1 work fine, but L1.2 causes missed >> rx packets. Therefore I write 1 to aspm_l1_1. >> This removes ASPM_STATE_L1 from the disabled modes and therefore >> unexpectedly enables also L1.2. So return to the old behavior. IIUC, the above-mentioned SysFS issue will be fixed by your change to aspm_attr_store_common(), right? Can you elaborate why you need the following change? >> @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) >> if (state & PCIE_LINK_STATE_L1) >> - link->aspm_disable |= ASPM_STATE_L1; >> + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; >> >> A comment in the commit message of the referenced change correctly points >> out that this behavior is inconsistent with aspm_attr_store_common(). >> So change aspm_attr_store_common() accordingly. > > I think we should split this into a pure revert of fb097dcd5a28 with > the description of the unintended consequence, followed by another > patch to change aspm_attr_store_common(). > I agree, the revert and new change should be split into two patches. > I guess the existing aspm_attr_store_common() behavior allows a > similar unexpected behavior? For example, in this sequence: > > - Write 0 to "l1_aspm" to disable L1 > - Write 1 to "l1_1_aspm" to enable L1.1 > > does L1.2 get implicitly enabled as well even though that's clearly > not what the user intended? > > There's also the separate question of how the sysfs file and the > pci_disable_link_state() API should interact. Drivers use that API > when they know about a defect in their device, but the user can > override that via syfs. > > In [1], we have a similar situation with D3cold support, where we're > thinking that we should not allow a user to use sysfs to override that > driver knowledge. > > Bjorn > > [1] https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de > >> Fixes: fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1") >> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> >> --- >> drivers/pci/pcie/aspm.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> index 3dafba0b5..6d3788257 100644 >> --- a/drivers/pci/pcie/aspm.c >> +++ b/drivers/pci/pcie/aspm.c >> @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) >> if (state & PCIE_LINK_STATE_L0S) >> link->aspm_disable |= ASPM_STATE_L0S; >> if (state & PCIE_LINK_STATE_L1) >> - link->aspm_disable |= ASPM_STATE_L1; >> + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; >> if (state & PCIE_LINK_STATE_L1_1) >> link->aspm_disable |= ASPM_STATE_L1_1; >> if (state & PCIE_LINK_STATE_L1_2) >> @@ -1251,6 +1251,8 @@ static ssize_t aspm_attr_store_common(struct device *dev, >> link->aspm_disable &= ~ASPM_STATE_L1; >> } else { >> link->aspm_disable |= state; >> + if (state & ASPM_STATE_L1) >> + link->aspm_disable |= ASPM_STATE_L1SS; >> } >> >> pcie_config_aspm_link(link, policy_to_aspm_state(link)); >> -- >> 2.42.0 >> -- Sathyanarayanan Kuppuswamy Linux Kernel Developer ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1 2023-10-03 3:33 ` Kuppuswamy Sathyanarayanan @ 2023-10-10 20:33 ` Heiner Kallweit 0 siblings, 0 replies; 5+ messages in thread From: Heiner Kallweit @ 2023-10-10 20:33 UTC (permalink / raw) To: Kuppuswamy Sathyanarayanan, Bjorn Helgaas Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, Ajay Agarwal, Lukas Wunner On 03.10.2023 05:33, Kuppuswamy Sathyanarayanan wrote: > > > On 10/2/2023 8:14 AM, Bjorn Helgaas wrote: >> [+cc Sathy, Lukas] >> >> On Sat, Aug 26, 2023 at 01:10:35PM +0200, Heiner Kallweit wrote: >>> After the referenced commit we may see L1 sub-states being active >>> unexpectedly. Following scenario as an example: >>> r8169 disables L1 because of known hardware issues on a number of >>> systems. Implicitly L1.1 and L1.2 are disabled too. >>> On my system L1 and L1.1 work fine, but L1.2 causes missed >>> rx packets. Therefore I write 1 to aspm_l1_1. >>> This removes ASPM_STATE_L1 from the disabled modes and therefore >>> unexpectedly enables also L1.2. So return to the old behavior. > > IIUC, the above-mentioned SysFS issue will be fixed by your change to > aspm_attr_store_common(), right? > In my use case the driver uses pci_disable_link_state() to disable L1 and implicitly also L1.1 and L1.2. The following change is needed so that L1.2 remains disabled if L1 is removed from the list of disabled states later. That's the revert we're talking about. > Can you elaborate why you need the following change? > >>> @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) >>> if (state & PCIE_LINK_STATE_L1) >>> - link->aspm_disable |= ASPM_STATE_L1; >>> + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; > >>> >>> A comment in the commit message of the referenced change correctly points >>> out that this behavior is inconsistent with aspm_attr_store_common(). >>> So change aspm_attr_store_common() accordingly. >> >> I think we should split this into a pure revert of fb097dcd5a28 with >> the description of the unintended consequence, followed by another >> patch to change aspm_attr_store_common(). >> > > I agree, the revert and new change should be split into two patches. > >> I guess the existing aspm_attr_store_common() behavior allows a >> similar unexpected behavior? For example, in this sequence: >> >> - Write 0 to "l1_aspm" to disable L1 >> - Write 1 to "l1_1_aspm" to enable L1.1 >> >> does L1.2 get implicitly enabled as well even though that's clearly >> not what the user intended? >> >> There's also the separate question of how the sysfs file and the >> pci_disable_link_state() API should interact. Drivers use that API >> when they know about a defect in their device, but the user can >> override that via syfs. >> >> In [1], we have a similar situation with D3cold support, where we're >> thinking that we should not allow a user to use sysfs to override that >> driver knowledge. >> >> Bjorn >> >> [1] https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de >> >>> Fixes: fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1") >>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> >>> --- >>> drivers/pci/pcie/aspm.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >>> index 3dafba0b5..6d3788257 100644 >>> --- a/drivers/pci/pcie/aspm.c >>> +++ b/drivers/pci/pcie/aspm.c >>> @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) >>> if (state & PCIE_LINK_STATE_L0S) >>> link->aspm_disable |= ASPM_STATE_L0S; >>> if (state & PCIE_LINK_STATE_L1) >>> - link->aspm_disable |= ASPM_STATE_L1; >>> + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; >>> if (state & PCIE_LINK_STATE_L1_1) >>> link->aspm_disable |= ASPM_STATE_L1_1; >>> if (state & PCIE_LINK_STATE_L1_2) >>> @@ -1251,6 +1251,8 @@ static ssize_t aspm_attr_store_common(struct device *dev, >>> link->aspm_disable &= ~ASPM_STATE_L1; >>> } else { >>> link->aspm_disable |= state; >>> + if (state & ASPM_STATE_L1) >>> + link->aspm_disable |= ASPM_STATE_L1SS; >>> } >>> >>> pcie_config_aspm_link(link, policy_to_aspm_state(link)); >>> -- >>> 2.42.0 >>> > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1 2023-10-02 15:14 ` Bjorn Helgaas 2023-10-03 3:33 ` Kuppuswamy Sathyanarayanan @ 2023-10-10 20:29 ` Heiner Kallweit 1 sibling, 0 replies; 5+ messages in thread From: Heiner Kallweit @ 2023-10-10 20:29 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, Ajay Agarwal, Kuppuswamy Sathyanarayanan, Lukas Wunner On 02.10.2023 17:14, Bjorn Helgaas wrote: > [+cc Sathy, Lukas] > > On Sat, Aug 26, 2023 at 01:10:35PM +0200, Heiner Kallweit wrote: >> After the referenced commit we may see L1 sub-states being active >> unexpectedly. Following scenario as an example: >> r8169 disables L1 because of known hardware issues on a number of >> systems. Implicitly L1.1 and L1.2 are disabled too. >> On my system L1 and L1.1 work fine, but L1.2 causes missed >> rx packets. Therefore I write 1 to aspm_l1_1. >> This removes ASPM_STATE_L1 from the disabled modes and therefore >> unexpectedly enables also L1.2. So return to the old behavior. >> >> A comment in the commit message of the referenced change correctly points >> out that this behavior is inconsistent with aspm_attr_store_common(). >> So change aspm_attr_store_common() accordingly. > > I think we should split this into a pure revert of fb097dcd5a28 with > the description of the unintended consequence, followed by another > patch to change aspm_attr_store_common(). > OK > I guess the existing aspm_attr_store_common() behavior allows a > similar unexpected behavior? For example, in this sequence: > > - Write 0 to "l1_aspm" to disable L1 > - Write 1 to "l1_1_aspm" to enable L1.1 > > does L1.2 get implicitly enabled as well even though that's clearly > not what the user intended? > Right, it's the same here. Therefore the second change in the patch. > There's also the separate question of how the sysfs file and the > pci_disable_link_state() API should interact. Drivers use that API > when they know about a defect in their device, but the user can > override that via syfs. > > In [1], we have a similar situation with D3cold support, where we're > thinking that we should not allow a user to use sysfs to override that > driver knowledge. > In my r8169 use case ASPM works fine on one system, but causes issues on another one with same NIC chip version. So it may be a wild mix of NIC hw erratum, BIOS bug, mainboard chipset incompatibilities, etc. Therefore I disable L1 per default, and allow users to re-enable L1/L1.1/L1.2 if their system isn't affected. > Bjorn > > [1] https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de > >> Fixes: fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1") >> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> >> --- >> drivers/pci/pcie/aspm.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> index 3dafba0b5..6d3788257 100644 >> --- a/drivers/pci/pcie/aspm.c >> +++ b/drivers/pci/pcie/aspm.c >> @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) >> if (state & PCIE_LINK_STATE_L0S) >> link->aspm_disable |= ASPM_STATE_L0S; >> if (state & PCIE_LINK_STATE_L1) >> - link->aspm_disable |= ASPM_STATE_L1; >> + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; >> if (state & PCIE_LINK_STATE_L1_1) >> link->aspm_disable |= ASPM_STATE_L1_1; >> if (state & PCIE_LINK_STATE_L1_2) >> @@ -1251,6 +1251,8 @@ static ssize_t aspm_attr_store_common(struct device *dev, >> link->aspm_disable &= ~ASPM_STATE_L1; >> } else { >> link->aspm_disable |= state; >> + if (state & ASPM_STATE_L1) >> + link->aspm_disable |= ASPM_STATE_L1SS; >> } >> >> pcie_config_aspm_link(link, policy_to_aspm_state(link)); >> -- >> 2.42.0 >> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-10 20:34 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-26 11:10 [PATCH] PCI/ASPM: fix unexpected behavior when re-enabling L1 Heiner Kallweit 2023-10-02 15:14 ` Bjorn Helgaas 2023-10-03 3:33 ` Kuppuswamy Sathyanarayanan 2023-10-10 20:33 ` Heiner Kallweit 2023-10-10 20:29 ` Heiner Kallweit
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).