* [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
@ 2025-12-01 3:52 Maciej W. Rozycki
2025-12-01 9:45 ` Ilpo Järvinen
` (3 more replies)
0 siblings, 4 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-01 3:52 UTC (permalink / raw)
To: Bjorn Helgaas, Matthew W Carlis, ALOK TIWARI
Cc: ashishk, bamstadt, msaggi, sconnor, Lukas Wunner,
Ilpo Järvinen, Jiwei, guojinhui.liam, ahuang12, sunjw10,
linux-pci, linux-kernel
Discard Vendor:Device ID matching in the PCIe failed link retraining
quirk and ignore the link status for the removal of the 2.5GT/s speed
clamp, whether applied by the quirk itself or the firmware earlier on.
Revert to the original target link speed if this final link retraining
has failed.
This is so that link training noise in hot-plug scenarios does not make
a link remain clamped to the 2.5GT/s speed where an event race has led
the quirk to apply the speed clamp for one device, only to leave it in
place for a subsequent device to be plugged in.
Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: <stable@vger.kernel.org> # v6.5+
---
drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
1 file changed, 18 insertions(+), 32 deletions(-)
linux-pcie-failed-link-retrain-unclamp-always.diff
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
* Restrict the speed to 2.5GT/s then with the Target Link Speed field,
* request a retrain and check the result.
*
- * If this turns out successful and we know by the Vendor:Device ID it is
- * safe to do so, then lift the restriction, letting the devices negotiate
- * a higher speed. Also check for a similar 2.5GT/s speed restriction the
- * firmware may have already arranged and lift it with ports that already
- * report their data link being up.
+ * If this turns out successful, or where a 2.5GT/s speed restriction has
+ * been previously arranged by the firmware and the port reports its link
+ * already being up, lift the restriction, in a hope it is safe to do so,
+ * letting the devices negotiate a higher speed.
*
* Otherwise revert the speed to the original setting and request a retrain
* again to remove any residual state, ignoring the result as it's supposed
@@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
*/
int pcie_failed_link_retrain(struct pci_dev *dev)
{
- static const struct pci_device_id ids[] = {
- { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
- {}
- };
- u16 lnksta, lnkctl2;
+ u16 lnksta, lnkctl2, oldlnkctl2;
int ret = -ENOTTY;
+ u32 lnkcap;
if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
!pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
return ret;
pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
- u16 oldlnkctl2;
-
pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
- pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
- true);
- return ret;
- }
-
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ if (ret)
+ goto err;
}
pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
-
- if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
- (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
- pci_match_id(ids, dev)) {
- u32 lnkcap;
-
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
+ (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
- pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- return ret;
- }
+ if (ret)
+ goto err;
}
return ret;
+err:
+ pci_info(dev, "retraining failed\n");
+ pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
+ return ret;
}
static ktime_t fixup_debug_start(struct pci_dev *dev,
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-01 3:52 [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Maciej W. Rozycki
@ 2025-12-01 9:45 ` Ilpo Järvinen
2025-12-01 13:55 ` Maciej W. Rozycki
2025-12-02 13:49 ` [External] : " ALOK TIWARI
` (2 subsequent siblings)
3 siblings, 1 reply; 22+ messages in thread
From: Ilpo Järvinen @ 2025-12-01 9:45 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Bjorn Helgaas, Matthew W Carlis, ALOK TIWARI, ashishk, bamstadt,
msaggi, sconnor, Lukas Wunner, Jiwei, guojinhui.liam, ahuang12,
sunjw10, linux-pci, LKML
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> Discard Vendor:Device ID matching in the PCIe failed link retraining
> quirk and ignore the link status for the removal of the 2.5GT/s speed
> clamp, whether applied by the quirk itself or the firmware earlier on.
> Revert to the original target link speed if this final link retraining
> has failed.
>
> This is so that link training noise in hot-plug scenarios does not make
> a link remain clamped to the 2.5GT/s speed where an event race has led
> the quirk to apply the speed clamp for one device, only to leave it in
> place for a subsequent device to be plugged in.
>
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: <stable@vger.kernel.org> # v6.5+
> ---
> drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
> 1 file changed, 18 insertions(+), 32 deletions(-)
>
> linux-pcie-failed-link-retrain-unclamp-always.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
> * Restrict the speed to 2.5GT/s then with the Target Link Speed field,
> * request a retrain and check the result.
> *
> - * If this turns out successful and we know by the Vendor:Device ID it is
> - * safe to do so, then lift the restriction, letting the devices negotiate
> - * a higher speed. Also check for a similar 2.5GT/s speed restriction the
> - * firmware may have already arranged and lift it with ports that already
> - * report their data link being up.
> + * If this turns out successful, or where a 2.5GT/s speed restriction has
> + * been previously arranged by the firmware and the port reports its link
> + * already being up, lift the restriction, in a hope it is safe to do so,
> + * letting the devices negotiate a higher speed.
> *
> * Otherwise revert the speed to the original setting and request a retrain
> * again to remove any residual state, ignoring the result as it's supposed
> @@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
> */
> int pcie_failed_link_retrain(struct pci_dev *dev)
> {
> - static const struct pci_device_id ids[] = {
> - { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
> - {}
> - };
> - u16 lnksta, lnkctl2;
> + u16 lnksta, lnkctl2, oldlnkctl2;
> int ret = -ENOTTY;
> + u32 lnkcap;
>
> if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
> !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> return ret;
>
> pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
> if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
> - u16 oldlnkctl2;
> -
> pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
>
> - pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
> ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
> - if (ret) {
> - pci_info(dev, "retraining failed\n");
> - pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
> - true);
> - return ret;
> - }
> -
> - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> + if (ret)
> + goto err;
> }
>
> pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
> -
> - if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
> - (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> - pci_match_id(ids, dev)) {
> - u32 lnkcap;
> -
> + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
I'm trying to recall, if there was some particular reason why
->supported_speeds couldn't be used in this function. It would avoid the
need to read LinkCap at all.
> pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
> - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
> - if (ret) {
> - pci_info(dev, "retraining failed\n");
> - return ret;
> - }
> + if (ret)
> + goto err;
> }
>
> return ret;
return 0;
> +err:
> + pci_info(dev, "retraining failed\n");
> + pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
> + return ret;
> }
>
> static ktime_t fixup_debug_start(struct pci_dev *dev,
>
--
i.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-01 9:45 ` Ilpo Järvinen
@ 2025-12-01 13:55 ` Maciej W. Rozycki
2025-12-01 16:48 ` Ilpo Järvinen
2025-12-08 19:24 ` Maciej W. Rozycki
0 siblings, 2 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-01 13:55 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Bjorn Helgaas, Matthew W Carlis, ALOK TIWARI, ashishk, msaggi,
sconnor, Lukas Wunner, Jiwei, guojinhui.liam, ahuang12, sunjw10,
linux-pci, LKML
On Mon, 1 Dec 2025, Ilpo Järvinen wrote:
> > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
>
> I'm trying to recall, if there was some particular reason why
> ->supported_speeds couldn't be used in this function. It would avoid the
> need to read LinkCap at all.
Thanks for the hint. There's probably none and it's just me missing some
of the zillion bits and pieces. I'll wait a couple of days for any other
people to chime in and respin with this update included if everyone is
otherwise happy to proceed with this update.
> > + if (ret)
> > + goto err;
> > }
> >
> > return ret;
>
> return 0;
It can still return -ENOTTY if neither of the two latter conditionals
matched, meaning the quirk was not applicable after all. ISTR you had
issues with the structure of this code before; I am not sure if it can
be made any better in a reasonable way. It is not a failure per se, so
the newly-added common error path does not apply. This is the case for:
"Return an error if retraining was not needed[...]" from the introductory
comment.
Shall I add a comment above the return statement referring to this?
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-01 13:55 ` Maciej W. Rozycki
@ 2025-12-01 16:48 ` Ilpo Järvinen
2025-12-08 19:24 ` Maciej W. Rozycki
1 sibling, 0 replies; 22+ messages in thread
From: Ilpo Järvinen @ 2025-12-01 16:48 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Bjorn Helgaas, Matthew W Carlis, ALOK TIWARI, ashishk, msaggi,
sconnor, Lukas Wunner, Jiwei, guojinhui.liam, ahuang12, sunjw10,
linux-pci, LKML
[-- Attachment #1: Type: text/plain, Size: 1732 bytes --]
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> On Mon, 1 Dec 2025, Ilpo Järvinen wrote:
>
> > > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
> >
> > I'm trying to recall, if there was some particular reason why
> > ->supported_speeds couldn't be used in this function. It would avoid the
> > need to read LinkCap at all.
>
> Thanks for the hint. There's probably none and it's just me missing some
> of the zillion bits and pieces. I'll wait a couple of days for any other
> people to chime in and respin with this update included if everyone is
> otherwise happy to proceed with this update.
>
> > > + if (ret)
> > > + goto err;
> > > }
> > >
> > > return ret;
> >
> > return 0;
>
> It can still return -ENOTTY if neither of the two latter conditionals
> matched, meaning the quirk was not applicable after all. ISTR you had
> issues with the structure of this code before; I am not sure if it can
> be made any better in a reasonable way. It is not a failure per se, so
> the newly-added common error path does not apply. This is the case for:
> "Return an error if retraining was not needed[...]" from the introductory
> comment.
>
> Shall I add a comment above the return statement referring to this?
I think it's fine as is, I just didn't review with enough context to
notice what it was initialized to (the usual thing when adding a
rollback path is to forget to change the normal path to return 0, thus
"auto commenting" it without checking enough, I'm sorry about that).
--
i.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-01 3:52 [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Maciej W. Rozycki
2025-12-01 9:45 ` Ilpo Järvinen
@ 2025-12-02 13:49 ` ALOK TIWARI
2025-12-02 16:07 ` Ilpo Järvinen
2025-12-04 1:40 ` [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Matthew W Carlis
2025-12-04 18:30 ` [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Matthew W Carlis
3 siblings, 1 reply; 22+ messages in thread
From: ALOK TIWARI @ 2025-12-02 13:49 UTC (permalink / raw)
To: Maciej W. Rozycki, Bjorn Helgaas, Matthew W Carlis
Cc: ashishk, bamstadt, msaggi, sconnor, Lukas Wunner,
Ilpo Järvinen, Jiwei, guojinhui.liam, ahuang12, sunjw10,
linux-pci, linux-kernel
On 12/1/2025 9:22 AM, Maciej W. Rozycki wrote:
> Discard Vendor:Device ID matching in the PCIe failed link retraining
> quirk and ignore the link status for the removal of the 2.5GT/s speed
> clamp, whether applied by the quirk itself or the firmware earlier on.
> Revert to the original target link speed if this final link retraining
> has failed.
>
> This is so that link training noise in hot-plug scenarios does not make
> a link remain clamped to the 2.5GT/s speed where an event race has led
> the quirk to apply the speed clamp for one device, only to leave it in
> place for a subsequent device to be plugged in.
>
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Signed-off-by: Maciej W. Rozycki<macro@orcam.me.uk>
> Cc:<stable@vger.kernel.org> # v6.5+
> ---
> drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
> 1 file changed, 18 insertions(+), 32 deletions(-)
>
> linux-pcie-failed-link-retrain-unclamp-always.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
Thanks a lot for your patch.
The patch works, and the issue has been resolved in our testing.
However, this patch does not cleanly apply to the 6.12 LTS kernel.
To apply the fix cleanly, a series of patches is required.
PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2389d8dc38fee PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
15b8968dcb90f PCI/bwctrl: Fix NULL pointer deref on unbind and bind
e3f30d563a388 PCI: Make pci_destroy_dev() concurrent safe
667f053b05f00 PCI/bwctrl: Fix NULL pointer dereference on bus number
exhaustion
e93d9fcfd7dc6 PCI: Refactor pcie_update_link_speed()
9989e0ca7462c PCI: Fix link speed calculation on retrain failure
b85af48de3ece PCI: Adjust the position of reading the Link Control 2
register
026e4bffb0af9 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
d278b098282d1 thermal: Add PCIe cooling driver
de9a6c8d5dbfe PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
665745f274870 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW
controller
3491f50966686 PCI: Abstract LBMS seen check into pcie_lbms_seen()
Could you please provide a version of this patch that can be
cleanly cherry-picked for the 6.12 LTS (6.12.y) branch?
Alternatively, is it okay to back-port the above patch series to 6.12.y?
Tested-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Thanks,
Alok
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-02 13:49 ` [External] : " ALOK TIWARI
@ 2025-12-02 16:07 ` Ilpo Järvinen
2025-12-03 19:01 ` Maciej W. Rozycki
0 siblings, 1 reply; 22+ messages in thread
From: Ilpo Järvinen @ 2025-12-02 16:07 UTC (permalink / raw)
To: ALOK TIWARI
Cc: Maciej W. Rozycki, Bjorn Helgaas, Matthew W Carlis, ashishk,
bamstadt, msaggi, sconnor, Lukas Wunner, Jiwei, guojinhui.liam,
ahuang12, sunjw10, linux-pci, LKML
On Tue, 2 Dec 2025, ALOK TIWARI wrote:
> On 12/1/2025 9:22 AM, Maciej W. Rozycki wrote:
> > Discard Vendor:Device ID matching in the PCIe failed link retraining
> > quirk and ignore the link status for the removal of the 2.5GT/s speed
> > clamp, whether applied by the quirk itself or the firmware earlier on.
> > Revert to the original target link speed if this final link retraining
> > has failed.
> >
> > This is so that link training noise in hot-plug scenarios does not make
> > a link remain clamped to the 2.5GT/s speed where an event race has led
> > the quirk to apply the speed clamp for one device, only to leave it in
> > place for a subsequent device to be plugged in.
> >
> > Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> > Signed-off-by: Maciej W. Rozycki<macro@orcam.me.uk>
> > Cc:<stable@vger.kernel.org> # v6.5+
> > ---
> > drivers/pci/quirks.c | 50
> > ++++++++++++++++++--------------------------------
> > 1 file changed, 18 insertions(+), 32 deletions(-)
> >
> > linux-pcie-failed-link-retrain-unclamp-always.diff
> > Index: linux-macro/drivers/pci/quirks.c
> > ===================================================================
> > --- linux-macro.orig/drivers/pci/quirks.c
> > +++ linux-macro/drivers/pci/quirks.c
>
> Thanks a lot for your patch.
> The patch works, and the issue has been resolved in our testing.
>
> However, this patch does not cleanly apply to the 6.12 LTS kernel.
> To apply the fix cleanly, a series of patches is required.
>
> PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
> 2389d8dc38fee PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
> 15b8968dcb90f PCI/bwctrl: Fix NULL pointer deref on unbind and bind
> e3f30d563a388 PCI: Make pci_destroy_dev() concurrent safe
> 667f053b05f00 PCI/bwctrl: Fix NULL pointer dereference on bus number
> exhaustion
> e93d9fcfd7dc6 PCI: Refactor pcie_update_link_speed()
> 9989e0ca7462c PCI: Fix link speed calculation on retrain failure
> b85af48de3ece PCI: Adjust the position of reading the Link Control 2 register
> 026e4bffb0af9 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
> d278b098282d1 thermal: Add PCIe cooling driver
> de9a6c8d5dbfe PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
> 665745f274870 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
> 3491f50966686 PCI: Abstract LBMS seen check into pcie_lbms_seen()
>
> Could you please provide a version of this patch that can be
> cleanly cherry-picked for the 6.12 LTS (6.12.y) branch?
>
> Alternatively, is it okay to back-port the above patch series to 6.12.y?
>
>
> Tested-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Thanks for testing.
As the change has a Fixes tag, stable maintainers will eventually get to
it (after the change has progressed into Linus' tree).
As per usual, if the patch won't apply to some old kernel version cleanly,
stable maintainers will decide themselves whether they end up taking some
extra changes or ask for a backport from the submitter of the original
change.
--
i.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-02 16:07 ` Ilpo Järvinen
@ 2025-12-03 19:01 ` Maciej W. Rozycki
2025-12-08 19:25 ` Maciej W. Rozycki
0 siblings, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-03 19:01 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: ALOK TIWARI, Bjorn Helgaas, Matthew W Carlis, ashishk, msaggi,
sconnor, Lukas Wunner, Jiwei, guojinhui.liam, ahuang12, sunjw10,
linux-pci, LKML
On Tue, 2 Dec 2025, Ilpo Järvinen wrote:
> > > linux-pcie-failed-link-retrain-unclamp-always.diff
> > > Index: linux-macro/drivers/pci/quirks.c
> > > ===================================================================
> > > --- linux-macro.orig/drivers/pci/quirks.c
> > > +++ linux-macro/drivers/pci/quirks.c
> >
> > Thanks a lot for your patch.
> > The patch works, and the issue has been resolved in our testing.
Great, thanks for running the verification!
> > However, this patch does not cleanly apply to the 6.12 LTS kernel.
> > To apply the fix cleanly, a series of patches is required.
> >
> > PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
> > 2389d8dc38fee PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
> > 15b8968dcb90f PCI/bwctrl: Fix NULL pointer deref on unbind and bind
> > e3f30d563a388 PCI: Make pci_destroy_dev() concurrent safe
> > 667f053b05f00 PCI/bwctrl: Fix NULL pointer dereference on bus number
> > exhaustion
> > e93d9fcfd7dc6 PCI: Refactor pcie_update_link_speed()
> > 9989e0ca7462c PCI: Fix link speed calculation on retrain failure
> > b85af48de3ece PCI: Adjust the position of reading the Link Control 2 register
> > 026e4bffb0af9 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
> > d278b098282d1 thermal: Add PCIe cooling driver
> > de9a6c8d5dbfe PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
> > 665745f274870 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
> > 3491f50966686 PCI: Abstract LBMS seen check into pcie_lbms_seen()
Thank you for identifying the dependencies. Clearly the majority of the
changes is not functionally related to this fix at all and any need for
them is purely mechanical.
> As the change has a Fixes tag, stable maintainers will eventually get to
> it (after the change has progressed into Linus' tree).
>
> As per usual, if the patch won't apply to some old kernel version cleanly,
> stable maintainers will decide themselves whether they end up taking some
> extra changes or ask for a backport from the submitter of the original
> change.
Dependencies can be listed if need be along with a backport request.
However in this case there have been lots of syntactic updates to this
function, which do not necessarily qualify for backporting, as that is
supposed to include critical fixes only. Certainly changes to prevent
systems from breaking do qualify, but helper macros and functions, or code
restructuring do not.
Since I'm going to wait for further feedback and respin anyway, I will
check if there are critical dependencies required that are missing on the
stable branches and list any along with the backport request. Then any
remaining syntactic sugar can, as you say, be handled on a case by case
basis in the backport process.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing
2025-12-01 3:52 [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Maciej W. Rozycki
2025-12-01 9:45 ` Ilpo Järvinen
2025-12-02 13:49 ` [External] : " ALOK TIWARI
@ 2025-12-04 1:40 ` Matthew W Carlis
2025-12-04 23:43 ` Maciej W. Rozycki
2025-12-04 18:30 ` [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Matthew W Carlis
3 siblings, 1 reply; 22+ messages in thread
From: Matthew W Carlis @ 2025-12-04 1:40 UTC (permalink / raw)
To: macro
Cc: ahuang12, alok.a.tiwari, ashishk, bamstadt, bhelgaas,
guojinhui.liam, ilpo.jarvinen, jiwei.sun.bj, linux-kernel,
linux-pci, lukas, mattc, msaggi, sconnor, sunjw10
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> Discard Vendor:Device ID matching in the PCIe failed link retraining
> quirk and ignore the link status for the removal of the 2.5GT/s speed
> clamp, whether applied by the quirk itself or the firmware earlier on.
> Revert to the original target link speed if this final link retraining
> has failed.
I think we should either remove the quirk or only execute the quirk when the
downstream port is the specific ASMedia 0x2824. Hardware companies that
develop PCIe devices rely on the linux kernel for a significant amount of
their testing & the action taken by this quirk is going to introduce
noise into those tests by initiating unexpected speed changes etc.
As long as we have this quirk messing with link speeds we'll just
continue to see issue reports over time in my opinion.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-01 3:52 [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Maciej W. Rozycki
` (2 preceding siblings ...)
2025-12-04 1:40 ` [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Matthew W Carlis
@ 2025-12-04 18:30 ` Matthew W Carlis
2025-12-08 19:25 ` Maciej W. Rozycki
3 siblings, 1 reply; 22+ messages in thread
From: Matthew W Carlis @ 2025-12-04 18:30 UTC (permalink / raw)
To: macro
Cc: ahuang12, alok.a.tiwari, ashishk, bamstadt, bhelgaas,
guojinhui.liam, ilpo.jarvinen, jiwei.sun.bj, linux-kernel,
linux-pci, lukas, mattc, msaggi, sconnor, sunjw10
I'm sorry my last response kind of messed up the threading in this chain...
I don't understand why we still think its a good idea to have this action
in the kernel for any device type since it seems to only help Maciej W. Rozycki's
specific situation which is very likely to be the only one of its kind. In
addition the Delock adapter isn't what I would consider a particularly
"solid" device.
For what its worth I have a particular Gen5 device in my lab here that gets stuck
in an infinite link up-down loop when you force the speed to Gen2 when also combined
with a specific downstream port... I'm sure there is another device somewhere else
in the world that has the same issue when you force it to Gen1.
The kernel should assume a PCIe link will automatically train to the best
speed that the devices can achieve & if the link fails to train then it should
be up to the user space to decide what to do with it in my opinion.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing
2025-12-04 1:40 ` [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Matthew W Carlis
@ 2025-12-04 23:43 ` Maciej W. Rozycki
0 siblings, 0 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-04 23:43 UTC (permalink / raw)
To: Matthew W Carlis
Cc: ahuang12, alok.a.tiwari, ashishk, Bjorn Helgaas, guojinhui.liam,
Ilpo Järvinen, jiwei.sun.bj, linux-kernel, linux-pci,
Lukas Wunner, msaggi, sconnor, sunjw10
On Wed, 3 Dec 2025, Matthew W Carlis wrote:
> > Discard Vendor:Device ID matching in the PCIe failed link retraining
> > quirk and ignore the link status for the removal of the 2.5GT/s speed
> > clamp, whether applied by the quirk itself or the firmware earlier on.
> > Revert to the original target link speed if this final link retraining
> > has failed.
>
> I think we should either remove the quirk or only execute the quirk when the
> downstream port is the specific ASMedia 0x2824. Hardware companies that
> develop PCIe devices rely on the linux kernel for a significant amount of
> their testing & the action taken by this quirk is going to introduce
> noise into those tests by initiating unexpected speed changes etc.
Conversely, ISTM this could be good motivation for hardware designers to
reduce hot-plug noise. After all LBMS is only supposed to be ever set for
links in the active state and not while training, so perhaps debouncing is
needed for the transient state?
> As long as we have this quirk messing with link speeds we'll just
> continue to see issue reports over time in my opinion.
Well, the issues happened because I made an unfortunate design decision
with the original implementation which did not clean up after itself, just
because I have no use for hot-plug scenarios and didn't envisage it could
be causing issues.
The most recent update brings the device back to its original state
whether retraining succeeded or not, so it should now be transparent to
your noisy hot-plug scenarios, while helping stubborn devices at the same
time. You might have not noticed this code existed if it had been in its
currently proposed shape right from the beginning.
It's only those who do nothing that make no mistakes.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-01 13:55 ` Maciej W. Rozycki
2025-12-01 16:48 ` Ilpo Järvinen
@ 2025-12-08 19:24 ` Maciej W. Rozycki
1 sibling, 0 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-08 19:24 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Bjorn Helgaas, Matthew W Carlis, ALOK TIWARI, ashishk, msaggi,
sconnor, Lukas Wunner, Jiwei, guojinhui.liam, ahuang12, sunjw10,
linux-pci, LKML
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> > > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
> >
> > I'm trying to recall, if there was some particular reason why
> > ->supported_speeds couldn't be used in this function. It would avoid the
> > need to read LinkCap at all.
>
> Thanks for the hint. There's probably none and it's just me missing some
> of the zillion bits and pieces. I'll wait a couple of days for any other
> people to chime in and respin with this update included if everyone is
> otherwise happy to proceed with this update.
I take it no further feedback will be gathered, so I've sent v2 now, but
I've figured out backporting v1 as it is will result in less intrusion to
the trunk commit, so I have only made a change to use `->supported_speeds'
a follow-up patch in a series. Please let me know if this works for you.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-03 19:01 ` Maciej W. Rozycki
@ 2025-12-08 19:25 ` Maciej W. Rozycki
0 siblings, 0 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-08 19:25 UTC (permalink / raw)
To: Ilpo Järvinen, ALOK TIWARI
Cc: Bjorn Helgaas, Matthew W Carlis, ashishk, msaggi, sconnor,
Lukas Wunner, Jiwei, guojinhui.liam, ahuang12, sunjw10, linux-pci,
LKML
On Wed, 3 Dec 2025, Maciej W. Rozycki wrote:
> Since I'm going to wait for further feedback and respin anyway, I will
> check if there are critical dependencies required that are missing on the
> stable branches and list any along with the backport request. Then any
> remaining syntactic sugar can, as you say, be handled on a case by case
> basis in the backport process.
I've gone through the relevant active stable branches and there's no fix
required to be additionally backported to either 6.6 or 6.12. There are
only mechanical updates needed, which I've included in the change below,
and which works for me with 6.12. Also nothing extra is needed for 6.17
as the code is the same as in the trunk.
Maciej
---
drivers/pci/quirks.c | 53 ++++++++++++++++++++-------------------------------
1 file changed, 21 insertions(+), 32 deletions(-)
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -68,11 +68,10 @@
* Restrict the speed to 2.5GT/s then with the Target Link Speed field,
* request a retrain and check the result.
*
- * If this turns out successful and we know by the Vendor:Device ID it is
- * safe to do so, then lift the restriction, letting the devices negotiate
- * a higher speed. Also check for a similar 2.5GT/s speed restriction the
- * firmware may have already arranged and lift it with ports that already
- * report their data link being up.
+ * If this turns out successful, or where a 2.5GT/s speed restriction has
+ * been previously arranged by the firmware and the port reports its link
+ * already being up, lift the restriction, in a hope it is safe to do so,
+ * letting the devices negotiate a higher speed.
*
* Otherwise revert the speed to the original setting and request a retrain
* again to remove any residual state, ignoring the result as it's supposed
@@ -83,23 +82,19 @@
*/
int pcie_failed_link_retrain(struct pci_dev *dev)
{
- static const struct pci_device_id ids[] = {
- { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
- {}
- };
- u16 lnksta, lnkctl2;
+ u16 lnksta, lnkctl2, oldlnkctl2;
int ret = -ENOTTY;
+ u32 lnkcap;
if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
!pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
return ret;
- pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
+ lnkctl2 = oldlnkctl2;
if ((lnksta & (PCI_EXP_LNKSTA_LBMS | PCI_EXP_LNKSTA_DLLLA)) ==
PCI_EXP_LNKSTA_LBMS) {
- u16 oldlnkctl2 = lnkctl2;
-
pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
@@ -107,36 +102,30 @@ int pcie_failed_link_retrain(struct pci_
pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
ret = pcie_retrain_link(dev, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- pcie_capability_write_word(dev, PCI_EXP_LNKCTL2,
- oldlnkctl2);
- pcie_retrain_link(dev, true);
- return ret;
- }
-
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ if (ret)
+ goto err;
}
- if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
- (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
- pci_match_id(ids, dev)) {
- u32 lnkcap;
-
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
+ (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
- pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+
lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
ret = pcie_retrain_link(dev, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- return ret;
- }
+ if (ret)
+ goto err;
}
return ret;
+err:
+ pci_info(dev, "retraining failed\n");
+ pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, oldlnkctl2);
+ pcie_retrain_link(dev, true);
+ return ret;
}
static ktime_t fixup_debug_start(struct pci_dev *dev,
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2025-12-04 18:30 ` [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Matthew W Carlis
@ 2025-12-08 19:25 ` Maciej W. Rozycki
0 siblings, 0 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2025-12-08 19:25 UTC (permalink / raw)
To: Matthew W Carlis
Cc: ahuang12, alok.a.tiwari, ashishk, Bjorn Helgaas, guojinhui.liam,
Ilpo Järvinen, jiwei.sun.bj, linux-kernel, linux-pci,
Lukas Wunner, msaggi, sconnor, sunjw10
On Thu, 4 Dec 2025, Matthew W Carlis wrote:
> For what its worth I have a particular Gen5 device in my lab here that gets stuck
> in an infinite link up-down loop when you force the speed to Gen2 when also combined
> with a specific downstream port... I'm sure there is another device somewhere else
> in the world that has the same issue when you force it to Gen1.
Well, it's an erratum vs another erratum case. Then should such a device
at its maximum link speed trigger the workaround somehow, with this fix in
place any temporary clamp will be lifted anyway and the link retrained, so
it will recover from the link up-down loop. So no functional regression.
> The kernel should assume a PCIe link will automatically train to the best
> speed that the devices can achieve & if the link fails to train then it should
> be up to the user space to decide what to do with it in my opinion.
It might be tough where you have your rootfs device down the problematic
link.
Thank you for your input.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-19 21:26 [PATCH v2 0/3] " Bjorn Helgaas
@ 2026-02-19 22:09 ` Matthew W Carlis
2026-02-19 22:53 ` Bjorn Helgaas
0 siblings, 1 reply; 22+ messages in thread
From: Matthew W Carlis @ 2026-02-19 22:09 UTC (permalink / raw)
To: helgaas
Cc: ahuang12, alok.a.tiwari, ashishk, bhelgaas, guojinhui.liam,
ilpo.jarvinen, jiwei.sun.bj, linux-kernel, linux-pci, lukas,
macro, mattc, msaggi, sconnor, sunjw10
On Thu, 19 Feb 2026, Bjorn Helgaas wrote:
> Applied to pci/enumeration for v7.1, thanks. This will be rebased
> after v7.0-rc1.
We're heading down a path here where we keep working around the work-around &
now have decided that the kernel will be meddling with the link on potentially
all of the PCIe devices in the world. The trade off that we're making here to
accommodate a device specific interaction doesn't seem like the right direction.
Can we reconsider my patch that restricts the link retrain mechanism
to the specific device that created the work-around?
https://lore.kernel.org/all/20250702052430.13716-1-mattc@purestorage.com/
How do we decide which device to fix and which devices to break?
- Matt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-19 22:09 ` [PATCH] " Matthew W Carlis
@ 2026-02-19 22:53 ` Bjorn Helgaas
2026-02-20 12:03 ` Maciej W. Rozycki
0 siblings, 1 reply; 22+ messages in thread
From: Bjorn Helgaas @ 2026-02-19 22:53 UTC (permalink / raw)
To: Matthew W Carlis
Cc: ahuang12, alok.a.tiwari, ashishk, bhelgaas, guojinhui.liam,
ilpo.jarvinen, jiwei.sun.bj, linux-kernel, linux-pci, lukas,
macro, msaggi, sconnor, sunjw10
On Thu, Feb 19, 2026 at 03:09:59PM -0700, Matthew W Carlis wrote:
> On Thu, 19 Feb 2026, Bjorn Helgaas wrote:
>
> > Applied to pci/enumeration for v7.1, thanks. This will be rebased
> > after v7.0-rc1.
>
> We're heading down a path here where we keep working around the
> work-around & now have decided that the kernel will be meddling with
> the link on potentially all of the PCIe devices in the world. The
> trade off that we're making here to accommodate a device specific
> interaction doesn't seem like the right direction.
>
> Can we reconsider my patch that restricts the link retrain mechanism
> to the specific device that created the work-around?
> https://lore.kernel.org/all/20250702052430.13716-1-mattc@purestorage.com/
I think we already at least potentially meddle with the link on every
device, and it definitely makes me nervous. I would like it much
better if it's possible to limit it to devices with known defects.
I'll defer these for now and we can see if a consensus emerges.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-19 22:53 ` Bjorn Helgaas
@ 2026-02-20 12:03 ` Maciej W. Rozycki
2026-02-23 17:36 ` Bjorn Helgaas
0 siblings, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2026-02-20 12:03 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Matthew W Carlis, ahuang12, alok.a.tiwari, ashishk, Bjorn Helgaas,
guojinhui.liam, Ilpo Järvinen, jiwei.sun.bj, linux-kernel,
linux-pci, Lukas Wunner, msaggi, sconnor, sunjw10
On Thu, 19 Feb 2026, Bjorn Helgaas wrote:
> > Can we reconsider my patch that restricts the link retrain mechanism
> > to the specific device that created the work-around?
> > https://lore.kernel.org/all/20250702052430.13716-1-mattc@purestorage.com/
>
> I think we already at least potentially meddle with the link on every
> device, and it definitely makes me nervous. I would like it much
> better if it's possible to limit it to devices with known defects.
>
> I'll defer these for now and we can see if a consensus emerges.
As I say it's logically impossible to figure out whether or not to apply
such a workaround where the culprit is the downstream device, because
until you've succeeded establishing a link you have no way to figure out
what the downstream device actually is.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-20 12:03 ` Maciej W. Rozycki
@ 2026-02-23 17:36 ` Bjorn Helgaas
2026-02-23 22:49 ` Matthew W Carlis
2026-02-23 23:14 ` Maciej W. Rozycki
0 siblings, 2 replies; 22+ messages in thread
From: Bjorn Helgaas @ 2026-02-23 17:36 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Matthew W Carlis, ahuang12, alok.a.tiwari, ashishk, Bjorn Helgaas,
guojinhui.liam, Ilpo Järvinen, jiwei.sun.bj, linux-kernel,
linux-pci, Lukas Wunner, msaggi, sconnor, sunjw10
On Fri, Feb 20, 2026 at 12:03:17PM +0000, Maciej W. Rozycki wrote:
> On Thu, 19 Feb 2026, Bjorn Helgaas wrote:
>
> > > Can we reconsider my patch that restricts the link retrain mechanism
> > > to the specific device that created the work-around?
> > > https://lore.kernel.org/all/20250702052430.13716-1-mattc@purestorage.com/
> >
> > I think we already at least potentially meddle with the link on every
> > device, and it definitely makes me nervous. I would like it much
> > better if it's possible to limit it to devices with known defects.
> >
> > I'll defer these for now and we can see if a consensus emerges.
>
> As I say it's logically impossible to figure out whether or not to
> apply such a workaround where the culprit is the downstream device,
> because until you've succeeded establishing a link you have no way
> to figure out what the downstream device actually is.
IIUC Matthew [1] and Alok [2] have reported issues that only happen
when we run pcie_failed_link_retrain(). The issues seem to be with
NVMe devices, but I don't see a root cause or a solution (other than
skipping pcie_failed_link_retrain()).
[1] https://lore.kernel.org/all/20250702052430.13716-2-mattc@purestorage.com/
[2] https://lore.kernel.org/all/c296df33-f9c0-42f7-8add-6966d89d00c4@oracle.com/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-23 17:36 ` Bjorn Helgaas
@ 2026-02-23 22:49 ` Matthew W Carlis
2026-02-23 23:14 ` Maciej W. Rozycki
1 sibling, 0 replies; 22+ messages in thread
From: Matthew W Carlis @ 2026-02-23 22:49 UTC (permalink / raw)
To: helgaas
Cc: ahuang12, alok.a.tiwari, ashishk, bhelgaas, guojinhui.liam,
ilpo.jarvinen, jiwei.sun.bj, linux-kernel, linux-pci, lukas,
macro, mattc, msaggi, sconnor, sunjw10
I wonder if the compromise here isn't adding a new kind of
DECLARE_PCI_FIXUP_<LINKFAIL?> & putting this quirk behind it?
On Thu, 19 Feb 2026 16:53:22 -0600, Bjorn Helgaas wrote:
> I would like it much better if it's possible to limit it to
> devices with known defects.
On Fri, 20 Feb 2026 12:03:17 +0000, Maciej W. Rozycki wrote:
> As I say it's logically impossible to figure out whether or not to apply
> such a workaround where the culprit is the downstream device
I don't think we're looking at an impossible decision to make here in terms
of whether to apply the quirk. It makes the most sense in my mind to restrict
the quirk to that Asmedia device which was upstream of the pericom switch in
the initial bug report iirc.
1) There was never a root cause so we can't say that this is in fact an issue
with the pericom switch... It could be the ASmedia switch causing the problem..
2) Even if it is a bug in the pericom, applying to only the ASmedia switch limits the
blast radius of the retrain action. It becomes unlikely that we ever see any
reports of issues from large scale users (hyper scalers, server vendors etc).
On Mon, 23 Feb 2026 11:36:03 -0600, Bjorn Helgaas wrote:
> IIUC Matthew [1] and Alok [2] have reported issues that only happen
> when we run pcie_failed_link_retrain(). The issues seem to be with
> NVMe devices, but I don't see a root cause or a solution (other than
> skipping pcie_failed_link_retrain()).
I don't think the issue is really specific to NVMe devices realistically what
is happening is that NVMe devices happen to be hot-plug'ed way more often than
any other kind of PCIe device. Vendors who design devices/systems for hot-plug
will also do extensive hot-plug testing & due to limitations of the kernel's
heuristics, invoke the quirk when it is not desirable to do so.
-Matt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-23 17:36 ` Bjorn Helgaas
2026-02-23 22:49 ` Matthew W Carlis
@ 2026-02-23 23:14 ` Maciej W. Rozycki
2026-02-25 1:41 ` Matthew W Carlis
1 sibling, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2026-02-23 23:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Matthew W Carlis, ahuang12, alok.a.tiwari, ashishk, Bjorn Helgaas,
guojinhui.liam, Ilpo Järvinen, jiwei.sun.bj, linux-kernel,
linux-pci, Lukas Wunner, msaggi, sconnor, sunjw10
On Mon, 23 Feb 2026, Bjorn Helgaas wrote:
> > > > Can we reconsider my patch that restricts the link retrain mechanism
> > > > to the specific device that created the work-around?
> > > > https://lore.kernel.org/all/20250702052430.13716-1-mattc@purestorage.com/
> > >
> > > I think we already at least potentially meddle with the link on every
> > > device, and it definitely makes me nervous. I would like it much
> > > better if it's possible to limit it to devices with known defects.
> > >
> > > I'll defer these for now and we can see if a consensus emerges.
> >
> > As I say it's logically impossible to figure out whether or not to
> > apply such a workaround where the culprit is the downstream device,
> > because until you've succeeded establishing a link you have no way
> > to figure out what the downstream device actually is.
>
> IIUC Matthew [1] and Alok [2] have reported issues that only happen
> when we run pcie_failed_link_retrain(). The issues seem to be with
> NVMe devices, but I don't see a root cause or a solution (other than
> skipping pcie_failed_link_retrain()).
I argue that by applying this change the issues with NVMe hot-plug will
be sorted while keeping the configuration working that
pcie_failed_link_retrain() is needed for. Win-win.
I note that active links are unaffected, so to say it's meddling with the
link on every device is I think a bit of an overstatement, and reports of
issues are from a few people only, of which Matthew insists on dropping
the pcie_failed_link_retrain() stuff while Alok is happy with the solution
proposed.
Shall we try this change and drop the whole pcie_failed_link_retrain()
stuff after all if it turns out to cause issues yet again?
What outcome would you envisage had I taken the approach from this update
right away with the original change? My only fault was I have no use(*)
for PCIe hot-plug and did not predict the impact there. But no one can
predict everything and I think dropping a solution rather than fixing it
just because it wasn't perfect right from the beginning would be unfair.
As to the root cause, I suppose it's hw people who see the problem in the
lab could say more.
I can only guess LBMS is set in the course of device removal, perhaps due
to contacts bouncing. It's not clear to me if this is compliant even, as
the spec is explicit that LBMS is only allowed to be set if the port has
not transitioned through the DL_Down status, which the loss of electrical
connection (circuit open) would seem (?) to imply.
Then again at x4 width which NVMe tends to imply the lanes may disconnect
at different times each, so the loss of connection on one lane would not
imply the loss of link, so the setting of LBMS could be legitimate before
the device has gone for good.
At insertion the link has to get running stable at 2.5GT/s first before
switching to a higher speed, so even more so the LBMS is not supposed to
be set.
As I say it's only a guess and I may be missing something. I can't claim
any familiarity with the PCIe physical layer, though I can take some time
and study it.
In any case the unconditional unclamping and retraining actions proposed
are expected to clean up the state, whether compliant or not.
(*) Except for ExpressCard in my laptop, but that's another story.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-23 23:14 ` Maciej W. Rozycki
@ 2026-02-25 1:41 ` Matthew W Carlis
2026-02-26 22:02 ` Maciej W. Rozycki
0 siblings, 1 reply; 22+ messages in thread
From: Matthew W Carlis @ 2026-02-25 1:41 UTC (permalink / raw)
To: macro
Cc: ahuang12, alok.a.tiwari, ashishk, bhelgaas, guojinhui.liam,
helgaas, ilpo.jarvinen, jiwei.sun.bj, linux-kernel, linux-pci,
lukas, mattc, msaggi, sconnor, sunjw10
On Mon, 23 Feb 2026 23:14:37 +0000, Maciej W. Rozycki wrote:
> I argue that by applying this change the issues with NVMe hot-plug will
> be sorted while keeping the configuration working that
> pcie_failed_link_retrain() is needed for. Win-win.
I don't think that what you are saying is true there is invariably going to be
some other consequence of this change.. Its hard to believe there can be any
changes to the pci drivers that won't break something.
> I note that active links are unaffected, so to say it's meddling with the
> link on every device is I think a bit of an overstatement, and reports of
> issues are from a few people only...
There is no discrimination about which device it can be invoked on..
I'm looking at a fleet of millions of hot-plug'able devices.... I don't really
know if it matters how many people report an issue, I think what probably
matters is making the right change. Initially was there any other reports
of the quirk helping with other devices besides the delock 41433?
> What outcome would you envisage had I taken the approach from this update
> right away with the original change? My only fault was I have no use(*)
> for PCIe hot-plug and did not predict the impact there.
What I'm seeing now is an overall confusion about whether a link failed to train
to gen 1 or was recovered by the quirk or recovered on its own etc... In my systems
I would prefer to NEVER invoke the quirk under any circumstances because I expect
my devices to work. With the quirk it becomes more unclear about what the cause
of a link issue might have been or whether it was even a real link issue in the
first place or some weird timing..
-Matt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-25 1:41 ` Matthew W Carlis
@ 2026-02-26 22:02 ` Maciej W. Rozycki
2026-03-26 19:52 ` ALOK TIWARI
0 siblings, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2026-02-26 22:02 UTC (permalink / raw)
To: Matthew W Carlis
Cc: ahuang12, alok.a.tiwari, ashishk, Bjorn Helgaas, guojinhui.liam,
Bjorn Helgaas, Ilpo Järvinen, jiwei.sun.bj, linux-kernel,
linux-pci, Lukas Wunner, msaggi, sconnor, sunjw10
On Tue, 24 Feb 2026, Matthew W Carlis wrote:
> > I argue that by applying this change the issues with NVMe hot-plug will
> > be sorted while keeping the configuration working that
> > pcie_failed_link_retrain() is needed for. Win-win.
>
> I don't think that what you are saying is true there is invariably going to be
> some other consequence of this change.. Its hard to believe there can be any
> changes to the pci drivers that won't break something.
You're being sarcastic, aren't you?
While I sympathise with your feeling, may I pretty please ask you to at
the very least give my fix a try in your test environment?
> > I note that active links are unaffected, so to say it's meddling with the
> > link on every device is I think a bit of an overstatement, and reports of
> > issues are from a few people only...
>
> There is no discrimination about which device it can be invoked on..
> I'm looking at a fleet of millions of hot-plug'able devices.... I don't really
> know if it matters how many people report an issue, I think what probably
> matters is making the right change. Initially was there any other reports
> of the quirk helping with other devices besides the delock 41433?
No reports that I know of. Please bear in mind that the failure mode is
such that you need enough knowledge of PCIe internals and the spec to
actually realise there is periodic link training activity taking place.
In the absence of the quirk for the average user there's just no
communication, as with a dead downstream device (and the upstream device
is sound as anything else plugged in, including but not limited to NVMe
storage, works just fine). In the presence of the quirk the downstream
device just works and I expect hardly anyone can be bothered to report
seeing "broken device, retraining non-functional downstream link at
2.5GT/s" in the log. It's only cases like yours that bring attention to
the message.
> > What outcome would you envisage had I taken the approach from this update
> > right away with the original change? My only fault was I have no use(*)
> > for PCIe hot-plug and did not predict the impact there.
>
> What I'm seeing now is an overall confusion about whether a link failed to train
> to gen 1 or was recovered by the quirk or recovered on its own etc... In my systems
> I would prefer to NEVER invoke the quirk under any circumstances because I expect
> my devices to work. With the quirk it becomes more unclear about what the cause
> of a link issue might have been or whether it was even a real link issue in the
> first place or some weird timing..
I can see your point.
However from your description I infer this is about a test environment, a
development lab so to speak. And you are a highly skilled professional
who has access to measurement, test, and hardware debug equipment, and are
therefore able to figure out stuff. Conversely, the vast majority of
Linux deployments is in the field, where no sophisticated equipment is
available and the operator, if any, may have basic technical skills only.
I have been taught that in the field it is more desirable for equipment
to operate according to expectations rather than to strictly follow the
relevant specifications and consequently fail operating. And the quirk I
have come up with just follows this principle, letting unqualified people
use their equipment (this is similar to Postel's law if you know what I
mean).
I realise that in the lab you want strict compliance as this will verify
interoperation of the devices you design.
So I think we have conflicting objectives here and I can only offer a
sysfs setting that will switch between the modes according to the specific
user's needs, as the intent is not something the kernel can figure out by
itself.
Please mind however that throughout this week and the next I'm away on
holiday (a proper one, as in alpine skiing), so my availability to respond
or work on stuff is limited. I'll appreciate if you give my fix a try
meanwhile.
Maciej
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2026-02-26 22:02 ` Maciej W. Rozycki
@ 2026-03-26 19:52 ` ALOK TIWARI
0 siblings, 0 replies; 22+ messages in thread
From: ALOK TIWARI @ 2026-03-26 19:52 UTC (permalink / raw)
To: Maciej W. Rozycki, Matthew W Carlis
Cc: ahuang12, ashishk, Bjorn Helgaas, guojinhui.liam, Bjorn Helgaas,
Ilpo Järvinen, jiwei.sun.bj, linux-kernel, linux-pci,
Lukas Wunner, msaggi, sconnor, sunjw10
On 2/27/2026 3:32 AM, Maciej W. Rozycki wrote:
> On Tue, 24 Feb 2026, Matthew W Carlis wrote:
>
>>> I argue that by applying this change the issues with NVMe hot-plug will
>>> be sorted while keeping the configuration working that
>>> pcie_failed_link_retrain() is needed for. Win-win.
>>
>> I don't think that what you are saying is true there is invariably going to be
>> some other consequence of this change.. Its hard to believe there can be any
>> changes to the pci drivers that won't break something.
>
> You're being sarcastic, aren't you?
>
> While I sympathise with your feeling, may I pretty please ask you to at
> the very least give my fix a try in your test environment?
>
>>> I note that active links are unaffected, so to say it's meddling with the
>>> link on every device is I think a bit of an overstatement, and reports of
>>> issues are from a few people only...
>>
>> There is no discrimination about which device it can be invoked on..
>> I'm looking at a fleet of millions of hot-plug'able devices.... I don't really
>> know if it matters how many people report an issue, I think what probably
>> matters is making the right change. Initially was there any other reports
>> of the quirk helping with other devices besides the delock 41433?
>
> No reports that I know of. Please bear in mind that the failure mode is
> such that you need enough knowledge of PCIe internals and the spec to
> actually realise there is periodic link training activity taking place.
>
> In the absence of the quirk for the average user there's just no
> communication, as with a dead downstream device (and the upstream device
> is sound as anything else plugged in, including but not limited to NVMe
> storage, works just fine). In the presence of the quirk the downstream
> device just works and I expect hardly anyone can be bothered to report
> seeing "broken device, retraining non-functional downstream link at
> 2.5GT/s" in the log. It's only cases like yours that bring attention to
> the message.
>
>>> What outcome would you envisage had I taken the approach from this update
>>> right away with the original change? My only fault was I have no use(*)
>>> for PCIe hot-plug and did not predict the impact there.
>>
>> What I'm seeing now is an overall confusion about whether a link failed to train
>> to gen 1 or was recovered by the quirk or recovered on its own etc... In my systems
>> I would prefer to NEVER invoke the quirk under any circumstances because I expect
>> my devices to work. With the quirk it becomes more unclear about what the cause
>> of a link issue might have been or whether it was even a real link issue in the
>> first place or some weird timing..
>
> I can see your point.
>
> However from your description I infer this is about a test environment, a
> development lab so to speak. And you are a highly skilled professional
> who has access to measurement, test, and hardware debug equipment, and are
> therefore able to figure out stuff. Conversely, the vast majority of
> Linux deployments is in the field, where no sophisticated equipment is
> available and the operator, if any, may have basic technical skills only.
>
> I have been taught that in the field it is more desirable for equipment
> to operate according to expectations rather than to strictly follow the
> relevant specifications and consequently fail operating. And the quirk I
> have come up with just follows this principle, letting unqualified people
> use their equipment (this is similar to Postel's law if you know what I
> mean).
>
> I realise that in the lab you want strict compliance as this will verify
> interoperation of the devices you design.
>
> So I think we have conflicting objectives here and I can only offer a
> sysfs setting that will switch between the modes according to the specific
> user's needs, as the intent is not something the kernel can figure out by
> itself.
>
> Please mind however that throughout this week and the next I'm away on
> holiday (a proper one, as in alpine skiing), so my availability to respond
> or work on stuff is limited. I'll appreciate if you give my fix a try
> meanwhile.
>
> Maciej
From my perspective, the current patch looks like a positive step and
seems to address the NVMe hot-plug concerns without regressing the
original use case.
That said, I’d really appreciate input from other maintainers on whether
this approach strikes the right balance between field robustness,
Given the concerns around hot-plug behavior and diagnosability, do you
think reverting (or partially reverting)
the behavioral impact of the original commit: "PCI: Work around PCIe
link training failures"
what is best way to conclude this issue?
Thanks,
Alok
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2026-03-26 19:52 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-01 3:52 [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Maciej W. Rozycki
2025-12-01 9:45 ` Ilpo Järvinen
2025-12-01 13:55 ` Maciej W. Rozycki
2025-12-01 16:48 ` Ilpo Järvinen
2025-12-08 19:24 ` Maciej W. Rozycki
2025-12-02 13:49 ` [External] : " ALOK TIWARI
2025-12-02 16:07 ` Ilpo Järvinen
2025-12-03 19:01 ` Maciej W. Rozycki
2025-12-08 19:25 ` Maciej W. Rozycki
2025-12-04 1:40 ` [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Matthew W Carlis
2025-12-04 23:43 ` Maciej W. Rozycki
2025-12-04 18:30 ` [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Matthew W Carlis
2025-12-08 19:25 ` Maciej W. Rozycki
-- strict thread matches above, loose matches on Subject: below --
2026-02-19 21:26 [PATCH v2 0/3] " Bjorn Helgaas
2026-02-19 22:09 ` [PATCH] " Matthew W Carlis
2026-02-19 22:53 ` Bjorn Helgaas
2026-02-20 12:03 ` Maciej W. Rozycki
2026-02-23 17:36 ` Bjorn Helgaas
2026-02-23 22:49 ` Matthew W Carlis
2026-02-23 23:14 ` Maciej W. Rozycki
2026-02-25 1:41 ` Matthew W Carlis
2026-02-26 22:02 ` Maciej W. Rozycki
2026-03-26 19:52 ` ALOK TIWARI
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox