public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing
@ 2025-01-10 13:44 Jiwei Sun
  2025-01-11 16:00 ` Maciej W. Rozycki
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jiwei Sun @ 2025-01-10 13:44 UTC (permalink / raw)
  To: macro, ilpo.jarvinen, bhelgaas
  Cc: linux-pci, linux-kernel, guojinhui.liam, helgaas, lukas, ahuang12,
	sunjw10, jiwei.sun.bj

From: Jiwei Sun <sunjw10@lenovo.com>

When we do the quick hot-add/hot-remove test (within 1 second) with a PCIE
Gen 5 NVMe disk, there is a possibility that the PCIe bridge will decrease
to 2.5GT/s from 32GT/s

pcieport 10002:00:04.0: pciehp: Slot(75): Link Down
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
...
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: broken device, retraining non-functional downstream link at 2.5GT/s
pcieport 10002:00:04.0: pciehp: Slot(75): No link
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): Link Up
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pcieport 10002:00:04.0: pciehp: Slot(75): No device found
pcieport 10002:00:04.0: pciehp: Slot(75): Card present
pci 10002:02:00.0: [144d:a826] type 00 class 0x010802 PCIe Endpoint
pci 10002:02:00.0: BAR 0 [mem 0x00000000-0x00007fff 64bit]
pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x00007fff 64bit]
pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x001fffff 64bit]: contains BAR 0 for 64 VFs
pci 10002:02:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 10002:00:04.0 (capable of 126.028 Gb/s with 32.0 GT/s PCIe x4 link)

If a NVMe disk is hot removed, the pciehp interrupt will be triggered, and
the kernel thread pciehp_ist will be woken up, the
pcie_failed_link_retrain() will be called as the following call trace.

   irq/87-pciehp-2524    [121] ..... 152046.006765: pcie_failed_link_retrain <-pcie_wait_for_link
   irq/87-pciehp-2524    [121] ..... 152046.006782: <stack trace>
 => [FTRACE TRAMPOLINE]
 => pcie_failed_link_retrain
 => pcie_wait_for_link
 => pciehp_check_link_status
 => pciehp_enable_slot
 => pciehp_handle_presence_or_link_change
 => pciehp_ist
 => irq_thread_fn
 => irq_thread
 => kthread
 => ret_from_fork
 => ret_from_fork_asm

Accorind to investigation, the issue is caused by the following scenerios,

NVMe disk	pciehp hardirq
hot-remove 	top-half		pciehp irq kernel thread
======================================================================
pciehp hardirq
will be triggered
	    	cpu handle pciehp
		hardirq
		pciehp irq kthread will
		be woken up
					pciehp_ist
					...
					  pcie_failed_link_retrain
					    read PCI_EXP_LNKCTL2 register
					    read PCI_EXP_LNKSTA register
If NVMe disk
hot-add before
calling pcie_retrain_link()
					    set target speed to 2_5GT
					      pcie_bwctrl_change_speed
	  				        pcie_retrain_link
						: the retrain work will be
						  successful, because
						  pci_match_id() will be
						  0 in
						  pcie_failed_link_retrain()
						  the target link speed
						  field of the Link Control
						  2 Register will keep 0x1.

In order to fix the issue, don't do the retraining work except ASMedia
ASM2824.

Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Reported-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
---
 drivers/pci/quirks.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 605628c810a5..ff04ebd9ae16 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -104,6 +104,9 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
 	u16 lnksta, lnkctl2;
 	int ret = -ENOTTY;
 
+	if (!pci_match_id(ids, dev))
+		return 0;
+
 	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
 	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
 		return ret;
@@ -129,8 +132,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
 	}
 
 	if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
-	    (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
-	    pci_match_id(ids, dev)) {
+	    (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT) {
 		u32 lnkcap;
 
 		pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread
* [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
@ 2025-12-01  3:52 Maciej W. Rozycki
  2025-12-04  1:40 ` [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Matthew W Carlis
  0 siblings, 1 reply; 13+ messages in thread
From: Maciej W. Rozycki @ 2025-12-01  3:52 UTC (permalink / raw)
  To: Bjorn Helgaas, Matthew W Carlis, ALOK TIWARI
  Cc: ashishk, bamstadt, msaggi, sconnor, Lukas Wunner,
	Ilpo Järvinen, Jiwei, guojinhui.liam, ahuang12, sunjw10,
	linux-pci, linux-kernel

Discard Vendor:Device ID matching in the PCIe failed link retraining 
quirk and ignore the link status for the removal of the 2.5GT/s speed 
clamp, whether applied by the quirk itself or the firmware earlier on.  
Revert to the original target link speed if this final link retraining 
has failed.

This is so that link training noise in hot-plug scenarios does not make 
a link remain clamped to the 2.5GT/s speed where an event race has led 
the quirk to apply the speed clamp for one device, only to leave it in 
place for a subsequent device to be plugged in.

Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: <stable@vger.kernel.org> # v6.5+
---
 drivers/pci/quirks.c |   50 ++++++++++++++++++--------------------------------
 1 file changed, 18 insertions(+), 32 deletions(-)

linux-pcie-failed-link-retrain-unclamp-always.diff
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
  * Restrict the speed to 2.5GT/s then with the Target Link Speed field,
  * request a retrain and check the result.
  *
- * If this turns out successful and we know by the Vendor:Device ID it is
- * safe to do so, then lift the restriction, letting the devices negotiate
- * a higher speed.  Also check for a similar 2.5GT/s speed restriction the
- * firmware may have already arranged and lift it with ports that already
- * report their data link being up.
+ * If this turns out successful, or where a 2.5GT/s speed restriction has
+ * been previously arranged by the firmware and the port reports its link
+ * already being up, lift the restriction, in a hope it is safe to do so,
+ * letting the devices negotiate a higher speed.
  *
  * Otherwise revert the speed to the original setting and request a retrain
  * again to remove any residual state, ignoring the result as it's supposed
@@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
  */
 int pcie_failed_link_retrain(struct pci_dev *dev)
 {
-	static const struct pci_device_id ids[] = {
-		{ PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
-		{}
-	};
-	u16 lnksta, lnkctl2;
+	u16 lnksta, lnkctl2, oldlnkctl2;
 	int ret = -ENOTTY;
+	u32 lnkcap;
 
 	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
 	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
 		return ret;
 
 	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
 	if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
-		u16 oldlnkctl2;
-
 		pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
 
-		pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
 		ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
-		if (ret) {
-			pci_info(dev, "retraining failed\n");
-			pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
-					      true);
-			return ret;
-		}
-
-		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+		if (ret)
+			goto err;
 	}
 
 	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
-
-	if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
-	    (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
-	    pci_match_id(ids, dev)) {
-		u32 lnkcap;
-
+	pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+	if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
+	    (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
 		pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
-		pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
 		ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
-		if (ret) {
-			pci_info(dev, "retraining failed\n");
-			return ret;
-		}
+		if (ret)
+			goto err;
 	}
 
 	return ret;
+err:
+	pci_info(dev, "retraining failed\n");
+	pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
+	return ret;
 }
 
 static ktime_t fixup_debug_start(struct pci_dev *dev,

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-12-04 23:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-10 13:44 [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Jiwei Sun
2025-01-11 16:00 ` Maciej W. Rozycki
2025-01-13 12:44   ` Jiwei
2025-01-13 15:08 ` Ilpo Järvinen
2025-01-14 15:04   ` Jiwei
2025-01-14 18:25     ` Ilpo Järvinen
2025-01-15 10:18       ` Lukas Wunner
2025-11-25 19:23         ` [External] : " ALOK TIWARI
2025-12-01  3:54           ` Maciej W. Rozycki
2025-01-15 11:39       ` Jiwei
2025-09-09 12:33 ` [External] : " ALOK TIWARI
  -- strict thread matches above, loose matches on Subject: below --
2025-12-01  3:52 [PATCH] PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining Maciej W. Rozycki
2025-12-04  1:40 ` [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Matthew W Carlis
2025-12-04 23:43   ` Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox