linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
@ 2025-07-16 19:02 Matthew W Carlis
  2025-07-16 19:02 ` [PATCH 1/1] " Matthew W Carlis
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew W Carlis @ 2025-07-16 19:02 UTC (permalink / raw)
  To: linux-pci
  Cc: bhelgaas, ashishk, macro, bamstadt, msaggi, sconnor,
	Matthew W Carlis

Add a kernel config to allow the removal of the pcie_failed_link_retrain()
quirk. CONFIG_PCI_NOSPEED_QUIRK=y will allow users to prevent the kernel
from forcing a PCIe link to 2.5GT/s. This may be beneficial for systems
or devices under debug as well as hardware configurations that have
demonstrated a high degree of device compatibility & are expected to
endure large numbers of hot-plug/error events & always arrive at maximum
speed.

Matthew W Carlis (1):
  PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain

 drivers/pci/Kconfig  | 9 +++++++++
 drivers/pci/pci.h    | 8 +++++++-
 drivers/pci/quirks.c | 3 +++
 3 files changed, 19 insertions(+), 1 deletion(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
  2025-07-16 19:02 [PATCH 0/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain Matthew W Carlis
@ 2025-07-16 19:02 ` Matthew W Carlis
  2025-07-16 19:34   ` Bjorn Helgaas
  2025-07-17 17:30   ` [PATCH 1/1] " kernel test robot
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew W Carlis @ 2025-07-16 19:02 UTC (permalink / raw)
  To: linux-pci
  Cc: bhelgaas, ashishk, macro, bamstadt, msaggi, sconnor,
	Matthew W Carlis

It is desirable to be able to remove pcie_failed_link_retrain for some
systems which are known to have PCIe devices with good LTSSM behavior
or a high degree of compatibility and which may be required to endure
large numbers of hot-plug events or DPC triggers & always arrive at the
maximum link speed. It appears that there is a degree of variability
in DSP/RP behavior in terms of setting the LBMS bit & therefore
difficult to tune pcie_failed_link_retrain with a very high degree
of accuracy in terms of never forcing a device to Gen1 that would
be able to arrive at its maximum speed on its own.

Signed-off-by: Matthew W Carlis <mattc@purestorage.com>
---
 drivers/pci/Kconfig  | 9 +++++++++
 drivers/pci/pci.h    | 8 +++++++-
 drivers/pci/quirks.c | 3 +++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 9c0e4aaf4e8c..8f01808231f7 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -68,6 +68,15 @@ config PCI_QUIRKS
 	  Disable this only if your target machine is unaffected by PCI
 	  quirks.
 
+config PCI_NOSPEED_QUIRK
+	default n
+	bool "Remove forced Gen1 link speed Gen1 quirk" if EXPERT
+	help
+	  This disables a workaround that will guide the PCIe link to
+	  2.5GT/s speed if it thinks the link has failed to train. Enable
+	  this if you think this workaround is forcing the link to 2.5GT/s
+	  when it should not.
+
 config PCI_DEBUG
 	bool "PCI Debugging"
 	depends on DEBUG_KERNEL
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 12215ee72afb..51fddc6419f3 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -785,7 +785,6 @@ void pci_acs_init(struct pci_dev *dev);
 int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
 int pci_dev_specific_enable_acs(struct pci_dev *dev);
 int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
-int pcie_failed_link_retrain(struct pci_dev *dev);
 #else
 static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
 					       u16 acs_flags)
@@ -800,11 +799,18 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
 {
 	return -ENOTTY;
 }
+#endif
+
+#ifdef CONFIG_PCI_QUIRKS
+#ifndef CONFIG_PCI_NOSPEED_QUIRK
+int pcie_failed_link_retrain(struct pci_dev *dev);
+#else
 static inline int pcie_failed_link_retrain(struct pci_dev *dev)
 {
 	return -ENOTTY;
 }
 #endif
+#endif
 
 /* PCI error reporting and recovery */
 pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 39bb0c025119..d2d06f9ec983 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -36,6 +36,8 @@
 #include <linux/switchtec.h>
 #include "pci.h"
 
+#ifndef CONFIG_PCI_NOSPEED_QUIRK
+
 static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta)
 {
 	if (test_bit(PCI_LINK_LBMS_SEEN, &dev->priv_flags))
@@ -140,6 +142,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
 
 	return ret;
 }
+#endif
 
 static ktime_t fixup_debug_start(struct pci_dev *dev,
 				 void (*fn)(struct pci_dev *dev))
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
  2025-07-16 19:02 ` [PATCH 1/1] " Matthew W Carlis
@ 2025-07-16 19:34   ` Bjorn Helgaas
  2025-07-17 18:38     ` [PATCH 0/1] " Matthew W Carlis
  2025-07-17 17:30   ` [PATCH 1/1] " kernel test robot
  1 sibling, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2025-07-16 19:34 UTC (permalink / raw)
  To: Matthew W Carlis
  Cc: linux-pci, bhelgaas, ashishk, macro, bamstadt, msaggi, sconnor

On Wed, Jul 16, 2025 at 01:02:06PM -0600, Matthew W Carlis wrote:
> It is desirable to be able to remove pcie_failed_link_retrain for some
> systems which are known to have PCIe devices with good LTSSM behavior
> or a high degree of compatibility and which may be required to endure
> large numbers of hot-plug events or DPC triggers & always arrive at the
> maximum link speed. It appears that there is a degree of variability
> in DSP/RP behavior in terms of setting the LBMS bit & therefore
> difficult to tune pcie_failed_link_retrain with a very high degree
> of accuracy in terms of never forcing a device to Gen1 that would
> be able to arrive at its maximum speed on its own.
> 
> Signed-off-by: Matthew W Carlis <mattc@purestorage.com>
> ---
>  drivers/pci/Kconfig  | 9 +++++++++
>  drivers/pci/pci.h    | 8 +++++++-
>  drivers/pci/quirks.c | 3 +++
>  3 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 9c0e4aaf4e8c..8f01808231f7 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -68,6 +68,15 @@ config PCI_QUIRKS
>  	  Disable this only if your target machine is unaffected by PCI
>  	  quirks.
>  
> +config PCI_NOSPEED_QUIRK
> +	default n
> +	bool "Remove forced Gen1 link speed Gen1 quirk" if EXPERT
> +	help
> +	  This disables a workaround that will guide the PCIe link to
> +	  2.5GT/s speed if it thinks the link has failed to train. Enable
> +	  this if you think this workaround is forcing the link to 2.5GT/s
> +	  when it should not.

This seems awfully specific to me, really too specific to carry in the
upstream tree.  pcie_failed_link_retrain() is itself ridiculously
specific, and I'm not sure we should even keep carrying that.

Maybe we should just accept that broken hardware exists and add quirks
to limit link speed or tell the user to buy a working device.

>  config PCI_DEBUG
>  	bool "PCI Debugging"
>  	depends on DEBUG_KERNEL
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 12215ee72afb..51fddc6419f3 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -785,7 +785,6 @@ void pci_acs_init(struct pci_dev *dev);
>  int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
>  int pci_dev_specific_enable_acs(struct pci_dev *dev);
>  int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
> -int pcie_failed_link_retrain(struct pci_dev *dev);
>  #else
>  static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
>  					       u16 acs_flags)
> @@ -800,11 +799,18 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
>  {
>  	return -ENOTTY;
>  }
> +#endif
> +
> +#ifdef CONFIG_PCI_QUIRKS
> +#ifndef CONFIG_PCI_NOSPEED_QUIRK
> +int pcie_failed_link_retrain(struct pci_dev *dev);
> +#else
>  static inline int pcie_failed_link_retrain(struct pci_dev *dev)
>  {
>  	return -ENOTTY;
>  }
>  #endif
> +#endif
>  
>  /* PCI error reporting and recovery */
>  pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 39bb0c025119..d2d06f9ec983 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -36,6 +36,8 @@
>  #include <linux/switchtec.h>
>  #include "pci.h"
>  
> +#ifndef CONFIG_PCI_NOSPEED_QUIRK
> +
>  static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta)
>  {
>  	if (test_bit(PCI_LINK_LBMS_SEEN, &dev->priv_flags))
> @@ -140,6 +142,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
>  
>  	return ret;
>  }
> +#endif
>  
>  static ktime_t fixup_debug_start(struct pci_dev *dev,
>  				 void (*fn)(struct pci_dev *dev))
> -- 
> 2.46.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
  2025-07-16 19:02 ` [PATCH 1/1] " Matthew W Carlis
  2025-07-16 19:34   ` Bjorn Helgaas
@ 2025-07-17 17:30   ` kernel test robot
  1 sibling, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-07-17 17:30 UTC (permalink / raw)
  To: Matthew W Carlis, linux-pci
  Cc: oe-kbuild-all, bhelgaas, ashishk, macro, bamstadt, msaggi,
	sconnor, Matthew W Carlis

Hi Matthew,

kernel test robot noticed the following build errors:

[auto build test ERROR on pci/next]
[also build test ERROR on pci/for-linus linus/master v6.16-rc6 next-20250717]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Matthew-W-Carlis/PCI-Add-CONFIG_PCI_NOSPEED_QUIRK-to-remove-pcie_failed_link_retrain/20250717-030438
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20250716190206.15269-2-mattc%40purestorage.com
patch subject: [PATCH 1/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
config: sparc-randconfig-002-20250717 (https://download.01.org/0day-ci/archive/20250718/202507180125.48CXWNxm-lkp@intel.com/config)
compiler: sparc-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250718/202507180125.48CXWNxm-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507180125.48CXWNxm-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/pci/probe.c: In function 'pci_device_add':
>> drivers/pci/probe.c:2688:2: error: implicit declaration of function 'pcie_failed_link_retrain'; did you mean 'pci_enable_link_state'? [-Werror=implicit-function-declaration]
     pcie_failed_link_retrain(dev);
     ^~~~~~~~~~~~~~~~~~~~~~~~
     pci_enable_link_state
   cc1: some warnings being treated as errors
--
   drivers/pci/pci.c: In function 'pci_dev_wait':
>> drivers/pci/pci.c:1325:9: error: implicit declaration of function 'pcie_failed_link_retrain'; did you mean 'pci_enable_link_state'? [-Werror=implicit-function-declaration]
        if (pcie_failed_link_retrain(bridge) == 0) {
            ^~~~~~~~~~~~~~~~~~~~~~~~
            pci_enable_link_state
   cc1: some warnings being treated as errors


vim +2688 drivers/pci/probe.c

44aa0c657e3e45 Marc Zyngier      2015-07-28  2670  
96bde06a2df1b3 Sam Ravnborg      2007-03-26  2671  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
^1da177e4c3f41 Linus Torvalds    2005-04-16  2672  {
4f535093cf8f6d Yinghai Lu        2013-01-21  2673  	int ret;
4f535093cf8f6d Yinghai Lu        2013-01-21  2674  
6cd33649fa83d9 Bjorn Helgaas     2014-08-27  2675  	pci_configure_device(dev);
6cd33649fa83d9 Bjorn Helgaas     2014-08-27  2676  
cdb9b9f730eac4 Paul Mackerras    2005-09-06  2677  	device_initialize(&dev->dev);
cdb9b9f730eac4 Paul Mackerras    2005-09-06  2678  	dev->dev.release = pci_release_dev;
^1da177e4c3f41 Linus Torvalds    2005-04-16  2679  
7629d19a4df922 Yinghai Lu        2013-01-21  2680  	set_dev_node(&dev->dev, pcibus_to_node(bus));
cdb9b9f730eac4 Paul Mackerras    2005-09-06  2681  	dev->dev.dma_mask = &dev->dma_mask;
4d57cdfacaa1c2 FUJITA Tomonori   2008-02-04  2682  	dev->dev.dma_parms = &dev->dma_parms;
cdb9b9f730eac4 Paul Mackerras    2005-09-06  2683  	dev->dev.coherent_dma_mask = 0xffffffffull;
^1da177e4c3f41 Linus Torvalds    2005-04-16  2684  
b0da3498c587c2 Christoph Hellwig 2018-10-09  2685  	dma_set_max_seg_size(&dev->dev, 65536);
a6f44cf9f5cc60 Christoph Hellwig 2018-10-09  2686  	dma_set_seg_boundary(&dev->dev, 0xffffffff);
4d57cdfacaa1c2 FUJITA Tomonori   2008-02-04  2687  
a89c82249c3763 Maciej W. Rozycki 2023-06-11 @2688  	pcie_failed_link_retrain(dev);
a89c82249c3763 Maciej W. Rozycki 2023-06-11  2689  
^1da177e4c3f41 Linus Torvalds    2005-04-16  2690  	/* Fix up broken headers */
^1da177e4c3f41 Linus Torvalds    2005-04-16  2691  	pci_fixup_device(pci_fixup_header, dev);
^1da177e4c3f41 Linus Torvalds    2005-04-16  2692  
2069ecfbe14ebd Yinghai Lu        2012-02-15  2693  	pci_reassigndev_resource_alignment(dev);
2069ecfbe14ebd Yinghai Lu        2012-02-15  2694  
4b77b0a2ba27d6 Rafael J. Wysocki 2009-09-09  2695  	dev->state_saved = false;
4b77b0a2ba27d6 Rafael J. Wysocki 2009-09-09  2696  
201de56eb22f1f Zhao, Yu          2008-10-13  2697  	pci_init_capabilities(dev);
eb9d0fe40e313c Rafael J. Wysocki 2008-07-07  2698  
^1da177e4c3f41 Linus Torvalds    2005-04-16  2699  	/*
^1da177e4c3f41 Linus Torvalds    2005-04-16  2700  	 * Add the device to our list of discovered devices
^1da177e4c3f41 Linus Torvalds    2005-04-16  2701  	 * and the bus list for fixup functions, etc.
^1da177e4c3f41 Linus Torvalds    2005-04-16  2702  	 */
d71374dafbba7e Zhang Yanmin      2006-06-02  2703  	down_write(&pci_bus_sem);
^1da177e4c3f41 Linus Torvalds    2005-04-16  2704  	list_add_tail(&dev->bus_list, &bus->devices);
d71374dafbba7e Zhang Yanmin      2006-06-02  2705  	up_write(&pci_bus_sem);
4f535093cf8f6d Yinghai Lu        2013-01-21  2706  
06dc660e6eb881 Oliver O'Halloran 2021-09-14  2707  	ret = pcibios_device_add(dev);
4f535093cf8f6d Yinghai Lu        2013-01-21  2708  	WARN_ON(ret < 0);
4f535093cf8f6d Yinghai Lu        2013-01-21  2709  
3e466e2d3a04c7 Bjorn Helgaas     2017-11-30  2710  	/* Set up MSI IRQ domain */
44aa0c657e3e45 Marc Zyngier      2015-07-28  2711  	pci_set_msi_domain(dev);
44aa0c657e3e45 Marc Zyngier      2015-07-28  2712  
4f535093cf8f6d Yinghai Lu        2013-01-21  2713  	/* Notifier could use PCI capabilities */
4f535093cf8f6d Yinghai Lu        2013-01-21  2714  	ret = device_add(&dev->dev);
4f535093cf8f6d Yinghai Lu        2013-01-21  2715  	WARN_ON(ret < 0);
4e893545ef8712 Mariusz Tkaczyk   2024-09-04  2716  
4e893545ef8712 Mariusz Tkaczyk   2024-09-04  2717  	pci_npem_create(dev);
2311ab1820fed2 Alistair Francis  2025-03-06  2718  
2311ab1820fed2 Alistair Francis  2025-03-06  2719  	pci_doe_sysfs_init(dev);
cdb9b9f730eac4 Paul Mackerras    2005-09-06  2720  }
cdb9b9f730eac4 Paul Mackerras    2005-09-06  2721  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
  2025-07-16 19:34   ` Bjorn Helgaas
@ 2025-07-17 18:38     ` Matthew W Carlis
  2025-07-18 14:16       ` Maciej W. Rozycki
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew W Carlis @ 2025-07-17 18:38 UTC (permalink / raw)
  To: helgaas
  Cc: ashishk, bamstadt, bhelgaas, linux-pci, macro, mattc, msaggi,
	sconnor

On Wed, 16 Jul 2025, Bjorn Helgaas wrote:
> Maybe we should just accept that broken hardware exists and add quirks
> to limit link speed or tell the user to buy a working device.

I was actually wondering if we should define a new kind of
DECLARE_PCI_FIXUP_<LINKUP> which allows vendors/users to implement an
appropriate link recovery for their circumstances. In one of our storage
appliance product-lines we actually have a kind of quirk of our own which is
implemented to work-around some older Gen3 PCIe switches that had some official
erratum. Honestly we would like to not have to carry these patches, but there
wasn't an obvious way to upstream them. We could probably re-work them to fit
into a kind of new fixup.

My belief is that we try to not bend the generic handling in the kernel around
specific device issues because once implemented we are essentially enabling
devices in the future to have such bugs/interactions.

In the case of this ASMedia/Pericom switch combination I'm told from others
internally that its possibly that changing the link presets or other settings
may resolve the ltssm looping issue, but it would probably require
ASMedia/Pericom to look into.

Cheers!
- Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
  2025-07-17 18:38     ` [PATCH 0/1] " Matthew W Carlis
@ 2025-07-18 14:16       ` Maciej W. Rozycki
  2025-07-23 19:18         ` Matthew W Carlis
  0 siblings, 1 reply; 7+ messages in thread
From: Maciej W. Rozycki @ 2025-07-18 14:16 UTC (permalink / raw)
  To: Matthew W Carlis, Bjorn Helgaas
  Cc: ashishk, bamstadt, Bjorn Helgaas, linux-pci, msaggi, sconnor

On Thu, 17 Jul 2025, Matthew W Carlis wrote:

> > Maybe we should just accept that broken hardware exists and add quirks
> > to limit link speed or tell the user to buy a working device.

 Bjorn, unfortunately sometimes you have to live with what you've got, in 
particular there's (I believe still) no good choice available to replace 
the HiFive Unmatched board and the PCIe splitter adapter chosen was the 
only one I could chase that is fully mechanically compatible with *ATX 
case slot space (i.e. you can actually properly mount it there next to the 
mainboard and no connector will clash with another part of the system).

 Matthew, please correct me if I'm wrong, but from discussion so far here 
and previously I infer the problematic part is not the essential part of 
the quirk, that is retraining at 2.5GT/s.  It is leaving the speed clamp 
behind that is.

 I pondered over this issue and came to a conclusion: how about we just 
drop the vendor/device ID qualification for unclamping where the quirk has 
triggered, that is remove the clamp unconditionally?  It obviously won't 
affect my devices and might perhaps have been overly cautious in the first 
place.

 Than for the case where the link had already been clamped by the firmware 
stricter matching can be done before unclamping, just as I suggested in my 
previous message, but by definition it shouldn't affect hotplug scenarios.

 If we agree on this way to move forward, then I'll make suitable changes, 
but today is my last day before heading for a holiday next week, so I'll 
only come back with some code the week of Jul 28th.  Does it sound like a 
plan to you?

> I was actually wondering if we should define a new kind of
> DECLARE_PCI_FIXUP_<LINKUP> which allows vendors/users to implement an
> appropriate link recovery for their circumstances. In one of our storage
> appliance product-lines we actually have a kind of quirk of our own which is
> implemented to work-around some older Gen3 PCIe switches that had some official
> erratum. Honestly we would like to not have to carry these patches, but there
> wasn't an obvious way to upstream them. We could probably re-work them to fit
> into a kind of new fixup.
> 
> My belief is that we try to not bend the generic handling in the kernel around
> specific device issues because once implemented we are essentially enabling
> devices in the future to have such bugs/interactions.

 No opinion about it right away, but I'll give it a thought once I'm back.

> In the case of this ASMedia/Pericom switch combination I'm told from others
> internally that its possibly that changing the link presets or other settings
> may resolve the ltssm looping issue, but it would probably require
> ASMedia/Pericom to look into.

 Thank you for looking into it.  Sadly both parties declined to comment 
when I contacted them back in 2021.

 They only seem to care about direct customers, which would be SiFive for 
ASMedia and Delock (or whoever the OEM was; the same device used to be 
retailed by StarTech for example) for Diodes/Pericom.  There was no answer 
from ASMedia at all and I got an initial response from Diodes, but then I 
was told that if this was about a PCIe switch, then they were the wrong 
part of the company and could not help themselves or direct me to the 
right part.

 Then SiFive told me they'd love to help, but were too small a customer 
for ASMedia (the first batch of HiFive Unmatched was a couple hundred 
pieces), and Delock told me I was the first one to ever report any issue 
and therefore it was my problem and not theirs.

 So I did whatever I was able to myself.  I'm glad I had the skills in the 
first place; most people could only give up.

  Maciej

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain
  2025-07-18 14:16       ` Maciej W. Rozycki
@ 2025-07-23 19:18         ` Matthew W Carlis
  0 siblings, 0 replies; 7+ messages in thread
From: Matthew W Carlis @ 2025-07-23 19:18 UTC (permalink / raw)
  To: macro
  Cc: ashishk, bamstadt, bhelgaas, helgaas, linux-pci, mattc, msaggi,
	sconnor

On Fri, 18 Jul 2025, Maciej W. Rozycki wrote:
>  Bjorn, unfortunately sometimes you have to live with what you've got, in 
> particular there's (I believe still) no good choice available to replace 
> the HiFive Unmatched board and the PCIe splitter adapter chosen was the 
> only one I could chase that is fully mechanically compatible with *ATX 
> case slot space (i.e. you can actually properly mount it there next to the 
> mainboard and no connector will clash with another part of the system).
>
>  Matthew, please correct me if I'm wrong, but from discussion so far here 
> and previously I infer the problematic part is not the essential part of 
> the quirk, that is retraining at 2.5GT/s.  It is leaving the speed clamp 
> behind that is.

I'm just not sure what the benefit of the quirk is generally. It seems like
there are several problems with it in "well behaved" systems. I think for
people who build & sell servers they would go out and qualify a list of
devices which they will tell their customers "have been seen to work" &
therefore would be unlikely to see your specific issue. Another problem in my
mind with the quirk is that you're left trying to figure out if it should
have invoked the quirk before looking at the device interaction & so I think it
makes things a bit harder to debug.

In a way we're basically enabling future bad hardware by allowing the quirk to
run broadly on PCIe devices... Once its been around for a while will anyone ever
be able to confidently say its not necessary? In addition it sounds like both Ilpo
and I have observed LBMS to behave differently on different devices.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-07-23 19:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 19:02 [PATCH 0/1] PCI: Add CONFIG_PCI_NOSPEED_QUIRK to remove pcie_failed_link_retrain Matthew W Carlis
2025-07-16 19:02 ` [PATCH 1/1] " Matthew W Carlis
2025-07-16 19:34   ` Bjorn Helgaas
2025-07-17 18:38     ` [PATCH 0/1] " Matthew W Carlis
2025-07-18 14:16       ` Maciej W. Rozycki
2025-07-23 19:18         ` Matthew W Carlis
2025-07-17 17:30   ` [PATCH 1/1] " kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).