public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: ALOK TIWARI <alok.a.tiwari@oracle.com>
To: "Lukas Wunner" <lukas@wunner.de>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
	bhelgaas@google.com
Cc: Jiwei <jiwei.sun.bj@qq.com>,
	macro@orcam.me.uk, linux-pci@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	guojinhui.liam@bytedance.com, helgaas@kernel.org,
	ahuang12@lenovo.com, sunjw10@lenovo.com
Subject: Re: [External] : Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing
Date: Wed, 26 Nov 2025 00:53:18 +0530	[thread overview]
Message-ID: <d4e5b6d8-c69f-4fbc-8da6-bc2c2fb2a550@oracle.com> (raw)
In-Reply-To: <Z4eLh24IkDrAm6cm@wunner.de>

Hi,

On 1/15/2025 3:48 PM, Lukas Wunner wrote:
> On Tue, Jan 14, 2025 at 08:25:04PM +0200, Ilpo Järvinen wrote:
>> On Tue, 14 Jan 2025, Jiwei wrote:
>>> [  539.362400] ==== pcie_bwnotif_irq 269(stop running),link_status:0x7841
>>> [  539.395720] ==== pcie_bwnotif_irq 247(start running),link_status:0x1041
>>
>> DLLLA=0
>>
>> But LBMS did not get reset.
>>
>> So is this perhaps because hotplug cannot keep up with the rapid
>> remove/add going on, and thus will not always call the remove_board()
>> even if the device went away?
>>
>> Lukas, do you know if there's a good way to resolve this within hotplug
>> side?
> 
> I believe the pciehp code is fine and suspect this is an issue
> in the quirk.  We've been dealing with rapid add/remove in pciehp
> for years without issues.
> 
> I don't understand the quirk sufficiently to make a guess
> what's going wrong, but I'm wondering if there could be
> a race accessing the lbms_count?
> 
> Maybe if lbms_count is replaced by a flag in pci_dev->priv_flags
> as we've discussed, with proper memory barriers where necessary,
> this problem will solve itself?
> 
> Thanks,
> 
> Lukas
> 

We are testing hot-add/hot-remove behavior and observed the same issue 
as, mentioned where the PCIe bridge link speed drops from 32 GT/s to 2.5 
GT/s.

My understanding is that pcie_failed_link_retrain should only apply to 
devices matched by PCI_VDEVICE(ASMEDIA, 0x2824),
but the current implementation appears to affect all devices that take 
longer to establish a link.
We are unsure if this is intentional, but it effectively allows such
devices to continue operating at a reduced speed.

If we extend PCIE_LINK_RETRAIN_TIMEOUT_MS to 3000 ms, these slower 
devices are able to complete link training,
and the problem is no longer observed in our testing. Therefore, 
increasing PCIE_LINK_RETRAIN_TIMEOUT_MS to 3000 ms seems to resolve the 
issue for us.

Would it be acceptable to increase PCIE_LINK_RETRAIN_TIMEOUT_MS, from 
1000 to 3000 ms in this case?


Thanks,
Alok


  reply	other threads:[~2025-11-25 19:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-10 13:44 [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing Jiwei Sun
2025-01-11 16:00 ` Maciej W. Rozycki
2025-01-13 12:44   ` Jiwei
2025-01-13 15:08 ` Ilpo Järvinen
2025-01-14 15:04   ` Jiwei
2025-01-14 18:25     ` Ilpo Järvinen
2025-01-15 10:18       ` Lukas Wunner
2025-11-25 19:23         ` ALOK TIWARI [this message]
2025-12-01  3:54           ` [External] : " Maciej W. Rozycki
2025-01-15 11:39       ` Jiwei
2025-09-09 12:33 ` [External] : " ALOK TIWARI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4e5b6d8-c69f-4fbc-8da6-bc2c2fb2a550@oracle.com \
    --to=alok.a.tiwari@oracle.com \
    --cc=ahuang12@lenovo.com \
    --cc=bhelgaas@google.com \
    --cc=guojinhui.liam@bytedance.com \
    --cc=helgaas@kernel.org \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=jiwei.sun.bj@qq.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=macro@orcam.me.uk \
    --cc=sunjw10@lenovo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox