From: Keith Busch <kbusch@kernel.org>
To: Lukas Wunner <lukas@wunner.de>
Cc: Keith Busch <kbusch@meta.com>, linux-pci@vger.kernel.org
Subject: Re: [PATCH] pciehp: sync interrupts for bus resets
Date: Wed, 3 Sep 2025 10:19:28 -0600 [thread overview]
Message-ID: <aLhqkO8Uaohghm97@kbusch-mbp> (raw)
In-Reply-To: <aLf6jkqAYkM3GBvt@wunner.de>
On Wed, Sep 03, 2025 at 10:21:34AM +0200, Lukas Wunner wrote:
> On Tue, Sep 02, 2025 at 11:59:11AM -0600, Keith Busch wrote:
> > Hm, I think you're right. We are definitely seeing pciehp requeue itself
> > with the link/presence events that we want to be ignored, so we're
> > getting re-enumeration when we didn't expect it. I thought the
> > back-to-back resets that we're causing vfio to initiate was the problem,
> > but maybe not. I think the switch and/or end device we're using have
> > some unusual link timings that defeats the pciehp ignore logic.
>
> pci_bridge_secondary_bus_reset() calls pci_bridge_wait_for_secondary_bus()
> to await Link Up. So unless the link flaps afterwards, this should be
> fine.
>
> Another possibility is that the pciehp_device_replaced() check triggers,
> e.g. because the Endpoint's Device Serial Number or other data in Config
> Space changed after the second reset.
That can happen because we're using switches that insert a fake
"placeholder" device when a link is down.
> Maybe you can instrument the code with a few printk()'s to see what's
> going on.
But it looks like we're more frequently seeing the link not active.
Here's the existing messages printed:
[ 7904.749658] vfio-pci 0000:05:00.0: disabling bus mastering
[ 7904.756595] vfio-pci 0000:05:00.0: reset via bus
[ 7904.759975] pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation
[ 7905.908987] vfio-pci 0000:05:00.0: ready 0ms after bus reset
[ 7905.909003] pcieport 0000:02:02.0: pciehp: Slot(314): Link Down/Up ignored
[ 7906.847973] vfio-pci 0000:05:00.0: resetting
[ 7906.856312] vfio-pci 0000:05:00.0: reset via bus
[ 7906.862967] pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation
[ 7909.915925] pcieport 0000:02:02.0: Data Link Layer Link Active not set in 100 msec
[ 7909.915953] pcieport 0000:02:02.0: pciehp: Slot(314): Link Down/Up ignored
[ 7909.915977] pcieport 0000:02:02.0: pciehp: Slot(314): Link Down
[ 7909.915978] pcieport 0000:02:02.0: pciehp: Slot(314): Card not present
[ 7909.918934] pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation
[ 7911.923899] vfio-pci 0000:05:00.0: disconnected; not waiting
[ 7911.923905] vfio-pci 0000:05:00.0: bus failed with -25
prev parent reply other threads:[~2025-09-03 16:19 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 22:45 [PATCH] pciehp: sync interrupts for bus resets Keith Busch
2025-08-27 22:48 ` Keith Busch
2025-08-31 13:43 ` Lukas Wunner
2025-09-02 17:59 ` Keith Busch
2025-09-03 8:21 ` Lukas Wunner
2025-09-03 16:19 ` Keith Busch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLhqkO8Uaohghm97@kbusch-mbp \
--to=kbusch@kernel.org \
--cc=kbusch@meta.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox