From: "Michael S. Tsirkin" <mst@redhat.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: linux-kernel@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Parav Pandit <parav@nvidia.com>,
virtualization@lists.linux.dev, stefanha@redhat.com,
alok.a.tiwari@oracle.com, linux-pci@vger.kernel.org
Subject: Re: [PATCH RFC v5 1/5] pci: report surprise removal event
Date: Thu, 17 Jul 2025 19:31:57 -0400 [thread overview]
Message-ID: <20250717193122-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <aHlZE18kPuHuDtTT@wunner.de>
On Thu, Jul 17, 2025 at 10:12:03PM +0200, Lukas Wunner wrote:
> On Thu, Jul 17, 2025 at 11:11:44AM -0400, Michael S. Tsirkin wrote:
> > On Mon, Jul 14, 2025 at 08:11:04AM +0200, Lukas Wunner wrote:
> > > On Wed, Jul 09, 2025 at 04:55:26PM -0400, Michael S. Tsirkin wrote:
> > > > At the moment, in case of a surprise removal, the regular remove
> > > > callback is invoked, exclusively. This works well, because mostly, the
> > > > cleanup would be the same.
> > > >
> > > > However, there's a race: imagine device removal was initiated by a user
> > > > action, such as driver unbind, and it in turn initiated some cleanup and
> > > > is now waiting for an interrupt from the device. If the device is now
> > > > surprise-removed, that never arrives and the remove callback hangs
> > > > forever.
> > >
> > > For PCI devices in a hotplug slot, user space can initiate "safe removal"
> > > by writing "0" to the hotplug slot's "power" file in sysfs.
> > >
> > > If the PCI device is yanked from the slot while safe removal is ongoing,
> > > there is likewise no way for the driver to know that the device is
> > > suddenly gone. That's because pciehp_unconfigure_device() only calls
> > > pci_dev_set_disconnected() in the surprise removal case, not for
> > > safe removal.
> > >
> > > The solution proposed here is thus not a complete one: It may work
> > > if user space initiated *driver* removal, but not if it initiated *safe*
> > > removal of the entire device. For virtio, that may be sufficient.
> >
> > So just as an idea, something like this can work I guess? I'm yet to
> > test this - wrote this on the go -
>
> Don't bother, it won't work:
>
> pciehp_handle_presence_or_link_change() is called from pciehp_ist(),
> the IRQ thread. During safe removal the IRQ thread is busy in
> pciehp_unconfigure_device() and waiting for the driver to unbind
> from devices being safe-removed.
Confused. I thought safe removal happens in the userspace thread
that wrote into sysfs?
> An IRQ thread is always single-threaded. There's no second instance
> of the IRQ thread being run when another interrupt is signaled.
> Rather, the IRQ thread is re-run when it has finished.
>
> In *theory* what would be possible is to plumb this into pciehp_isr().
> That's the hardirq handler. This one will indeed be run when an
> interrupt comes in while the IRQ thread is running. Normally the
> hardirq handler would just collect the events for later consumption
> by the IRQ thread. The hardirq handler could *theoretically* mark
> devices gone while they're being safe-removed.
>
> I'm saying "theoretically" because in reality I don't think this is
> a viable approach either: pciehp_ist() contains code to *ignore*
> link or presence changes if they were caused by a Secondary Bus Reset
> or Downstream Port Containment. In that case we do *not* want to mark
> devices disconnected because they're only *temporarily* inaccessible.
> This requires waiting for the SBR or DPC to conclude, which can take
> several seconds. We can't wait in the hardirq handler.
>
> So this cannot be solved with the current architecture of pciehp,
> at least not easily or in an elegant way. Sorry!
>
> Thanks,
>
> Lukas
next prev parent reply other threads:[~2025-07-17 23:32 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-09 20:55 [PATCH RFC v5 0/5] pci,virtio: report surprise removal event Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 1/5] pci: " Michael S. Tsirkin
2025-07-09 23:38 ` Bjorn Helgaas
2025-07-09 23:55 ` Keith Busch
2025-07-14 6:17 ` Michael S. Tsirkin
2025-07-14 6:26 ` Michael S. Tsirkin
2025-07-14 21:13 ` Bjorn Helgaas
2025-07-15 6:28 ` Michael S. Tsirkin
2025-07-16 22:29 ` Bjorn Helgaas
2025-07-17 15:15 ` Michael S. Tsirkin
2025-07-14 6:11 ` Lukas Wunner
2025-07-14 6:18 ` Michael S. Tsirkin
2025-07-14 6:54 ` Michael S. Tsirkin
2025-07-17 15:11 ` Michael S. Tsirkin
2025-07-17 20:12 ` Lukas Wunner
2025-07-17 23:31 ` Michael S. Tsirkin [this message]
2025-07-18 4:35 ` Lukas Wunner
2025-07-18 8:40 ` Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 2/5] virtio: fix comments, readability Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 3/5] virtio: pack config changed flags Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 4/5] virtio: allow transports to suppress config change Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 5/5] virtio: support device disconnect Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250717193122-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alok.a.tiwari@oracle.com \
--cc=bhelgaas@google.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=parav@nvidia.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.