From: "Michael S. Tsirkin" <mst@redhat.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: linux-kernel@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
linux-pci@vger.kernel.org, Parav Pandit <parav@nvidia.com>,
virtualization@lists.linux.dev, stefanha@redhat.com,
alok.a.tiwari@oracle.com
Subject: Re: [PATCH RFC] pci: report surprise removal events
Date: Sun, 29 Jun 2025 13:28:08 -0400 [thread overview]
Message-ID: <20250629132113-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <aGFBW7wet9V4WENC@wunner.de>
On Sun, Jun 29, 2025 at 03:36:27PM +0200, Lukas Wunner wrote:
> On Sat, Jun 28, 2025 at 02:58:49PM -0400, Michael S. Tsirkin wrote:
> > At the moment, in case of a surprise removal, the regular
> > remove callback is invoked, exclusively.
> > This works well, because mostly, the cleanup would be the same.
> >
> > However, there's a race: imagine device removal was initiated by a user
> > action, such as driver unbind, and it in turn initiated some cleanup and
> > is now waiting for an interrupt from the device. If the device is now
> > surprise-removed, that never arrives and the remove callback hangs
> > forever.
> >
> > Drivers can artificially add timeouts to handle that, but it can be
> > flaky.
> >
> > Instead, let's add a way for the driver to be notified about the
> > disconnect. It can then do any necessary cleanup, knowing that the
> > device is inactive.
> [...]
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -549,6 +549,15 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
> > pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
> > pci_doe_disconnected(dev);
> >
> > + /* Notify driver of surprise removal */
> > + device_lock(&dev->dev);
> > +
> > + if (dev->driver && dev->driver->err_handler &&
> > + dev->driver->err_handler->disconnect)
> > + dev->driver->err_handler->disconnect(dev);
> > +
> > + device_unlock(&dev->dev);
> > +
> > return 0;
> > }
thanks for the feedback. Would appreciate a couple more hints:
> No, that's not good:
>
> 1/ The device_lock() will reintroduce the issues solved by 74ff8864cc84.
I see. What other way is there to prevent dev->driver from going away,
though? I guess I can add a new spinlock and take it both here and when
dev->driver changes? Acceptable?
> 2/ pci_dev_set_disconnected() needs to be fast so that devices are marked
> unplugged as quickly as possible. We want to minimize the time window
> where MMIO and Config Space reads already return "all ones" and writes
> go to nirvana, but pci_dev_is_disconnected() still returns false.
> Hence invoking some driver callback which may take arbitrarily long or
> even sleeps is not an option.
Well, there's no plan to do that there - just to wake up some wq so
things can be completed. I can add code comments.
> The driver is already notified of removal through invocation of the
> ->remove() callback. The use case you're describing is arguably
> a corner case. I do think that a timeout is a better approach
> than the one proposed here. How long does it take for the interrupt
> to arrive?
It's a virtual device - kind of unpredictable.
> If it's not just a few msec, consider polling the device
> and breaking out of the pool loop as soon as pci_dev_is_disconnected()
> returns true (or the MMIO read returns PCI_POSSIBLE_ERROR()).
Yes but with no callback, we don't know when to do it.
The config reads in pci_dev_is_disconnected are also expensive
on VMs...
> If/when respinning, please explain the use case in more detail,
> i.e. which driver, which device, pointers to code...
>
> Thanks!
>
> Lukas
It's virtio-blk.
next prev parent reply other threads:[~2025-06-29 17:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-28 18:58 [PATCH RFC] pci: report surprise removal events Michael S. Tsirkin
2025-06-29 13:36 ` Lukas Wunner
2025-06-29 17:28 ` Michael S. Tsirkin [this message]
2025-06-29 23:39 ` Keith Busch
2025-06-30 4:07 ` Parav Pandit
2025-06-30 13:44 ` Keith Busch
2025-06-30 13:52 ` Parav Pandit
2025-06-30 16:57 ` Keith Busch
2025-06-30 17:25 ` Michael S. Tsirkin
2025-06-30 7:17 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250629132113-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alok.a.tiwari@oracle.com \
--cc=bhelgaas@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=parav@nvidia.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).