linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: linux-kernel@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, Parav Pandit <parav@nvidia.com>,
	virtualization@lists.linux.dev, stefanha@redhat.com,
	alok.a.tiwari@oracle.com
Subject: Re: [PATCH RFC] pci: report surprise removal events
Date: Sun, 29 Jun 2025 13:28:08 -0400	[thread overview]
Message-ID: <20250629132113-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <aGFBW7wet9V4WENC@wunner.de>

On Sun, Jun 29, 2025 at 03:36:27PM +0200, Lukas Wunner wrote:
> On Sat, Jun 28, 2025 at 02:58:49PM -0400, Michael S. Tsirkin wrote:
> > At the moment, in case of a surprise removal, the regular
> > remove callback is invoked, exclusively.
> > This works well, because mostly, the cleanup would be the same.
> > 
> > However, there's a race: imagine device removal was initiated by a user
> > action, such as driver unbind, and it in turn initiated some cleanup and
> > is now waiting for an interrupt from the device. If the device is now
> > surprise-removed, that never arrives and the remove callback hangs
> > forever.
> > 
> > Drivers can artificially add timeouts to handle that, but it can be
> > flaky.
> > 
> > Instead, let's add a way for the driver to be notified about the
> > disconnect. It can then do any necessary cleanup, knowing that the
> > device is inactive.
> [...]
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -549,6 +549,15 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
> >  	pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
> >  	pci_doe_disconnected(dev);
> >  
> > +	/* Notify driver of surprise removal */
> > +	device_lock(&dev->dev);
> > +
> > +	if (dev->driver && dev->driver->err_handler &&
> > +	    dev->driver->err_handler->disconnect)
> > +		dev->driver->err_handler->disconnect(dev);
> > +
> > +	device_unlock(&dev->dev);
> > +
> >  	return 0;
> >  }

thanks for the feedback. Would appreciate a couple more hints:

> No, that's not good:
> 
> 1/ The device_lock() will reintroduce the issues solved by 74ff8864cc84.

I see. What other way is there to prevent dev->driver from going away,
though? I guess I can add a new spinlock and take it both here and when
dev->driver changes? Acceptable?

> 2/ pci_dev_set_disconnected() needs to be fast so that devices are marked
>    unplugged as quickly as possible.  We want to minimize the time window
>    where MMIO and Config Space reads already return "all ones" and writes
>    go to nirvana, but pci_dev_is_disconnected() still returns false.
>    Hence invoking some driver callback which may take arbitrarily long or
>    even sleeps is not an option.

Well, there's no plan to do that there - just to wake up some wq so
things can be completed. I can add code comments.

> The driver is already notified of removal through invocation of the
> ->remove() callback.  The use case you're describing is arguably
> a corner case.  I do think that a timeout is a better approach
> than the one proposed here.  How long does it take for the interrupt
> to arrive?

It's a virtual device - kind of unpredictable.

>  If it's not just a few msec, consider polling the device
> and breaking out of the pool loop as soon as pci_dev_is_disconnected()
> returns true (or the MMIO read returns PCI_POSSIBLE_ERROR()).

Yes but with no callback, we don't know when to do it.
The config reads in pci_dev_is_disconnected are also expensive
on VMs...

> If/when respinning, please explain the use case in more detail,
> i.e. which driver, which device, pointers to code...
> 
> Thanks!
> 
> Lukas


It's virtio-blk. 


  reply	other threads:[~2025-06-29 17:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-28 18:58 [PATCH RFC] pci: report surprise removal events Michael S. Tsirkin
2025-06-29 13:36 ` Lukas Wunner
2025-06-29 17:28   ` Michael S. Tsirkin [this message]
2025-06-29 23:39     ` Keith Busch
2025-06-30  4:07       ` Parav Pandit
2025-06-30 13:44         ` Keith Busch
2025-06-30 13:52           ` Parav Pandit
2025-06-30 16:57             ` Keith Busch
2025-06-30 17:25               ` Michael S. Tsirkin
2025-06-30  7:17       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250629132113-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alok.a.tiwari@oracle.com \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=parav@nvidia.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).