linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: linux-kernel@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Parav Pandit <parav@nvidia.com>,
	virtualization@lists.linux.dev, stefanha@redhat.com,
	alok.a.tiwari@oracle.com, linux-pci@vger.kernel.org
Subject: Re: [PATCH RFC v5 1/5] pci: report surprise removal event
Date: Mon, 14 Jul 2025 08:11:04 +0200	[thread overview]
Message-ID: <aHSfeNhpocI4nmQk@wunner.de> (raw)
In-Reply-To: <fba3d235e38c1c6fcef2a30ed083ad9e25b20fa3.1752094439.git.mst@redhat.com>

On Wed, Jul 09, 2025 at 04:55:26PM -0400, Michael S. Tsirkin wrote:
> At the moment, in case of a surprise removal, the regular remove
> callback is invoked, exclusively.  This works well, because mostly, the
> cleanup would be the same.
> 
> However, there's a race: imagine device removal was initiated by a user
> action, such as driver unbind, and it in turn initiated some cleanup and
> is now waiting for an interrupt from the device. If the device is now
> surprise-removed, that never arrives and the remove callback hangs
> forever.

For PCI devices in a hotplug slot, user space can initiate "safe removal"
by writing "0" to the hotplug slot's "power" file in sysfs.

If the PCI device is yanked from the slot while safe removal is ongoing,
there is likewise no way for the driver to know that the device is
suddenly gone.  That's because pciehp_unconfigure_device() only calls
pci_dev_set_disconnected() in the surprise removal case, not for
safe removal.

The solution proposed here is thus not a complete one:  It may work
if user space initiated *driver* removal, but not if it initiated *safe*
removal of the entire device.  For virtio, that may be sufficient.

> +++ b/drivers/pci/pci.h
> @@ -553,6 +553,12 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
>  	pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
>  	pci_doe_disconnected(dev);
>  
> +	if (READ_ONCE(dev->disconnect_work_enable)) {
> +		/* Make sure work is up to date. */
> +		smp_rmb();
> +		schedule_work(&dev->disconnect_work);
> +	}
> +
>  	return 0;
>  }

Going through all the callers of pci_dev_set_disconnected(),
I suppose the (only) one you're interested in is
pciehp_unconfigure_device().

The other callers are related to runtime resume, resume from
system sleep and ACPI slots.

Instead of amending pci_dev_set_disconnected(), I'd prefer
an approach where pciehp_unconfigure_device() first marks
all devices disconnected, then wakes up some global waitqueue, e.g.:

-	if (!presence)
+	if (!presence) {
		pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
+		wake_up_all(&pci_disconnected_wq);
+	}

The benefit is that there's no delay when marking devices disconnected.
(Granted, the delay is small for smp_rmb() + schedule_work().)
And just having a global waitqueue is simpler and may be useful
for other use cases.

So instead of adding timeouts when waiting for interrupts, drivers would
be woken via the waitqueue.

But again, it's not a complete solution as it doesn't cover the
"surprise removal during safe removal" case.

I also agree with Bjorn's and Keith's comments that the driver should
use timeouts for robustness, but still wanted to provide additional
(hopefully constructive) thoughts.

Thanks!

Lukas

  parent reply	other threads:[~2025-07-14  6:11 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1752094439.git.mst@redhat.com>
2025-07-09 20:55 ` [PATCH RFC v5 1/5] pci: report surprise removal event Michael S. Tsirkin
2025-07-09 23:38   ` Bjorn Helgaas
2025-07-09 23:55     ` Keith Busch
2025-07-14  6:17       ` Michael S. Tsirkin
2025-07-14  6:26     ` Michael S. Tsirkin
2025-07-14 21:13       ` Bjorn Helgaas
2025-07-15  6:28         ` Michael S. Tsirkin
2025-07-16 22:29           ` Bjorn Helgaas
2025-07-17 15:15             ` Michael S. Tsirkin
2025-07-14  6:11   ` Lukas Wunner [this message]
2025-07-14  6:18     ` Michael S. Tsirkin
2025-07-14  6:54     ` Michael S. Tsirkin
2025-07-17 15:11     ` Michael S. Tsirkin
2025-07-17 20:12       ` Lukas Wunner
2025-07-17 23:31         ` Michael S. Tsirkin
2025-07-18  4:35           ` Lukas Wunner
2025-07-18  8:40             ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aHSfeNhpocI4nmQk@wunner.de \
    --to=lukas@wunner.de \
    --cc=alok.a.tiwari@oracle.com \
    --cc=bhelgaas@google.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).