linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v5 0/5] pci,virtio: report surprise removal event
@ 2025-07-09 20:55 Michael S. Tsirkin
  2025-07-09 20:55 ` [PATCH RFC v5 1/5] pci: " Michael S. Tsirkin
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2025-07-09 20:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lukas Wunner, Keith Busch, Bjorn Helgaas, Parav Pandit,
	virtualization, stefanha, alok.a.tiwari


Lukas, Keith, Bjorn, others, would very much appreciate your comments
on whether the pci core changes are acceptable.


Parav, I expect this to be integrated into your work on fixing surprise
removal. As such, I am not queueing these patches yet - please include with your
other patches fixing these issues.

==========

This is an attempt to fix the following race in virtio:

when device removal is initiated by a user
action, such as driver unbind, it in turn initiates driver cleanup and
is then waiting for an interrupt from the device. If the device is now
surprise-removed, that interrupt never arrives and the remove callback hangs
forever.

For example, this was reported for virtio-blk:

        1. the graceful removal is ongoing in the remove() callback, where disk
           deletion del_gendisk() is ongoing, which waits for the requests +to
           complete,

        2. Now few requests are yet to complete, and surprise removal started.

        At this point, virtio block driver will not get notified by the driver
        core layer, because it is likely serializing remove() happening by
        +user/driver unload and PCI hotplug driver-initiated device removal.  So
        vblk driver doesn't know that device is removed, block layer is waiting
        for requests completions to arrive which it never gets.  So
        del_gendisk() gets stuck.

We could add timeouts to handle that, but given virtio blk devices are
implemented in userspace, this makes the device flaky.

Instead, this adds pci core infrastructure, and virtio core
infrastructure, for drivers to be notified of device disconnect.

Actual cleanup in virtio-blk is still TBD.
Compile-tested only.

==========

Notes on the design:

Care was taken to avoid re-introducing the bug fixed by

commit 74ff8864cc84 ("PCI: hotplug: Allow marking devices as disconnected during bind/unbind")

To avoid taking locks on removal path, and to avoid invoking callback
with unpredictable latency, the event is reported through a WQ.

Adding APIs to enable/disable the reporting on probe/remove, helps make
sure the driver won't go away in the middle of the handling, all
without taking any locks.

The benefit is that the resulting API is harder than a callback to
misuse, adding unpredictable latencies to unplug. The WQ can simply do
the cleanup directly, taking any locks it needs for that.  The cost is
several extra bytes per device, which seems modest.


Previous discussion:
https://lore.kernel.org/all/11cfcb55b5302999b0e58b94018f92a379196698.1751136072.git.mst@redhat.com

Changes from v4:
	fix !CONFIG_PCI, reported by Stephen
Changes from v3:
	added documentation to address comments by Parav
	support in virtio core
Changes from v2:
        v2 was corrupted, fat fingers :(
Changes from v1:
         switched to a WQ, with APIs to enable/disable
         added motivation

==========


Michael S. Tsirkin (5):
  pci: report surprise removal event
  virtio: fix comments, readability
  virtio: pack config changed flags
  virtio: allow transports to suppress config change
  virtio: support device disconnect

 drivers/pci/pci.h                  |  6 ++++
 drivers/virtio/virtio.c            | 23 +++++++++++++--
 drivers/virtio/virtio_pci_common.c | 45 ++++++++++++++++++++++++++++++
 drivers/virtio/virtio_pci_common.h |  3 ++
 drivers/virtio/virtio_pci_legacy.c |  2 ++
 drivers/virtio/virtio_pci_modern.c |  2 ++
 include/linux/pci.h                | 45 ++++++++++++++++++++++++++++++
 include/linux/virtio.h             | 11 ++++++--
 include/linux/virtio_config.h      | 32 +++++++++++++++++++++
 9 files changed, 163 insertions(+), 6 deletions(-)

-- 
MST


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-07-18  8:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 20:55 [PATCH RFC v5 0/5] pci,virtio: report surprise removal event Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 1/5] pci: " Michael S. Tsirkin
2025-07-09 23:38   ` Bjorn Helgaas
2025-07-09 23:55     ` Keith Busch
2025-07-14  6:17       ` Michael S. Tsirkin
2025-07-14  6:26     ` Michael S. Tsirkin
2025-07-14 21:13       ` Bjorn Helgaas
2025-07-15  6:28         ` Michael S. Tsirkin
2025-07-16 22:29           ` Bjorn Helgaas
2025-07-17 15:15             ` Michael S. Tsirkin
2025-07-14  6:11   ` Lukas Wunner
2025-07-14  6:18     ` Michael S. Tsirkin
2025-07-14  6:54     ` Michael S. Tsirkin
2025-07-17 15:11     ` Michael S. Tsirkin
2025-07-17 20:12       ` Lukas Wunner
2025-07-17 23:31         ` Michael S. Tsirkin
2025-07-18  4:35           ` Lukas Wunner
2025-07-18  8:40             ` Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 2/5] virtio: fix comments, readability Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 3/5] virtio: pack config changed flags Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 4/5] virtio: allow transports to suppress config change Michael S. Tsirkin
2025-07-09 20:55 ` [PATCH RFC v5 5/5] virtio: support device disconnect Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).