From: "Michael S. Tsirkin" <mst@redhat.com>
To: Parav Pandit <parav@nvidia.com>
Cc: virtualization@lists.linux.dev, jasowang@redhat.com,
stefanha@redhat.com, pbonzini@redhat.com,
xuanzhuo@linux.alibaba.com, stable@vger.kernel.org,
mgurtovoy@nvidia.com, lirongqing@baidu.com
Subject: Re: [PATCH] Revert "virtio_pci: Support surprise removal of virtio pci device"
Date: Fri, 22 Aug 2025 06:21:52 -0400 [thread overview]
Message-ID: <20250822060839-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20250822091706.21170-1-parav@nvidia.com>
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
> This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
>
> Virtio drivers and PCI devices have never fully supported true
> surprise (aka hot unplug) removal. Drivers historically continued
> processing and waiting for pending I/O and even continued synchronous
> device reset during surprise removal. Devices have also continued
> completing I/Os, doing DMA and allowing device reset after surprise
> removal to support such drivers.
>
> Supporting it correctly would require a new device capability
If a device is removed, it is removed. Windows drivers supported
this since forever and it's just a Linux bug that it does not
handle all the cases. This is not something you can handle
with a capability.
> and
> driver negotiation in the virtio specification to safely stop
> I/O and free queue memory. Failure to do so either breaks all the
> existing drivers with call trace listed in the commit or crashes the
> host on continuing the DMA.
If the device is gone, then DMA does not continue.
IIUC what is going on for you, is that you have developed a surprise
removal emulation that pretends to remove the device but
actually the device is doing DMA. So of course things break then.
> Hence, until such specification and devices
> are invented, restore the previous behavior of treating surprise
> removal as graceful removal to avoid regressions and maintain system
> stability same as before the
> commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
>
> As explained above, previous analysis of solving this only in driver
> was incomplete and non-reliable at [1] and at [2]; Hence reverting commit
> 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device")
> is still the best stand to restore failures of virtio net and
> block devices.
>
> [1] https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C638DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t
I can only repeat what I said then, this is not how we do kernel
development.
> [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nvidia.com/
What was missing here, is handling corner cases. So let us please
try to handle them.
Here is how I would try to do it:
- add a new driver callback
- start a periodic timer task in virtio core on remove
- in the timer, probe that the device is still present.
if not, invoke a driver callback
- cancel the task on device reset
If you do not have the time, let me know and I will try to look into it.
> Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device")
> Cc: stable@vger.kernel.org
> Reported-by: lirongqing@baidu.com
> Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741@baidu.com/
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> drivers/virtio/virtio_pci_common.c | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> index d6d79af44569..dba5eb2eaff9 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
> struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
> struct device *dev = get_device(&vp_dev->vdev.dev);
>
> - /*
> - * Device is marked broken on surprise removal so that virtio upper
> - * layers can abort any ongoing operation.
> - */
> - if (!pci_device_is_present(pci_dev))
> - virtio_break_device(&vp_dev->vdev);
> -
> pci_disable_sriov(pci_dev);
>
> unregister_virtio_device(&vp_dev->vdev);
> --
> 2.26.2
next prev parent reply other threads:[~2025-08-22 10:21 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-22 9:17 [PATCH] Revert "virtio_pci: Support surprise removal of virtio pci device" Parav Pandit
2025-08-22 10:21 ` Michael S. Tsirkin [this message]
2025-08-22 12:22 ` Parav Pandit
2025-08-22 13:03 ` Michael S. Tsirkin
2025-08-22 13:49 ` Parav Pandit
2025-08-22 13:59 ` Michael S. Tsirkin
2025-08-24 2:36 ` Parav Pandit
2025-08-24 14:33 ` Michael S. Tsirkin
2025-08-26 18:52 ` Parav Pandit
2025-08-27 10:19 ` Michael S. Tsirkin
2025-08-27 11:33 ` Cornelia Huck
2025-08-28 6:24 ` Parav Pandit
2025-08-28 12:16 ` Cornelia Huck
2025-08-28 12:19 ` Michael S. Tsirkin
2025-08-28 12:22 ` Cornelia Huck
2025-08-28 12:33 ` Parav Pandit
2025-08-28 13:00 ` Michael S. Tsirkin
2025-08-28 13:37 ` Parav Pandit
-- strict thread matches above, loose matches on Subject: below --
2025-08-22 10:27 Li,Rongqing
2025-08-22 12:24 ` Parav Pandit
2025-08-22 13:04 ` Michael S. Tsirkin
2025-08-22 13:53 ` Parav Pandit
2025-08-22 14:02 ` Michael S. Tsirkin
2025-08-24 2:36 ` Parav Pandit
2025-08-24 14:29 ` Michael S. Tsirkin
2025-08-26 18:52 ` Parav Pandit
2025-08-27 10:21 ` Michael S. Tsirkin
2025-08-27 10:49 ` Michael S. Tsirkin
2025-08-28 6:23 ` Parav Pandit
2025-08-28 6:34 ` Michael S. Tsirkin
2025-08-28 6:59 ` Parav Pandit
2025-08-28 9:23 ` Michael S. Tsirkin
2025-08-28 10:41 ` Parav Pandit
2025-04-08 14:59 Parav Pandit
2025-04-08 20:15 ` Michael S. Tsirkin
2025-04-09 13:50 ` Parav Pandit
2025-04-09 16:02 ` Michael S. Tsirkin
2025-04-16 3:01 ` Parav Pandit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250822060839-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=jasowang@redhat.com \
--cc=lirongqing@baidu.com \
--cc=mgurtovoy@nvidia.com \
--cc=parav@nvidia.com \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).