All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Parav Pandit <parav@nvidia.com>
Cc: "NBU-Contact-Li Rongqing (EXTERNAL)" <lirongqing@baidu.com>,
	"virtualization@lists.linux.dev" <virtualization@lists.linux.dev>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"xuanzhuo@linux.alibaba.com" <xuanzhuo@linux.alibaba.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	Max Gurtovoy <mgurtovoy@nvidia.com>
Subject: Re: [PATCH] Revert "virtio_pci: Support surprise removal of virtio pci device"
Date: Thu, 28 Aug 2025 05:23:17 -0400	[thread overview]
Message-ID: <20250828051435-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CY8PR12MB71956D1AB42BFA9BF19AE134DC3BA@CY8PR12MB7195.namprd12.prod.outlook.com>

On Thu, Aug 28, 2025 at 06:59:26AM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: 28 August 2025 12:04 PM
> > 
> > On Thu, Aug 28, 2025 at 06:23:02AM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: 27 August 2025 04:19 PM
> > > >
> > > > On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
> > > > > On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
> > > > > > > > > If it does not, and a user pull out the working device,
> > > > > > > > > how does your patch help?
> > > > > > > > >
> > > > > > > > A driver must tell that it will not follow broken ancient
> > > > > > > > behaviour and at that
> > > > > > > point device would stop its ancient backward compatibility mode.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I don't know what is "ancient backward compatibility mode".
> > > > > > >
> > > > > > Let me explain.
> > > > > > Sadly, CSPs virtio pci device implementation is done such a way
> > > > > > that, it
> > > > works with ancient Linux kernel which does not have commit
> > > > 43bb40c5b9265.
> > > > >
> > > > >
> > > > > OK we are getting new information here.
> > > > >
> > > > > So let me summarize. There's a virtual system that pretends, to
> > > > > the guest, that device was removed by surprise removal, but
> > > > > actually device is there and is still doing DMA.
> > > > > Is that a fair summary?
> > > >
> > > Yes.
> > >
> > > > If that is the case, the thing to do would be to try and detect the
> > > > fake removal and then work with device as usual - device not doing
> > > > DMA after removal is pretty fundamental, after all.
> > > >
> > > The issue is: one can build the device to stop the DMA.
> > > There is no predictable combination for the driver and device that can work
> > for the user.
> > > For example,
> > > Device that stops the dma will not work before the commit 43bb40c5b9265.
> > > Device that continues the dma will not work with whatever new
> > implementation done in future kernels.
> > >
> > > Hence the capability negotiation would be needed so that device can stop the
> > DMA, config interrupts etc.
> > 
> > So this is a broken implementation at the pci level. We really can't fix removal
> > for this device at all, except by fixing the device. 
> The device to be told how to behave with/without commit 43bb40c5b9265.
> Not sure what you mean by 'fix the device'.
> 
> Users are running stable kernel that has commit 43bb40c5b9265 and its broken setup for them.
> 
> > Whatever works, works by
> > chance.  Feature negotiation in spec is not the way to fix that, but some work
> > arounds in the driver to skip the device are acceptable, mostly to not bother
> > with it.
> >
> Why not?
> It sounds like we need feature bit like VERSION_1 or ORDER_PLATFORM.


Because the device is out of spec (PCI spec which virtio references).

Besides the bug is not in the device, it's in the pci emulation.


> To _fix_ a stable kernel, if you have a suggestion, please suggest.
> 
> > Pls document exactly how this pci looks. Does it have an id we can use to detect
> > it?
> >
> CSPs have different device and vendor id for vnet, blk vfs.
> Is that what you mean by id?

vendor id is one way, yes. maybe a revision check, too.

> > > > For example, how about reading device control+status?
> > > >
> > > Most platforms read 0xffff on non-existing device, but not sure if this the
> > standard or well defined.
> > 
> > IIRC it's in the pci spec as a note.
> > 
> Checking.
> 
> > > > If we get all ones device has been removed If we get 0 in bus
> > > > master: device has been removed but re-inserted Anything else is a
> > > > fake removal
> > > >
> > > Bus master check may pass, right returning all 1s, even if the device is
> > removed, isn't it?
> > 
> > 
> > So we check all ones 1st, only check bus master if not all ones?
> >
> Pci subsystem typically checks the vendor and device ids, and if its not all 1s, its safe enough check.
> 
> How about a fix something like this:
> 
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -746,12 +746,16 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>  {
>         struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
>         struct device *dev = get_device(&vp_dev->vdev.dev);
> +       u32 v;
> 
>         /*
>          * Device is marked broken on surprise removal so that virtio upper
>          * layers can abort any ongoing operation.
> +        * Make sure that device is truly removed by directly interacting
> +        * with the device (and not just depend on the slot registers).
>          */
> -       if (!pci_device_is_present(pci_dev))
> +       if (!pci_device_is_present(pci_dev) &&
> +           !pci_bus_read_dev_vendor_id(pci_dev->bus, pci_dev->devfn, &v, 0))
>                 virtio_break_device(&vp_dev->vdev);
> 
> So if the device is still there, it let it go through its usual cleanup flow.
> And post this fix, a proper implementation with callback etc that you described can be implemented.


I don't have a big problem with this, but I don't understand the
scenario now again. report_error_detected relies on dev->error_state and
bus read. error_state is set on AER reporting an error. This is
not what you described.

Does the patch actually solve the problem for you?

Also can we limit this to a specific vendor id, or something like that?


I also still like the idea of reading dev control and status, since
it always bothered me that there's a theoretical chance that device
is re-inserted and bus read will succeed. Or maybe I'm imagining it.


-- 
MST


  reply	other threads:[~2025-08-28  9:23 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22 10:27 [PATCH] Revert "virtio_pci: Support surprise removal of virtio pci device" Li,Rongqing
2025-08-22 12:24 ` Parav Pandit
2025-08-22 13:04   ` Michael S. Tsirkin
2025-08-22 13:53     ` Parav Pandit
2025-08-22 14:02       ` Michael S. Tsirkin
2025-08-24  2:36         ` Parav Pandit
2025-08-24 14:29           ` Michael S. Tsirkin
2025-08-26 18:52             ` Parav Pandit
2025-08-27 10:21               ` Michael S. Tsirkin
2025-08-27 10:49                 ` Michael S. Tsirkin
2025-08-28  6:23                   ` Parav Pandit
2025-08-28  6:34                     ` Michael S. Tsirkin
2025-08-28  6:59                       ` Parav Pandit
2025-08-28  9:23                         ` Michael S. Tsirkin [this message]
2025-08-28 10:41                           ` Parav Pandit
  -- strict thread matches above, loose matches on Subject: below --
2025-08-22  9:17 Parav Pandit
2025-08-22 10:21 ` Michael S. Tsirkin
2025-08-22 12:22   ` Parav Pandit
2025-08-22 13:03     ` Michael S. Tsirkin
2025-08-22 13:49       ` Parav Pandit
2025-08-22 13:59         ` Michael S. Tsirkin
2025-08-24  2:36           ` Parav Pandit
2025-08-24 14:33             ` Michael S. Tsirkin
2025-08-26 18:52               ` Parav Pandit
2025-08-27 10:19                 ` Michael S. Tsirkin
2025-08-27 11:33                   ` Cornelia Huck
2025-08-28  6:24                     ` Parav Pandit
2025-08-28 12:16                       ` Cornelia Huck
2025-08-28 12:19                         ` Michael S. Tsirkin
2025-08-28 12:22                           ` Cornelia Huck
2025-08-28 12:33                             ` Parav Pandit
2025-08-28 13:00                               ` Michael S. Tsirkin
2025-08-28 13:37                                 ` Parav Pandit
2025-04-08 14:59 Parav Pandit
2025-04-08 20:15 ` Michael S. Tsirkin
2025-04-09 13:50   ` Parav Pandit
2025-04-09 16:02     ` Michael S. Tsirkin
2025-04-16  3:01       ` Parav Pandit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250828051435-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lirongqing@baidu.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=parav@nvidia.com \
    --cc=pbonzini@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.