From: Leon Romanovsky <leon@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Qinyun Tan <qinyuntan@linux.alibaba.com>,
Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org,
Xunlei Pang <xlpang@linux.alibaba.com>,
Guixin Liu <kanie@linux.alibaba.com>,
oliver.yang@linux.alibaba.com,
Guanghui Feng <guanghuifeng@linux.alibaba.com>,
Bjorn Helgaas <bhelgaas@google.com>,
linux-pci@vger.kernel.org
Subject: Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
Date: Tue, 27 Jan 2026 16:31:43 +0200 [thread overview]
Message-ID: <20260127143143.GW13967@unreal> (raw)
In-Reply-To: <20260127084807.GA342@lst.de>
On Tue, Jan 27, 2026 at 09:48:07AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
> > The NVMe PCI driver exports the sriov_configure callback via
> > pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
> > VFs through sysfs. However, when the PF driver is unbound, the driver
> > does not disable SR-IOV, leaving VFs orphaned in the system.
>
> That sounds dangerous.
It is not. In a real SR-IOV device, VFs are created by the hardware and
are independent of their PF. There are several use cases where an
operator unbinds the PF and reuses it to improve overall device
utilization.
We have already discussed this in the context of Rust.
https://lore.kernel.org/all/20251122185701.GZ18335@unreal/
>
> > According to Documentation/PCI/pci-iov-howto.rst, PCI drivers that
> > support SR-IOV should call pci_disable_sriov() in their remove callback
> > to properly clean up VFs before the driver is unloaded.
I could not find that claim in Documentation/PCI/pci-iov-howto.rst.
Can you point to the specific sentence that supports it?
>
> Bjorn and other PCI folks: is there any reason to not do this in
> the PCI code and leave a landmine for the drivers?
It will break a lot of real users.
>
> > Fix this by disabling SR-IOV in nvme_remove(). If VFs are not assigned
> > to a guest, disable SR-IOV. If VFs are still assigned, emit a warning
> > since forcibly disabling would disrupt the guest.
>
> Well, I think we have to distrupt it, at least for hot unplug. This
> sounds like we need some better handling in the core code as well.
As mentioned earlier, there are valid users of this functionality relying on
legitimate devices that operate correctly regardless of whether the PF is
bound.
>
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index 58f3097888a7..4f2dc13de48b 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -3666,6 +3666,15 @@ static void nvme_remove(struct pci_dev *pdev)
> > nvme_stop_ctrl(&dev->ctrl);
> > nvme_remove_namespaces(&dev->ctrl);
> > nvme_dev_disable(dev, true);
> > +
> > + if (pci_num_vf(pdev)) {
> > + if (pci_vfs_assigned(pdev))
> > + dev_warn(&pdev->dev,
> > + "WARNING: Removing PF while VFs are assigned - VFs will not be deallocated!\n");
> > + else
> > + pci_disable_sriov(pdev);
> > + }
> > +
> > nvme_free_host_mem(dev);
> > nvme_dev_remove_admin(dev);
> > nvme_dbbuf_dma_free(dev);
> > --
> > 2.43.5
> ---end quoted text---
>
next prev parent reply other threads:[~2026-01-27 14:31 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 7:33 [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind Qinyun Tan
2026-01-27 8:48 ` Christoph Hellwig
2026-01-27 14:31 ` Leon Romanovsky [this message]
2026-01-27 16:06 ` Keith Busch
2026-01-27 18:00 ` Leon Romanovsky
2026-01-27 23:09 ` Bjorn Helgaas
2026-01-27 23:43 ` Jakub Kicinski
2026-01-28 8:44 ` Leon Romanovsky
2026-01-30 4:53 ` qinyuntan
2026-02-06 22:28 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260127143143.GW13967@unreal \
--to=leon@kernel.org \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=guanghuifeng@linux.alibaba.com \
--cc=hch@lst.de \
--cc=kanie@linux.alibaba.com \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=oliver.yang@linux.alibaba.com \
--cc=qinyuntan@linux.alibaba.com \
--cc=sagi@grimberg.me \
--cc=xlpang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox