public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
@ 2026-01-27  7:33 Qinyun Tan
  2026-01-27  8:48 ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Qinyun Tan @ 2026-01-27  7:33 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang, Guanghui Feng,
	Qinyun Tan

The NVMe PCI driver exports the sriov_configure callback via
pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
VFs through sysfs. However, when the PF driver is unbound, the driver
does not disable SR-IOV, leaving VFs orphaned in the system.

This causes the PCI core to emit a warning in pci_iov_remove():
  "driver left SR-IOV enabled after remove"

According to Documentation/PCI/pci-iov-howto.rst, PCI drivers that
support SR-IOV should call pci_disable_sriov() in their remove callback
to properly clean up VFs before the driver is unloaded.

Fix this by disabling SR-IOV in nvme_remove(). If VFs are not assigned
to a guest, disable SR-IOV. If VFs are still assigned, emit a warning
since forcibly disabling would disrupt the guest.

Signed-off-by: Qinyun Tan <qinyuntan@linux.alibaba.com>
Reviewed-by: Xunlei Pang <xlpang@linux.alibaba.com>
Reviewed-by: Guixin Liu <kanie@linux.alibaba.com>
---
 drivers/nvme/host/pci.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 58f3097888a7..4f2dc13de48b 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3666,6 +3666,15 @@ static void nvme_remove(struct pci_dev *pdev)
 	nvme_stop_ctrl(&dev->ctrl);
 	nvme_remove_namespaces(&dev->ctrl);
 	nvme_dev_disable(dev, true);
+
+	if (pci_num_vf(pdev)) {
+		if (pci_vfs_assigned(pdev))
+			dev_warn(&pdev->dev,
+				 "WARNING: Removing PF while VFs are assigned - VFs will not be deallocated!\n");
+		else
+			pci_disable_sriov(pdev);
+	}
+
 	nvme_free_host_mem(dev);
 	nvme_dev_remove_admin(dev);
 	nvme_dbbuf_dma_free(dev);
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27  7:33 [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind Qinyun Tan
@ 2026-01-27  8:48 ` Christoph Hellwig
  2026-01-27 14:31   ` Leon Romanovsky
  2026-01-30  4:53   ` qinyuntan
  0 siblings, 2 replies; 10+ messages in thread
From: Christoph Hellwig @ 2026-01-27  8:48 UTC (permalink / raw)
  To: Qinyun Tan
  Cc: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang, Guanghui Feng,
	Bjorn Helgaas, linux-pci

On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
> The NVMe PCI driver exports the sriov_configure callback via
> pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
> VFs through sysfs. However, when the PF driver is unbound, the driver
> does not disable SR-IOV, leaving VFs orphaned in the system.

That sounds dangerous.

> According to Documentation/PCI/pci-iov-howto.rst, PCI drivers that
> support SR-IOV should call pci_disable_sriov() in their remove callback
> to properly clean up VFs before the driver is unloaded.

Bjorn and other PCI folks: is there any reason to not do this in
the PCI code and leave a landmine for the drivers?

> Fix this by disabling SR-IOV in nvme_remove(). If VFs are not assigned
> to a guest, disable SR-IOV. If VFs are still assigned, emit a warning
> since forcibly disabling would disrupt the guest.

Well, I think we have to distrupt it, at least for hot unplug.  This
sounds like we need some better handling in the core code as well.

> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 58f3097888a7..4f2dc13de48b 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3666,6 +3666,15 @@ static void nvme_remove(struct pci_dev *pdev)
>  	nvme_stop_ctrl(&dev->ctrl);
>  	nvme_remove_namespaces(&dev->ctrl);
>  	nvme_dev_disable(dev, true);
> +
> +	if (pci_num_vf(pdev)) {
> +		if (pci_vfs_assigned(pdev))
> +			dev_warn(&pdev->dev,
> +				 "WARNING: Removing PF while VFs are assigned - VFs will not be deallocated!\n");
> +		else
> +			pci_disable_sriov(pdev);
> +	}
> +
>  	nvme_free_host_mem(dev);
>  	nvme_dev_remove_admin(dev);
>  	nvme_dbbuf_dma_free(dev);
> -- 
> 2.43.5
---end quoted text---


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27  8:48 ` Christoph Hellwig
@ 2026-01-27 14:31   ` Leon Romanovsky
  2026-01-27 16:06     ` Keith Busch
  2026-01-30  4:53   ` qinyuntan
  1 sibling, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2026-01-27 14:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Qinyun Tan, Keith Busch, Jens Axboe, Sagi Grimberg, linux-nvme,
	Xunlei Pang, Guixin Liu, oliver.yang, Guanghui Feng,
	Bjorn Helgaas, linux-pci

On Tue, Jan 27, 2026 at 09:48:07AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
> > The NVMe PCI driver exports the sriov_configure callback via
> > pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
> > VFs through sysfs. However, when the PF driver is unbound, the driver
> > does not disable SR-IOV, leaving VFs orphaned in the system.
> 
> That sounds dangerous.

It is not. In a real SR-IOV device, VFs are created by the hardware and
are independent of their PF. There are several use cases where an
operator unbinds the PF and reuses it to improve overall device
utilization.

We have already discussed this in the context of Rust.
https://lore.kernel.org/all/20251122185701.GZ18335@unreal/

> 
> > According to Documentation/PCI/pci-iov-howto.rst, PCI drivers that
> > support SR-IOV should call pci_disable_sriov() in their remove callback
> > to properly clean up VFs before the driver is unloaded.

I could not find that claim in Documentation/PCI/pci-iov-howto.rst.  
Can you point to the specific sentence that supports it?

> 
> Bjorn and other PCI folks: is there any reason to not do this in
> the PCI code and leave a landmine for the drivers?

It will break a lot of real users.

> 
> > Fix this by disabling SR-IOV in nvme_remove(). If VFs are not assigned
> > to a guest, disable SR-IOV. If VFs are still assigned, emit a warning
> > since forcibly disabling would disrupt the guest.
> 
> Well, I think we have to distrupt it, at least for hot unplug.  This
> sounds like we need some better handling in the core code as well.

As mentioned earlier, there are valid users of this functionality relying on
legitimate devices that operate correctly regardless of whether the PF is
bound.

> 
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index 58f3097888a7..4f2dc13de48b 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -3666,6 +3666,15 @@ static void nvme_remove(struct pci_dev *pdev)
> >  	nvme_stop_ctrl(&dev->ctrl);
> >  	nvme_remove_namespaces(&dev->ctrl);
> >  	nvme_dev_disable(dev, true);
> > +
> > +	if (pci_num_vf(pdev)) {
> > +		if (pci_vfs_assigned(pdev))
> > +			dev_warn(&pdev->dev,
> > +				 "WARNING: Removing PF while VFs are assigned - VFs will not be deallocated!\n");
> > +		else
> > +			pci_disable_sriov(pdev);
> > +	}
> > +
> >  	nvme_free_host_mem(dev);
> >  	nvme_dev_remove_admin(dev);
> >  	nvme_dbbuf_dma_free(dev);
> > -- 
> > 2.43.5
> ---end quoted text---
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27 14:31   ` Leon Romanovsky
@ 2026-01-27 16:06     ` Keith Busch
  2026-01-27 18:00       ` Leon Romanovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2026-01-27 16:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Christoph Hellwig, Qinyun Tan, Jens Axboe, Sagi Grimberg,
	linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang, Guanghui Feng,
	Bjorn Helgaas, linux-pci

On Tue, Jan 27, 2026 at 04:31:43PM +0200, Leon Romanovsky wrote:
> On Tue, Jan 27, 2026 at 09:48:07AM +0100, Christoph Hellwig wrote:
> > On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
> > > The NVMe PCI driver exports the sriov_configure callback via
> > > pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
> > > VFs through sysfs. However, when the PF driver is unbound, the driver
> > > does not disable SR-IOV, leaving VFs orphaned in the system.
> > 
> > That sounds dangerous.
> 
> It is not. In a real SR-IOV device, VFs are created by the hardware and
> are independent of their PF. There are several use cases where an
> operator unbinds the PF and reuses it to improve overall device
> utilization.

If this is expected, should the warn message "driver left SR-IOV enabled
after remove" be downgraded to 'info' level?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27 16:06     ` Keith Busch
@ 2026-01-27 18:00       ` Leon Romanovsky
  2026-01-27 23:09         ` Bjorn Helgaas
  0 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2026-01-27 18:00 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Qinyun Tan, Jens Axboe, Sagi Grimberg,
	linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang, Guanghui Feng,
	Bjorn Helgaas, linux-pci



On Tue, Jan 27, 2026, at 18:06, Keith Busch wrote:
> On Tue, Jan 27, 2026 at 04:31:43PM +0200, Leon Romanovsky wrote:
>> On Tue, Jan 27, 2026 at 09:48:07AM +0100, Christoph Hellwig wrote:
>> > On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
>> > > The NVMe PCI driver exports the sriov_configure callback via
>> > > pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
>> > > VFs through sysfs. However, when the PF driver is unbound, the driver
>> > > does not disable SR-IOV, leaving VFs orphaned in the system.
>> > 
>> > That sounds dangerous.
>> 
>> It is not. In a real SR-IOV device, VFs are created by the hardware and
>> are independent of their PF. There are several use cases where an
>> operator unbinds the PF and reuses it to improve overall device
>> utilization.
>
> If this is expected, should the warn message "driver left SR-IOV enabled
> after remove" be downgraded to 'info' level?

It is not important, no one complained about it. People who unbind PF, simply ignore this warning.

BTW, the use case which I presented is for SR-IOV handled by drivers. Maybe VFs created by NVMe are different here and they must be destroyed.

Thanks 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27 18:00       ` Leon Romanovsky
@ 2026-01-27 23:09         ` Bjorn Helgaas
  2026-01-27 23:43           ` Jakub Kicinski
  2026-01-28  8:44           ` Leon Romanovsky
  0 siblings, 2 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2026-01-27 23:09 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Keith Busch, Christoph Hellwig, Qinyun Tan, Jens Axboe,
	Sagi Grimberg, linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang,
	Guanghui Feng, Bjorn Helgaas, linux-pci, Jakub Kicinski

[+cc Jakub, author of 38972375ef7b ("PCI/IOV: Reset total_VFs limit
after detaching PF driver"), which added the message]

On Tue, Jan 27, 2026 at 08:00:42PM +0200, Leon Romanovsky wrote:
> On Tue, Jan 27, 2026, at 18:06, Keith Busch wrote:
> > On Tue, Jan 27, 2026 at 04:31:43PM +0200, Leon Romanovsky wrote:
> >> On Tue, Jan 27, 2026 at 09:48:07AM +0100, Christoph Hellwig wrote:
> >> > On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
> >> > > The NVMe PCI driver exports the sriov_configure callback via
> >> > > pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
> >> > > VFs through sysfs. However, when the PF driver is unbound, the driver
> >> > > does not disable SR-IOV, leaving VFs orphaned in the system.
> >> > 
> >> > That sounds dangerous.
> >> 
> >> It is not. In a real SR-IOV device, VFs are created by the hardware and
> >> are independent of their PF. There are several use cases where an
> >> operator unbinds the PF and reuses it to improve overall device
> >> utilization.
> >
> > If this is expected, should the warn message "driver left SR-IOV
> > enabled after remove" be downgraded to 'info' level?
> 
> It is not important, no one complained about it. People who unbind
> PF, simply ignore this warning.
> 
> BTW, the use case which I presented is for SR-IOV handled by
> drivers. Maybe VFs created by NVMe are different here and they must
> be destroyed.

If it's to be expected, I do think 'info' would be more appropriate,
if nothing else as an indication to code readers that nothing is
wrong.  Or maybe even no message at all.  Or maybe the fact that we
reset the driver_max_VFs value is of more interest.

Bjorn


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27 23:09         ` Bjorn Helgaas
@ 2026-01-27 23:43           ` Jakub Kicinski
  2026-01-28  8:44           ` Leon Romanovsky
  1 sibling, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2026-01-27 23:43 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Leon Romanovsky, Keith Busch, Christoph Hellwig, Qinyun Tan,
	Jens Axboe, Sagi Grimberg, linux-nvme, Xunlei Pang, Guixin Liu,
	oliver.yang, Guanghui Feng, Bjorn Helgaas, linux-pci

On Tue, 27 Jan 2026 17:09:12 -0600 Bjorn Helgaas wrote:
> > > If this is expected, should the warn message "driver left SR-IOV
> > > enabled after remove" be downgraded to 'info' level?  
> > 
> > It is not important, no one complained about it. People who unbind
> > PF, simply ignore this warning.
> > 
> > BTW, the use case which I presented is for SR-IOV handled by
> > drivers. Maybe VFs created by NVMe are different here and they must
> > be destroyed.  
> 
> If it's to be expected, I do think 'info' would be more appropriate,
> if nothing else as an indication to code readers that nothing is
> wrong.  Or maybe even no message at all.  Or maybe the fact that we
> reset the driver_max_VFs value is of more interest.

FWIW I probably just added that message because most drivers end up
printing something along those lines. Tho, I strongly suspect it's
a mindless copy/paste in majority of the cases. No preference here.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27 23:09         ` Bjorn Helgaas
  2026-01-27 23:43           ` Jakub Kicinski
@ 2026-01-28  8:44           ` Leon Romanovsky
  1 sibling, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2026-01-28  8:44 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, Christoph Hellwig, Qinyun Tan, Jens Axboe,
	Sagi Grimberg, linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang,
	Guanghui Feng, Bjorn Helgaas, linux-pci, Jakub Kicinski

On Tue, Jan 27, 2026 at 05:09:12PM -0600, Bjorn Helgaas wrote:
> [+cc Jakub, author of 38972375ef7b ("PCI/IOV: Reset total_VFs limit
> after detaching PF driver"), which added the message]
> 
> On Tue, Jan 27, 2026 at 08:00:42PM +0200, Leon Romanovsky wrote:
> > On Tue, Jan 27, 2026, at 18:06, Keith Busch wrote:
> > > On Tue, Jan 27, 2026 at 04:31:43PM +0200, Leon Romanovsky wrote:
> > >> On Tue, Jan 27, 2026 at 09:48:07AM +0100, Christoph Hellwig wrote:
> > >> > On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
> > >> > > The NVMe PCI driver exports the sriov_configure callback via
> > >> > > pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
> > >> > > VFs through sysfs. However, when the PF driver is unbound, the driver
> > >> > > does not disable SR-IOV, leaving VFs orphaned in the system.
> > >> > 
> > >> > That sounds dangerous.
> > >> 
> > >> It is not. In a real SR-IOV device, VFs are created by the hardware and
> > >> are independent of their PF. There are several use cases where an
> > >> operator unbinds the PF and reuses it to improve overall device
> > >> utilization.
> > >
> > > If this is expected, should the warn message "driver left SR-IOV
> > > enabled after remove" be downgraded to 'info' level?
> > 
> > It is not important, no one complained about it. People who unbind
> > PF, simply ignore this warning.
> > 
> > BTW, the use case which I presented is for SR-IOV handled by
> > drivers. Maybe VFs created by NVMe are different here and they must
> > be destroyed.
> 
> If it's to be expected, I do think 'info' would be more appropriate,
> if nothing else as an indication to code readers that nothing is
> wrong.  Or maybe even no message at all.

The issue is that not all devices are created equal. Some devices from major
vendor rely on messages sent from the VF to the PF, and expect the PF driver
to perform certain configuration steps in response. These devices must disable
SR-IOV when the PF is unbound.

From one side, for users of such hardware, this message is important.
From another, for many other devices, this message is not important.

Like Jakub, I don't have any preference here.

Thanks


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-27  8:48 ` Christoph Hellwig
  2026-01-27 14:31   ` Leon Romanovsky
@ 2026-01-30  4:53   ` qinyuntan
  2026-02-06 22:28     ` Bjorn Helgaas
  1 sibling, 1 reply; 10+ messages in thread
From: qinyuntan @ 2026-01-30  4:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Jens Axboe, Sagi Grimberg, linux-nvme, Xunlei Pang,
	Guixin Liu, oliver.yang, Guanghui Feng, Bjorn Helgaas, linux-pci

Hi All,

Thank you all for the insightful discussion!

I agree with Leon's point that not all devices are created equal when it
comes to SR-IOV handling during driver unbind.

Looking at existing driver implementations, I found two different 
approaches:

1) mlx5 - unconditionally disables SR-IOV in remove:

    drivers/net/ethernet/mellanox/mlx5/core/main.c:
    static void remove_one(struct pci_dev *pdev)
    {
        ...
        mlx5_sriov_disable(pdev, false);
        ...
    }

    drivers/net/ethernet/mellanox/mlx5/core/sriov.c:
    void mlx5_sriov_disable(struct pci_dev *pdev, bool num_vf_change)
    {
        struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
        struct devlink *devlink = priv_to_devlink(dev);
        int num_vfs = pci_num_vf(dev->pdev);

        pci_disable_sriov(pdev);  /* Always disable, no 
pci_vfs_assigned() check */
        devl_lock(devlink);
        mlx5_device_disable_sriov(dev, num_vfs, true, num_vf_change);
        devl_unlock(devlink);
    }

2) ixgbe - checks pci_vfs_assigned() and skips disable if VFs are in use:

    drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:
    static void ixgbe_remove(struct pci_dev *pdev)
    {
        ...
    #ifdef CONFIG_PCI_IOV
        ixgbe_disable_sriov(adapter);
    #endif
        ...
    }

    drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c:
    #ifdef CONFIG_PCI_IOV
        if (pci_vfs_assigned(adapter->pdev)) {
            e_dev_warn("Unloading driver while VFs are assigned - VFs 
will not be deallocated\n");
            return -EPERM;
        }
        pci_disable_sriov(adapter->pdev);
    #endif

Regarding the warning level discussion: I would prefer keeping it as
dev_warn() rather than downgrading to dev_info(). As Leon mentioned,
some devices do require SR-IOV to be disabled when the PF is unbound,
and for those cases, this warning is important for operators to notice
and take action. A warning level helps ensure it doesn't get lost in
normal system logs.

Please let me know how you'd like to proceed.

Thanks,
Qinyun

On 1/27/26 4:48 PM, Christoph Hellwig wrote:
> On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
>> The NVMe PCI driver exports the sriov_configure callback via
>> pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
>> VFs through sysfs. However, when the PF driver is unbound, the driver
>> does not disable SR-IOV, leaving VFs orphaned in the system.
> 
> That sounds dangerous.
> 
>> According to Documentation/PCI/pci-iov-howto.rst, PCI drivers that
>> support SR-IOV should call pci_disable_sriov() in their remove callback
>> to properly clean up VFs before the driver is unloaded.
> 
> Bjorn and other PCI folks: is there any reason to not do this in
> the PCI code and leave a landmine for the drivers?
> 
>> Fix this by disabling SR-IOV in nvme_remove(). If VFs are not assigned
>> to a guest, disable SR-IOV. If VFs are still assigned, emit a warning
>> since forcibly disabling would disrupt the guest.
> 
> Well, I think we have to distrupt it, at least for hot unplug.  This
> sounds like we need some better handling in the core code as well.
> 
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 58f3097888a7..4f2dc13de48b 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -3666,6 +3666,15 @@ static void nvme_remove(struct pci_dev *pdev)
>>   	nvme_stop_ctrl(&dev->ctrl);
>>   	nvme_remove_namespaces(&dev->ctrl);
>>   	nvme_dev_disable(dev, true);
>> +
>> +	if (pci_num_vf(pdev)) {
>> +		if (pci_vfs_assigned(pdev))
>> +			dev_warn(&pdev->dev,
>> +				 "WARNING: Removing PF while VFs are assigned - VFs will not be deallocated!\n");
>> +		else
>> +			pci_disable_sriov(pdev);
>> +	}
>> +
>>   	nvme_free_host_mem(dev);
>>   	nvme_dev_remove_admin(dev);
>>   	nvme_dbbuf_dma_free(dev);
>> -- 
>> 2.43.5
> ---end quoted text---



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
  2026-01-30  4:53   ` qinyuntan
@ 2026-02-06 22:28     ` Bjorn Helgaas
  0 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2026-02-06 22:28 UTC (permalink / raw)
  To: qinyuntan
  Cc: Christoph Hellwig, Keith Busch, Jens Axboe, Sagi Grimberg,
	linux-nvme, Xunlei Pang, Guixin Liu, oliver.yang, Guanghui Feng,
	Bjorn Helgaas, linux-pci

On Fri, Jan 30, 2026 at 12:53:25PM +0800, qinyuntan wrote:
> Hi All,
> 
> Thank you all for the insightful discussion!
> 
> I agree with Leon's point that not all devices are created equal when it
> comes to SR-IOV handling during driver unbind.
> 
> Looking at existing driver implementations, I found two different
> approaches:
> 
> 1) mlx5 - unconditionally disables SR-IOV in remove:
> 
>    drivers/net/ethernet/mellanox/mlx5/core/main.c:
>    static void remove_one(struct pci_dev *pdev)
>    {
>        ...
>        mlx5_sriov_disable(pdev, false);
>        ...
>    }
> 
>    drivers/net/ethernet/mellanox/mlx5/core/sriov.c:
>    void mlx5_sriov_disable(struct pci_dev *pdev, bool num_vf_change)
>    {
>        struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
>        struct devlink *devlink = priv_to_devlink(dev);
>        int num_vfs = pci_num_vf(dev->pdev);
> 
>        pci_disable_sriov(pdev);  /* Always disable, no pci_vfs_assigned()
> check */
>        devl_lock(devlink);
>        mlx5_device_disable_sriov(dev, num_vfs, true, num_vf_change);
>        devl_unlock(devlink);
>    }
> 
> 2) ixgbe - checks pci_vfs_assigned() and skips disable if VFs are in use:
> 
>    drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:
>    static void ixgbe_remove(struct pci_dev *pdev)
>    {
>        ...
>    #ifdef CONFIG_PCI_IOV
>        ixgbe_disable_sriov(adapter);
>    #endif
>        ...
>    }
> 
>    drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c:
>    #ifdef CONFIG_PCI_IOV
>        if (pci_vfs_assigned(adapter->pdev)) {
>            e_dev_warn("Unloading driver while VFs are assigned - VFs will
> not be deallocated\n");
>            return -EPERM;
>        }
>        pci_disable_sriov(adapter->pdev);
>    #endif
> 
> Regarding the warning level discussion: I would prefer keeping it as
> dev_warn() rather than downgrading to dev_info(). As Leon mentioned,
> some devices do require SR-IOV to be disabled when the PF is unbound,
> and for those cases, this warning is important for operators to notice
> and take action. A warning level helps ensure it doesn't get lost in
> normal system logs.
> 
> Please let me know how you'd like to proceed.

"driver left SR-IOV enabled after remove\n" is KERN_WARNING today, and
I'm OK with leaving it that way.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-02-06 22:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-27  7:33 [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind Qinyun Tan
2026-01-27  8:48 ` Christoph Hellwig
2026-01-27 14:31   ` Leon Romanovsky
2026-01-27 16:06     ` Keith Busch
2026-01-27 18:00       ` Leon Romanovsky
2026-01-27 23:09         ` Bjorn Helgaas
2026-01-27 23:43           ` Jakub Kicinski
2026-01-28  8:44           ` Leon Romanovsky
2026-01-30  4:53   ` qinyuntan
2026-02-06 22:28     ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox