public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: qinyuntan <qinyuntan@linux.alibaba.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme@lists.infradead.org,
	Xunlei Pang <xlpang@linux.alibaba.com>,
	Guixin Liu <kanie@linux.alibaba.com>,
	oliver.yang@linux.alibaba.com,
	Guanghui Feng <guanghuifeng@linux.alibaba.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind
Date: Fri, 30 Jan 2026 12:53:25 +0800	[thread overview]
Message-ID: <88437e1a-2df0-41e6-a58f-dcc68d4458bc@linux.alibaba.com> (raw)
In-Reply-To: <20260127084807.GA342@lst.de>

Hi All,

Thank you all for the insightful discussion!

I agree with Leon's point that not all devices are created equal when it
comes to SR-IOV handling during driver unbind.

Looking at existing driver implementations, I found two different 
approaches:

1) mlx5 - unconditionally disables SR-IOV in remove:

    drivers/net/ethernet/mellanox/mlx5/core/main.c:
    static void remove_one(struct pci_dev *pdev)
    {
        ...
        mlx5_sriov_disable(pdev, false);
        ...
    }

    drivers/net/ethernet/mellanox/mlx5/core/sriov.c:
    void mlx5_sriov_disable(struct pci_dev *pdev, bool num_vf_change)
    {
        struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
        struct devlink *devlink = priv_to_devlink(dev);
        int num_vfs = pci_num_vf(dev->pdev);

        pci_disable_sriov(pdev);  /* Always disable, no 
pci_vfs_assigned() check */
        devl_lock(devlink);
        mlx5_device_disable_sriov(dev, num_vfs, true, num_vf_change);
        devl_unlock(devlink);
    }

2) ixgbe - checks pci_vfs_assigned() and skips disable if VFs are in use:

    drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:
    static void ixgbe_remove(struct pci_dev *pdev)
    {
        ...
    #ifdef CONFIG_PCI_IOV
        ixgbe_disable_sriov(adapter);
    #endif
        ...
    }

    drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c:
    #ifdef CONFIG_PCI_IOV
        if (pci_vfs_assigned(adapter->pdev)) {
            e_dev_warn("Unloading driver while VFs are assigned - VFs 
will not be deallocated\n");
            return -EPERM;
        }
        pci_disable_sriov(adapter->pdev);
    #endif

Regarding the warning level discussion: I would prefer keeping it as
dev_warn() rather than downgrading to dev_info(). As Leon mentioned,
some devices do require SR-IOV to be disabled when the PF is unbound,
and for those cases, this warning is important for operators to notice
and take action. A warning level helps ensure it doesn't get lost in
normal system logs.

Please let me know how you'd like to proceed.

Thanks,
Qinyun

On 1/27/26 4:48 PM, Christoph Hellwig wrote:
> On Tue, Jan 27, 2026 at 03:33:44PM +0800, Qinyun Tan wrote:
>> The NVMe PCI driver exports the sriov_configure callback via
>> pci_sriov_configure_simple(), which allows userspace to enable SR-IOV
>> VFs through sysfs. However, when the PF driver is unbound, the driver
>> does not disable SR-IOV, leaving VFs orphaned in the system.
> 
> That sounds dangerous.
> 
>> According to Documentation/PCI/pci-iov-howto.rst, PCI drivers that
>> support SR-IOV should call pci_disable_sriov() in their remove callback
>> to properly clean up VFs before the driver is unloaded.
> 
> Bjorn and other PCI folks: is there any reason to not do this in
> the PCI code and leave a landmine for the drivers?
> 
>> Fix this by disabling SR-IOV in nvme_remove(). If VFs are not assigned
>> to a guest, disable SR-IOV. If VFs are still assigned, emit a warning
>> since forcibly disabling would disrupt the guest.
> 
> Well, I think we have to distrupt it, at least for hot unplug.  This
> sounds like we need some better handling in the core code as well.
> 
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 58f3097888a7..4f2dc13de48b 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -3666,6 +3666,15 @@ static void nvme_remove(struct pci_dev *pdev)
>>   	nvme_stop_ctrl(&dev->ctrl);
>>   	nvme_remove_namespaces(&dev->ctrl);
>>   	nvme_dev_disable(dev, true);
>> +
>> +	if (pci_num_vf(pdev)) {
>> +		if (pci_vfs_assigned(pdev))
>> +			dev_warn(&pdev->dev,
>> +				 "WARNING: Removing PF while VFs are assigned - VFs will not be deallocated!\n");
>> +		else
>> +			pci_disable_sriov(pdev);
>> +	}
>> +
>>   	nvme_free_host_mem(dev);
>>   	nvme_dev_remove_admin(dev);
>>   	nvme_dbbuf_dma_free(dev);
>> -- 
>> 2.43.5
> ---end quoted text---


  parent reply	other threads:[~2026-01-30  4:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260127073344.2489873-1-qinyuntan@linux.alibaba.com>
2026-01-27  8:48 ` [PATCH V1] nvme-pci: disable SR-IOV VFs on driver unbind Christoph Hellwig
2026-01-27 14:31   ` Leon Romanovsky
2026-01-27 16:06     ` Keith Busch
2026-01-27 18:00       ` Leon Romanovsky
2026-01-27 23:09         ` Bjorn Helgaas
2026-01-27 23:43           ` Jakub Kicinski
2026-01-28  8:44           ` Leon Romanovsky
2026-01-30  4:53   ` qinyuntan [this message]
2026-02-06 22:28     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88437e1a-2df0-41e6-a58f-dcc68d4458bc@linux.alibaba.com \
    --to=qinyuntan@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=guanghuifeng@linux.alibaba.com \
    --cc=hch@lst.de \
    --cc=kanie@linux.alibaba.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=oliver.yang@linux.alibaba.com \
    --cc=sagi@grimberg.me \
    --cc=xlpang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox