From: Don Dutile <ddutile@redhat.com>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org, yuvalmin@broadcom.com,
bhutchings@solarflare.com, gregory.v.rose@intel.com,
yinghai@kernel.org, davem@davemloft.net
Subject: Re: [PATCH 4/4] PCI,sriov: provide method to reduce the number of total VFs supported
Date: Mon, 12 Nov 2012 11:33:39 -0500 [thread overview]
Message-ID: <50A124E3.3050901@redhat.com> (raw)
In-Reply-To: <CAErSpo7gj7p1=o1GJtaa0_ALO-6M3+AfdMmR1mQc9f6T3T_z3Q@mail.gmail.com>
On 11/10/2012 04:33 PM, Bjorn Helgaas wrote:
> On Mon, Nov 5, 2012 at 1:20 PM, Donald Dutile<ddutile@redhat.com> wrote:
>> Some implementations of SRIOV provide a capability structure
>> value of TotalVFs that is greater than what the software can support.
>> Provide a method to reduce the capability structure reported value
>> to the value the driver can support.
>> This ensures sysfs reports the current capability of the system,
>> hardware and software.
>> Example for its use: igb& ixgbe -- report 8& 64 as TotalVFs,
>> but drivers only support 7& 63 maximum.
>>
>> Signed-off-by: Donald Dutile<ddutile@redhat.com>
>
> I don't really understand the purpose of pci_sriov_set_totalvfs(). I
> think a driver should enforce its limit at the point where it enables
> the VFs. I think the driver should do that to be defensive regardless
> of whether we add pci_sriov_set_totalvfs().
>
I received feedback from the driver folks that putting this check
into the core reduces dependencies on drivers doing the right check
at the right time. It's similar to a similar argument made that the
core ought to call pci_sriov_enable/disable(), and not the driver(s).
> So is this just to make the driver's limit visible to user-space? How
Yes.
> is it better than having the user specify the number he'd like, and
> having the driver reduce that if necessary? The user will be able to
> read sriov_numvfs to learn how many the driver enabled, right?
>
Most drivers don't enable-what-can-be-enabled; the request succeeds for
the number of VFS specified, or it tears all the VFs configured up to the
failure, and returns failure, i.e., all or nothing. I would tend to agree
with this logic, since SRIOV resources (BARS,MSI, etc.) are architected
as n-resources/VF*num_vfs_enabled.
The primary purpose of the patch set is to drive SRIOV/VF enablement
from userspace. To simplify userspace, it's best to present to it
what *can* be enabled. Right now, that's read totalvfs & check numvfs;
Having userspace space do 'trial & error' by 'try this number, nope, ok,
try this number'.... vs. totalvfs = numvfs == num-of-vfs-that-can-be-enabled
seems more predictable.
> If we allow sriov_totalvfs to contain a different number than the
> SR-IOV capability has (as seen via "lspci"), then we have to explain
> to users why they might be different.
>
The difference already has to be explained, since this state already
exists today, and on two of the first SRIOV devices in the market.
Now, if we want to quirk this info vs providing a driver interface,
we can change that part of the design. The interface lets the driver
(policy) change with the driver vs having to do a driver change & a quirk change.
But, we know it exists from day-1, so it should be handled as cleanly
as possible wrt userspace tools from day-1.
> I'm playing devil's advocate a bit here because I really don't know
> that much about SR-IOV or what the administrative interfaces look
> like.
>
>> drivers/pci/iov.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>> drivers/pci/pci-sysfs.c | 4 ++--
>> drivers/pci/pci.h | 1 +
>> include/linux/pci.h | 10 ++++++++++
>> 4 files changed, 61 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index aeccc91..3b4a905 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -735,3 +735,51 @@ int pci_num_vf(struct pci_dev *dev)
>> return dev->sriov->nr_virtfn;
>> }
>> EXPORT_SYMBOL_GPL(pci_num_vf);
>> +
>> +/**
>> + * pci_sriov_set_totalvfs -- reduce the TotalVFs available
>> + * @dev: the PCI PF device
>> + * numvfs: number that should be used for TotalVFs supported
>> + *
>> + * Should be called from PF driver's probe routine with
>> + * device's mutex held.
>> + *
>> + * Returns 0 if PF is an SRIOV-capable device and
>> + * value of numvfs valid. If not a PF with VFS, return -EINVAL;
>> + * if VFs already enabled, return -EBUSY.
>> + */
>> +int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
>> +{
>> + if (!dev || !dev->is_physfn || (numvfs> dev->sriov->total))
>> + return -EINVAL;
>> +
>> + /* Shouldn't change if VFs already enabled */
>> + if (dev->sriov->ctrl& PCI_SRIOV_CTRL_VFE)
>> + return -EBUSY;
>> + else
>> + dev->sriov->drvttl = numvfs;
>> +
>> + return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(pci_sriov_set_totalvfs);
>> +
>> +/**
>> + * pci_sriov_get_totalvfs -- get total VFs supported on this devic3
>> + * @dev: the PCI PF device
>> + *
>> + * For a PCIe device with SRIOV support, return the PCIe
>> + * SRIOV capability value of TotalVFs or the value of drvttl
>> + * if the driver reduced it. Otherwise, -EINVAL.
>> + */
>> +int pci_sriov_get_totalvfs(struct pci_dev *dev)
>> +{
>> + if (!dev || !dev->is_physfn)
>> + return -EINVAL;
>> +
>> + if (dev->sriov->drvttl)
>> + return dev->sriov->drvttl;
>> + else
>> + return dev->sriov->total;
>> +}
>> +EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs);
>> +
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index cbcdd8d..e9c967f 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -413,7 +413,7 @@ static ssize_t sriov_totalvfs_show(struct device *dev,
>> u16 total;
>>
>> pdev = to_pci_dev(dev);
>> - total = pdev->sriov->total;
>> + total = pci_sriov_get_totalvfs(pdev);
>> return sprintf(buf, "%u\n", total);
>> }
>>
>> @@ -459,7 +459,7 @@ static ssize_t sriov_numvfs_store(struct device *dev,
>> }
>>
>> /* if enabling vf's ... */
>> - total = pdev->sriov->total;
>> + total = pci_sriov_get_totalvfs(pdev);
>> /* Requested VFs to enable< totalvfs and none enabled already */
>> if ((num_vfs> 0)&& (num_vfs<= total)) {
>> if (pdev->sriov->nr_virtfn == 0) {
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index 6f6cd14..553bbba 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -240,6 +240,7 @@ struct pci_sriov {
>> u16 stride; /* following VF stride */
>> u32 pgsz; /* page size for BAR alignment */
>> u8 link; /* Function Dependency Link */
>> + u16 drvttl; /* max num VFs driver supports */
>> struct pci_dev *dev; /* lowest numbered PF */
>> struct pci_dev *self; /* this PF */
>> struct mutex lock; /* lock for VF bus */
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 7ef8fba..1ad8249 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1611,6 +1611,8 @@ extern int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
>> extern void pci_disable_sriov(struct pci_dev *dev);
>> extern irqreturn_t pci_sriov_migration(struct pci_dev *dev);
>> extern int pci_num_vf(struct pci_dev *dev);
>> +extern int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
>> +extern int pci_sriov_get_totalvfs(struct pci_dev *dev);
>> #else
>> static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
>> {
>> @@ -1627,6 +1629,14 @@ static inline int pci_num_vf(struct pci_dev *dev)
>> {
>> return 0;
>> }
>> +static inline int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
>> +{
>> + return 0;
>> +}
>> +static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
>> +{
>> + return 0;
>> +}
>> #endif
>>
>> #if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
>> --
>> 1.7.10.2.552.gaa3bb87
>>
next prev parent reply other threads:[~2012-11-12 16:33 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-05 20:20 [PATCH v2] PCI SRIOV device enable and disable via sysfs Donald Dutile
2012-11-05 20:20 ` [PATCH 1/4] PCI: add pci_device_type to pdev's device struct Donald Dutile
2012-11-05 20:20 ` [PATCH 2/4] PCI,sys: Use is_visible() with boot_vga attribute for pci_dev Donald Dutile
2012-11-05 20:20 ` [PATCH 3/4] PCI,sys: SRIOV control and status via sysfs Donald Dutile
2012-11-10 6:47 ` Yinghai Lu
2012-11-10 7:31 ` Yinghai Lu
2012-11-10 21:16 ` Bjorn Helgaas
2012-11-10 23:14 ` Yinghai Lu
2012-11-12 19:24 ` Don Dutile
2012-11-05 20:20 ` [PATCH 4/4] PCI,sriov: provide method to reduce the number of total VFs supported Donald Dutile
2012-11-10 21:33 ` Bjorn Helgaas
2012-11-12 16:33 ` Don Dutile [this message]
2012-11-12 20:57 ` Greg Rose
2012-11-05 20:20 ` [PATCH 5/8] ixgbe: refactor mailbox ops init Donald Dutile
2012-11-05 20:20 ` [PATCH 6/8] ixgbe: refactor SRIOV enable and disable for sysfs interface Donald Dutile
2012-11-05 20:20 ` [PATCH 7/8] ixgbe: sysfs sriov configuration callback support Donald Dutile
2012-11-05 20:20 ` [PATCH 8/8] ixgbe: change totalvfs to match support in driver Donald Dutile
2012-11-14 20:46 ` [PATCH v2] PCI SRIOV device enable and disable via sysfs Bjorn Helgaas
2012-11-14 22:00 ` Rose, Gregory V
2012-12-14 18:19 ` Greg Rose
2012-12-17 19:59 ` Don Dutile
2012-12-17 23:24 ` Bjorn Helgaas
2012-12-17 23:38 ` Greg Rose
2012-12-19 22:44 ` Don Dutile
2012-12-20 21:47 ` Bjorn Helgaas
2012-12-20 22:29 ` Rose, Gregory V
2012-12-21 19:49 ` Bjorn Helgaas
2012-12-21 19:53 ` Rose, Gregory V
2013-01-02 17:08 ` Don Dutile
-- strict thread matches above, loose matches on Subject: below --
2012-10-31 21:19 [RFC] " Donald Dutile
2012-10-31 21:19 ` [PATCH 4/4] PCI,sriov: provide method to reduce the number of total VFs supported Donald Dutile
2012-10-31 23:53 ` Ben Hutchings
2012-11-01 21:12 ` Don Dutile
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50A124E3.3050901@redhat.com \
--to=ddutile@redhat.com \
--cc=bhelgaas@google.com \
--cc=bhutchings@solarflare.com \
--cc=davem@davemloft.net \
--cc=gregory.v.rose@intel.com \
--cc=linux-pci@vger.kernel.org \
--cc=yinghai@kernel.org \
--cc=yuvalmin@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).