From: Ben Shelton <benjamin.h.shelton@intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
bhelgaas@google.com, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] PCI: IOV: read SRIOV_NUM_VF after enabling ARI
Date: Fri, 16 Oct 2015 11:56:28 -0500 [thread overview]
Message-ID: <20151016165627.GA52728@bhshelto-vm> (raw)
In-Reply-To: <20151015213603.GB13636@localhost>
Hi Bjorn,
> What problem does this patch solve, Ben? I assume you have devices
> that do change TotalVFs when ARI is enabled, and you do want the new
> value?
>
> Or is the problem something like the following:
>
> - ...
> - Linux PCI core sees TotalVFs = X (saved as iov->total_VFs)
> - Linux sets ARI Capable Hierarchy
> - Device changes TotalVFs to X + Y (but PCI core doesn't notice)
> - Driver reads TotalVFs and sees X + Y
> - Driver attempts pci_enable_sriov(dev, X + Y), which fails because
> sriov_enable() sees "X + Y > iov->total_VFs"
Here's a short snippet from the databook for the PCI Express controller we're
using:
"Supports two sets of VF Stride, First VF Offset, InitialVFs, and TotalVFs
registers per PF—one each for ARI and non-ARI hierarchies. Selection is
performed by host software through the ARI Capable Hierarchy bit of the Control
register in the PF0 SR-IOV capability structure."
The values in InitialVFs and TotalVFs are HWinit for each set of registers.
So the issue this is intended to fix is the following:
- Linux PCI core sees TotalVFs = X (saved as iov->total_VFs).
- Linux sets ARI Capable Hierarchy.
- Device switches to exposing the second set of registers, where
InitialVFs = TotalVFs = Y (where Y > X).
- User enables one or more VFs on the device, e.g. by writing a value to
sriov_numvfs in the sysfs.
- Driver calls pci_enable_sriov() for the device, which then calls
sriov_enable(). sriov_enable() reads InitialVFs (= Y) and then checks if it's
greater than iov->total_VFs (= X). Since Y > X, the comparison is true, so
sriov_enable() fails out and returns -EIO.
>
> I'm a little dubious about drivers reading the SRIOV capability
> directly, so maybe this is a symptom of deeper problems.
I agree that the driver should not be reading the capability directly, but from
what I understand, it's intended for the device itself to do this. From the PCI
SR-IOV spec revision 1.1:
"ARI Capable Hierarchy is a hint to the Device that ARI has been enabled in the
Root Port or Switch Downstream Port immediately above the Device."
Ben
>
> Bjorn
next prev parent reply other threads:[~2015-10-16 16:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-08 15:20 [PATCH v2] PCI: IOV: read SRIOV_NUM_VF after enabling ARI Ben Shelton
2015-10-15 17:58 ` Bjorn Helgaas
2015-10-15 20:00 ` Alexander Duyck
2015-10-15 21:36 ` Bjorn Helgaas
2015-10-15 22:14 ` Alexander Duyck
2015-10-16 16:56 ` Ben Shelton [this message]
2015-10-16 18:07 ` Bjorn Helgaas
2015-10-15 19:31 ` Bjorn Helgaas
2015-10-21 20:52 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151016165627.GA52728@bhshelto-vm \
--to=benjamin.h.shelton@intel.com \
--cc=alexander.duyck@gmail.com \
--cc=bhelgaas@google.com \
--cc=helgaas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).