linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Shelton <benjamin.h.shelton@intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	bhelgaas@google.com, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] PCI: IOV: read SRIOV_NUM_VF after enabling ARI
Date: Fri, 16 Oct 2015 11:56:28 -0500	[thread overview]
Message-ID: <20151016165627.GA52728@bhshelto-vm> (raw)
In-Reply-To: <20151015213603.GB13636@localhost>

Hi Bjorn,

> What problem does this patch solve, Ben?  I assume you have devices
> that do change TotalVFs when ARI is enabled, and you do want the new
> value?
> 
> Or is the problem something like the following:
> 
>   - ...
>   - Linux PCI core sees TotalVFs = X (saved as iov->total_VFs)
>   - Linux sets ARI Capable Hierarchy
>   - Device changes TotalVFs to X + Y (but PCI core doesn't notice)
>   - Driver reads TotalVFs and sees X + Y
>   - Driver attempts pci_enable_sriov(dev, X + Y), which fails because
>     sriov_enable() sees "X + Y > iov->total_VFs"

Here's a short snippet from the databook for the PCI Express controller we're
using:

"Supports two sets of VF Stride, First VF Offset, InitialVFs, and TotalVFs
registers per PF—one each for ARI and non-ARI hierarchies. Selection is
performed by host software through the ARI Capable Hierarchy bit of the Control
register in the PF0 SR-IOV capability structure."

The values in InitialVFs and TotalVFs are HWinit for each set of registers.

So the issue this is intended to fix is the following:

- Linux PCI core sees TotalVFs = X (saved as iov->total_VFs).
- Linux sets ARI Capable Hierarchy.
- Device switches to exposing the second set of registers, where
  InitialVFs = TotalVFs = Y (where Y > X).
- User enables one or more VFs on the device, e.g. by writing a value to
  sriov_numvfs in the sysfs.
- Driver calls pci_enable_sriov() for the device, which then calls
  sriov_enable().  sriov_enable() reads InitialVFs (= Y) and then checks if it's
  greater than iov->total_VFs (= X).  Since Y > X, the comparison is true, so
  sriov_enable() fails out and returns -EIO.

> 
> I'm a little dubious about drivers reading the SRIOV capability
> directly, so maybe this is a symptom of deeper problems.

I agree that the driver should not be reading the capability directly, but from
what I understand, it's intended for the device itself to do this.  From the PCI
SR-IOV spec revision 1.1:

"ARI Capable Hierarchy is a hint to the Device that ARI has been enabled in the
Root Port or Switch Downstream Port immediately above the Device."

Ben

> 
> Bjorn

  parent reply	other threads:[~2015-10-16 16:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-08 15:20 [PATCH v2] PCI: IOV: read SRIOV_NUM_VF after enabling ARI Ben Shelton
2015-10-15 17:58 ` Bjorn Helgaas
2015-10-15 20:00   ` Alexander Duyck
2015-10-15 21:36     ` Bjorn Helgaas
2015-10-15 22:14       ` Alexander Duyck
2015-10-16 16:56       ` Ben Shelton [this message]
2015-10-16 18:07         ` Bjorn Helgaas
2015-10-15 19:31 ` Bjorn Helgaas
2015-10-21 20:52   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151016165627.GA52728@bhshelto-vm \
    --to=benjamin.h.shelton@intel.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).