From: Bjorn Helgaas <bhelgaas@google.com>
To: Wei Yang <weiyang@linux.vnet.ibm.com>
Cc: benh@au1.ibm.com, linux-pci@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, gwshan@linux.vnet.ibm.com,
Donald Dutile <ddutile@redhat.com>,
Myron Stowe <myron.stowe@redhat.com>
Subject: Re: [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface
Date: Tue, 18 Nov 2014 18:12:43 -0700 [thread overview]
Message-ID: <20141119011243.GA23467@google.com> (raw)
In-Reply-To: <1414942894-17034-4-git-send-email-weiyang@linux.vnet.ibm.com>
[+cc Don, Myron]
Hi Wei,
On Sun, Nov 02, 2014 at 11:41:19PM +0800, Wei Yang wrote:
> When retrieving VF IOV BAR in virtfn_add(), it will divide the total PF's IOV
> BAR size with the totalVF number. This is true for most cases, while may not
> be correct on some specific platform.
>
> For example on PowerNV platform, in order to fix PF's IOV BAR into a hardware
> alignment, the PF's IOV BAR size would be expended. This means the original
> method couldn't work.
>
> This patch introduces a weak pcibios_iov_resource_size() interface, which
> gives platform a chance to implement specific method to calculate the VF BAR
> resource size.
>
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> ---
> drivers/pci/iov.c | 27 +++++++++++++++++++++++++--
> include/linux/pci.h | 5 +++++
> 2 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 4d1685d..6866830 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -61,6 +61,30 @@ static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
> pci_remove_bus(virtbus);
> }
>
> +resource_size_t __weak pcibios_iov_resource_size(struct pci_dev *dev, int resno)
> +{
> + return 0;
> +}
> +
> +resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> +{
> + resource_size_t size;
> + struct pci_sriov *iov;
> +
> + if (!dev->is_physfn)
> + return 0;
> +
> + size = pcibios_iov_resource_size(dev, resno);
> + if (size != 0)
> + return size;
> +
> + iov = dev->sriov;
> + size = resource_size(dev->resource + resno);
> + do_div(size, iov->total_VFs);
> +
> + return size;
> +}
> +
> static int virtfn_add(struct pci_dev *dev, int id, int reset)
> {
> int i;
> @@ -96,8 +120,7 @@ static int virtfn_add(struct pci_dev *dev, int id, int reset)
> continue;
> virtfn->resource[i].name = pci_name(virtfn);
> virtfn->resource[i].flags = res->flags;
> - size = resource_size(res);
> - do_div(size, iov->total_VFs);
> + size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
> virtfn->resource[i].start = res->start + size * id;
Can you help me understand this?
We have previously called sriov_init() on the PF. There, we sized the VF
BARs, which are in the PF's SR-IOV Capability (SR-IOV spec sec 3.3.14).
The size we discover is the amount of space required by a single VF, so
sriov_init() adjusts PF->resource[PCI_IOV_RESOURCES + i] by multiplying
that size by PCI_SRIOV_TOTAL_VF, so this PF resource is now big enough to
hold the VF BAR[i] areas for all the possible VFs.
Now we're in virtfn_add(), setting up a new VF. The usual BARs at config
space 0x10, etc., are read-only zeroes for a VF, so we don't size them the
usual way. Instead, we carve out a slice of the
PF->resource[PCI_IOV_RESOURCES + i] area.
I thought the starting address of the VF BARn memory aperture was
prescribed by the spec in sec 2.1.1.1 and shown in figure 2-1:
BARx VFy starting address = VF BARx + (y - 1) * (VF BARx aperture size)
That's basically what the existing code does. We don't save the VF BARx
aperture size, so we recompute it by undoing the multiplication we did in
sriov_init().
But you're computing the starting address using a different VF BARx
aperture size. How does that work? I assumed this calculation was built
into the PCI device, but obviously I'm missing something.
To make it concrete, here's a made-up example:
PF SR-IOV Capability
TotalVFs = 4
NumVFs = 4
System Page Size = 4KB
VF BAR0 = [mem 0x00000000-0x00000fff] (4KB at address 0)
PF pci_dev->resource[7] = [mem 0x00000000-0x00003fff] (16KB)
VF1 pci_dev->resource[0] = [mem 0x00000000-0x00000fff]
VF2 pci_dev->resource[0] = [mem 0x00001000-0x00001fff]
VF3 pci_dev->resource[0] = [mem 0x00002000-0x00002fff]
VF4 pci_dev->resource[0] = [mem 0x00003000-0x00003fff]
You're changing this so we might use 16KB as the VF BAR0 aperture size
instead of 4KB, which would result in the following:
VF1 pci_dev->resource[0] = [mem 0x00000000-0x00003fff]
VF2 pci_dev->resource[0] = [mem 0x00004000-0x00007fff]
VF3 pci_dev->resource[0] = [mem 0x00008000-0x0000bfff]
VF4 pci_dev->resource[0] = [mem 0x0000c000-0x0000ffff]
But you didn't change sriov_init(), so the PF resource[7] is only 16KB, not
the 64KB the VFs need. And I assume the VF address decoder is wired to
claim the 4KB regions at 0x0000, 0x1000, 0x2000, 0x3000, not the ones at
0x0000, 0x4000, 0x8000, 0xc000.
Bjorn
> virtfn->resource[i].end = virtfn->resource[i].start + size - 1;
> rc = request_resource(res, &virtfn->resource[i]);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index bbf8058..2f5b454 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1162,6 +1162,8 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
> resource_size_t pcibios_iov_resource_alignment(struct pci_dev *dev,
> int resno,
> resource_size_t align);
> +resource_size_t pcibios_iov_resource_size(struct pci_dev *dev,
> + int resno);
>
> #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
> #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
> @@ -1666,6 +1668,7 @@ int pci_num_vf(struct pci_dev *dev);
> int pci_vfs_assigned(struct pci_dev *dev);
> int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
> int pci_sriov_get_totalvfs(struct pci_dev *dev);
> +resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno);
> #else
> static inline int pci_iov_virtfn_bus(struct pci_dev *dev, int id)
> {
> @@ -1685,6 +1688,8 @@ static inline int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
> { return 0; }
> static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
> { return 0; }
> +static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> +{ return 0; }
> #endif
>
> #if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
> --
> 1.7.9.5
>
next prev parent reply other threads:[~2014-11-19 1:12 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-02 15:41 [PATCH V9 00/18] Enable SRIOV on PowerNV Wei Yang
2014-11-02 15:41 ` [PATCH V9 01/18] PCI/IOV: Export interface for retrieve VF's BDF Wei Yang
2014-11-19 23:35 ` Bjorn Helgaas
2014-11-02 15:41 ` [PATCH V9 02/18] PCI: Add weak pcibios_iov_resource_alignment() interface Wei Yang
2014-11-02 15:41 ` [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface Wei Yang
2014-11-19 1:12 ` Bjorn Helgaas [this message]
2014-11-19 2:15 ` Benjamin Herrenschmidt
2014-11-19 3:21 ` Wei Yang
2014-11-19 4:26 ` Bjorn Helgaas
2014-11-19 9:27 ` Wei Yang
2014-11-19 17:23 ` Bjorn Helgaas
2014-11-19 20:51 ` Benjamin Herrenschmidt
2014-11-20 5:40 ` Wei Yang
2014-11-20 5:39 ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 04/18] PCI: Take additional PF's IOV BAR alignment in sizing and assigning Wei Yang
2014-11-02 15:41 ` [PATCH V9 05/18] powerpc/pci: Add PCI resource alignment documentation Wei Yang
2014-11-02 15:41 ` [PATCH V9 06/18] powerpc/pci: Don't unset pci resources for VFs Wei Yang
2014-11-02 15:41 ` [PATCH V9 07/18] powerpc/pci: Define pcibios_disable_device() on powerpc Wei Yang
2014-11-02 15:41 ` [PATCH V9 08/18] powrepc/pci: Refactor pci_dn Wei Yang
2014-11-19 23:30 ` Bjorn Helgaas
2014-11-20 1:02 ` Gavin Shan
2014-11-20 7:25 ` Wei Yang
2014-11-20 7:20 ` Wei Yang
2014-11-20 19:05 ` Bjorn Helgaas
2014-11-21 0:04 ` Gavin Shan
2014-11-25 9:28 ` Wei Yang
2014-11-21 1:46 ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 09/18] powerpc/pci: remove pci_dn->pcidev field Wei Yang
2014-11-02 15:41 ` [PATCH V9 10/18] powerpc/powernv: Use pci_dn in PCI config accessor Wei Yang
2014-11-02 15:41 ` [PATCH V9 11/18] powerpc/powernv: Allocate pe->iommu_table dynamically Wei Yang
2014-11-02 15:41 ` [PATCH V9 12/18] powerpc/powernv: Expand VF resources according to the number of total_pe Wei Yang
2014-11-02 15:41 ` [PATCH V9 13/18] powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv Wei Yang
2014-11-02 15:41 ` [PATCH V9 14/18] powerpc/powernv: Implement pcibios_iov_resource_size() " Wei Yang
2014-11-02 15:41 ` [PATCH V9 15/18] powerpc/powernv: Shift VF resource with an offset Wei Yang
2014-11-02 15:41 ` [PATCH V9 16/18] powerpc/powernv: Allocate VF PE Wei Yang
2014-11-02 15:41 ` [PATCH V9 17/18] powerpc/powernv: Expanding IOV BAR, with m64_per_iov supported Wei Yang
2014-11-02 15:41 ` [PATCH V9 18/18] powerpc/powernv: Group VF PE when IOV BAR is big on PHB3 Wei Yang
2014-11-18 23:11 ` [PATCH V9 00/18] Enable SRIOV on PowerNV Gavin Shan
2014-11-18 23:40 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141119011243.GA23467@google.com \
--to=bhelgaas@google.com \
--cc=benh@au1.ibm.com \
--cc=ddutile@redhat.com \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=myron.stowe@redhat.com \
--cc=weiyang@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).