linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Wei Yang <weiyang@linux.vnet.ibm.com>
Cc: benh@au1.ibm.com, linux-pci@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, gwshan@linux.vnet.ibm.com,
	Donald Dutile <ddutile@redhat.com>,
	Myron Stowe <myron.stowe@redhat.com>
Subject: Re: [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface
Date: Tue, 18 Nov 2014 18:12:43 -0700	[thread overview]
Message-ID: <20141119011243.GA23467@google.com> (raw)
In-Reply-To: <1414942894-17034-4-git-send-email-weiyang@linux.vnet.ibm.com>

[+cc Don, Myron]

Hi Wei,

On Sun, Nov 02, 2014 at 11:41:19PM +0800, Wei Yang wrote:
> When retrieving VF IOV BAR in virtfn_add(), it will divide the total PF's IOV
> BAR size with the totalVF number. This is true for most cases, while may not
> be correct on some specific platform.
> 
> For example on PowerNV platform, in order to fix PF's IOV BAR into a hardware
> alignment, the PF's IOV BAR size would be expended. This means the original
> method couldn't work.
> 
> This patch introduces a weak pcibios_iov_resource_size() interface, which
> gives platform a chance to implement specific method to calculate the VF BAR
> resource size.
> 
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> ---
>  drivers/pci/iov.c   |   27 +++++++++++++++++++++++++--
>  include/linux/pci.h |    5 +++++
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 4d1685d..6866830 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -61,6 +61,30 @@ static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
>  		pci_remove_bus(virtbus);
>  }
>  
> +resource_size_t __weak pcibios_iov_resource_size(struct pci_dev *dev, int resno)
> +{
> +	return 0;
> +}
> +
> +resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> +{
> +	resource_size_t size;
> +	struct pci_sriov *iov;
> +
> +	if (!dev->is_physfn)
> +		return 0;
> +
> +	size = pcibios_iov_resource_size(dev, resno);
> +	if (size != 0)
> +		return size;
> +
> +	iov = dev->sriov;
> +	size = resource_size(dev->resource + resno);
> +	do_div(size, iov->total_VFs);
> +
> +	return size;
> +}
> +
>  static int virtfn_add(struct pci_dev *dev, int id, int reset)
>  {
>  	int i;
> @@ -96,8 +120,7 @@ static int virtfn_add(struct pci_dev *dev, int id, int reset)
>  			continue;
>  		virtfn->resource[i].name = pci_name(virtfn);
>  		virtfn->resource[i].flags = res->flags;
> -		size = resource_size(res);
> -		do_div(size, iov->total_VFs);
> +		size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
>  		virtfn->resource[i].start = res->start + size * id;

Can you help me understand this?

We have previously called sriov_init() on the PF.  There, we sized the VF
BARs, which are in the PF's SR-IOV Capability (SR-IOV spec sec 3.3.14).
The size we discover is the amount of space required by a single VF, so
sriov_init() adjusts PF->resource[PCI_IOV_RESOURCES + i] by multiplying
that size by PCI_SRIOV_TOTAL_VF, so this PF resource is now big enough to 
hold the VF BAR[i] areas for all the possible VFs.

Now we're in virtfn_add(), setting up a new VF.  The usual BARs at config
space 0x10, etc., are read-only zeroes for a VF, so we don't size them the
usual way.  Instead, we carve out a slice of the
PF->resource[PCI_IOV_RESOURCES + i] area.

I thought the starting address of the VF BARn memory aperture was
prescribed by the spec in sec 2.1.1.1 and shown in figure 2-1:

  BARx VFy starting address = VF BARx + (y - 1) * (VF BARx aperture size)

That's basically what the existing code does.  We don't save the VF BARx
aperture size, so we recompute it by undoing the multiplication we did in
sriov_init().

But you're computing the starting address using a different VF BARx
aperture size.  How does that work?  I assumed this calculation was built
into the PCI device, but obviously I'm missing something.

To make it concrete, here's a made-up example:

  PF SR-IOV Capability
    TotalVFs = 4
    NumVFs = 4
    System Page Size = 4KB
    VF BAR0 = [mem 0x00000000-0x00000fff] (4KB at address 0)

  PF  pci_dev->resource[7] = [mem 0x00000000-0x00003fff] (16KB)
  VF1 pci_dev->resource[0] = [mem 0x00000000-0x00000fff]
  VF2 pci_dev->resource[0] = [mem 0x00001000-0x00001fff]
  VF3 pci_dev->resource[0] = [mem 0x00002000-0x00002fff]
  VF4 pci_dev->resource[0] = [mem 0x00003000-0x00003fff]

You're changing this so we might use 16KB as the VF BAR0 aperture size
instead of 4KB, which would result in the following:

  VF1 pci_dev->resource[0] = [mem 0x00000000-0x00003fff]
  VF2 pci_dev->resource[0] = [mem 0x00004000-0x00007fff]
  VF3 pci_dev->resource[0] = [mem 0x00008000-0x0000bfff]
  VF4 pci_dev->resource[0] = [mem 0x0000c000-0x0000ffff]

But you didn't change sriov_init(), so the PF resource[7] is only 16KB, not
the 64KB the VFs need.  And I assume the VF address decoder is wired to
claim the 4KB regions at 0x0000, 0x1000, 0x2000, 0x3000, not the ones at
0x0000, 0x4000, 0x8000, 0xc000.

Bjorn

>  		virtfn->resource[i].end = virtfn->resource[i].start + size - 1;
>  		rc = request_resource(res, &virtfn->resource[i]);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index bbf8058..2f5b454 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1162,6 +1162,8 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
>  resource_size_t pcibios_iov_resource_alignment(struct pci_dev *dev,
>  						 int resno,
>  						 resource_size_t align);
> +resource_size_t pcibios_iov_resource_size(struct pci_dev *dev,
> +		                            int resno);
>  
>  #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
>  #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
> @@ -1666,6 +1668,7 @@ int pci_num_vf(struct pci_dev *dev);
>  int pci_vfs_assigned(struct pci_dev *dev);
>  int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
>  int pci_sriov_get_totalvfs(struct pci_dev *dev);
> +resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno);
>  #else
>  static inline int pci_iov_virtfn_bus(struct pci_dev *dev, int id)
>  {
> @@ -1685,6 +1688,8 @@ static inline int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
>  { return 0; }
>  static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
>  { return 0; }
> +static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> +{ return 0; }
>  #endif
>  
>  #if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
> -- 
> 1.7.9.5
> 

  reply	other threads:[~2014-11-19  1:12 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-02 15:41 [PATCH V9 00/18] Enable SRIOV on PowerNV Wei Yang
2014-11-02 15:41 ` [PATCH V9 01/18] PCI/IOV: Export interface for retrieve VF's BDF Wei Yang
2014-11-19 23:35   ` Bjorn Helgaas
2014-11-02 15:41 ` [PATCH V9 02/18] PCI: Add weak pcibios_iov_resource_alignment() interface Wei Yang
2014-11-02 15:41 ` [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface Wei Yang
2014-11-19  1:12   ` Bjorn Helgaas [this message]
2014-11-19  2:15     ` Benjamin Herrenschmidt
2014-11-19  3:21       ` Wei Yang
2014-11-19  4:26         ` Bjorn Helgaas
2014-11-19  9:27           ` Wei Yang
2014-11-19 17:23             ` Bjorn Helgaas
2014-11-19 20:51               ` Benjamin Herrenschmidt
2014-11-20  5:40                 ` Wei Yang
2014-11-20  5:39               ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 04/18] PCI: Take additional PF's IOV BAR alignment in sizing and assigning Wei Yang
2014-11-02 15:41 ` [PATCH V9 05/18] powerpc/pci: Add PCI resource alignment documentation Wei Yang
2014-11-02 15:41 ` [PATCH V9 06/18] powerpc/pci: Don't unset pci resources for VFs Wei Yang
2014-11-02 15:41 ` [PATCH V9 07/18] powerpc/pci: Define pcibios_disable_device() on powerpc Wei Yang
2014-11-02 15:41 ` [PATCH V9 08/18] powrepc/pci: Refactor pci_dn Wei Yang
2014-11-19 23:30   ` Bjorn Helgaas
2014-11-20  1:02     ` Gavin Shan
2014-11-20  7:25       ` Wei Yang
2014-11-20  7:20     ` Wei Yang
2014-11-20 19:05       ` Bjorn Helgaas
2014-11-21  0:04         ` Gavin Shan
2014-11-25  9:28           ` Wei Yang
2014-11-21  1:46         ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 09/18] powerpc/pci: remove pci_dn->pcidev field Wei Yang
2014-11-02 15:41 ` [PATCH V9 10/18] powerpc/powernv: Use pci_dn in PCI config accessor Wei Yang
2014-11-02 15:41 ` [PATCH V9 11/18] powerpc/powernv: Allocate pe->iommu_table dynamically Wei Yang
2014-11-02 15:41 ` [PATCH V9 12/18] powerpc/powernv: Expand VF resources according to the number of total_pe Wei Yang
2014-11-02 15:41 ` [PATCH V9 13/18] powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv Wei Yang
2014-11-02 15:41 ` [PATCH V9 14/18] powerpc/powernv: Implement pcibios_iov_resource_size() " Wei Yang
2014-11-02 15:41 ` [PATCH V9 15/18] powerpc/powernv: Shift VF resource with an offset Wei Yang
2014-11-02 15:41 ` [PATCH V9 16/18] powerpc/powernv: Allocate VF PE Wei Yang
2014-11-02 15:41 ` [PATCH V9 17/18] powerpc/powernv: Expanding IOV BAR, with m64_per_iov supported Wei Yang
2014-11-02 15:41 ` [PATCH V9 18/18] powerpc/powernv: Group VF PE when IOV BAR is big on PHB3 Wei Yang
2014-11-18 23:11 ` [PATCH V9 00/18] Enable SRIOV on PowerNV Gavin Shan
2014-11-18 23:40   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141119011243.GA23467@google.com \
    --to=bhelgaas@google.com \
    --cc=benh@au1.ibm.com \
    --cc=ddutile@redhat.com \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=myron.stowe@redhat.com \
    --cc=weiyang@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).