From: Wei Yang <weiyang@linux.vnet.ibm.com>
To: Wei Yang <weiyang@linux.vnet.ibm.com>
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, benh@au1.ibm.com,
linuxppc-dev@lists.ozlabs.org, gwshan@linux.vnet.ibm.com
Subject: Re: [PATCH V10 00/17] Enable SRIOV on Power8
Date: Mon, 22 Dec 2014 14:05:22 +0800 [thread overview]
Message-ID: <20141222060522.GA12285@richard> (raw)
In-Reply-To: <1419227677-12312-1-git-send-email-weiyang@linux.vnet.ibm.com>
Bjorn,
This patch set is tested on 3.19-rc1 and with the offset/stride update patch.
I see your comment on the MEM64 issue, so if that is reverted, this
patch set will not work. While I think we can work in parallel, I sent it here
for more comment and to see whether I understand your previous comments
correctly.
I will work with Yinghai to find a way to fix the bug 85491, hope linux kernel
could handle both cases.
Merry Christmas in advance ~
On Mon, Dec 22, 2014 at 01:54:20PM +0800, Wei Yang wrote:
>This patchset enables the SRIOV on POWER8.
>
>The gerneral idea is put each VF into one individual PE and allocate required
>resources like MMIO/DMA/MSI. The major difficulty comes from the MMIO
>allocation and adjustment for PF's IOV BAR.
>
>On P8, we use M64BT to cover a PF's IOV BAR, which could make an individual VF
>sit in its own PE. This gives more flexiblity, while at the mean time it
>brings on some restrictions on the PF's IOV BAR size and alignment.
>
>To achieve this effect, we need to do some hack on pci devices's resources.
>1. Expand the IOV BAR properly.
> Done by pnv_pci_ioda_fixup_iov_resources().
>2. Shift the IOV BAR properly.
> Done by pnv_pci_vf_resource_shift().
>3. IOV BAR alignment is calculated by arch dependent function instead of an
> individual VF BAR size.
> Done by pnv_pcibios_sriov_resource_alignment().
>4. Take the IOV BAR alignment into consideration in the sizing and assigning.
> This is achieved by commit: "PCI: Take additional IOV BAR alignment in
> sizing and assigning"
>
>Test Environment:
> The SRIOV device tested is Emulex Lancer(10df:e220) and
> Mellanox ConnectX-3(15b3:1003) on POWER8.
>
>Examples on pass through a VF to guest through vfio:
> 1. unbind the original driver and bind to vfio-pci driver
> echo 0000:06:0d.0 > /sys/bus/pci/devices/0000:06:0d.0/driver/unbind
> echo 1102 0002 > /sys/bus/pci/drivers/vfio-pci/new_id
> Note: this should be done for each device in the same iommu_group
> 2. Start qemu and pass device through vfio
> /home/ywywyang/git/qemu-impreza/ppc64-softmmu/qemu-system-ppc64 \
> -M pseries -m 2048 -enable-kvm -nographic \
> -drive file=/home/ywywyang/kvm/fc19.img \
> -monitor telnet:localhost:5435,server,nowait -boot cd \
> -device "spapr-pci-vfio-host-bridge,id=CXGB3,iommu=26,index=6"
>
>Verify this is the exact VF response:
> 1. ping from a machine in the same subnet(the broadcast domain)
> 2. run arp -n on this machine
> 9.115.251.20 ether 00:00:c9:df:ed:bf C eth0
> 3. ifconfig in the guest
> # ifconfig eth1
> eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
> inet 9.115.251.20 netmask 255.255.255.0 broadcast 9.115.251.255
> inet6 fe80::200:c9ff:fedf:edbf prefixlen 64 scopeid 0x20<link>
> ether 00:00:c9:df:ed:bf txqueuelen 1000 (Ethernet)
> RX packets 175 bytes 13278 (12.9 KiB)
> RX errors 0 dropped 0 overruns 0 frame 0
> TX packets 58 bytes 9276 (9.0 KiB)
> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
> 4. They have the same MAC address
>
> Note: make sure you shutdown other network interfaces in guest.
>
>---
>v10:
> * remove weak function pcibios_iov_resource_size()
> the VF BAR size is stored in pci_sriov structure and retrieved from
> pci_iov_resource_size()
> * Use "Reserve additional" instead of "Expand" to be more acurate in the
> change log
> * add log message to show the PF's IOV BAR final size
> * add pcibios_sriov_enable/disable() weak funcion in sriov_enable/disable()
> for arch setup before enable VFs. Like the arch could fix up the BDF for
> VFs, since the change of NumVFs would affect the BDF of VFs.
> * Add some explanation of PE on Power arch in the documentation
>v9:
> * make the change log consistent in the terminology
> PF's IOV BAR -> the SRIOV BAR in PF
> VF's BAR -> the normal BAR in VF's view
> * rename all newly introduced function from _sriov_ to _iov_
> * rename the document to Documentation/powerpc/pci_iov_resource_on_powernv.txt
> * add the vendor id and device id of the tested devices
> * change return value from EINVAL to ENOSYS for pci_iov_virtfn_bus() and
> pci_iov_virtfn_devfn() when it is called on PF or SRIOV is not configured
> * rebase on 3.18-rc2 and tested
>v8:
> * use weak funcion pcibios_sriov_resource_size() instead of some flag to
> retrieve the IOV BAR size.
> * add a document Documentation/powerpc/pci_resource.txt to explain the
> design.
> * make pci_iov_virtfn_bus()/pci_iov_virtfn_devfn() not inline.
> * extract a function res_to_dev_res(), so that it is more general to get
> additional size and alignment
> * fix one contention which is introduced in "powrepc/pci: Refactor pci_dn".
> the root cause is pci_get_slot() takes pci_bus_sem and leads to dead
> lock.
>v7:
> * add IORESOURCE_ARCH flag for IOV BAR on powernv platform.
> * when IOV BAR has IORESOURCE_ARCH flag, the size is retrieved from
> hardware directly. If not, calculate as usual.
> * reorder the patch set, group them by subsystem:
> PCI, powerpc, powernv
> * rebase it on 3.16-rc6
>v6:
> * remove pcibios_enable_sriov()/pcibios_disable_sriov() weak function
> similar function is moved to
> pnv_pci_enable_device_hook()/pnv_pci_disable_device_hook(). When PF is
> enabled, platform will try best to allocate resources for VFs.
> * remove pcibios_sriov_resource_size weak function
> * VF BAR size is retrieved from hardware directly in virtfn_add()
>v5:
> * merge those SRIOV related platform functions in machdep_calls
> wrap them in one CONFIG_PCI_IOV marco
> * define IODA_INVALID_M64 to replace (-1)
> use this value to represent the m64_wins is not used
> * rename pnv_pci_release_dev_dma() to pnv_pci_ioda2_release_dma_pe()
> this function is a conterpart to pnv_pci_ioda2_setup_dma_pe()
> * change dev_info() to dev_dgb() in pnv_pci_ioda_fixup_iov_resources()
> reduce some log in kernel
> * release M64 window in pnv_pci_ioda2_release_dma_pe()
>v4:
> * code format fix, eg. not exceed 80 chars
> * in commit "ppc/pnv: Add function to deconfig a PE"
> check the bus has a bridge before print the name
> remove a PE from its own PELTV
> * change the function name for sriov resource size/alignment
> * rebase on 3.16-rc3
> * VFs will not rely on device node
> As Grant Likely's comments, kernel should have the ability to handle the
> lack of device_node gracefully. Gavin restructure the pci_dn, which
> makes the VF will have pci_dn even when VF's device_node is not provided
> by firmware.
> * clean all the patch title to make them comply with one style
> * fix return value for pci_iov_virtfn_bus/pci_iov_virtfn_devfn
>v3:
> * change the return type of virtfn_bus/virtfn_devfn to int
> change the name of these two functions to pci_iov_virtfn_bus/pci_iov_virtfn_devfn
> * reduce the second parameter or pcibios_sriov_disable()
> * use data instead of pe in "ppc/pnv: allocate pe->iommu_table dynamically"
> * rename __pci_sriov_resource_size to pcibios_sriov_resource_size
> * rename __pci_sriov_resource_alignment to pcibios_sriov_resource_alignment
>v2:
> * change the return value of virtfn_bus/virtfn_devfn to 0
> * move some TCE related marco definition to
> arch/powerpc/platforms/powernv/pci.h
> * fix the __pci_sriov_resource_alignment on powernv platform
> During the sizing stage, the IOV BAR is truncated to 0, which will
> effect the order of allocation. Fix this, so that make sure BAR will be
> allocated ordered by their alignment.
>v1:
> * improve the change log for
> "PCI: Add weak __pci_sriov_resource_size() interface"
> "PCI: Add weak __pci_sriov_resource_alignment() interface"
> "PCI: take additional IOV BAR alignment in sizing and assigning"
> * wrap VF PE code in CONFIG_PCI_IOV
> * did regression test on P7.
>Gavin Shan (1):
> powrepc/pci: Refactor pci_dn
>
>Wei Yang (16):
> PCI/IOV: Export interface for retrieve VF's BDF
> PCI/IOV: add VF enable/disable hook
> PCI: Add weak pcibios_iov_resource_alignment() interface
> PCI: Store VF BAR size in pci_sriov
> PCI: Take additional PF's IOV BAR alignment in sizing and assigning
> powerpc/pci: Add PCI resource alignment documentation
> powerpc/pci: Don't unset pci resources for VFs
> powerpc/pci: remove pci_dn->pcidev field
> powerpc/powernv: Use pci_dn in PCI config accessor
> powerpc/powernv: Allocate pe->iommu_table dynamically
> powerpc/powernv: Reserve additional space for IOV BAR according to
> the number of total_pe
> powerpc/powernv: Implement pcibios_iov_resource_alignment() on
> powernv
> powerpc/powernv: Shift VF resource with an offset
> powerpc/powernv: Allocate VF PE
> powerpc/powernv: Reserve additional space for IOV BAR, with
> m64_per_iov supported
> powerpc/powernv: Group VF PE when IOV BAR is big on PHB3
>
> .../powerpc/pci_iov_resource_on_powernv.txt | 215 ++++++
> arch/powerpc/include/asm/device.h | 3 +
> arch/powerpc/include/asm/iommu.h | 3 +
> arch/powerpc/include/asm/machdep.h | 7 +
> arch/powerpc/include/asm/pci-bridge.h | 24 +-
> arch/powerpc/kernel/pci-common.c | 23 +
> arch/powerpc/kernel/pci_dn.c | 251 ++++++-
> arch/powerpc/platforms/powernv/eeh-powernv.c | 14 +-
> arch/powerpc/platforms/powernv/pci-ioda.c | 739 +++++++++++++++++++-
> arch/powerpc/platforms/powernv/pci.c | 87 +--
> arch/powerpc/platforms/powernv/pci.h | 13 +-
> drivers/pci/iov.c | 80 ++-
> drivers/pci/pci.h | 2 +
> drivers/pci/setup-bus.c | 85 ++-
> include/linux/pci.h | 17 +
> 15 files changed, 1449 insertions(+), 114 deletions(-)
> create mode 100644 Documentation/powerpc/pci_iov_resource_on_powernv.txt
>
>--
>1.7.9.5
--
Richard Yang
Help you, Help me
next prev parent reply other threads:[~2014-12-22 6:05 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-22 5:54 [PATCH V10 00/17] Enable SRIOV on Power8 Wei Yang
2014-12-22 5:54 ` [PATCH V10 01/17] PCI/IOV: Export interface for retrieve VF's BDF Wei Yang
2014-12-22 5:54 ` [PATCH V10 02/17] PCI/IOV: add VF enable/disable hook Wei Yang
2014-12-22 5:54 ` [PATCH V10 03/17] PCI: Add weak pcibios_iov_resource_alignment() interface Wei Yang
2014-12-22 5:54 ` [PATCH V10 04/17] PCI: Store VF BAR size in pci_sriov Wei Yang
2014-12-22 5:54 ` [PATCH V10 05/17] PCI: Take additional PF's IOV BAR alignment in sizing and assigning Wei Yang
2014-12-22 5:54 ` [PATCH V10 06/17] powerpc/pci: Add PCI resource alignment documentation Wei Yang
2014-12-22 5:54 ` [PATCH V10 07/17] powerpc/pci: Don't unset pci resources for VFs Wei Yang
2014-12-22 5:54 ` [PATCH V10 08/17] powrepc/pci: Refactor pci_dn Wei Yang
2014-12-22 5:54 ` [PATCH V10 09/17] powerpc/pci: remove pci_dn->pcidev field Wei Yang
2014-12-22 5:54 ` [PATCH V10 10/17] powerpc/powernv: Use pci_dn in PCI config accessor Wei Yang
2014-12-22 5:54 ` [PATCH V10 11/17] powerpc/powernv: Allocate pe->iommu_table dynamically Wei Yang
2014-12-22 5:54 ` [PATCH V10 12/17] powerpc/powernv: Reserve additional space for IOV BAR according to the number of total_pe Wei Yang
2014-12-22 5:54 ` [PATCH V10 13/17] powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv Wei Yang
2014-12-22 5:54 ` [PATCH V10 14/17] powerpc/powernv: Shift VF resource with an offset Wei Yang
2014-12-22 5:54 ` [PATCH V10 15/17] powerpc/powernv: Allocate VF PE Wei Yang
2014-12-22 5:54 ` [PATCH V10 16/17] powerpc/powernv: Reserve additional space for IOV BAR, with m64_per_iov supported Wei Yang
2014-12-22 5:54 ` [PATCH V10 17/17] powerpc/powernv: Group VF PE when IOV BAR is big on PHB3 Wei Yang
2014-12-22 6:05 ` Wei Yang [this message]
2015-01-13 18:05 ` [PATCH V10 00/17] Enable SRIOV on Power8 Bjorn Helgaas
2015-01-15 2:27 ` [PATCH V11 " Wei Yang
2015-01-15 2:27 ` [PATCH V11 01/17] PCI/IOV: Export interface for retrieve VF's BDF Wei Yang
2015-02-20 23:09 ` Bjorn Helgaas
2015-03-02 6:05 ` Wei Yang
2015-01-15 2:27 ` [PATCH V11 02/17] PCI/IOV: add VF enable/disable hook Wei Yang
2015-02-10 0:26 ` Benjamin Herrenschmidt
2015-02-10 1:35 ` Wei Yang
2015-02-10 2:13 ` Benjamin Herrenschmidt
2015-02-10 6:18 ` Wei Yang
2015-01-15 2:27 ` [PATCH V11 03/17] PCI: Add weak pcibios_iov_resource_alignment() interface Wei Yang
2015-02-10 0:32 ` Benjamin Herrenschmidt
2015-02-10 1:44 ` Wei Yang
2015-01-15 2:27 ` [PATCH V11 04/17] PCI: Store VF BAR size in pci_sriov Wei Yang
2015-01-15 2:27 ` [PATCH V11 05/17] PCI: Take additional PF's IOV BAR alignment in sizing and assigning Wei Yang
2015-01-15 2:27 ` [PATCH V11 06/17] powerpc/pci: Add PCI resource alignment documentation Wei Yang
2015-02-04 23:44 ` Bjorn Helgaas
2015-02-10 1:02 ` Benjamin Herrenschmidt
2015-02-20 0:56 ` Bjorn Helgaas
2015-02-20 2:41 ` Benjamin Herrenschmidt
2015-01-15 2:27 ` [PATCH V11 07/17] powerpc/pci: Don't unset pci resources for VFs Wei Yang
2015-02-10 0:36 ` Benjamin Herrenschmidt
2015-02-10 1:51 ` Wei Yang
2015-02-10 2:14 ` Benjamin Herrenschmidt
2015-02-10 6:25 ` Wei Yang
2015-02-10 8:14 ` Benjamin Herrenschmidt
2015-02-20 23:47 ` Bjorn Helgaas
2015-03-02 6:09 ` Wei Yang
2015-01-15 2:27 ` [PATCH V11 08/17] powrepc/pci: Refactor pci_dn Wei Yang
2015-02-20 23:19 ` Bjorn Helgaas
2015-02-23 0:13 ` Gavin Shan
2015-02-24 8:13 ` Bjorn Helgaas
2015-02-24 8:25 ` Benjamin Herrenschmidt
2015-01-15 2:27 ` [PATCH V11 09/17] powerpc/pci: remove pci_dn->pcidev field Wei Yang
2015-01-15 2:28 ` [PATCH V11 10/17] powerpc/powernv: Use pci_dn in PCI config accessor Wei Yang
2015-01-15 2:28 ` [PATCH V11 11/17] powerpc/powernv: Allocate pe->iommu_table dynamically Wei Yang
2015-01-15 2:28 ` [PATCH V11 12/17] powerpc/powernv: Reserve additional space for IOV BAR according to the number of total_pe Wei Yang
2015-02-04 21:26 ` Bjorn Helgaas
2015-02-04 23:08 ` Wei Yang
2015-01-15 2:28 ` [PATCH V11 13/17] powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv Wei Yang
2015-02-04 21:26 ` Bjorn Helgaas
2015-02-04 22:45 ` Wei Yang
2015-01-15 2:28 ` [PATCH V11 14/17] powerpc/powernv: Shift VF resource with an offset Wei Yang
2015-01-30 23:08 ` Bjorn Helgaas
2015-02-03 1:30 ` Wei Yang
2015-02-03 7:01 ` [PATCH] powerpc/powernv: make sure the IOV BAR will not exceed limit after shifting Wei Yang
2015-02-04 0:19 ` Bjorn Helgaas
2015-02-04 3:34 ` Wei Yang
2015-02-04 14:19 ` Bjorn Helgaas
2015-02-04 15:20 ` Wei Yang
2015-02-04 16:08 ` [PATCH] pci/iov: fix memory leak introduced in "PCI: Store individual VF BAR size in struct pci_sriov" Wei Yang
2015-02-04 16:28 ` Bjorn Helgaas
2015-02-04 20:53 ` [PATCH] powerpc/powernv: make sure the IOV BAR will not exceed limit after shifting Bjorn Helgaas
2015-02-05 3:01 ` Wei Yang
2015-01-15 2:28 ` [PATCH V11 15/17] powerpc/powernv: Allocate VF PE Wei Yang
2015-01-15 2:28 ` [PATCH V11 16/17] powerpc/powernv: Reserve additional space for IOV BAR, with m64_per_iov supported Wei Yang
2015-02-04 22:05 ` Bjorn Helgaas
2015-02-05 0:07 ` Wei Yang
2015-01-15 2:28 ` [PATCH V11 17/17] powerpc/powernv: Group VF PE when IOV BAR is big on PHB3 Wei Yang
2015-02-04 23:44 ` [PATCH V11 00/17] Enable SRIOV on Power8 Bjorn Helgaas
2015-02-05 0:13 ` Wei Yang
2015-02-05 6:34 ` [PATCH 0/3] Code adjustment on pci/virtualization Wei Yang
2015-02-05 6:34 ` [PATCH 1/3] fix on Store individual VF BAR size in struct pci_sriov Wei Yang
2015-02-05 6:34 ` [PATCH 2/3] fix Reserve additional space for IOV BAR, with m64_per_iov supported Wei Yang
2015-02-05 6:34 ` [PATCH 3/3] remove the unused end in pnv_pci_vf_resource_shift() Wei Yang
2015-02-10 0:25 ` [PATCH V11 00/17] Enable SRIOV on Power8 Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141222060522.GA12285@richard \
--to=weiyang@linux.vnet.ibm.com \
--cc=benh@au1.ibm.com \
--cc=bhelgaas@google.com \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).