All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Yang <weiyang@linux.vnet.ibm.com>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Wei Yang <weiyang@linux.vnet.ibm.com>,
	benh@au1.ibm.com, linux-pci@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH V9 08/18] powrepc/pci: Refactor pci_dn
Date: Thu, 20 Nov 2014 15:25:08 +0800	[thread overview]
Message-ID: <20141120072508.GD8562@richard> (raw)
In-Reply-To: <20141120010213.GA11893@shangw>

On Thu, Nov 20, 2014 at 12:02:13PM +1100, Gavin Shan wrote:
>On Wed, Nov 19, 2014 at 04:30:24PM -0700, Bjorn Helgaas wrote:
>>On Sun, Nov 02, 2014 at 11:41:24PM +0800, Wei Yang wrote:
>>> From: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>> 
>>> pci_dn is the extension of PCI device node and it's created from
>>> device node. Unfortunately, VFs that are enabled dynamically by
>>> PF's driver and they don't have corresponding device nodes, and
>>> pci_dn. The patch refactors pci_dn to support VFs:
>>> 
>>>    * pci_dn is organized as a hierarchy tree. VF's pci_dn is put
>>>      to the child list of pci_dn of PF's bridge. pci_dn of other
>>>      device put to the child list of pci_dn of its upstream bridge.
>>> 
>>>    * VF's pci_dn is expected to be created dynamically when applying
>>>      final fixup to PF. VF's pci_dn will be destroyed when releasing
>>>      PF's pci_dev instance. pci_dn of other device is still created
>>>      from device node as before.
>>> 
>>>    * For one particular PCI device (VF or not), its pci_dn can be
>>>      found from pdev->dev.archdata.firmware_data, PCI_DN(devnode),
>>>      or parent's list. The fast path (fetching pci_dn through PCI
>>>      device instance) is populated during early fixup time.
>>> 
>>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>> ---
>>> ...
>>
>>> +struct pci_dn *add_dev_pci_info(struct pci_dev *pdev)
>>> +{
>>> +#ifdef CONFIG_PCI_IOV
>>> +	struct pci_dn *parent, *pdn;
>>> +	int i;
>>> +
>>> +	/* Only support IOV for now */
>>> +	if (!pdev->is_physfn)
>>> +		return pci_get_pdn(pdev);
>>> +
>>> +	/* Check if VFs have been populated */
>>> +	pdn = pci_get_pdn(pdev);
>>> +	if (!pdn || (pdn->flags & PCI_DN_FLAG_IOV_VF))
>>> +		return NULL;
>>> +
>>> +	pdn->flags |= PCI_DN_FLAG_IOV_VF;
>>> +	parent = pci_bus_to_pdn(pdev->bus);
>>> +	if (!parent)
>>> +		return NULL;
>>> +
>>> +	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>>> +		pdn = add_one_dev_pci_info(parent, NULL,
>>> +					   pci_iov_virtfn_bus(pdev, i),
>>> +					   pci_iov_virtfn_devfn(pdev, i));
>>
>>I'm not sure this makes sense, but I certainly don't know this code, so
>>maybe I'm missing something.
>>
>
>For ARI, Richard had some patches to fix the issue from firmware side.
>
>>pci_iov_virtfn_bus() and pci_iov_virtfn_devfn() depend on
>>pdev->sriov->stride and pdev->sriov->offset.  These are read from VF Stride
>>and First VF Offset in the SR-IOV capability by sriov_init(), which is
>>called before add_dev_pci_info():
>>
>>  pci_scan_child_bus
>>    pci_scan_slot
>>      pci_scan_single_device
>>	pci_device_add
>>	  pci_init_capabilities
>>	    pci_iov_init(PF)
>>	      sriov_init(PF, pos)
>>		pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0)
>>		pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset)
>>		pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride)
>>		iov->offset = offset
>>		iov->stride = stride
>>
>>  pci_bus_add_devices
>>    pci_bus_add_device
>>      pci_fixup_device(pci_fixup_final)
>>	add_dev_pci_info
>>	  pci_iov_virtfn_bus
>>	    return ... + sriov->offset + (sriov->stride * id) ...
>>
>>But both First VF Offset and VF Stride change when ARI Capable Hierarchy or
>>NumVFs changes (SR-IOV spec sec 3.3.9, 3.3.10).  We set NumVFs to zero in
>>sriov_init() above.  We will change NumVFs to something different when a
>>driver calls pci_enable_sriov():
>>
>>  pci_enable_sriov
>>    sriov_enable
>>      pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn)
>>
>>Now First VF Offset and VF Stride have changed from what they were when we
>>called pci_iov_virtfn_bus() above.
>>
>
>It's the case we missed: First VF Offset and VF Stride can change when
>PF's number of VFs is changed. It means the BDFN (Bus/Device/Function
>number) for one VF can't be determined until PF's number of VFs is
>populated and updated to HW (before calling to virtfn_add()).
>
>The dynamically created pci_dn is used in PCI config accessors currently.
>That means we have to get it ready before first PCI config request to the
>VF in pci_setup_device(). In the code of old revision, we had some weak
>function called in pci_alloc_dev(), which gave platform chance to create
>pci_dn. I think we have to switch back to the old way in order to fix
>the problem you catched. However, the old way is implemented with cost
>of more weak function, which you're probably unhappy to see.
>
>  sriov_enable()
>    virtfn_add()
>      virtfn_add_bus()
>      pci_alloc_dev()
>      pci_setup_device()

Ok, sounds my solution in previous reply can't work. We need the pci_dn ready
before access the configuration space of VFs.


-- 
Richard Yang
Help you, Help me


WARNING: multiple messages have this Message-ID (diff)
From: Wei Yang <weiyang@linux.vnet.ibm.com>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, Wei Yang <weiyang@linux.vnet.ibm.com>,
	benh@au1.ibm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH V9 08/18] powrepc/pci: Refactor pci_dn
Date: Thu, 20 Nov 2014 15:25:08 +0800	[thread overview]
Message-ID: <20141120072508.GD8562@richard> (raw)
In-Reply-To: <20141120010213.GA11893@shangw>

On Thu, Nov 20, 2014 at 12:02:13PM +1100, Gavin Shan wrote:
>On Wed, Nov 19, 2014 at 04:30:24PM -0700, Bjorn Helgaas wrote:
>>On Sun, Nov 02, 2014 at 11:41:24PM +0800, Wei Yang wrote:
>>> From: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>> 
>>> pci_dn is the extension of PCI device node and it's created from
>>> device node. Unfortunately, VFs that are enabled dynamically by
>>> PF's driver and they don't have corresponding device nodes, and
>>> pci_dn. The patch refactors pci_dn to support VFs:
>>> 
>>>    * pci_dn is organized as a hierarchy tree. VF's pci_dn is put
>>>      to the child list of pci_dn of PF's bridge. pci_dn of other
>>>      device put to the child list of pci_dn of its upstream bridge.
>>> 
>>>    * VF's pci_dn is expected to be created dynamically when applying
>>>      final fixup to PF. VF's pci_dn will be destroyed when releasing
>>>      PF's pci_dev instance. pci_dn of other device is still created
>>>      from device node as before.
>>> 
>>>    * For one particular PCI device (VF or not), its pci_dn can be
>>>      found from pdev->dev.archdata.firmware_data, PCI_DN(devnode),
>>>      or parent's list. The fast path (fetching pci_dn through PCI
>>>      device instance) is populated during early fixup time.
>>> 
>>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>> ---
>>> ...
>>
>>> +struct pci_dn *add_dev_pci_info(struct pci_dev *pdev)
>>> +{
>>> +#ifdef CONFIG_PCI_IOV
>>> +	struct pci_dn *parent, *pdn;
>>> +	int i;
>>> +
>>> +	/* Only support IOV for now */
>>> +	if (!pdev->is_physfn)
>>> +		return pci_get_pdn(pdev);
>>> +
>>> +	/* Check if VFs have been populated */
>>> +	pdn = pci_get_pdn(pdev);
>>> +	if (!pdn || (pdn->flags & PCI_DN_FLAG_IOV_VF))
>>> +		return NULL;
>>> +
>>> +	pdn->flags |= PCI_DN_FLAG_IOV_VF;
>>> +	parent = pci_bus_to_pdn(pdev->bus);
>>> +	if (!parent)
>>> +		return NULL;
>>> +
>>> +	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>>> +		pdn = add_one_dev_pci_info(parent, NULL,
>>> +					   pci_iov_virtfn_bus(pdev, i),
>>> +					   pci_iov_virtfn_devfn(pdev, i));
>>
>>I'm not sure this makes sense, but I certainly don't know this code, so
>>maybe I'm missing something.
>>
>
>For ARI, Richard had some patches to fix the issue from firmware side.
>
>>pci_iov_virtfn_bus() and pci_iov_virtfn_devfn() depend on
>>pdev->sriov->stride and pdev->sriov->offset.  These are read from VF Stride
>>and First VF Offset in the SR-IOV capability by sriov_init(), which is
>>called before add_dev_pci_info():
>>
>>  pci_scan_child_bus
>>    pci_scan_slot
>>      pci_scan_single_device
>>	pci_device_add
>>	  pci_init_capabilities
>>	    pci_iov_init(PF)
>>	      sriov_init(PF, pos)
>>		pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0)
>>		pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset)
>>		pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride)
>>		iov->offset = offset
>>		iov->stride = stride
>>
>>  pci_bus_add_devices
>>    pci_bus_add_device
>>      pci_fixup_device(pci_fixup_final)
>>	add_dev_pci_info
>>	  pci_iov_virtfn_bus
>>	    return ... + sriov->offset + (sriov->stride * id) ...
>>
>>But both First VF Offset and VF Stride change when ARI Capable Hierarchy or
>>NumVFs changes (SR-IOV spec sec 3.3.9, 3.3.10).  We set NumVFs to zero in
>>sriov_init() above.  We will change NumVFs to something different when a
>>driver calls pci_enable_sriov():
>>
>>  pci_enable_sriov
>>    sriov_enable
>>      pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn)
>>
>>Now First VF Offset and VF Stride have changed from what they were when we
>>called pci_iov_virtfn_bus() above.
>>
>
>It's the case we missed: First VF Offset and VF Stride can change when
>PF's number of VFs is changed. It means the BDFN (Bus/Device/Function
>number) for one VF can't be determined until PF's number of VFs is
>populated and updated to HW (before calling to virtfn_add()).
>
>The dynamically created pci_dn is used in PCI config accessors currently.
>That means we have to get it ready before first PCI config request to the
>VF in pci_setup_device(). In the code of old revision, we had some weak
>function called in pci_alloc_dev(), which gave platform chance to create
>pci_dn. I think we have to switch back to the old way in order to fix
>the problem you catched. However, the old way is implemented with cost
>of more weak function, which you're probably unhappy to see.
>
>  sriov_enable()
>    virtfn_add()
>      virtfn_add_bus()
>      pci_alloc_dev()
>      pci_setup_device()

Ok, sounds my solution in previous reply can't work. We need the pci_dn ready
before access the configuration space of VFs.


-- 
Richard Yang
Help you, Help me

  reply	other threads:[~2014-11-20  7:25 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-02 15:41 [PATCH V9 00/18] Enable SRIOV on PowerNV Wei Yang
2014-11-02 15:41 ` [PATCH V9 01/18] PCI/IOV: Export interface for retrieve VF's BDF Wei Yang
2014-11-19 23:35   ` Bjorn Helgaas
2014-11-19 23:35     ` Bjorn Helgaas
2014-11-02 15:41 ` [PATCH V9 02/18] PCI: Add weak pcibios_iov_resource_alignment() interface Wei Yang
2014-11-02 15:41 ` [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface Wei Yang
2014-11-19  1:12   ` Bjorn Helgaas
2014-11-19  1:12     ` Bjorn Helgaas
2014-11-19  2:15     ` Benjamin Herrenschmidt
2014-11-19  2:15       ` Benjamin Herrenschmidt
2014-11-19  3:21       ` Wei Yang
2014-11-19  3:21         ` Wei Yang
2014-11-19  4:26         ` Bjorn Helgaas
2014-11-19  4:26           ` Bjorn Helgaas
2014-11-19  9:27           ` Wei Yang
2014-11-19  9:27             ` Wei Yang
2014-11-19 17:23             ` Bjorn Helgaas
2014-11-19 17:23               ` Bjorn Helgaas
2014-11-19 20:51               ` Benjamin Herrenschmidt
2014-11-19 20:51                 ` Benjamin Herrenschmidt
2014-11-20  5:40                 ` Wei Yang
2014-11-20  5:40                   ` Wei Yang
2014-11-20  5:39               ` Wei Yang
2014-11-20  5:39                 ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 04/18] PCI: Take additional PF's IOV BAR alignment in sizing and assigning Wei Yang
2014-11-02 15:41 ` [PATCH V9 05/18] powerpc/pci: Add PCI resource alignment documentation Wei Yang
2014-11-02 15:41 ` [PATCH V9 06/18] powerpc/pci: Don't unset pci resources for VFs Wei Yang
2014-11-02 15:41 ` [PATCH V9 07/18] powerpc/pci: Define pcibios_disable_device() on powerpc Wei Yang
2014-11-02 15:41 ` [PATCH V9 08/18] powrepc/pci: Refactor pci_dn Wei Yang
2014-11-19 23:30   ` Bjorn Helgaas
2014-11-19 23:30     ` Bjorn Helgaas
2014-11-20  1:02     ` Gavin Shan
2014-11-20  1:02       ` Gavin Shan
2014-11-20  7:25       ` Wei Yang [this message]
2014-11-20  7:25         ` Wei Yang
2014-11-20  7:20     ` Wei Yang
2014-11-20  7:20       ` Wei Yang
2014-11-20 19:05       ` Bjorn Helgaas
2014-11-20 19:05         ` Bjorn Helgaas
2014-11-21  0:04         ` Gavin Shan
2014-11-21  0:04           ` Gavin Shan
2014-11-25  9:28           ` Wei Yang
2014-11-25  9:28             ` Wei Yang
2014-11-21  1:46         ` Wei Yang
2014-11-21  1:46           ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 09/18] powerpc/pci: remove pci_dn->pcidev field Wei Yang
2014-11-02 15:41 ` [PATCH V9 10/18] powerpc/powernv: Use pci_dn in PCI config accessor Wei Yang
2014-11-02 15:41 ` [PATCH V9 11/18] powerpc/powernv: Allocate pe->iommu_table dynamically Wei Yang
2014-11-02 15:41 ` [PATCH V9 12/18] powerpc/powernv: Expand VF resources according to the number of total_pe Wei Yang
2014-11-02 15:41 ` [PATCH V9 13/18] powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv Wei Yang
2014-11-02 15:41 ` [PATCH V9 14/18] powerpc/powernv: Implement pcibios_iov_resource_size() " Wei Yang
2014-11-02 15:41 ` [PATCH V9 15/18] powerpc/powernv: Shift VF resource with an offset Wei Yang
2014-11-02 15:41 ` [PATCH V9 16/18] powerpc/powernv: Allocate VF PE Wei Yang
2014-11-02 15:41 ` [PATCH V9 17/18] powerpc/powernv: Expanding IOV BAR, with m64_per_iov supported Wei Yang
2014-11-02 15:41 ` [PATCH V9 18/18] powerpc/powernv: Group VF PE when IOV BAR is big on PHB3 Wei Yang
2014-11-18 23:11 ` [PATCH V9 00/18] Enable SRIOV on PowerNV Gavin Shan
2014-11-18 23:11   ` Gavin Shan
2014-11-18 23:40   ` Bjorn Helgaas
2014-11-18 23:40     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141120072508.GD8562@richard \
    --to=weiyang@linux.vnet.ibm.com \
    --cc=benh@au1.ibm.com \
    --cc=bhelgaas@google.com \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.