All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Daniel Axtens <dja@axtens.net>,
	Gavin Shan <gwshan@linux.vnet.ibm.com>
Subject: Re: [PATCH kernel 2/2] powerpc/powernv/ioda2: Delay PE disposal
Date: Fri, 15 Apr 2016 12:26:43 +1000	[thread overview]
Message-ID: <20160415022643.GF18218@voom.redhat.com> (raw)
In-Reply-To: <571043FC.8040509@ozlabs.ru>

[-- Attachment #1: Type: text/plain, Size: 5090 bytes --]

On Fri, Apr 15, 2016 at 11:29:32AM +1000, Alexey Kardashevskiy wrote:
> On 04/14/2016 11:40 AM, David Gibson wrote:
> >On Fri, Apr 08, 2016 at 04:36:44PM +1000, Alexey Kardashevskiy wrote:
> >>When SRIOV is disabled, the existing code presumes there is no
> >>virtual function (VF) in use and destroys all associated PEs.
> >>However it is possible to get into the situation when the user
> >>activated SRIOV disabling while a VF is still in use via VFIO.
> >>For example, unbinding a physical function (PF) while there is a guest
> >>running with a VF passed throuhgh via VFIO will trigger the bug.
> >>
> >>This defines an IODA2-specific IOMMU group release() callback.
> >>This moves all the disposal code from pnv_ioda_release_vf_PE() to this
> >>new callback so the cleanup happens when the last user of an IOMMU
> >>group released the reference.
> >>
> >>As pnv_pci_ioda2_release_dma_pe() was reduced to just calling
> >>iommu_group_put(), this merges pnv_pci_ioda2_release_dma_pe()
> >>into pnv_ioda_release_vf_PE().
> >>
> >>Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>---
> >>  arch/powerpc/platforms/powernv/pci-ioda.c | 33 +++++++++++++------------------
> >>  1 file changed, 14 insertions(+), 19 deletions(-)
> >>
> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> >>index ce9f2bf..8108c54 100644
> >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
> >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> >>@@ -1333,27 +1333,25 @@ static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable);
> >>  static void pnv_pci_ioda2_group_release(void *iommu_data)
> >>  {
> >>  	struct iommu_table_group *table_group = iommu_data;
> >>+	struct pnv_ioda_pe *pe = container_of(table_group,
> >>+			struct pnv_ioda_pe, table_group);
> >>+	struct pci_controller *hose = pci_bus_to_host(pe->parent_dev->bus);
> >>+	struct pnv_phb *phb = hose->private_data;
> >>+	struct iommu_table *tbl = pe->table_group.tables[0];
> >>+	int64_t rc;
> >>
> >>-	table_group->group = NULL;
> >>-}
> >>-
> >>-static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe *pe)
> >>-{
> >>-	struct iommu_table    *tbl;
> >>-	int64_t               rc;
> >>-
> >>-	tbl = pe->table_group.tables[0];
> >>  	rc = pnv_pci_ioda2_unset_window(&pe->table_group, 0);
> >
> >Is it safe to go manipulating the PE windows, etc. after SR-IOV is
> >disabled?
> 
> Manipulating windows in this case is just updating 8 bytes in the TVT. At
> this point a VF is expected to be destroyed but PE is expected to remain not
> free so pnv_ioda2_pick_m64_pe() (or pnv_ioda2_reserve_m64_pe()?) won't use
> it.

Ok.

> >When SR-IOV is disabled, you need to immediately disable the VF (I'm
> >guessing that happens somewhere) and stop all access to the VF
> >"hardware".
> 
> drivers/pci/iov.c
> ===
> static void sriov_disable(struct pci_dev *dev)
> {
> ...
> for (i = 0; i < iov->num_VFs; i++)
>         pci_iov_remove_virtfn(dev, i, 0);
> ...
> pcibios_sriov_disable(dev);
> ===
> 
> pcibios_sriov_disable() is where pnv_pci_ioda2_release_dma_pe() is called from.
> 
> >Only the iommu group structure *has* to stick around
> >until the reference count drops to zero.  I think other structures and
> >hardware reconfiguration can be deferred or done immediately,
> >whichever is more convenient.
> 
> I deferred everything because of convenience as iommu_table_group is
> embedded into pnv_ioda struct, not a pointer.

Ok.


With those queries answered,

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> >>  	if (rc)
> >>  		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
> >>
> >>  	pnv_pci_ioda2_set_bypass(pe, false);
> >>-	if (pe->table_group.group) {
> >>-		iommu_group_put(pe->table_group.group);
> >>-		BUG_ON(pe->table_group.group);
> >>-	}
> >>+
> >>+	BUG_ON(!tbl);
> >>  	pnv_pci_ioda2_table_free_pages(tbl);
> >>-	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
> >>+	iommu_free_table(tbl, of_node_full_name(pe->parent_dev->dev.of_node));
> >>+
> >>+	pnv_ioda_deconfigure_pe(phb, pe);
> >>+	pnv_ioda_free_pe(phb, pe->pe_number);
> >>  }
> >>
> >>  static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
> >>@@ -1376,16 +1374,13 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
> >>  		if (pe->parent_dev != pdev)
> >>  			continue;
> >>
> >>-		pnv_pci_ioda2_release_dma_pe(pdev, pe);
> >>-
> >>  		/* Remove from list */
> >>  		mutex_lock(&phb->ioda.pe_list_mutex);
> >>  		list_del(&pe->list);
> >>  		mutex_unlock(&phb->ioda.pe_list_mutex);
> >>
> >>-		pnv_ioda_deconfigure_pe(phb, pe);
> >>-
> >>-		pnv_ioda_free_pe(phb, pe->pe_number);
> >>+		if (pe->table_group.group)
> >>+			iommu_group_put(pe->table_group.group);
> >>  	}
> >>  }
> >>
> >
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-04-15  2:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-08  6:36 [PATCH kernel 0/2] powerpc/powernv: Fix crash on PF unbind when VF is passed Alexey Kardashevskiy
2016-04-08  6:36 ` [PATCH kernel 1/2] powerpc/iommu: Get rid of default group_release() Alexey Kardashevskiy
2016-04-08  7:14   ` kbuild test robot
2016-04-14  1:35   ` David Gibson
2016-04-21  0:02   ` Gavin Shan
2016-04-21  3:17     ` Alexey Kardashevskiy
2016-04-08  6:36 ` [PATCH kernel 2/2] powerpc/powernv/ioda2: Delay PE disposal Alexey Kardashevskiy
2016-04-14  1:40   ` David Gibson
2016-04-15  1:29     ` Alexey Kardashevskiy
2016-04-15  2:26       ` David Gibson [this message]
2016-04-21  0:21   ` Gavin Shan
2016-04-21  3:20     ` Alexey Kardashevskiy
2016-04-26  2:29       ` Alexey Kardashevskiy
2016-04-27  1:07       ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160415022643.GF18218@voom.redhat.com \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@ozlabs.ru \
    --cc=benh@kernel.crashing.org \
    --cc=dja@axtens.net \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.