linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
	Alistair Popple <alistair@popple.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Daniel Axtens <dja@axtens.net>,
	Gavin Shan <gwshan@linux.vnet.ibm.com>,
	Paul Mackerras <paulus@samba.org>,
	Russell Currey <ruscur@russell.cc>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH kernel 08/10] powerpc/powernv/npu: Add NPU devices to IOMMU group
Date: Tue, 22 Mar 2016 11:25:50 +1100	[thread overview]
Message-ID: <20160322002550.GR23586@voom.redhat.com> (raw)
In-Reply-To: <56EFAFF3.5090404@ozlabs.ru>

[-- Attachment #1: Type: text/plain, Size: 5361 bytes --]

On Mon, Mar 21, 2016 at 07:25:23PM +1100, Alexey Kardashevskiy wrote:
> On 03/21/2016 03:48 PM, David Gibson wrote:
> >On Wed, Mar 09, 2016 at 05:29:04PM +1100, Alexey Kardashevskiy wrote:
> >>NPU devices have their own TVT which means they are isolated and can be
> >>passed to the userspace via VFIO. The first step is to create an IOMMU
> >>group and attach devices there so does the patch.
> >>
> >>This adds a helper to npu-dma.c which gets GPU from the NPU's pdev and
> >>then walks through all devices on the same bus to determine which NPUs
> >>belong to the same GPU.
> >>
> >>This adds an additional loop over PEs in pnv_ioda_setup_dma() as the main
> >>loop skips NPU PEs as they do not have 32bit DMA segments.
> >>
> >>This uses get_gpu_pci_dev_and_pe() to get @gpdev rather than
> >>pnv_pci_get_gpu_dev() as the following patch will use @gpe as well.
> >>
> >>Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >
> >I'm not entirely clear on how these devices are assigned to groups.
> >Do they each get their own groups, or is the NPU device in the same
> >group as its corresponding GPU (I would have thought the latter makes
> >sense).
> 
> 
> I am putting them to a separate group as they have their own TCE table
> pointer even though they are expected to share it with GPU.

Hmm.. is this safe?  If the GPU and NPU got assigned to different
owners, what would happen?  Could the interfere with each other?

> If I put them to the same group as GPUs, I would have to have
> IODA2-linked-to-NPU bridge type with different iommu_table_group_ops  or
> have multiple hacks everywhere in IODA2 to enable/disable bypass,
> etc.

Well.. I suspect it would mean no longer having a 1:1 correspondance
between user-visible IOMMU groups and the internal iommu_table.

> >>---
> >>  arch/powerpc/platforms/powernv/npu-dma.c  | 40 +++++++++++++++++++++++++++++++
> >>  arch/powerpc/platforms/powernv/pci-ioda.c |  8 +++++++
> >>  arch/powerpc/platforms/powernv/pci.h      |  1 +
> >>  3 files changed, 49 insertions(+)
> >>
> >>diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> >>index 866d3d3..e5a5feb 100644
> >>--- a/arch/powerpc/platforms/powernv/npu-dma.c
> >>+++ b/arch/powerpc/platforms/powernv/npu-dma.c
> >>@@ -263,3 +263,43 @@ void pnv_npu_try_dma_set_bypass(struct pci_dev *gpdev, bool bypass)
> >>  		}
> >>  	}
> >>  }
> >>+
> >>+void pnv_pci_npu_setup_iommu(struct pnv_ioda_pe *npe)
> >>+{
> >>+	struct iommu_table *tbl;
> >>+	struct pnv_phb *phb = npe->phb;
> >>+	struct pci_bus *pbus = phb->hose->bus;
> >>+	struct pci_dev *npdev, *gpdev = NULL, *gptmp;
> >>+	struct pnv_ioda_pe *gpe = get_gpu_pci_dev_and_pe(npe, &gpdev);
> >>+
> >>+	if (!gpe || !gpdev)
> >>+		return;
> >>+
> >>+	iommu_register_group(&npe->table_group, phb->hose->global_number,
> >>+			npe->pe_number);
> >>+
> >>+	tbl = pnv_pci_table_alloc(phb->hose->node);
> >>+
> >>+	list_for_each_entry(npdev, &pbus->devices, bus_list) {
> >>+		gptmp = pnv_pci_get_gpu_dev(npdev);
> >>+
> >>+		if (gptmp != gpdev)
> >>+			continue;
> >>+
> >>+		/*
> >>+		 * The iommu_add_device() picks an IOMMU group from
> >>+		 * the first IOMMU group attached to the iommu_table
> >>+		 * so we need to pretend that there is a table so
> >>+		 * iommu_add_device() can complete the job.
> >>+		 * We unlink the tempopary table from the group afterwards.
> >>+		 */
> >>+		pnv_pci_link_table_and_group(phb->hose->node, 0,
> >>+				tbl, &npe->table_group);
> >>+		set_iommu_table_base(&npdev->dev, tbl);
> >>+		iommu_add_device(&npdev->dev);
> >>+		set_iommu_table_base(&npdev->dev, NULL);
> >>+		pnv_pci_unlink_table_and_group(tbl, &npe->table_group);
> >>+	}
> >>+
> >>+	iommu_free_table(tbl, "");
> >>+}
> >>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> >>index 5a6cf2e..becd168 100644
> >>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
> >>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> >>@@ -2570,6 +2570,14 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
> >>  		remaining -= segs;
> >>  		base += segs;
> >>  	}
> >>+	/*
> >>+	 * Create an IOMMU group and add devices to it.
> >>+	 * DMA setup is to be done via GPU's dma_set_mask().
> >>+	 */
> >>+	if (phb->type == PNV_PHB_NPU) {
> >>+		list_for_each_entry(pe, &phb->ioda.pe_dma_list, dma_link)
> >>+			pnv_pci_npu_setup_iommu(pe);
> >>+	}
> >>  }
> >>
> >>  #ifdef CONFIG_PCI_MSI
> >>diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> >>index 06405fd..0c0083a 100644
> >>--- a/arch/powerpc/platforms/powernv/pci.h
> >>+++ b/arch/powerpc/platforms/powernv/pci.h
> >>@@ -235,5 +235,6 @@ extern void pnv_teardown_msi_irqs(struct pci_dev *pdev);
> >>  /* Nvlink functions */
> >>  extern void pnv_npu_try_dma_set_bypass(struct pci_dev *gpdev, bool bypass);
> >>  extern void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_phb *phb, bool rm);
> >>+extern void pnv_pci_npu_setup_iommu(struct pnv_ioda_pe *npe);
> >>
> >>  #endif /* __POWERNV_PCI_H */
> >
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-03-22  0:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-09  6:28 [PATCH kernel 00/10] powerpc/powernv/npu: Enable PCI pass through for NVLink Alexey Kardashevskiy
2016-03-09  6:28 ` [PATCH kernel 01/10] vfio/spapr: Relax the IOMMU compatibility check Alexey Kardashevskiy
2016-03-10  5:35   ` David Gibson
2016-03-09  6:28 ` [PATCH kernel 02/10] powerpc/powernv: Rename pnv_pci_ioda2_tce_invalidate_entire Alexey Kardashevskiy
2016-03-10  5:35   ` David Gibson
2016-03-09  6:28 ` [PATCH kernel 03/10] powerpc/powernv: Define TCE Kill flags Alexey Kardashevskiy
2016-03-10  5:36   ` David Gibson
2016-03-09  6:29 ` [PATCH kernel 04/10] powerpc/powernv/npu: TCE Kill helpers cleanup Alexey Kardashevskiy
2016-03-10  5:42   ` David Gibson
2016-03-21  2:51   ` Alistair Popple
2016-03-09  6:29 ` [PATCH kernel 05/10] powerpc/powernv/npu: Use the correct IOMMU page size Alexey Kardashevskiy
2016-03-10  5:43   ` David Gibson
2016-03-21  2:57   ` Alistair Popple
2016-03-09  6:29 ` [PATCH kernel 06/10] powerpc/powernv/npu: Simplify DMA setup Alexey Kardashevskiy
2016-03-16  5:55   ` David Gibson
2016-03-21  3:59     ` Alistair Popple
2016-03-09  6:29 ` [PATCH kernel 07/10] powerpc/powernv/npu: Rework TCE Kill handling Alexey Kardashevskiy
2016-03-21  6:50   ` Alistair Popple
2016-03-09  6:29 ` [PATCH kernel 08/10] powerpc/powernv/npu: Add NPU devices to IOMMU group Alexey Kardashevskiy
2016-03-21  4:48   ` David Gibson
2016-03-21  8:25     ` Alexey Kardashevskiy
2016-03-22  0:25       ` David Gibson [this message]
2016-03-22  1:48         ` Alexey Kardashevskiy
2016-03-22 12:41           ` Benjamin Herrenschmidt
2016-03-09  6:29 ` [PATCH kernel 09/10] powerpc/powernv/ioda2: Export some helpers Alexey Kardashevskiy
2016-03-09  6:29 ` [PATCH kernel 10/10] powerpc/powernv/npu: Enable passing through via VFIO Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160322002550.GR23586@voom.redhat.com \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=alistair@popple.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=dja@axtens.net \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    --cc=ruscur@russell.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).