From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Wei Yang <weiyang@linux.vnet.ibm.com>,
Gavin Shan <gwshan@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org,
Alex Williamson <alex.williamson@redhat.com>,
Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel v10 06/34] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver
Date: Wed, 13 May 2015 15:58:09 +1000 [thread overview]
Message-ID: <20150513055809.GA3622@gwshan> (raw)
In-Reply-To: <1431358763-24371-7-git-send-email-aik@ozlabs.ru>
On Tue, May 12, 2015 at 01:38:55AM +1000, Alexey Kardashevskiy wrote:
>This moves page pinning (get_user_pages_fast()/put_page()) code out of
>the platform IOMMU code and puts it to VFIO IOMMU driver where it belongs
>to as the platform code does not deal with page pinning.
>
>This makes iommu_take_ownership()/iommu_release_ownership() deal with
>the IOMMU table bitmap only.
>
>This removes page unpinning from iommu_take_ownership() as the actual
>TCE table might contain garbage and doing put_page() on it is undefined
>behaviour.
>
>Besides the last part, the rest of the patch is mechanical.
>
>Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>[aw: for the vfio related changes]
>Acked-by: Alex Williamson <alex.williamson@redhat.com>
>Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>---
>Changes:
>v9:
>* added missing tce_iommu_clear call after iommu_release_ownership()
>* brought @offset (a local variable) back to make patch even more
>mechanical
>
>v4:
>* s/iommu_tce_build(tbl, entry + 1/iommu_tce_build(tbl, entry + i/
>---
> arch/powerpc/include/asm/iommu.h | 4 --
> arch/powerpc/kernel/iommu.c | 55 -------------------------
> drivers/vfio/vfio_iommu_spapr_tce.c | 80 +++++++++++++++++++++++++++++++------
> 3 files changed, 67 insertions(+), 72 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
>index 8353c86..e94a5e3 100644
>--- a/arch/powerpc/include/asm/iommu.h
>+++ b/arch/powerpc/include/asm/iommu.h
>@@ -194,10 +194,6 @@ extern int iommu_tce_build(struct iommu_table *tbl, unsigned long entry,
> unsigned long hwaddr, enum dma_data_direction direction);
> extern unsigned long iommu_clear_tce(struct iommu_table *tbl,
> unsigned long entry);
>-extern int iommu_clear_tces_and_put_pages(struct iommu_table *tbl,
>- unsigned long entry, unsigned long pages);
>-extern int iommu_put_tce_user_mode(struct iommu_table *tbl,
>- unsigned long entry, unsigned long tce);
>
> extern void iommu_flush_tce(struct iommu_table *tbl);
> extern int iommu_take_ownership(struct iommu_table *tbl);
>diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>index 2c02d4c..8673c94 100644
>--- a/arch/powerpc/kernel/iommu.c
>+++ b/arch/powerpc/kernel/iommu.c
>@@ -983,30 +983,6 @@ unsigned long iommu_clear_tce(struct iommu_table *tbl, unsigned long entry)
> }
> EXPORT_SYMBOL_GPL(iommu_clear_tce);
>
>-int iommu_clear_tces_and_put_pages(struct iommu_table *tbl,
>- unsigned long entry, unsigned long pages)
>-{
>- unsigned long oldtce;
>- struct page *page;
>-
>- for ( ; pages; --pages, ++entry) {
>- oldtce = iommu_clear_tce(tbl, entry);
>- if (!oldtce)
>- continue;
>-
>- page = pfn_to_page(oldtce >> PAGE_SHIFT);
>- WARN_ON(!page);
>- if (page) {
>- if (oldtce & TCE_PCI_WRITE)
>- SetPageDirty(page);
>- put_page(page);
>- }
>- }
>-
>- return 0;
>-}
>-EXPORT_SYMBOL_GPL(iommu_clear_tces_and_put_pages);
>-
> /*
> * hwaddr is a kernel virtual address here (0xc... bazillion),
> * tce_build converts it to a physical address.
>@@ -1036,35 +1012,6 @@ int iommu_tce_build(struct iommu_table *tbl, unsigned long entry,
> }
> EXPORT_SYMBOL_GPL(iommu_tce_build);
>
>-int iommu_put_tce_user_mode(struct iommu_table *tbl, unsigned long entry,
>- unsigned long tce)
>-{
>- int ret;
>- struct page *page = NULL;
>- unsigned long hwaddr, offset = tce & IOMMU_PAGE_MASK(tbl) & ~PAGE_MASK;
>- enum dma_data_direction direction = iommu_tce_direction(tce);
>-
>- ret = get_user_pages_fast(tce & PAGE_MASK, 1,
>- direction != DMA_TO_DEVICE, &page);
>- if (unlikely(ret != 1)) {
>- /* pr_err("iommu_tce: get_user_pages_fast failed tce=%lx ioba=%lx ret=%d\n",
>- tce, entry << tbl->it_page_shift, ret); */
>- return -EFAULT;
>- }
>- hwaddr = (unsigned long) page_address(page) + offset;
>-
>- ret = iommu_tce_build(tbl, entry, hwaddr, direction);
>- if (ret)
>- put_page(page);
>-
>- if (ret < 0)
>- pr_err("iommu_tce: %s failed ioba=%lx, tce=%lx, ret=%d\n",
>- __func__, entry << tbl->it_page_shift, tce, ret);
>-
>- return ret;
>-}
>-EXPORT_SYMBOL_GPL(iommu_put_tce_user_mode);
>-
> int iommu_take_ownership(struct iommu_table *tbl)
> {
> unsigned long sz = (tbl->it_size + 7) >> 3;
>@@ -1078,7 +1025,6 @@ int iommu_take_ownership(struct iommu_table *tbl)
> }
>
> memset(tbl->it_map, 0xff, sz);
>- iommu_clear_tces_and_put_pages(tbl, tbl->it_offset, tbl->it_size);
>
> /*
> * Disable iommu bypass, otherwise the user can DMA to all of
>@@ -1096,7 +1042,6 @@ void iommu_release_ownership(struct iommu_table *tbl)
> {
> unsigned long sz = (tbl->it_size + 7) >> 3;
>
>- iommu_clear_tces_and_put_pages(tbl, tbl->it_offset, tbl->it_size);
> memset(tbl->it_map, 0, sz);
>
> /* Restore bit#0 set by iommu_init_table() */
>diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
>index 730b4ef..b95fa2b 100644
>--- a/drivers/vfio/vfio_iommu_spapr_tce.c
>+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
>@@ -147,6 +147,67 @@ static void tce_iommu_release(void *iommu_data)
> kfree(container);
> }
>
>+static int tce_iommu_clear(struct tce_container *container,
>+ struct iommu_table *tbl,
>+ unsigned long entry, unsigned long pages)
>+{
>+ unsigned long oldtce;
>+ struct page *page;
>+
>+ for ( ; pages; --pages, ++entry) {
>+ oldtce = iommu_clear_tce(tbl, entry);
It might be nice to rename iommu_clear_tce() to iommu_tce_free() with another
separate patch for two reasons as I can see: iommu_tce_{build, free} is one
pair of functions doing opposite things. iommu_tce_free() is implemented based
on ppc_md.tce_free() as iommu_tce_build() depends on ppc_md.tce_build().
>+ if (!oldtce)
>+ continue;
>+
>+ page = pfn_to_page(oldtce >> PAGE_SHIFT);
>+ WARN_ON(!page);
>+ if (page) {
>+ if (oldtce & TCE_PCI_WRITE)
>+ SetPageDirty(page);
>+ put_page(page);
>+ }
>+ }
>+
>+ return 0;
>+}
>+
>+static long tce_iommu_build(struct tce_container *container,
>+ struct iommu_table *tbl,
>+ unsigned long entry, unsigned long tce, unsigned long pages)
>+{
>+ long i, ret = 0;
>+ struct page *page = NULL;
>+ unsigned long hva;
>+ enum dma_data_direction direction = iommu_tce_direction(tce);
>+
>+ for (i = 0; i < pages; ++i) {
>+ unsigned long offset = tce & IOMMU_PAGE_MASK(tbl) & ~PAGE_MASK;
>+
>+ ret = get_user_pages_fast(tce & PAGE_MASK, 1,
>+ direction != DMA_TO_DEVICE, &page);
>+ if (unlikely(ret != 1)) {
>+ ret = -EFAULT;
>+ break;
>+ }
>+ hva = (unsigned long) page_address(page) + offset;
>+
>+ ret = iommu_tce_build(tbl, entry + i, hva, direction);
>+ if (ret) {
>+ put_page(page);
>+ pr_err("iommu_tce: %s failed ioba=%lx, tce=%lx, ret=%ld\n",
>+ __func__, entry << tbl->it_page_shift,
>+ tce, ret);
>+ break;
>+ }
>+ tce += IOMMU_PAGE_SIZE_4K;
>+ }
>+
>+ if (ret)
>+ tce_iommu_clear(container, tbl, entry, i);
>+
>+ return ret;
>+}
>+
> static long tce_iommu_ioctl(void *iommu_data,
> unsigned int cmd, unsigned long arg)
> {
>@@ -195,7 +256,7 @@ static long tce_iommu_ioctl(void *iommu_data,
> case VFIO_IOMMU_MAP_DMA: {
> struct vfio_iommu_type1_dma_map param;
> struct iommu_table *tbl = container->tbl;
>- unsigned long tce, i;
>+ unsigned long tce;
>
> if (!tbl)
> return -ENXIO;
>@@ -229,17 +290,9 @@ static long tce_iommu_ioctl(void *iommu_data,
> if (ret)
> return ret;
>
>- for (i = 0; i < (param.size >> IOMMU_PAGE_SHIFT_4K); ++i) {
>- ret = iommu_put_tce_user_mode(tbl,
>- (param.iova >> IOMMU_PAGE_SHIFT_4K) + i,
>- tce);
>- if (ret)
>- break;
>- tce += IOMMU_PAGE_SIZE_4K;
>- }
>- if (ret)
>- iommu_clear_tces_and_put_pages(tbl,
>- param.iova >> IOMMU_PAGE_SHIFT_4K, i);
>+ ret = tce_iommu_build(container, tbl,
>+ param.iova >> IOMMU_PAGE_SHIFT_4K,
>+ tce, param.size >> IOMMU_PAGE_SHIFT_4K);
>
> iommu_flush_tce(tbl);
>
>@@ -273,7 +326,7 @@ static long tce_iommu_ioctl(void *iommu_data,
> if (ret)
> return ret;
>
>- ret = iommu_clear_tces_and_put_pages(tbl,
>+ ret = tce_iommu_clear(container, tbl,
> param.iova >> IOMMU_PAGE_SHIFT_4K,
> param.size >> IOMMU_PAGE_SHIFT_4K);
> iommu_flush_tce(tbl);
>@@ -357,6 +410,7 @@ static void tce_iommu_detach_group(void *iommu_data,
> /* pr_debug("tce_vfio: detaching group #%u from iommu %p\n",
> iommu_group_id(iommu_group), iommu_group); */
> container->tbl = NULL;
>+ tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
> iommu_release_ownership(tbl);
> }
> mutex_unlock(&container->lock);
Thanks,
Gavin
>--
>2.4.0.rc3.8.gfb3e7d5
>
next prev parent reply other threads:[~2015-05-13 5:59 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-11 15:38 [PATCH kernel v10 00/34] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 01/34] powerpc/eeh/ioda2: Use device::iommu_group to check IOMMU group Alexey Kardashevskiy
2015-05-12 1:51 ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 02/34] powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group Alexey Kardashevskiy
2015-05-13 5:18 ` Gavin Shan
2015-05-13 7:26 ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 03/34] powerpc/powernv/ioda: Clean up IOMMU group registration Alexey Kardashevskiy
2015-05-13 5:21 ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 04/34] powerpc/iommu: Put IOMMU group explicitly Alexey Kardashevskiy
2015-05-13 5:27 ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 05/34] powerpc/iommu: Always release iommu_table in iommu_free_table() Alexey Kardashevskiy
2015-05-13 5:33 ` Gavin Shan
2015-05-13 6:30 ` Alexey Kardashevskiy
2015-05-13 12:51 ` Thomas Huth
2015-05-13 23:27 ` Gavin Shan
2015-05-14 2:34 ` Alexey Kardashevskiy
2015-05-14 2:53 ` Alex Williamson
2015-05-14 6:29 ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 06/34] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-05-13 5:58 ` Gavin Shan [this message]
2015-05-13 6:32 ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 07/34] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page Alexey Kardashevskiy
2015-05-13 6:06 ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 08/34] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-05-13 6:12 ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 09/34] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-05-13 6:18 ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 10/34] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-05-13 6:20 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 11/34] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-05-13 6:32 ` Gavin Shan
2015-05-13 7:30 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 12/34] vfio: powerpc/spapr: Rework groups attaching Alexey Kardashevskiy
2015-05-13 23:35 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 13/34] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-05-14 0:00 ` Gavin Shan
2015-05-14 2:51 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 14/34] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-05-14 0:23 ` Gavin Shan
2015-05-14 3:07 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 15/34] powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free() Alexey Kardashevskiy
2015-05-14 0:48 ` Gavin Shan
2015-05-14 3:19 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 16/34] powerpc/spapr: vfio: Replace iommu_table with iommu_table_group Alexey Kardashevskiy
2015-05-13 21:30 ` Alex Williamson
2015-05-14 1:21 ` Gavin Shan
2015-05-14 3:31 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 17/34] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-05-14 1:52 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 18/34] vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control Alexey Kardashevskiy
2015-05-14 2:01 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 19/34] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-05-14 3:36 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 20/34] powerpc/powernv/ioda2: Move TCE kill register address to PE Alexey Kardashevskiy
2015-05-14 2:10 ` Gavin Shan
2015-05-14 3:39 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 21/34] powerpc/powernv/ioda2: Add TCE invalidation for all attached groups Alexey Kardashevskiy
2015-05-14 2:22 ` Gavin Shan
2015-05-14 3:50 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 22/34] powerpc/powernv: Implement accessor to TCE entry Alexey Kardashevskiy
2015-05-14 2:34 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 23/34] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-05-13 15:00 ` Thomas Huth
2015-05-14 3:53 ` Alexey Kardashevskiy
2015-05-15 8:09 ` Thomas Huth
2015-05-11 15:39 ` [PATCH kernel v10 24/34] powerpc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-05-14 4:14 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 25/34] powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages Alexey Kardashevskiy
2015-05-14 4:31 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 26/34] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-05-14 5:01 ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 27/34] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 28/34] vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API Alexey Kardashevskiy
2015-05-13 21:30 ` Alex Williamson
2015-05-11 15:39 ` [PATCH kernel v10 29/34] powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 30/34] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 31/34] vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 32/34] powerpc/mmu: Add userspace-to-physical addresses translation cache Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 33/34] vfio: powerpc/spapr: Register memory and define IOMMU v2 Alexey Kardashevskiy
2015-05-13 21:30 ` Alex Williamson
2015-05-14 6:08 ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 34/34] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150513055809.GA3622@gwshan \
--to=gwshan@linux.vnet.ibm.com \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
--cc=weiyang@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).