All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Gavin Shan <gwshan@linux.vnet.ibm.com>,
	Paul Mackerras <paulus@samba.org>,
	Wei Yang <weiyang@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH kernel v12 33/34] vfio: powerpc/spapr: Register memory and define IOMMU v2
Date: Tue, 9 Jun 2015 14:21:25 +1000	[thread overview]
Message-ID: <20150609042125.GD11000@voom.redhat.com> (raw)
In-Reply-To: <1433486126-23551-34-git-send-email-aik@ozlabs.ru>

[-- Attachment #1: Type: text/plain, Size: 3082 bytes --]

On Fri, Jun 05, 2015 at 04:35:25PM +1000, Alexey Kardashevskiy wrote:
> The existing implementation accounts the whole DMA window in
> the locked_vm counter. This is going to be worse with multiple
> containers and huge DMA windows. Also, real-time accounting would requite
> additional tracking of accounted pages due to the page size difference -
> IOMMU uses 4K pages and system uses 4K or 64K pages.
> 
> Another issue is that actual pages pinning/unpinning happens on every
> DMA map/unmap request. This does not affect the performance much now as
> we spend way too much time now on switching context between
> guest/userspace/host but this will start to matter when we add in-kernel
> DMA map/unmap acceleration.
> 
> This introduces a new IOMMU type for SPAPR - VFIO_SPAPR_TCE_v2_IOMMU.
> New IOMMU deprecates VFIO_IOMMU_ENABLE/VFIO_IOMMU_DISABLE and introduces
> 2 new ioctls to register/unregister DMA memory -
> VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY -
> which receive user space address and size of a memory region which
> needs to be pinned/unpinned and counted in locked_vm.
> New IOMMU splits physical pages pinning and TCE table update
> into 2 different operations. It requires:
> 1) guest pages to be registered first
> 2) consequent map/unmap requests to work only with pre-registered memory.
> For the default single window case this means that the entire guest
> (instead of 2GB) needs to be pinned before using VFIO.
> When a huge DMA window is added, no additional pinning will be
> required, otherwise it would be guest RAM + 2GB.
> 
> The new memory registration ioctls are not supported by
> VFIO_SPAPR_TCE_IOMMU. Dynamic DMA window and in-kernel acceleration
> will require memory to be preregistered in order to work.
> 
> The accounting is done per the user process.
> 
> This advertises v2 SPAPR TCE IOMMU and restricts what the userspace
> can do with v1 or v2 IOMMUs.
> 
> In order to support memory pre-registration, we need a way to track
> the use of every registered memory region and only allow unregistration
> if a region is not in use anymore. So we need a way to tell from what
> region the just cleared TCE was from.
> 
> This adds a userspace view of the TCE table into iommu_table struct.
> It contains userspace address, one per TCE entry. The table is only
> allocated when the ownership over an IOMMU group is taken which means
> it is only used from outside of the powernv code (such as VFIO).
> 
> As v2 IOMMU supports IODA2 and pre-IODA2 IOMMUs (which do not support
> DDW API), this creates a default DMA window for IODA2 for consistency.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> [aw: for the vfio related changes]
> Acked-by: Alex Williamson <alex.williamson@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-06-09  4:56 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-05  6:34 [PATCH kernel v12 00/34] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 01/34] powerpc/eeh/ioda2: Use device::iommu_group to check IOMMU group Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 02/34] powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 03/34] powerpc/powernv/ioda: Clean up IOMMU group registration Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 04/34] powerpc/iommu: Put IOMMU group explicitly Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 05/34] powerpc/iommu: Always release iommu_table in iommu_free_table() Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 06/34] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-06-05  6:34 ` [PATCH kernel v12 07/34] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 08/34] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 09/34] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-06-09  4:22   ` David Gibson
2015-06-05  6:35 ` [PATCH kernel v12 10/34] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 11/34] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 12/34] vfio: powerpc/spapr: Rework groups attaching Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 13/34] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 14/34] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 15/34] powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free() Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 16/34] powerpc/spapr: vfio: Replace iommu_table with iommu_table_group Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 17/34] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-06-09  2:36   ` David Gibson
2015-06-09  5:59   ` [PATCH kernel] powerpc/pseries: Fix compile error when CONFIG_IOMMU_API if off Alexey Kardashevskiy
2015-06-09 12:23   ` [PATCH kernel v12 17/34] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Michael Ellerman
2015-06-10  3:08     ` [PATCH kernel] powerpc/powernv: Fix crash when CONFIG_IOMMU_API is off Alexey Kardashevskiy
2015-06-10  7:33   ` [kernel, v12, 17/34] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Michael Ellerman
2015-06-11  4:28     ` [PATCH kernel v12.2] powerpc/powernv: Fix crash when CONFIG_IOMMU_API is off Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 18/34] vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 19/34] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 20/34] powerpc/powernv/ioda2: Move TCE kill register address to PE Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 21/34] powerpc/powernv/ioda2: Add TCE invalidation for all attached groups Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 22/34] powerpc/powernv: Implement accessor to TCE entry Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 23/34] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 24/34] powerpc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 25/34] powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 26/34] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-06-09  4:23   ` David Gibson
2015-06-05  6:35 ` [PATCH kernel v12 27/34] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-06-09  3:57   ` David Gibson
2015-06-05  6:35 ` [PATCH kernel v12 28/34] vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 29/34] powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 30/34] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 31/34] vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control Alexey Kardashevskiy
2015-06-05  6:35 ` [PATCH kernel v12 32/34] powerpc/mmu: Add userspace-to-physical addresses translation cache Alexey Kardashevskiy
2015-06-09  4:05   ` David Gibson
2015-06-05  6:35 ` [PATCH kernel v12 33/34] vfio: powerpc/spapr: Register memory and define IOMMU v2 Alexey Kardashevskiy
2015-06-09  4:21   ` David Gibson [this message]
2015-06-05  6:35 ` [PATCH kernel v12 34/34] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150609042125.GD11000@voom.redhat.com \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    --cc=weiyang@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.