From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org,
Alex Williamson <alex.williamson@redhat.com>,
Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH kernel v9 28/32] powerpc/mmu: Add userspace-to-physical addresses translation cache
Date: Thu, 30 Apr 2015 16:34:55 +1000 [thread overview]
Message-ID: <20150430063455.GA24886@voom.redhat.com> (raw)
In-Reply-To: <1429964096-11524-29-git-send-email-aik@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 2685 bytes --]
On Sat, Apr 25, 2015 at 10:14:52PM +1000, Alexey Kardashevskiy wrote:
> We are adding support for DMA memory pre-registration to be used in
> conjunction with VFIO. The idea is that the userspace which is going to
> run a guest may want to pre-register a user space memory region so
> it all gets pinned once and never goes away. Having this done,
> a hypervisor will not have to pin/unpin pages on every DMA map/unmap
> request. This is going to help with multiple pinning of the same memory
> and in-kernel acceleration of DMA requests.
>
> This adds a list of memory regions to mm_context_t. Each region consists
> of a header and a list of physical addresses. This adds API to:
> 1. register/unregister memory regions;
> 2. do final cleanup (which puts all pre-registered pages);
> 3. do userspace to physical address translation;
> 4. manage a mapped pages counter; when it is zero, it is safe to
> unregister the region.
>
> Multiple registration of the same region is allowed, kref is used to
> track the number of registrations.
[snip]
> +long mm_iommu_alloc(unsigned long ua, unsigned long entries,
> + struct mm_iommu_table_group_mem_t **pmem)
> +{
> + struct mm_iommu_table_group_mem_t *mem;
> + long i, j;
> + struct page *page = NULL;
> +
> + list_for_each_entry_rcu(mem, ¤t->mm->context.iommu_group_mem_list,
> + next) {
> + if ((mem->ua == ua) && (mem->entries == entries))
> + return -EBUSY;
> +
> + /* Overlap? */
> + if ((mem->ua < (ua + (entries << PAGE_SHIFT))) &&
> + (ua < (mem->ua + (mem->entries << PAGE_SHIFT))))
> + return -EINVAL;
> + }
> +
> + mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> + if (!mem)
> + return -ENOMEM;
> +
> + mem->hpas = vzalloc(entries * sizeof(mem->hpas[0]));
> + if (!mem->hpas) {
> + kfree(mem);
> + return -ENOMEM;
> + }
So, I've thought more about this and I'm really confused as to what
this is supposed to be accomplishing.
I see that you need to keep track of what regions are registered, so
you don't double lock or unlock, but I don't see what the point of
actualy storing the translations in hpas is.
I had assumed it was so that you could later on get to the
translations in real mode when you do in-kernel acceleration. But
that doesn't make sense, because the array is vmalloc()ed, so can't be
accessed in real mode anyway.
I can't think of a circumstance in which you can use hpas where you
couldn't just walk the page tables anyway.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Alex Williamson <alex.williamson@redhat.com>,
Gavin Shan <gwshan@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH kernel v9 28/32] powerpc/mmu: Add userspace-to-physical addresses translation cache
Date: Thu, 30 Apr 2015 16:34:55 +1000 [thread overview]
Message-ID: <20150430063455.GA24886@voom.redhat.com> (raw)
In-Reply-To: <1429964096-11524-29-git-send-email-aik@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 2685 bytes --]
On Sat, Apr 25, 2015 at 10:14:52PM +1000, Alexey Kardashevskiy wrote:
> We are adding support for DMA memory pre-registration to be used in
> conjunction with VFIO. The idea is that the userspace which is going to
> run a guest may want to pre-register a user space memory region so
> it all gets pinned once and never goes away. Having this done,
> a hypervisor will not have to pin/unpin pages on every DMA map/unmap
> request. This is going to help with multiple pinning of the same memory
> and in-kernel acceleration of DMA requests.
>
> This adds a list of memory regions to mm_context_t. Each region consists
> of a header and a list of physical addresses. This adds API to:
> 1. register/unregister memory regions;
> 2. do final cleanup (which puts all pre-registered pages);
> 3. do userspace to physical address translation;
> 4. manage a mapped pages counter; when it is zero, it is safe to
> unregister the region.
>
> Multiple registration of the same region is allowed, kref is used to
> track the number of registrations.
[snip]
> +long mm_iommu_alloc(unsigned long ua, unsigned long entries,
> + struct mm_iommu_table_group_mem_t **pmem)
> +{
> + struct mm_iommu_table_group_mem_t *mem;
> + long i, j;
> + struct page *page = NULL;
> +
> + list_for_each_entry_rcu(mem, ¤t->mm->context.iommu_group_mem_list,
> + next) {
> + if ((mem->ua == ua) && (mem->entries == entries))
> + return -EBUSY;
> +
> + /* Overlap? */
> + if ((mem->ua < (ua + (entries << PAGE_SHIFT))) &&
> + (ua < (mem->ua + (mem->entries << PAGE_SHIFT))))
> + return -EINVAL;
> + }
> +
> + mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> + if (!mem)
> + return -ENOMEM;
> +
> + mem->hpas = vzalloc(entries * sizeof(mem->hpas[0]));
> + if (!mem->hpas) {
> + kfree(mem);
> + return -ENOMEM;
> + }
So, I've thought more about this and I'm really confused as to what
this is supposed to be accomplishing.
I see that you need to keep track of what regions are registered, so
you don't double lock or unlock, but I don't see what the point of
actualy storing the translations in hpas is.
I had assumed it was so that you could later on get to the
translations in real mode when you do in-kernel acceleration. But
that doesn't make sense, because the array is vmalloc()ed, so can't be
accessed in real mode anyway.
I can't think of a circumstance in which you can use hpas where you
couldn't just walk the page tables anyway.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-04-30 7:11 UTC|newest]
Thread overview: 220+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-25 12:14 [PATCH kernel v9 00/32] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 01/32] powerpc/iommu: Split iommu_free_table into 2 helpers Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 2:03 ` David Gibson
2015-04-29 2:03 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 02/32] Revert "powerpc/powernv: Allocate struct pnv_ioda_pe iommu_table dynamically" Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-27 21:05 ` Alex Williamson
2015-04-27 21:05 ` Alex Williamson
2015-04-29 2:05 ` David Gibson
2015-04-29 2:05 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 03/32] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 04/32] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 05/32] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 06/32] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 07/32] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 08/32] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 2:14 ` David Gibson
2015-04-29 2:14 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 09/32] vfio: powerpc/spapr: Rework groups attaching Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 2:16 ` David Gibson
2015-04-29 2:16 ` David Gibson
2015-04-30 2:29 ` Alexey Kardashevskiy
2015-04-30 2:29 ` Alexey Kardashevskiy
2015-04-30 4:05 ` David Gibson
2015-04-30 4:05 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 10/32] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 11/32] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 12/32] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 2:49 ` David Gibson
2015-04-29 2:49 ` David Gibson
2015-04-30 2:30 ` Alexey Kardashevskiy
2015-04-30 2:30 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 13/32] vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 3:02 ` David Gibson
2015-04-29 3:02 ` David Gibson
2015-04-29 9:19 ` Alexey Kardashevskiy
2015-04-29 9:19 ` Alexey Kardashevskiy
2015-04-30 4:08 ` David Gibson
2015-04-30 4:08 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 14/32] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 3:08 ` David Gibson
2015-04-29 3:08 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 15/32] powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free() Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 3:18 ` David Gibson
2015-04-29 3:18 ` David Gibson
2015-04-30 2:58 ` Alexey Kardashevskiy
2015-04-30 2:58 ` Alexey Kardashevskiy
2015-04-30 4:16 ` David Gibson
2015-04-30 4:16 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 16/32] powerpc/powernv/ioda: Move TCE kill register address to PE Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-27 21:05 ` Alex Williamson
2015-04-27 21:05 ` Alex Williamson
2015-04-29 3:25 ` David Gibson
2015-04-29 3:25 ` David Gibson
2015-04-29 9:00 ` Alexey Kardashevskiy
2015-04-29 9:00 ` Alexey Kardashevskiy
2015-04-30 4:18 ` David Gibson
2015-04-30 4:18 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 17/32] powerpc/powernv: Implement accessor to TCE entry Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 4:04 ` David Gibson
2015-04-29 4:04 ` David Gibson
2015-04-29 9:02 ` Alexey Kardashevskiy
2015-04-29 9:02 ` Alexey Kardashevskiy
2015-04-30 0:13 ` David Gibson
2015-04-30 0:13 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 18/32] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 4:18 ` David Gibson
2015-04-29 4:18 ` David Gibson
2015-04-29 9:51 ` Alexey Kardashevskiy
2015-04-29 9:51 ` Alexey Kardashevskiy
2015-04-30 4:21 ` David Gibson
2015-04-30 4:21 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 19/32] powerpc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 4:27 ` David Gibson
2015-04-29 4:27 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 20/32] powerpc/powernv/ioda2: Introduce pnv_pci_create_table/pnv_pci_free_table Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 4:39 ` David Gibson
2015-04-29 4:39 ` David Gibson
2015-04-29 9:12 ` Alexey Kardashevskiy
2015-04-29 9:12 ` Alexey Kardashevskiy
2015-04-30 4:24 ` David Gibson
2015-04-30 4:24 ` David Gibson
2015-05-01 10:13 ` Alexey Kardashevskiy
2015-05-01 10:13 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 21/32] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 4:45 ` David Gibson
2015-04-29 4:45 ` David Gibson
2015-04-29 9:26 ` Alexey Kardashevskiy
2015-04-29 9:26 ` Alexey Kardashevskiy
2015-04-30 4:32 ` David Gibson
2015-04-30 4:32 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 22/32] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 5:04 ` David Gibson
2015-04-29 5:04 ` David Gibson
2015-05-01 9:48 ` Alexey Kardashevskiy
2015-05-01 9:48 ` Alexey Kardashevskiy
2015-05-05 12:05 ` David Gibson
2015-05-05 12:05 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 23/32] powerpc/powernv/ioda: Define and implement DMA table/window management callbacks Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 5:30 ` David Gibson
2015-04-29 5:30 ` David Gibson
2015-04-29 9:44 ` Alexey Kardashevskiy
2015-04-29 9:44 ` Alexey Kardashevskiy
2015-04-30 4:37 ` David Gibson
2015-04-30 4:37 ` David Gibson
2015-04-30 9:56 ` Alexey Kardashevskiy
2015-04-30 9:56 ` Alexey Kardashevskiy
2015-05-01 3:36 ` David Gibson
2015-05-01 3:36 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 24/32] powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 25/32] vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 5:39 ` David Gibson
2015-04-29 5:39 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 26/32] powerpc/iommu: Add userspace view of TCE table Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 6:31 ` David Gibson
2015-04-29 6:31 ` David Gibson
2015-05-01 4:01 ` Alexey Kardashevskiy
2015-05-01 4:01 ` Alexey Kardashevskiy
2015-05-01 4:23 ` David Gibson
2015-05-01 4:23 ` David Gibson
2015-05-01 7:12 ` Alexey Kardashevskiy
2015-05-01 7:12 ` Alexey Kardashevskiy
2015-05-05 12:02 ` David Gibson
2015-05-05 12:02 ` David Gibson
2015-05-11 2:11 ` Alexey Kardashevskiy
2015-05-11 2:11 ` Alexey Kardashevskiy
2015-05-11 4:52 ` Alexey Kardashevskiy
2015-05-11 4:52 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 27/32] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 6:40 ` David Gibson
2015-04-29 6:40 ` David Gibson
2015-05-01 4:10 ` Alexey Kardashevskiy
2015-05-01 4:10 ` Alexey Kardashevskiy
2015-05-01 5:12 ` David Gibson
2015-05-01 5:12 ` David Gibson
2015-05-01 6:53 ` Alexey Kardashevskiy
2015-05-01 6:53 ` Alexey Kardashevskiy
2015-05-05 11:58 ` David Gibson
2015-05-05 11:58 ` David Gibson
2015-05-11 2:24 ` Alexey Kardashevskiy
2015-05-11 2:24 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 28/32] powerpc/mmu: Add userspace-to-physical addresses translation cache Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-29 7:01 ` David Gibson
2015-04-29 7:01 ` David Gibson
2015-05-01 11:26 ` Alexey Kardashevskiy
2015-05-01 11:26 ` Alexey Kardashevskiy
2015-05-05 12:12 ` David Gibson
2015-05-05 12:12 ` David Gibson
2015-04-30 6:34 ` David Gibson [this message]
2015-04-30 6:34 ` David Gibson
2015-04-30 8:25 ` Paul Mackerras
2015-04-30 8:25 ` Paul Mackerras
2015-05-01 3:39 ` David Gibson
2015-05-01 3:39 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 29/32] vfio: powerpc/spapr: Register memory and define IOMMU v2 Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-30 6:55 ` David Gibson
2015-04-30 6:55 ` David Gibson
2015-05-01 4:35 ` Alexey Kardashevskiy
2015-05-01 4:35 ` Alexey Kardashevskiy
2015-05-01 5:23 ` David Gibson
2015-05-01 5:23 ` David Gibson
2015-05-01 6:27 ` Alexey Kardashevskiy
2015-05-01 6:27 ` Alexey Kardashevskiy
2015-05-05 11:53 ` David Gibson
2015-05-05 11:53 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 30/32] vfio: powerpc/spapr: Use 32bit DMA window properties from table_group Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-27 22:18 ` Alex Williamson
2015-04-27 22:18 ` Alex Williamson
2015-04-30 6:58 ` David Gibson
2015-04-30 6:58 ` David Gibson
2015-04-25 12:14 ` [PATCH kernel v9 31/32] vfio: powerpc/spapr: Support multiple groups in one container if possible Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
2015-04-30 7:22 ` David Gibson
2015-04-30 7:22 ` David Gibson
2015-04-30 9:33 ` Alexey Kardashevskiy
2015-04-30 9:33 ` Alexey Kardashevskiy
2015-05-01 0:46 ` Benjamin Herrenschmidt
2015-05-01 0:46 ` Benjamin Herrenschmidt
2015-05-01 4:44 ` David Gibson
2015-05-01 4:44 ` David Gibson
2015-05-01 4:33 ` David Gibson
2015-05-01 4:33 ` David Gibson
2015-05-01 6:05 ` Alexey Kardashevskiy
2015-05-01 6:05 ` Alexey Kardashevskiy
2015-05-05 11:50 ` David Gibson
2015-05-05 11:50 ` David Gibson
2015-05-11 2:26 ` Alexey Kardashevskiy
2015-05-11 2:26 ` Alexey Kardashevskiy
2015-04-25 12:14 ` [PATCH kernel v9 32/32] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
2015-04-25 12:14 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150430063455.GA24886@voom.redhat.com \
--to=david@gibson.dropbear.id.au \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.