From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>,
qemu-devel@nongnu.org, Michael Roth <mdroth@linux.vnet.ibm.com>,
Alex Williamson <alex.williamson@redhat.com>,
qemu-ppc@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
Date: Fri, 17 Jul 2015 17:13:37 +1000 [thread overview]
Message-ID: <55A8AB21.8080307@ozlabs.ru> (raw)
In-Reply-To: <20150716051122.GA25179@voom.redhat.com>
On 07/16/2015 03:11 PM, David Gibson wrote:
> On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote:
>> This makes use of the new "memory registering" feature. The idea is
>> to provide the userspace ability to notify the host kernel about pages
>> which are going to be used for DMA. Having this information, the host
>> kernel can pin them all once per user process, do locked pages
>> accounting (once) and not spent time on doing that in real time with
>> possible failures which cannot be handled nicely in some cases.
>>
>> This adds a guest RAM memory listener which notifies a VFIO container
>> about memory which needs to be pinned/unpinned. VFIO MMIO regions
>> (i.e. "skip dump" regions) are skipped.
>>
>> The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
>> are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
>> not call it when v2 is detected and enabled.
>>
>> This does not change the guest visible interface.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> I've looked at this in more depth now, and attempting to unify the
> pre-reg and mapping listeners like this can't work - they need to be
> listening on different address spaces: mapping actions need to be
> listening on the PCI address space, whereas the pre-reg needs to be
> listening on address_space_memory. For x86 - for now - those end up
> being the same thing, but on Power they're not.
>
> We do need to be clear about what differences are due to the presence
> of a guest IOMMU versus which are due to arch or underlying IOMMU
> type. For now Power has a guest IOMMU and x86 doesn't, but that could
> well change in future: we could well implement the guest side IOMMU
> for x86 in future (or x86 could invent a paravirt IOMMU interface).
> On the other side, BenH's experimental powernv machine type could
> introduce Power machines without a guest side IOMMU (or at least an
> optional guest side IOMMU).
>
> The quick and dirty approach here is:
> 1. Leave the main listener as is
> 2. Add a new pre-reg notifier to the spapr iommu specific code,
> which listens on address_space_memory, *not* the PCI space
>
> The more generally correct approach, which allows for more complex
> IOMMU arrangements and the possibility of new IOMMU types with pre-reg
> is:
> 1. Have the core implement both a mapping listener and a pre-reg
> listener (optionally enabled by a per-iommu-type flag).
> Basically the first one sees what *is* mapped, the second sees
> what *could* be mapped.
>
> 2. As now, the mapping listener listens on PCI address space, if
> RAM blocks are added, immediately map them into the host IOMMU,
> if guest IOMMU blocks appear register a notifier which will
> mirror guest IOMMU mappings to the host IOMMU (this is what we
> do now).
>
> 3. The pre-reg listener also listens on the PCI address space. RAM
> blocks added are pre-registered immediately.
PCI address space listeners won't be notified about RAM blocks on sPAPR.
> But, if guest
> IOMMU blocks are added, instead of registering a guest-iommu
> notifier,
"guest-iommu notifier" is the one called via memory_region_notify_iommu()
from H_PUT_TCE? "Instead" implies dropping it, how this can work?
> we register another listener on the *target* AS of the
> guest IOMMU, same callbacks as this one. In practice that
> target AS will almost always resolve to address_space_memory,
> but this can at least in theory handle crazy guest setups with
> multiple layers of IOMMU.
>
> 4. Have to ensure that the pre-reg callbacks always happen before
> the mapping calls. For a system with an IOMMU backend which
> requires pre-registration, but doesn't have a guest IOMMU, we
> need to pre-reg, then host-iommu-map RAM blocks that appear in
> PCI address space.
--
Alexey
next prev parent reply other threads:[~2015-07-17 7:13 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-14 12:21 [Qemu-devel] [RFC PATCH qemu v3 0/4] vfio: SPAPR IOMMU v2 (memory preregistration support) Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 1/4] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2015-07-15 18:26 ` Alex Williamson
2015-07-16 1:12 ` Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 2/4] vfio: Use different page size for different IOMMU types Alexey Kardashevskiy
2015-07-15 18:26 ` Alex Williamson
2015-07-16 1:26 ` Alexey Kardashevskiy
2015-07-16 2:51 ` Alex Williamson
2015-07-16 3:31 ` Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 3/4] vfio: Store IOMMU type in container Alexey Kardashevskiy
2015-07-15 18:26 ` Alex Williamson
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-07-16 5:11 ` David Gibson
2015-07-16 14:44 ` Alex Williamson
2015-07-17 5:20 ` David Gibson
2015-07-17 18:25 ` Alex Williamson
2015-07-18 15:05 ` David Gibson
2015-07-19 15:04 ` Alex Williamson
2015-07-17 7:13 ` Alexey Kardashevskiy [this message]
2015-07-17 13:39 ` David Gibson
2015-07-17 15:47 ` Alexey Kardashevskiy
2015-07-18 15:17 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55A8AB21.8080307@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=mdroth@linux.vnet.ibm.com \
--cc=pbonzini@redhat.com \
--cc=peter.crosthwaite@xilinx.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).