qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>,
	qemu-devel@nongnu.org, Michael Roth <mdroth@linux.vnet.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	qemu-ppc@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
Date: Fri, 17 Jul 2015 17:13:37 +1000	[thread overview]
Message-ID: <55A8AB21.8080307@ozlabs.ru> (raw)
In-Reply-To: <20150716051122.GA25179@voom.redhat.com>

On 07/16/2015 03:11 PM, David Gibson wrote:
> On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote:
>> This makes use of the new "memory registering" feature. The idea is
>> to provide the userspace ability to notify the host kernel about pages
>> which are going to be used for DMA. Having this information, the host
>> kernel can pin them all once per user process, do locked pages
>> accounting (once) and not spent time on doing that in real time with
>> possible failures which cannot be handled nicely in some cases.
>>
>> This adds a guest RAM memory listener which notifies a VFIO container
>> about memory which needs to be pinned/unpinned. VFIO MMIO regions
>> (i.e. "skip dump" regions) are skipped.
>>
>> The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
>> are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
>> not call it when v2 is detected and enabled.
>>
>> This does not change the guest visible interface.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> I've looked at this in more depth now, and attempting to unify the
> pre-reg and mapping listeners like this can't work - they need to be
> listening on different address spaces:  mapping actions need to be
> listening on the PCI address space, whereas the pre-reg needs to be
> listening on address_space_memory.  For x86 - for now - those end up
> being the same thing, but on Power they're not.
>
> We do need to be clear about what differences are due to the presence
> of a guest IOMMU versus which are due to arch or underlying IOMMU
> type.  For now Power has a guest IOMMU and x86 doesn't, but that could
> well change in future: we could well implement the guest side IOMMU
> for x86 in future (or x86 could invent a paravirt IOMMU interface).
> On the other side, BenH's experimental powernv machine type could
> introduce Power machines without a guest side IOMMU (or at least an
> optional guest side IOMMU).
>
> The quick and dirty approach here is:
>     1. Leave the main listener as is
>     2. Add a new pre-reg notifier to the spapr iommu specific code,
>        which listens on address_space_memory, *not* the PCI space
>
> The more generally correct approach, which allows for more complex
> IOMMU arrangements and the possibility of new IOMMU types with pre-reg
> is:
>     1. Have the core implement both a mapping listener and a pre-reg
>        listener (optionally enabled by a per-iommu-type flag).
>        Basically the first one sees what *is* mapped, the second sees
>        what *could* be mapped.
>
>     2. As now, the mapping listener listens on PCI address space, if
>        RAM blocks are added, immediately map them into the host IOMMU,
>        if guest IOMMU blocks appear register a notifier which will
>        mirror guest IOMMU mappings to the host IOMMU (this is what we
>        do now).
>
>     3. The pre-reg listener also listens on the PCI address space.  RAM
>        blocks added are pre-registered immediately.


PCI address space listeners won't be notified about RAM blocks on sPAPR.


> But, if guest
>        IOMMU blocks are added, instead of registering a guest-iommu
>        notifier,

"guest-iommu notifier" is the one called via memory_region_notify_iommu() 
from H_PUT_TCE? "Instead" implies dropping it, how this can work?


> we register another listener on the *target* AS of the
>        guest IOMMU, same callbacks as this one.  In practice that
>        target AS will almost always resolve to address_space_memory,
>        but this can at least in theory handle crazy guest setups with
>        multiple layers of IOMMU.
>
>     4. Have to ensure that the pre-reg callbacks always happen before
>        the mapping calls.  For a system with an IOMMU backend which
>        requires pre-registration, but doesn't have a guest IOMMU, we
>        need to pre-reg, then host-iommu-map RAM blocks that appear in
>        PCI address space.




-- 
Alexey

  parent reply	other threads:[~2015-07-17  7:13 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-14 12:21 [Qemu-devel] [RFC PATCH qemu v3 0/4] vfio: SPAPR IOMMU v2 (memory preregistration support) Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 1/4] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2015-07-15 18:26   ` Alex Williamson
2015-07-16  1:12     ` Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 2/4] vfio: Use different page size for different IOMMU types Alexey Kardashevskiy
2015-07-15 18:26   ` Alex Williamson
2015-07-16  1:26     ` Alexey Kardashevskiy
2015-07-16  2:51       ` Alex Williamson
2015-07-16  3:31         ` Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 3/4] vfio: Store IOMMU type in container Alexey Kardashevskiy
2015-07-15 18:26   ` Alex Williamson
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-07-16  5:11   ` David Gibson
2015-07-16 14:44     ` Alex Williamson
2015-07-17  5:20       ` David Gibson
2015-07-17 18:25         ` Alex Williamson
2015-07-18 15:05           ` David Gibson
2015-07-19 15:04             ` Alex Williamson
2015-07-17  7:13     ` Alexey Kardashevskiy [this message]
2015-07-17 13:39       ` David Gibson
2015-07-17 15:47         ` Alexey Kardashevskiy
2015-07-18 15:17           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A8AB21.8080307@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.crosthwaite@xilinx.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).