qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>,
	qemu-devel@nongnu.org, Michael Roth <mdroth@linux.vnet.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	qemu-ppc@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
Date: Sat, 18 Jul 2015 01:47:34 +1000	[thread overview]
Message-ID: <55A92396.2030506@ozlabs.ru> (raw)
In-Reply-To: <20150717133959.GG25179@voom.redhat.com>

On 07/17/2015 11:39 PM, David Gibson wrote:
> On Fri, Jul 17, 2015 at 05:13:37PM +1000, Alexey Kardashevskiy wrote:
>> On 07/16/2015 03:11 PM, David Gibson wrote:
>>> On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote:
>>>> This makes use of the new "memory registering" feature. The idea is
>>>> to provide the userspace ability to notify the host kernel about pages
>>>> which are going to be used for DMA. Having this information, the host
>>>> kernel can pin them all once per user process, do locked pages
>>>> accounting (once) and not spent time on doing that in real time with
>>>> possible failures which cannot be handled nicely in some cases.
>>>>
>>>> This adds a guest RAM memory listener which notifies a VFIO container
>>>> about memory which needs to be pinned/unpinned. VFIO MMIO regions
>>>> (i.e. "skip dump" regions) are skipped.
>>>>
>>>> The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
>>>> are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
>>>> not call it when v2 is detected and enabled.
>>>>
>>>> This does not change the guest visible interface.
>>>>
>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>
>>> I've looked at this in more depth now, and attempting to unify the
>>> pre-reg and mapping listeners like this can't work - they need to be
>>> listening on different address spaces:  mapping actions need to be
>>> listening on the PCI address space, whereas the pre-reg needs to be
>>> listening on address_space_memory.  For x86 - for now - those end up
>>> being the same thing, but on Power they're not.
>>>
>>> We do need to be clear about what differences are due to the presence
>>> of a guest IOMMU versus which are due to arch or underlying IOMMU
>>> type.  For now Power has a guest IOMMU and x86 doesn't, but that could
>>> well change in future: we could well implement the guest side IOMMU
>>> for x86 in future (or x86 could invent a paravirt IOMMU interface).
>>> On the other side, BenH's experimental powernv machine type could
>>> introduce Power machines without a guest side IOMMU (or at least an
>>> optional guest side IOMMU).
>>>
>>> The quick and dirty approach here is:
>>>     1. Leave the main listener as is
>>>     2. Add a new pre-reg notifier to the spapr iommu specific code,
>>>        which listens on address_space_memory, *not* the PCI space
>>>
>>> The more generally correct approach, which allows for more complex
>>> IOMMU arrangements and the possibility of new IOMMU types with pre-reg
>>> is:
>>>     1. Have the core implement both a mapping listener and a pre-reg
>>>        listener (optionally enabled by a per-iommu-type flag).
>>>        Basically the first one sees what *is* mapped, the second sees
>>>        what *could* be mapped.
>>>
>>>     2. As now, the mapping listener listens on PCI address space, if
>>>        RAM blocks are added, immediately map them into the host IOMMU,
>>>        if guest IOMMU blocks appear register a notifier which will
>>>        mirror guest IOMMU mappings to the host IOMMU (this is what we
>>>        do now).
>>>
>>>     3. The pre-reg listener also listens on the PCI address space.  RAM
>>>        blocks added are pre-registered immediately.
>>
>>
>> PCI address space listeners won't be notified about RAM blocks on sPAPR.
>
> Sure they will - if any RAM blocks were mapped directly into PCI
> address space, the listener would be notified.  It's just that no RAM
> blocks are directly mapped into PCI space, only partially mapped in
> via IOMMU blocks.

Right. No RAM blocks are mapped. So on *sPAPR* PCI AS listener won't be 
notified about *RAM*. But you say "they will". I am missing something here.


> But the idea is this scheme could handle a platform that has both a
> "bypass" DMA window which maps directly onto a block of ram and an
> IOMMU controlled DMA window.  Or one which could have either setup
> depending on circumstances (which is probably true of BenH's "powernv"
> machine type).
>
>>> But, if guest
>>>        IOMMU blocks are added, instead of registering a guest-iommu
>>>        notifier,
>>
>> "guest-iommu notifier" is the one called via memory_region_notify_iommu()
>> from H_PUT_TCE? "Instead" implies dropping it, how this can work?
>
> Because the other listener - the mapping listener at (2) handles that
> part.  The pre-reg listener doesn't.


My bad, #2 included notifiers, right.


> But as noted in by other mail this whole scheme doesn't work without a
> way to discover an IOMMU region's target AS in advance, which doesn't
> currently exist.

We can add AS to IOMMU MR now, few lines of code :)


>>> we register another listener on the *target* AS of the
>>>        guest IOMMU, same callbacks as this one.  In practice that
>>>        target AS will almost always resolve to address_space_memory,
>>>        but this can at least in theory handle crazy guest setups with
>>>        multiple layers of IOMMU.
>>>
>>>     4. Have to ensure that the pre-reg callbacks always happen before
>>>        the mapping calls.  For a system with an IOMMU backend which
>>>        requires pre-registration, but doesn't have a guest IOMMU, we
>>>        need to pre-reg, then host-iommu-map RAM blocks that appear in
>>>        PCI address space.
>


-- 
Alexey

  reply	other threads:[~2015-07-17 15:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-14 12:21 [Qemu-devel] [RFC PATCH qemu v3 0/4] vfio: SPAPR IOMMU v2 (memory preregistration support) Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 1/4] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2015-07-15 18:26   ` Alex Williamson
2015-07-16  1:12     ` Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 2/4] vfio: Use different page size for different IOMMU types Alexey Kardashevskiy
2015-07-15 18:26   ` Alex Williamson
2015-07-16  1:26     ` Alexey Kardashevskiy
2015-07-16  2:51       ` Alex Williamson
2015-07-16  3:31         ` Alexey Kardashevskiy
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 3/4] vfio: Store IOMMU type in container Alexey Kardashevskiy
2015-07-15 18:26   ` Alex Williamson
2015-07-14 12:21 ` [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-07-16  5:11   ` David Gibson
2015-07-16 14:44     ` Alex Williamson
2015-07-17  5:20       ` David Gibson
2015-07-17 18:25         ` Alex Williamson
2015-07-18 15:05           ` David Gibson
2015-07-19 15:04             ` Alex Williamson
2015-07-17  7:13     ` Alexey Kardashevskiy
2015-07-17 13:39       ` David Gibson
2015-07-17 15:47         ` Alexey Kardashevskiy [this message]
2015-07-18 15:17           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A92396.2030506@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.crosthwaite@xilinx.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).