From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Paolo Bonzini <pbonzini@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>,
Alex Williamson <alex.williamson@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH qemu v7 06/14] spapr_iommu: Introduce "enabled" state for TCE table
Date: Wed, 27 May 2015 01:00:15 +1000 [thread overview]
Message-ID: <55648A7F.9070100@ozlabs.ru> (raw)
In-Reply-To: <55648239.7070905@redhat.com>
On 05/27/2015 12:24 AM, Paolo Bonzini wrote:
>
>
> On 26/05/2015 16:17, Alexey Kardashevskiy wrote:
>> On 05/27/2015 12:03 AM, Paolo Bonzini wrote:
>>>
>>>
>>> On 26/05/2015 16:00, Alexey Kardashevskiy wrote:
>>>> On 05/26/2015 11:48 PM, Paolo Bonzini wrote:
>>>>>
>>>>>
>>>>> On 26/05/2015 15:42, Alexey Kardashevskiy wrote:
>>>>>>
>>>>>>
>>>>>> The next patch of this patchset changes:
>>>>>> spapr_tce_table_do_enable()
>>>>>> memory_region_init_iommu(&iommu)
>>>>>> memory_region_add_subregion(&root, &iommu)
>>>>>>
>>>>>> spapr_tce_table_disable()
>>>>>> memory_region_del_subregion(&root, &iommu)
>>>>>> object_unref(&iommu)
>>>>>>
>>>>>> These spapr_tce_xxx are called by request from the guest. &root is a
>>>>>> container and exists as long as sPAPRTCETable exists.
>>>>>>
>>>>>> Where do I get a leaking child property here?
>>>>>
>>>>> When you unref iommu and not unparent it. The next
>>>>> memory_region_init_iommu creates a second child property, and the first
>>>>> is gone.
>>>>
>>>> But when do I get this child property? In memory_region_add_subregion()?
>>>> And memory_region_del_subregion() does not do the opposite thing
>>>> (unparent)?
>>>
>>> In memory_region_init_iommu.
>>
>> Ah. So I need at least s/object_unref/object_unparent/ in my current
>> code, right?
>
> Yes, and then you hit the situation documented in docs/memory.txt.
Oh. ok.
>>> Why do you need different regions? Why can't you have always the same
>>> IOMMU regions, and either:
>>
>> They may change a size.
>
> That's not a problem, there's memory_region_set_size for that.
It was not there when I started doing this DDW :) If so, I can keep the
existing structure and just set size to zero instead of
memory_region_del_subregion().
>> These are dynamic DMA windows, guest may remove
>> all and create randomly. Each region is backed by a separate TCE table
>> with different page size.
>
> Okay.
>
>>> 1) create/destroy an alias to that region
>>
>> How does this change things compared to iommus in regard to parenting?
>
> Aliases do not have the same restriction. But this doesn't help your
> case if you have separate TCE tables etc.
I need windows appear and disappear on a bus dynamically, that's it. The
actual sPAPRTCETable objects exist always. Aliases will do the job as far
as I can tell.
>>> 2) change the behavior of the translation function, while keeping a
>>> single region?
>>
>> Have one sPAPRTCETable object with 0, 1 or 2 (and potentially more)
>> actual TCE tables? I can do that too but I thought subregions are just
>> natural for that.
>
> They may be. You may need more than one though.
I fail to see when :)
> What guest actions trigger the change? Is it a hypercall? If so, what
> hypercall is it so I can look at the documentation?
It is a bunch of RTAS calls which are highly classified in PAPR spec :)
Linux guests do this:
1. load a driver
2. driver calls set_dma_mask()
3. if mask < 64, usual old-style &dma_iommu_ops is used; exit
4. platform code calls enable_ddw()
5. enable_ddw() looks at PHB "ddw-applicable"
6. enable_ddw() calls ibm,query-pe-dma-window (returns page mask supported)
7. enable_ddw() calls ibm,create-pe-dma-window to create actual window with
specific size (which is entire guest RAM in the case of linux but might be
different for the other OS) and know its bus address (rtas returns it, the
guest does not choose it)
8. enable_ddw() calls H_PUT_TCE in a loop to map all guest RAM pages onto a
bus and does set_dma_ops(dev, &dma_direct_ops) so H_PUT_TCE is not called
again till guest reboot.
If any step in 5..8 fails, then &dma_iommu_ops is used.
The pseries platform expects the default DMA window (4K pages, <=2GB) to
exist. And there is an extra ibm,remove-pe-dma-window call to remove any
window (including default one) so a following ibm,create-pe-dma-window will
create a new window at zero offset on a bus (as big as the guest RAM and
page size bigger than 4K).
Aaaaand there is an extension - ibm,reset-pe-dma-window which should delete
all windows and create the default one (kernels before v3.10 or so used to
do this). The machine reset should do the same thing.
>> I even wanted to create sPAP
RTCETable' dynamically but
>> this would break migration (because we cannot start QEMU with an
>> additional sPAPRTCETable if it exists in the source which is not always
>> the case).
>
> Creating sPAPRTCETables dynamically would be a fix as well. You _can_
> unparent the sPAPRTCETable whenever you want. But it's not necessarily
> the right solution.
>
> Why does it break migration? There is only one migration handler for
> all htabs, I think. Or is this a different thing than the htabs?
sPAPRTCETable stores the actual table and if I want it to migrate, the
destination QEMU must have the object created-and-vmstate_register'ated.
But the table (and class) may be absent or present on the source side so I
need to start the destination with or without -device sPAPRTCETable, and if
I need to create this object, I need to make it a child of a PHB and last
time I checked - there is no command line interface for linking children.
>
> The sPAPRTCETable would be created in its parent device's post_load handler.
>
>> Ok. I'll redo this thing again and try using less QOM objects...
>
> Wait, I haven't understood the problem yet.
Oookay :)
But I started thinking that always having 2 sPAPRTCETable objects (some may
be "disabled") it not better than a single sPAPRTCETable with multiple TCE
tables...
--
Alexey
next prev parent reply other threads:[~2015-05-26 15:00 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-25 12:24 [Qemu-devel] [PATCH qemu v7 00/14] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 01/14] spapr_pci: Finish making find_phb()/find_dev() public Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 02/14] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 03/14] vfio: spapr: Move SPAPR-related code to a separate file Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 04/14] spapr_pci_vfio: Enable multiple groups per container Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 05/14] spapr_pci: Convert finish_realize() to dma_capabilities_update()+dma_init_window() Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 06/14] spapr_iommu: Introduce "enabled" state for TCE table Alexey Kardashevskiy
2015-05-05 12:28 ` David Gibson
2015-05-25 15:05 ` Alexey Kardashevskiy
2015-05-26 2:46 ` David Gibson
2015-05-26 8:58 ` Paolo Bonzini
2015-05-26 9:01 ` Alexander Graf
2015-05-26 9:16 ` Paolo Bonzini
2015-05-26 10:15 ` Alexey Kardashevskiy
2015-05-26 10:16 ` Paolo Bonzini
2015-05-26 12:33 ` Alexey Kardashevskiy
2015-05-26 12:50 ` Paolo Bonzini
2015-05-26 13:28 ` Alexey Kardashevskiy
2015-05-26 13:31 ` Paolo Bonzini
2015-05-26 13:42 ` Alexey Kardashevskiy
2015-05-26 13:48 ` Paolo Bonzini
2015-05-26 14:00 ` Alexey Kardashevskiy
2015-05-26 14:03 ` Paolo Bonzini
2015-05-26 14:17 ` Alexey Kardashevskiy
2015-05-26 14:24 ` Paolo Bonzini
2015-05-26 14:55 ` Michael Roth
2015-05-26 14:58 ` Paolo Bonzini
2015-05-26 15:49 ` Alexey Kardashevskiy
2015-05-26 15:51 ` Paolo Bonzini
2015-05-26 23:55 ` Alexey Kardashevskiy
2015-05-27 7:05 ` Paolo Bonzini
2015-07-04 1:12 ` Alexey Kardashevskiy
2015-07-06 0:52 ` Alexey Kardashevskiy
2015-07-06 11:16 ` Paolo Bonzini
2015-05-26 15:00 ` Alexey Kardashevskiy [this message]
2015-05-26 15:08 ` Paolo Bonzini
2015-05-26 15:49 ` Alexey Kardashevskiy
2015-05-26 14:36 ` Michael Roth
2015-05-27 2:54 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 07/14] spapr_iommu: Add root memory region Alexey Kardashevskiy
2015-05-05 12:31 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 08/14] spapr_pci: Do complete reset of DMA config when resetting PHB Alexey Kardashevskiy
2015-05-05 12:34 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 09/14] spapr_vfio_pci: Remove redundant spapr-pci-vfio-host-bridge Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 10/14] linux headers update for DDW on SPAPR Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 11/14] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 12/14] spapr: Add pseries-2.4 machine Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 13/14] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2015-05-05 12:49 ` David Gibson
2015-06-18 11:35 ` Alexey Kardashevskiy
2015-06-19 1:45 ` David Gibson
2015-06-19 6:49 ` Markus Armbruster
2015-06-22 2:00 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 14/14] vfio: Enable DDW ioctls to VFIO IOMMU driver Alexey Kardashevskiy
2015-05-05 12:50 ` David Gibson
2015-05-05 9:30 ` [Qemu-devel] [PATCH qemu v7 00/14] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55648A7F.9070100@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=agraf@suse.de \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=mdroth@linux.vnet.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).