From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Alex Williamson <alex.williamson@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH qemu v7 13/14] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)
Date: Thu, 18 Jun 2015 21:35:44 +1000 [thread overview]
Message-ID: <5582AD10.3070400@ozlabs.ru> (raw)
In-Reply-To: <20150505124940.GS14090@voom.redhat.com>
On 05/05/2015 10:49 PM, David Gibson wrote:
> On Sat, Apr 25, 2015 at 10:24:43PM +1000, Alexey Kardashevskiy wrote:
>> This adds support for Dynamic DMA Windows (DDW) option defined by
>> the SPAPR specification which allows to have additional DMA window(s)
>>
>> This implements DDW for emulated and VFIO devices. As all TCE root regions
>> are mapped at 0 and 64bit long (and actual tables are child regions),
>> this replaces memory_region_add_subregion() with _overlap() to make
>> QEMU memory API happy.
>>
>> This reserves RTAS token numbers for DDW calls.
>>
>> This implements helpers to interact with VFIO kernel interface.
>>
>> This changes the TCE table migration descriptor to support dynamic
>> tables as from now on, PHB will create as many stub TCE table objects
>> as PHB can possibly support but not all of them might be initialized at
>> the time of migration because DDW might or might not be requested by
>> the guest.
>>
>> The "ddw" property is enabled by default on a PHB but for compatibility
>> the pseries-2.3 machine and older disable it.
>>
>> This implements DDW for VFIO. The host kernel support is required.
>> This adds a "levels" property to PHB to control the number of levels
>> in the actual TCE table allocated by the host kernel, 0 is the default
>> value to tell QEMU to calculate the correct value. Current hardware
>> supports up to 5 levels.
>>
>> The existing linux guests try creating one additional huge DMA window
>> with 64K or 16MB pages and map the entire guest RAM to. If succeeded,
>> the guest switches to dma_direct_ops and never calls TCE hypercalls
>> (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM
>> and not waste time on map/unmap later.
>>
>> This adds 4 RTAS handlers:
>> * ibm,query-pe-dma-window
>> * ibm,create-pe-dma-window
>> * ibm,remove-pe-dma-window
>> * ibm,reset-pe-dma-window
>> These are registered from type_init() callback.
>>
>> These RTAS handlers are implemented in a separate file to avoid polluting
>> spapr_iommu.c with PCI.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
I saw this and decided there are no more coments but I was wrong :)
>> ---
>> Changes:
>> v6:
>> * rework as there is no more special device for VFIO PHB
>>
>> v5:
>> * total rework
>> * enabled for machines >2.3
>> * fixed migration
>> * merged rtas handlers here
>>
>> v4:
>> * reset handler is back in generalized form
>>
>> v3:
>> * removed reset
>> * windows_num is now 1 or bigger rather than 0-based value and it is only
>> changed in PHB code, not in RTAS
>> * added page mask check in create()
>> * added SPAPR_PCI_DDW_MAX_WINDOWS to track how many windows are already
>> created
>>
>> v2:
>> * tested on hacked emulated E1000
>> * implemented DDW reset on the PHB reset
>> * spapr_pci_ddw_remove/spapr_pci_ddw_reset are public for reuse by VFIO
>> ---
>> hw/ppc/Makefile.objs | 3 +
>> hw/ppc/spapr.c | 10 +-
>> hw/ppc/spapr_iommu.c | 35 +++++-
>> hw/ppc/spapr_pci.c | 66 ++++++++--
>> hw/ppc/spapr_pci_vfio.c | 80 ++++++++++++
>> hw/ppc/spapr_rtas_ddw.c | 300 ++++++++++++++++++++++++++++++++++++++++++++
>> include/hw/pci-host/spapr.h | 21 ++++
>> include/hw/ppc/spapr.h | 17 ++-
>> trace-events | 4 +
>> 9 files changed, 521 insertions(+), 15 deletions(-)
>> create mode 100644 hw/ppc/spapr_rtas_ddw.c
>>
>> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
>> index 437955d..c6b344f 100644
>> --- a/hw/ppc/Makefile.objs
>> +++ b/hw/ppc/Makefile.objs
>> @@ -7,6 +7,9 @@ obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o
>> ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
>> obj-y += spapr_pci_vfio.o
>> endif
>> +ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES), yy)
>> +obj-y += spapr_rtas_ddw.o
>> +endif
>> # PowerPC 4xx boards
>> obj-y += ppc405_boards.o ppc4xx_devs.o ppc405_uc.o ppc440_bamboo.o
>> obj-y += ppc4xx_pci.o
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index b28209f..fd7fdb3 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1801,7 +1801,15 @@ static const TypeInfo spapr_machine_info = {
>> },
>> };
>>
>> +#define SPAPR_COMPAT_2_3 \
>> + {\
>> + .driver = TYPE_SPAPR_PCI_HOST_BRIDGE,\
>> + .property = "ddw",\
>> + .value = stringify(off),\
>> + }
>> +
>> #define SPAPR_COMPAT_2_2 \
>> + SPAPR_COMPAT_2_3, \
>> {\
>> .driver = TYPE_SPAPR_PCI_HOST_BRIDGE,\
>> .property = "mem_win_size",\
>> @@ -1853,7 +1861,7 @@ static const TypeInfo spapr_machine_2_2_info = {
>> static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
>> {
>> static GlobalProperty compat_props[] = {
>> - SPAPR_COMPAT_2_2,
>> + SPAPR_COMPAT_2_3,
>> { /* end of list */ }
>> };
>> MachineClass *mc = MACHINE_CLASS(oc);
>> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
>> index 245534f..df4c72d 100644
>> --- a/hw/ppc/spapr_iommu.c
>> +++ b/hw/ppc/spapr_iommu.c
>> @@ -90,6 +90,15 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr,
>> return ret;
>> }
>>
>> +static void spapr_tce_table_pre_save(void *opaque)
>> +{
>> + sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
>> +
>> + tcet->migtable = tcet->table;
>> +}
>> +
>> +static void spapr_tce_table_do_enable(sPAPRTCETable *tcet);
>> +
>> static int spapr_tce_table_post_load(void *opaque, int version_id)
>> {
>> sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
>> @@ -98,22 +107,42 @@ static int spapr_tce_table_post_load(void *opaque, int version_id)
>> spapr_vio_set_bypass(tcet->vdev, tcet->bypass);
>> }
>>
>> + if (!tcet->migtable) {
>
> What's the case where migtable will be NULL? IIUC an old->new
> migration will result in the data saved for "table" being loaded into
> "migtable".
>
> So "migtable" should only be NULL, when tce->enabled is also false?
Seems to be true and this is just extra precaution. Remove?
--
Alexey
next prev parent reply other threads:[~2015-06-18 11:36 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-25 12:24 [Qemu-devel] [PATCH qemu v7 00/14] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 01/14] spapr_pci: Finish making find_phb()/find_dev() public Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 02/14] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 03/14] vfio: spapr: Move SPAPR-related code to a separate file Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 04/14] spapr_pci_vfio: Enable multiple groups per container Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 05/14] spapr_pci: Convert finish_realize() to dma_capabilities_update()+dma_init_window() Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 06/14] spapr_iommu: Introduce "enabled" state for TCE table Alexey Kardashevskiy
2015-05-05 12:28 ` David Gibson
2015-05-25 15:05 ` Alexey Kardashevskiy
2015-05-26 2:46 ` David Gibson
2015-05-26 8:58 ` Paolo Bonzini
2015-05-26 9:01 ` Alexander Graf
2015-05-26 9:16 ` Paolo Bonzini
2015-05-26 10:15 ` Alexey Kardashevskiy
2015-05-26 10:16 ` Paolo Bonzini
2015-05-26 12:33 ` Alexey Kardashevskiy
2015-05-26 12:50 ` Paolo Bonzini
2015-05-26 13:28 ` Alexey Kardashevskiy
2015-05-26 13:31 ` Paolo Bonzini
2015-05-26 13:42 ` Alexey Kardashevskiy
2015-05-26 13:48 ` Paolo Bonzini
2015-05-26 14:00 ` Alexey Kardashevskiy
2015-05-26 14:03 ` Paolo Bonzini
2015-05-26 14:17 ` Alexey Kardashevskiy
2015-05-26 14:24 ` Paolo Bonzini
2015-05-26 14:55 ` Michael Roth
2015-05-26 14:58 ` Paolo Bonzini
2015-05-26 15:49 ` Alexey Kardashevskiy
2015-05-26 15:51 ` Paolo Bonzini
2015-05-26 23:55 ` Alexey Kardashevskiy
2015-05-27 7:05 ` Paolo Bonzini
2015-07-04 1:12 ` Alexey Kardashevskiy
2015-07-06 0:52 ` Alexey Kardashevskiy
2015-07-06 11:16 ` Paolo Bonzini
2015-05-26 15:00 ` Alexey Kardashevskiy
2015-05-26 15:08 ` Paolo Bonzini
2015-05-26 15:49 ` Alexey Kardashevskiy
2015-05-26 14:36 ` Michael Roth
2015-05-27 2:54 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 07/14] spapr_iommu: Add root memory region Alexey Kardashevskiy
2015-05-05 12:31 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 08/14] spapr_pci: Do complete reset of DMA config when resetting PHB Alexey Kardashevskiy
2015-05-05 12:34 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 09/14] spapr_vfio_pci: Remove redundant spapr-pci-vfio-host-bridge Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 10/14] linux headers update for DDW on SPAPR Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 11/14] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 12/14] spapr: Add pseries-2.4 machine Alexey Kardashevskiy
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 13/14] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2015-05-05 12:49 ` David Gibson
2015-06-18 11:35 ` Alexey Kardashevskiy [this message]
2015-06-19 1:45 ` David Gibson
2015-06-19 6:49 ` Markus Armbruster
2015-06-22 2:00 ` David Gibson
2015-04-25 12:24 ` [Qemu-devel] [PATCH qemu v7 14/14] vfio: Enable DDW ioctls to VFIO IOMMU driver Alexey Kardashevskiy
2015-05-05 12:50 ` David Gibson
2015-05-05 9:30 ` [Qemu-devel] [PATCH qemu v7 00/14] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5582AD10.3070400@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=agraf@suse.de \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).