From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60868) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwtxB-0004n6-NG for qemu-devel@nongnu.org; Mon, 25 May 2015 11:06:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Ywtx7-0004KI-Jt for qemu-devel@nongnu.org; Mon, 25 May 2015 11:06:09 -0400 Received: from mail-pa0-f51.google.com ([209.85.220.51]:35949) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ywtx7-0004Jv-5y for qemu-devel@nongnu.org; Mon, 25 May 2015 11:06:05 -0400 Received: by paza2 with SMTP id a2so63612059paz.3 for ; Mon, 25 May 2015 08:06:03 -0700 (PDT) Message-ID: <55633A54.8080807@ozlabs.ru> Date: Tue, 26 May 2015 01:05:56 +1000 From: Alexey Kardashevskiy MIME-Version: 1.0 References: <1429964684-23872-1-git-send-email-aik@ozlabs.ru> <1429964684-23872-7-git-send-email-aik@ozlabs.ru> In-Reply-To: <1429964684-23872-7-git-send-email-aik@ozlabs.ru> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH qemu v7 06/14] spapr_iommu: Introduce "enabled" state for TCE table List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Alexander Graf , Michael Roth , qemu-devel@nongnu.org, Alex Williamson , qemu-ppc@nongnu.org, David Gibson Hi Paolo, I have had a conversation with Mike and it turns out I am not allowed to create/remove memory regions dynamically (docs/memory.txt:101); otherwise "destroying regions during reset causes assertion in RCU thread during PHB/IOMMU unplug/unparent". Is it because patch just missing some unref()/unparent() call or it is totally wrong and I have to implement subregions (on a PCI bus address space) myself if I want dynamic DMA windows? Thanks! On 04/25/2015 10:24 PM, Alexey Kardashevskiy wrote: > Currently TCE tables are created once at start and their size never > changes. We are going to change that by introducing a Dynamic DMA windows > support where DMA configuration may change during the guest execution. > > This changes spapr_tce_new_table() to create an empty stub object. Only > LIOBN is assigned by the time of creation. It still will be called once > at the owner object (VIO or PHB) creation. > > This introduces an "enabled" state for TCE table objects with two > helper functions - spapr_tce_table_enable()/spapr_tce_table_disable(). > spapr_tce_table_enable() receives TCE table parameters and allocates > a guest view of the TCE table (in the user space or KVM). > spapr_tce_table_disable() disposes the table. > > Follow up patches will disable+enable tables on reset (system reset > or DDW reset). > > No visible change in behaviour is expected except the actual table > will be reallocated every reset. We might optimize this later. > > The other way to implement this would be dynamically create/remove > the TCE table QOM objects but this would make migration impossible > as migration expects all QOM objects to exist at the receiver > so we have to have TCE table objects created when migration begins. > > spapr_tce_table_do_enable() is separated from from spapr_tce_table_enable() > as later it will be called at the sPAPRTCETable post-migration stage when > it has all the properties set after the migration. > > Signed-off-by: Alexey Kardashevskiy > --- > Changes: > v7: > * s'tmp[64]'tmp[32]' as we need less than 64bytes and more than 16 bytes > and 32 is the closest power-of-two (just looks nices to have power-of-two > values) > * updated commit log about having spapr_tce_table_do_enable() splitted > from spapr_tce_table_enable() > > v6: > * got rid of set_props() > --- > hw/ppc/spapr_iommu.c | 104 +++++++++++++++++++++++++++++++----------------- > hw/ppc/spapr_pci.c | 16 +++++--- > hw/ppc/spapr_pci_vfio.c | 10 ++--- > hw/ppc/spapr_vio.c | 9 ++--- > include/hw/ppc/spapr.h | 11 ++--- > 5 files changed, 93 insertions(+), 57 deletions(-) > > diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c > index a14cdc4..a3f2b83 100644 > --- a/hw/ppc/spapr_iommu.c > +++ b/hw/ppc/spapr_iommu.c > @@ -126,8 +126,47 @@ static MemoryRegionIOMMUOps spapr_iommu_ops = { > static int spapr_tce_table_realize(DeviceState *dev) > { > sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev); > + > + QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list); > + > + vmstate_register(DEVICE(tcet), tcet->liobn, &vmstate_spapr_tce_table, > + tcet); > + > + return 0; > +} > + > +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn) > +{ > + sPAPRTCETable *tcet; > + char tmp[32]; > + > + if (spapr_tce_find_by_liobn(liobn)) { > + fprintf(stderr, "Attempted to create TCE table with duplicate" > + " LIOBN 0x%x\n", liobn); > + return NULL; > + } > + > + tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); > + tcet->liobn = liobn; > + > + snprintf(tmp, sizeof(tmp), "tce-table-%x", liobn); > + object_property_add_child(OBJECT(owner), tmp, OBJECT(tcet), NULL); > + > + object_property_set_bool(OBJECT(tcet), true, "realized", NULL); > + > + trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd); > + > + return tcet; > +} > + > +static void spapr_tce_table_do_enable(sPAPRTCETable *tcet) > +{ > uint64_t window_size = (uint64_t)tcet->nb_table << tcet->page_shift; > > + if (!tcet->nb_table) { > + return; > + } > + > if (kvm_enabled() && !(window_size >> 32)) { > tcet->table = kvmppc_create_spapr_tce(tcet->liobn, > window_size, > @@ -140,65 +179,56 @@ static int spapr_tce_table_realize(DeviceState *dev) > tcet->table = g_malloc0(table_size); > } > > - trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd); > - > - memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops, > + memory_region_init_iommu(&tcet->iommu, OBJECT(tcet), &spapr_iommu_ops, > "iommu-spapr", > (uint64_t)tcet->nb_table << tcet->page_shift); > > - QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list); > - > - vmstate_register(DEVICE(tcet), tcet->liobn, &vmstate_spapr_tce_table, > - tcet); > - > - return 0; > + tcet->enabled = true; > } > > -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, > - uint64_t bus_offset, > - uint32_t page_shift, > - uint32_t nb_table, > - bool vfio_accel) > +void spapr_tce_table_enable(sPAPRTCETable *tcet, > + uint64_t bus_offset, uint32_t page_shift, > + uint32_t nb_table, bool vfio_accel) > { > - sPAPRTCETable *tcet; > - char tmp[64]; > - > - if (spapr_tce_find_by_liobn(liobn)) { > - fprintf(stderr, "Attempted to create TCE table with duplicate" > - " LIOBN 0x%x\n", liobn); > - return NULL; > - } > - > - if (!nb_table) { > - return NULL; > + if (tcet->enabled) { > + return; > } > > - tcet = SPAPR_TCE_TABLE(object_new(TYPE_SPAPR_TCE_TABLE)); > - tcet->liobn = liobn; > tcet->bus_offset = bus_offset; > tcet->page_shift = page_shift; > tcet->nb_table = nb_table; > tcet->vfio_accel = vfio_accel; > > - snprintf(tmp, sizeof(tmp), "tce-table-%x", liobn); > - object_property_add_child(OBJECT(owner), tmp, OBJECT(tcet), NULL); > - > - object_property_set_bool(OBJECT(tcet), true, "realized", NULL); > - > - return tcet; > + spapr_tce_table_do_enable(tcet); > } > > -static void spapr_tce_table_unrealize(DeviceState *dev, Error **errp) > +void spapr_tce_table_disable(sPAPRTCETable *tcet) > { > - sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev); > - > - QLIST_REMOVE(tcet, list); > + if (!tcet->enabled) { > + return; > + } > > if (!kvm_enabled() || > (kvmppc_remove_spapr_tce(tcet->table, tcet->fd, > tcet->nb_table) != 0)) { > + tcet->fd = -1; > g_free(tcet->table); > } > + tcet->table = NULL; > + tcet->enabled = false; > + tcet->bus_offset = 0; > + tcet->page_shift = 0; > + tcet->nb_table = 0; > + tcet->vfio_accel = false; > +} > + > +static void spapr_tce_table_unrealize(DeviceState *dev, Error **errp) > +{ > + sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev); > + > + QLIST_REMOVE(tcet, list); > + > + spapr_tce_table_disable(tcet); > } > > MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet) > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 8c0d2eb..c3410b8 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -881,6 +881,12 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) > sphb->lsi_table[i].irq = irq; > } > > + tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn); > + if (!tcet) { > + error_setg(errp, "failed to create TCE table"); > + return; > + } > + > info->dma_capabilities_update(sphb); > info->dma_init_window(sphb, sphb->dma_liobn, SPAPR_TCE_PAGE_SHIFT, > sphb->dma32_window_size); > @@ -908,13 +914,13 @@ static int spapr_phb_dma_init_window(sPAPRPHBState *sphb, > uint64_t window_size) > { > uint64_t bus_offset = sphb->dma32_window_start; > - sPAPRTCETable *tcet; > + sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); > > - tcet = spapr_tce_new_table(DEVICE(sphb), liobn, bus_offset, page_shift, > - window_size >> page_shift, > - false); > + spapr_tce_table_enable(tcet, bus_offset, page_shift, > + window_size >> page_shift, > + false); > > - return tcet ? 0 : -1; > + return 0; > } > > static int spapr_phb_children_reset(Object *child, void *opaque) > diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c > index f1dd28c..a5b97d0 100644 > --- a/hw/ppc/spapr_pci_vfio.c > +++ b/hw/ppc/spapr_pci_vfio.c > @@ -49,13 +49,13 @@ static int spapr_phb_vfio_dma_init_window(sPAPRPHBState *sphb, > uint64_t window_size) > { > uint64_t bus_offset = sphb->dma32_window_start; > - sPAPRTCETable *tcet; > + sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); > > - tcet = spapr_tce_new_table(DEVICE(sphb), liobn, bus_offset, page_shift, > - window_size >> page_shift, > - true); > + spapr_tce_table_enable(tcet, bus_offset, page_shift, > + window_size >> page_shift, > + true); > > - return tcet ? 0 : -1; > + return 0; > } > > static void spapr_phb_vfio_reset(DeviceState *qdev) > diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c > index 174033d..3e28835 100644 > --- a/hw/ppc/spapr_vio.c > +++ b/hw/ppc/spapr_vio.c > @@ -479,11 +479,10 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp) > memory_region_add_subregion_overlap(&dev->mrroot, 0, &dev->mrbypass, 1); > address_space_init(&dev->as, &dev->mrroot, qdev->id); > > - dev->tcet = spapr_tce_new_table(qdev, liobn, > - 0, > - SPAPR_TCE_PAGE_SHIFT, > - pc->rtce_window_size >> > - SPAPR_TCE_PAGE_SHIFT, false); > + dev->tcet = spapr_tce_new_table(qdev, liobn); > + spapr_tce_table_enable(dev->tcet, 0, SPAPR_TCE_PAGE_SHIFT, > + pc->rtce_window_size >> SPAPR_TCE_PAGE_SHIFT, > + false); > dev->tcet->vdev = dev; > memory_region_add_subregion_overlap(&dev->mrroot, 0, > spapr_tce_get_iommu(dev->tcet), 2); > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index 7d9ab9d..074d837 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -498,6 +498,7 @@ typedef struct sPAPRTCETable sPAPRTCETable; > > struct sPAPRTCETable { > DeviceState parent; > + bool enabled; > uint32_t liobn; > uint32_t nb_table; > uint64_t bus_offset; > @@ -515,11 +516,11 @@ sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn); > void spapr_events_init(sPAPREnvironment *spapr); > void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq); > int spapr_h_cas_compose_response(target_ulong addr, target_ulong size); > -sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, > - uint64_t bus_offset, > - uint32_t page_shift, > - uint32_t nb_table, > - bool vfio_accel); > +sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn); > +void spapr_tce_table_enable(sPAPRTCETable *tcet, > + uint64_t bus_offset, uint32_t page_shift, > + uint32_t nb_table, bool vfio_accel); > +void spapr_tce_table_disable(sPAPRTCETable *tcet); > MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet); > int spapr_dma_dt(void *fdt, int node_off, const char *propname, > uint32_t liobn, uint64_t window, uint32_t size); > -- Alexey