From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35872) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsRP5-00017M-Un for qemu-devel@nongnu.org; Thu, 05 Jun 2014 02:44:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WsROt-0001iW-2e for qemu-devel@nongnu.org; Thu, 05 Jun 2014 02:43:59 -0400 Received: from mail-pd0-f174.google.com ([209.85.192.174]:35161) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsROs-0001hM-Po for qemu-devel@nongnu.org; Thu, 05 Jun 2014 02:43:46 -0400 Received: by mail-pd0-f174.google.com with SMTP id r10so654046pdi.19 for ; Wed, 04 Jun 2014 23:43:45 -0700 (PDT) Message-ID: <5390119D.8040201@ozlabs.ru> Date: Thu, 05 Jun 2014 16:43:41 +1000 From: Alexey Kardashevskiy MIME-Version: 1.0 References: <1401947401-21329-1-git-send-email-aik@ozlabs.ru> <1401947401-21329-2-git-send-email-aik@ozlabs.ru> In-Reply-To: <1401947401-21329-2-git-send-email-aik@ozlabs.ru> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7 1/4] spapr_iommu: Make in-kernel TCE table optional List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Alex Williamson , qemu-ppc@nongnu.org, Alexander Graf , Gavin Shan On 06/05/2014 03:49 PM, Alexey Kardashevskiy wrote: > POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating > TCE tables in the host kernel memory and handle H_PUT_TCE requests > targeted to specific LIOBN (logical bus number) right in the host without > switching to QEMU. At the moment this is used for emulated devices only > and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE > handler finds a LIOBN and corresponding table, it will put a TCE to > the table and complete hypercall execution. The user space will not be > notified. > > Upcoming VFIO support is going to use the same sPAPRTCETable device class > so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE > tables for VFIO are going to be allocated in the host as well. > However VFIO operates with real IOMMU tables and simple copying of > a TCE to the real hardware TCE table will not work as guest physical > to host physical address translation is requited. > > So until the host kernel gets VFIO support for H_PUT_TCE, we better not > to register VFIO's TCE in the host. > > This adds a bool @kvm_accel flag to the sPAPRTCETable device telling > that sPAPRTCETable should not try allocating TCE table in the host kernel. > Instead, the table will be created in QEMU. > > This adds an kvm_accel parameter to spapr_tce_new_table() to let users > choose whether to use acceleration or not. At the moment it is enabled > for VIO and emulated PCI. Upcoming VFIO support will set it to false. > > Signed-off-by: Alexey Kardashevskiy > --- > > This is a workaround but it lets me have one IOMMU device for VIO, emulated > PCI and VFIO which is a good thing. > > The other way around would be a new KVM_CAP_SPAPR_TCE_VFIO capability but > this needs kernel update. Never mind, I'll make it a capability. I'll post capability reservation patch separately. > --- > hw/ppc/spapr_iommu.c | 6 ++++-- > hw/ppc/spapr_pci.c | 2 +- > hw/ppc/spapr_vio.c | 2 +- > include/hw/ppc/spapr.h | 4 +++- > 4 files changed, 9 insertions(+), 5 deletions(-) > > diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c > index 3b6e373..bfd3701 100644 > --- a/hw/ppc/spapr_iommu.c > +++ b/hw/ppc/spapr_iommu.c > @@ -115,7 +115,7 @@ static int spapr_tce_table_realize(DeviceState *dev) > { > sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev); > > - if (kvm_enabled()) { > + if (tcet->kvm_accel && kvm_enabled()) { > tcet->table = kvmppc_create_spapr_tce(tcet->liobn, > tcet->nb_table << > tcet->page_shift, > @@ -143,7 +143,8 @@ static int spapr_tce_table_realize(DeviceState *dev) > sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, > uint64_t bus_offset, > uint32_t page_shift, > - uint32_t nb_table) > + uint32_t nb_table, > + bool kvm_accel) > { > sPAPRTCETable *tcet; > > @@ -162,6 +163,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, > tcet->bus_offset = bus_offset; > tcet->page_shift = page_shift; > tcet->nb_table = nb_table; > + tcet->kvm_accel = kvm_accel; > > object_property_add_child(OBJECT(owner), "tce-table", OBJECT(tcet), NULL); > > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index ddfd8bb..6021f35 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -658,7 +658,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) > tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn, > 0, > SPAPR_TCE_PAGE_SHIFT, > - 0x40000000 >> SPAPR_TCE_PAGE_SHIFT); > + 0x40000000 >> SPAPR_TCE_PAGE_SHIFT, true); > if (!tcet) { > error_setg(errp, "Unable to create TCE table for %s", > sphb->dtbusname); > diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c > index 48b0125..16385e4 100644 > --- a/hw/ppc/spapr_vio.c > +++ b/hw/ppc/spapr_vio.c > @@ -460,7 +460,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev) > 0, > SPAPR_TCE_PAGE_SHIFT, > pc->rtce_window_size >> > - SPAPR_TCE_PAGE_SHIFT); > + SPAPR_TCE_PAGE_SHIFT, true); > address_space_init(&dev->as, spapr_tce_get_iommu(dev->tcet), qdev->id); > } > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index 4ffb903..7db34ff 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -402,6 +402,7 @@ struct sPAPRTCETable { > uint32_t page_shift; > uint64_t *table; > bool bypass; > + bool kvm_accel; > int fd; > MemoryRegion iommu; > QLIST_ENTRY(sPAPRTCETable) list; > @@ -413,7 +414,8 @@ int spapr_h_cas_compose_response(target_ulong addr, target_ulong size); > sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn, > uint64_t bus_offset, > uint32_t page_shift, > - uint32_t nb_table); > + uint32_t nb_table, > + bool kvm_accel); > MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet); > void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass); > int spapr_dma_dt(void *fdt, int node_off, const char *propname, > -- Alexey