From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40078) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJDhL-0000bK-RI for qemu-devel@nongnu.org; Wed, 04 Feb 2015 23:05:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YJDh7-0001eD-7I for qemu-devel@nongnu.org; Wed, 04 Feb 2015 23:05:47 -0500 Date: Thu, 5 Feb 2015 14:51:44 +1100 From: David Gibson Message-ID: <20150205035144.GK25675@voom.fritz.box> References: <1422523650-2888-1-git-send-email-aik@ozlabs.ru> <1422523650-2888-12-git-send-email-aik@ozlabs.ru> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="9RxwyT9MtfFuvYYZ" Content-Disposition: inline In-Reply-To: <1422523650-2888-12-git-send-email-aik@ozlabs.ru> Subject: Re: [Qemu-devel] [PATCH v4 11/18] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: Alex Williamson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, Alexander Graf --9RxwyT9MtfFuvYYZ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jan 29, 2015 at 08:27:23PM +1100, Alexey Kardashevskiy wrote: > This implements DDW for emulated and VFIO PHB. >=20 > This removes all DMA windows on reset and creates the default window, > same is done on the "ibm,reset-pe-dma-window" call. > This converts sPAPRPHBClass::finish_realize to sPAPRPHBClass::ddw_reset > and others. >=20 > The "ddw" property is enabled by default on a PHB but for compatibility > pseries-2.1 machine disables it. Now that we're past the 2.2 release, this should change to only be enabled for 2.3+, yes? > Signed-off-by: Alexey Kardashevskiy > --- > Changes: > v4: > * reset handler is back in generalized form >=20 > v3: > * removed reset > * windows_num is now 1 or bigger rather than 0-based value and it is only > changed in PHB code, not in RTAS > * added page mask check in create() > * added SPAPR_PCI_DDW_MAX_WINDOWS to track how many windows are already > created >=20 > v2: > * tested on hacked emulated E1000 > * implemented DDW reset on the PHB reset > * spapr_pci_ddw_remove/spapr_pci_ddw_reset are public for reuse by VFIO >=20 > spapr_pci_vfio: Enable DDW >=20 > This implements DDW for VFIO. Host kernel support is required for this. >=20 > After this patch DDW will be enabled on all machines but pseries-2.1. >=20 > Signed-off-by: Alexey Kardashevskiy > --- > Changes: > v2: > * remove()/reset() callbacks use spapr_pci's ones > --- > hw/ppc/spapr_pci.c | 160 +++++++++++++++++++++++++++++++++++---= ------ > hw/ppc/spapr_pci_vfio.c | 98 +++++++++++++++++---------- > include/hw/pci-host/spapr.h | 15 ++++- > 3 files changed, 203 insertions(+), 70 deletions(-) >=20 > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 6bd00e8..3ec03be 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -469,6 +469,126 @@ static const MemoryRegionOps spapr_msi_ops =3D { > .endianness =3D DEVICE_LITTLE_ENDIAN > }; > =20 > +static int spapr_phb_get_win_num_cb(Object *child, void *opaque) > +{ > + if (object_dynamic_cast(child, TYPE_SPAPR_TCE_TABLE)) { > + ++*(unsigned *)opaque; > + } > + return 0; > +} > + > +unsigned spapr_phb_get_win_num(sPAPRPHBState *sphb) > +{ > + unsigned ret =3D 0; > + > + object_child_foreach(OBJECT(sphb), spapr_phb_get_win_num_cb, &ret); > + > + return ret; > +} > + > +/* > + * Dynamic DMA windows > + */ > +static int spapr_pci_ddw_query(sPAPRPHBState *sphb, > + uint32_t *windows_supported, > + uint32_t *page_size_mask, > + uint32_t *dma32_window_size, > + uint64_t *dma64_window_size) > +{ > + *windows_supported =3D SPAPR_PCI_DDW_MAX_WINDOWS; > + *page_size_mask =3D DDW_PGSIZE_64K | DDW_PGSIZE_16M; > + *dma32_window_size =3D SPAPR_PCI_TCE32_WIN_SIZE; > + *dma64_window_size =3D ram_size; > + > + return 0; > +} > + > +static int spapr_pci_ddw_create(sPAPRPHBState *sphb, uint32_t liobn, > + uint32_t page_shift, uint32_t window_shi= ft, > + sPAPRTCETable **ptcet) > +{ > + uint64_t bus_offset =3D spapr_phb_get_win_num(sphb) ? > + SPAPR_PCI_TCE64_START : 0; Should you also have an assert that spapr_phb_get_win_num(sphb) <=3D1 at this point? > + > + if (((page_shift !=3D 16) && (page_shift !=3D 24) && (page_shift != =3D 12))) { > + return -1; You only have two return values: failure and success. So is there a reason you're using an int, rather than returning the sPAPRTCETable * or NULL? > + } > + > + *ptcet =3D spapr_tce_new_table(DEVICE(sphb), liobn, > + bus_offset, > + page_shift, > + 1ULL << (window_shift - page_shift), > + false); > + if (!*ptcet) { > + return -1; > + } > + memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset, > + spapr_tce_get_iommu(*ptcet)); > + > + return 0; > +} > + > +int spapr_pci_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable *tcet) > +{ > + memory_region_del_subregion(&sphb->iommu_root, > + spapr_tce_get_iommu(tcet)); > + spapr_tce_free_table(tcet); > + > + return 0; > +} > + > +static int spapr_pci_remove_ddw_cb(Object *child, void *opaque) > +{ > + sPAPRTCETable *tcet; > + > + tcet =3D (sPAPRTCETable *) object_dynamic_cast(child, TYPE_SPAPR_TCE= _TABLE); > + > + if (tcet) { > + sPAPRPHBState *sphb =3D opaque; > + sPAPRPHBClass *spc =3D SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); > + > + spc->ddw_remove(sphb, tcet); > + } > + > + return 0; > +} > + > +int spapr_pci_ddw_reset(sPAPRPHBState *sphb) > +{ > + int ret; > + sPAPRPHBClass *spc; > + sPAPRTCETable *tcet; > + uint32_t windows_supported =3D 0, page_size_mask =3D 0, dma32_window= _size =3D 0; > + uint64_t dma64_window_size =3D 0; > + > + /* Remove all windows */ > + object_child_foreach(OBJECT(sphb), spapr_pci_remove_ddw_cb, sphb); > + > + /* Create default 32bit window */ This comment seems to below a few lines down from here. > + spc =3D SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); > + if (!spc->ddw_create || !spc->ddw_query) { > + return -1; > + } > + > + ret =3D spc->ddw_query(sphb, &windows_supported, &page_size_mask, > + &dma32_window_size, &dma64_window_size); > + if (ret) { > + return ret; > + } > + > + sphb->ddw_enabled =3D (windows_supported > 1); ddw_enabled doesn't actually seem to be tested anywhere. And shouldn't it depend on the externall set property for pre-2.3 compat, not just on the # windows supported by the underlying implementation? > + ret =3D spc->ddw_create(sphb, SPAPR_PCI_LIOBN(sphb->index, 0), > + SPAPR_TCE_PAGE_SHIFT, ctzl(dma32_window_size),= &tcet); > + if (ret) { > + return ret; > + } > + > + object_unref(OBJECT(tcet)); This could perhaps do with a comment saying why you've ended up with an extraneous reference. > + > + return 0; > +} > + > /* > * PHB PCI device > */ > @@ -484,7 +604,6 @@ static void spapr_phb_realize(DeviceState *dev, Error= **errp) > SysBusDevice *s =3D SYS_BUS_DEVICE(dev); > sPAPRPHBState *sphb =3D SPAPR_PCI_HOST_BRIDGE(s); > PCIHostState *phb =3D PCI_HOST_BRIDGE(s); > - sPAPRPHBClass *info =3D SPAPR_PCI_HOST_BRIDGE_GET_CLASS(s); > char *namebuf; > int i; > PCIBus *bus; > @@ -622,37 +741,9 @@ static void spapr_phb_realize(DeviceState *dev, Erro= r **errp) > sphb->lsi_table[i].irq =3D irq; > } > =20 > - if (!info->finish_realize) { > - error_setg(errp, "finish_realize not defined"); > - return; > - } > - > - info->finish_realize(sphb, errp); > - > sphb->msi =3D g_hash_table_new_full(g_int_hash, g_int_equal, g_free,= g_free); > } > =20 > -static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) > -{ > - sPAPRTCETable *tcet; > - > - tcet =3D spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn, > - 0, > - SPAPR_TCE_PAGE_SHIFT, > - 0x40000000 >> SPAPR_TCE_PAGE_SHIFT, false= ); > - if (!tcet) { > - error_setg(errp, "Unable to create TCE table for %s", > - sphb->dtbusname); > - return ; > - } > - > - /* Register default 32bit DMA window */ > - memory_region_add_subregion(&sphb->iommu_root, 0, > - spapr_tce_get_iommu(tcet)); > - > - object_unref(OBJECT(tcet)); > -} > - > static int spapr_phb_children_reset(Object *child, void *opaque) > { > DeviceState *dev =3D (DeviceState *) object_dynamic_cast(child, TYPE= _DEVICE); > @@ -666,7 +757,11 @@ static int spapr_phb_children_reset(Object *child, v= oid *opaque) > =20 > static void spapr_phb_reset(DeviceState *qdev) > { > - /* Reset the IOMMU state */ > + sPAPRPHBClass *spc =3D SPAPR_PCI_HOST_BRIDGE_GET_CLASS(qdev); > + > + if (spc->ddw_reset) { > + spc->ddw_reset(SPAPR_PCI_HOST_BRIDGE(qdev)); > + } > object_child_foreach(OBJECT(qdev), spapr_phb_children_reset, NULL); > } > =20 > @@ -801,7 +896,10 @@ static void spapr_phb_class_init(ObjectClass *klass,= void *data) > dc->vmsd =3D &vmstate_spapr_pci; > set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); > dc->cannot_instantiate_with_device_add_yet =3D false; > - spc->finish_realize =3D spapr_phb_finish_realize; > + spc->ddw_query =3D spapr_pci_ddw_query; > + spc->ddw_create =3D spapr_pci_ddw_create; > + spc->ddw_remove =3D spapr_pci_ddw_remove; > + spc->ddw_reset =3D spapr_pci_ddw_reset; > } > =20 > static const TypeInfo spapr_phb_info =3D { > diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c > index aabf0ae..b20ac90 100644 > --- a/hw/ppc/spapr_pci_vfio.c > +++ b/hw/ppc/spapr_pci_vfio.c > @@ -27,65 +27,89 @@ static Property spapr_phb_vfio_properties[] =3D { > DEFINE_PROP_END_OF_LIST(), > }; > =20 > -static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **e= rrp) > +static int spapr_pci_vfio_ddw_query(sPAPRPHBState *sphb, > + uint32_t *windows_supported, > + uint32_t *page_size_mask, > + uint32_t *dma32_window_size, > + uint64_t *dma64_window_size) > { > sPAPRPHBVFIOState *svphb =3D SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); > struct vfio_iommu_spapr_tce_info info =3D { .argsz =3D sizeof(info) = }; > int ret; > - sPAPRTCETable *tcet; > - uint32_t liobn =3D svphb->phb.dma_liobn; > =20 > - if (svphb->iommugroupid =3D=3D -1) { > - error_setg(errp, "Wrong IOMMU group ID %d", svphb->iommugroupid); > - return; > - } > - > - ret =3D vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroup= id, > - VFIO_CHECK_EXTENSION, > - (void *) VFIO_SPAPR_TCE_IOMMU); > - if (ret !=3D 1) { > - error_setg_errno(errp, -ret, > - "spapr-vfio: SPAPR extension is not supported"); > - return; > - } > - > - ret =3D vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroup= id, > + ret =3D vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid, > VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info); > if (ret) { > - error_setg_errno(errp, -ret, > - "spapr-vfio: get info from container failed"); > - return; > + return ret; > } > =20 > - tcet =3D spapr_tce_new_table(DEVICE(sphb), liobn, info.dma32_window_= start, > - SPAPR_TCE_PAGE_SHIFT, > - info.dma32_window_size >> SPAPR_TCE_PAGE_= SHIFT, > - true); > - if (!tcet) { > - error_setg(errp, "spapr-vfio: failed to create VFIO TCE table"); > - return; > + *windows_supported =3D info.windows_supported; > + *page_size_mask =3D info.flags & DDW_PGSIZE_MASK; > + *dma32_window_size =3D info.dma32_window_size; > + *dma64_window_size =3D ram_size; > + > + return ret; > +} > + > +static int spapr_pci_vfio_ddw_create(sPAPRPHBState *sphb, uint32_t liobn, > + uint32_t page_shift, uint32_t windo= w_shift, > + sPAPRTCETable **ptcet) > +{ > + sPAPRPHBVFIOState *svphb =3D SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); > + struct vfio_iommu_spapr_tce_create create =3D { > + .argsz =3D sizeof(create), > + .page_shift =3D page_shift, > + .window_shift =3D window_shift, > + .levels =3D 1, > + .start_addr =3D 0, > + }; > + int ret; > + > + ret =3D vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid, > + VFIO_IOMMU_SPAPR_TCE_CREATE, &create); > + if (ret) { > + return ret; > } > =20 > - /* Register default 32bit DMA window */ > - memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset, > - spapr_tce_get_iommu(tcet)); > + *ptcet =3D spapr_tce_new_table(DEVICE(sphb), liobn, > + create.start_addr, > + page_shift, > + 1ULL << (window_shift - page_shift), > + true); > + if (!*ptcet) { > + return -1; > + } > + memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset, > + spapr_tce_get_iommu(*ptcet)); > =20 > - object_unref(OBJECT(tcet)); > + return ret; > } > =20 > -static void spapr_phb_vfio_reset(DeviceState *qdev) > +static int spapr_pci_vfio_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable = *tcet) > { > - /* Do nothing */ > + sPAPRPHBVFIOState *svphb =3D SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); > + struct vfio_iommu_spapr_tce_remove remove =3D { > + .argsz =3D sizeof(remove), > + .start_addr =3D tcet->bus_offset > + }; > + int ret; > + > + spapr_pci_ddw_remove(sphb, tcet); > + ret =3D vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid, > + VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove); > + > + return ret; > } > =20 > static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) > { > - DeviceClass *dc =3D DEVICE_CLASS(klass); > sPAPRPHBClass *spc =3D SPAPR_PCI_HOST_BRIDGE_CLASS(klass); > + DeviceClass *dc =3D DEVICE_CLASS(klass); > =20 > dc->props =3D spapr_phb_vfio_properties; > - dc->reset =3D spapr_phb_vfio_reset; > - spc->finish_realize =3D spapr_phb_vfio_finish_realize; > + spc->ddw_query =3D spapr_pci_vfio_ddw_query; > + spc->ddw_create =3D spapr_pci_vfio_ddw_create; > + spc->ddw_remove =3D spapr_pci_vfio_ddw_remove; > } > =20 > static const TypeInfo spapr_phb_vfio_info =3D { > diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h > index eec95f3..577f908 100644 > --- a/include/hw/pci-host/spapr.h > +++ b/include/hw/pci-host/spapr.h > @@ -48,8 +48,6 @@ typedef struct sPAPRPHBVFIOState sPAPRPHBVFIOState; > struct sPAPRPHBClass { > PCIHostBridgeClass parent_class; > =20 > - void (*finish_realize)(sPAPRPHBState *sphb, Error **errp); > - > /* sPAPR spec defined pagesize mask values */ > #define DDW_PGSIZE_4K 0x01 > #define DDW_PGSIZE_64K 0x02 > @@ -106,6 +104,8 @@ struct sPAPRPHBState { > int32_t msi_devs_num; > spapr_pci_msi_mig *msi_devs; > =20 > + bool ddw_enabled; > + > QLIST_ENTRY(sPAPRPHBState) list; > }; > =20 > @@ -129,6 +129,14 @@ struct sPAPRPHBVFIOState { > =20 > #define SPAPR_PCI_MSI_WINDOW 0x40000000000ULL > =20 > +#define SPAPR_PCI_TCE32_WIN_SIZE 0x80000000ULL > + > +/* Default 64bit dynamic window offset */ > +#define SPAPR_PCI_TCE64_START 0x8000000000000000ULL > + > +/* Maximum allowed number of DMA windows for emulated PHB */ > +#define SPAPR_PCI_DDW_MAX_WINDOWS 2 > + > static inline qemu_irq spapr_phb_lsi_qirq(struct sPAPRPHBState *phb, int= pin) > { > return xics_get_qirq(spapr->icp, phb->lsi_table[pin].irq); > @@ -147,5 +155,8 @@ void spapr_pci_rtas_init(void); > sPAPRPHBState *spapr_pci_find_phb(sPAPREnvironment *spapr, uint64_t buid= ); > PCIDevice *spapr_pci_find_dev(sPAPREnvironment *spapr, uint64_t buid, > uint32_t config_addr); > +int spapr_pci_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable *tcet); > +int spapr_pci_ddw_reset(sPAPRPHBState *sphb); > +unsigned spapr_phb_get_win_num(sPAPRPHBState *sphb); > =20 > #endif /* __HW_SPAPR_PCI_H__ */ --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --9RxwyT9MtfFuvYYZ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJU0ujQAAoJEGw4ysog2bOSXe8QALnFbAYz77RO6w0j3fHPGbIQ yDh3p95PaK6+xdwsMktBtwBuHSvuow7wygFQoAZMcvV5za+9j7vyaJSsp+JJNImX Nry/z1YSuPJdEVkrE59dXsEsd2V29ev5JT8srOwTlNnkaA5CzR7P7f2c+L/0vizQ PFIQzrTYd4PxJHfCELu4EXWntU0wDkQ7vD1xm+c7ouXM685/3LE55Jhq7KYFBrbz i3v/oVm4dss19UhJNWnv9gzk9INX7icr/qaC+eXCfwO0xPVB83IEIHuooBipqnny poyV8lMf0NhlJyzdQuR42ELPJi3o/+97QbEenqwhhb0zDLsPNWxNda1Ca/+cL6rd GQ3ZHONxziwr/7s9cOVIDgal0XQ34kxz9aady70VDXUBER0o1YORW1ExDBWkZBVe VEXsqYNdLXwdqnQ39VHWB3A6424+I50hjNkNRF7fk/Rrp4Yl/Gaodm1qHgKEgZ/d amJPTLvLDW4gc7QkqUB3IAiTaY8KYc+sdvru8DyksaU19y+NlTNNQlmTegoQUsbH Yhqlshl4+fZfJWhYUVIIgREG5pes6u6mx3zJQ0aeX+yCHMZNycAe78JgA66wtOJ7 sI2CR/0upWaDebcs0a3UUE9T4VVx1HFJndGXgIRDEENiuk9tlFNUL/oPCqLNkBbP +ULrxOPLvpN6PI9ERrYB =IwEI -----END PGP SIGNATURE----- --9RxwyT9MtfFuvYYZ--