From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58324) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XGtQy-00013B-KJ for qemu-devel@nongnu.org; Mon, 11 Aug 2014 13:31:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XGtQs-0005b6-3A for qemu-devel@nongnu.org; Mon, 11 Aug 2014 13:31:00 -0400 Message-ID: <53E8FDCC.7060308@suse.de> Date: Mon, 11 Aug 2014 19:30:52 +0200 From: Alexander Graf MIME-Version: 1.0 References: <1406799254-25223-1-git-send-email-aik@ozlabs.ru> <1406799254-25223-10-git-send-email-aik@ozlabs.ru> <53E8B0BB.60006@suse.de> <53E8DAC2.1020307@ozlabs.ru> In-Reply-To: <53E8DAC2.1020307@ozlabs.ru> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH 09/10] spapr_pci_vfio: Enable DDW List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy , qemu-devel@nongnu.org Cc: Alex Williamson , qemu-ppc@nongnu.org On 11.08.14 17:01, Alexey Kardashevskiy wrote: > On 08/11/2014 10:02 PM, Alexander Graf wrote: >> On 31.07.14 11:34, Alexey Kardashevskiy wrote: >>> This implements DDW for VFIO. Host kernel support is required for this. >>> >>> Signed-off-by: Alexey Kardashevskiy >>> --- >>> hw/ppc/spapr_pci_vfio.c | 75 >>> +++++++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 75 insertions(+) >>> >>> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c >>> index d3bddf2..dc443e2 100644 >>> --- a/hw/ppc/spapr_pci_vfio.c >>> +++ b/hw/ppc/spapr_pci_vfio.c >>> @@ -69,6 +69,77 @@ static void >>> spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) >>> /* Register default 32bit DMA window */ >>> memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset, >>> spapr_tce_get_iommu(tcet)); >>> + >>> + sphb->ddw_supported = !!(info.flags & VFIO_IOMMU_SPAPR_TCE_FLAG_DDW); >>> +} >>> + >>> +static int spapr_pci_vfio_ddw_query(sPAPRPHBState *sphb, >>> + uint32_t *windows_available, >>> + uint32_t *page_size_mask) >>> +{ >>> + sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); >>> + struct vfio_iommu_spapr_tce_query query = { .argsz = sizeof(query) }; >>> + int ret; >>> + >>> + ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid, >>> + VFIO_IOMMU_SPAPR_TCE_QUERY, &query); >>> + if (ret) { >>> + return ret; >>> + } >>> + >>> + *windows_available = query.windows_available; >>> + *page_size_mask = query.page_size_mask; >>> + >>> + return ret; >>> +} >>> + >>> +static int spapr_pci_vfio_ddw_create(sPAPRPHBState *sphb, uint32_t >>> page_shift, >>> + uint32_t window_shift, uint32_t liobn, >>> + sPAPRTCETable **ptcet) >>> +{ >>> + sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); >>> + struct vfio_iommu_spapr_tce_create create = { >>> + .argsz = sizeof(create), >>> + .page_shift = page_shift, >>> + .window_shift = window_shift, >>> + .start_addr = 0 >>> + }; >>> + int ret; >>> + >>> + ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid, >>> + VFIO_IOMMU_SPAPR_TCE_CREATE, &create); >>> + if (ret) { >>> + return ret; >>> + } >>> + >>> + *ptcet = spapr_tce_new_table(DEVICE(sphb), liobn, create.start_addr, >>> + page_shift, 1 << (window_shift - >>> page_shift), >> I spot a 1 without ULL again - this time it might work out ok, but please >> just always use ULL when you pass around addresses. > My bad. I keep forgetting this, I'll adjust my own checkpatch.py :) > > >> Please walk me though the abstraction levels on what each page size >> honoration means. If I use THP, what page size granularity can I use for >> TCE entries? > > [RFC PATCH 06/10] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls support > > + const struct { int shift; uint32_t mask; } masks[] = { > + { 12, DDW_PGSIZE_4K }, > + { 16, DDW_PGSIZE_64K }, > + { 24, DDW_PGSIZE_16M }, > + { 25, DDW_PGSIZE_32M }, > + { 26, DDW_PGSIZE_64M }, > + { 27, DDW_PGSIZE_128M }, > + { 28, DDW_PGSIZE_256M }, > + { 34, DDW_PGSIZE_16G }, > + }; > > > Supported page sizes are returned by the host kernel via "query". For 16MB > pages, page shift will return DDW_PGSIZE_4K|DDW_PGSIZE_64K|DDW_PGSIZE_16M. > Or I did not understand the question... Why do we care about the sizes? Anything bigger than what we support should always work, no? What happens if the guest creates a 16MB map but my pages are 4kb mapped? Wouldn't the same logic be able to deal with 16G pages? Alex