From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36013) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a4Whq-0005ED-V9 for qemu-devel@nongnu.org; Thu, 03 Dec 2015 11:26:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a4Whm-00041e-SO for qemu-devel@nongnu.org; Thu, 03 Dec 2015 11:26:06 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49131) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a4Whm-00041T-LV for qemu-devel@nongnu.org; Thu, 03 Dec 2015 11:26:02 -0500 Message-ID: <1449159960.15753.160.camel@redhat.com> From: Alex Williamson Date: Thu, 03 Dec 2015 09:26:00 -0700 In-Reply-To: <008a01d12da9$67f816d0$37e84470$@samsung.com> References: <00fe01d1210c$1be12880$53a37980$@samsung.com> <1447884282.4697.111.camel@redhat.com> <013101d122b5$240ef500$6c2cdf00$@samsung.com> <1447976037.4697.205.camel@redhat.com> <013801d126cc$3efcbdf0$bcf639d0$@samsung.com> <1449085233.15753.101.camel@redhat.com> <008a01d12da9$67f816d0$37e84470$@samsung.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] vfio: Align iova also to IOMMU page size List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pavel Fedin Cc: 'Peter Maydell' , qemu-devel@nongnu.org On Thu, 2015-12-03 at 12:02 +0300, Pavel Fedin wrote: > Hello! > > > > My device defines this BAR to be of 2M size. In this case qemu splits it up into three > > > regions: > > > 1) Region below the MSI-X table (it's called "mmap", for me it's empty because table offset > > > is 0) > > > 2) MSI-X table itself (20 vectors = 0x00000140 bytes for me). > > > 3) Region above the MSi-X table (it's called "mmap msix-hi"). Size for my case is 2M - > > > REAL_HOST_PAGE_ALIGN(0x140) = 0x1FF000. > > > Regions (1) and (3) are passed through and directly mapped by VFIO, region (2) is emulated. > > > Now, we have PBA. The device says that PBA is placed in memory specified by the same BAR as > > > table, and its offset is 0x000f0000. PBA is also emulated by qemu, and as a result it overlaps > > > "mmap msix-hi" region, which breaks up into two fragments as a consequence: > > > 3a) offset 0x00000 size 0x0EF000 > > > 3b) offset 0xEF008 size 0x10FFF8 > > > PBA sits between these fragments, having only 8 bytes in size. > > > Attempt to map (3b) causes the problem. vfio_mmap_region() is expected to align this up, > > > and it does, but it does it to a minimum possible granularity for ARM platform, which, as qemu > > > thinks, is always 1K. > > > > Yep, on x86 we'd have target page size == host page size == minimum > > iommu page size and we'd align that up to something that can be mapped, > > which means we wouldn't have an iommu mapping for peer-to-peer in this > > space, but nobody is likely to notice. > > So, OK, to keep the quoting short... I've also read your previous reply about that "rounding up just works". Let's sum things up. > > 1. Rounding up "just works" in my case too. So i don't see a direct reason to change it. > 2. The only problem is rounding up to 1K, which TARGET_PAGE_ALIGN() does on ARM. What we need is 4K, > which is appropriate for host too. And, by the way, for modern ARM guests too. So, we can do either of: > a) Keep my original approach - TARGET_PAGE_ALIGN(), then align to vfio_container_granularity(). > b) Just align to vfio_container_granularity() instead of using TARGET_PAGE_ALIGN(). > c) Use HOST_PAGE_ALIGN() instead of TARGET_PAGE_ALIGN() (what Peter suggested). > > On x86 there would be no difference, because all these alignments are identical. On ARM, actually, all of these approaches would also give correct result, because it's only TARGET_PAGE_ALIGN() wrong in this case. So - what do you think is the most appropriate? Let's make a choice and i'll send a proper patch. I feel a lot more comfortable if we limit the scope to MMIO regions of PCI devices. The problems I brought up before about the device not being able to DMA to a target aligned RAM address are still a possibility that I think we want to catch. To do that, I think we just need: Object *obj = memory_region_owner(section->mr); if (object_dynamic_cast(obj, "pci-device")) { /* HOST_PAGE_ALIGN... */ } else { /* TARGET_PAGE_ALIGN... */ } Thanks, Alex