From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35984) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b17jl-0005n0-0D for qemu-devel@nongnu.org; Fri, 13 May 2016 03:42:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b17je-0002l1-1j for qemu-devel@nongnu.org; Fri, 13 May 2016 03:42:15 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:12056) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b17jd-0002k3-OT for qemu-devel@nongnu.org; Fri, 13 May 2016 03:42:09 -0400 Date: Fri, 13 May 2016 00:42:06 -0700 From: Neo Jia Message-ID: <20160513074206.GC5749@nvidia.com> References: <5731933B.90508@intel.com> <20160510160257.GA4125@nvidia.com> <5732F823.3090409@intel.com> <20160511160628.690876f9@t450s.home> <20160512194924.GA24334@nvidia.com> <573572AD.2010407@intel.com> <20160513064357.GB30970@nvidia.com> <57358293.7060408@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <57358293.7060408@intel.com> Subject: Re: [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jike Song Cc: "Tian, Kevin" , Jike Song , Alex Williamson , Kirti Wankhede , "pbonzini@redhat.com" , "kraxel@redhat.com" , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , "Ruan, Shuai" , "Lv, Zhiyuan" On Fri, May 13, 2016 at 03:30:27PM +0800, Jike Song wrote: > On 05/13/2016 02:43 PM, Neo Jia wrote: > > On Fri, May 13, 2016 at 02:22:37PM +0800, Jike Song wrote: > >> On 05/13/2016 10:41 AM, Tian, Kevin wrote: > >>>> From: Neo Jia [mailto:cjia@nvidia.com] Sent: Friday, May 13, > >>>> 2016 3:49 AM > >>>> > >>>>> > >>>>>> Perhaps one possibility would be to allow the vgpu driver > >>>>>> to register map and unmap callbacks. The unmap callback > >>>>>> might provide the invalidation interface that we're so far > >>>>>> missing. The combination of map and unmap callbacks might > >>>>>> simplify the Intel approach of pinning the entire VM memory > >>>>>> space, ie. for each map callback do a translation (pin) and > >>>>>> dma_map_page, for each unmap do a dma_unmap_page and > >>>>>> release the translation. > >>>>> > >>>>> Yes adding map/unmap ops in pGPU drvier (I assume you are > >>>>> refering to gpu_device_ops as implemented in Kirti's patch) > >>>>> sounds a good idea, satisfying both: 1) keeping vGPU purely > >>>>> virtual; 2) dealing with the Linux DMA API to achive hardware > >>>>> IOMMU compatibility. > >>>>> > >>>>> PS, this has very little to do with pinning wholly or > >>>>> partially. Intel KVMGT has once been had the whole guest > >>>>> memory pinned, only because we used a spinlock, which can't > >>>>> sleep at runtime. We have removed that spinlock in our > >>>>> another upstreaming effort, not here but for i915 driver, so > >>>>> probably no biggie. > >>>>> > >>>> > >>>> OK, then you guys don't need to pin everything. The next > >>>> question will be if you can send the pinning request from your > >>>> mediated driver backend to request memory pinning like we have > >>>> demonstrated in the v3 patch, function vfio_pin_pages and > >>>> vfio_unpin_pages? > >>>> > >>> > >>> Jike can you confirm this statement? My feeling is that we don't > >>> have such logic in our device model to figure out which pages > >>> need to be pinned on demand. So currently pin-everything is same > >>> requirement in both KVM and Xen side... > >> > >> [Correct me in case of any neglect:)] > >> > >> IMO the ultimate reason to pin a page, is for DMA. Accessing RAM > >> from a GPU is certainly a DMA operation. The DMA facility of most > >> platforms, IGD and NVIDIA GPU included, is not capable of > >> faulting-handling-retrying. > >> > >> As for vGPU solutions like Nvidia and Intel provide, the memory > >> address region used by Guest for GPU access, whenever Guest sets > >> the mappings, it is intercepted by Host, so it's safe to only pin > >> the page before it get used by Guest. This probably doesn't need > >> device model to change :) > > > > Hi Jike > > > > Just out of curiosity, how does the host intercept this before it > > goes on the bus? > > > > Hi Neo, > > [prologize if I mis-expressed myself, bad English ..] > > I was talking about intercepting the setting-up of GPU page tables, > not the DMA itself. For currently Intel GPU, the page tables are > MMIO registers or simply RAM pages, called GTT (Graphics Translation > Table), the writing event to an GTT entry from Guest, is always > intercepted by Host. Hi Jike, Thanks for the details, one more question if the page tables are guest RAM, how do you intercept it from host? I can see it get intercepted when it is in MMIO range. Thanks, Neo > > -- > Thanks, > Jike >