From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:35984)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1b17jl-0005n0-0D
	for qemu-devel@nongnu.org; Fri, 13 May 2016 03:42:18 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1b17je-0002l1-1j
	for qemu-devel@nongnu.org; Fri, 13 May 2016 03:42:15 -0400
Received: from hqemgate15.nvidia.com ([216.228.121.64]:12056)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1b17jd-0002k3-OT
	for qemu-devel@nongnu.org; Fri, 13 May 2016 03:42:09 -0400
Date: Fri, 13 May 2016 00:42:06 -0700
From: Neo Jia <cjia@nvidia.com>
Message-ID: <20160513074206.GC5749@nvidia.com>
References: <5731933B.90508@intel.com> <20160510160257.GA4125@nvidia.com>
	<5732F823.3090409@intel.com> <20160511160628.690876f9@t450s.home>
	<CANE52KgothHxNED00pXBHBT1KgejMQiEhHBcMKppyFT_gZgPSw@mail.gmail.com>
	<20160512194924.GA24334@nvidia.com>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F853FE9@SHSMSX101.ccr.corp.intel.com>
	<573572AD.2010407@intel.com> <20160513064357.GB30970@nvidia.com>
	<57358293.7060408@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <57358293.7060408@intel.com>
Subject: Re: [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to
 support with iommu and without iommu
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Jike Song <jike.song@intel.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>, Jike Song <albcamus@gmail.com>, Alex Williamson <alex.williamson@redhat.com>, Kirti Wankhede <kwankhede@nvidia.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "kraxel@redhat.com" <kraxel@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "Ruan, Shuai" <shuai.ruan@intel.com>, "Lv, Zhiyuan" <zhiyuan.lv@intel.com>

On Fri, May 13, 2016 at 03:30:27PM +0800, Jike Song wrote:
> On 05/13/2016 02:43 PM, Neo Jia wrote:
> > On Fri, May 13, 2016 at 02:22:37PM +0800, Jike Song wrote:
> >> On 05/13/2016 10:41 AM, Tian, Kevin wrote:
> >>>> From: Neo Jia [mailto:cjia@nvidia.com] Sent: Friday, May 13,
> >>>> 2016 3:49 AM
> >>>> 
> >>>>> 
> >>>>>> Perhaps one possibility would be to allow the vgpu driver
> >>>>>> to register map and unmap callbacks.  The unmap callback
> >>>>>> might provide the invalidation interface that we're so far
> >>>>>> missing.  The combination of map and unmap callbacks might
> >>>>>> simplify the Intel approach of pinning the entire VM memory
> >>>>>> space, ie. for each map callback do a translation (pin) and
> >>>>>> dma_map_page, for each unmap do a dma_unmap_page and
> >>>>>> release the translation.
> >>>>> 
> >>>>> Yes adding map/unmap ops in pGPU drvier (I assume you are
> >>>>> refering to gpu_device_ops as implemented in Kirti's patch)
> >>>>> sounds a good idea, satisfying both: 1) keeping vGPU purely 
> >>>>> virtual; 2) dealing with the Linux DMA API to achive hardware
> >>>>> IOMMU compatibility.
> >>>>> 
> >>>>> PS, this has very little to do with pinning wholly or
> >>>>> partially. Intel KVMGT has once been had the whole guest
> >>>>> memory pinned, only because we used a spinlock, which can't
> >>>>> sleep at runtime.  We have removed that spinlock in our
> >>>>> another upstreaming effort, not here but for i915 driver, so
> >>>>> probably no biggie.
> >>>>> 
> >>>> 
> >>>> OK, then you guys don't need to pin everything. The next
> >>>> question will be if you can send the pinning request from your
> >>>> mediated driver backend to request memory pinning like we have
> >>>> demonstrated in the v3 patch, function vfio_pin_pages and 
> >>>> vfio_unpin_pages?
> >>>> 
> >>> 
> >>> Jike can you confirm this statement? My feeling is that we don't
> >>> have such logic in our device model to figure out which pages
> >>> need to be pinned on demand. So currently pin-everything is same
> >>> requirement in both KVM and Xen side...
> >> 
> >> [Correct me in case of any neglect:)]
> >> 
> >> IMO the ultimate reason to pin a page, is for DMA. Accessing RAM
> >> from a GPU is certainly a DMA operation. The DMA facility of most
> >> platforms, IGD and NVIDIA GPU included, is not capable of
> >> faulting-handling-retrying.
> >> 
> >> As for vGPU solutions like Nvidia and Intel provide, the memory
> >> address region used by Guest for GPU access, whenever Guest sets
> >> the mappings, it is intercepted by Host, so it's safe to only pin
> >> the page before it get used by Guest. This probably doesn't need
> >> device model to change :)
> > 
> > Hi Jike
> > 
> > Just out of curiosity, how does the host intercept this before it
> > goes on the bus?
> > 
> 
> Hi Neo,
> 
> [prologize if I mis-expressed myself, bad English ..] 
> 
> I was talking about intercepting the setting-up of GPU page tables,
> not the DMA itself.  For currently Intel GPU, the page tables are
> MMIO registers or simply RAM pages, called GTT (Graphics Translation
> Table), the writing event to an GTT entry from Guest, is always
> intercepted by Host.

Hi Jike,

Thanks for the details, one more question if the page tables are guest RAM, how do you
intercept it from host? I can see it get intercepted when it is in MMIO range.

Thanks,
Neo

> 
> --
> Thanks,
> Jike
>