From mboxrd@z Thu Jan  1 00:00:00 1970
From: Neo Jia <cjia@nvidia.com>
Subject: Re: [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with
 iommu and without iommu
Date: Thu, 12 May 2016 23:43:57 -0700
Message-ID: <20160513064357.GB30970@nvidia.com>
References: <572AEE72.90008@intel.com>
 <AADFC41AFE54684AB9EE6CBC0274A5D15F8470F6@SHSMSX101.ccr.corp.intel.com>
 <5731933B.90508@intel.com>
 <20160510160257.GA4125@nvidia.com>
 <5732F823.3090409@intel.com>
 <20160511160628.690876f9@t450s.home>
 <CANE52KgothHxNED00pXBHBT1KgejMQiEhHBcMKppyFT_gZgPSw@mail.gmail.com>
 <20160512194924.GA24334@nvidia.com>
 <AADFC41AFE54684AB9EE6CBC0274A5D15F853FE9@SHSMSX101.ccr.corp.intel.com>
 <573572AD.2010407@intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Jike Song <albcamus@gmail.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kraxel@redhat.com" <kraxel@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Ruan, Shuai" <shuai.ruan@intel.com>,
	"Lv, Zhiyuan" <zhiyuan.lv@intel.com>
To: Jike Song <jike.song@intel.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from hqemgate16.nvidia.com ([216.228.121.65]:9375 "EHLO
	hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750918AbcEMGoE (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 13 May 2016 02:44:04 -0400
Content-Disposition: inline
In-Reply-To: <573572AD.2010407@intel.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Fri, May 13, 2016 at 02:22:37PM +0800, Jike Song wrote:
> On 05/13/2016 10:41 AM, Tian, Kevin wrote:
> >> From: Neo Jia [mailto:cjia@nvidia.com]
> >> Sent: Friday, May 13, 2016 3:49 AM
> >>
> >>>
> >>>> Perhaps one possibility would be to allow the vgpu driver to register
> >>>> map and unmap callbacks.  The unmap callback might provide the
> >>>> invalidation interface that we're so far missing.  The combination of
> >>>> map and unmap callbacks might simplify the Intel approach of pinning the
> >>>> entire VM memory space, ie. for each map callback do a translation
> >>>> (pin) and dma_map_page, for each unmap do a dma_unmap_page and release
> >>>> the translation.
> >>>
> >>> Yes adding map/unmap ops in pGPU drvier (I assume you are refering to
> >>> gpu_device_ops as
> >>> implemented in Kirti's patch) sounds a good idea, satisfying both: 1)
> >>> keeping vGPU purely
> >>> virtual; 2) dealing with the Linux DMA API to achive hardware IOMMU
> >>> compatibility.
> >>>
> >>> PS, this has very little to do with pinning wholly or partially. Intel KVMGT has
> >>> once been had the whole guest memory pinned, only because we used a spinlock,
> >>> which can't sleep at runtime.  We have removed that spinlock in our another
> >>> upstreaming effort, not here but for i915 driver, so probably no biggie.
> >>>
> >>
> >> OK, then you guys don't need to pin everything. The next question will be if you
> >> can send the pinning request from your mediated driver backend to request memory
> >> pinning like we have demonstrated in the v3 patch, function vfio_pin_pages and
> >> vfio_unpin_pages?
> >>
> > 
> > Jike can you confirm this statement? My feeling is that we don't have such logic
> > in our device model to figure out which pages need to be pinned on demand. So
> > currently pin-everything is same requirement in both KVM and Xen side...
> 
> [Correct me in case of any neglect:)]
> 
> IMO the ultimate reason to pin a page, is for DMA. Accessing RAM from a GPU is
> certainly a DMA operation. The DMA facility of most platforms, IGD and NVIDIA
> GPU included, is not capable of faulting-handling-retrying.
> 
> As for vGPU solutions like Nvidia and Intel provide, the memory address region
> used by Guest for GPU access, whenever Guest sets the mappings, it is
> intercepted by Host, so it's safe to only pin the page before it get used by
> Guest. This probably doesn't need device model to change :)

Hi Jike

Just out of curiosity, how does the host intercept this before it goes on the
bus?

Thanks,
Neo

> 
> 
> --
> Thanks,
> Jike
> 

From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:52594)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1b16pX-0003yT-3O
	for qemu-devel@nongnu.org; Fri, 13 May 2016 02:44:12 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1b16pR-0007HC-1b
	for qemu-devel@nongnu.org; Fri, 13 May 2016 02:44:09 -0400
Received: from hqemgate16.nvidia.com ([216.228.121.65]:9377)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1b16pQ-0007H1-OP
	for qemu-devel@nongnu.org; Fri, 13 May 2016 02:44:04 -0400
Date: Thu, 12 May 2016 23:43:57 -0700
From: Neo Jia <cjia@nvidia.com>
Message-ID: <20160513064357.GB30970@nvidia.com>
References: <572AEE72.90008@intel.com>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F8470F6@SHSMSX101.ccr.corp.intel.com>
	<5731933B.90508@intel.com> <20160510160257.GA4125@nvidia.com>
	<5732F823.3090409@intel.com> <20160511160628.690876f9@t450s.home>
	<CANE52KgothHxNED00pXBHBT1KgejMQiEhHBcMKppyFT_gZgPSw@mail.gmail.com>
	<20160512194924.GA24334@nvidia.com>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F853FE9@SHSMSX101.ccr.corp.intel.com>
	<573572AD.2010407@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <573572AD.2010407@intel.com>
Subject: Re: [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to
 support with iommu and without iommu
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Jike Song <jike.song@intel.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>, Jike Song <albcamus@gmail.com>, Alex Williamson <alex.williamson@redhat.com>, Kirti Wankhede <kwankhede@nvidia.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "kraxel@redhat.com" <kraxel@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "Ruan, Shuai" <shuai.ruan@intel.com>, "Lv, Zhiyuan" <zhiyuan.lv@intel.com>

On Fri, May 13, 2016 at 02:22:37PM +0800, Jike Song wrote:
> On 05/13/2016 10:41 AM, Tian, Kevin wrote:
> >> From: Neo Jia [mailto:cjia@nvidia.com]
> >> Sent: Friday, May 13, 2016 3:49 AM
> >>
> >>>
> >>>> Perhaps one possibility would be to allow the vgpu driver to register
> >>>> map and unmap callbacks.  The unmap callback might provide the
> >>>> invalidation interface that we're so far missing.  The combination of
> >>>> map and unmap callbacks might simplify the Intel approach of pinning the
> >>>> entire VM memory space, ie. for each map callback do a translation
> >>>> (pin) and dma_map_page, for each unmap do a dma_unmap_page and release
> >>>> the translation.
> >>>
> >>> Yes adding map/unmap ops in pGPU drvier (I assume you are refering to
> >>> gpu_device_ops as
> >>> implemented in Kirti's patch) sounds a good idea, satisfying both: 1)
> >>> keeping vGPU purely
> >>> virtual; 2) dealing with the Linux DMA API to achive hardware IOMMU
> >>> compatibility.
> >>>
> >>> PS, this has very little to do with pinning wholly or partially. Intel KVMGT has
> >>> once been had the whole guest memory pinned, only because we used a spinlock,
> >>> which can't sleep at runtime.  We have removed that spinlock in our another
> >>> upstreaming effort, not here but for i915 driver, so probably no biggie.
> >>>
> >>
> >> OK, then you guys don't need to pin everything. The next question will be if you
> >> can send the pinning request from your mediated driver backend to request memory
> >> pinning like we have demonstrated in the v3 patch, function vfio_pin_pages and
> >> vfio_unpin_pages?
> >>
> > 
> > Jike can you confirm this statement? My feeling is that we don't have such logic
> > in our device model to figure out which pages need to be pinned on demand. So
> > currently pin-everything is same requirement in both KVM and Xen side...
> 
> [Correct me in case of any neglect:)]
> 
> IMO the ultimate reason to pin a page, is for DMA. Accessing RAM from a GPU is
> certainly a DMA operation. The DMA facility of most platforms, IGD and NVIDIA
> GPU included, is not capable of faulting-handling-retrying.
> 
> As for vGPU solutions like Nvidia and Intel provide, the memory address region
> used by Guest for GPU access, whenever Guest sets the mappings, it is
> intercepted by Host, so it's safe to only pin the page before it get used by
> Guest. This probably doesn't need device model to change :)

Hi Jike

Just out of curiosity, how does the host intercept this before it goes on the
bus?

Thanks,
Neo

> 
> 
> --
> Thanks,
> Jike
>