From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:35115)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1ayPsV-0007wh-Nh
	for qemu-devel@nongnu.org; Thu, 05 May 2016 16:28:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1ayPsJ-000244-PC
	for qemu-devel@nongnu.org; Thu, 05 May 2016 16:28:02 -0400
Received: from hqemgate15.nvidia.com ([216.228.121.64]:9543)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cjia@nvidia.com>) id 1ayPsJ-0001wO-Gw
	for qemu-devel@nongnu.org; Thu, 05 May 2016 16:27:55 -0400
Date: Thu, 5 May 2016 13:27:39 -0700
From: Neo Jia <cjia@nvidia.com>
Message-ID: <20160505202739.GB28046@nvidia.com>
References: <1462214441-3732-1-git-send-email-kwankhede@nvidia.com>
	<1462214441-3732-3-git-send-email-kwankhede@nvidia.com>
	<20160503164326.14dafcf5@t450s.home>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F843DFA@SHSMSX101.ccr.corp.intel.com>
	<20160504110619.1c75cb69@t450s.home>
	<AADFC41AFE54684AB9EE6CBC0274A5D15F8470AD@SHSMSX101.ccr.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D15F8470AD@SHSMSX101.ccr.corp.intel.com>
Subject: Re: [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>, Kirti Wankhede <kwankhede@nvidia.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "kraxel@redhat.com" <kraxel@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "Ruan,
	Shuai" <shuai.ruan@intel.com>, "Song, Jike" <jike.song@intel.com>, "Lv,
	Zhiyuan" <zhiyuan.lv@intel.com>

On Thu, May 05, 2016 at 09:24:26AM +0000, Tian, Kevin wrote:
> > From: Alex Williamson
> > Sent: Thursday, May 05, 2016 1:06 AM
> > > > > +
> > > > > +static int vgpu_dev_mmio_fault(struct vm_area_struct *vma, struct vm_fault
> > *vmf)
> > > > > +{
> > > > > +	int ret = 0;
> > > > > +	struct vfio_vgpu_device *vdev = vma->vm_private_data;
> > > > > +	struct vgpu_device *vgpu_dev;
> > > > > +	struct gpu_device *gpu_dev;
> > > > > +	u64 virtaddr = (u64)vmf->virtual_address;
> > > > > +	u64 offset, phyaddr;
> > > > > +	unsigned long req_size, pgoff;
> > > > > +	pgprot_t pg_prot;
> > > > > +
> > > > > +	if (!vdev && !vdev->vgpu_dev)
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	vgpu_dev = vdev->vgpu_dev;
> > > > > +	gpu_dev  = vgpu_dev->gpu_dev;
> > > > > +
> > > > > +	offset   = vma->vm_pgoff << PAGE_SHIFT;
> > > > > +	phyaddr  = virtaddr - vma->vm_start + offset;
> > > > > +	pgoff    = phyaddr >> PAGE_SHIFT;
> > > > > +	req_size = vma->vm_end - virtaddr;
> > > > > +	pg_prot  = vma->vm_page_prot;
> > > > > +
> > > > > +	if (gpu_dev->ops->validate_map_request) {
> > > > > +		ret = gpu_dev->ops->validate_map_request(vgpu_dev, virtaddr,
> > &pgoff,
> > > > > +							 &req_size, &pg_prot);
> > > > > +		if (ret)
> > > > > +			return ret;
> > > > > +
> > > > > +		if (!req_size)
> > > > > +			return -EINVAL;
> > > > > +	}
> > > > > +
> > > > > +	ret = remap_pfn_range(vma, virtaddr, pgoff, req_size, pg_prot);
> > > >
> > > > So not supporting validate_map_request() means that the user can
> > > > directly mmap BARs of the host GPU and as shown below, we assume a 1:1
> > > > mapping of vGPU BAR to host GPU BAR.  Is that ever valid in a vGPU
> > > > scenario or should this callback be required?  It's not clear to me how
> > > > the vendor driver determines what this maps to, do they compare it to
> > > > the physical device's own BAR addresses?
> > >
> > > I didn't quite understand too. Based on earlier discussion, do we need
> > > something like this, or could achieve the purpose just by leveraging
> > > recent sparse mmap support?
> > 
> > The reason for faulting in the mmio space, if I recall correctly, is to
> > enable an ordering where the user driver (QEMU) can mmap regions of the
> > device prior to resources being allocated on the host GPU to handle
> > them.  Sparse mmap only partially handles that, it's not dynamic.  With
> > this faulting mechanism, the host GPU doesn't need to commit resources
> > until the mmap is actually accessed.  Thanks,
> > 
> > Alex
> 
> Neo/Kirti, any specific example how above exactly works? I can see
> difference from sparse mmap based on Alex's explanation, but still
> cannot map the 1st sentence to a real scenario clearly. Now our side
> doesn't use such faulting-based method. So I'd like to understand it
> clearly and then see any value to do same thing for Intel GPU.

Hi Kevin,

The short answer is CPU access to GPU resources via MMIO region.

Thanks,
Neo

> 
> Thanks
> Kevin