From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59419) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNyFC-0000gU-Bl for qemu-devel@nongnu.org; Tue, 26 Jan 2016 02:40:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNyF8-0007CO-B0 for qemu-devel@nongnu.org; Tue, 26 Jan 2016 02:40:54 -0500 Received: from mga14.intel.com ([192.55.52.115]:30218) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNyF8-0007B9-1W for qemu-devel@nongnu.org; Tue, 26 Jan 2016 02:40:50 -0500 Message-ID: <56A72313.9030009@intel.com> Date: Tue, 26 Jan 2016 15:41:07 +0800 From: Jike Song MIME-Version: 1.0 References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> In-Reply-To: <1453757426.32741.614.camel@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: "Ruan, Shuai" , "Tian, Kevin" , Neo Jia , "kvm@vger.kernel.org" , "igvt-g@lists.01.org" , qemu-devel , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" On 01/26/2016 05:30 AM, Alex Williamson wrote: > [cc +Neo @Nvidia] > > Hi Jike, > > On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote: >> On 01/20/2016 05:05 PM, Tian, Kevin wrote: >>> I would expect we can spell out next level tasks toward above >>> direction, upon which Alex can easily judge whether there are >>> some common VFIO framework changes that he can help :-) >> >> Hi Alex, >> >> Here is a draft task list after a short discussion w/ Kevin, >> would you please have a look? >> >> Bus Driver >> >> { in i915/vgt/xxx.c } >> >> - define a subset of vfio_pci interfaces >> - selective pass-through (say aperture) >> - trap MMIO: interface w/ QEMU > > What's included in the subset? Certainly the bus reset ioctls really > don't apply, but you'll need to support the full device interface, > right? That includes the region info ioctl and access through the vfio > device file descriptor as well as the interrupt info and setup ioctls. > [All interfaces I thought are via ioctl:) For other stuff like file descriptor we'll definitely keep it.] The list of ioctl commands provided by vfio_pci: - VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET As you said, above 2 don't apply. But for this: - VFIO_DEVICE_RESET In my opinion it should be kept, no matter what will be provided in the bus driver. - VFIO_PCI_ROM_REGION_INDEX - VFIO_PCI_VGA_REGION_INDEX I suppose above 2 don't apply neither? For a vgpu we don't provide a ROM BAR or VGA region. - VFIO_DEVICE_GET_INFO - VFIO_DEVICE_GET_REGION_INFO - VFIO_DEVICE_GET_IRQ_INFO - VFIO_DEVICE_SET_IRQS Above 4 are needed of course. We will need to extend: - VFIO_DEVICE_GET_REGION_INFO a) adding a flag: DONT_MAP. For example, the MMIO of vgpu should be trapped instead of being mmap-ed. b) adding other information. For example, for the OpRegion, QEMU need to do more than mmap a region, it has to: - allocate a region - copy contents from somewhere in host to that region - mmap it to guest I remember you already have a prototype for this? >> IOMMU >> >> { in a new vfio_xxx.c } >> >> - allocate: struct device & IOMMU group > > It seems like the vgpu instance management would do this. > Yes, it can be removed from here. >> - map/unmap functions for vgpu >> - rb-tree to maintain iova/hpa mappings > > Yep, pretty much what type1 does now, but without mapping through the > IOMMU API. Essentially just a database of the current userspace > mappings that can be accessed for page pinning and IOVA->HPA > translation. > Yes. >> - interacts with kvmgt.c >> >> >> vgpu instance management >> >> { in i915 } >> >> - path, create/destroy >> > > Yes, and since you're creating and destroying the vgpu here, this is > where I'd expect a struct device to be created and added to an IOMMU > group. The lifecycle management should really include links between > the vGPU and physical GPU, which would be much, much easier to do with > struct devices create here rather than at the point where we start > doing vfio "stuff". > Yes, just like the SRIOV does. > Nvidia has also been looking at this and has some ideas how we might > standardize on some of the interfaces and create a vgpu framework to > help share code between vendors and hopefully make a more consistent > userspace interface for libvirt as well. I'll let Neo provide some > details. Thanks, Good to know that, so we can possibly cooperate on some common part, e.g. the instance management :) > > Alex > -- Thanks, Jike