From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jike Song Subject: Re: VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) Date: Tue, 26 Jan 2016 15:41:07 +0800 Message-ID: <56A72313.9030009@intel.com> References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: "Tian, Kevin" , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" , "Ruan, Shuai" , "kvm@vger.kernel.org" , qemu-devel , "igvt-g@lists.01.org" , Neo Jia To: Alex Williamson Return-path: Received: from mga02.intel.com ([134.134.136.20]:62574 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754436AbcAZHkt (ORCPT ); Tue, 26 Jan 2016 02:40:49 -0500 In-Reply-To: <1453757426.32741.614.camel@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 01/26/2016 05:30 AM, Alex Williamson wrote: > [cc +Neo @Nvidia] > > Hi Jike, > > On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote: >> On 01/20/2016 05:05 PM, Tian, Kevin wrote: >>> I would expect we can spell out next level tasks toward above >>> direction, upon which Alex can easily judge whether there are >>> some common VFIO framework changes that he can help :-) >> >> Hi Alex, >> >> Here is a draft task list after a short discussion w/ Kevin, >> would you please have a look? >> >> Bus Driver >> >> { in i915/vgt/xxx.c } >> >> - define a subset of vfio_pci interfaces >> - selective pass-through (say aperture) >> - trap MMIO: interface w/ QEMU > > What's included in the subset? Certainly the bus reset ioctls really > don't apply, but you'll need to support the full device interface, > right? That includes the region info ioctl and access through the vfio > device file descriptor as well as the interrupt info and setup ioctls. > [All interfaces I thought are via ioctl:) For other stuff like file descriptor we'll definitely keep it.] The list of ioctl commands provided by vfio_pci: - VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET As you said, above 2 don't apply. But for this: - VFIO_DEVICE_RESET In my opinion it should be kept, no matter what will be provided in the bus driver. - VFIO_PCI_ROM_REGION_INDEX - VFIO_PCI_VGA_REGION_INDEX I suppose above 2 don't apply neither? For a vgpu we don't provide a ROM BAR or VGA region. - VFIO_DEVICE_GET_INFO - VFIO_DEVICE_GET_REGION_INFO - VFIO_DEVICE_GET_IRQ_INFO - VFIO_DEVICE_SET_IRQS Above 4 are needed of course. We will need to extend: - VFIO_DEVICE_GET_REGION_INFO a) adding a flag: DONT_MAP. For example, the MMIO of vgpu should be trapped instead of being mmap-ed. b) adding other information. For example, for the OpRegion, QEMU need to do more than mmap a region, it has to: - allocate a region - copy contents from somewhere in host to that region - mmap it to guest I remember you already have a prototype for this? >> IOMMU >> >> { in a new vfio_xxx.c } >> >> - allocate: struct device & IOMMU group > > It seems like the vgpu instance management would do this. > Yes, it can be removed from here. >> - map/unmap functions for vgpu >> - rb-tree to maintain iova/hpa mappings > > Yep, pretty much what type1 does now, but without mapping through the > IOMMU API. Essentially just a database of the current userspace > mappings that can be accessed for page pinning and IOVA->HPA > translation. > Yes. >> - interacts with kvmgt.c >> >> >> vgpu instance management >> >> { in i915 } >> >> - path, create/destroy >> > > Yes, and since you're creating and destroying the vgpu here, this is > where I'd expect a struct device to be created and added to an IOMMU > group. The lifecycle management should really include links between > the vGPU and physical GPU, which would be much, much easier to do with > struct devices create here rather than at the point where we start > doing vfio "stuff". > Yes, just like the SRIOV does. > Nvidia has also been looking at this and has some ideas how we might > standardize on some of the interfaces and create a vgpu framework to > help share code between vendors and hopefully make a more consistent > userspace interface for libvirt as well. I'll let Neo provide some > details. Thanks, Good to know that, so we can possibly cooperate on some common part, e.g. the instance management :) > > Alex > -- Thanks, Jike