From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60962) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aOY6l-0002cG-F5 for qemu-devel@nongnu.org; Wed, 27 Jan 2016 16:58:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aOY6h-00026C-VV for qemu-devel@nongnu.org; Wed, 27 Jan 2016 16:58:35 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33922) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aOY6h-00025F-LH for qemu-devel@nongnu.org; Wed, 27 Jan 2016 16:58:31 -0500 Message-ID: <1453931908.18221.5.camel@redhat.com> From: Alex Williamson Date: Wed, 27 Jan 2016 14:58:28 -0700 In-Reply-To: <56A92EC4.5050105@nvidia.com> References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> <20160126102003.GA14400@nvidia.com> <1453838773.15515.1.camel@redhat.com> <56A87A93.3000105@nvidia.com> <1453910459.6261.1.camel@redhat.com> <56A92EC4.5050105@nvidia.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kirti Wankhede , Neo Jia , "Tian, Kevin" Cc: "Ruan, Shuai" , "Song, Jike" , "kvm@vger.kernel.org" , "igvt-g@lists.01.org" , qemu-devel , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" On Thu, 2016-01-28 at 02:25 +0530, Kirti Wankhede wrote: >=C2=A0 > On 1/27/2016 9:30 PM, Alex Williamson wrote: > > On Wed, 2016-01-27 at 13:36 +0530, Kirti Wankhede wrote: > > >=C2=A0 > > > On 1/27/2016 1:36 AM, Alex Williamson wrote: > > > > On Tue, 2016-01-26 at 02:20 -0800, Neo Jia wrote: > > > > > On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote: > > > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com] > > > > >=C2=A0 > > > > > Hi Alex, Kevin and Jike, > > > > >=C2=A0 > > > > > (Seems I shouldn't use attachment, resend it again to the list,= patches are > > > > > inline at the end) > > > > >=C2=A0 > > > > > Thanks for adding me to this technical discussion, a great oppo= rtunity > > > > > for us to design together which can bring both Intel and NVIDIA= vGPU solution to > > > > > KVM platform. > > > > >=C2=A0 > > > > > Instead of directly jumping to the proposal that we have been w= orking on > > > > > recently for NVIDIA vGPU on KVM, I think it is better for me to= put out couple > > > > > quick comments / thoughts regarding the existing discussions on= this thread as > > > > > fundamentally I think we are solving the same problem, DMA, int= errupt and MMIO. > > > > >=C2=A0 > > > > > Then we can look at what we have, hopefully we can reach some c= onsensus soon. > > > > >=C2=A0 > > > > > > Yes, and since you're creating and destroying the vgpu here, = this is > > > > > > where I'd expect a struct device to be created and added to a= n IOMMU > > > > > > group.=C2=A0=C2=A0The lifecycle management should really incl= ude links between > > > > > > the vGPU and physical GPU, which would be much, much easier t= o do with > > > > > > struct devices create here rather than at the point where we = start > > > > > > doing vfio "stuff". > > > > >=C2=A0 > > > > > Infact to keep vfio-vgpu to be more generic, vgpu device creati= on and management > > > > > can be centralized and done in vfio-vgpu. That also include add= ing to IOMMU > > > > > group and VFIO group. > > > > Is this really a good idea?=C2=A0=C2=A0The concept of a vgpu is n= ot unique to > > > > vfio, we want vfio to be a driver for a vgpu, not an integral par= t of > > > > the lifecycle of a vgpu.=C2=A0=C2=A0That certainly doesn't exclud= e adding > > > > infrastructure to make lifecycle management of a vgpu more consis= tent > > > > between drivers, but it should be done independently of vfio.=C2=A0= =C2=A0I'll go > > > > back to the SR-IOV model, vfio is often used with SR-IOV VFs, but= vfio > > > > does not create the VF, that's done in coordination with the PF m= aking > > > > use of some PCI infrastructure for consistency between drivers. > > > >=C2=A0 > > > > It seems like we need to take more advantage of the class and dri= ver > > > > core support to perhaps setup a vgpu bus and class with vfio-vgpu= just > > > > being a driver for those devices. > > >=C2=A0 > > > For device passthrough or SR-IOV model, PCI devices are created by = PCI > > > bus driver and from the probe routine each device is added in vfio = group. > >=C2=A0 > > An SR-IOV VF is created by the PF driver using standard interfaces > > provided by the PCI core.=C2=A0=C2=A0The IOMMU group for a VF is adde= d by the > > IOMMU driver when the device is created on the pci_bus_type.=C2=A0=C2= =A0The probe > > routine of the vfio bus driver (vfio-pci) is what adds the device int= o > > the vfio group. > >=C2=A0 > > > For vgpu, there should be a common module that create vgpu device, = say > > > vgpu module, add vgpu device to an IOMMU group and then add it to v= fio > > > group.=C2=A0=C2=A0This module can handle management of vgpus. Advan= tage of keeping > > > this module a separate module than doing device creation in vendor > > > modules is to have generic interface for vgpu management, for examp= le, > > > files /sys/class/vgpu/vgpu_start and=C2=A0=C2=A0/sys/class/vgpu/vgp= u_shudown and > > > vgpu driver registration interface. > >=C2=A0 > > But you're suggesting something very different from the SR-IOV model. > > If we wanted to mimic that model, the GPU specific driver should crea= te > > the vgpu using services provided by a common interface.=C2=A0=C2=A0Fo= r instance > > i915 could call a new vgpu_device_create() which creates the device, > > adds it to the vgpu class, etc.=C2=A0=C2=A0That vgpu device should no= t be assumed > > to be used with vfio though, that should happen via a separate probe > > using a vfio-vgpu driver.=C2=A0=C2=A0It's that vfio bus driver that w= ill add the > > device to a vfio group. > >=C2=A0 >=C2=A0 > In that case vgpu driver should provide a driver registration interface= =C2=A0 > to register vfio-vgpu driver. >=C2=A0 > struct vgpu_driver { >=C2=A0 const char *name; >=C2=A0 int (*probe) (struct vgpu_device *vdev); >=C2=A0 void (*remove) (struct vgpu_device *vdev); > } >=C2=A0 > int vgpu_register_driver(struct vgpu_driver *driver) > { > ... > } > EXPORT_SYMBOL(vgpu_register_driver); >=C2=A0 > int vgpu_unregister_driver(struct vgpu_driver *driver) > { > ... > } > EXPORT_SYMBOL(vgpu_unregister_driver); >=C2=A0 > vfio-vgpu driver registers to vgpu driver. Then from=C2=A0 > vgpu_device_create(), after creating the device it calls=C2=A0 > vgpu_driver->probe(vgpu_device) and vfio-vgpu driver adds the device to= =C2=A0 > vfio group. >=C2=A0 > +--------------+=C2=A0=C2=A0=C2=A0=C2=A0vgpu_register_driver()+--------= -------+ > > =C2=A0=C2=A0=C2=A0=C2=A0__init() +------------------------->+=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0| > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0+<-------------------------+=C2=A0=C2=A0=C2=A0=C2=A0vgpu.ko=C2=A0= =C2=A0=C2=A0=C2=A0| > > vfio_vgpu.ko |=C2=A0=C2=A0=C2=A0probe()/remove()=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+---------+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+---------+ > +--------------+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0+-------+-------+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0^=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| callback=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+-------+--------+=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|vgpu_register_device()=C2=A0=C2=A0=C2=A0= | > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+---^-----+-----+=C2=A0= =C2=A0=C2=A0=C2=A0+-----+------+-+ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= | nvidia.ko |=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0i915.ko=C2=A0=C2=A0=C2=A0= | > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0= =C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= +-----------+=C2=A0=C2=A0=C2=A0=C2=A0+------------+ >=C2=A0 > Is my understanding correct? We have an entire driver core subsystem in Linux for the purpose of matching devices to drivers, I don't think we should be re-inventing that.=C2=A0=C2=A0That's why I'm suggesting that we should have infrastruc= ture which facilitates GPU drivers to create vGPU devices in a common way, perhaps even placing the devices on a virtual vgpu bus, and then allow a vfio-vgpu driver to register as a driver for devices of that bus/class and use the existing driver callbacks.=C2=A0=C2=A0Thanks, Alex