From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) Date: Wed, 27 Jan 2016 14:58:28 -0700 Message-ID: <1453931908.18221.5.camel@redhat.com> References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> <20160126102003.GA14400@nvidia.com> <1453838773.15515.1.camel@redhat.com> <56A87A93.3000105@nvidia.com> <1453910459.6261.1.camel@redhat.com> <56A92EC4.5050105@nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "Song, Jike" , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" , "Ruan, Shuai" , "kvm@vger.kernel.org" , qemu-devel , "igvt-g@lists.01.org" To: Kirti Wankhede , Neo Jia , "Tian, Kevin" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:56771 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161961AbcA0V6a (ORCPT ); Wed, 27 Jan 2016 16:58:30 -0500 In-Reply-To: <56A92EC4.5050105@nvidia.com> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, 2016-01-28 at 02:25 +0530, Kirti Wankhede wrote: >=C2=A0 > On 1/27/2016 9:30 PM, Alex Williamson wrote: > > On Wed, 2016-01-27 at 13:36 +0530, Kirti Wankhede wrote: > > >=C2=A0 > > > On 1/27/2016 1:36 AM, Alex Williamson wrote: > > > > On Tue, 2016-01-26 at 02:20 -0800, Neo Jia wrote: > > > > > On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote: > > > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com] > > > > >=C2=A0 > > > > > Hi Alex, Kevin and Jike, > > > > >=C2=A0 > > > > > (Seems I shouldn't use attachment, resend it again to the lis= t, patches are > > > > > inline at the end) > > > > >=C2=A0 > > > > > Thanks for adding me to this technical discussion, a great op= portunity > > > > > for us to design together which can bring both Intel and NVID= IA vGPU solution to > > > > > KVM platform. > > > > >=C2=A0 > > > > > Instead of directly jumping to the proposal that we have been= working on > > > > > recently for NVIDIA vGPU on KVM, I think it is better for me = to put out couple > > > > > quick comments / thoughts regarding the existing discussions = on this thread as > > > > > fundamentally I think we are solving the same problem, DMA, i= nterrupt and MMIO. > > > > >=C2=A0 > > > > > Then we can look at what we have, hopefully we can reach some= consensus soon. > > > > >=C2=A0 > > > > > > Yes, and since you're creating and destroying the vgpu here= , this is > > > > > > where I'd expect a struct device to be created and added to= an IOMMU > > > > > > group.=C2=A0=C2=A0The lifecycle management should really in= clude links between > > > > > > the vGPU and physical GPU, which would be much, much easier= to do with > > > > > > struct devices create here rather than at the point where w= e start > > > > > > doing vfio "stuff". > > > > >=C2=A0 > > > > > Infact to keep vfio-vgpu to be more generic, vgpu device crea= tion and management > > > > > can be centralized and done in vfio-vgpu. That also include a= dding to IOMMU > > > > > group and VFIO group. > > > > Is this really a good idea?=C2=A0=C2=A0The concept of a vgpu is= not unique to > > > > vfio, we want vfio to be a driver for a vgpu, not an integral p= art of > > > > the lifecycle of a vgpu.=C2=A0=C2=A0That certainly doesn't excl= ude adding > > > > infrastructure to make lifecycle management of a vgpu more cons= istent > > > > between drivers, but it should be done independently of vfio.=C2= =A0=C2=A0I'll go > > > > back to the SR-IOV model, vfio is often used with SR-IOV VFs, b= ut vfio > > > > does not create the VF, that's done in coordination with the PF= making > > > > use of some PCI infrastructure for consistency between drivers. > > > >=C2=A0 > > > > It seems like we need to take more advantage of the class and d= river > > > > core support to perhaps setup a vgpu bus and class with vfio-vg= pu just > > > > being a driver for those devices. > > >=C2=A0 > > > For device passthrough or SR-IOV model, PCI devices are created b= y PCI > > > bus driver and from the probe routine each device is added in vfi= o group. > >=C2=A0 > > An SR-IOV VF is created by the PF driver using standard interfaces > > provided by the PCI core.=C2=A0=C2=A0The IOMMU group for a VF is ad= ded by the > > IOMMU driver when the device is created on the pci_bus_type.=C2=A0=C2= =A0The probe > > routine of the vfio bus driver (vfio-pci) is what adds the device i= nto > > the vfio group. > >=C2=A0 > > > For vgpu, there should be a common module that create vgpu device= , say > > > vgpu module, add vgpu device to an IOMMU group and then add it to= vfio > > > group.=C2=A0=C2=A0This module can handle management of vgpus. Adv= antage of keeping > > > this module a separate module than doing device creation in vendo= r > > > modules is to have generic interface for vgpu management, for exa= mple, > > > files /sys/class/vgpu/vgpu_start and=C2=A0=C2=A0/sys/class/vgpu/v= gpu_shudown and > > > vgpu driver registration interface. > >=C2=A0 > > But you're suggesting something very different from the SR-IOV mode= l. > > If we wanted to mimic that model, the GPU specific driver should cr= eate > > the vgpu using services provided by a common interface.=C2=A0=C2=A0= =46or instance > > i915 could call a new vgpu_device_create() which creates the device= , > > adds it to the vgpu class, etc.=C2=A0=C2=A0That vgpu device should = not be assumed > > to be used with vfio though, that should happen via a separate prob= e > > using a vfio-vgpu driver.=C2=A0=C2=A0It's that vfio bus driver that= will add the > > device to a vfio group. > >=C2=A0 >=C2=A0 > In that case vgpu driver should provide a driver registration interfa= ce=C2=A0 > to register vfio-vgpu driver. >=C2=A0 > struct vgpu_driver { >=C2=A0 const char *name; >=C2=A0 int (*probe) (struct vgpu_device *vdev); >=C2=A0 void (*remove) (struct vgpu_device *vdev); > } >=C2=A0 > int vgpu_register_driver(struct vgpu_driver *driver) > { > ... > } > EXPORT_SYMBOL(vgpu_register_driver); >=C2=A0 > int vgpu_unregister_driver(struct vgpu_driver *driver) > { > ... > } > EXPORT_SYMBOL(vgpu_unregister_driver); >=C2=A0 > vfio-vgpu driver registers to vgpu driver. Then from=C2=A0 > vgpu_device_create(), after creating the device it calls=C2=A0 > vgpu_driver->probe(vgpu_device) and vfio-vgpu driver adds the device = to=C2=A0 > vfio group. >=C2=A0 > +--------------+=C2=A0=C2=A0=C2=A0=C2=A0vgpu_register_driver()+------= ---------+ > > =C2=A0=C2=A0=C2=A0=C2=A0__init() +------------------------->+=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0| > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0+<-------------------------+=C2=A0=C2=A0=C2=A0=C2=A0vgpu.ko=C2= =A0=C2=A0=C2=A0=C2=A0| > > vfio_vgpu.ko |=C2=A0=C2=A0=C2=A0probe()/remove()=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+---------+=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+-------= --+ > +--------------+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0+-------+-------+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0^=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| callback=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= | > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+-------+--------+=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|vgpu_register_device()=C2=A0=C2= =A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0+---^-----+---= --+=C2=A0=C2=A0=C2=A0=C2=A0+-----+------+-+ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0| nvidia.ko |=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0i915.ko=C2=A0= =C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0|=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0+-----------+=C2=A0=C2=A0=C2=A0=C2=A0+------------+ >=C2=A0 > Is my understanding correct? We have an entire driver core subsystem in Linux for the purpose of matching devices to drivers, I don't think we should be re-inventing that.=C2=A0=C2=A0That's why I'm suggesting that we should have infrastr= ucture which facilitates GPU drivers to create vGPU devices in a common way, perhaps even placing the devices on a virtual vgpu bus, and then allow = a vfio-vgpu driver to register as a driver for devices of that bus/class and use the existing driver callbacks.=C2=A0=C2=A0Thanks, Alex