From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) Date: Wed, 27 Jan 2016 09:00:59 -0700 Message-ID: <1453910459.6261.1.camel@redhat.com> References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> <20160126102003.GA14400@nvidia.com> <1453838773.15515.1.camel@redhat.com> <56A87A93.3000105@nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "Song, Jike" , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" , "Ruan, Shuai" , "kvm@vger.kernel.org" , qemu-devel , "igvt-g@lists.01.org" To: Kirti Wankhede , Neo Jia , "Tian, Kevin" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:34080 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752974AbcA0QBA (ORCPT ); Wed, 27 Jan 2016 11:01:00 -0500 In-Reply-To: <56A87A93.3000105@nvidia.com> Sender: kvm-owner@vger.kernel.org List-ID: On Wed, 2016-01-27 at 13:36 +0530, Kirti Wankhede wrote: >=C2=A0 > On 1/27/2016 1:36 AM, Alex Williamson wrote: > > On Tue, 2016-01-26 at 02:20 -0800, Neo Jia wrote: > > > On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote: > > > > > From: Alex Williamson [mailto:alex.williamson@redhat.com] > > > =C2=A0=C2=A0 > > > Hi Alex, Kevin and Jike, > > > =C2=A0=C2=A0 > > > (Seems I shouldn't use attachment, resend it again to the list, p= atches are > > > inline at the end) > > > =C2=A0=C2=A0 > > > Thanks for adding me to this technical discussion, a great opport= unity > > > for us to design together which can bring both Intel and NVIDIA v= GPU solution to > > > KVM platform. > > > =C2=A0=C2=A0 > > > Instead of directly jumping to the proposal that we have been wor= king on > > > recently for NVIDIA vGPU on KVM, I think it is better for me to p= ut out couple > > > quick comments / thoughts regarding the existing discussions on t= his thread as > > > fundamentally I think we are solving the same problem, DMA, inter= rupt and MMIO. > > > =C2=A0=C2=A0 > > > Then we can look at what we have, hopefully we can reach some con= sensus soon. > > > =C2=A0=C2=A0 > > > > Yes, and since you're creating and destroying the vgpu here, th= is is > > > > where I'd expect a struct device to be created and added to an = IOMMU > > > > group.=C2=A0=C2=A0The lifecycle management should really includ= e links between > > > > the vGPU and physical GPU, which would be much, much easier to = do with > > > > struct devices create here rather than at the point where we st= art > > > > doing vfio "stuff". > > > =C2=A0=C2=A0 > > > Infact to keep vfio-vgpu to be more generic, vgpu device creation= and management > > > can be centralized and done in vfio-vgpu. That also include addin= g to IOMMU > > > group and VFIO group. > > Is this really a good idea?=C2=A0=C2=A0The concept of a vgpu is not= unique to > > vfio, we want vfio to be a driver for a vgpu, not an integral part = of > > the lifecycle of a vgpu.=C2=A0=C2=A0That certainly doesn't exclude = adding > > infrastructure to make lifecycle management of a vgpu more consiste= nt > > between drivers, but it should be done independently of vfio.=C2=A0= =C2=A0I'll go > > back to the SR-IOV model, vfio is often used with SR-IOV VFs, but v= fio > > does not create the VF, that's done in coordination with the PF mak= ing > > use of some PCI infrastructure for consistency between drivers. > >=C2=A0 > > It seems like we need to take more advantage of the class and drive= r > > core support to perhaps setup a vgpu bus and class with vfio-vgpu j= ust > > being a driver for those devices. >=C2=A0 > For device passthrough or SR-IOV model, PCI devices are created by PC= I=C2=A0 > bus driver and from the probe routine each device is added in vfio gr= oup. An SR-IOV VF is created by the PF driver using standard interfaces provided by the PCI core.=C2=A0=C2=A0The IOMMU group for a VF is added = by the IOMMU driver when the device is created on the pci_bus_type.=C2=A0=C2=A0= The probe routine of the vfio bus driver (vfio-pci) is what adds the device into the vfio group. > For vgpu, there should be a common module that create vgpu device, sa= y=C2=A0 > vgpu module, add vgpu device to an IOMMU group and then add it to vfi= o=C2=A0 > group.=C2=A0=C2=A0This module can handle management of vgpus. Advanta= ge of keeping=C2=A0 > this module a separate module than doing device creation in vendor=C2= =A0 > modules is to have generic interface for vgpu management, for example= ,=C2=A0 > files /sys/class/vgpu/vgpu_start and=C2=A0=C2=A0/sys/class/vgpu/vgpu_= shudown and=C2=A0 > vgpu driver registration interface. But you're suggesting something very different from the SR-IOV model. If we wanted to mimic that model, the GPU specific driver should create the vgpu using services provided by a common interface.=C2=A0=C2=A0For = instance i915 could call a new vgpu_device_create() which creates the device, adds it to the vgpu class, etc.=C2=A0=C2=A0That vgpu device should not = be assumed to be used with vfio though, that should happen via a separate probe using a vfio-vgpu driver.=C2=A0=C2=A0It's that vfio bus driver that wil= l add the device to a vfio group. > In the patch, vgpu_dev.c + vgpu_sysfs.c form such vgpu module and=C2=A0 > vgpu_vfio.c is for VFIO interface. Each vgpu device should be added t= o=C2=A0 > vfio group, so vgpu_group_init() from vgpu_vfio.c should be called pe= r=C2=A0 > device. In the vgpu module, vgpu devices are created on request, so=C2= =A0 > vgpu_group_init() should be called explicitly for per vgpu device.=C2= =A0 > =C2=A0 That=E2=80=99s why had merged the 2 modules, vgpu + vgpu_vfio = to form one vgpu=C2=A0 > module.=C2=A0=C2=A0Vgpu_vfio would remain separate entity but merged = with vgpu=C2=A0 > module. I disagree with this design, creation of a vgpu necessarily involves th= e GPU driver and should not be tied to use of the vgpu with vfio.=C2=A0=C2= =A0vfio should be a driver for the device, maybe eventually not the only driver for the device.=C2=A0=C2=A0Thanks, Alex