From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yang Zhang Subject: Re: VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) Date: Tue, 26 Jan 2016 22:05:33 +0800 Message-ID: <56A77D2D.40109@gmail.com> References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> <56A72313.9030009@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: "Tian, Kevin" , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" , "Ruan, Shuai" , "kvm@vger.kernel.org" , qemu-devel , "igvt-g@lists.01.org" , Neo Jia To: Jike Song , Alex Williamson Return-path: Received: from mail-pf0-f193.google.com ([209.85.192.193]:35073 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965332AbcAZOFj (ORCPT ); Tue, 26 Jan 2016 09:05:39 -0500 Received: by mail-pf0-f193.google.com with SMTP id q63so986080pfb.2 for ; Tue, 26 Jan 2016 06:05:38 -0800 (PST) In-Reply-To: <56A72313.9030009@intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 2016/1/26 15:41, Jike Song wrote: > On 01/26/2016 05:30 AM, Alex Williamson wrote: >> [cc +Neo @Nvidia] >> >> Hi Jike, >> >> On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote: >>> On 01/20/2016 05:05 PM, Tian, Kevin wrote: >>>> I would expect we can spell out next level tasks toward above >>>> direction, upon which Alex can easily judge whether there are >>>> some common VFIO framework changes that he can help :-) >>> >>> Hi Alex, >>> >>> Here is a draft task list after a short discussion w/ Kevin, >>> would you please have a look? >>> >>> Bus Driver >>> >>> { in i915/vgt/xxx.c } >>> >>> - define a subset of vfio_pci interfaces >>> - selective pass-through (say aperture) >>> - trap MMIO: interface w/ QEMU >> >> What's included in the subset? Certainly the bus reset ioctls really >> don't apply, but you'll need to support the full device interface, >> right? That includes the region info ioctl and access through the vfio >> device file descriptor as well as the interrupt info and setup ioctls. >> > > [All interfaces I thought are via ioctl:) For other stuff like file > descriptor we'll definitely keep it.] > > The list of ioctl commands provided by vfio_pci: > > - VFIO_DEVICE_GET_PCI_HOT_RESET_INFO > - VFIO_DEVICE_PCI_HOT_RESET > > As you said, above 2 don't apply. But for this: > > - VFIO_DEVICE_RESET > > In my opinion it should be kept, no matter what will be provided in > the bus driver. > > - VFIO_PCI_ROM_REGION_INDEX > - VFIO_PCI_VGA_REGION_INDEX > > I suppose above 2 don't apply neither? For a vgpu we don't provide a > ROM BAR or VGA region. > > - VFIO_DEVICE_GET_INFO > - VFIO_DEVICE_GET_REGION_INFO > - VFIO_DEVICE_GET_IRQ_INFO > - VFIO_DEVICE_SET_IRQS > > Above 4 are needed of course. > > We will need to extend: > > - VFIO_DEVICE_GET_REGION_INFO > > > a) adding a flag: DONT_MAP. For example, the MMIO of vgpu > should be trapped instead of being mmap-ed. I may not in the context, but i am curious how to handle the DONT_MAP in vfio driver? Since there are no real MMIO maps into the region and i suppose the access to the region should be handled by vgpu in i915 driver, but currently most of the mmio accesses are handled by Qemu. > > b) adding other information. For example, for the OpRegion, QEMU need > to do more than mmap a region, it has to: > > - allocate a region > - copy contents from somewhere in host to that region > - mmap it to guest > > > I remember you already have a prototype for this? > > >>> IOMMU >>> >>> { in a new vfio_xxx.c } >>> >>> - allocate: struct device & IOMMU group >> >> It seems like the vgpu instance management would do this. >> > > Yes, it can be removed from here. > >>> - map/unmap functions for vgpu >>> - rb-tree to maintain iova/hpa mappings >> >> Yep, pretty much what type1 does now, but without mapping through the >> IOMMU API. Essentially just a database of the current userspace >> mappings that can be accessed for page pinning and IOVA->HPA >> translation. >> > > Yes. > >>> - interacts with kvmgt.c >>> >>> >>> vgpu instance management >>> >>> { in i915 } >>> >>> - path, create/destroy >>> >> >> Yes, and since you're creating and destroying the vgpu here, this is >> where I'd expect a struct device to be created and added to an IOMMU >> group. The lifecycle management should really include links between >> the vGPU and physical GPU, which would be much, much easier to do with >> struct devices create here rather than at the point where we start >> doing vfio "stuff". >> > > Yes, just like the SRIOV does. > > >> Nvidia has also been looking at this and has some ideas how we might >> standardize on some of the interfaces and create a vgpu framework to >> help share code between vendors and hopefully make a more consistent >> userspace interface for libvirt as well. I'll let Neo provide some >> details. Thanks, > > Good to know that, so we can possibly cooperate on some common part, > e.g. the instance management :) > >> >> Alex >> > > -- > Thanks, > Jike > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- best regards yang